Lecture Notes
Working with files
Before Python can access file information it must “open” the file. open() is the built-in function which tells Python to open the file. open() takes two parameters ‘filename’ and ‘mode’. If you leave the second parameter off ‘open(filename)’ Python will simply open the file in read mode (‘r’). Opening a file does not cause Python to read all the data in the file, but it makes the information in the file available to Python to use – it creates a connection between Python and the file on the hard drive, referred to as a “Handle”.
If the file cannot be found you will receive a traceback error.
The newline character
‘\n’ represents a newline. Newline is one character, even though it is represented by two.
Code patterns
Counting lines in a file
fhand = open("words.txt") count = 0 for line in fhand: count = count + 1 print "line count", count
Reading the whole file
fhand = open("words.txt") inp = fhand.read() # reads the whole file into memory print len(inp)# returns the number of characters in a file print inp # prints the whole file
Searching through a file
fhand= open("mbox-short.txt") for line in fhand: # line = line.rstrip() if line.startswith("From "): print line
- note that methods something.strip() or something.rstrip() should be used to get rid of the extra \n (new line character).
Chapter 7 Exercises
The above page is intended as a place for students to work out solutions and answers to the exercises from the textbook. Please do not post answers to exercises that are actual graded assignments.
The exercises with file introduce the idea of opening a text file and choosing a handle to it.
In both tasks we need only to read. Therefore most of the exercises will start with a open/ handle idiom:
Opening
inp = raw_input("Enter file name")# asks for a file name fhand= open(inp) # handling the file
A more elegant idiom includes a guardian
inp = raw_input("Enter file name")# asks for a file name try: fhand= open(inp) # handling the file except: print "Invalid filename" exit()
The try/except means to Py “try to open the file. In case the file cannot be opened, print invalid filename and kill the program”.
Working the data
Most of the exercises will ask to:
- Count or print the lines
- Find a line with specific data: (e-mail, server name, hours, dates, etc.)
To run this tasks we usually use a for loop
fhand = open("words.txt") #before the loop you may need to set counters for line in fhand: #after the loop you will give instructions on what needs to be done to each line
Some examples of the full loops are described above in the section “code patterns”
Create and write in a file
In the lecture it’s all about how to open and read a file. So here is a very simple example about how to create a file and write something in it.
The code below will read the file ‘mbox.txt’, and find all emails from ‘umich.edu’, and write these email addresses into a file named ’emailaddress.txt’
fhand = open('mbox.txt','r') whand = open('mailaddress.txt','w') for line in fhand: if line.startswith('From:') and line.endswith('umich.edu\n'):#don't forget the newline '\n', otherwise you will get nothing in your new file whand.write(line[6:])#I simply count the index number, in fact you may use 'find' or something fhand.close() whand.close()
Closing the file
Though not mentioned in the lecture, it’s really useful to close the file handle you opened earlier with open
once we’re done reading the file contents. This mightn’t make a big difference with small files we’re reading throughout the course, but it’ll help python(and the operating system underneath) to optimize its resource usage by telling python and the OS that they can release resources allocated for the file. Opening files typically involves allocating and managing buffers under the hood, and closing a file allows python to flush these buffer and give back their resources so that they can be used by other programs.
To close a file, simply call close
on the file handle obtained from open
.
fhand.close()
A complete scenario for dealing with a file would thus look like this:
fhand = open("words.txt") # Do whatever you like with the file contents here, like reading them through a loop. fhand.close() # Close the file.
An even more elegant way to write the previous scenario is to use the with
statement:
with open("words.txt") as fhand: # Do whatever you want here with the file contents.
Note that there’s no need to explicitly close the file here, since a with
statement automatically calls close
on the file handle once the last statement in its block is executed. fhand
is only usable inside the with
statement, that is, inside the code block indented under with
.
The with
statement is a python construct that can be used for some other types as well, in situations where scoped resource management is desired. The with
statement entails some deeper details on objects it can be used with, and file handles satisfy those requirements to be used with the with
statement. The with
statement comes handy here for ensuring resources associated with file handling are managed correctly.
Printing the results
When the loop is finished you should print you results (e.g. the whole file, the total number of lines, all the lines containing e-mail addresses.
More Resource Topics
Add resources for this chapter to this page..
File names and paths:
The filename can also be an absolute or relative path.
For Windows:
On Windows machines, all backslashes must be doubled.
For Unix / Unix-like (Linux, [Mac] OS X, etc)
On Unix / Unix-like machines, forward slashes don’t need any special treatment. However, you may notice the ~ shortcut doesn’t work. Actuallyrelative paths don’t work. You must write full path.
A solution:
fname = raw_input("Enter a file name: ") if fname[0:2] == "~/": #Check to see if it starts with a ~ and a slash #If it doesn't start with the ~/, then #the user could be referring to a valid file #like "~.py" (I checked: it is possible.) #notice below replace is valid on Mac OSX only (and not a good approach overall, cause not portable at all) fname = fname.replace('~',"/Users/"+raw_input("Enter your short user name: "),1) workingfname = fname.replace("\\",'') #This for proper escaping of a valid folder named '~' as '\~', you can also use './~' as Python automatically escapes for you. #go back to normal program now handle = open(workingfname,'r') # . . . for line in handle: print line print "\n"+("That was "+fname+".").center(40)
Naturally, no need to ask the user for their short name if there’s only one user, just replace the italicized code with the path to your home folder. You could even confuse things by redefining the ~ as a shortcut to the folder that has all your python code! (or select another letter to use as a wildcard.)
Using try, except, and open
Exemple (http://www.pythonlearn.com/html-270/book008.html – 7.7 Using try, except, and open)
fhand = raw_input('Enter the file name: ') while True: try: var_text = open('C:\\...path...\\%s.txt' % (fhand), 'r') for line in var_text: line = line.rstrip() if not '@uct.ac.za' in line: continue print line except: print 'Not Found' fhand = raw_input('Enter the file name: ') continue quit()