Home » Blog » Pythonlearn:resources-week07

Pythonlearn:resources-week07

Lecture Notes

Working with files

Before Python can access file information it must “open” the file. open() is the built-in function which tells Python to open the file. open() takes two parameters ‘filename’ and ‘mode’. If you leave the second parameter off ‘open(filename)’ Python will simply open the file in read mode (‘r’). Opening a file does not cause Python to read all the data in the file, but it makes the information in the file available to Python to use – it creates a connection between Python and the file on the hard drive, referred to as a “Handle”.

If the file cannot be found you will receive a traceback error.

The newline character

‘\n’ represents a newline. Newline is one character, even though it is represented by two.

Code patterns

Counting lines in a file

fhand = open("words.txt")
count = 0
for line in fhand:
    count = count + 1
print "line count", count

Reading the whole file

fhand =  open("words.txt")
inp = fhand.read() # reads the whole file into memory
print len(inp)# returns the number of characters in a file
print inp # prints the whole file

Searching through a file

fhand= open("mbox-short.txt")
for line in fhand:
    # line = line.rstrip()
    if line.startswith("From "):
        print line
  • note that methods something.strip() or something.rstrip() should be used to get rid of the extra \n (new line character).

Chapter 7 Exercises

The above page is intended as a place for students to work out solutions and answers to the exercises from the textbook. Please do not post answers to exercises that are actual graded assignments.

The exercises with file introduce the idea of opening a text file and choosing a handle to it.

In both tasks we need only to read. Therefore most of the exercises will start with a open/ handle idiom:

Opening

inp = raw_input("Enter file name")# asks for a file name
fhand= open(inp) # handling the file

A more elegant idiom includes a guardian

inp = raw_input("Enter file name")# asks for a file name
try:
    fhand= open(inp) # handling the file
except:
    print "Invalid filename"
    exit()

The try/except means to Py “try to open the file. In case the file cannot be opened, print invalid filename and kill the program”.

Working the data

Most of the exercises will ask to:

  • Count or print the lines
  • Find a line with specific data: (e-mail, server name, hours, dates, etc.)

To run this tasks we usually use a for loop

fhand = open("words.txt")
#before the loop you may need to set counters
for line in fhand:
#after the loop you will give instructions on what needs to be done to each line

Some examples of the full loops are described above in the section “code patterns”

Create and write in a file

In the lecture it’s all about how to open and read a file. So here is a very simple example about how to create a file and write something in it.

The code below will read the file ‘mbox.txt’, and find all emails from ‘umich.edu’, and write these email addresses into a file named ’emailaddress.txt’

fhand = open('mbox.txt','r')
whand = open('mailaddress.txt','w')
for line in fhand:
    if line.startswith('From:') and line.endswith('umich.edu\n'):#don't forget the newline '\n', otherwise you will get nothing in your new file 
        whand.write(line[6:])#I simply count the index number, in fact you may use 'find' or something
fhand.close()
whand.close()

Closing the file

Though not mentioned in the lecture, it’s really useful to close the file handle you opened earlier with open once we’re done reading the file contents. This mightn’t make a big difference with small files we’re reading throughout the course, but it’ll help python(and the operating system underneath) to optimize its resource usage by telling python and the OS that they can release resources allocated for the file. Opening files typically involves allocating and managing buffers under the hood, and closing a file allows python to flush these buffer and give back their resources so that they can be used by other programs.

To close a file, simply call close on the file handle obtained from open.

fhand.close()

A complete scenario for dealing with a file would thus look like this:

fhand = open("words.txt")
# Do whatever you like with the file contents here, like reading them through a loop.
fhand.close()  # Close the file.

An even more elegant way to write the previous scenario is to use the with statement:

with open("words.txt") as fhand:
    # Do whatever you want here with the file contents.

Note that there’s no need to explicitly close the file here, since a with statement automatically calls close on the file handle once the last statement in its block is executed. fhand is only usable inside the with statement, that is, inside the code block indented under with.

The with statement is a python construct that can be used for some other types as well, in situations where scoped resource management is desired. The with statement entails some deeper details on objects it can be used with, and file handles satisfy those requirements to be used with the with statement. The with statement comes handy here for ensuring resources associated with file handling are managed correctly.

Printing the results

When the loop is finished you should print you results (e.g. the whole file, the total number of lines, all the lines containing e-mail addresses.

More Resource Topics

Add resources for this chapter to this page..

File names and paths:

The filename can also be an absolute or relative path.

For Windows:

On Windows machines, all backslashes must be doubled.

For Unix / Unix-like (Linux, [Mac] OS X, etc)

On Unix / Unix-like machines, forward slashes don’t need any special treatment. However, you may notice the ~ shortcut doesn’t work. Actuallyrelative paths don’t work. You must write full path.

A solution:

fname = raw_input("Enter a file name: ")
    if fname[0:2] == "~/": #Check to see if it starts with a ~ and a slash
        #If it doesn't start with the ~/, then 
        #the user could be referring to a valid file
        #like "~.py" (I checked: it is possible.)
        #notice below replace is valid on Mac OSX only (and not a good approach overall, cause not portable at all)
        fname = fname.replace('~',"/Users/"+raw_input("Enter your short user name: "),1)
workingfname = fname.replace("\\",'') #This for proper escaping of a valid folder named '~' as '\~', you can also use './~' as Python automatically escapes for you.
#go back to normal program now
handle = open(workingfname,'r') # . . .
for line in handle:
    print line
print "\n"+("That was "+fname+".").center(40)

Naturally, no need to ask the user for their short name if there’s only one user, just replace the italicized code with the path to your home folder. You could even confuse things by redefining the ~ as a shortcut to the folder that has all your python code! (or select another letter to use as a wildcard.)

Using try, except, and open

Exemple (http://www.pythonlearn.com/html-270/book008.html – 7.7 Using try, except, and open)

fhand = raw_input('Enter the file name: ')
while True:
    try:
        var_text = open('C:\\...path...\\%s.txt' % (fhand), 'r')
        for line in var_text:
            line = line.rstrip()
            if not '@uct.ac.za' in line:
                continue
            print line
            
    except:
        print 'Not Found'
        fhand = raw_input('Enter the file name: ')
        continue
        
    quit()