list

Writing lists to a file

After creating my list from the previous post I needed to save this to a text file to send back to the person who had requested the information. I just wrote a standard loop over the list but this got me thinking about a better way to do this as it’s a common task. As it turns out not really but it did bring up a few titbits worth mentioning.

If you are from a PERL background you might be tempted to use a single write and a join to turn the list into a string as follows

file.write("\n".join(mylist))

Don’t. This creates a new string in memory from the list and then writes that. Ok if it is a small list but a huge waste of memory for larger ones.

For the more experienced Python user, using a generator produces arguably a cleaner syntax and saves a line of typing. As with all generators, you can make it conditonal by adding an if clause to the end.

with open(r'x:\path\to\file.txt','w') as txtfile:
  txtfile.writelines( "%s\n" % item for item in mylist )

That covers off the text file, but there may be a better option depending upon what is going to be done with the resulting file. I’ll finish the following couple of examples.

# if the file is going to Excel, maybe a CSV file
import csv
with open(r'x:\path\to\file.csv','w') as csvfile:
  csv = csv.writer(csvfile)
  for item in mylist:
    csv.writerow([item])

# or how about json if it is going to another program
import json # Python 2.5 use simplejson
with open(r'x:\path\to\file.json','w') as jsonfile:
  json.dump(mylist,jsonfile)

See the Python docs for the csv module and json module.

Pre-processing a file with a generator

While answering a forum post on a function that processed a list I got thinking about how it would run in a real-life situation. Rather than a list being passed it would probably be a file. This almost worked except the line returns were passed in and I needed those stripped out. I was hoping to find an elegant solution and I did, a generator.

If you have not used generators before this wiki post is a good starting point. If you have used list comprehension then it is exactly the same just with different brackets. I’ll use collections.Counter() in place of the function to demonstrate; for those using a Python version earlier than 2.7 you will to create your own function.

First an example with a list which acts as the starting point:

def basicCounter ( mylist ):
  # Python 2.7+ users could use collections.Counter instead
  retdic = dict()
  for item in mylist:
    retdic[item] = retdic.get(item,0) + 1
  return retdic

mylist = ['1','2','2','3','3','3']
counted = basicCounter(mylist)
print counted

Now let create a generator to process the lines in a file to remove the whitespace and line returns. The strip() function does this for a string, we just need to do this for every line in the file. This gives us our generator; (line.strip() for line in file).

Add a bit of code for opening the file and we have our version of the above which uses the contents of a file for the input instead.

#  basicCounter as before
# Python 2.5 users need the following line
# from __future__ import with_statement
with open(r'C:\path\to\file.txt') as myfile:
  counted = basicCounter(line.strip() for line in myfile)
print counted

There is nothing to stop you making the processing much more complex; simply create your function and replace line.strip() with yourfunction(line). You can also make the processing conditional by adding an if clause at the end.