open

Proxies

Accessing information via a URL is easily done using urllib but it is not always able to automatically detect proxy settings. Manually setting a proxy is not that difficult, it even handles authentication.

<br /># Python 3 use urllib.request<br />import urllib<br />proxy = { 'http': 'http://proxy.server:port' }<br />conn = urllib.FancyURLopener(proxy)<br />with conn.open("http://www.bbc.co.uk/") as page:<br /><%%KEEPWHITESPACE%%>    print "Header\n",page.info(),"\nPage\n",page.read()<br />

The proxy passed is a dictionary where the key is the protocol and value is the proxy. Different protocols can have different proxies although this is unlikely in practice.

Once you have connection, you can use the open command and use it a file. If the with statement seems unusual, see this post for an explanation.

Python 3 users note that data returned from the various read methods will return binary data just the same as a file read would. You can convert this to a (unicode) string with calling str.decode()

If you are only using HTTP and you need more control over what is being sent / received you can use httplib. The added power creates added complexity but to add a proxy pass it to the connection method as shown

<br />import httplib<br />conn = httplib.HTTPConnection("proxy.server:port") # or "proxy.server",port<br />conn.request("GET","http://www.bbc.co.uk/")<br />resp = conn.getresponse()<br />print resp.status<br />conn.close()<br />

Using WITH

It is not obvious what benefit the with statement brings. For example the preferred way of opening a file is by using the with statement rather than than an assignment as shown below.

# preferred method
with open('filename.ext') as myfile1:
  data1 = myfile1.read() # do something

# old method
myfile2 = open('filename.ext')
data2 = myfile2.read() # do something
myfile2.close()

In other languages with is little more than a shortcut to save typing in fully qualified names but that is not happening here. So why is this preferred? Those paying attention will have noticed I did not close myfile1. This is not a typo.

The reason for using the with command is it ensures that a cleanup of the variable is done once it is finished with. If in the old method the read caused an exception myfile2 would have been left open. Using with, the cleanup is done automatically whether the end of the block is reached or an exception occurs. For the open class, cleanup is naturally to close the file.

You can of course achieve the same result using try at the expense of several ugly lines which does nothing for readability. For those looking for a more technical explanation and how to make a class that can be used with the with statement have a look at Fredrik Lundh’s post here.

One final note is for Python 2.5 users who will not be able to run the code fragment above as it stands. Although the with statement was added in 2.5 it resides in __future__ module and needs to be imported. All later versions include the with statement. If you know the version you are programming for you can just import the statement, otherwise you can check at runtime and import it only if necessary with the following code.

import sys
if sys.version_info[0:2] == (2,5):
    # Running on python 2.5 - need to import with statement
    from __future__ import with_statement