proxy

Proxies

Accessing information via a URL is easily done using urllib but it is not always able to automatically detect proxy settings. Manually setting a proxy is not that difficult, it even handles authentication.

<br /># Python 3 use urllib.request<br />import urllib<br />proxy = { 'http': 'http://proxy.server:port' }<br />conn = urllib.FancyURLopener(proxy)<br />with conn.open("http://www.bbc.co.uk/") as page:<br /><%%KEEPWHITESPACE%%>    print "Header\n",page.info(),"\nPage\n",page.read()<br />

The proxy passed is a dictionary where the key is the protocol and value is the proxy. Different protocols can have different proxies although this is unlikely in practice.

Once you have connection, you can use the open command and use it a file. If the with statement seems unusual, see this post for an explanation.

Python 3 users note that data returned from the various read methods will return binary data just the same as a file read would. You can convert this to a (unicode) string with calling str.decode()

If you are only using HTTP and you need more control over what is being sent / received you can use httplib. The added power creates added complexity but to add a proxy pass it to the connection method as shown

<br />import httplib<br />conn = httplib.HTTPConnection("proxy.server:port") # or "proxy.server",port<br />conn.request("GET","http://www.bbc.co.uk/")<br />resp = conn.getresponse()<br />print resp.status<br />conn.close()<br />