Proxies

Accessing information via a URL is easily done using urllib but it is not always able to automatically detect proxy settings. Manually setting a proxy is not that difficult, it even handles authentication.

<br /># Python 3 use urllib.request<br />import urllib<br />proxy = { 'http': 'http://proxy.server:port' }<br />conn = urllib.FancyURLopener(proxy)<br />with conn.open("http://www.bbc.co.uk/") as page:<br /><%%KEEPWHITESPACE%%>    print "Header\n",page.info(),"\nPage\n",page.read()<br />

The proxy passed is a dictionary where the key is the protocol and value is the proxy. Different protocols can have different proxies although this is unlikely in practice.

Once you have connection, you can use the open command and use it a file. If the with statement seems unusual, see this post for an explanation.

Python 3 users note that data returned from the various read methods will return binary data just the same as a file read would. You can convert this to a (unicode) string with calling str.decode()

If you are only using HTTP and you need more control over what is being sent / received you can use httplib. The added power creates added complexity but to add a proxy pass it to the connection method as shown

<br />import httplib<br />conn = httplib.HTTPConnection("proxy.server:port") # or "proxy.server",port<br />conn.request("GET","http://www.bbc.co.uk/")<br />resp = conn.getresponse()<br />print resp.status<br />conn.close()<br />
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s