Recipe 14.8. Authenticating with a Proxy for HTTPS Navigation
Credit: John Nielsen
Problem
You need to use
httplib for HTTPS navigation through a proxy that
requires basic authentication, but httplib out of
the box supports HTTPS only through proxies that do
not require authentication.
Solution
Unfortunately, it takes a wafer-thin amount of trickery to achieve
this recipe's task. Here is a script that is just
tricky enough:
import httplib, base64, socket
# parameters for the script
user = 'proxy_login'; passwd = 'proxy_pass'
host = 'login.yahoo.com'; port = 443
phost = 'proxy_host'; pport = 80
# setup basic authentication
user_pass = base64.encodestring(user+':'+passwd)
proxy_authorization = 'Proxy-authorization: Basic '+user_pass+'\r\n'
proxy_connect = 'CONNECT %s:%s HTTP/1.0\r\n' % (host, port)
user_agent = 'User-Agent: python\r\n'
proxy_pieces = proxy_connect+proxy_authorization+user_agent+'\r\n'
# connect to the proxy
proxy_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
proxy_socket.connect((phost, pport))
proxy_socket.sendall(proxy_pieces+'\r\n')
response = proxy_socket.recv(8192)
status = response.split( )[1]
if status!='200':
raise IOError, 'Connecting to proxy: status=%s' % status
# trivial setup for SSL socket
ssl = socket.ssl(proxy_socket, None, None)
sock = httplib.FakeSocket(proxy_socket, ssl)
# initialize httplib and replace
the connection's socket with the SSL one
h = httplib.HTTPConnection('localhost')
h.sock = sock
# and finally, use the now-HTTPS httplib connection as you wish
h.request('GET', '/')
r = h.getresponse( )
print r.read( )
Discussion
HTTPS is essentially HTTP spoken on top of an SSL connection rather
than a plain socket. So, this recipe connects to the proxy with basic
authentication at the very lowest level of Python socket programming,
wraps an SSL socket around the proxy connection thus secured, and
finally plays a little trick under
httplib's nose to use that
laboriously constructed SSL socket in place of the plain socket in an
HTTPConnection instance. From that point onwards,
you can use the normal httplib approach as you
wish.
See Also
Documentation on the socket and
httplib standard library modules in the
Library Reference and Python in a
Nutshell.