Have you ever got IncompleteRead exception on trying to fetch chunked data with urllib2? I did. Look at the snippet:
import urllib2, httplib try: data = urllib2.urlopen('http://some.url/address').read() except httplib.IncompleteRead: # Ahtung! At this point you lose any fetched data except last chunk.
IRL most bad servers transmit all data, but due implementation errors they wrongly close session and urllib raise error and bury your precious bytes.
What you have to do to handle such situation?
I don't like any solutions which involve manual data reading loop, so I prefer to patch read function.
import httplib def patch_http_response_read(func): def inner(*args): try: return func(*args) except httplib.IncompleteRead, e: return e.partial return inner httplib.HTTPResponse.read = patch_http_response_read(httplib.HTTPResponse.read)
It allows you to deal with defective http servers.