Have you ever got IncompleteRead exception on trying to fetch chunked data with urllib2? I did. Look at the snippet:
import urllib2, httplib
try:
data = urllib2.urlopen('http://some.url/address').read()
except httplib.IncompleteRead:
# Ahtung! At this point you lose any fetched data except last chunk.
IRL most bad servers transmit all data, but due implementation errors they wrongly close session and urllib raise error and bury your precious bytes.
What you have to do to handle such situation?
I don't like any solutions which involve manual data reading loop, so I prefer to patch read function.
import httplib
def patch_http_response_read(func):
def inner(*args):
try:
return func(*args)
except httplib.IncompleteRead, e:
return e.partial
return inner
httplib.HTTPResponse.read = patch_http_response_read(httplib.HTTPResponse.read)
It allows you to deal with defective http servers.
That is a nice tip, I have ran in to this issue quite a few times and couldn't find a way around it. Thank for the post.
ReplyDeleteNot a good idea to use the word 'beaver' in a blog title. It has several meanings...
ReplyDelete> It has several meanings...
ReplyDeleteI know, I know.
This comment has been removed by the author.
ReplyDeleteThank you very much, you solved my problem.
ReplyDeleteDoes this mean the client will be able to read only a portion of the data/webpage?
ReplyDelete> Does this mean the client will be able to read only a portion of the data/webpage?
ReplyDeleteThis is true in case of a "real" read error (a protocol error, more exactly). However I faced it only during closing connection (from server side) when all data was sent.
This totally, 100% saved my day. +10 Karma. Object patching is awesome.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteYou should run the snippet before using urllib2 functions. Copy paste it into module header in the simple case.
DeleteExcellent. Just saved my laptop from taking a flight off the balcony! Thanks.
ReplyDeleteAwesome dude...felt the same !!
DeleteHaha... the same... thanks, baverman!
DeleteWorkaround works with httplib2! Thanks!!!
ReplyDeletemaximum recursion depth exceeded for inner(*args) function
ReplyDelete