Convert Gzipped Data Fetched By Urllib2 To Html
I currently use mechanize to read gzipped web page as below: br = mechanize.Browser() br.set_handle_gzip(True) response = br.open(url) data = response.read() I wonder how to decom
Solution 1:
Try this:
importStringIO
data = StringIO.StringIO(data)
import gzip
gzipper = gzip.GzipFile(fileobj=data)
html = gzipper.read()
html
should now hold the HTML (Print it to see). See here for more info.
Solution 2:
def ungzip(r,b):
headers = r.info()
if ('Content-Encoding' in headers.keys() and headers['Content-Encoding']=='gzip') or \
('content-encoding' in headers.keys() and headers['content-encoding']=='gzip'):
import gzip
gz = gzip.GzipFile(fileobj=r, mode='rb')
html = gz.read()
gz.close()
headers['Content-type'] = 'text/html; charset=utf-8'
r.set_data(html)
b.set_response(r)
Post a Comment for "Convert Gzipped Data Fetched By Urllib2 To Html"