2

I want to fetch a Image(GIF format) from a website.So I use tornado in-build asynchronous http client to do it.My code is like the following:

import tornado.httpclient
import tornado.ioloop
import tornado.gen
import tornado.web

tornado.httpclient.AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient")
http_client = tornado.httpclient.AsyncHTTPClient()

class test(tornado.web.RequestHandler):
    @tornado.gen.coroutine
    def get(self):
        content = yield http_client.fetch('http://www.baidu.com/img/bdlogo.gif')
        print('=====', type(content.body))

application = tornado.web.Application([
    (r'/', test)
    ])
application.listen(80)
tornado.ioloop.IOLoop.instance().start()

So when I visit the server it should fetch a gif file.However It catch a exception.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 8: invalid start byte
ERROR:tornado.application:Uncaught exception GET / (127.0.0.1)
HTTPRequest(protocol='http', host='127.0.0.1', method='GET', uri='/', version='HTTP/1.1', remote_ip='127.0.0.1', headers={'Accept-Language': 'zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3', 'Accept-Encoding': 'gzip, deflate', 'Host': '127.0.0.1', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'User-Agent': 'Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130922 Firefox/17.0', 'Connection': 'keep-alive', 'Cache-Control': 'max-age=0', 'If-None-Match': '"da39a3ee5e6b4b0d3255bfef95601890afd80709"'})
Traceback (most recent call last):
  File "/usr/lib/python3.2/site-packages/tornado/web.py", line 1144, in _when_complete
    if result.result() is not None:
  File "/usr/lib/python3.2/site-packages/tornado/concurrent.py", line 129, in result
    raise_exc_info(self.__exc_info)
  File "<string>", line 3, in raise_exc_info
  File "/usr/lib/python3.2/site-packages/tornado/stack_context.py", line 302, in wrapped
    ret = fn(*args, **kwargs)
  File "/usr/lib/python3.2/site-packages/tornado/gen.py", line 550, in inner
    self.set_result(key, result)
  File "/usr/lib/python3.2/site-packages/tornado/gen.py", line 476, in set_result
    self.run()
  File "/usr/lib/python3.2/site-packages/tornado/gen.py", line 505, in run
    yielded = self.gen.throw(*exc_info)
  File "test.py", line 12, in get
    content = yield http_client.fetch('http://www.baidu.com/img/bdlogo.gif')
  File "/usr/lib/python3.2/site-packages/tornado/gen.py", line 496, in run
    next = self.yield_point.get_result()
  File "/usr/lib/python3.2/site-packages/tornado/gen.py", line 395, in get_result
    return self.runner.pop_result(self.key).result()
  File "/usr/lib/python3.2/concurrent/futures/_base.py", line 393, in result
    return self.__get_result()
  File "/usr/lib/python3.2/concurrent/futures/_base.py", line 352, in __get_result
    raise self._exception
tornado.curl_httpclient.CurlError: HTTP 599: Failed writing body (0 != 1024)
ERROR:tornado.access:500 GET / (127.0.0.1) 131.53ms

It seems to attempt to decode my binary file as UTF-8 text, which is unnecessary.IF I comment

tornado.httpclient.AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient")

out, which will use a simple http client instead of pycurl, it works well.(It tell me that the type of "content" is bytes)

So if it return a bytes object, why it tries to decode it? I think the problems is the pycurl or the wrapper of pycurl in tornado, right?

My python version is 3.2.5, tornado 3.1.1, pycurl 7.19.

Thanks!

riaqn
  • 107
  • 1
  • 8

1 Answers1

2

pycurl 7.19 doesn't support Python 3. Ubuntu (and possibly other Linux distributions) ship a modified version of pycurl that partially works with Python 3, but it doesn't work with Tornado (https://github.com/facebook/tornado/issues/671), and fails with an exception that looks like the one you're seeing here.

Until there's a new version of pycurl that officially supports Python 3 (or you use the change suggested in that Tornado bug report), I'm afraid you'll need to either go back to Python 2.7 or use Tornado's simple_httpclient instead.

Ben Darnell
  • 21,844
  • 3
  • 29
  • 50
  • Thanks for reply! I tried that, and pycurl seems works. But tornado expect pycurl to return the header and the body as strings(Now they are bytes, as they should be!). So it looks that tornado is not very like python3 and need few modifying to run under python3. I am not expert in python, so I choose to give up and use simple http client instead. Thank you anyway! – riaqn Sep 23 '13 at 11:26
  • The rest of Tornado works fine on Python 3; it's just curl_httpclient that has problems. These will be fixed after there is an official pycurl release that supports Python 3. – Ben Darnell Sep 23 '13 at 14:31
  • http://pycurl.sourceforge.net/doc/release-notes.html looks like its been updated to supposedly support python3 – proteneer Feb 12 '14 at 19:53