8

I'm using urlfetch in my app and while everything works perfectly fine in the development environment, i'm finding urlfetch to be VERY unreliable when it's actually deployed. Sometimes it works as it should (retrieving data), but then a few minutes later it might return nothing, then it'll be working fine again a few minutes after that. This is very unacceptable. I've checked to make sure it's NOT the source URL that's the problem (YQL) and, again, everything works as it should in the development environment.

Are there any third-party libraries I could try?

Example code:

url = "http://query.yahooapis.com/v1/public/yql?q=%s&format=json" % urllib.quote_plus(query)
result = urlfetch.fetch(url, deadline=10)

if result.status_code == 200:
    r = json.loads(result.content)
else:
    return

a = r['query']['results']
# Do stuff with 'a'

Sometimes it'll work as it should, but other times - completely randomly with no code changes - i'll get this this error:

a = r['query']['results']
TypeError: 'NoneType' object is unsubscriptable
Don
  • 83
  • 6

3 Answers3

11

Sometimes it'll work as it should, but other times completely randomly with no code changes

This is a common symptom that your application's requests have exceeded the Yahoo API calls rate limit.

Quoting Yahoo developer documentations rate limit:

IP Based Limits

Our service rate limits are imposed as a limit on the number of API calls made per IP address during a specific time window. If your IP address changes during that time period, you may find yourself with more "credit" available. However, if someone else had been using the address and hit the limit, you'll need to wait until the end of the time period to be allowed to make more API calls.

Google App Engine uses a pool of IP addresses for outgoing urlfetch requests and your application is sharing these IP addresses with other applications that are calling the same Yahoo endpoint; when the rate limit is exceeded, the endpoint replies with a limit exceeded error causing UrlFetch to fail.
Here another case using the Twitter search API.

When you mix Google App Engine+Third party web APIs, you need to be sure that the API provides authenticated calls allowing your application to have its own quota (StackApps API for example).

Community
  • 1
  • 1
systempuntoout
  • 71,966
  • 47
  • 171
  • 241
  • I use gAppProxy on appengine as proxy server, but appengine outgoing IP changed 3 times in 10 minutes, so that some website will shut my logined session – diyism Mar 30 '12 at 10:09
1
import urllib2
response = urllib2.urlopen('http://python.org/')
html = response.read()
Oded Breiner
  • 28,523
  • 10
  • 105
  • 71
0

This isn't an error in URLFetch - it's an issue with the JSON being returned. Either json.loads is returning None, or r['query'] is - I'm guessing it's probably the latter. Try logging result.content to see what the service is returning. You probably also want to cehck result.status.

One possibility is that your request is being denied or ratelimited by Yahoo in production, but not on your development machine.

Nick Johnson
  • 100,655
  • 16
  • 128
  • 198
  • Hm, it returns me some json but with no data. I checked YQL's rate limits and it's 1,000 per hour, i'm not coming close to that. – Don Jan 28 '11 at 02:01
  • @Don Are you using any sort of API key? Yahoo may limit by IP, and App Engine apps share a pool of IPs for outgoing requests. – Nick Johnson Jan 28 '11 at 02:29
  • No, i'm using the public API. I'll sign up for a key and do some more testing. – Don Jan 28 '11 at 02:34
  • Got the key, updated my code, but i'm still getting the same problem. Sometimes it'll work as it should, other times it won't. – Don Jan 28 '11 at 04:48
  • What response body and status code does it return when it's not working? This is probably something you want to bring up with Yahoo - on the forums or via their support channels. – Nick Johnson Jan 28 '11 at 05:07