4

I'm using YouTube data API v3.

Is it possible to make a big BatchHttpRequest (e.g., see here) and also to use ETags for local caching at the httplib2 level (e.g., see here)?

ETags work fine for single queries, I don't understand if they are useful also for batch requests.

Cœur
  • 37,241
  • 25
  • 195
  • 267
floatingpurr
  • 7,749
  • 9
  • 46
  • 106

1 Answers1

4

TL;DR:

  • BatchHttpRequest cannot be used with caching

HERE IT IS:

First lets see the way to initialize BatchHttpRequest:

from apiclient.http import BatchHttpRequest

def list_animals(request_id, response, exception):
  if exception is not None:
    # Do something with the exception
    pass
  else:
    # Do something with the response
    pass

def list_farmers(request_id, response):
  """Do something with the farmers list response."""
  pass

service = build('farm', 'v2')

batch = service.new_batch_http_request()

batch.add(service.animals().list(), callback=list_animals)
batch.add(service.farmers().list(), callback=list_farmers)


batch.execute(http=http)

Second lets see how ETags are used:

from google.appengine.api import memcache
http = httplib2.Http(cache=memcache)

Now lets analyze:

Observe the last line of BatchHttpRequest example: batch.execute(http=http), and now checking the source code for execute, it calls _refresh_and_apply_credentials, which applies the http object we pass it.

def _refresh_and_apply_credentials(self, request, http):
    """Refresh the credentials and apply to the request.
    Args:
      request: HttpRequest, the request.
      http: httplib2.Http, the global http object for the batch.
    """
    # For the credentials to refresh, but only once per refresh_token
    # If there is no http per the request then refresh the http passed in
    # via execute()

Which means, execute call which takes in http, can be passed the ETag http you would have created as:

http = httplib2.Http(cache=memcache)
# This would mean we would get the ETags cached http
batch.execute(http=http)

Update 1:

Could try with a custom object as well:

from googleapiclient.discovery_cache import DISCOVERY_DOC_MAX_AGE
from googleapiclient.discovery_cache.base import Cache
from googleapiclient.discovery_cache.file_cache import Cache as FileCache

custCache = FileCache(max_age=DISCOVERY_DOC_MAX_AGE)
http = httplib2.Http(cache=custCache)
# This would mean we would get the ETags cached http
batch.execute(http=http)

Because, this is just a hunch on the comment in http2 lib:

"""If 'cache' is a string then it is used as a directory name for
        a disk cache. Otherwise it must be an object that supports the
        same interface as FileCache.

Conclusion Update 2:

After again verifying the google-api-python source code, I see that, BatchHttpRequest is fixed with 'POST' request and has a content-type of multipart/mixed;.. - source code.

Giving a clue about the fact that, BatchHttpRequest is useful in order to POST data which is then processed down the later.

Now, keeping that in mind, observing what httplib2 request method uses: _updateCache only when following criteria are met:

  1. Request is in ["GET", "HEAD"] or response.status == 303 or is a redirect request
  2. ElSE -- response.status in [200, 203] and method in ["GET", "HEAD"]
  3. OR -- if response.status == 304 and method == "GET"

This means, BatchHttpRequest cannot be used with caching.

Nagaraj Tantri
  • 5,172
  • 12
  • 54
  • 78
  • Thanks for the answer but. I had already tried to mix BatchHttpRequest and ETag in the way you suggested but the memcache folder, with local copies, remains empty. This is very strange. I am not sure that ETag is compatibile with BatchHttpRequest. I mean: probably ETag cannot cache batches... – floatingpurr Jul 04 '16 at 11:05
  • 1
    @superciccio14 then does this `http = httplib2.Http(cache=".cache")` (Accordidng to source: In the simplest case, you can just pass in a directory name, and a cache will be built from that directory) not help? – Nagaraj Tantri Jul 04 '16 at 11:14
  • Just tried, but unfortunately didn't help. The `.cache` folder is properly created but it remains always empty. I think that ETag cannot work with batches. : ( – floatingpurr Jul 04 '16 at 11:18
  • 1
    So what happens if you just single query? does the .cache folder is created? – Nagaraj Tantri Jul 04 '16 at 11:20
  • Yes, the folder is created but this time it is also filled with the cached files. – floatingpurr Jul 04 '16 at 11:23
  • @superciccio14 just updated the answer, which what I felt, "can" be of help, because `google-api-python-client` have their own caching available and if we can just directly give an object – Nagaraj Tantri Jul 04 '16 at 11:44
  • My problem is understanding if batches can be cached with Etag, regardless of whether we decide to store cache in an "object" or in a directory. It seems like no (ie my test with `http = httplib2.Http(cache=".cache")`). Probably Etags are designed to track the status of a resource but not of a collection of resources. – floatingpurr Jul 04 '16 at 12:08
  • 1
    @superciccio14 added the latest findings. – Nagaraj Tantri Jul 05 '16 at 08:26
  • Thanks, I read. I do not understand why `BatchHttpRequest` should be useful only with `POST`. In fact, I used it with `GET` and it worked. It batched a collection of many `list` requests in only one `HTTP/GET` request (without caching, of course). – floatingpurr Jul 05 '16 at 09:54
  • 1
    @superciccio14 because, `BatchHttpRequest` will take your batch of `GET` http requests and keep track of all the requests by `POST` + `multipart/mixed`. They use `multipart/mixed` to create a batch of execution (like batch execute email attachment download). So, in that sense, `BatchHttpRequest` is on right track. – Nagaraj Tantri Jul 05 '16 at 11:59
  • Got it. Basically `list` is translated into `POST` + `multipart/mixed` but `httplib2`'s `_updateCache` does not work with `POST`. The answer is definitely: _no caching_. : ( right? – floatingpurr Jul 05 '16 at 12:09
  • 1
    @superciccio14 yes, definitely no caching for `POST` from httplib2. Which seems, like correct. – Nagaraj Tantri Jul 05 '16 at 12:15