2

When I request data from the google analytics reporting api, where the request contains

1) the dimension ga:clientId

2) yesterday's date in the daterange

I am finding that the response returns 10,001 data rows consistently irrespective of what the date range is. Below is an example request that should recreate the results. I've tried this across four different google accounts and multiple views within these accounts and I am getting the same behaviour. Is this expected behaviour?

A few further details on the issue:

1) I am posting this on 2020-03-24 and so, the endDate in the below request is yesterday (2020-03-23).

2) There needs to be over 10,000 rows being returned from the api for this problem to arise. If less than 10,000 rows exist for the view in question the api returns all data as expected.

3) This is only a problem if the request is run before approximatly 3pm on the date in question, 2020-03-24 in my case, if ran after approximatly 3pm, it appears as though yesterday is sufficiently historical for the api to behave as expected.

request = {
    'view_id': '123456789', 
    'dateRanges': [{'startDate': '2020-03-20', 'endDate': '2020-03-23'}], 
    'metrics': [
        {'expression': 'ga:users'}, 
        {'expression': 'ga:visitors'}, 
        {'expression': 'ga:newVisits'}, 
        {'expression': 'ga:sessions'}], 
    'dimensions': [
        {'name': 'ga:campaign'}, 
        {'name': 'ga:channelGrouping'}, 
        {'name': 'ga:sourceMedium'}, 
        {'name': 'ga:dateHourMinute'}, 
        {'name': 'ga:country'}, 
        {'name': 'ga:adGroup'}, 
        {'name': 'ga:clientId'}], 
    'segments': [], 
    'samplingLevel': 'LARGE', 
    'pageSize': 100000
}

This request is called like this:

self.batch_get_api(body={"reportRequests": [request]}).execute()

where batch_get_api is the callable:

self.reporting.reports().batchGet

self.reporting is defined as:

self.reporting = self._build_resource("analyticsreporting", "v4", credentials)
def _build_resource(service_name: str, version: str, credentials: Credentials):
    return build(
        service_name,
        version,
        http=google_auth_httplib2.AuthorizedHttp(
            credentials,
            http=Http(timeout=REQUEST_TIMEOUT.total_seconds())),
            cache_discovery=False,
        )
P Corr
  • 21
  • 2
  • Are you sure, you are receiving 10k rows only? Or is it 1k rows? Because by default API returns 1000 rows. – dikesh Mar 29 '20 at 05:58
  • I don't know why but yes, if a query includes the clientId dimension and the daterange includes today or yesteday, only 10,001 rows are returned. – Rob Flaherty Jul 07 '21 at 18:05

0 Answers0