1

We've recently implemented Firebase in one of our games and have configured multiple custom events & parameters per event in the same. The structure of the events is the same as what has been defined/shared in Firebase BigQuery DB (Link)

Currently, I am using the below query to take out the data:

  SELECT
  user_dim.app_info.app_instance_id AS id,
  (
  SELECT
    param.value.int_value
  FROM
    UNNEST(event_dim.params) AS param
  WHERE
    param.key = 'x') AS x,
  (
  SELECT
    param.value.int_value
  FROM
    UNNEST(event_dim.params) AS param
  WHERE
    param.key = 'y') AS y,
  (
  SELECT
    param.value.string_value
  FROM
    UNNEST(event_dim.params) AS param
  WHERE
    param.key = 'z') AS z
FROM
  `firebase_dataset`
CROSS JOIN
  UNNEST(event_dim) AS event_dim
WHERE
  event_dim.name = 'event_name';

This gives a list of all the data for the parameters in the said event. For example,

id   x   y      z
1   100 1000    abc
1   200 1500    def
1   300 2000    ghi
1   400 2500    klm
2   100 1000    abc
2   200 1500    def
3   100 1000    abc
4   100 1000    abc
4   200 1500    def
4   300 2000    ghi

We then take the CSV and then take out the counts for x, y and z separately or unique counts of "id" where x, y or z occurred.

For example,

X   Count of id
100     4
200     3
300     2
400     1

I know that it's possible to get the unique count / over count values directly through queries but I haven't been able to figure out how for multiple parameters in bigquery.

I am assuming a better query would help decrease the time taken to run but won't change the data being queried as much as the dataset to get the end result won't change much.

Any help on how to make the result come through directly on BigQuery would be highly appreciated.

Thanks!

Regards, Tejas

Frank van Puffelen
  • 565,676
  • 79
  • 828
  • 807
Tejas Shah
  • 13
  • 4

1 Answers1

1

Below is for BigQuery Standard SQL

#standardSQL
SELECT 
  param.key key,
  IFNULL(CAST(param.value.int_value AS STRING), param.value.string_value) value,
  COUNT(IFNULL(CAST(param.value.int_value AS STRING), param.value.string_value)) cnt 
FROM `firebase_dataset`, 
UNNEST(event_dim) AS event_dim, UNNEST(event_dim.params) AS param
WHERE event_dim.name = 'event_name';
GROUP BY key, value
HAVING key IN ('x', 'y', 'z')   

You can test / play with it using public dataset that you referenced in your question as below

#standardSQL
SELECT 
  param.key key,
  IFNULL(CAST(param.value.int_value AS STRING), param.value.string_value) value,
  COUNT(IFNULL(CAST(param.value.int_value AS STRING), param.value.string_value)) cnt 
FROM `firebase-analytics-sample-data.ios_dataset.app_events_20160607`, 
UNNEST(event_dim) AS event_dim, UNNEST(event_dim.params) AS param
WHERE event_dim.name = 'select_content' 
GROUP BY key, value
HAVING key IN ('content_type', 'item_id', 'firebase_event_origin')
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • Hi Mikhail, thank you for your response. I haven't been able to try this out properly yet. I tried it once on our own dataset but I wasn't able to get the desired result. I'll try a couple of iterations tomorrow and give an update on the solution you provided. Thank you once again. – Tejas Shah Oct 23 '17 at 16:54
  • Hi Mikhail, thank you for the response. I tried the query and it worked. However, there are a couple of different iterations that I might want to try. I'll post a different question regarding that later. Have chosen your answer as the best one in this case. – Tejas Shah Oct 26 '17 at 14:23
  • Have voted it. But since my reputation currently is low it doesn't show it publicly. :) (Long time viewer first time poster problems :)) – Tejas Shah Oct 26 '17 at 14:28
  • I see. no problem. keep posting :o) – Mikhail Berlyant Oct 26 '17 at 14:29
  • Hi Mikhail, would you be able to help answer the below question? It's a similar question however, it involves parameters as well vs just event names. https://stackoverflow.com/q/48544392/3898958 – Tejas Shah Jan 31 '18 at 19:16