1

I am working on the app which is connecting the Firebase database, My App is basically a marketplace app. I have logging all the user search history to the Firebase Analytics as a Custom events, Which I have linked to the Big query for the analytics or suggestion purpose.

Here is my Custom Events log example:

FirebaseAnalytics mFireBaseAnalytics = FirebaseAnalytics.getInstance(context);
        Bundle params = new Bundle();
        params.putString("SEARCHER_ID", "xxxx_id");
        params.putString("KEYWORDS", "Shoe");
        params.putDouble("LATITUDE", 12.00);
        params.putDouble("LONGITUDE", 80.00);
        mFireBaseAnalytics.logEvent("SEARCH_PRODUCT", params);

This data will be saved in the Biq Query in app_events table.

From this events I have to query the List of keywords search popular on my local area for example within 5 Km range. I am not really sure how to achieve this. Because see the result below how it's actually stored in the Big query.

"event_dim": [      
      {
        "date": "20170718",
        "name": "SEARCH_PRODUCT",
        "params": [

          {
            "key": "LATITUDE",
            "value": {
              "string_value": null,
              "int_value": null,
              "float_value": null,
              "double_value": "12.9632747"
            }
          },

          {
            "key": "SEARCHER_ID",
            "value": {
              "string_value": "-KkphUBI3_v3kdGq6qTa",
              "int_value": null,
              "float_value": null,
              "double_value": null
            }
          },

          {
            "key": "LONGITUDE",
            "value": {
              "string_value": null,
              "int_value": null,
              "float_value": null,
              "double_value": "80.2131347"
            }
          },

          {
            "key": "KEYWORDS",
            "value": {
              "string_value": "shoe cloths",
              "int_value": null,
              "float_value": null,
              "double_value": null
            }
          },

        ],

      }
      ]

For this format I made a query like below using standard SQL Query

SELECT
SQRT(POW(69.1*( (
  SELECT
    params.value.double_value
  FROM
    x.params
  WHERE
    params.key = 'LATITUDE') - 12.00), 2) + POW(69.1*(80.098 - (SELECT
    params.value.double_value
  FROM
    x.params
  WHERE
    params.key = 'LONGITUDE') ) * COS((SELECT
    params.value.double_value
  FROM
    x.params
  WHERE
    params.key = 'LATITUDE') / 57.3), 2)) AS distance,
  (
  SELECT
    params.value.string_value
  FROM
    x.params
  WHERE
    params.key = 'KEYWORDS') AS KEYWORDS,
  (
  SELECT
    params.value.string_value
  FROM
    x.params
  WHERE
    params.key = 'SEARCHER_ID') AS SEARCHER_ID,
  (
  SELECT
    params.value.string_value
  FROM
    x.params
  WHERE
    params.key = 'LATITUDE') AS LATITUDE,
  (
  SELECT
    params.value.string_value
  FROM
    x.params
  WHERE
    params.key = 'LONGITUDE') AS LONGITUDE
FROM
  `app_name.app_events_*`,
  UNNEST(event_dim) AS x
WHERE
  x.name = 'SEARCH_PRODUCT'
  AND (
  SELECT
    params.value.string_value As SearchId
  FROM
    x.params
  WHERE
    params.key = 'SEARCHER_ID' ) = '-xx_id'

With this query I am facing 2 issues. 1) I am getting distance as NULL 2) I am not sure how to filter this result within 5 KM Range.

Help me resolve this.

PS: I have another temporary Biqquery table which I am directly inserting data to that and from that I can get the expected result using below query.

SELECT
  SQRT(POW(111.1*( MAX(Lattitude) - 12.9632691), 2) + POW(111.1*(80.2140816 -  MAX(Longitude) ) * COS( MAX(Lattitude) / 57.3), 2)) AS distance,
  MAX(SearchKeyword) as SearchKeyword,
  SearcherArea,
  COUNT( SearcherArea) AS totalCount
FROM
  `HNF.LastFewHoursData`
WHERE
 SearchKeyword LIKE '%%'
GROUP BY
  SearcherArea

HAVING
  distance <= 5
ORDER BY
  totalCount DESC
LIMIT
  5

So I have only problem with query the Events table which are updated by the Firebase Analytics.

KENdi
  • 7,576
  • 2
  • 16
  • 31
Lakshmanan
  • 1,671
  • 2
  • 26
  • 42

1 Answers1

2

This query might work for you:

CREATE TEMP FUNCTION distance(lat1 FLOAT64, lat2 FLOAT64, lon1 FLOAT64, lon2 FLOAT64) AS((
WITH data AS(
SELECT POW(SIN((ACOS(-1) / 180 * (lat1 -lat2)) / 2), 2) + COS(ACOS(-1) / 180 * (lat1)) * COS(ACOS(-1) / 180 * (lat2)) * POW(SIN((ACOS(-1) / 180 * (lon1 -lon2)) / 2), 2) a
)
SELECT 6371 * 2 * ATAN2(SQRT((SELECT a FROM data)), SQRT(1 - (SELECT a FROM data)))
));

WITH data AS(
  SELECT ARRAY<STRUCT<name STRING, params ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, double_value FLOAT64> > > > > [STRUCT("SEARCH_PRODUCT" AS name, [STRUCT("LATITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.9632747 AS double_value) AS value), STRUCT("LONGITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 80.2131347 AS double_value) AS value), STRUCT("SEARCHER_ID" AS key, STRUCT("-KkphUBI3_v3kdGq6qTa" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value), STRUCT("KEYWORDS" AS key, STRUCT("shoes cloths" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value)] AS params),
                                                                                                                                                                          STRUCT("SEARCH_RANDOM" AS name, [STRUCT("LATITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.9632747 AS double_value) AS value), STRUCT("LONGITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 80.2131347 AS double_value) AS value), STRUCT("SEARCHER_ID" AS key, STRUCT("-KkphUBI3_v3kdGq6qTa" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value), STRUCT("KEYWORDS" AS key, STRUCT("shoes cloths" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value)] AS params),
                                                                                                                                                                          STRUCT("SEARCH_PRODUCT" AS name, [STRUCT("LATITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.9632747 AS double_value) AS value), STRUCT("LONGITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 80.2131347 AS double_value) AS value), STRUCT("SEARCHER_ID" AS key, STRUCT("-xx_id" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value), STRUCT("KEYWORDS" AS key, STRUCT("shoes cloths" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value)] AS params)] event_dim UNION ALL

  SELECT ARRAY<STRUCT<name STRING, params ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, double_value FLOAT64> > > > > [STRUCT("SEARCH_PRODUCT" AS name, [STRUCT("LATITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.9632747 AS double_value) AS value), STRUCT("LONGITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 80.2131347 AS double_value) AS value), STRUCT("SEARCHER_ID" AS key, STRUCT("-KkphUBI3_v3kdGq6qTa" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value), STRUCT("KEYWORDS" AS key, STRUCT("shoes cloths" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value)] AS params),
                                                                                                                                                                          STRUCT("SEARCH_RANDOM" AS name, [STRUCT("LATITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.9632747 AS double_value) AS value), STRUCT("LONGITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 80.2131347 AS double_value) AS value), STRUCT("SEARCHER_ID" AS key, STRUCT("-KkphUBI3_v3kdGq6qTa" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value), STRUCT("KEYWORDS" AS key, STRUCT("shoes cloths" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value)] AS params),
                                                                                                                                                                          STRUCT("SEARCH_PRODUCT" AS name, [STRUCT("LATITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.9632747 AS double_value) AS value), STRUCT("LONGITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 80.2131347 AS double_value) AS value), STRUCT("SEARCHER_ID" AS key, STRUCT("-xxx_id" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value), STRUCT("KEYWORDS" AS key, STRUCT("shoes cloths" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value)] AS params)] event_dim UNION ALL

  SELECT ARRAY<STRUCT<name STRING, params ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, double_value FLOAT64> > > > > [STRUCT("SEARCH_PRODUCT" AS name, [STRUCT("LATITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.01 AS double_value) AS value), STRUCT("LONGITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.01 AS double_value) AS value), STRUCT("SEARCHER_ID" AS key, STRUCT("-xx_id" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value), STRUCT("KEYWORDS" AS key, STRUCT("pants" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value)] AS params),
                                                                                                                                                                          STRUCT("SEARCH_PRODUCT" AS name, [STRUCT("LATITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.01 AS double_value) AS value), STRUCT("LONGITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.01 AS double_value) AS value), STRUCT("SEARCHER_ID" AS key, STRUCT("-xx_id" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value), STRUCT("KEYWORDS" AS key, STRUCT("shoes cloths" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value)] AS params),
                                                                                                                                                                          STRUCT("SEARCH_PRODUCT" AS name, [STRUCT("LATITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.01 AS double_value) AS value), STRUCT("LONGITUDE" AS key, STRUCT("" as string_value, 0 AS int_value, 0.0 AS float_value, 12.01 AS double_value) AS value), STRUCT("SEARCHER_ID" AS key, STRUCT("-xx_id" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value), STRUCT("KEYWORDS" AS key, STRUCT("shoes cloths" as string_value, 0 AS int_value, 0.0 AS float_value, 0.0 AS double_value) AS value)] AS params)] event_dim



)

SELECT
  keyword,
  count(1) freq
FROM(
  SELECT
    ARRAY(SELECT AS STRUCT (SELECT value.string_value FROM UNNEST(params) WHERE key = 'KEYWORDS' AND EXISTS(SELECT 1 FROM UNNEST(params) WHERE key = 'SEARCHER_ID' and value.string_value = '-xx_id')) AS keyword, (SELECT value.double_value FROM UNNEST(params) WHERE key = 'LATITUDE' AND EXISTS(SELECT 1 FROM UNNEST(params) WHERE key = 'SEARCHER_ID' and value.string_value = '-xx_id')) latitude, (SELECT value.double_value FROM UNNEST(params) WHERE key = 'LONGITUDE' AND EXISTS(SELECT 1 FROM UNNEST(params) WHERE key = 'SEARCHER_ID' and value.string_value = '-xx_id')) longitude FROM UNNEST(event_dim) WHERE name = 'SEARCH_PRODUCT') data
  FROM data
  WHERE EXISTS(select 1 FROM UNNEST(event_dim) WHERE (SELECT COUNTIF(value.string_value = '-xx_id') FROM UNNEST(params) WHERE key = 'SEARCHER_ID' ) > 0)
),
UNNEST(data)
WHERE keyword IS NOT NULL AND latitude IS NOT NULL AND longitude IS NOT NULL
AND distance(latitude, X, longitude, Y) < 5 -- change X and Y to your desired location
GROUP BY keyword
ORDER BY freq DESC

What I did:

First we have the definition of the function distance that computes the distance in kilometers between two spatial points (given in degrees, for instance, (12° Lat, 13° Long)).

data is just a simulation of the same data you have in your Firebase base. When running your query, just replace data with your actual table.

Finally, the query itself first filters out events whose names are not SEARCH_PRODUCT while building the ARRAY with the keywords, their latitude and longitude (also there's a filter to remove SEARCHER_IDs that are not equal to -xx_id).

After that, there's just a select and a WHERE clause to select only keywords whose latitude and longitude is less then your desired place you are querying from (I called it as X and Y in the query, you can change it for your current location, such as 12.09°, 39.01° for instance).

Let me know if this works for you.

Willian Fuks
  • 11,259
  • 10
  • 50
  • 74
  • Unfortunately this query not run on the Firebase , I am getting error X is undefined, I am trying to correct it but since i could not understand the query its difficult for me to edit, Also the Query looks very big, Thanks for your input, If you could give easy and simple solution would be great. So basically my main issue is the value i have assigned in AS ( SELECT params.value.string_value FROM x.params WHERE params.key = 'KEYWORDS') AS KEYWORDS Kewords not directly used in the Where condition or having condition, do you know the reason for that ? – Lakshmanan Jul 24 '17 at 11:26
  • You should change the value of `X` and `Y` to the latitude and longitude where you want to analyze. If you are located for instance in latitude 10° and longitude 20° then replace X=10 and Y=20 and the query will return keywords frequency on a 5 km radius from X,Y (10,20). – Willian Fuks Jul 24 '17 at 14:35