0

I'm trying to withdraw information from Facebook Graph API and convert the result into a readable form in Google Data Fusion using HTTP plugin to then upload results into Google BigQuery.

I've used this method in the past but in this particular case, I ran into an issue with a nested object from the JSON response which is visible below:

{
"data": [
    {
        "id": "23850681132290191"
    },
    {
        "id": "23850605381460191"
    },
    {
        "insights": {
            "data": [
                {
                    "clicks": "13",
                    "date_start": "2022-07-18",
                    "date_stop": "2022-08-16"
                }
            ]
            }
            }]
}

So now to map the 'id' field, it looks like this:

enter image description here

And the result is the following:

enter image description here

So now when I try to map another field which is "clicks" - according to the documentation it should look like this:

insights/data/clicks

Assuming that the result path is "data"

enter image description here

Unfortunately, I'm getting an error:

Cannot convert line '{"clicks":[{"clicks":"66","date_start":"2022-07-19","date_stop":"2022-08-17"}],"id":"23850681132290191"}' to a record. Reason: 'java.io.IOException: No matching schema found for union type: ["string","null"] for token: BEGIN_ARRAY'.

I don't know how to map this particular field following the documentation they provided. Any idea what I'm doing wrong here?

  • Can you provide the documentation which you are following? – Prajna Rai T Aug 19 '22 at 07:38
  • It's this one: https://cdap.atlassian.net/wiki/spaces/DOCS/pages/694190359/HTTP+Batch+Source I've seen that people struggle with Graph API because it uses weird arrays that are difficult to map. Also, I've tried the following: 1. insights/data/clicks 2. insights.data.clicks 3. $.data[*].insights.data[*].clicks 4. data.[insights].[data].[clicks] 5. $.data.[insights].[data].[clicks] – Marcin Stańczak Aug 19 '22 at 08:30
  • Hey @MarcinStańczak did you find any solution on this? – Taulant Racaj Mar 30 '23 at 08:14
  • @TaulantRacaj hey, my solution was to switch to python and airflow for such tasks. However your solution probably works. – Marcin Stańczak Mar 31 '23 at 10:57

1 Answers1

0

Seems that the issue is related when you have a list on a nested object. Therefore you would need to create a proper schema as below.

enter image description here

Hope this helps.

Taulant Racaj
  • 71
  • 1
  • 9