1

I am trying to process some uneven JSON files using PDI (Pentaho) and after trying a lot with the native tools, I figured out that I need to parse the JSON files before they are processed. This is an example for just two rows:

[{  
  "UID": "34531513", 
  "identities": 
    [{
      "provider": "facebook",
      "providerUID": "123145517",
      "isLoginIdentity": true,
      "oldestDataUpdatedTimestamp": 145227161126
     },
     {
      "provider": "site",
      "providerUID": "321315415153",
      "isLoginIdentity": false,
      "oldestDataUpdated": "2015-07-14T13:37:43.682Z",
      "oldestDataUpdatedTimestamp": 1436881063682
      }]
},
{
 "UID": "1234155",
 "identities":
      [{
       "provider": "facebook",
       "providerUID": "123145517",
       "isLoginIdentity": true,
       "oldestDataUpdatedTimestamp": 145227161126
       }]
}]

The problem here is that under the different values inside Identities I don't have the Key field (UID). But I would like to have different rows for each different Identity without loosing their UID. This way, the new key would be UID+Provider (facebook,site or twitter).

What would you recommend?

Thank you in advance,

Martin

1 Answers1

2

To solve this in Pentaho you have to chain JSON Inputs.

Chained JSON Inputs

In your first input get the UID:

First JSON step

And then decode identities in the second step:

Second JSON Step

bolav
  • 6,938
  • 2
  • 18
  • 42