0

I'm looking to convert JSON with an array to csv format. The number of elements inside the array is dynamic for each row. I tried using this flow, ( attached the flow file xml on the post ).

GetFile --> ConvertRecord --> UpdateAttribute --> PutFile

Are there any other alternatives?

JSON format:

{  "LogData": {
"Location": "APAC",
"product": "w1"  },  "Outcome": [
{
  "limit": "0",
  "pri": "3",
  "result": "pass"
},
{
  "limit": "1",
  "pri": "2",
  "result": "pass"
},
{
  "limit": "5",
  "priority": "1",
  "result": "fail"
}  ],  "attr": {
"vers": "1",
"datetime": "2018-01-10 00:36:00"  }}

Expected output in csv:

location,   product,    limit,  pri,    result, vers,   datetime
APAC        w1          0       3       pass    1       2018-01-10 00:36:00
APAC        w1          1       2       pass    1       2018-01-10 00:36:00
APAC        w1          5       1       fail    1       2018-01-10 00:36:00

Output from the attached flow: LogData,Outcome,attr "MapRecord[{product=w1, Location=APAC}]","[MapRecord[{limit=0, result=pass, pri=3}], MapRecord[{limit=1, result=pass, pri=2}], MapRecord[{limit=5, result=fail}]]","MapRecord[{datetime=2018-01-10 00:36:00, vers=1}]"

enter image description here

ConvertRecord -- I am using JSONTreereader and CSVRecordSSetwriter configurations as below: enter image description here

JSONTreereader Controler service config: enter image description here CSVRecordSetwriter controller service config: enter image description here AvroschemaRegistry Controller service config: enter image description here

Avro schema : { "name": "myschema", "type": "record", "namespace": "myschema", "fields": [{"name": "LogData","type": { "name": "LogData", "type": "record", "fields": [{ "name": "Location", "type": "string"},{ "name": "product", "type": "string"} ]}},{"name": "Outcome","type": { "type": "array", "items": {"name": "Outcome_record","type": "record","fields": [ {"name": "limit","type": "string" }, {"name": "pri","type": ["string","null"] }, {"name": "result","type": "string" }] }}},{"name": "attr","type": { "name": "attr", "type": "record", "fields": [{ "name": "vers", "type": "string"},{ "name": "datetime", "type": "string"} ]}} ]}

2 Answers2

1

Seems that you need to performa JoltTransform before convert to CSV, if not is not going to work.

Óscar Andreu
  • 1,630
  • 13
  • 32
1

Try this spec in JoltTransformJSON before ConvertRecord:

  {
    "operation": "shift",
    "spec": {
      "Outcome": {
        "*": {
          "@(3,LogData.Location)": "[#2].location",
          "@(3,LogData.product)": "[#2].product",
          "@(3,attr.vers)": "[#2].vers",
          "@(3,attr.datetime)": "[#2].datetime",
          "*": "[#2].&"
        }
      }
    }
  }
]```
mattyb
  • 11,693
  • 15
  • 20
  • Thanks a ton @mattyb for the help. This works like a charm. One question: If i have 1300 + array elements inside Outcome:[], will the jolt transform works well or will there be any overhead while converting 1 incoming json record to 1300+ json records? – vishwanatha handadi Mar 27 '19 at 13:58
  • Not sure what you mean by overhead, but I can tell you that all 1300+ flow files will be transferred at the same time, once they are all done processing and the session is committed. – mattyb Mar 29 '19 at 19:28