Convert Json array into csv using apache Nifi

Question

I'm looking to convert JSON with an array to csv format. The number of elements inside the array is dynamic for each row. I tried using this flow, ( attached the flow file xml on the post ).

GetFile --> ConvertRecord --> UpdateAttribute --> PutFile

Are there any other alternatives?

JSON format:

{  "LogData": {
"Location": "APAC",
"product": "w1"  },  "Outcome": [
{
  "limit": "0",
  "pri": "3",
  "result": "pass"
},
{
  "limit": "1",
  "pri": "2",
  "result": "pass"
},
{
  "limit": "5",
  "priority": "1",
  "result": "fail"
}  ],  "attr": {
"vers": "1",
"datetime": "2018-01-10 00:36:00"  }}

Expected output in csv:

location,   product,    limit,  pri,    result, vers,   datetime
APAC        w1          0       3       pass    1       2018-01-10 00:36:00
APAC        w1          1       2       pass    1       2018-01-10 00:36:00
APAC        w1          5       1       fail    1       2018-01-10 00:36:00

Output from the attached flow: LogData,Outcome,attr "MapRecord[{product=w1, Location=APAC}]","[MapRecord[{limit=0, result=pass, pri=3}], MapRecord[{limit=1, result=pass, pri=2}], MapRecord[{limit=5, result=fail}]]","MapRecord[{datetime=2018-01-10 00:36:00, vers=1}]"

ConvertRecord -- I am using JSONTreereader and CSVRecordSSetwriter configurations as below:

JSONTreereader Controler service config: CSVRecordSetwriter controller service config: AvroschemaRegistry Controller service config:

Avro schema : { "name": "myschema", "type": "record", "namespace": "myschema", "fields": [{"name": "LogData","type": { "name": "LogData", "type": "record", "fields": [{ "name": "Location", "type": "string"},{ "name": "product", "type": "string"} ]}},{"name": "Outcome","type": { "type": "array", "items": {"name": "Outcome_record","type": "record","fields": [ {"name": "limit","type": "string" }, {"name": "pri","type": ["string","null"] }, {"name": "result","type": "string" }] }}},{"name": "attr","type": { "name": "attr", "type": "record", "fields": [{ "name": "vers", "type": "string"},{ "name": "datetime", "type": "string"} ]}} ]}

score 1 · Answer 1 · answered Mar 24 '19 at 12:17

1

Seems that you need to performa JoltTransform before convert to CSV, if not is not going to work.

answered Mar 24 '19 at 12:17

Óscar Andreu

1,630
13
32

score 1 · Accepted Answer · answered Mar 25 '19 at 23:27

1

Try this spec in JoltTransformJSON before ConvertRecord:

  {
    "operation": "shift",
    "spec": {
      "Outcome": {
        "*": {
          "@(3,LogData.Location)": "[#2].location",
          "@(3,LogData.product)": "[#2].product",
          "@(3,attr.vers)": "[#2].vers",
          "@(3,attr.datetime)": "[#2].datetime",
          "*": "[#2].&"
        }
      }
    }
  }
]```

answered Mar 25 '19 at 23:27

mattyb

11,693
15
20

Thanks a ton @mattyb for the help. This works like a charm. One question: If i have 1300 + array elements inside Outcome:[], will the jolt transform works well or will there be any overhead while converting 1 incoming json record to 1300+ json records? – vishwanatha handadi Mar 27 '19 at 13:58
Not sure what you mean by overhead, but I can tell you that all 1300+ flow files will be transferred at the same time, once they are all done processing and the session is committed. – mattyb Mar 29 '19 at 19:28

Convert Json array into csv using apache Nifi

2 Answers2