0

I have case: in flow content is always json format and the data inside json always change (both kyes and values). Is this possible to convert this flow content to csv?

Please note that, keys in json are always change.

Many thanks,

Thuy Le
  • 11
  • 6
  • Yes, You can use this link https://community.hortonworks.com/questions/63995/convert-json-to-csv-using-nifi.html and https://stackoverflow.com/questions/49145832/convert-json-to-csv-in-nifi – user4321 Sep 26 '18 at 20:56
  • Possible duplicate of [Convert JSON to CSV in nifi](https://stackoverflow.com/questions/49145832/convert-json-to-csv-in-nifi) – user4321 Sep 26 '18 at 20:57
  • hi, in this example they specify the keys in json. in my case, the keys in json are always changes, so I cant apply it – Thuy Le Sep 26 '18 at 21:02
  • Can you give more details? Like will the csv headers remain the same or are they also dynamic ? And please give a specific example – user4321 Sep 26 '18 at 21:17

1 Answers1

0

To achieve this usecase we need to generate avro schema dynamically for each json record first then convert to AVRO finally convert AVRO to CSV

Flow: enter image description here

1.SplitJson //split the array of json records into individual records

2.InferAvroSchema //infer the avro schema based on Json record and store in attribute

3.ConvertJSONToAvro //convert each json record into Avro data file

4.ConvertRecord //read the avro data file dynamically and convert into CSV format

5.MergeContent (or) MergeRecord processor //to merge the splitted flowfiles into one flowfile based on defragment strategy.

Save this xml and upload to your nifi instance and change as per your requirements.

notNull
  • 30,258
  • 4
  • 35
  • 50
  • Thank you so much for your suggestion, that is so useful :). In the case, my json are array as: [{"a":"IboECKV", "b":"2018-09-14 21:05:02.000Z", "c":"2132a4"}, {"a":"IboECKV", "b_a2":"2018-09-14 21:05:02.000Z_b2", "c":"2132a4_c2"}] the same keys name for each record, Is there any way to convert them to AVRO without split json????? – Thuy Le Sep 27 '18 at 13:44
  • @ThuyLe, It's not possible because you are having **different keys(b,b_a2) for two json records**, **InferAvroSchema** processor works only on **one record** at a time. So to convert into CSV format you have to split the json then dynamically generate schema for **each record** then convert to CSV format dynamically. – notNull Sep 27 '18 at 14:38
  • hi @shu, sorry the right data is [{"a":"IboECKV", "b":"2018-09-14 21:05:02.000Z", "c":"2132a4"}, {"a":"IboECKV_a2", "b":"2018-09-14 21:05:02.000Z", "c":"2132a4_c2"}] – Thuy Le Sep 27 '18 at 17:44
  • Hi @Shu, in some cases we have json array with 100M rows, we split json convert it to avro then merger again, it will be take time, I understand that InferAvroSchema processor works only on one record at a time, does Nifi have other processor to do that? – Thuy Le Sep 27 '18 at 17:47
  • @ThuyLe, NiFi supports dynamic read of schema for **CSV,Avro** formats. But coming to **Json** format this is not yet possible to read dynamically on **array of json records**. So that is the reason why need to split the json and convert to Avro. – notNull Sep 28 '18 at 01:39