I found a similar question on stack overflow. This approach worked fine with just a couple of columns But I realised this method is not possible for csv's with a large number of Columns.
I have a csv with 75 columns. I decided to follow this approach (Same link as mentioned above). As asked to do in that question. I added the UpdateRecord
processor and added the CSVReader
and CSVWriter
. Then as told I entered my SchemaText
. Which was pretty long as it required me to define the entire 70 columns. Then CSVRecordSetWriter
was told to be invalid
.
I realised after a certain number of column definitions I included in the schema it became invalid
.
Part of my schema looks like this:
{
"type":"record",
"name":"test2.csv",
"namespace":"my.namespace",
"fields":[
{
"name":"download",
"type":"string"
},
{
"name":"upload",
"type":"string"
}
.
.
.
.
{
"name":"operatorId",
"type":"string"
},
{
"name":"errorCode",
"type":"string"
}
]
}
Also my csv contains headers.
Objective:
I need to map the data in the errorCode
Column to a new column named errorMean
. Hope you can suggest a method I can achieve this. Fell free to give a solution which can even completely skip the process of writing down the Schema Text
.