i'm looking at apache camel as the best fit to an ETL process which starts with a CSV.
this file will be in the millions of rows, and has an ungodly number of columns ( ~500 )
so far i've looked at a couple of different options - unmarshalling with the CSV data format, and also with camel-bindy but none quite do what i expect.
the csv data format parses every row then passes a list of lists to the next processor - so with the millions of rows options it'll blow up with an out of memory/heap space.
the bindy approach looked great! until i worked out i need to map each column in the csv to the pojo, 99% of which i am not interested in.
so the question is - do i need to write an explicit line by line processor or component which will handle the transform per row and pass it along to the next to() in the route, or is there another option i've not come across yet?