We need to load several gigabytes of csv files into cassandra. We tried ingesting data using source command to pull data from text files that contain insert statements with data values of the csv files.
With this approach, the data is not getting uploaded correctly - data from the first row is repeated in all the subsequent rows. (I have checked the insert commands and they seem to contain the right values).
What could be the issue? Am I seeing the rows are duplicates because it takes time for Cassandra to flush the data to disks? (nodetool shows no pending flushes though.)
Is it more efficient to create CSV files and use the copy statement to ingest the data? pls. advise.