4

I'm migrating tables from Hive/HDFS (Using Presto to speed up migration) to Cassandra v3.11.3, my question - Can I use any other method which will be easy? as I have less time and lot of tables to move.

I have tried exporting tables from hive to cassandra as .csv files... but I'm running into other issues, like when I run COPY command it is not importing all the rows. It does not give me any error for this... but it fails to copy all the rows or records in that .csv file.

Like I have 1074 rows in the .csv file but I end up seeing only 130 rows after running COPY from command in cassandra.

Can I have some examples which I can use for better COPY command and also have complete rows copied from .csv file.

I have tried below command for COPY ... it gives me good results but not showing all that records which I need.


COPY table1 ("domainid","value","description","siteid","orgid","testid","valueid","rowstamp","pluspcustomer") FROM '/tmp/csv_files/csv_table1.csv' with HEADER = true AND DELIMITER = ',' ; Using 7 child processes

Starting copy of test_db.table1 with columns [domainid, value, description, siteid, orgid, testid, valueid, rowstamp, pluspcustomer]. Processed: 1042 rows; Rate: 906 rows/s; Avg. rate: 1542 rows/s 1042 rows imported from 1 files in 0.676 seconds (0 skipped).


SELECT count(*) FROM table1 ;

count

130

(1 rows)


Please help...

Erick Ramirez
  • 13,964
  • 1
  • 18
  • 23
Hareesha
  • 125
  • 9
  • My Bad... this row had duplicate entries in (.csv file) column which I have set as partition key. I checked it in excel by applying vlookup. But in this case it is easy to fine as number of rows are less. But if I have a record which has more columns like 50 core or more then I can't use Excel... I read in some article that notepad++ can process data of 2 GB. Not sure I have not tried it yet. Still I have issues in getting these big .csv's to import in Cassandra if some one can help please. – Hareesha Dec 14 '18 at 10:45

0 Answers0