4

I use python and vertica-python library to COPY data to Vertica DB

connection = vertica_python.connect(**conn_info)
vsql_cur = connection.cursor()

with open("/tmp/vertica-test-insert", "rb") as fs:
    vsql_cur.copy( "COPY table FROM STDIN DELIMITER ',' ", fs, buffer_size=65536)
    connection.commit()

It inserts data, but only 5 rows, although the file contains more. Could this be related to db settings or it's some client issue?

andylens
  • 41
  • 1
  • 3

2 Answers2

5

This code works for me:

For JSON

# for json file
with open("D:/SampleCSVFile_2kb/tweets.json", "rb") as fs:
    my_file = fs.read().decode('utf-8')
    cur.copy( "COPY STG.unstruc_data FROM STDIN parser fjsonparser()", my_file)
    connection.commit()

For CSV

# for csv file
with open("D:/SampleCSVFile_2kb/SampleCSVFile_2kb.csv", "rb") as fs:
    my_file = fs.read().decode('utf-8','ignore')
    cur.copy( "COPY STG.unstruc_data FROM STDIN PARSER FDELIMITEDPARSER (delimiter=',', header='false') ", my_file) # buffer_size=65536
    connection.commit()
Waqas Ali
  • 1,441
  • 2
  • 20
  • 25
2

Very likely that you have rows getting rejected. Assuming you are using 7.x, you can add:

[ REJECTED DATA {'path' [ ON nodename ] [, ...] | AS TABLE 'reject_table'} ]

You can also query this after the copy execution to see the summary of results:

SELECTGET_NUM_ACCEPTED_ROWS(),GET_NUM_REJECTED_ROWS();

woot
  • 7,406
  • 2
  • 36
  • 55