6

If my table is schema_one.table_five and my file name is file_to_import.csv.gz, what args do I give the copy_expert() cmd in order to copy the file contents into the table?

Here's what I'm trying:

this_copy = '''COPY schema_one.table_five FROM STDIN with CSV'''
this_file = "file_to_import.csv.gz"
con = psycopg2.connect(dbname=dbname, host=host, port=port, user=user, password=password)
cur = con.cursor()

cur.copy_expert(this_copy, this_file)

This produces an error:

cur.copy_expert(this_copy, this_file) 
TypeError: file must be a readable file-like object for COPY FROM; a writable file-like object for COPY TO.

So how do I tell the command to first uncompress the file and then specify a delimiter (in this case '|') so that it can be processed.

Secondary question. If my file is in a directory called "files_to_import" i.e. /home/dir1/dir2/files_to_import/file_to_import.csv.gz, is there a way that I can specify just the directory and have the pgm copy in all the files in that dir (to the same table)? They would all be .csv.gz files.


Added 12-30-16 0940 MST -- In response to comment: Trying to get COPY statement right, but all these error ---

this_file = "staging.tbl_testcopy.csv.gz"
this_copy_01 = '''COPY staging.tbl_testcopy_tmp FROM STDIN'''
this_copy_02 = '''COPY staging.tbl_testcopy_tmp FROM %s'''
this_copy_03 = '''COPY staging.tbl_testcopy_tmp FROM (%s)'''
this_copy_04 = '''COPY staging.tbl_testcopy_tmp FROM f'''

with gzip.open(this_file, 'rb') as f:
    try:
        cur.copy_expert(this_copy_01, f)
    except Exception, e:
        print e
    try:
        cur.copy_expert(this_copy_02, f)
    except Exception, e:
        print e
    try:
        cur.copy_expert(this_copy_03, f)
    except Exception, e:
        print e
    try:
        cur.copy_expert(this_copy_04, f)
    except Exception, e:
        print e

All of these error, at the same place. So what should come after 'FROM' ?

syntax error at or near "STDIN"
LINE 1: COPY staging.tbl_testcopy_tmp FROM STDIN
                                           ^

syntax error at or near "%"
LINE 1: COPY staging.tbl_testcopy_tmp FROM %s
                                           ^

syntax error at or near "("
LINE 1: COPY staging.tbl_testcopy_tmp FROM (%s)
                                           ^

syntax error at or near "f"
LINE 1: COPY staging.tbl_testcopy_tmp FROM f
                                           ^
Thom Rogers
  • 1,385
  • 2
  • 20
  • 33

1 Answers1

6

The file argument to copy_expert should be a file like object, not the file name. For a regular csv file you could use:

with open("file_to_import.csv",  'rb') as this_file:
    cur.copy_expert(this_copy, this_file)

For a gzipped file you could use the gzip module to open the file:

import gzip
with gzip.open("file_to_import.csv.gz",  'rb') as this_file:
    cur.copy_expert(this_copy, this_file)

To change the separator, you'll have to change the COPY statement. See the COPY docs for more information. It might be easier to use copy_from (which has a optional sep argument) instead of copy_expert.

with gzip.open("file_to_import.csv.gz",  'rb') as this_file:
    cur.copy_from(this_file, 'staging.tbl_testcopy_tmp', sep='|')

There isn't a command to automatically import all the files in the directory, you'll have to get a listing of the directory's contents and loop through it.

Alasdair
  • 298,606
  • 55
  • 578
  • 516
  • Tried that but got error: `File "./trytest_copy.py", line 24, in cur.copy_expert(this_copy, sys.stdin) psycopg2.ProgrammingError: syntax error at or near "STDIN" LINE 1: COPY schema_one.table_five from STDIN WITH CSV HEADER` – Thom Rogers Dec 30 '16 at 08:08
  • You have used `sys.stdin` - that's not what I have in my answer. You need to pass the file handle. – Alasdair Dec 30 '16 at 10:17
  • Changed it to 'f' which is file handle and still erroring. Please see above edited OP for code and results. Looks like whatever comes after the FROM is causing errors (or at least everything I've tried). Any suggestions for proper construction of the COPY statement? All I'm trying to do is COPY into my table from my file. Also, should I be using copy_from() instead of copy_expert()? I fooled around w copy_from() for a bit, but got nowhere. – Thom Rogers Dec 30 '16 at 16:47
  • BTW, the destination table is an AWS RedShift table – Thom Rogers Dec 30 '16 at 19:30
  • Yes, as I said in my answer it might be possible to use `copy_from` instead of `copy_expert`. I can't spot a problem in the statement `COPY staging.tbl_testcopy_tmp FROM STDIN`. Perhaps `copy_from` and `copy_expert` do not work with redshift. – Alasdair Dec 30 '16 at 21:12