2

I have a .dat file consisting of thousands of lines, where each line is made of 4 sets of numbers, delimited by ::. It ends up looking something like this:

1234::482::4::1342.5321234
4342::532::1::1532.532431

I'm trying to copy this file into a Postgres table using COPY FROM PROGRAM. BTW I'm using PostreSQL 9.5. The reason I'm trying to use the PROGRAM option is because I only need the first 3 group of numbers from each line. I found that I can use the awk command to print what I want to copy in the terminal awk -F '::' '{print $1, $2, $3}' my_file.dat so I figured that I could pass that awk command to COPY FROM PROGRAM and have the first 3 group of numbers imported to my database table, which only has 3 columns.

However, when I try to use this command with COPY FROM, I keep getting errors. I try running the following in psql

COPY my_table FROM PROGRAM 'awk -F ''::'' ''{print $1''%''$2''%''$3}'' my_file.dat' delimiter '%';

but keep getting an error: ERROR: program "awk -F '::' '{print $1'%'$2'%'$3}' my_file.dat" failed DETAIL: child process exited with exit code 2. I've tried all kinds of different variations of this with different delimiters but I can't for the life of me figure out where I'm going wrong. Am I misunderstanding what COPY FROM PROGRAM can do? Or am I missing something?

I can't create a new file on my system, so I can't run a command to filter the file and create a new one with my desired format. I need to be able to write this in one line of SQL code, and thought that COPY FROM PROGRAM was actually a pretty cool way to do this, but I can't get it to work.

paul go
  • 27
  • 6
  • 1
    Replace `''{print $1''%''$2''%''$3}''` with `''{print $1"%"$2"%"$3}''` and give a try. From the error message you get, it seems that this was not the awk command you want, with single quotes inside the awk body. – thanasisp Jun 08 '20 at 04:56
  • @thanasisp Thanks for the reply. I've been messing around with they syntax and all kinds of single and double quote combinations. If I understand your suggestion, I replaced the two single quotes around each side of the % with double quotes. Got the error `ERROR: program "awk -F '::' '{print $1"%"$2"%"$3}' ratings_copy.dat" failed DETAIL: child process exited with exit code 2`. – paul go Jun 08 '20 at 05:09
  • Does this command work alone, outside psql? `awk -F "::" '{print $1"%"$2"%"$3}' ratings_copy.dat` – thanasisp Jun 08 '20 at 05:15
  • @thanasisp yup, works perfectly. It starts printing in the terminal all thousands of lines, each group of numbers separated by a % character. The only different thing I do when adding it to the Postgres command is to add more single quotes where appropriate – paul go Jun 08 '20 at 05:21

2 Answers2

2

If awk is returning a proper result set, you could try to import it using the STDIN (Standard Input) from COPY instead of executing awk from inside of PostgreSQL, e.g.

awk -F "::" '{print $1"%"$2"%"$3}' ratings_copy.dat | psql -d db "COPY my_table FROM STDIN"

This command takes the result of awk and pipes it to psql, which can then be read by COPY FROM. Another option is to use \copy ;-)

Jim Jones
  • 18,404
  • 3
  • 35
  • 44
2

In awk exit code 2 might mean it couldn't open the input file. Which makes sense, as COPY is looking for the file on the server side, in the server process's current working directory, which is probably not where the file is located. Use \copy instead to look for the file in psql's current working directory. The actual error message generated by awk should be found in the database server's log file (when used with COPY; when used with \copy it should show up on psql's stderr).

Why not run the awk command which you had already tested, rather than making up a new untested one?

\COPY my_table FROM PROGRAM 'awk -F ''::'' ''{print $1,$2,$3}'' my_file.dat' delimiter ' '
jjanes
  • 37,812
  • 5
  • 27
  • 34
  • Thanks so much. using \COPY did work. You were right, it was all about awk not finding the file. Also thanks for the explanation – paul go Jun 09 '20 at 02:06