1

I am new to using SQL, so please bear with me.

I need to import several hundred csv files into PostgreSQL. My web search has only indicated how to import many csv files into one table. However, most csv files have different column types (all have one line headers). Is it possible to somehow run a loop, and have each csv imported to a table with the same name as the csv? Creating each table manually and specifying columns is not an option. I know that COPY will not work as the table needs to already by specified.

Perhaps this is not feasible in PostgreSQL? I would like to accomplish this in pgAdmin III or the PSQL console, but I am open to other ideas (using something like R to change the csv to a format more easily entered into PostgreSQL?).

I am using PostgreSQL on a Windows 7 computer. It was requested that I use PostgreSQL, thus the focus of the question.

The desired result is a database full of tables, that I will then join with a spreadsheet that includes specific site data. Thanks!

Vivek S.
  • 19,945
  • 7
  • 68
  • 85
Sean
  • 133
  • 1
  • 5
  • It looks like something where you would want a separate process to iterate through the csv files, declaring tables with the appropriate names, then a process to iterate through and evaluate the types for each column that would be imported and insert them into the tables, then finally import the data. C# might be a good venue for this due to its excessive File.IO processing and compatibility with SQL. This won't be a short answer. – CSS Sep 24 '15 at 20:53
  • Extracting the column names from the header names will be possible and could be done in any scripting language. Determining the column types could be difficult, it either needs to rely on a naming convention, or you'll have to add them manually (eg enter them into a name->type dict) – wildplasser Sep 24 '15 at 22:00
  • I'd script this, though some of the ETL tools can probably help with this too. – Craig Ringer Sep 25 '15 at 00:24

2 Answers2

1

Use pgfutter.

The general syntax looks like this:

   pgfutter csv       

In order to run this on all csv files in a directory from Windows Command Prompt, navigate to the desired directory and enter:

   for %f in (*.csv) do pgfutter csv %f

Note that the path for the downloaded program must be added to the list of accepted paths for Environmental Variables.

EDIT: Here is the command line code for Linux users

Run it as

   pgfutter *.csv       

Or if that won't do

   find -iname '*.csv' -exec pgfutter csv {} \;
Sean
  • 133
  • 1
  • 5
Jakub Kania
  • 15,665
  • 2
  • 37
  • 47
  • Thank you Jakub for pointing out pgfutter. However, the syntax for running the program is incorrect. I will edit the answer and then mark it as the accepted one. – Sean Oct 06 '15 at 21:42
  • @Sean Yeah, I didn't notice you were using Windows not Linux, sorry. – Jakub Kania Oct 07 '15 at 07:57
  • No problem, I added your original code to the accepted answer for Linux users – Sean Oct 07 '15 at 21:45
  • pq: password authentication failed for user "postgres" I got this error. – TheDataGuy Apr 04 '17 at 16:01
  • I use OS X EI Capitan 10.11.6. I tried the code for Linux on my mac but it did not work. Any idea? – Paul.j Apr 20 '17 at 20:45
  • @Paul.jl unfortunately no. "did not work" is too vague and I don't even know if find or pgfutter did not work for you. – Jakub Kania Apr 20 '17 at 21:30
  • @Bhuvanesh I believe I had to make my password for the database blank, as "". I could not get pgfutter to correctly apply any other password. – Sean Apr 27 '17 at 20:19
  • pgfutter does not work on Macs for quite a while now. – JohnAllen Jan 25 '23 at 09:22
1

In the terminal use nano to make a file to loop through moving csv files under my directory to postgres DB

>nano run_pgfutter.sh

The content of run_pgfutter.sh:

#! /bin/bash
for i in /mypath/*.csv
do
   ./pgfutter csv ${i}
done

Then make the file executable:

chmod u+x run_pgfutter.sh 
charlesreid1
  • 4,360
  • 4
  • 30
  • 52
Paul.j
  • 794
  • 2
  • 8
  • 17