0

I want to import a CSV file into a PostgreSQL table with Ruby. I wish to do it automatically (with a task) because new files with the same structure will be imported monthly. It it first copied to a temporary table, in this case "test", and afterwards inserted into another table.

Here is what I want to do:

  • Some of the headers of the CSV file contain whitespace, and I wish to replace them with an underscore so I don't have to deal with those later. For example, "col 1" needs to become "col_1".
  • Some of the columns in the CSV file are useless and I don't wish to copy them in the database. For example, out of "col_1", "col_2" and "col_3", I only want to copy "col_1" and "col_3".

Here is the CSV content I'm working with:

col 1,col 2,col 3
r1c1,r1c2,r1c3
r2c1,r2c2,r2c3

From searching on stackoverflow and other places, this is the following code that I got.

task :insert_charge [:file] => :environment do |task,args|
  require 'csv'
  testfile = CSV.read(args[:file],
    :headers => true,
    :converters => :all,
    :header_converters => lambda { |h| h.gsub(' ', '_') }
  )

   ActiveRecord::Base.connection.execute("
    drop table if exists test;
    create table test (
      id serial primary key,
      col_1 varchar(4),
      col_3 varchar(4)
    );
  ")

  conn = PG::Connection.open(
    :user => Rails.configuration.database_configuration["development"]["username"],
    :dbname => Rails.configuration.database_configuration["development"]["database"],
    :password => Rails.configuration.database_configuration["development"]["password"]
  )

  conn.copy_data "copy test (col_1, col_3)
    from stdin csv header delimiter ',' null as '' encoding 'utf-8'" do
    conn.put_copy_data testfile
  end
end

I am successfully able to change the whitespace in the headers with an underscore. However, this changes the CSV into Table mode, which then cannot be copied in the database. How can I modify the headers and then copy the CSV into the database?

This is the error that I get when I do rake insert_charge [d:\\test.csv]: TypeError: wrong argument type CSV::Table (expected String)

Please take in consideration that I'm a real Ruby beginner. I've seen questions similar to mine, but no answers that I works for my problem.

Thanks for your help!

mikibok
  • 35
  • 4

1 Answers1

0

Maybe you can consider to fix headers rewriting the file, change the first line:

lines = File.readlines('test.csv')
new_header = lines[0].chomp.split(',').map{ |w| w.gsub(' ', '_')}.join(',')
lines[0] = new_header << $/
File.open('test.csv', 'w') { |f| f.write(lines.join) }

If you don't want to overwrite the file, just change the name of the output file.

(Stolen here: https://stackoverflow.com/a/35958805)

Then probably you don't need to read file with the csv library, to pass a string to the process, try just:

testfile = File.read('test.csv')
p testfile.class #=> String
iGian
  • 11,023
  • 3
  • 21
  • 36