-1

I have two csv files. One has this header

%w{ Name E-mail Job Phone Application_date } 

the other one has

%w{ E-mail Note }

What i want is to merge the two in a unique CSV..with this header

%w { Name E-mail Job Phone Application_date Note }

In the process as you already figured out i want to pair the Note column data with the relative E-mail of the first CSV, because the e-mails of the second CSV are present in the first CSV. So i need to pair the Note column data with throught the e-mail..

require 'csv'

desc "Import csv candidates into the database"

task candidates: :environment do
  filepath_candidates_csv = 'data/Import task - Candidates.csv'
  filepath_note_csv = 'data/Import task - Notes.csv'
  filepath_final_csv = 'data/Final.csv'

  #removing candidates duplicates from the csv
  candidates = CSV.read(filepath_candidates_csv)
  new_candidates = candidates.uniq {|x| x.first}

  # removing candidates notes from the csv
  notes = CSV.read(filepath_note_csv)
  new_notes = notes.uniq {|x| x.first}
  new_notes[0][0] = "E-mail"

  # generate new csv array with the updated fields
  hs = %w{ Name E-mail Phone Job Created_at Note }
  CSV.open(filepath_final_csv, "wb") do |csv|
    csv << hs
    CSV.parse_line(new_candidates) do |line|
      csv << line unless line.contain?("E-mail")
    end
  end
end

i get this error

Running via Spring preloader in process 9372
rake aborted!
NoMethodError: private method `gets' called for #<Array:0x00005638b5452bc8>
/home/luis/code/levisn1/Import-Task/csv_Importer/lib/tasks/import.rake:23:in `block (2 levels) in <main>'
/home/luis/code/levisn1/Import-Task/csv_Importer/lib/tasks/import.rake:21:in `block in <main>'
-e:1:in `<main>'
Tasks: TOP => candidates
(See full trace by running task with --trace)
Kaido
  • 117
  • 9
  • This is duplicate of [this one](https://stackoverflow.com/questions/7947000/merge-csv-files-on-a-common-field-with-ruby-fastercsv) – yorodm Feb 08 '19 at 15:09
  • 1
    Possible duplicate of [merge CSV files on a common field with ruby/fastercsv](https://stackoverflow.com/questions/7947000/merge-csv-files-on-a-common-field-with-ruby-fastercsv) – yorodm Feb 08 '19 at 15:09

2 Answers2

1

First you need to parse both files - you could save each row in a hash or you create a new class and save instances of that class. Second you need to pair the entries with the same email (if you create instances of your own class, you can assign the notes to the right instance when you parse the second csv) Finally you want to write a csv file again.

Have a look at this gem - it might be helpful https://github.com/ruby/csv

How does that sound?

EDIT: here is the code if you use a class to solve the problem

class Person
  attr_reader :name, :email, :phone, :job, :created_at, :note
  attr_writer :note
  #state
  # name,email,phone,job,created_at
  def initialize(name, email, phone, job, created_at, note)
    @name = name
    @email = email
    @phone = phone
    @job = job
    @created_at = created_at
    @note = note
  end
  #behaviour
end

#little test:
person_1 = Person.new("john", "john@john.us", "112", "police", "21.02.", nil)
p person_1

require 'csv'
csv_options = { headers: :first_row }
filepath    = 'persons.csv'
persons = []

CSV.foreach(filepath, csv_options) do |row|
  persons << Person.new(row["name"], row["email"], row["phone"], row["job"], row["created_at"], nil)
end

filepath_2 = "notes.csv"
CSV.foreach(filepath_2, csv_options) do |row|
  persons.each do |person|
    if person.email == row["email"]
      person.note = row["note"]
    end
  end
end

p persons

csv_options = { col_sep: ',', force_quotes: true, quote_char: '"' }
filepath    = 'combined.csv'

CSV.open(filepath, 'wb', csv_options) do |csv|
  csv << ['name', 'email', 'phone', 'job', 'created_at', "note"]
  persons.each do |person|
    csv << [person.name, person.email, person.phone, person.job, person.created_at, person.note]
  end
end

Clara
  • 2,677
  • 4
  • 16
  • 31
  • Sounds great, but i'm a very beginner and writing my own class sounds difficult :/ can you give me a visual example? Thank you. – Kaido Feb 08 '19 at 15:24
  • Are you a LW alumni too? I just saw your question in our slack :D – Clara Feb 09 '19 at 17:58
1

It's naive implementation. You can improve it.

Just as idea for you.

Here example csv-files:

$ cat first.csv
name,email,phone,job,created_at
John,john@john.us,112,police,21.02.
Jack,jack@jack.us,112,ambulance,22.02.
Ivan,ivan@ivan.ru,02,kgb,23.02.

$ cat second.csv
email,note
ivan@ivan.ru,some note

Naive script:

require 'csv'

first_csv = CSV.
              read('first.csv', headers: true).
              map { |value| { name:       value['name'],
                              email:      value['email'],
                              phone:      value['phone'],
                              job:        value['job'],
                              created_at: value['created_at'] } }

second_csv = CSV.
               read('second.csv', headers: true).
               map { |value| { email: value['email'],
                               note:  value['note'] } }

# The same email searching

first_csv.each do |f|
  second_csv.each do |s|
    f.merge! s if f[:email] == s[:email]
  end
end

# Write to new CSV

CSV.open('new.csv', 'w') do |csv|
  csv << %w(name email phone job created_at note)
  first_csv.each do |info|
    csv << info.values_at(:name, :email, :phone, :job, :created_at, :note)
  end
end

Checking

$ cat new.csv
name,email,phone,job,created_at,note
John,john@john.us,112,police,21.02.,
Jack,jack@jack.us,112,ambulance,22.02.,
Ivan,ivan@ivan.ru,02,kgb,23.02.,some note
mechnicov
  • 12,025
  • 4
  • 33
  • 56
  • I have some questions: So you transform an array of arrays in hash using map, why? And why you use headers: true? – Kaido Feb 08 '19 at 19:57
  • `first_csv` and `second_csv` are arrays of hashes. `headers: true` allows to name columns and ignores the first row – mechnicov Feb 08 '19 at 20:09
  • a CSV is not read by ruby as array of arrays? When you do map, right after you populate the a hash for each row..this is what is seems for me. Sorry if i don't understand. – Kaido Feb 08 '19 at 20:30
  • CSV uses arrays in Ruby. For example `CSV.read('second.csv')[1][0]` returns `"ivan@ivan.ru"`. Read more about CSV in [offiicial docs](https://ruby-doc.org/stdlib-2.5.1/libdoc/csv/rdoc/CSV.html) or in [tutorial](https://www.rubyguides.com/2018/10/parse-csv-ruby/). – mechnicov Feb 08 '19 at 20:52
  • @LuisCarlosQuarta, I've improved the answer to demonstrate how `headers: true` works (all columns have own names, these names are in the first line of csv-file). – mechnicov Feb 08 '19 at 21:35
  • can you explain to me this? value['name']? are you working with a sub array right? How do you access the value like that? Or is a csv funcionality related to the column? Thank you – Kaido Feb 09 '19 at 17:14
  • i'm testing your code with my csv and i get this error TypeError: no implicit conversion of Array into String, on the line when i create the hash – Kaido Feb 09 '19 at 17:40
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/188153/discussion-between-mechnicov-and-kaido). – mechnicov Feb 09 '19 at 18:00