1

I have a csv, that contains float numbers with commas except of dots like this "34,21", and I need to parse it in my rake task, I have already tried some solutions like this: Ruby on Rails - Import Data from a CSV file

But none of them doesn't seem to be working properly, they just parse it like 2 fields (32 and 21). Is there a way to fix it with using built-in CSV?

I have already tried this:

task :drugimport, [:filename, :model] => :environment do |task,args|
    CSV.foreach(args[:filename], { encoding: "UTF-8", headers: true, header_converters: :symbol,
        converters: :all}) do |row|
            Moulding.create!(row.to_hash)
        end
end

And this one:

require 'smarter_csv'
options = {}
SmarterCSV.process('input_file.csv', options} do |chunk|
   chunk.each do |data_hash|
       Moulding.create!( data_hash )
   end
end

They both look nice and elegant, except of wrong parsing of fields containing commas.

here is my rows, sorry there is russian, but whatever: http://pastebin.com/RbC4SVzz I didn't changed anything in it, so I pasted to pastebin, will be more useful then here, I guess

here is my import log: http://pastebin.com/rzC0h9rS

Community
  • 1
  • 1
animekun
  • 1,789
  • 4
  • 28
  • 45
  • 1
    Here is an idea... Take your csv rows and replace commas with dots and after you parse it reverse the change. `"foo,bar,foo,baz,foo,bar".gsub(",", ".")` – Tim Aug 02 '15 at 03:05
  • @TimKos thanks for response, but I guess it is not a proper solution, because I need then to manually detect open\close quotes and find a commas in it and then replace, I guess there is a built-in solution more elegant and easy, in php it's pretty easy to parse such files, I guess rails have a solution too – animekun Aug 02 '15 at 03:10
  • 1
    In such a case could you please update your answer with code that you use to parse your CSV files? Whilst I agree with you about the lack of "elegance" in the prior solution, I think with Rails some form of regex will inevitably be necessary. But let's have a look first – Tim Aug 02 '15 at 03:20
  • @TimKos yep, sure, check out updates – animekun Aug 02 '15 at 03:27
  • 1
    Would be great to see what a couple rows of your csv looks like. – Mark Swardstrom Aug 02 '15 at 03:46
  • @Swards yep, sorry, I've added – animekun Aug 02 '15 at 03:56

2 Answers2

1

In my opinion, you have three possible roads you could go:

1) work with the "bad" input and try to find a workaround

You could try and work line by line and try

line.split (" ,")

which would assume that there is a blank space before the comma. Another approach would be to identify the numerical values via regex and replacing the comma character (this might be easier to fix on the source data!)

2) try to export the CSV with another separator

This depends on where the data comes from. If you can re-export the data, maybe that's the most easy solution. In this case of course, your data would technically not be CSV anymore, but for example SSV (semi-colon-separated values).

3) try other CSV parsers

I can definitely suggest you take a look at other CSV parsers, such as fasterCSV and others (see a list of CSV parsers at ruby-toolbox)

I hope this is helpful advice - sample CSV data would definitely help to help you.

OpenGears
  • 102
  • 7
1

Right, so from what I am seeing you are, as you understand yourself, not passing any options to the parser. When not indicating row_sep or any other form of option, smarter_csv will use the system new line separator which is "\r\n" for windows machines, and "\r" for unix machines.

That being said, try the following...

require 'smarter_csv'
SmarterCSV.process('input_file.csv', :row_sep => :auto, :row_sep => ","} do |chunk|
  chunk.each do |data_hash|
    Moulding.create!( data_hash )
  end
end

I agree with Swards. What I have done assumes quite a lot of things. A glance at some CSV data could be useful.

Tim
  • 1,326
  • 1
  • 15
  • 27
  • yep, you are right about "\r\n", but it still using "\r\n" as separators :c – animekun Aug 02 '15 at 03:51
  • 1
    Hmm try adding `row_sep: :auto` as well inside options. I'll update my answer accordingly for you to see – Tim Aug 02 '15 at 03:57
  • I've added some rows from csv with headers, hope it can help – animekun Aug 02 '15 at 03:57
  • nope, it didn't help( I've added data import log, you can check where data import fails, in dates and regnumbers and in prices with commas too – animekun Aug 02 '15 at 04:01
  • well, you'll be laughing, but your code and my in topic - works, it's me, an idiot, was working 20h long, and absolutely out of my mind, I was doing changes of code on my local repo, but parsing was doing on server through terminal, and I didn't noticed, that code on server wasn't change, I thought it was my local terminal and I realised this only after taking a nap :D well, sorry, guys, you all are great, so much thanks to all of u! – animekun Aug 02 '15 at 13:41
  • 1
    :0 Ah what a relief! I am happy it all went well. – Tim Aug 02 '15 at 14:42