4

Descriptions

Read file csv in ruby.

  1. I have a csv file with this content

    longitude,latitude,phone
    13,139.7113134,35.56712836,0311112222
    
  2. I read file csv.

  3. I get data is not expect at column phone number

Code

uploaded_io = params[:rooms][:file]
rooms_table = CSV.table(uploaded_io.tempfile, encoding: "UTF-8")
rooms_table.each_with_index do |row, i|
  p row
end

puts row:

#<CSV::Row longitude:139.7113134 latitude:35.56712836 phone:52728978 >

I don't understand where is value phone number? I expect phone number is 0311112222 instead of 52728978

3limin4t0r
  • 19,353
  • 2
  • 31
  • 52
Akiko
  • 51
  • 3
  • Wrap that phone number in double-quotes in the file. – Sergio Tulentsev Aug 26 '18 at 11:44
  • 6
    @SergioTulentsev's comment is spot on, but just to add an explanation. The reason for this is that numbers with a leading zero are interpreted by Ruby as being in base 8 [octal](https://en.wikipedia.org/wiki/Octal) (311112222 in octal is 52728978 in [base 10](https://en.wikipedia.org/wiki/Decimal)). – mikej Aug 26 '18 at 11:48
  • 1
    @mikej That's a really interesting answer -- if you write it up as one, I'd upvote it. – thesecretmaster Aug 26 '18 at 12:32
  • Thanks you everyone. Is there a way to solve it without changing from csv file ? – Akiko Aug 26 '18 at 22:58
  • Wrapping the phone number in double quotes is not going to solve this problem. The purpose of double quotes in CSV is to wrap fields when the fields contain commas or newline characters. CSV data has no type and there is no way to specify type in the CSV file itself. (But this can be solved in Ruby—answer forthcoming.) – Jordan Running Aug 28 '18 at 15:14
  • Images might help, but when it comes to images of code/plain text. It is often better to provide the code in a code block so people can copy its content. I submitted a change (currently waiting for peer review) that adds this code block to your question. However there are 3 headers and 4 values, which makes your example invalid. I suspect that the first value is an "id", but that info isn't provided. – 3limin4t0r Aug 28 '18 at 20:12

2 Answers2

2

The reason this is happening is that, per the docs, CSV.table is:

A shortcut for:

CSV.read( path, { headers:           true,
                  converters:        :numeric,
                  header_converters: :symbol }.merge(options) )

Note converters: :numeric, which tells it to automatically (attempt to) convert numeric-looking fields to Numbers. Phone numbers, of course, aren't really numbers, but rather strings of digits.

If you don't want any converstions, you could pass converters: nil as an option to CSV.table.

Assuming you do want the :numeric converter to still operate on the other fields, though, you need to define your own converter. A converter is a Proc that takes two arguments: A field value and an (optional) FieldInfo object. Your converter might look like this:

NUMERIC_EXCEPT_PHONE_CONVERTER = lambda do |value, field_info|
  if field_info.header == :phone
    value
  else
    CSV::Converters[:float].call(
      CSV::Converters[:integer].call(value))
  end
end

Then you would use it by passing it to CSV.table as the converters: option, which will override the default converters: :numeric:

rooms_table = CSV.table("data.csv", encoding: "UTF-8", converters: NUMERIC_EXCEPT_PHONE_CONVERTER)
p rooms_table[0]
# => #<CSV::Row longitude:139.7113134 latitude:35.56712836 phone:"0311112222">

As you can see, the phone value is now a string with the leading 0.

You can see this code in action on repl.it: https://repl.it/@jrunning/WellmadeFarflungCron

Aside

Why, you might ask, is this bit so ugly?

CSV::Converters[:float].call(
  CSV::Converters[:integer].call(value))

It's because the CSV module defines CSV::Converters thusly:

Converters  = {
  integer:   lambda { |f|
    Integer(f.encode(ConverterEncoding)) rescue f
  },
  float:     lambda { |f|
    Float(f.encode(ConverterEncoding)) rescue f
  },
  numeric:   [:integer, :float],
  # ...
}

Since the :numeric converter is not specified as a lambda, but rather an array that indicates that it's really just a "chain" of the :integer and :float converters, we can't just do CSV::Converters[:numeric].call(value); we have to call the two converters manually. (If anybody knows something I'm missing, please leave a comment.)

Jordan Running
  • 102,619
  • 17
  • 182
  • 182
  • I'm not sure if the comparison is made on content or object level (`v == v` or `v.equal?(v)`) for the decision to move to the next converter. If it is done on object level (the second) one might try to pass `[->(v, fi) { fi.header == :phone ? v.dup : v }, :numeric]` as value for the `:converters` option. If the comparison is done on content level you are required to add, remove or change a character. – 3limin4t0r Aug 28 '18 at 20:33
  • 1
    @JohanWentholt Yeah, I had hoped for that, but alas, the comparison is [rather more naive](https://github.com/ruby/ruby/blob/8867f285da534970c98f8fd388ea4d92ca750a67/lib/csv.rb#L1620-L1624). "Any converter that changes the field into something other than a String halts the pipeline of conversion for that field. This is primarily an efficiency shortcut." – Jordan Running Aug 28 '18 at 22:17
  • I guess in that case returning a string wrapper could work. (`require 'delegate'`) Passing the following as `:converters` option: `[->(v, fi) { fi.header == :phone ? SimpleDelegator.new(v) : v }, :numeric]` (`SimpleDelegator.new('I am a string').is_a? String #=> false`) You can later convert it back to a string by calling `#to_s` if needed. Since all methods are delegated this shouldn't be needed most of the time. – 3limin4t0r Aug 29 '18 at 10:13
1

You can change:

rooms_table = CSV.table(uploaded_io.tempfile, encoding: "UTF-8")

to:

rooms_table = CSV.table(uploaded_io.tempfile, encoding: "UTF-8", converters: nil)

which will not convert/cast your fields (you will get strings). The default converter is :numeric which does this conversions that you don't want.

Possible converters that you can work with could be found here:

https://ruby-doc.org/stdlib-2.5.1/libdoc/csv/rdoc/CSV.html#Converters

Tarek N. Elsamni
  • 1,718
  • 1
  • 12
  • 15