20

I'm trying to run a Ruby script, and always getting an error on this line:

file_content.gsub(/dr/i,'med')

Where I'm trying to replace "dr" by "med".

The error is:

program.rb:4:in `gsub': invalid byte sequence in UTF-8 (ArgumentError)

Why is that, how can I fix this issue?

I'm working on a MAC OS X Yosemite machine, with Ruby 2.2.1p85.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Simplicity
  • 47,404
  • 98
  • 256
  • 385
  • From the variable name it looks like you are reading the data from a file – is that right? Where does the file come from and how are you reading it? Do you know the actual encoding of the file? – matt Apr 26 '15 at 13:18

1 Answers1

24

Probably your string is not in UTF-8 format, so use

if ! file_content.valid_encoding?
  s = file_content.encode("UTF-16be", :invalid=>:replace, :replace=>"?").encode('UTF-8')
  s.gsub(/dr/i,'med')
end

See "Ruby 2.0.0 String#Match ArgumentError: invalid byte sequence in UTF-8".

Community
  • 1
  • 1
jon snow
  • 3,062
  • 1
  • 19
  • 31
  • Thanks for your reply. How can I use the code snippet you provided? Since when I use it immediately in my program, I get : undefined local variable or method `s' for main:Object (NameError). Thanks – Simplicity Apr 26 '15 at 12:11
  • use `file_content ` instead of `s` – jon snow Apr 26 '15 at 12:14
  • Before the line I show in my question, I have the following line of code: "file_content = IO.read(filename)". I have placed your code after this line, and before the line in my question, and really still having the same problem – Simplicity Apr 26 '15 at 12:26
  • replace `file_content.gsub(/dr/i,'med')` line with my block of code – jon snow Apr 26 '15 at 12:29
  • 7
    @Simplicity you’re using Ruby 2.2, so you can just use the [`scrub`](http://ruby-doc.org/core-2.2.2/String.html#method-i-scrub) method. This technique is really a workaround for older versions that don’t have `scrub`. (But really you should work out what the encoding actually is and convert it properly, otherwise you are losing data). – matt Apr 26 '15 at 16:06
  • 1
    For me, this saved me https://stackoverflow.com/a/19103433/7365329 – Touqeer Nov 21 '19 at 01:41