I am having trouble handling text files of tabulated data generated on a windows machine. I'm working in Ruby 1.8. The following gives an error ("\000" (Iconv::InvalidCharacter)) when processing the SECOND line from the file. The first line is converted properly.
require 'iconv'
conv = Iconv.new("UTF-8//IGNORE","UTF-16")
infile = File.open(tabfile, "r")
while (line = infile.gets)
line = conv.iconv(line.strip) # FAILS HERE
puts line
# DO MORE STUFF HERE
end
The strange thing is that it reads and converts the first line in the file with no problem. I have the //IGNORE flag in the Iconv constructor -- I thought this was supposed to suppress this kind of error.
I've been going in circles for a while. Any advice would be highly appreciated.
Thanks!
EDIT: hobbs solution fixes this. Thank you. Simply change the code to:
require 'iconv'
conv = Iconv.new("UTF-8//IGNORE","UTF-16")
infile = File.open(tabfile, "r")
while (line = infile.gets("\x0a\x00"))
line = conv.iconv(line.strip) # NO LONGER FAILS HERE
# DOES MORE STUFF HERE
end
Now I'll just need to find a way to automatically determine which gets separator to use.