0

I just noticed very strange behavior of gsub method and maybe somebody can explain this to me. I have file which I open with standard

f=File.read(filename)
puts f.gsub('xxxxx','a')

this works fine and all xxxxx strings are replaced with a

if I open same file encoded originally in iso-8859-1

f=File.read(filename,:encoding => 'iso-8859-1')
puts f.gsub('xxxxx','a')

this doesnt work...there is no error, just ignores and xxxxx is not replaced but with 1 char only it works nice

f=File.read(filename,:encoding => 'iso-8859-1')
puts f.gsub('x','a')

is there a reason why?

edit: I am adding example file as a link to gdrive...it is default export from sql server file_in_zip

Mi Ro
  • 740
  • 1
  • 6
  • 31
  • 1
    Can you show your file content, I just touch a file contains ```this is a ruby file\n```, and I gusb ```i``` to ```a``` correctly! – spike 王建 Aug 10 '20 at 13:49
  • iso-8859-1 file is generated output from SQL server 2019 with generate script function it is normal sql stored procedure – Mi Ro Aug 10 '20 at 14:15
  • Could there be some whitespace character hiding there somewhere? – Yule Aug 10 '20 at 15:36
  • Could you try to create a file with `string = "\u0078\u0078\u0078\u0078\u0078" open("transcoded.txt", "w:ISO-8859-1") do |io| io.write(string) end` . When I tried `gsub` reading `transcoded.txt` It worked just fine. Could you try and post the code and the result? – Giuseppe Schembri Aug 10 '20 at 19:01
  • As posted, your problem is not reproducible. Try looking at your file in a hex editor, or posting a String#inspect version of your file so we can see what Ruby sees. – Todd A. Jacobs Aug 10 '20 at 19:41
  • @ToddA.Jacobs I added link to file...it looks that file might contain some strange character in the begining but behavior of gsub or other string functions(like split) are really strange that works only with 1 not multiple characters – Mi Ro Aug 11 '20 at 15:35
  • What is the actual encoding of the data in the file? – D. SM Aug 14 '20 at 04:50

0 Answers0