35

I have a string in Ruby, s (say) which might have any of the standard line endings (\n, \r\n, \r). I want to convert all of those to \ns. What's the best way?

This seems like a super-common problem, but there's not much documentation about it. Obviously there are easy crude solutions, but is there anything built in to handle this?

Elegant, idiomatic-Ruby solutions are best.

EDIT: realized that ^M and \r are the same. But there are still three cases. (See wikipedia.)

Guilherme Bernal
  • 8,183
  • 25
  • 43
Peter
  • 127,331
  • 53
  • 180
  • 211

4 Answers4

44

Since ruby 1.9 you can use String::encode with universal_newline: true to get all of your new lines into \n while keeping your encoding unchanged:

s.encode(s.encoding, universal_newline: true)

Once in a known newline state you can freely convert back to CRLF using :crlf_newline. eg: to convert a file of unknown (possibly mixed) ending to CRLF (for example), read it in binary mode, then :

s.encode(s.encoding, universal_newline: true).encode(s.encoding, crlf_newline: true)
Greg
  • 2,549
  • 2
  • 24
  • 30
  • 7
    You don't need to include the first `s.encoding`, a simple `s.encode(universal_newline: true)` or `s.encode(crlf_newline: true)` does the trick. This helped me with a project today. – Donovan Dec 03 '14 at 19:56
  • 2
    @Donovan - You're _probably_ right, however the docs say that the version without an explicit encoding will transcode to `Encoding.default_internal`, which may or may not be what you want. My version will conservatively preserve your current encoding. – Greg Dec 04 '14 at 23:20
  • 2
    true and you make a good point, but in most cases the default is fine, after all, that's what `String.new` uses. So, in my case (and I could argue most cases), it would be redundant. – Donovan Dec 06 '14 at 02:12
  • 1
    This is apparently much faster than other methods (takes 40% less time than gsub method whereas split-join takes about 40% more time). I compared this to: `s.gsub(/\r\n?/, "\n")`, `s.gsub("\r\n", "\n").gsub("\r", "\n")` (about same speed), and `s.split(/\r\n?/).join("\n")` – Kanat Bolazar Jan 21 '16 at 22:57
41

Best is just to handle the two cases that you want to change specifically and not try to get too clever:

s.gsub /\r\n?/, "\n"
Josh Lee
  • 171,072
  • 38
  • 269
  • 275
  • 1
    Two things: You have to put \r\n first in the regex or else it will never match (because anyhing that could otherwise matched b \r\n will be matched by \r first). And '\n' == "\\n", while what you want is "\n". – sepp2k Dec 02 '09 at 22:08
  • 1
    Change the single quotes to double quotes. Otherwise it doesn't work as intended. – Mikael S Dec 02 '09 at 22:09
  • It seems we're all on the same page :) – Josh Lee Dec 02 '09 at 22:10
  • nicely done that you don't bother changing the default case (`\n` -> `\n` is unnecessary. didn't quite realise this at first :) – Peter Dec 02 '09 at 22:13
  • 1
    Interesting answer; I wonder why Ruby doesn't have something like python's `os.linesep`? –  Aug 12 '12 at 05:27
  • it would be cool to have an answer that utilized this line of code. All the details around File, readline, etc. The context of what this line fits into – ahnbizcad Aug 31 '16 at 22:02
4

I think the cleanest solution would be to use a regular expression:

s.gsub! /\r\n?/, "\n"
Mikael S
  • 5,206
  • 2
  • 23
  • 21
-9

Try opening them on NetBeans IDE - Its asked me before, on one of the projects I've opened from elsewhere, if I wanted to fix the line endings. I think there might be a menu option to do it too, but that would be the first thing I would try.

Ash
  • 24,276
  • 34
  • 107
  • 152
  • 2
    thanks, but this isn't a one-off; this is for processing data in Ruby, not processing Ruby files. – Peter Dec 02 '09 at 21:52