0

I used a heredoc to make some simple sample data like:

> str = <<TEXT
  abc^M^M^M
  def^M^M^M^M
TEXT

Now look at the contents of this string:

> p str
"abc\r\r\ndef\r\r\r\n"

Note the three carriage-returns become two "\r"s and the four carriage-returns become three "\r"s. I've also tried the %q() syntax with the same result. Putting the same data in a file and reading it results in a correct string with the right number of "\r"s.

Maybe the issue is related to "How can I preserve / maintain consecutive newlines in Ruby here-document?".

P.S.: This happens both running the script in a file, and in irb, using Ruby 1.8.7-p371, 1.9.3-p392, and 2.0.0-p247.

Community
  • 1
  • 1
Coderer
  • 25,844
  • 28
  • 99
  • 154

1 Answers1

0

Here's what Ruby sees on a MacOS system:

str = <<TEXT
  abc^M^M^M
  def^M^M^M^M
TEXT
str # => "  abc^M^M^M\n  def^M^M^M^M\n"

str = <<TEXT
  abc



  def




TEXT
str # => "  abc\n\n\n\n  def\n\n\n\n\n"

Ruby uses "\n" for line-ends, and converts to and from that internally if I remember right. Ruby is aware of the line-ends needed for various OSes, and tries to convert to that when reading/writing to text files, so translation might be occurring during the file I/O.

str = <<TEXT
  abc\n\n\n
  def\n\n\n\n
TEXT
str # => "  abc\n\n\n\n  def\n\n\n\n\n"
puts str
# >>   abc
# >> 
# >> 
# >> 
# >>   def
# >> 
# >> 
# >> 
# >> 

If you output the last str on Windows using IO, File as a "text" file, or puts, Ruby should output the correct line-ends, converting to LF characters to CRLF combinations. In other words, you shouldn't have to jump through hoops to output line-ends specific to an OS or file-type UNLESS you're writing binary data.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • I *am* trying to handle this as binary data, precisely because I'm trying to deliberately manipulate the line endings. Also, I don't think you're entering this correctly if you actually see "^M" in the response -- if you type it with `CTRL-V` `CTRL-M` then Ruby will show you a `\r` in `str.inspect`. – Coderer Jan 10 '14 at 16:35
  • I was working with what was in your example, by copying and pasting the content visible when editing the page, which is the raw source. If you are working with binary data, you should have said so in your question because that's essential. A heredoc isn't really suitable for generating binary data. The conversion of CNTRL-V CNTRL-M to `"\r"` is exactly what I'd expect inside source code because the characters are identical inside strings, which HEREDOCs are. Instead, you should use `pack` if you want to guarantee no line-end conversion, along with the required "b" flag when opening files. – the Tin Man Jan 10 '14 at 22:54