0

I'm trying to scan a file that has the DOS ^M as end-of-line using something like:

Scanner file = new Scanner(new File(saveToFilePath)).useDelimiter("(?=\^M)")

In other words, I want to read the text line by line but also keep the ^M that marks the end of the line. This would be easy with \n but I'm not good with regexes and the DOS end-of-line is driving me crazy.

davejal
  • 6,009
  • 10
  • 39
  • 82
Rick
  • 1
  • 2
  • Maybe you ned to open the file in binary mode, so that you don't get automatic newline translation. – paddy Jan 27 '16 at 01:21
  • What is dos eol, and what control code is `^M` ? Also, looks like `(?=\^M)` is regex. Why not just use `"\\r?\\n"` ? –  Jan 27 '16 at 01:26
  • `^M` is just the way certain text editors show you there is a Windows eol (CRLF, or "\r\n"); it's not an actual character by itself. At least, it shouldn't be unless you tried manually copying text into a new file from an emacs terminal. – Riaz Jan 27 '16 at 01:53
  • `^M` is a way of representing ``, which corresponds to Carriage Return. Also represented as \r, or decimal value 13, or hex 0C. That was never the DOS end-of-line. It was used by TRS-80, Apple II, Mac OS, and OS-9. DOS always used a two character EOL: \r\n (Carriage Return LineFeed) – dbenham Jan 27 '16 at 03:36
  • It's a cvs file that is meant for excel. I'm altering a column but want to be absolutely sure that the altered file is identical to the original other than the column's cell alteration. But once I read the lines into an array and wrote them back to a file, the eol disappeared when I opened the file in vi. I can easily replace the eol with "\n". I was just afraid that it might haunt me down the road :) – Rick Jan 27 '16 at 04:53

1 Answers1

0

After some research I finally got it. The following is the correct regex for finding and keeping ^M. I didn't know that it meant CTRL-M, so some of your responses helped with that. For some reason, the "M" is not included in the regex and I'm not sure why it works, but it does. This gives us a delimiter for lines that includes the delimiter (with a lookahead regex) when searching for the elusive "^M".

Scanner file = new Scanner(source).useDelimiter("(?=\p{Cntrl})")

Thank you, everyone.

Rick
  • 1
  • 2