0

I have files with every possible EOL imaginable. I want to normalize them in one go instead of doing them one by one as we are talking a few thousand. I know how to do them manually, so please don't explain that.

I think all possible ones are, from most common to least common: CRLF, LF, CR-CRLF, CRCR-CRLF, CR, LFLF, CRCR, CRLF-CRLF and CRCRCR-CRLF (yes, there is one file).

All files have consistent EOL, there's no mixed ones in one file. There might remain some odd CR or LF after fixing, those can be left alone.

I want all files to have just CRLF. Empty lines must remain intact.

First I think I need some good detection of what EOL is in every file. It could check that it repeats at least 3 times, but some have just one line.

Here I made some scratch files, all should look like the CRLF one when it's done (there's just TXT files inside): https://www71.zippyshare.com/v/BNpRAijy/file.html

I Googled for the whole day and didn't find any good solution.

Examples

1. just CRLF EOL, result I want from all:

line1CRLF

line2CRLF

CRLF

line3CRLF

line4CRLF

CRLF

CRLF

line5CRLF

CRLF

CRLF

CRLF

line6CRLF

CRLF

2. CRCRLF: Manually I would replace CRCRLF with CRLF, \r\r\n with \r\n and repeat again for files with CRCRCRLF and again for that lonely CRCRCRCRLF. But problem is not all files have just this possibility, there are 5 more to consider which I listed above. Though just LF and just CR is not so problematic here as Windows Notepad now supports Unix and MAC EOL, but it would still be nice to include them.

So main problem remains LFLF and then there's also those few CRCR and CRCR-CRLF to consider. Best would be to include all possibilites.

line1CR

CRLF

line2CR

CRLF

CR

CRLF

line3CR

CRLF

line4CR

CRLF

CR

CRLF

CR

CRLF

line5CR

CRLF

CR

CRLF

CR

CRLF

CR

CRLF

line6CR

CRLF

CR

CRLF

  • I think you need to write a program that visits each file in turn. It would read the first few lines and determine the line break format of that file, then process it accordingly. This is not a job for Notepad++. – AdrianHHH Apr 20 '20 at 21:43
  • Thanks, wrote it in Python. I'm just kidding, programmer friend already had it, I just wanted to know if there's a simple way in Notepad++. – GrimReaper Apr 20 '20 at 22:21

1 Answers1

0

With Notepad++, you can do:

  • Ctrl+Shift+F
  • Find what: \R+
  • Replace with: \r\n

Where \R+ stands for 1 or more any kind of linebreak.

ScreenShot:

enter image description here

Toto
  • 89,455
  • 62
  • 89
  • 125
  • Maybe it's a start, but this deletes all empty lines which is no good for me. Please, if you have time, try to get at least the 7 ones I listed to match CRC32 of CRLF one. I'm quite sure it must be way more complicated than this. – GrimReaper Apr 20 '20 at 15:39
  • @GrimReaper: Instead of give a link to an unknown file, please, [edit your question](https://stackoverflow.com/posts/61325134/edit) and add some sample lines and expected result. Add also all constraints, you haven't said you didn't want to delete empty lines. – Toto Apr 20 '20 at 16:14