Remove \r\n in awk

Question

I have a simple awk command that converts a date from MM/DD/YYYY to YYYY/MM/DD. However, the file I'm using has \r\n at the end of the lines, and sometimes the date is at the end of the line.

awk '
  BEGIN { FS = OFS = "|" }
  {
    split($27, date, /\//)
    $27 = date[3] "/" date[1] "/" date[2]

    print $0
  }
' file.txt

In this case, if the date is MM/DD/YYYY\r\n then I end up with this in the output:

YYYY
/MM/DD

What is the best way to get around this? Keep in mind, sometimes the input is simply \r\n in which case the output SHOULD be // but instead ends up as

/
/

Why dont you use one of the replacement functions on `date[3]` to replace `\r` with `""` ? — Lars Fischer, Apr 09 '17 at 17:51

mklement0 · Accepted Answer · 2017-04-09T22:27:56.970

Given that the \r isn't always at the end of field $27, the simplest approach is to remove the \r from the entire line.

With GNU Awk or Mawk (one of which is typically the default awk on Linux platforms), you can simply define your input record separator, RS, accordingly:

awk -v RS='\r\n' ...

Or, if you want \r\n-terminated output lines too, set the output record separator, ORS, to the same value:

awk 'BEGIN { RS=ORS="\r\n"; ...

Optional reading: an aside for BSD/macOS Awk users:

BSD/macOS awk doesn't support multi-character RS values (in line with the POSIX Awk spec: "If RS contains more than one character, the results are unspecified").

Therefore, a sub call inside the Awk script is necessary to trim the \r instance from the end of each input line:

awk '{ sub("\r$", ""); ...

To also output \r\n-terminated lines, option -v ORS='\r\n' (or ORS="\r\n" inside the script's BEGIN block) will work fine, as with GNU Awk and Mawk.

after four hours of madness by trying to `print var_1 var_2` where the var_2 was overwriting var_1, it works. Thanks ! — Rigoberta Raviolini, Mar 09 '23 at 17:18

score 0 · Answer 2 · answered Apr 09 '17 at 22:03

0

If you're on a system where \n by itself is the newline, you should remove the \r from the record. You could do it like:

$ awk '{sub(/\r/,"",$NF); ...}'

answered Apr 09 '17 at 22:03

James Brown

36,089
7
43
59

Remove \r\n in awk

2 Answers2