1

Right now, I'm trying to fix an issue in my PrintBans.sh script.

The problem is, the program that generates this file saves it with \r\n line endings, so I need the while loop to be able to read \r\n lines, otherwise there's an extra \r at the end of the last line which results in the arithmetic failing:

 - 621355968000000000")syntax error: invalid arithmetic operator (error token is "

I've tried these.

while read ban
do
    ...
done < dos2unix $file

while read ban
do
    ...
done < `dos2unix $file`

cat $file > dos2unix > while read ban
do
    ...
done

while read ban
do
    ...
done < dos2unix < $file

I also see that some people set IFS='\r\n', but this did not work for me.

Is it impossible to pipe files through dos2unix without overwriting the original file?

PatPeter
  • 394
  • 2
  • 17
  • 1
    `dos2unix < $file | while ...` – Cyrus Oct 06 '18 at 20:09
  • 2
    BTW, `IFS=\r\n` doesn't work, but `IFS=$'\r'` would have. Quotes matter. – Charles Duffy Oct 06 '18 at 20:16
  • @Cyrus, ...when showcasing that, it's probably worth linking to [BashFAQ #24](http://mywiki.wooledge.org/BashFAQ/024) so folks who are setting variables in that `while` loop know what they need to adjust so those variables persist past the loop's exit. – Charles Duffy Oct 06 '18 at 20:22
  • @CharlesDuffy: Okay, I solved a problem and created a new one. ;-) That's better with bash: `while ...; do ... ; done < <(dos2unix < "$file")` – Cyrus Oct 06 '18 at 20:27
  • @Cyrus This is black magic. Thank you. – PatPeter Oct 06 '18 at 20:31
  • @CharlesDuffy Sorry, I tried both ' and " and forgot to quote them when I surrounded it with backticks. – PatPeter Oct 06 '18 at 20:32
  • 1
    @PatPeter, `IFS='\r'` wouldn't work either -- the `$` in `IFS=$'\r'` is important. BTW, `$( )` has been a mandatory part of the POSIX sh standard since its initial publication in 1992 and resolves the quoting problems introduced by backticks. – Charles Duffy Oct 06 '18 at 20:33
  • ...btw, you could use `IFS=$(printf '\r')` on all POSIX shells, though outside the few shells (like ksh93) that optimize away the fork cost it's a hefty performance cost to pay over `IFS=$'\r'` for that extra portability. – Charles Duffy Oct 06 '18 at 20:35
  • No need of dos2unix nor cut, while IFS=';' read name idip end_time reason admin start_time ... The '\r' come with the last value start_time so start_time="${start_timef%$'\r'}". – ctac_ Oct 06 '18 at 20:55

2 Answers2

7

Literal Answer: Pipe Through!

If you don't tell dos2unix the name of the file it's working with, it can't modify that file in-place.

while IFS= read -r line; do
  echo "No carriage returns here: <$line>"
done < <(dos2unix <"$file")

Redirections are performed by the shell before a program is started, when you invoke dos2unix <input.txt, the shell replaces file descriptor 0 with a read handle on input.txt before invoking dos2unix with no arguments.

If you wanted to be really paranoid (and pay a performance cost for that paranoia), you could prevent a hypothetical nonexistent dos2unix that modified a file descriptor received on stdin in-place from doing so by making it <(cat <"$file" | dos2unix), such that dos2unix is reading from a FIFO connected to the separate executable cat, rather than straight from the input file. Needless to say, I don't ever advise this in practice.


Better Answer: Don't

You don't need dos2unix (which -- with its default in-place modification behavior -- is meant for human interactive users, not scripts); the shell itself can strip carriage returns for you:

#!/usr/bin/env bash
#              ^^^^- not /bin/sh; needed for $'' syntax and [[ ]]

while IFS= read -r line || [[ $line ]]; do
  line=${line%$'\r'}
  echo "No carriage returns here: <$line>"
done <"$file"
  • ${var%expr} is a parameter expansion which strips any trailing instance of the glob expression expr from the contents of the variable var.
  • $'\r' is ANSI C-like string syntax for a carriage return. Using that syntax is important, because other things that look like they might refer to a carriage return don't.

    • \r outside any kind of quoting context is just the letter r.
    • "\r" or '\r' are two characters (a backslash and then the letter r), not a single carriage return.
  • [[ $line ]] is a ksh extension adopted by bash equivalent to [ -n "$line" ]; it checks whether the variable line is non-empty. It's possible for read to return false while still populating a line if you have a partial line without any terminator; as l0b0 points out, line separators rather than terminators are common on Windows. This ensures the last line of a file is processed even if it doesn't end in a newline.
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • 1
    You probably want `|| [[ -n "$line" ]]` on the first line since Windows uses line *separators* rather than line *terminators.* – l0b0 Oct 06 '18 at 20:22
  • Wow, such a comprehensive answer! Thank you so much. I do have a few questions though, such as what's the difference between `#!/bin/bash` and `#!/usr/bin/env bash` (you said not to use #!/bin/sh which I agree with)? – PatPeter Oct 06 '18 at 22:19
  • 1
    `#!/usr/bin/env bash` uses the first version of bash in the PATH. If you're running MacOS, `/bin/bash` will be the ancient 3.2 release, whereas `#!/usr/bin/env bash` will run a modern release if it's been installed, as via [MacPorts](https://www.macports.org/) or [Homebrew](https://brew.sh/) or (my personal favorite) [Nix](https://nixos.org/nix/). – Charles Duffy Oct 06 '18 at 22:33
  • 1
    Re: suggested edit -- `<$file` may or may not work (depending on shell release) if the name contains spaces, can evaluate as a glob, etc; much safer to use `<"$file"`. – Charles Duffy Oct 06 '18 at 22:39
0

Considering the context of your script on github, assuming none of the fields of your CSV file contain a CR character, you just have to put CR in IFS.

Change:

while read ban; do ...

to:

while IFS=$'\r' read -r ban; do ...

For the same price, you can get the split of ban into six fields with:

while IFS=$';\r' read -r name idip end_time reason admin start_time remaining; do
    name=${name/,/\\,}
    ticksToDateString "$end_time"
    ...
xhienne
  • 5,738
  • 1
  • 15
  • 34
  • The file is generated by a program running on mono, which results in `\r\n` line endings. When my while loop parses each line, there's an extra `\r` in the last cell that's ruining my arithmetic because it breaks it across two lines. – PatPeter Oct 06 '18 at 22:15
  • Is it another unrelated issue with your code, or do you mean there is something wrong with the code I gave you in my answer? I don't see anything in it that could add that `\n` since it is the line delimiter for `read`. – xhienne Oct 06 '18 at 22:27