Correct syntax and usage of `cat` command?

Question

(This question is a follow-up on this comment, in an answer about git hooks)

I'm far too unskilled in bash (so far) to understand fully the remark and how to act accordingly. More specifically, I've been advised to avoid using bash command cat this way :

echo "$current_branch" $(cat "$1") > "$1"

because the order of operations depends on the specific shell and it could end up destroying the contents of the passed argument, so the commit message itself if I got it right?

Also, how to "save the contents in a separate step"?

Would the following make any sense?

tmp = "$1"
echo "$current_branch" $(cat $tmp) > "$1"

score 4 · Accepted Answer · edited Jun 20 '20 at 09:12

The proposed issue is not about overwriting variables or arguments, but about the fact that both reading from and writing to a file at the same time is generally a bad idea.

For example, this command may look like it will just write a file to itself, but instead it truncates it:

cat myfile > myfile   # Truncates the file to size 0

However, this is not a problem in your specific command. It is guaranteed to work in a POSIX compliant shell because the order of operations specify that redirections will happen after expansions:

The words that are not variable assignments or redirections shall be expanded. If any fields remain following their expansion, the first field shall be considered the command name and remaining fields are the arguments for the command.

Redirections shall be performed as described in Redirection.

Double-however, it's still a bit fragile in the sense that seemingly harmless modifications may trigger the problem, such as if you wanted to run sed on the result. Since the redirection (> "$1") and command substitution $(cat "$1") are now in separate commands, the POSIX definition no longer saves you:

# Command may now randomly result in the original message being deleted
echo "$current_branch $(cat "$1")" | sed -e 's/(c)/©/g' > "$1"

Similarly, if you refactor it into a function, it will also suddenly stop working:

# Command will now always delete the original message
modify_message() {
  echo "$current_branch $(cat "$1")"
}
modify_message "$1" > "$1"

You can avoid this by writing to a temporary file, and then replace your original.

tmp=$(mktemp) || exit
echo "$current_branch $(cat "$1")" > "$tmp"
mv "$tmp" "$1"

Outstanding answer, thanks a lot! Follow-up question : would the temp file not be dangling afterwards? I mean, it should not be detected by git as an untracked file or hinder the process otherwise, so I should get rid of it in the process, right? — Romain Valeri, Jan 07 '19 at 18:44
@RomainValeri In this example `mv` is used to move your hook's temp file and replace git's temp file, so there's no file left over. If you had done e.g. `cat "$tmp" > "$1"` then you would be right: that would have needed a separate `rm "$tmp"` to clean up the temp file. — that other guy, Jan 07 '19 at 18:48

score 1 · Answer 2 · answered Jan 07 '19 at 18:27

In my opinion, it's better to save to another file.

You may try something like

echo "$current_branch" > tmp
cat "$1" >> tmp # merge these into 
# echo "$current_branch" $(cat "$1") > tmp
# may both OK
mv tmp "$1"

However I am not sure if my understanding is right, or there are some better solutions.

This is what I considered as the core of question. It is hard to decide the "precedence" of $() block and >. If > is executed "earlier", then echo "$current_branch" will rewrite "$1" file and drop the original content of "$1", which is a disaster. If $() is executed "earlier", then everything works as expected. However, there exists a risk, and we should avoid it.

I get it better now, thanks for the explanations. I'll try it out and fiddle with it. — Romain Valeri, Jan 07 '19 at 18:34

score 1 · Answer 3 · answered Jan 07 '19 at 19:07

1

A command group would be far better than a command substitution here. Note the similarity to Geno Chen's answer.

{
  echo "$current_branch"
  cat "$1"
} > tmp && mv tmp "$1"

answered Jan 07 '19 at 19:07

chepner

497,756
71
530
681

Can I just ask in which regards it would be better? I'm afraid I'm unsure of what you're calling command substitution here :-/ – Romain Valeri Jan 07 '19 at 19:11
1

`$(...)` is a command substitution, and it requires the entire contents of the file be read into memory first. (It also strips any trailing newlines, which requires a small amount of fiddling to work around.) This simply writes the output of the `echo` command, followed by the output of the `cat` command, to the temp file. (Semantically, it's the same as writing then appending to a file, but only needs to open `tmp` once and is syntactically cleaner.) – chepner Jan 07 '19 at 19:33

Correct syntax and usage of `cat` command?

3 Answers3

Linked