-1

I'm using the gnu version of these tools. I'm trying to unzip an archive and transform a file.

The file is "myfile.txt" and it appears in multiple folders in the archive- so I thought passing the full path to xarg would transform all files:

mkdir temp
unzip mypackage.zip -d temp

find temp -iname "myfile.txt" | xargs -I FILE sh -c "sed -e 's/replacethis/withthis/g' -e 's/replacethistoo/withthisaswell/g' FILE | tee FILE"
# List the files
find temp -iname "myfile.txt" | xargs -I FILE ls -l FILE
# Cat the files
find temp -iname "myfile.txt" | xargs -I FILE cat FILE
# Clean up 
rm -Rf temp

I run this script multiple times and have different outcomes which I don't understand.

Each time a different "myfile.txt" is modified, sometimes one of the "myfile.txt" files has 0 bytes

Why is this happening? It should be the same every time, shouldn't it? Is find only passing one, random, "myfile.txt" path to xargs each time I run this script?

halfer
  • 19,824
  • 17
  • 99
  • 186
red888
  • 27,709
  • 55
  • 204
  • 392
  • 1
    What are you actually trying to accomplish? In particular, why do you pipe the `sed` output to `tee`? I'm guessing you might be looking for the `-i` option to `sed` (and then you won't need `xargs` or the `sh -c` wrapper either; `find temp -iname "myfile.txt" -exec sed -i -e 's/replacethis/withthis/g' -e 's/replacethistoo/withthisaswell/g' {} \;` – tripleee Jul 14 '20 at 19:05

1 Answers1

3

Why is this happening? It should be the same every time shouldnt it?

This happens because of a race condition between the two parallel operations of:

  • sed opening and reading the file
  • tee opening and truncating the file

If tee wins, the file will be empty when sed reads it, and it will therefore be 0 bytes.

If sed wins, it'll read (at least parts of) the file and you'll get some data.

Since process scheduling is not predictable, you risk seeing different results each time.

that other guy
  • 116,971
  • 11
  • 170
  • 194
  • if I change tee to `>` would that solve my problem or is running sed with xargs still going to have a race condition? – red888 Jul 14 '20 at 18:56
  • 3
    That will make the results more consistent. The file will *always* be truncated before `sed` runs, because the shell opens the output file before starting the command. In general, you can't use the same file for input and output. That's why `sed` has the `-i` option to perform inplace replacement. – Barmar Jul 14 '20 at 19:01