8

I want to sort a bunch of files. I can do

sort file.txt > foo.txt
mv foo.txt file.txt

but do I need this second file?

(I tried sort file.txt > file.txt of course, but then I just ended up with an empty file.)

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
Nagel
  • 2,576
  • 3
  • 22
  • 20

5 Answers5

12

Try:

sort -o file.txt file.txt

See http://ss64.com/bash/sort.html

`-o OUTPUT-FILE'
     Write output to OUTPUT-FILE instead of standard output.  If
     OUTPUT-FILE is one of the input files, `sort' copies it to a
     temporary file before sorting and writing the output to
     OUTPUT-FILE.
Andrew Cooper
  • 32,176
  • 5
  • 81
  • 116
  • +1 ;-) Some details: `sort` will do the `mv foo.txt file.txt` itself. – oHo Feb 02 '12 at 17:58
  • Ah, there we go! I saw that flag in the man, but didn't notice the additional info. EDIT: Odd. The additional info is not in my man pages. Why would the man page you're linking to be more comprehensive? – Nagel Feb 02 '12 at 18:00
  • 1
    @Nagel - Don't know. It was just the first hit on Google when I searched for `man sort`. The Berkeley man page has similar info, but a little more brief. – Andrew Cooper Feb 02 '12 at 18:06
  • Strange. The man pages both on my MacBook and on my research lab's unix server both just say "`-o, --output=FILE write result to FILE instead of standard output`" – Nagel Feb 02 '12 at 18:17
1

The philosophy of classic Unix tools like sort includes that you can build a pipe with them. Every little tool reads from STDIN and writes to STDOUT. This way the next little tool down the pipe can read the output of the first as input and act on it.

So I'd say that this is a bug and not a feature.

Please also read about Pipes, Redirection, and Filters in the very nice book by ESR.

1

Because you're writing back to the same file you'll always end up with a problem of the redirect opening the output file before sort gets done loading the original. So yes, you need to use a separate file.

Now, having said that, there are ways to buffer the whole file into the pipe stream first but generally you wouldn't want to do that, although it is possible if you write something to do it. But you'd be inserting special tools at the beginning and the end to do the buffering. Bash, however, will open the output file too soon if you use it's > redirect.

Wes Hardaker
  • 21,735
  • 2
  • 38
  • 69
0

if you are dealing with sorting fixed length records from a single file, then the sort algorithm can swap records within the file. There are a few available algorithms availabe. Your choice would depend on the amount of the file's randomness properties. Generally, quicksort tends to swap the fewest number of records and is usually the sort that completes first, when compared to othersorting algorithms.

0

Yes, you do need a second file! The command

sort file.txt > file.txt

would have bash to set up the redirection of stout before it starts executing sort. This is a certain way to clobber your input file.

If you want to sort many files try :

cat *.txt | sort > result.txt
Mithrandir
  • 24,869
  • 6
  • 50
  • 66
  • I suspected as much. It just felt inelegant somehow :P Thanks! – Nagel Feb 02 '12 at 17:57
  • -1 There is a `-o` option that allows output to the same file – Andrew Cooper Feb 02 '12 at 17:59
  • @AndrewCooper : i know, but the case above used output redirection, in this case you need the extra file! Even the -o uses a temporary file, that you don't have to care about the creation doesn't mean it isn't there! – Mithrandir Feb 02 '12 at 18:01
  • 1
    Yes, but the OP wasn't asking how to do it with redirection, he was asking if there was a way to do it without having to handle the temporary file himself. – Andrew Cooper Feb 02 '12 at 18:03