I have a text file containing one record per line which I'd like to sort alphabetically, except that I want '-' to sort after '[' and ']'. (The natural sort order has '-' before the square brackets.) Is there a way to modify the collation that sort(1) uses in order to achieve this?
Asked
Active
Viewed 295 times
3 Answers
4
You will probably want to apply one of the suggested workarounds, but the answer to your question is no(t easily). If you want to change how sort
sorts, and none of the special sort orders offered by the command-line options suit you, you will need to define your own locale. See localedef
.

Peter Eisentraut
- 3,665
- 1
- 24
- 21
3
You could do it with perl:
perl -e 'print sort { (($a =~ /^-/ && $b =~ /^[\[\]]/) || ($a =~ /^[\[\]]/ && $b =~ /^-/)) ? ($b cmp $a) : ($a cmp $b) } (<>)' <filename>

Mark Wagner
- 18,019
- 2
- 32
- 47
-
2a good example of executable line noise :-) – Javier Dec 30 '10 at 19:01
1
One way would be to substitute a character that doesn't appear in your data, but sorts after the brackets (in some locale).
sed 's/-/|/g' inputfile | LC_ALL=C sort | sed 's/|/-/g' > outputfile
This is obviously not an ideal solution.

Dennis Williamson
- 62,149
- 16
- 116
- 151