3

I have a text file containing one record per line which I'd like to sort alphabetically, except that I want '-' to sort after '[' and ']'. (The natural sort order has '-' before the square brackets.) Is there a way to modify the collation that sort(1) uses in order to achieve this?

uckelman
  • 183
  • 5

3 Answers3

4

You will probably want to apply one of the suggested workarounds, but the answer to your question is no(t easily). If you want to change how sort sorts, and none of the special sort orders offered by the command-line options suit you, you will need to define your own locale. See localedef.

Peter Eisentraut
  • 3,665
  • 1
  • 24
  • 21
3

You could do it with perl:

perl -e 'print sort { (($a =~ /^-/ && $b =~ /^[\[\]]/) || ($a =~ /^[\[\]]/ && $b =~ /^-/)) ? ($b cmp $a) : ($a cmp $b) } (<>)' <filename>
Mark Wagner
  • 18,019
  • 2
  • 32
  • 47
1

One way would be to substitute a character that doesn't appear in your data, but sorts after the brackets (in some locale).

sed 's/-/|/g' inputfile | LC_ALL=C sort | sed 's/|/-/g' > outputfile

This is obviously not an ideal solution.

Dennis Williamson
  • 62,149
  • 16
  • 116
  • 151