2

uniq (GNU coreutils 8.5) does not seem to distinguish between em- and en-dashes:

$ echo -e "a–b\na—b" | uniq -c

  2 a–b

Is there any way to force this distinction? I've tried various settings for LC_COLLATE with no luck.

jub0bs
  • 60,866
  • 25
  • 183
  • 186

1 Answers1

3

Worked for me

echo -e "a–b\na—b" | LC_COLLATE=C uniq -c
      1 a–b
      1 a—b
Michael Krelin - hacker
  • 138,757
  • 24
  • 193
  • 173
  • I see. I was trying various permutations of `LC_COLLATE=en_GB.utf8` assuming that it _must_ be `utf8` to work. `LC_COLLATE=C` produces the expected results. Cheers! –  Oct 28 '11 at 12:01