-1

I have data in file in the form:

Torch Lake township | Antrim | 1194

I'm able use grep to look for keywords and pipe that into a sort but the sort isn't behaving how I intendended

This is what I have:

grep '| Saginaw' data | sort -k 5,5

I would like to be able to sort by the numerical value in the last column but it currently isn't and I'm unsure what I'm actually doing wrong.

jww
  • 97,681
  • 90
  • 411
  • 885
  • Welcome to Stack Overflow! Lot's of good information & people here. – Scottie H Sep 09 '19 at 23:15
  • Your command is working as expected for me. Perhaps you need to provide a better example. – EternalHour Sep 10 '19 at 00:00
  • Please show the relevant code, the relevant data, and the expected and actual output. Also see [How to create a Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve). – jww Sep 10 '19 at 01:22

1 Answers1

1

A few things seem to be bogging you down.

First, the vertical bar can be a special character in grep. It means OR. Ex:
A|B
could be interpreted as A or B, and not A vertical bar B.
To correct that, you need to tell grep to interpret the | as a non-special character. To do that, escape it, like this:
grep '\| Saginaw' data
or, simply remove it altogether, if you data format allows that.

Second, the sort command needs to know what your column separator is. By default, it uses a space character (Actually, it's any white space). sort -k 5,5 actually says "sort on the 5th word"
To specify that your column separator is actually the vertical bay, use the -t option:
sort -t'|' -k 5,5
alternately,
sort --field-separator='|' -k 5,5

Third, You've got a bit of a sticky wicket now. Your data is formatted as:
Field1 | Field2 | Field3
...and not...
Field1|Field2|Field3
You may have issues with that additional space. Or maybe not. If all of your data has EXACTLY the same white-space, you'll be fine. If some have a single space, some have 2 spaces, and others have a tab, your sort will get jacked up.

Fourth, sorting by numbers may not be intuitive for you. The number 10 comes after the number 1 and before the number 2.
To sort the way you think it ought to be, where 10 comes after 9, use the option -n for numeric sort.
grep '\| Saginaw' data | sort -t'|' -n -k 5,5
The entire filed #5 will be sorted. Thus, 10 Abbington will come before 10 Andover.

Scottie H
  • 324
  • 1
  • 7
  • Also be aware that a single quote `'` and a double quote `"` are not necessarily the same thing! – Scottie H Sep 09 '19 at 23:16
  • 2
    but if they OP used data like `Torch Lake township | Antrim | 1194`, they would rather want -`sort -t'|' -k3n` (to do a numeric sort on the numeric field, 1194). –  Sep 09 '19 at 23:19
  • Agreed. A lot of the actuals are dependent on his data. Once he get's past his initial problem, he can work out the details of sorting his particular data. Hopefully, we've given him enoughjt tools to figure out most of what he want. – Scottie H Sep 09 '19 at 23:34
  • 1
    If you won't be looking for patterns you can use `fgrep` aka `grep -F` or `grep --fixed-strings` so you don't have to escape anything. fgrep is also faster than grep because it knows it's not looking for patterns (good if your data file is large) – Stephen P Sep 10 '19 at 00:45