38

Is there a way to sort a csv file based on the 1st column using some shell command?

I have this huge file with more than 150k lines hence I can do it in excel:( is there an alternate way ?

fiddle
  • 1,095
  • 5
  • 18
  • 33

3 Answers3

76

sort -k1 -n -t, filename should do the trick.

-k1 sorts by column 1.

-n sorts numerically instead of lexicographically (so "11" will not come before "2,3...").

-t, sets the delimiter (what separates values in your file) to , since your file is comma-separated.

Travis
  • 2,579
  • 18
  • 19
  • 3
    Probably convenient to also set the field separator `-t` to `,` ? ;-) – thom Oct 08 '14 at 04:55
  • 3
    NOTE: This approach will **fail** on valid CSV files where entries contain newlines, so one CSV entry spans multiple lines in the file. – Peter V. Mørch Oct 11 '19 at 02:38
  • 1
    To sort by only one column, you should use `-k1,1` (see https://superuser.com/questions/33362/how-to-unix-sort-by-one-column-only) – Jack Valmadre Dec 22 '20 at 16:01
  • 1
    Comment to others who'd like to use this: If your separator is a *semicolon* `;` instead of a comma, then you might want to escape it: `-t\;` – MrSnrub May 25 '21 at 02:45
  • This did not work for me, using -k1,1 did though.Only -k1 produced something like: 1,10551 1,19163 18,5718 2,16561 – Sybille Peters Mar 05 '23 at 09:10
12

I don't know why above solution was not working in my case.

15,5
17,2
18,6
19,4
8,25
8,90
9,47
9,49
10,67
10,90
13,96
159,9

however this command solved my problem.

sort -t"," -k1n,1 fileName
David Parks
  • 30,789
  • 47
  • 185
  • 328
Bharthan
  • 1,458
  • 2
  • 17
  • 29
12

Using csvsort.

  1. Install csvkit if not already installed.

    brew install csvkit
    
  2. Sort CSV by first column.

    csvsort -c 1 original.csv > sorted.csv
    
Joshua Pinter
  • 45,245
  • 23
  • 243
  • 245
  • 4
    @fiddle The existing solutions sorted my column of numbers incorrectly, however, `csvsort` worked perfectly with the defaults. That's a useful thing to me and it might be to others. Only time will tell. – Joshua Pinter Jan 23 '19 at 21:43
  • 7
    This is the only one of the answers that handles valid entries that span multiple lines. – Peter V. Mørch Oct 11 '19 at 02:37
  • 1
    This works well when you have string values surrounded by double quotes containing commas (",") – Aldo Canepa Jun 20 '22 at 23:42