3

I am a beginner to using sed. I am trying to use it to edit down a uniq -c result to remove the spaces before the numbers so that I can then convert it to a usable .tsv.

The furthest I have gotten is to use:

$ sed 's|\([0-9].*$\)|\1|' comp-c.csv

With the input:

     8 Delayed speech and language development
    15 Developmental Delay and additional significant developmental and morphological phenotypes referred for genetic testing
     4 Developmental delay AND/OR other significant developmental or morphological phenotypes
     1 Diaphragmatic eventration
     3 Downslanted palpebral fissures

The output from this is identical to the input; it recognises (I have tested it with a simple substitute) the first number but also drags in the prior blankspace for some reason.

To clarify, I would like to remove all spaces before the numbers; hardcoding a simple trimming will not work as some lines contain double/triple digit numbers and so do not have the same amount of blankspace before the number.

Bonus points for some way to produce a usable uniq -c result without this faffing around with blank space.

janos
  • 120,954
  • 29
  • 226
  • 236
  • `s|\([0-9].*$\)|\1|` says `find the string from the first digit to the end of the line ([0-9].*$) and replace it with itself (\1)` which is why your output is identical to your input - it's not changing anything. I'd guess you were probably trying to write `s|^[^0-9]*\([0-9].*$\)|\1|` but [@janos's answer](https://stackoverflow.com/a/46695051/1745001) makes more sense. – Ed Morton Oct 12 '17 at 13:12

1 Answers1

2

It's all about writing the correct regex:

sed 's/^ *//' comp-c.csv

That is, replace zero or more spaces at the start of lines (as many as there are) with nothing.

Bonus points for some way to produce a usable uniq -c result without this faffing around with blank space.

The uniq command doesn't have a flag to print its output without the leading blanks. There's no other way than to strip it yourself.

janos
  • 120,954
  • 29
  • 226
  • 236
  • Works perfectly, thank you very much. Anyone looking for a usable _uniq -c_ result can use: 'sort -c $FILE | uniq -c | sed -e 's/^ *//' -e 's/ /[ ^I]/' > "$FILE".tsv' where [ ^I] = TAB. – Rob Sellers Oct 12 '17 at 19:33