13

I try to extract digits with sed:

echo hgdfjg678gfdg kjg45nn | sed 's/.*\([0-9]\+\).*/\1/g'

but result is: 5 How to extract: 678 and 45? Thanks in advance!

Ned
  • 131
  • 1
  • 1
  • 3

4 Answers4

22

The problem is that the . in .* will match digits as well as non-digits, and it keeps on matching as long as it can -- that is as long as there's one digit left unconsumed that can match the [0-9].

Instead of extracting digits, just delete non-digits:

echo hgdfjg678gfdg kjg45nn | sed 's/[^0-9]//g'

or even

echo hgdfjg678gfdg kjg45nn | tr -d -c 0-9
hmakholm left over Monica
  • 23,074
  • 3
  • 51
  • 73
  • 3
    In sed, probably want to replace non-digits with a single space, so the original groupings of digits can be maintained. – glenn jackman Sep 12 '11 at 19:43
  • Note, that `sed 's/[^0-9]//g'` will not cut off new line characters (important, when you filtering multiline strings), however `tr -d -c 0-9` will do – demmonico Nov 11 '22 at 11:11
9

You may use grep with option -o for this:

$ echo hgdfjg678gfdg kjg45nn | grep -E -o "[0-9]+"
678
45
Victor Vasiliev
  • 462
  • 2
  • 12
3

Or use tr:

$ echo hgdfjg678gfdg kjg45nn | tr -d [a-z]
678 45
Fredrik Pihl
  • 44,604
  • 7
  • 83
  • 130
2

.* in sed is greedy. And there are no non-greedy option AFAIK.
(You must use [^0-9]* in this case for non-greedy matching. But this works only once, so you will get only 678 without 45.)

If you must use only sed, it would not be easy to get the result.
I recommend to use gnu’s grep

$ echo hgdfjg678gfdg kjg45nn | grep -oP '\d+'
678
45

If you really want to stick to sed, this would be one of many possible answers.

$ echo hgdfjg678gfdg kjg45nn | \
sed -e 's/\([0-9^]\)\([^0-9]\)/\1\n\2/g' | \
sed -n 's/[^0-9]*\([0-9]\+\).*/\1/p’
678
45
plhn
  • 5,017
  • 4
  • 47
  • 47