Finding it difficult to extract digits from string using sed

Question

I am trying to extract the version information a string using sed as follows

echo "A10.1.1-Vers8" | sed -n "s/^A\([0-9]+\)\.\([0-9]\)\.[0-9]+-.*/\1/p"

I want to extract '10' after 'A'. But the above expression doesn't give the expected information. Could some one please give some explanation on why this statement doesn't work ?

I tried the above command and changed options os sed but nothing works. I think this is some syntax error

echo "A10.1.1-Vers10" | sed -n "s/^X\([0-9]+\)\.\([0-9]\)\.[0-9]+-.*/\1/p"

Expected result is '10' Actually result is None

Sure, because of `+` in a BRE POSIX pattern that is treated as a literal `+` char. Use `sed -n "s/^A$[0-9]\{1,\}$.*/\1/p" <<< "A10.1.1-Vers10"` or `sed -n -E "s/^A([0-9]+).*/\1/p" <<< "A10.1.1-Vers10"` — Wiktor Stribiżew, Oct 17 '19 at 09:34
Possible duplicate of [sed plus sign doesn't work](https://stackoverflow.com/questions/22099623/sed-plus-sign-doesnt-work) — Wiktor Stribiżew, Oct 17 '19 at 09:36
Thanks for the information. Yes this works. Is there any tool with which I can debug sed command ? — user2677279, Oct 17 '19 at 09:38
I know no tool for `sed` command debugging :( Only [`awk`](https://awk.js.org/). Just googled [a `sed` cheatsheet](http://anaturb.net/sed.htm). A generic shell [script check](https://www.shellcheck.net/). — Wiktor Stribiżew, Oct 17 '19 at 09:51

slayedbylucifer · Answer 1 · 2019-10-17T13:02:56.453

2

$ echo "A10.1.1-Vers8" | sed -r 's/^A([[:digit:]]+)\.(.*)$/\1/g'
10

Search for string starting with A (^A), followed by multiple digits (I am using POSIX character class [[:digit:]]+) which is captured in a group (), followed by a literal dot \., followed by everything else (.*)$.

Finally, replace the whole thing with the Captured Group content \1.

In GNU sed, -r adds some syntactic sugar, in the man page, it is called as --regexp-extended

edited Oct 17 '19 at 13:02

answered Oct 17 '19 at 12:57

slayedbylucifer

22,878
16
94
123

Is that `/g` option needed? – Jon Oct 17 '19 at 13:33
g doesn't appear needed. Could also simply remove the A and everything after the period. `sed 's/A//;s/\..*//'` – stevesliva Oct 17 '19 at 13:43
Is there any way we can get a verbose output or a debug for sed - so that I can debug how sed treat the pattern and where things go wrong ! – user2677279 Dec 05 '19 at 06:46

score 1 · Answer 2 · answered Oct 17 '19 at 13:40

GNU grep is an alternative to sed:

$ echo "A10.1.1-Vers10" | grep -oP '(?<=^A)[0-9]+'
10

The -o option tells grep to print only the matched characters.

The -P option tells grep to match Perl regular expressions, which enables the (?<= lookbehind zero-length assertion.

The lookbehind assertion (?<=^A) ensures there is an A at the beginning of the line, but doesn't include it as part of the match for output.

If you need to match more of the version string, you can use a lookforward assertion:

$ echo "A10.1.1-Vers10" | grep -oP '(?<=^A)[0-9]+(?=\.[0-9]+\.[0-9]+-.*)'
10

Finding it difficult to extract digits from string using sed

2 Answers2