Use sed to grab a string

Question

I'm using curl to get the html from a site then I just need a specific string which is between 'standards.xml?revision=' and '&amp'. I'm using sed to do this but I can't seem to get the regex right and needed some help.

curl website.com | sed -r 's|.*standards\.xml\?revision=([0-9]+).*|\1|'

The output I'm getting is the full html--any help would be appreciated.

you should use grep... – Gregory Pakosz Oct 30 '13 at 17:13 — Gregory Pakosz, Oct 30 '13 at 17:13
How would I use grep for this? – cakes88 Oct 30 '13 at 17:14 — cakes88, Oct 30 '13 at 17:14

jkshah · Accepted Answer · 2013-10-30T17:28:34.300

5

You're almost there. Try using -n option with sed not to print unmatched data and add p modifier to s||| to print replace string

curl website.com | sed -n -r 's|.*standards\.xml\?revision=([0-9]+).*|\1|p'

edited Oct 30 '13 at 17:28

answered Oct 30 '13 at 17:22

jkshah

11,387
6
35
45

1

@Konnor Welcome! It seems you're new to this site. If any ans is working for you, consider accepting that ans by clicking on hollow green tick mark besides ans. P.S. I noticed you haven't accepted any of your 3 answers. – jkshah Oct 30 '13 at 18:47

score 2 · Answer 2 · answered Oct 30 '13 at 17:16

2

you can use grep -oP (PCRE option):

grep -oP 'standards\.xml\?revision=\K[0-9]+'

\K resets the matched text hence only later part [0-9]+ is returned.

answered Oct 30 '13 at 17:16

anubhava

761,203
64
569
643

score 1 · Answer 3 · answered Oct 30 '13 at 17:46

1

curl website.com | sed -n '/xml/ {s|.*standards\.xml\?revision=([^&]+).*|\1|p;q;}'

From previous sed [0-9]+ is only if number occur maybe a [^&]+ is more appropriate. Very good to use the ' and | to avoid problem with \ so I pick it :-)

answered Oct 30 '13 at 17:46

NeronLeVelu

9,908
1
23
43

Use sed to grab a string

3 Answers3

Linked