-4
http://www.example.com/product/9896341.html?utm_source=google&utm_medium=VRM&utm_campaign=N&cid=vizuryjz&utm_content=&color=red&pid=9896341

in the above url, i need to remove text followed in utm_source=google, suppose if in url utm_source="text" i need to replace "utm_source=text" with "".

please guys help me for regular expression.

James Donnelly
  • 126,410
  • 34
  • 208
  • 218
chethi
  • 699
  • 2
  • 7
  • 23

2 Answers2

0

Instead of gawk, I would recommend using gnu sed for this:

$ s="http://www.example.com/product/9896341.html?utm_source=google&utm_medium=VRM&utm_campaign=N&cid=vizuryjz&utm_content=&color=red&pid=9896341"
$ sed -r 's/utm_source=[^&]+//' <<<"$s"
http://www.shopin.net/product/9896341.html?&utm_medium=VRM&utm_campaign=N&cid=vizuryjz&utm_content=&color=red&pid=9896341

This deletes utm_source= followed by anything up to the next ampersand.

Tom Fenech
  • 72,334
  • 12
  • 107
  • 141
0

you can use this regex

    utm_source=[^&?=]*

javascript

    your_url.replace(/utm_source=[^&?=]*/gi,"")

sed

    echo "http://www.shopin.net/product/9896341.html?utm_source=google&utm_medium=VRM&utm_campaign=N&cid=vizuryjz&utm_content=&color=red&pid=9896341" | sed s/utm_source\=\[\^\&\?\=\]\*//g
alexanderlz
  • 579
  • 3
  • 6