2

I have a string in ruby, in which I have text like give below.

ink = "<a href=\"https://www.abc.gov/asd/asdfg/bill/bill-a\">H.R.11461</a>"
ink = "<a href=\"https://www.abc.gov/asd/asdfg/bill/bill-b\">H.R.11461</a>"
ink = "<img class='image-small' src=\"https://www.abc.gov/asd/asdfg/bill/bill-a\">"
ink = "<img class='image-large' src=\"https://www.abc.gov/asd/asdfg/bill/bill-b\" id='image'>"
ink = "<iframe class='ifram' src=\"http://www.xyzabc.com\"></iframe>"
ink = "<iframe src=\"http://www.xyzabc.com\" id='fram-1'></iframe>"

I want to match only source URL of image, means value scr attribute of img tag by regular expression.

Output should be like,

https://www.abc.gov/asd/asdfg/bill/bill-a
https://www.abc.gov/asd/asdfg/bill/bill-b

Is there any way to get expected output by regex in ruby?

Hiren Bhalani
  • 848
  • 9
  • 17
  • 4
    You shouldn't use a regexp to parse html (see [this answer to understand why](http://stackoverflow.com/a/1732454/2483313)). Use a XML and HTML library like [Nokogiri](http://www.nokogiri.org/). – spickermann Jun 23 '16 at 07:16

1 Answers1

2

You can use an expression like this one:
<img.*?src=\\"(.+?)\\"

You'll need to use the global modifier for the regex to get all matches on several tags. If you are testing one string at a time, it shouldn't be necessary.

Example at regex101

Niitaku
  • 835
  • 9
  • 19