0

I have below html content from that I want to fetch all text between anchor tag

<div class="row mb-xlg"><div class="col-md-12">
<div class="heading heading-border heading-middle-border"><h3>Compatible Models</h3></div>
<div class="row show-grid">
<div class="col-md-4"><a href="/model/SFSPC19S80/_/_/Sanyo/PC19S80/" title="Sanyo PC19S80 Remote Control (Pc-27s80)">PC19S80</a></div>
<div class="col-md-4"><a href="/model/SFSPC25580/_/_/Sanyo/PC25580/" title="Sanyo PC25580 Remote Control (Pc-27s80)">PC25580</a></div>
<div class="col-md-4"><a href="/model/SFSPC25S80/_/_/Sanyo/PC25S80/" title="Sanyo PC25S80 Remote Control (Pc-27s80)">PC25S80</a></div>
<div class="col-md-4"><a href="/model/SFSPC27S80/_/_/Sanyo/PC27S80/" title="Sanyo PC27S80 Remote Control (Pc-27s80)">PC27S80</a></div>
</div></div></div>

I have below regular expression which returns all text between anchor tag

<a[^>]*>([^<]+)<\/a>+

Tested on this website

Result -

Full match  `<a href="/model/SFSPC25580/_/_/Sanyo/PC25580/" title="Sanyo PC25580 Remote Control (Pc-27s80)">PC25580</a>`
Group 1.    `PC25580`
Match 3
Full match  `<a href="/model/SFSPC25S80/_/_/Sanyo/PC25S80/" title="Sanyo PC25S80 Remote Control (Pc-27s80)">PC25S80</a>`
Group 1.    `PC25S80`
Match 4
Full match  `<a href="/model/SFSPC27S80/_/_/Sanyo/PC27S80/" title="Sanyo PC27S80 Remote Control (Pc-27s80)">PC27S80</a>`
Group 1.    `PC27S80`

But I want to add Compatible Models word condition like

<h3>Compatible Models<\/h3>.*?<a[^>]*>([^<]+)<\/a>+

In this condition it returns only first anchor tag result. How can I achieve all anchor tag text result and store in an array

John
  • 61
  • 8

1 Answers1

0

Don't use regular expressions for this. Instead you should use a DOM Parser:

The next link just contains an excellent answer why you shouldn't use regex:

Community
  • 1
  • 1
smiggle
  • 1,259
  • 6
  • 16