XPath 1.0, which is the version supported by DOMXPath()
, has no Regex functionalities. Though, you can easily write your own PHP function to execute Regex expression to be called from DOMXPath
if you need one, as mentioned in this other answer.
There is XPath 1.0 way to test if an attribute value is a number, which you can use on href
attribute value after /
character, to test if the attribute value follows the pattern /digits
:
//a[number(substring-after(@href,'/')) = substring-after(@href,'/')]
UPDATE :
For the sake of completeness, here is a working example of calling PHP function preg_match
from DOMXPath::query()
to accomplish the same task :
$raw_data = <<<XML
<html>
<body>
<div id="diva">
<a href="/123" >text2</a>
</div>
<div id="divb">
<a href="/345" >text1</a>
<a href="/678" >text2</a>
</div>
</body>
</html>
XML;
$doc = new DOMDocument;
$doc->loadXML($raw_data);
$xpath = new DOMXPath($doc);
$xpath->registerNamespace("php", "http://php.net/xpath");
$xpath->registerPHPFunctions("preg_match");
// php:function's parameters below are :
// parameter 1: PHP function name
// parameter 2: PHP function's 1st parameter, the pattern
// parameter 3: PHP function's 2nd parameter, the string
$gm = $xpath->query("//a[php:function('preg_match', '~^/\d+$~', string(@href))]");
foreach ($gm as $a) {
echo $a->getAttribute("href") . "\n";
}