So, still learning, regex is mind numbing stuff. But I have a working regex to preg_match in php any numbers based around product pricing that follow a currency symbol £. This may be helpful as I couldn't find a working example to consider all variants (such as thousand , and decimals etc). Any improvements to the regex totally welcome!
My question is why though does the array contain 3 instances of every number? And what's the meaning of the "2" that follows?
(?<=\£|GBP)((\d{1,6}(,\d{3})*)|(\d+))(\.\d{2})?
Function:
function website($url) {
$xml = new DOMDocument();
if(@$xml->loadHTMLFile($url)) {
$xpath = new DOMXPath( $xml );
$textNodes = $xpath->query( '//text()' );
foreach ( $textNodes as $textNode ) {
if ( preg_match('/(?<=\£|GBP)((\d{1,6}(,\d{3})*)|(\d+))(\.\d{2})?/', $textNode->nodeValue, $matches, PREG_OFFSET_CAPTURE ) ) {
$website_prices[] = $matches;
global $website_prices;
}
}
}
print_r is dumping:
[3] => Array
(
[0] => Array
(
[0] => 545
[1] => 2
)
[1] => Array
(
[0] => 545
[1] => 2
)
[2] => Array
(
[0] => 545
[1] => 2
)
)