-2

we got an XML-Output like this one:

      <phone>
        <countryCode>+45</countryCode>
        <areaCode>354</areaCode>
        <subscriberNumber>1631616</subscriberNumber>
      </phone>

How can I convert this in a string seperated with whitespaces: +45 354 1631616 with preg_match_all as result?

I have an application that can only use preg_match_all (because of some other dependencies).

I tried the following:

/\<phone\>\s+\<\D+\>(\+\d+)\<\/\D+\>\s+\<\D+\>(\d+)\<\/\D+\>\s+\<\D+\>(\d+)\<\/\D+\>\s+\<\/phone\>/i

This produce 3 results:

Group 1: +45
Group 2: 354
Group 3: 1631616

because of 3 Brackets '()' (I think we speak about Backreferences?).

But I need a result of only one group

Group 1: +45 354 1631616

Seperated with Whitespaces. Is this possible with preg_match_all (not preg_replace) ? If not possible with whitespaces it is also OK.

Thanks for your thoughts.

Sterconium
  • 559
  • 4
  • 20
Marco A.
  • 7
  • 1
  • Why do you *must* use a regex? Can't you parse each string individually? Also you could simply regex this piece of looking for digit groups (regardless on xml tags validation). – Sterconium Oct 04 '19 at 09:27
  • Thank you! But it is a app written in PHP, where I cannot change the programming code. I have only the option to insert regex in some fields to filter informations from XML-result. – Marco A. Oct 04 '19 at 14:51

2 Answers2

0

With preg_match() or preg_match_all() you can not create a matching group what only contains the phone number. I think, the best solution is what you already write, whit three separate groups.
But if you have already the <phone> block then you can use the strip_tags() function and then you get only the phone number (and the whitespace characters between the tags if there are any).

hNczy
  • 305
  • 1
  • 8
  • Thank you! But it is a app written in PHP, where I cannot change the programming code. I have only the option to insert regex in some fields to filter informations from XML-result. – Marco A. Oct 04 '19 at 14:52
  • Then I don't know any technics to get only one group with the full phone number (and only phone number) because of any group contains the numbers it contains the other xml tags also. :( – hNczy Oct 04 '19 at 16:08
0

You're asking how to configure a specific software. Just without telling us which one. That is not a (PHP) programming question.

The difference between preg_match() and preg_match_all() is that the first will only return one match while the other returns all matches. A match includes the full match and all captured groups.

If I understand correctly you would like to aggregate several captured groups into a single string result. That is very easy with PHP, but not possible with the RegEx match.

Here is an PHP Example:

$data = <<<'XML'
<phone>
  <countryCode>+45</countryCode>
  <areaCode>354</areaCode>
  <subscriberNumber>1631616</subscriberNumber>
</phone>
XML;

$pattern = <<<'PATTERN'
(<phone>\s*<\D+>(\+\d+)<\/\D+>\s*<\D+>(\d+)<\/\D+>\s*<\D+>(\d+)<\/\D+>\s*<\/phone>)
PATTERN;

preg_match($pattern, $data, $match);
$result = $match[1].' '.$match[2].' '.$match[3];
var_dump($result);

Output:

string(15) "+45 354 1631616"

In a replacement it would be part of the replacement string, not the pattern:

var_dump(preg_replace($pattern, '$1 $2 $3', $data));

Fun fact: Xpath has some aggregation features, so it would be possible with an expression:

$document = new DOMDocument();
$document->loadXml($data);
$xpath = new DOMXpath($document);

var_dump(
  $xpath->evaluate(
    'concat(/phone/countryCode, " ", /phone/areaCode, " ", /phone/subscriberNumber)'
  )
);

Output:

string(15) "+45 354 1631616"
ThW
  • 19,120
  • 3
  • 22
  • 44
  • Thanks a lot for your detailed answer. >but not possible with the RegEx match Ok, that was the thing, what I want to know. regex is not my favorite hobby ;-). All other things are known. But the software, which is written in PHP can only use preg_match_all at this point. To write it new will take effects on many other points. So this is to much for only parsing this one XML-Answer. – Marco A. Oct 09 '19 at 11:59
  • You're thinking about it the wrong way. According to you the software allows only to use a regex match to extract data. It is not relevant that it is written in PHP or that it uses `preg_match_all()`. It is a question about the features of the (still unknown) application. Changing the application itself would make it a programming question. But at this point you could add a feature like the replacement string or the XML match. – ThW Oct 09 '19 at 12:07