0

I am new to using DOM with PHP and need some help figuring out a solution of iterating xpath into an array. The the examples I found online provided very little help.

This is the string content from my XML file:

    <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.2-c004 1.136881, 2010/06/10-18:11:35        "> 
        <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> 
        <rdf:Description 
            rdf:about="" 
            xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/" 
            xmlns:dc="http://purl.org/dc/elements/1.1/" 
            xmlns:tiff="http://ns.adobe.com/tiff/1.0/" 
            xmlns:exif="http://ns.adobe.com/exif/1.0/" 
            xmlns:xmp="http://ns.adobe.com/xap/1.0/" 
            xmlns:aux="http://ns.adobe.com/exif/1.0/aux/" 
            xmlns:crs="http://ns.adobe.com/camera-raw-settings/1.0/" 
            xmlns:Iptc4xmpCore="http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/" 
            xmlns:xmpRights="http://ns.adobe.com/xap/1.0/rights/" 
            photoshop:LegacyIPTCDigest="B0D1E9B9CFC1C774E7277517B04970DC" 
            photoshop:ColorMode="3" 
            photoshop:ICCProfile="sRGB IEC61966-2.1" 
            photoshop:AuthorsPosition="Tester" 
            photoshop:Headline="Big City Landscape" 
            photoshop:CaptionWriter="Freelancer" 
            photoshop:DateCreated="2016-08-05T02:16Z" 
            photoshop:City="NA" 
            photoshop:State="NA" 
            photoshop:Country="NA" 
            photoshop:TransmissionReference="2323" 
            photoshop:Instructions="set to landscape" 
            photoshop:Credit="Photographor: FirstName lastname" 
            photoshop:Source="Smart Phone Photo" 
            tiff:Make="Motorola" 
            tiff:Model="MB865" 
            tiff:Orientation="1" 
            tiff:ImageWidth="3264" 
            tiff:ImageLength="1840" 
            tiff:PhotometricInterpretation="2" 
            tiff:SamplesPerPixel="3" 
            tiff:XResolution="72/1" 
            tiff:YResolution="72/1" 
            tiff:ResolutionUnit="2" 
            exif:ExifVersion="0220" 
            exif:ExposureTime="1/11" 
            exif:ShutterSpeedValue="3459432/1000000" 
            exif:FNumber="24/10" 
            exif:ApertureValue="2526069/1000000" 
            exif:ExposureProgram="0" 
            exif:BrightnessValue="0/1" 
            exif:ExposureBiasValue="0/10" 
            exif:MaxApertureValue="3/1" 
            exif:SubjectDistance="0/1" 
            exif:MeteringMode="1" 
            exif:LightSource="4" 
            exif:FocalLength="460/100" 
            exif:SceneType="1" 
            exif:CustomRendered="1" 
            exif:ExposureMode="0" 
            exif:WhiteBalance="0" 
            exif:SceneCaptureType="0" 
            exif:GainControl="256" 
            exif:Contrast="0" 
            exif:Saturation="0" 
            exif:Sharpness="0" 
            exif:SubjectDistanceRange="0" 
            exif:DigitalZoomRatio="65536/65535" 
            exif:PixelXDimension="3264" 
            exif:PixelYDimension="1840" 
            exif:ColorSpace="1" 
            xmp:ModifyDate="2016-02-22T09:22:39-05:00" 
            xmp:MetadataDate="2016-08-05T02:21:35-04:00" 
            aux:ApproximateFocusDistance="0/1" 
            crs:AlreadyApplied="True" 
            Iptc4xmpCore:IntellectualGenre="NA" 
            Iptc4xmpCore:Location="NA" 
            Iptc4xmpCore:CountryCode="NA"> 
            <dc:rights> 
                <rdf:Alt> 
                    <rdf:li xml:lang="x-default">Copyright FirstName lastname</rdf:li> 
                </rdf:Alt> 
            </dc:rights> 
            <dc:creator> 
                <rdf:Seq> 
                    <rdf:li>FirstName lastname</rdf:li> 
                </rdf:Seq> 
            </dc:creator> 
            <dc:description> 
                <rdf:Alt> 
                    <rdf:li xml:lang="x-default">Jurks on the move</rdf:li> 
                </rdf:Alt> 
            </dc:description> 
            <dc:subject> 
                <rdf:Bag> 
                    <rdf:li>New Jurks in Town</rdf:li> 
                </rdf:Bag> 
            </dc:subject> 
            <dc:title> 
                <rdf:Alt> 
                    <rdf:li xml:lang="x-default">Big City Jurks</rdf:li> 
                </rdf:Alt> 
            </dc:title> 
            <tiff:BitsPerSample> 
                <rdf:Seq> 
                    <rdf:li>8</rdf:li> 
                    <rdf:li>8</rdf:li> 
                    <rdf:li>8</rdf:li> 
                </rdf:Seq> 
            </tiff:BitsPerSample> 
            <exif:ISOSpeedRatings> 
                <rdf:Seq> 
                    <rdf:li>107</rdf:li> 
                </rdf:Seq> 
            </exif:ISOSpeedRatings> 
            <exif:Flash exif:Fired="True" exif:Return="0" exif:Mode="1" exif:Function="False" exif:RedEyeMode="False"/> 
            <Iptc4xmpCore:CreatorContactInfo 
            Iptc4xmpCore:CiAdrExtadr="" 
            Iptc4xmpCore:CiAdrCity="" 
            Iptc4xmpCore:CiAdrRegion="NY" 
            Iptc4xmpCore:CiAdrPcode="" 
            Iptc4xmpCore:CiAdrCtry="USA" 
            Iptc4xmpCore:CiTelWork="" 
            Iptc4xmpCore:CiEmailWork="you@yourwebsite.com" 
            Iptc4xmpCore:CiUrlWork="www.yourwebsite.com"/> 
            <Iptc4xmpCore:SubjectCode> 
                <rdf:Bag> 
                    <rdf:li>Jurks</rdf:li> 
                </rdf:Bag> 
            </Iptc4xmpCore:SubjectCode> 
            <Iptc4xmpCore:Scene> 
                <rdf:Bag> 
                    <rdf:li>Big City</rdf:li> 
                </rdf:Bag> 
            </Iptc4xmpCore:Scene> 
            <xmpRights:UsageTerms> 
                <rdf:Alt> 
                    <rdf:li xml:lang="x-default">Free to use</rdf:li> 
                </rdf:Alt> 
            </xmpRights:UsageTerms> 
        </rdf:Description> 
        </rdf:RDF> 
    </x:xmpmeta>                                                                                  

This is how I approach the issue.

    $__data = "xmp-cache-test.xml";

    $content = file_get_contents('xmp-cache-test.xml');

    if(preg_match("/(\<x\:xmpmeta.*?\>.*?\<\/x\:xmpmeta\>)/s", $content, $matches))
        $data = "<?xml version='1.0'?>\n" . $matches[1];

    $myXmlString = $data ;
    $myXmlFilename = $__data;

    $doc = new DOMDocument();
    $doc->loadXML($myXmlString);
    $doc->documentURI = $myXmlFilename;
    $xpath = new DOMXpath($doc);

    $xpath->registerNamespace('x', 'adobe:ns:meta/');
    $xpath->registerNamespace('xmp', 'http://ns.adobe.com/xap/1.0/');
    $xpath->registerNamespace("Iptc4xmpCore", "http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/");
    $xpath->registerNamespace('rdf', 'http://www.w3.org/1999/02/22-rdf-syntax-ns#');

    $elements = $xpath->evaluate('//rdf:RDF/rdf:Description');
    $arr_xmp = iterator_to_array($elements);
    print_r($arr_xmp);

// The print result:

    Array ( 
        [0] => DOMElement Object ( 
            [tagName] => rdf:Description 
            [schemaTypeInfo] => 
            [nodeName] => rdf:Description 
            [nodeValue] => Copyright FirstName lastname FirstName lastname Jurks on the move 
            New Jurks in Town Big City Jurks 8 8 8 107 Jurks Big City Free to use 
            [nodeType] => 1 
            [parentNode] => (object value omitted) 
            [childNodes] => (object value omitted) 
            [firstChild] => (object value omitted) 
            [lastChild] => (object value omitted) 
            [previousSibling] => (object value omitted) 
            [nextSibling] => (object value omitted) 
            [attributes] => (object value omitted) 
            [ownerDocument] => (object value omitted) 
            [namespaceURI] => http://www.w3.org/1999/02/22-rdf-syntax-ns# [prefix] => rdf 
            [localName] => Description 
            [baseURI] => xmp-cache-test.xml 
            [textContent] => Copyright FirstName lastname FirstName lastname Jurks on the move 
            New Jurks in Town Big City Jurks 8 8 8 107 Jurks Big City Free to use 
            ) ) 

The above result is not what I had expected. I would rather to have in the array for viewing something more like the following example below and along with a few other options:

    Array ( 
        [rdf:about] => 
        [xmlns:photoshop] => http://ns.adobe.com/photoshop/1.0/ 
        [xmlns:dc] => http://purl.org/dc/elements/1.1/ 
        [xmlns:tiff] => http://ns.adobe.com/tiff/1.0/ 
        [xmlns:exif] => http://ns.adobe.com/exif/1.0/ 
        [xmlns:xmp] => http://ns.adobe.com/xap/1.0/ 
        [xmlns:aux] => http://ns.adobe.com/exif/1.0/aux/ 
        [xmlns:crs] => http://ns.adobe.com/camera-raw-settings/1.0/ 
        [xmlns:Iptc4xmpCore] => http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/ 
        [xmlns:xmpRights] => http://ns.adobe.com/xap/1.0/rights/ 
        [photoshop:LegacyIPTCDigest] => B0D1E9B9CFC1C774E7277517B04970DC 
        [photoshop:ColorMode] => 3 
        [photoshop:ICCProfile] => sRGB IEC61966-2.1 
        [photoshop:AuthorsPosition] => Tester 
        [photoshop:Headline] => Big City Landscape 
        [photoshop:CaptionWriter] => Freelancer 
        [photoshop:DateCreated] => 2016-08-05T02:16Z 
        [photoshop:City] => NA 
        [photoshop:City] => NA 
        [photoshop:State] => NA 
        [photoshop:Country] => NA 
        [photoshop:TransmissionReference] => 2323 
        [photoshop:Instructions] => set to landscape 
        [photoshop:Credit] => Photographor: FirstName lastname 
        [photoshop:Source] => Smart Phone Photo 
        [tiff:Make] => Motorola 
        [tiff:Model] => MB865 
        [tiff:Orientation] => 1 

        ------------ // continue
        )
  1. Options: By giving an example would be helpful.

    1. How should I approach the creation of the array by using DOM?
    2. If I need to remove say "tiff and exif" from the array what should the approach be like?
    3. Use Dom to update say "photoshop:Credit" value.
    4. How to use DOM to reverse the array back to the XML string.
Carl Barrett
  • 209
  • 5
  • 16

1 Answers1

0

=============EDIT===================

The xml to array part, almost the same question here: What is the best php DOM 2 Array function?

I played with the code a bit and this is the result:

function xml_to_array($root) {
    $result = array();

    if ($root->hasAttributes()) {
        $attrs = $root->attributes;
        foreach ($attrs as $attr) {
            $result['@attributes'][$attr->name] = $attr->value;
        }
    }

    if ($root->hasChildNodes()) {
        $children = $root->childNodes;
        if ($children->length == 1) {
            $child = $children->item(0);
            if ($child->nodeType == XML_TEXT_NODE) {
                $result['_value'] = $child->nodeValue;
                return count($result) == 1
                    ? $result['_value']
                    : $result;
            }
        }
        $groups = array();
        foreach ($children as $child) {
            if($child->nodeType == XML_TEXT_NODE && empty(trim($child->nodeValue))) continue;
            if (!isset($result[$child->nodeName])) {
                $result[$child->nodeName] = xml_to_array($child);
            } else {
                if (!isset($groups[$child->nodeName])) {
                    $result[$child->nodeName] = array($result[$child->nodeName]);
                    $groups[$child->nodeName] = 1;
                }
                $result[$child->nodeName][] = xml_to_array($child);
            }
        }
    }

    return $result;
}


// $content = your xml raw source

if(preg_match("/(\<x\:xmpmeta.*?\>.*?\<\/x\:xmpmeta\>)/s", $content, $matches))
    $data = "<?xml version='1.0'?>\n" . $matches[1];

$myXmlString = $data ;
//$myXmlFilename = $__data;

$doc = new DOMDocument();
$doc->loadXML($myXmlString);

$array = xml_to_array($doc);
print_r($array);

Very neat function someone wrote there, it iterates thru-out the xml collecting attributes and node values and pretty much ignoring the pain associated to dreadful namespaces.

If you need to remove an item from the array, just use unset, as in:

unset($array['x:xmpmeta']['rdf:RDF']['rdf:Description']['tiff:BitsPerSample']);

As for how to update an attribute value, exact same question here: Change tag attribute value with PHP DOMDocument

$dom = new DOMDocument();
$dom->loadHTML('<a href="http://foo.bar/">Click here</a>');

foreach ($dom->getElementsByTagName('a') as $item) {

    $item->setAttribute('href', 'http://google.com/');
    echo $dom->saveHTML();
    exit;
}

And finally, how to reverse back from array to DOM: there is no easy way, you would have to manually create a DOM object and create nodes and attributes one by one.

Once populated you would call http://php.net/manual/en/domdocument.savexml.php to get the xml code.

<?php

$doc = new DOMDocument('1.0');
// we want a nice output
$doc->formatOutput = true;

$root = $doc->createElement('book');
$root = $doc->appendChild($root);

$title = $doc->createElement('title');
$title = $root->appendChild($title);

$text = $doc->createTextNode('This is the title');
$text = $title->appendChild($text);

echo "Saving all the document:\n";
echo $doc->saveXML() . "\n";

echo "Saving only the title part:\n";
echo $doc->saveXML($title);

?>

Hope this helps,

Community
  • 1
  • 1
  • you maybe right about accessing attribute values for example: ``, but I need to produce a tree structure where the parentNode `` as the key. And for the childrenNode, the attribute names as key equal to the attribute values. I need to collect the data in an array dynamically. I have seen the example you provided before which is straight forward with on a simple XML file, but how will you go about creating the xpath to generate the array elements based on the parentNode from my XML? – Carl Barrett Oct 13 '16 at 20:54
  • Thanks very much for sharing the above resource which points to the function `xml_to_array($root) `. I have tweak the code a little by making the following changes: `$result['@attributes'][$attr->name] = $attr->value;` to `$result[$attr->name] = $attr->value;`, `$result['_value'] = $child->nodeValue;` to `$result[] = $child->nodeValue;`and `? $result['_value']` to `? $result[0]`. Now it works wonderful! – Carl Barrett Oct 16 '16 at 19:22
  • Correction to the above comment: `$result['@attributes'][$attr->name] = $attr->value;` to `$result[$attr->nodeName] = $attr->nodeValue;` – Carl Barrett Oct 16 '16 at 21:41