-1

Whilst there are plenty of options for converting "normal" XML into an array I'd dearly love to find a way of converting this data into an array that I can process with PHP (it's currently designed to be processed by JQuery)

<?xml version="1.0" encoding="ISO-8859-15"?><root><data><![CDATA[ [{title_id: "284270",
          track_id: "1548617",
          artist: [[20670, 1, "Matthias Vogt", "matthias-vogt"]],
          title: "The Wobble Track",
          title_url: "/title/284270/the-wobble-track",
          track_url: "/track/1548617/the-wobble-track",
          label: [88, "Large Music", "large-music"],
          genre: "Deep House",
          genre_url: "/genre/13/deep-house",
          catnumber: "LAR181",
          promo: false,
          duration: "5:54",
          r_date: "2014-02-17",
          price: {hbr: 1.99, wav: 2.74},
          bought: false,
          image: "http://static.traxsource.com/files/images/271306_large.jpg",
          thumb: "http://static.traxsource.com/scripts/image.php/44x44/271306.jpg",
          mp3: "http://preview.traxsource.com/files/previews/88/1324290-p.mp3",
          waveform: "http://static.traxsource.com/files/wf/1324290-wf.png",
          bpm: "120",
          keysig: "Bmin"}
] ]]></data></root>

There are about another 99 objects in this xml string so i've only included 1 for simplicity

I want to convert, what appears to be, an array into an JSON or PHP array - thanks :)

Ollie Brooke
  • 64
  • 10

2 Answers2

0

Don't. Just use Xpath on the DOM to fetch the parts, in your case the JSON structure in the CDATA section.

The structure is not really JSON but Javascript, the quotes around the property names are missing. Here is a nice regex in a user comment of the PHP Manual that repairs it.

$xml = <<<'XML'
<?xml version="1.0" encoding="ISO-8859-15"?><root><data><![CDATA[ [{title_id: "284270",
          track_id: "1548617",
          artist: [[20670, 1, "Matthias Vogt", "matthias-vogt"]],
          title: "The Wobble Track",
          title_url: "/title/284270/the-wobble-track",
          track_url: "/track/1548617/the-wobble-track",
          label: [88, "Large Music", "large-music"],
          genre: "Deep House",
          genre_url: "/genre/13/deep-house",
          catnumber: "LAR181",
          promo: false,
          duration: "5:54",
          r_date: "2014-02-17",
          price: {hbr: 1.99, wav: 2.74},
          bought: false,
          image: "http://static.traxsource.com/files/images/271306_large.jpg",
          thumb: "http://static.traxsource.com/scripts/image.php/44x44/271306.jpg",
          mp3: "http://preview.traxsource.com/files/previews/88/1324290-p.mp3",
          waveform: "http://static.traxsource.com/files/wf/1324290-wf.png",
          bpm: "120",
          keysig: "Bmin"}
] ]]></data></root>
XML;

function javascript_decode($json, $assoc = FALSE){
  $json = str_replace(array("\n","\r"),"",$json);
  $json = preg_replace('(([{,]+)(\s*)([^"]+?)\s*:)','$1"$3":',$json);
  return json_decode($json,$assoc);
} 

$dom = new DOMDocument();
//$dom->load($xmlFile);
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);

$json = javascript_decode($xpath->evaluate('string(/root/data)'));
var_dump($json);

Output https://eval.in/105256

array(1) {
  [0]=>
  object(stdClass)#3 (21) {
    ["title_id"]=>
    string(6) "284270"
    ["track_id"]=>
    string(7) "1548617"
    ["artist"]=>
    array(1) {
      [0]=>
      array(4) {
        [0]=>
        int(20670)
        [1]=>
        int(1)
        [2]=>
        string(13) "Matthias Vogt"
        [3]=>
        string(13) "matthias-vogt"
      }
    }
    ["title"]=>
    string(16) "The Wobble Track"
    ["title_url"]=>
    string(30) "/title/284270/the-wobble-track"
    ["track_url"]=>
    string(31) "/track/1548617/the-wobble-track"
    ["label"]=>
    array(3) {
      [0]=>
      int(88)
      [1]=>
      string(11) "Large Music"
      [2]=>
      string(11) "large-music"
    }
    ["genre"]=>
    string(10) "Deep House"
    ["genre_url"]=>
    string(20) "/genre/13/deep-house"
    ["catnumber"]=>
    string(6) "LAR181"
    ["promo"]=>
    bool(false)
    ["duration"]=>
    string(4) "5:54"
    ["r_date"]=>
    string(10) "2014-02-17"
    ["price"]=>
    object(stdClass)#4 (2) {
      ["hbr"]=>
      float(1.99)
      ["wav"]=>
      float(2.74)
    }
    ["bought"]=>
    bool(false)
    ["image"]=>
    string(58) "http://static.traxsource.com/files/images/271306_large.jpg"
    ["thumb"]=>
    string(63) "http://static.traxsource.com/scripts/image.php/44x44/271306.jpg"
    ["mp3"]=>
    string(61) "http://preview.traxsource.com/files/previews/88/1324290-p.mp3"
    ["waveform"]=>
    string(52) "http://static.traxsource.com/files/wf/1324290-wf.png"
    ["bpm"]=>
    string(3) "120"
    ["keysig"]=>
    string(4) "Bmin"
  }
}
ThW
  • 19,120
  • 3
  • 22
  • 44
  • Thanks but the var_dump just gives me NULL, am I missing something? – Ollie Brooke Feb 19 '14 at 09:38
  • This could happen if the XML was not loaded, or the JSON is invalid. I edited the answer and added example data and a live link. – ThW Feb 19 '14 at 09:47
  • Were you able to make that work with the actual data included in my question? Or is that data not properly formatted? If not then do you have any ideas on how to solve my issue? – Ollie Brooke Feb 19 '14 at 10:45
  • The incomplete one? No of course that JSON can not be decoded, it is incomplete. The XML reading parts works on it however: https://eval.in/103112 – ThW Feb 19 '14 at 11:22
  • Ok but I remain unclear as to how your answer solves the problem I have...? – Ollie Brooke Feb 19 '14 at 22:16
  • The code shows you how to read a JSON encoded data structure inside a CDATA section in an XML document. You did not provide the original, full JSON in the question. So I can not test, reproduce or debug any problem you might have with that full JSON. You will need to check if the xml reading works and if yes what is wrong with yor JSON. – ThW Feb 20 '14 at 00:21
  • Yeah ok. Given that I could solve the issue with your solution I kept digging and did some tests. My solution is probably not pretty but is certainly functional. – Ollie Brooke Feb 20 '14 at 03:47
  • I've edited the code to make it a complete XML file with JSON data - I'd be very much obliged if you were able to revisit and ideally solve it more gracefully than I did :) – Ollie Brooke Feb 24 '14 at 04:40
  • Updated the answer to repair the javascript structure so it can be parsed as JSON. – ThW Feb 24 '14 at 23:09
-1

Given that I couldn't solve the issue with any of the solutions I kept digging and did some tests. My solution is not pretty but is certainly functional. I removed all the XML data and left myself with fairly raw JSON. The issue, in this case, with the JSON data is that the strings are not wrapped in "" so I did an str_replace to fix this and it all worked. $content= str_replace('','',str_replace("\'","'",$content)));

$keys = array("title_id:",
"track_id:",
"artist:",
"title:",
"title_url:",
"track_url:",
"label:",
"genre:",
"genre_url:",
"catnumber:",
"promo:",
"duration:",
"r_date:",
"price:",
"hbr:",
"wav:",
"bought:",
"false",
"image:",
"thumb:",
"mp3:",
"waveform:",
"bpm:",
"keysig:"
);
$newkeys = array("\"title_id\":",
"\"track_id\":",
"\"artist\":",
"\"title\":",
"\"title_url\":",
"\"track_url\":",
"\"label\":",
"\"genre\":",
"\"genre_url\":",
"\"catnumber\":",
"\"promo\":",
"\"duration\":",
"\"r_date\":",
"\"price\":",
"\"hbr\":",
"\"wav\":",
"\"bought\":",
"\"false\"",
"\"image\":",
"\"thumb\":",
"\"mp3\":",
"\"waveform\":",
"\"bpm\":",
"\"keysig\":"
);
$content=  str_replace($keys, $newkeys, $content);
$json = json_decode($content, true);

I was then able to loop through this and process normally.

Ollie Brooke
  • 64
  • 10