You can query data from an XML DOM, using Xpath. It is accessible in PHP using the DOMXpath::evaluate() method. The second argument is the context, so you're expressions can be relative to another node. Converting it to an list of records (for database, csv, ...). will require several steps. Starting with some bootstrap:
$xml = <<<'XML'
<foods>
<food>
<name>ravioli 1</name>
<recipe>food.com/ravioli-1</recipe>
<time unit="minutes">10</time>
</food>
<food>
<name>ravioli 2</name>
<recipe>food.com/ravioli-2</recipe>
<time unit="minutes">11</time>
</food>
</foods>
XML;
$dom = new DOMDocument();
$dom->loadXml($xml);
$xpath = new DOMXpath($dom);
First we need to define which xml element defines the record, then which elements define the fields.
So let's build a lists of possible record paths and field paths:
$paths = [];
$leafs = [];
foreach ($xpath->evaluate('//*|//@*') as $node) {
$isPath = $xpath->evaluate('count(@*|*) > 0', $node);
$isLeaf = !($xpath->evaluate('count(*) > 0', $node));
$path = '';
foreach ($xpath->evaluate('ancestor::*', $node) as $parent) {
$path .= '/'.$parent->nodeName;
}
$path .= '/'.($node instanceOf DOMAttr ? '@' : '').$node->nodeName;
if ($isLeaf) {
$leafs[$path] = TRUE;
}
if ($isPath) {
$paths[$path] = TRUE;
}
}
$paths = array_keys($paths);
$leafs = array_keys($leafs);
var_dump($paths, $leafs);
Output:
array(3) {
[0] =>
string(6) "/foods"
[1] =>
string(11) "/foods/food"
[2] =>
string(16) "/foods/food/time"
}
array(4) {
[0] =>
string(16) "/foods/food/name"
[1] =>
string(18) "/foods/food/recipe"
[2] =>
string(16) "/foods/food/time"
[3] =>
string(22) "/foods/food/time/@unit"
}
Next show the possible record paths to the user. The user needs to select one. Knowing the record path, build a list of the possible field paths from the leafs array:
$path = '/foods/food';
$fieldLeafs = [];
$pathLength = strlen($path) + 1;
foreach ($leafs as $leaf) {
if (0 === strpos($leaf, $path.'/')) {
$fieldLeafs[] = substr($leaf, $pathLength);
}
}
var_dump($fieldLeafs);
Output:
array(4) {
[0] =>
string(4) "name"
[1] =>
string(6) "recipe"
[2] =>
string(4) "time"
[3] =>
string(10) "time/@unit"
}
Put up some dialog that allows the user to select a path for each field.
$fieldDefinition = [
'title' => 'name',
'url' => 'recipe',
'needed_time' => 'time',
'time_unit' => 'time/@unit'
];
Now use the path and the mapping to build up the records array:
$result = [];
foreach ($xpath->evaluate($path) as $node) {
$record = [];
foreach ($fieldDefinition as $field => $expression) {
$record[$field] = $xpath->evaluate(
'string('.$expression.')',
$node
);
}
$result[] = $record;
}
var_dump($result);
Output:
array(2) {
[0] =>
array(4) {
'title' =>
string(9) "ravioli 1"
'url' =>
string(18) "food.com/ravioli-1"
'needed_time' =>
string(2) "10"
'time_unit' =>
string(7) "minutes"
}
[1] =>
array(4) {
'title' =>
string(9) "ravioli 2"
'url' =>
string(18) "food.com/ravioli-2"
'needed_time' =>
string(2) "11"
'time_unit' =>
string(7) "minutes"
}
}
The full example can be found at: https://eval.in/118012
The XML in the example is never converted to a generic array. Doing this would mean to loosing information and double storage. So don't. Extract structure information from the XML, let the user define the mapping. Use Xpath extract the data and store them directly in the result format.