-1

Given the following code:

$flat = [
    [ '10', 'hoho'],
    [ '10', null],
    [ '13', null],
    [ '10', 'ahha']
];

//imperative, procedural approach
$hierarchical = [];
foreach ($flat  as $entry) {
    $id = $entry[0];

    $hierarchical[$id]['id'] = $id;
    $hierarchical[$id]['microtags'] = $hierarchical[$id]['microtags'] ?? [];
    if ($entry[1] != null)
        array_push($hierarchical[$id]['microtags'], $entry[1]);
}

And its result ($hierarchical):

 array (
   10 => 
   array (
     'id' => '10',
     'microtags' => 
     array (
       0 => 'hoho',
       1 => 'ahha',
     ),
   ),
   13 => 
   array (
     'id' => '13',
     'microtags' => 
     array (
     ),
   ),
 )

Is it possible to refactor it to a reasonably efficient declarative/functional approach? Like using array transformation functions (map,reduce,filter,etc)? Also without changing references or altering the same variable. If so, how?

Jean Carlo Machado
  • 1,480
  • 1
  • 15
  • 25
  • 3
    What is stateless, for that matter? – Anthony Apr 12 '18 at 18:08
  • And what is `$flat`? – Anthony Apr 12 '18 at 18:08
  • Sorry, s/query/flat/ – Jean Carlo Machado Apr 12 '18 at 18:10
  • I hope I further clarified what I mean on the post itself. But in summary: to transform the data without overriding the same variable. – Jean Carlo Machado Apr 12 '18 at 18:20
  • Huh, avoiding visible mutations doesn't seem to be worth striving for in PHP?!? You usually do this with recursion and an accumulator. For performance reasons you can mutate the accumulator, because it is hidden inside the function scope. Unfortunately, I barely know PHP... –  Apr 12 '18 at 18:23
  • Can't you just make the transformation code a function and pass it the array you want to transform? If it isn't passed by reference, the returned array would be a separate variable than the passed-in array. If you are working with objects, however, you would want your function to use `clone` to get a new object instead of mutating the provided object. But it still isn't clear what you are actually striving for. – Anthony Apr 12 '18 at 18:33
  • How is the `foreach` in your example stateful? It doesn't update `$flat`, it creates a new array `$hierarchical`. This is what isn't clear (to me). – Anthony Apr 12 '18 at 18:34
  • The $hierarchical array changes at every iteration. This is what I meant by stateful, but probably it was not helpful. I'll remove the term. – Jean Carlo Machado Apr 12 '18 at 18:41

1 Answers1

1

Creating and traversing trees of different shape is best accomplished by using functions. Below, we create functions node_create and node_add_child which encode our intention. Finally, we use array_reduce to complete the transformation. $flat remains untouched; our reducing operation only reads from the input data.

function node_create ($id, $children = []) {
  return [ "id" => $id, "children" => $children ];
}

function node_add_child ($node, $child) {
  return node_create ($node['id'], array_merge ($node['children'], [ $child ]));
}

$flat =
  [ [ '10', 'hoho' ]
  , [ '10', null ]
  , [ '13', null ]
  , [ '10', 'ahha' ]
  ];

$result =
  array_reduce ($flat, function ($acc, $item) {
    list ($id, $value) = $item;
    if (! array_key_exists ($id, $acc))
      $acc [$id] = node_create ($id);
    if (! is_null ($value))
      $acc [$id] = node_add_child ($acc [$id], $value);
    return $acc;
  }, []);

And the result

print_r ($result);
// Array
// (
//     [10] => Array
//         (
//             [id] => 10
//             [children] => Array
//                 (
//                     [0] => hoho
//                     [1] => ahha
//                 )
//         )
//     [13] => Array
//         (
//             [id] => 13
//             [children] => Array
//                 (
//                 )
//         )
// )

Above, we use an associative array for $acc which means we have to use PHP's built-in functions for interaction with associative arrays. We can abstract away PHP's ugly, non-functional interfaces for more favourable ones.

function has ($map, $key) {
  return array_key_exists ($key, $map);
}

function get ($map, $key) {
  return $map [$key];
}

function set ($map, $key, $value = null) {
  $map [$key] = $value;
  return $map;
}

We move the logic for adding null children to node_add_child

function node_create ($id, $children = []) {
  return [ "id" => $id, "children" => $children ];
}

function node_add_child ($node, $child = null) {
  if (is_null ($child))
    return $node;
  else
    return node_create ($node['id'], array_merge ($node['children'], [ $child ]));
}

Now we can see a much more declarative reduce

function make_tree ($flat = []) {
  return 
    array_reduce ($flat, function ($acc, $item) {
      list ($id, $value) = $item;
      return 
          set ( $acc
              , $id
              , has ($acc, $id)
                  ? node_add_child (get ($acc, $id), $value)
                  : node_add_child (node_create ($id), $value)
              );
    }, []);
}

print_r (make_tree ($flat));
// same output as above

Above, we see how has, get, and set can simplify our reduce operation. However, this kind of approach can lead to lots of small, separated functions. Another approach involves inventing your own data type that satisfies your needs. Below, we scrap the separated functions we created above and trade them for a class, MutableMap

class MutableMap {
  public function __construct ($data = []) {
    $this->data = $data;
  }
  public function has ($key) {
    return array_key_exists ($key, $this->data);
  }
  public function get ($key) {
    return $this->has ($key)
      ? $this->data [$key]
      : null
    ;
  }
  public function set ($key, $value = null) {
    $this->data [$key] = $value;
    return $this;
  }
  public function to_assoc () {
    return $this->data;
  }
}

Now instead of having to pass $acc, to each function, we swap it out for $map which is an instance of our new type

function make_tree ($flat = []) {
  return 
    array_reduce ($flat, function ($map, $item) {
      list ($id, $value) = $item;
      return
        $map -> set ( $id
                    , $map -> has ($id)
                        ? node_add_child ($map -> get ($id), $value)
                        : node_add_child (node_create ($id), $value)
                    );
    }, new MutableMap ())
    -> to_assoc ();
}

Of course you could swap node_create and node_add_child out for a class-based implementation, class Node { ... }. This exercise is left for the reader.

function make_tree ($flat = []) {
  return 
    array_reduce ($flat, function ($map, $item) {
      list ($id, $value) = $item;
      return
        $map -> set ( $id
                    , $map -> has ($id)
                        ? $map -> get ($id) -> add_child ($value)
                        : (new Node ($id)) -> add_child ($value)
                    );
    }, new MutableMap ())
    -> to_assoc ();
}
Mulan
  • 129,518
  • 31
  • 228
  • 259
  • I don't understand how this is any different than what the OP already is doing. I mean, I get that you are doing it differently (using functions), but how is this more "stateless" than the original code? – Anthony Apr 12 '18 at 18:36
  • 2
    @Anthony our reducing operation does not mutate existing bindings or create new ones. It is a pure functional operation that takes `$flat` as input and produces output, `$result`. – Mulan Apr 12 '18 at 18:43
  • Its more declarative. This is what i really meant. – Jean Carlo Machado Apr 12 '18 at 18:44
  • 1
    The original doesn't create new bindings either? What is a binding? And how is this special? This is how all functions work. And your approach doesn't take `$flat`, it is an anonymous function that happens to call non-anonymous functions. All the real work is happening in that anonymous function. What would make this different from, say, `$result = myTransformFunction($flat);`? – Anthony Apr 12 '18 at 18:47
  • @JeanCarloMachado I made an update that shows how you could continue to create additional abstractions to clean up the code even more – Mulan Apr 12 '18 at 18:55
  • 1
    Thank you @user633183. Your aswer is very educative. I much appreciate. – Jean Carlo Machado Apr 12 '18 at 19:00
  • @Anthony the original code creates (leaks) a new binding, `$entry`. A binding is a value that has been *bound* (set) to an identifier, or variable. I can't tell you what makes it "special" because I'm not sure what you're asking there. How it differs from `myTransformFunction` is impossible to tell because you didn't share any code. – Mulan Apr 12 '18 at 19:02
  • @JeanCarloMachado I had some extra time so I showed you how you can make better data types that yield even better results :) – Mulan Apr 12 '18 at 19:21