2

I've spent hours trying to find the answer to this question, but I'm struggling. I'm reasonably familiar with PHP and the various in-built functions, and can build a complex foreach() loop to do this, but I thought I'd ask to see if anyone has a smarter solution to my problem.

I have the following simplified example array with three "rows" (the real array is usually a lot bigger and more complex, but the issue is the same).

$rows[] = [
    "widget_id" => "widget1",
    "size" => "large",
    "item" => [
        "item_id" => "item1",
        "shape" => "circle",
        "paint" => [
            "paint_id" => "paint1",
            "colour" => "red",
        ]
    ]
];

# Exactly the same as above, except the "paint" child array is different
$rows[] = [
    "widget_id" => "widget1",
    "size" => "large",
    "item" => [
        "item_id" => "item1",
        "shape" => "circle",
        "paint" => [
            "paint_id" => "paint2",
            "colour" => "green",
        ]
    ]
];

# Same children ("item" and "paint") as the first row, but different parents ("widget_id" is different)
$rows[] = [
    "widget_id" => "widget2",
    "size" => "medium",
    "item" => [
        "item_id" => "item1",
        "shape" => "circle",
        "paint" => [
            "paint_id" => "paint1",
            "colour" => "red",
        ]
    ]
];

What I'm trying to get to is the following output:

[[
    "widget_id" => "widget1",
    "size" => "large",
    "item" => [
        "item_id" => "item1",
        "shape" => "circle",
        "paint" => [[
            "paint_id" => "paint1",
            "colour" => "red",
        ],[
            "paint_id" => "paint2",
            "colour" => "green",
        ]]
    ]
],[
    "widget_id" => "widget2",
    "size" => "medium",
    "item" => [
        "item_id" => "item1",
        "shape" => "circle",
        "paint" => [
            "paint_id" => "paint1",
            "colour" => "red",
        ]
    ]
]]

Basically, when two rows share the same key and values, merge them. When the key is the same, but the value is different, keep both values and put them in a numerical array under the key (sort of like how array_merge_recursive does it).

The challenge is that the values can themselves be arrays and there is an unknown number of levels. Is there a smart and effective way of doing this, or do I have to resort to a heavy duty foreach loop?

Thank you for browsing, hope there are some people more clever than me reading this!

dearsina
  • 4,774
  • 2
  • 28
  • 34

2 Answers2

1

I achieved to get the expected array structure with the following function, I hope comments are explicit on what's inside:

function complex_merge(array $arr): array
{
    // Grouped items
    $result = [];
    $iterationKey = 0;

    // Loop through every item
    while (($element = array_shift($arr)) !== null) {
        // Save scalar values as is
        $scalarValues = array_filter($element, 'is_scalar');

        // Save array values in an array
        $arrayValues = array_map(fn(array $arrVal) => [$arrVal], array_filter($element, 'is_array'));
        $arrayValuesKeys = array_keys($arrayValues);

        $result[$iterationKey] = array_merge($scalarValues, $arrayValues);

        // Compare with remaining items
        for ($i = 0; $i < count($arr); $i++) {
            $comparisonScalarValues = array_filter($arr[$i], 'is_scalar');

            // Scalar values are same, add the array values to the containing arrays
            if ($scalarValues === $comparisonScalarValues) {
                $comparisonArrayValues = array_filter($arr[$i], 'is_array');
                foreach ($arrayValuesKeys as $arrayKey) {
                    $result[$iterationKey][$arrayKey][] = $comparisonArrayValues[$arrayKey];
                }

                // Remove matching item
                array_splice($arr, $i, 1);
                $i--;
            }
        }

        // Merge array values
        foreach ($arrayValuesKeys as $arrayKey) {
            $result[$iterationKey][$arrayKey] = complex_merge($result[$iterationKey][$arrayKey]);

            // array key contains a single item, extract it
            if (count($result[$iterationKey][$arrayKey]) === 1) {
                $result[$iterationKey][$arrayKey] = $result[$iterationKey][$arrayKey][0];
            }
        }

        // Increment result key
        $iterationKey++;
    }
    return $result;
}

Just pass $rows to the function, quick checkup of the values:

echo '<pre>' . print_r(complex_merge($rows), true) . '</pre>';

/*
Displays:
Array
(
    [0] => Array
        (
            [widget_id] => widget1
            [size] => large
            [item] => Array
                (
                    [item_id] => item1
                    [shape] => circle
                    [paint] => Array
                        (
                            [0] => Array
                                (
                                    [paint_id] => paint1
                                    [colour] => red
                                )

                            [1] => Array
                                (
                                    [paint_id] => paint2
                                    [colour] => green
                                )

                        )

                )

        )

    [1] => Array
        (
            [widget_id] => widget2
            [size] => medium
            [item] => Array
                (
                    [item_id] => item1
                    [shape] => circle
                    [paint] => Array
                        (
                            [paint_id] => paint1
                            [colour] => red
                        )

                )

        )

)
*/
AymDev
  • 6,626
  • 4
  • 29
  • 52
  • It could have better var names and could probably be optimized but it works and I had fun doing it :) – AymDev Nov 27 '20 at 18:20
  • Thank you for this great example. I posted my own attempt also, but I think yours is better. – dearsina Nov 28 '20 at 09:08
  • I did a quick benchmark, mine seems to be faster about 0.01 millisecond. But an object oriented approach can be easier to use :) – AymDev Nov 28 '20 at 12:54
  • Thanks. I'm actually learning a lot from your approach, you used some in-built methods I was not familiar (or comfortable!) with. Maybe I'll play around with a hybrid version when I have a moment, for now I've implemented your version as it's shorter! Thanks again for your help. – dearsina Nov 30 '20 at 10:27
0

Here's my own attempt. I think I prefer AymDev's version though, a lot more succinct. I wonder which is faster.

class ComplexMerge{
    /**
     * Checks to see whether an array has sequential numerical keys (only),
     * starting from 0 to n, where n is the array count minus one.
     *
     * @link https://codereview.stackexchange.com/questions/201/is-numeric-array-is-missing/204
     *
     * @param $arr
     *
     * @return bool
     */
    private static function isNumericArray($arr)
    {
        if(!is_array($arr)){
            return false;
        }
        return array_keys($arr) === range(0, (count($arr) - 1));
    }

    /**
     * Given an array, separate out
     * array values that themselves are arrays
     * and those that are not.
     *
     * @param array $array
     *
     * @return array[]
     */
    private static function separateOutArrayValues(array $array): array
    {
        $valuesThatAreArrays = [];
        $valuesThatAreNotArrays = [];

        foreach($array as $key => $val){
            if(is_array($val)){
                $valuesThatAreArrays[$key] = $val;
            } else {
                $valuesThatAreNotArrays[$key] = $val;
            }
        }

        return [$valuesThatAreArrays, $valuesThatAreNotArrays];
    }

    /**
     * Groups row keys together that have the same non-array values.
     * If every row is already unique, returns NULL.
     *
     * @param $array
     *
     * @return array|null
     */
    private static function groupRowKeysWithSameNonArrayValues($array): ?array
    {
        foreach($array as $key => $row){
            # Separate out the values that are arrays and those that are not
            [$a, $v] = self::separateOutArrayValues($row);

            # Serialise the values that are not arrays and create a unique ID from them
            $uniqueRowId = md5(serialize($v));

            # Store all the original array keys under the unique ID
            $deduplicatedArray[$uniqueRowId][] = $key;
        }

        # If every row is unique, there are no more rows to combine, and our work is done
        if(!$a && count($array) == count($deduplicatedArray)){
            return NULL;
        }

        return $deduplicatedArray;
    }

    private static function mergeRows(array $array): array
    {
        # Get the grouped row keys
        if(!$groupedRowKeys = self::groupRowKeysWithSameNonArrayValues($array)){
            //If there are no more rows to merge
            return $array;
        }

        foreach($groupedRowKeys as $uniqueRowId => $keys){

            foreach($keys as $id => $key){
                # Separate out the values that are arrays and those that are not
                [$valuesThatAreArrays, $valuesThatAreNotArrays] = self::separateOutArrayValues($array[$key]);
                //We're using the key from the grouped row keys array, but using it on the original array

                # If this is the first row from the group, throw in the non-array values
                if(!$id){
                    $unique[$uniqueRowId] = $valuesThatAreNotArrays;
                }

                # For each of the values that are arrays include them back in
                foreach($valuesThatAreArrays as $k => $childArray){
                    $unique[$uniqueRowId][$k][] = $childArray;
                    //Wrap them in a numerical key array so that only children and siblings are have the same parent-child relationship
                }
            }
        }

        # Go deeper
        foreach($unique as $key => $val){
            foreach($val as $k => $valuesThatAreNotArrays){
                if(self::isNumericArray($valuesThatAreNotArrays)){
                    $unique[$key][$k] = self::mergeRows($unique[$key][$k]);
                }
            }
        }

        # No need to include the unique row IDs
        return array_values($unique);
    }

    public static function normalise($array): ?array
    {
        $array = self::mergeRows($array);
        return $array;
    }
}

Usage:

$array = ComplexMerge::normalise($array);

Demo

dearsina
  • 4,774
  • 2
  • 28
  • 34