3

I have a need to split an array into several chunks each of them with same number of elements (even repeated across different chunks) balanced across the output

So, for example and starting from an array such this:

$input = array(1,2,3,4,5,6,7,8,9,10,11);

I'm trying to develop a function that accept the input array and the number of elements expected on each chunk. So for example [ balanced_chunk($input, 3) ]

should gets me

0 => array(1,2,3)
1 => array(4,5,6)
2 => array(6,7,8)         #-- Value 6 is repeated
3 => array(9,10,11)

as long as [ balanced_chunk($input, 5) ]

should gets me

0 => array(1,2,3,4,5)
1 => array(4,5,6,7,8)     #-- Value 4,5 are repeated
2 => array(7,8,9,10,11)   #-- Value 7,8 are repeated

and so on

To start I've developed such function

function balanced_chunk($input, $size) {
    $step = ceil(count($input) / $size);
    $chunks = range(0, count($input), count($input) / $step);
    reset($chunks);
    while (list(,$start) = each($chunks)) {
            $start = (fmod($start, 1) <= 0.5 ? floor($start) : ceil($start));
            $clist[] = array_slice($input, $start, $size, true);
    }
    return($clist);
}

but for a reason that at the moment I miss, it gets me such output:

[0] => Array (1,2,3,4,5)
[1] => Array (5,6,7,8,9)     #-- this element should instead start from 4...
[2] => Array (8,9,10,11)     #-- last element contains only 4 value

Just to make a better example consider the input array [ a,b,c,d,e,f,g,h,i,l,m,n,o,p ]

a balanced chunk with 5 elements each has to be

[ a,b,c,d,e ]
          [ f,g,h,i,l ]
                  [ l,m,n,o,p ] #-- letter 'l' is repeated twice on 3rd result

or (as valid alternative)

[ a,b,c,d,e ] 
        [ e,f,g,h,i ]           #-- letter 'e' is repeated twice on 2nd result
                    [ l,m,n,o,p ]

a balanced chunk with 8 elements each has to be

[ a,b,c,d,e,f,g,h ] 
            [ g,h,i,l,m,n,o,p ]  #-- letter 'g','h' are repeated twice

I'm stucked! After several trial by myselft I'm not able to find out how to solve such problem.

Stefano Radaelli
  • 1,088
  • 2
  • 15
  • 35
  • How do you decide what is valid for the 2nd array made from the sequence 1-11? Would `6,7,8,9,10` also be valid where 8,9 and 10 are all repeated? How about `5,6,7,8,9` where 5, 8 and 9 are all repeated elsewhere (5 in the 1st array and 8 and 9 in the second)? – madebydavid Apr 17 '14 at 16:45
  • not exaclty as for my scope the sequence would not be balanced. in your example I would have `[1,2,3,4,5][6,7,8,9,10][7,8,9,10,11]` that means that 0 values of 1st result are in common with 2nd one, and 4 result of 2nd result are in common with last one. Whereas a "balanced" distribution has to produce `[1,2,3,4,5][4,5,6,7,8][7,8,9,10,11]` where 2 values 1st result are in common with 2nd one and 2 result of 2nd one are in commong with 3rd. – Stefano Radaelli Apr 17 '14 at 20:28
  • Very interesting challenge. Where is it from? Why do you need this? – mzedeler Apr 18 '14 at 13:58
  • I'm going to optimize a "father" matrix [ N,M ] with a smaller "child" matrix [ n,m ]. The challenge is to compute the less number of repetition of smaller matrix are needed to cover the whole "father" matrix. To start I've assumed to reduce the input matrix as a simple horizontal array of (for example) 75 elements to be covered with a "child array" of just 10 elements but that [a] can be overlapped and in case of overlapping the overlay must be balanced across the middle of "father" array. – Stefano Radaelli Apr 18 '14 at 14:14

2 Answers2

1

You have probably moved on to other things now! But here is my way of doing this.

function balanced_chunk($input, $size)
{
    $len = count($input);
    $chunkSize = ceil($len / $size);
    $o = [];
    $i = 1;
    $k = 0;
    foreach ($input as $elem)
    {
        $o[$k][$i] = $elem;
        $k = ($i % $chunkSize == 0) ? $k+1 : $k;
        $i++;
    }
    return $o;
}
kohloth
  • 742
  • 1
  • 7
  • 21
0

For the moment, waiting to find out a "smarter" approach I've found this solution:

function balanced_chunk($input, $size) {
    $chunks   = ceil(count($input) / $size);
    $step     = count($input) / $chunks;
    $chunklist = array();
    for ($i = 0; $i < count($input); $i += $step) {
        $chunk = array_slice($input, floor($i), $size);
        if (count($chunk) < $size)  $chunk = array_slice($input, $size * -1, $size, true);
        $chunklist[] = $chunk;
    }
    return($chunklist);
}

That means that for example... Ex #1: 11 elements split in chunk of 3 each:

$split = balanced_chunk(range(1, 11), 3));
/*
[ 1,2,3 ]
    [ 3,4,5 ][ 6,7,8 ][ 9,10,11 ]

In such case the fx needs to be better tuned as a better balanced output
has instead to be

[ 1,2,3 ][ 4,5,6 ]
             [ 6,7,8 ][ 9,10,11 ]
*/

Ex #2: 11 elements split in chunk of 4 each:

$split = balanced_chunk(range(1, 11), 4));
/*
[ 1,2,3,4 ]
      [ 4,5,6,7 ][ 8,9,10,11 ]

In such case the given output is exactly comparable with the alternative

[ 1,2,3,4 ][ 5,6,7,8 ]
                 [ 8,9,10,11 ]
*/

Ex #3: 11 elements split in chunk of 5 each:

$split = balanced_chunk(range(1, 11), 5));
/*
[ 1,2,3,4,5 ]
      [ 4,5,6,7,8 ]
            [ 7,8,9,10,11 ]
*/

Ex #4: 11 elements split in chunk of 7 each:

$split = balanced_chunk(range(1, 11), 7));
/*
[ 1,2,3,4,5,6,7 ]
        [ 5,6,7,8,9,10,11 ]
*/

I welcome tips or suggestion to improve this.

Stefano Radaelli
  • 1,088
  • 2
  • 15
  • 35