Accumulating different sized column vectors stored as a cell array into a matrix padded with NaNs

Question

Imagine I have a series of different sized column vectors inside an array and want to group them into a matrix by padding the empty spaces with NaN. How can I do this?

There is already an answer to a very similar problem (accumulate cells of different lengths into a matrix in MATLAB?) but that solution deals with row vectors and my problem is with column vectors. One possible solution could be transposing each of the array components and then applying the above mentioned solution. However, I have no idea how to do this.

Also, speed is a bit of an issue so if possible take that into consideration.

score 4 · Answer 1 · edited May 23 '17 at 12:29

You can just slightly tweak that answer you found to work for columns:

tcell = {[1,2,3]', [1,2,3,4,5]', [1,2,3,4,5,6]', [1]', []'};                      %\\ ignore this comment, it's just for formatting in SO
maxSize = max(cellfun(@numel,tcell));    
fcn = @(x) [x; nan(maxSize-numel(x),1)]; 
cmat = cellfun(fcn,tcell,'UniformOutput',false);  
cmat = horzcat(cmat{:}) 

cmat =

     1     1     1     1   NaN
     2     2     2   NaN   NaN
     3     3     3   NaN   NaN
   NaN     4     4   NaN   NaN
   NaN     5     5   NaN   NaN
   NaN   NaN     6   NaN   NaN

Or you could tweak this as an alternative:

cell2mat(cellfun(@(x)cat(1,x,NaN(maxSize-length(x),1)),tcell,'UniformOutput',false))

Thanks, it worked. I can't upvote you because I don't have enough points but as soon I do I promise I will. Thank you for the quick answer! — worldexplorer95, Jul 31 '14 at 12:24

gire · Answer 2 · 2014-07-31T11:53:49.713

2

If you want speed the cell data structure is your enemy. For this example I will assume you have this vectors stored in a structure called vector_holder:

elements = fieldnames(vector_holder);

% Per Dan request
maximum_size = max(structfun(@max, vector_holder));

% maximum_size is the maximum length of all your separate arrays
matrix = NaN(length(elements), maximum_size);

for i = 1:length(elements)
    current_length = length(vector.holder(element{i}));
    matrix(i, 1:current_length) = vector.holder(element{i});
end

Many Matlab functions are slower when dealing with cell variables. In addition, a cell matrix with N double-precision elements requires more memory than a double-precision matrix with N elements.

edited Jul 31 '14 at 11:53

answered Jul 31 '14 at 11:46

gire

1,105
1
6
16

You should add how to find `maximum_size` from `vector_holder` – Dan Jul 31 '14 at 11:49
Have you tested `maximum_size = max(structfun(@max, vector_holder))`? On Octave Online it gives an error (the error btw leads me to believe that `structfun` is going to convert your struct to a `cellarray` internally therefore losing any memory advantage it may have had. Also adding `'uni',false` didn't help), but it may work in Matlab - please confirm. – Dan Jul 31 '14 at 11:56
@Dan it might be even possible to get rid of the for-loop, but the code will look awful. – gire Jul 31 '14 at 11:58
@Dan yes, it works in Matlab. I have no access to Octave, unfortunately. – gire Jul 31 '14 at 11:59
+1 then, I just don't have access to Matlab right now which is why I was checking on [Octave Online](http://octave-online.net/). I guess this is one of those cases where the two languages differ. – Dan Jul 31 '14 at 12:05
@Dan cannot say anything about Octave behavior, but in Matlab `structfun` is just a short notation to avoid for-loops. `structfun` will access the elements of the structure and apply the function to whatever is inside the struct (convenient when the applied function is simple). No conversion to cell involved at all. – gire Jul 31 '14 at 12:09

Accumulating different sized column vectors stored as a cell array into a matrix padded with NaNs

2 Answers2