0

Assume the following hypothetical Matlab data (as column vectors): for 3 subjects (i=1 to 3) each provides three measurements y1, y2 , y3, over 5 time points (j=1 to 5) or less (unbalanced). The original data set is bigger. so I need to use cell array. I need at the end have Y=cell(3,1), such that, for a subject i, Y{i} is a matrix represents the repeated measures for subject i.

i   j   y1  y2  y3

1   1   1.0 0.6 0.8
1   2   0.8 0.7 0.2
1   3   1.0 0.7 0.9
1   4   1.0 0.8 0.7
1   5   0.7 0.8 0.9

2   1   0.5 0.7 0.8
2   2   0.4 0.7 0.6
2   3   0.4 0.5 0.8

3   1   0.4 0.5 0.7
3   2   0.5 0.6 0.8
3   3   0.5 0.6 0.8
3   4   0.6 0.6 0.8

So I need them look like

Y{1}=       
1.0 0.6 0.8
0.8 0.7 0.2
1.0 0.7 0.9
1.0 0.8 0.7
0.7 0.8 0.9

Y{2}=       
0.5 0.7 0.8
0.4 0.7 0.6
0.4 0.5 0.8

Y{3}=       
0.4 0.5 0.7
0.5 0.6 0.8
0.5 0.6 0.8
0.6 0.6 0.8

I need also to use i and j to help in indexing

N. I. ElZayat
  • 11
  • 1
  • 3
  • The question is not clear. What does the original look like (i.e. what does *unbalanced* mean)? In what format is it (e.g. text file, Matlab array)? What do you expect as an output? A small example with a few lines of *unbalanced* input and the expected output will help to understand what you exactly want -- and get a useful answer. – Brice Nov 23 '18 at 08:46
  • Please edit the question, this is not readable in a comment – Brice Nov 23 '18 at 09:36
  • @Brice, I felt it was not readable so I edit the question and tried to clarify. Wish it is clear now. Unbalanced means the matrix of each cell in Y is not of the same size. the first is 5X3, second 3X3 , last is 4X3. – N. I. ElZayat Nov 23 '18 at 09:43

1 Answers1

0

I'll assume that the input data is contained in 5 vectors called i, j, y1, y2 and y3. You may use the following code:

% Initialize Y
Y = cell(max(i),1);

% Loop to read
for kk=1:max(i)
    sel = (i==kk); % Logical array use to select lines according to i
    ind = j(sel);  % This is in case the input data is not sorted, we'll know where the data has to go in Y{...}
    % Start with last column so that Y{ii} is initialized with the good size
    Y{kk}(ind,3) = y3(sel);
    Y{kk}(ind,2) = y2(sel);
    Y{kk}(ind,1) = y1(sel);
end

If all the data is already sorted by i & j, you could use mat2cell. A loop would still be needed to know the number of lines for each value of i:

% Initialize Y
count = zeros(max(i),1);

% Loop to count
for kk=1:max(i)
    count(kk)=(i==kk);
end
Y = mat2cell([y1,y2,y3] , count);

(As a side note, the use of i and j as variable names is not recommended as it may lead to a confusion with the imaginary unit i and may affect performance. Mathworks recommends to use other variable names, and use 1i or 1j when referring to the imaginary unit.)

Brice
  • 1,560
  • 5
  • 10
  • your first code results in an error, I modified two lines then it work but I cant post the code in a neat way in the comment as ctrl+k does not define the selected line as code here – N. I. ElZayat Nov 23 '18 at 13:47
  • You may use backquotes ` around code snippets in comments. This is also how you put code inline in questions & answers. – Brice Nov 23 '18 at 14:23
  • The modified code is // `N=max(i1); Y = cell(1,N); for kk=1:N sel = (i1==kk); Y{kk} = [y1(sel),y2(sel),y3(sel)]; end`// I can not find how to enter new line in the comment?? – N. I. ElZayat Nov 23 '18 at 14:27
  • I got it. As regards running the code in my last comment on the full data set I need to add another command line in the loop to get rid of NAN vectors for unbalanced subjects, the final code that works fine with my full data is `N=max(i1); Y = cell(1,N); for kk=1:N sel = (i1==kk); yy = [y1(sel),y2(sel),y3(sel)]; yy(any(isnan(yy), 2), :) = []; Y{kk} = yy; end` – N. I. ElZayat Nov 23 '18 at 15:41