For each integer, take the mean of all the values

Question

I have a question for a problem I am trying to solve.

I have a rather large array with a series of numbers ranging from 4, 4.2,4.4 and 16, 16.5, 16.7 and so on in column 1 and a series of 0s and 1s in column 2 corresponding to each number such that column 1 will be say 5 and column 2 will be say 0. Below is a very small version of the matrix I am working with:

[5,0;5.10000000000000,0;5.20000000000000,0;5.25000000000000,0;5.30000000000000,0;5.35000000000000,0;5.45000000000000,0;5.50000000000000,0;5.55000000000000,0;5.60000000000000,0;14.2000000000000,0;5.70000000000000,0;5.80000000000000,0;5.90000000000000,0;14.0000000000000,0;14.9500000000000,1;14.8500000000000,1;14.6000000000000,1;14.3500000000000,1;14.3000000000000,1;14.2500000000000,1;14.3500000000000,1;14.2500000000000,1;14.1500000000000,1;14.0500000000000,1;]

What I want to do is write code that averages the 0s and 1s of column 2 for each integer in column 1. I honestly have no idea where to begin, I started to write a for loop, but was unsure of how I would execute a a process on a group of rows in column 2 on the basis of a group of rows in column 1. Does anyone have any ideas? I apologize I do not have any example code just yet, I honestly have no idea what to do at this point.

Question is not very clear to me. Can you please provide desired output on your sample input? `What I want to do is write code that averages the 0s and 1s of column 2 for each integer in column 1` ?? column 1 has floating point and how you ant to average? — jkshah, Oct 17 '13 at 19:32
You say "for each integer in column 1", but column 1 does not contain integers. Do you mean the integer part? Or do you mean "for each number in column 1"? — Luis Mendo, Oct 17 '13 at 22:36

craigim · Answer 1 · 2013-10-17T20:14:38.283

I would do something like this:

integers = floor(inputMatrix(:,1));
uniqueIntegers = unique(integers);
K = numel(uniqueIntegers);

outputMean = nan(1,K);
for k = 1:K
    outputMean(k) = mean(inputMatrix(integers==uniqueIntegers(k),2)));
end

where inputMatrix is your matrix above. In plain English, convert the first column to integers with the floor function, pick out the unique values, and then loop through the unique values and find the mean using logical indexing. The two vectors uniqueIntegers and outputMean contain those integers and the averaged values of the second column, respectively.

As suggested in comments, I think there are a few ways to read the question. First, as answered above, the OP wants to average together everything that with a 5 to the left of the decimal place.

If, however, the OP wants to average together only those values that contain a 5 (or other integer), then replace the value in the loop with:

outputMean(k) = mean(inputMatrix(inputMatrix(:,1)==uniqueIntegers(k),2)));

If, however, the OP wants to average together all values that contain an integer, regardless of what the value is, then the entire code block can be reduced to:

integers = floor(inputMatrix(:,1)) == inputMatrix(:,1);
outputMean = mean(inputMatrix(integers,2));

Makes sense, but the posted matrix makes no sense since all values in column 1 are unique... — chappjc, Oct 17 '13 at 20:00
OP said he wanted to average over the each integer in column 1, that's why I included the `floor` statement to round all the values down. — craigim, Oct 17 '13 at 20:07
I see. I wouldn't have assumed floor, but it makes sense this way. +1 — chappjc, Oct 17 '13 at 20:14
There are at least three ways to read the question by my count. I've added answers to the other two. — craigim, Oct 17 '13 at 20:42

score 2 · Accepted Answer · edited May 23 '17 at 11:57

If I understand correctly, you want the average of all values of the second column that have the same integer part in the first column.

You can achieve this by a slight modification of the answer to a previous question. Let x be your data (2 columns, arbitrary number of rows). Then:

x1_int = floor(x(:,1));
[value_sort ind_sort] = sort(x1_int);
[~, ii, jj] = unique(value_sort);
n = diff([0; ii]);
result = [ x1_int(ii) accumarray(jj,x(ind_sort,2))./n ];

If you use Matlab 2013a or newer, replace third line by the following. This is necessary because the unique function has been changed in Matlab 2013a:

[~, ii, jj] = unique(value_sort,'legacy');

The variable result contains in its column 1 the integer part of column 1 of x, and in its column 2 the corresponding average of column 2 of x. With your example data:

x= [5.0000         0
    5.1000         0
    5.2000         0
    5.2500         0
    5.3000         0
    5.3500         0
    5.4500         0
    5.5000         0
    5.5500         0
    5.6000         0
   14.2000         0
    5.7000         0
    5.8000         0
    5.9000         0
   14.0000         0
   14.9500    1.0000
   14.8500    1.0000
   14.6000    1.0000
   14.3500    1.0000
   14.3000    1.0000
   14.2500    1.0000
   14.3500    1.0000
   14.2500    1.0000
   14.1500    1.0000
   14.0500    1.0000]

the result is

result =

    5.0000         0
   14.0000    0.8333

For each integer, take the mean of all the values

2 Answers2