0

I'm plotting a box plot with overlaid data from the following concatenated matrix:

data = [10  16  24  31  12  26  23  33;11   15  27  27  12  24  22  36;12   15  24  25  14  25  22  37;10   16  27  24  14  27  23  41;12   15  NaN NaN 15  NaN 22  NaN;13  18  NaN NaN 16  NaN 22  NaN]

The code for this plot is:

datas=sort(data);
datainbox=datas(ceil(end/4)+1:floor(end*3/4),:);
[n1,n2]=size(datainbox);
dataoutbox=datas([1:ceil(end/4) floor(end*3/4)+1:end],:);
n3=size(dataoutbox,1);
% calculate quartiles
dataq=quantile(data,[.25 .5 .75]);
% calculate range between box and outliers = between 1.5*IQR from quartiles
dataiqr=iqr(data);
datar=[dataq(1,:)-dataiqr*1.5;dataq(3,:)+dataiqr*1.5];
dataoutbox(dataoutbox<ones(n3,1)*datar(1,:)|dataoutbox>ones(n3,1)*datar(2,:))=nan;

figure()
hold on
bp = boxplot(data);
plot(ones(n1,1)*[1 2 3 4 5 6 7 8]+.4*(rand(n1,n2)-.5),datainbox,'k.','MarkerSize',12)
plot(ones(n3,1)*[1 2 3 4 5 6 7 8]+.4*(rand(n3,n2)-.5),dataoutbox,'.','color',[1 1 1]*.5,'MarkerSize',12)
set(bp,'linewidth',1);

As indicated above, I am sorting the data into 'datainbox' and 'dataoutbox' based on the IQR. The code works as expected (credit to JJM Driesson) except for the data columns containing NaNs, where as shown in the plot the data is not sorted correctly. How should I modify the above code to exclude NaNs from calculations and prevent this from influencing the plot?

Thank you for your time,

Laura

Laura
  • 89
  • 8

1 Answers1

0

You should process every column separately. You can select the NaN values as follows: col = data(~isnan(data(:, i)), i);

If you want all the boxplots in the same figure, you can try to use this answer.

m7913d
  • 10,244
  • 7
  • 28
  • 56
  • 1
    Here is [another related post](http://stackoverflow.com/questions/39708175/hierarchically-grouped-boxplot) for grouping boxplots – EBH Feb 28 '17 at 20:09
  • Hi both, thank you for your help, using both of your tips I've solved it! – Laura Mar 07 '17 at 12:02