10

I have 3 vectors: Y=rand(1000,1), X=Y-rand(1000,1) and ACTid=randi(6,1000,1). I'd like to create boxplots by groups of Y and X corresponding to their group value 1:6 (from ACTid).

This is rather ad-hoc and looks nasty

for ii=
dummyY(ii)={Y(ACTid==ii)};
dummyX(ii)={X(ACTid==ii)}
end

Now I have the data in a cell but can't work out how to group it in a boxplot. Any thoughts?

I've found aboxplot function that looks like this but I don't want that, I'd like the builtin boxplot function because i'm converting it to matlab2tikz and this one doesn't do it well.

enter image description here

EDIT

Thanks to Oleg: we now have a grouped boxplot... but the labels are all skew-whiff.

xylabel = repmat({'Bleh','Blah'},1000,1); % need a legend instead, but doesn't appear possible
boxplot([Y(:,end); cfu], {repmat(ACTid,2,1), xylabel(:)} ,'factorgap',10,'color','rk')
set(gca,'xtick',1.5:3.2:50)
set(gca,'xticklabel',{'Direct care','Housekeeping','Mealtimes','Medication','Miscellaneous','Personal care'})
>> ylabel('Raw CFU counts (Y)')

enter image description here

How to add a legend?

HCAI
  • 2,213
  • 8
  • 33
  • 65
  • I have the outlier thing covered because I'm using `matlab2tikz` and there I can specify easily that. The legend I can put in artificially too, but it's not fun like that :S – HCAI Apr 12 '13 at 16:14

2 Answers2

10

I had the same problem with grouping data in a box plot. A further constraint of mine was that different groups have different amounts of data points. Based on a tutorial I found, this seems to be a nice solution I wanted to share with you:

x = [1,2,3,4,5,1,2,3,4,6];
group = [1,1,2,2,2,3,3,3,4,4];
positions = [1 1.25 2 2.25];
boxplot(x,group, 'positions', positions);

set(gca,'xtick',[mean(positions(1:2)) mean(positions(3:4)) ])
set(gca,'xticklabel',{'Direct care','Housekeeping'})

color = ['c', 'y', 'c', 'y'];
h = findobj(gca,'Tag','Box');
for j=1:length(h)
   patch(get(h(j),'XData'),get(h(j),'YData'),color(j),'FaceAlpha',.5);
end

c = get(gca, 'Children');

hleg1 = legend(c(1:2), 'Feature1', 'Feature2' );

colored grouped boxplot with varying group sizes

Here is a link to the tutorial.

Simeon
  • 408
  • 1
  • 7
  • 17
  • 1
    It seems that the handles are returned from the last to the first, so to match the order of the `color` map from left to right, I just changed the `for` loop into `patch(get(h(j),'XData'),get(h(j),'YData'),color(length(h)-j+1),'FaceAlpha',.5);` – George Aprilis Sep 09 '14 at 09:28
9

A two-line approach (although if you want to retain two-line xlables and center those in the first line, it's gonna be hackish):

Y     = rand(1000,1);
X     = Y-rand(1000,1);
ACTid = randi(6,1000,1);

xylabel = repmat('xy',1000,1);
boxplot([X; Y], {repmat(ACTid,2,1), xylabel(:)} ,'factorgap',10)

The result:

enter image description here

EDIT

To center labels...

% Retrieve handles to text labels
h = allchild(findall(gca,'type','hggroup'));

% Delete x, y labels
throw = findobj(h,'string','x','-or','string','y');
h     = setdiff(h,throw);
delete(throw);

% Center labels
mylbl  = {'this','is','a','pain','in...','guess!'};
hlbl   = findall(h,'type','text');
pos    = cell2mat(get(hlbl,'pos'));

% New centered position for first intra-group label
newPos = num2cell([mean(reshape(pos(:,1),2,[]))' pos(1:2:end,2:end)],2);
set(hlbl(1:2:end),{'pos'},newPos,{'string'},mylbl')

% delete second intra-group label
delete(hlbl(2:2:end))

Exporting as .png will cause problems...

Oleg
  • 10,406
  • 3
  • 29
  • 57
  • You can add `colorgroup` and `colors` arguments to distinguish groups by color. – yuk Apr 12 '13 at 15:25
  • Thank you very much for this! I have two little questions: I'm looking for each group to be called something different. I.e. 1=Direct care, 2=Housekeeping, 3=Mealtimes.... And instead of x, y group labels have a legend for colour `boxplot([Y; X], {repmat(ACTid,2,1), xylabel(:)} ,'factorgap',10,'color','rk')`. @yuk the colors command changes the box but not the outliers, can these be changed too? – HCAI Apr 12 '13 at 15:46
  • 1
    It can be done but in a little hackish way. See this [question](http://stackoverflow.com/questions/15125314/colorfill-the-boxes-in-a-boxplot-in-matlab), for example. – yuk Apr 12 '13 at 16:22
  • 1
    As for labels, use 'labels' argument in `boxplot`. See the [documentation](http://www.mathworks.com/help/stats/boxplot.html) for other options. – yuk Apr 12 '13 at 16:24
  • @yuk what about legend though? that doesn't seem to work at all – HCAI Apr 12 '13 at 16:32
  • @OlegKomarov Many thanks to both of you. I see Oleg you also are quite active on matlab answers :). Unfortunately I'm exporting to LaTeX via `matlab2tikz` and this does not capture the group labels correctly. I will stick with set(gca,'xtick',1.5:3.2:50) and set(gca,'xticklabel',{'Direct care','Housekeeping','Mealtimes','Medication','Miscellaneous','Personal care'}) as an ad-hoc approach. – HCAI Apr 12 '13 at 20:05