1

I use rose in Matlab and I want to color those triangles with values above the 95 percentile (max outliers) red. I used the following generic code

clear all
close all
ncat = 180;
mydata = rand(360,1).*100;       % random vector
mydata = mydata./max(mydata).*100; % normalize to a max of 100
[tout, rout] = rose(mydata,ncat); % do rose plot with 180 categories

polar(tout, rout);   % getting coordinates
[xout, yout] = pol2cart(tout, rout);
set(gca, 'nextplot', 'add');
test = sum(reshape(mydata(1:360),360/ncat,[])); 
index = find( test >= prctile(test,95)); % get index of outliers
for cindex = index
    fill(xout((cindex-1)*4+1:cindex*4), yout((cindex-1)*4+1:cindex*4), 'r'); % fill outliers red
end
set(gca,'View',[-90 90],'YDir','reverse'); % put 0(360) to top

however, the filled triangles are not the max values and I cannot figure out why. Any idea?

enter image description here

adding the solution suggested by @zeeMonkeez:

% as suggested in the answer
figure
[tout, rout] = rose(mydata,ncat); % do rose plot with 180 categories
polar(tout, rout);   % getting coordinates
[xout, yout] = pol2cart(tout, rout);
set(gca, 'nextplot', 'add');
test = rout(2:4:end);
index = find( test >= prctile(test,95)); % get index of outliers
for cindex = index
    fill(xout((cindex-1)*4+1:cindex*4), yout((cindex-1)*4+1:cindex*4), 'r'); % fill outliers red
end
set(gca,'View',[-90 90],'YDir','reverse'); % put 0(360) to top

does mark the highest ones

enter image description here

but for the original data I get

test1( test1 >= prctile(test1,95))

180.8300 190.7822 190.6257 175.4790 183.1746 196.6801 181.4798 176.1298 198.9011

length(test1( test1 >= prctile(test1,95)))

9

whereas when using rout I get for

test( test >= prctile(test,95))

4 5 5 4 5 4 4 4 4 4 4 4 5 4 4 6 4 4 5 4 4

length(test( test >= prctile(test,95)))

22

... following up on the comments and answers below (thanks a lot to @ ZeeMonkeez) and now that I understand how rose works, for those who might run into the same problem here one solution:

figure
catsize = 30;
counts_by_angle = round(rand(360,1).*100);

ncounts = sum(reshape(counts_by_angle(1:360),catsize,[]));
ncounts = ncounts ./max(ncounts);
bins = ((15:catsize:360)./360).*2.*pi;
cases = ones(1,round(ncounts(1).*100)).*round(bins(1),2);
for icat = 2:length(ncounts)
    cases = [cases  ones(1,round(ncounts(icat).*100)).*round(bins(icat),2)];
end
[tout, rout] = rose(cases,bins);
polar(tout, rout);
[xout, yout] = pol2cart(tout, rout);
set(gca, 'nextplot', 'add');
test = rout(2:4:end);
cindex = find( test > prctile(test,95));
for index = cindex
    fill(xout((index-1)*4+1:index*4), yout((index-1)*4+1:index*4), 'r');
end
set(gca,'View',[-90 90],'YDir','reverse');
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
horseshoe
  • 1,437
  • 14
  • 42
  • Also note that the inputs to `rose` are expressed in radians, so your `mydata` variable will essentially be taken mod 2π. Not sure you are aware of this looking at your code. – zeeMonkeez Dec 03 '15 at 20:07
  • I think in your example you assume `rose` takes histogram counts as inputs. It does not. – zeeMonkeez Dec 03 '15 at 22:31

1 Answers1

2

Take a look at rout (returned by rose), which conveniently contains the bin counts in a pattern 0 c(i) c(i) 0, for bins i. If you set

test = rout(2:4:end);

you get the bin counts for all ncat bins. The rest of your code correctly draws the outlier bins.

zeeMonkeez
  • 5,057
  • 3
  • 33
  • 56
  • @ zeeMonkeez: Thx for that. It works! However, as `rout` only contains integers identifying the highest 5% of `rout` will likely yield a different number of values in comparison to the aggregated `mydata` in the old `test` – horseshoe Dec 03 '15 at 21:29
  • I'm not quite sure I understand your problem. Do you want to find the outliers in the distribution of bin counts? That's what my answer does. If you want to find the outliers in the data, you have to be careful, because it will be mapped around the circle mod 2π. It's not even clear what outlier means on the circle. – zeeMonkeez Dec 03 '15 at 21:35
  • @ zeeMonkeez: I added the output in the question so it might get more clear what I mean – horseshoe Dec 03 '15 at 21:56
  • @ zeeMonkeez: Concerning the radians. If I have 360 categories it shouldn't matter right? In the help it says: The vector theta, expressed in radians, determines the angle of each bin from the origin. So if I have 360 bins they should match 2 Pi – horseshoe Dec 03 '15 at 22:05
  • @horseshoe Your data is in radians, so as I wrote above it will be taken mod 2π and that determines where on the circle it falls. Essentially what rose does is `histogram(mod(my data,2*pi), ncat)`, and wraps it around a circle. So `rose` takes raw data as input, not histogram counts. Does that make sense? – zeeMonkeez Dec 03 '15 at 22:28
  • @ zeeMonkeez: Ok I am a bit slow and it was late, but now I finally got it, thanks. I indeed assumed that each category includes the number of counts (so the output of hist) and not the angle itself. – horseshoe Dec 04 '15 at 08:20