0

I have a 1-dimensional data which is (t) where users spend time to complete a task. I applied kernel density estimation from http://www.mathworks.com/matlabcentral/fileexchange/14034-kernel-density-estimator to remove the outliers who spent unreasonable time. I used the following lines:

[bandwidth,density,xmesh]=kde(dur1);
plot(xmesh,density);

After applying KDE, I have a problem of defining the local minima to split the data. The following link shows how the curve looks like: http://s23.postimg.org/6aa1748jf/kde.jpg I expect to see three clusters, where the middle one contains the reasonable spent time. However, the curve I have got has only one peak.

I am wondering if the steps I am following are correct?

Saeed
  • 25
  • 1
  • 8
  • 1
    Try a smaller bandwidth! – Has QUIT--Anony-Mousse Apr 13 '16 at 20:35
  • Thanks Anony-Mousse. Your advice has been taken and it is working. Could you please tell me how to know the appropriate bandwidth for my data? Also when I locate the minima values at both ends, do you think it is a correct way to split the data there and remove outliers? – Saeed Apr 14 '16 at 13:28
  • There are only rules of thumb, such as the plug-in estimator. There is no such thing as a "correct way" IMHO if you are working with real data. – Has QUIT--Anony-Mousse Apr 14 '16 at 14:40
  • Thanks again – Anony-Mousse. Your answers are very valuable. – Saeed Apr 17 '16 at 18:21

0 Answers0