2

The Matlab documentation seems unclear about how to ignore missing data when using kruskalwallis, the Kruskal-Wallis (or any other related) test. The same goes for unequal group size.

horchler
  • 18,384
  • 4
  • 37
  • 73

2 Answers2

1

For numeric data, the the standard missing data value in Matlab is NaN. See ismissing. See also this article from The MathWorks. For tables, you might find standardizeMissing helpful as well as replaceWithMissing for dataset objects. I can't say anything about group size.

horchler
  • 18,384
  • 4
  • 37
  • 73
  • Thank you for your answer, however I am aware of these functions. Instead I'm interested in statistical tests on data with missing values. – user3503398 May 17 '14 at 23:33
  • @user3503398: You asked "how to ignore missing data." You use `Nan`, as my answer indicates (almost all functions in the Statistics Toolbox use this convention). The rest of my answer is simply references and additional information. – horchler May 17 '14 at 23:38
1

Very late answer, but I ran into the same problem myself today, might as well help some future searcher.

The solution is pretty straightforward. kruskalwallis is primarily used on matrices and by default compares equal-sized columns, but it does allow you to instead assign groups manually, with the optional variable "group". I was attempting to check whether a single value was unlikely to belong to a distribution from a different set, so this was straightforward. I just added the value I wanted to test on to the end of the set I was testing against, then made "group" a vector of ones the same size as the set, with a "2" added to the end for the new value. Looks like it worked quite nicely.

Chris M.
  • 26
  • 3