How to fit probability distribution to multi variate data in R or Python?

Asked May 30 '17 at 07:32

Active May 30 '17 at 12:19

Viewed 48 times

I have dataset of 250000 points which has 15 features. Each feature takes values from 0 to 100.

So, I want to fit a probability distribution to this dataset to identify outliers like wrong data entry.

For univariate there is fitdist in R, what about multi variate?

How to do this effectively in R or Python?

edited May 30 '17 at 12:19

asked May 30 '17 at 07:32

curio17

Could you be more specific? With `R`, you could compute both the mean (`mean()`) and the standard deviation (`sd()`) of your features, and then check if your data entries lie in between a specific value range (like mean+/- 1.5 * standard deviation) and further examine the identified outliers. – LAP May 30 '17 at 07:51
Most readers will not be familiar with lakh. Consider rephrasing if you want to communicate well. – Glen_b May 30 '17 at 09:22

0 Answers0