What is the best way to fit test / test normality for each unique ilitm in the below dataset? Thanks
1 Answers
As you know (visible in the edit history) Oracle provides the Shapiro-Wilk test of normality (I use a link to [R], as you will find much more reference for this implementation).
The important thing to know is that the OUT parameter sig
corresponds to what the statistics call the p-value
.
Example
DECLARE
sig NUMBER;
mean NUMBER := 0;
stdev NUMBER := 1;
BEGIN
DBMS_STAT_FUNCS.normal_dist_fit (USER,
'DIST',
'DIST1',
'SHAPIRO_WILKS',
mean,
stdev,
sig);
DBMS_OUTPUT.put_line (sig);
END;
/
you get the following output
W value : ,9997023261540432791888281834378157820514
,7136528702727722659486194469256296703232
For comparison the test in r
with the same data
> shapiro.test(df$DIST1)
Shapiro-Wilk normality test
data: df$DIST1
W = 0.9997, p-value = 0.7137
The rest is statistics:)
My interpretation - this test is useful if you need to discard the most coarse deviations from the normal distribution
If sig < .05 you may throw the data away as not normal distributed, but a high value of sig doesn't mean the opposite. You only know that you can't discard it as non-normal..
Anyway a plot of distribution can provide better insight that a simple true/false test. Here is R a good resource as well.
Some other useful discussions to this topic.

- 1
- 1

- 19,886
- 4
- 26
- 53