2

I'm performing ANOVA on a pandas dataframe using statsmodels anova_lm.

The returned significance level PR>F is 0.0. I assume this is a rounded value, but rounded at how many decimal places?

Is there a way to specify the number of decimal places in statsmodels?

my code:

from statsmodels.formula.api import ols
import statsmodels.api as sm

formula = 'consensus_rate ~ C(strategy) + np.power(nr_clues,' + str(exp) +') +  shared_ratio + primacy_weight + edges_per_node '
lm = ols(formula, data=bigdf).fit()
sm.stats.anova_lm(lm, typ=2)

returns

>>>>                           sum_sq      df             F  PR(>F)
C(strategy)              1.909980e+06     3.0  15196.209763     0.0
np.power(nr_clues, 0.1)  5.159021e+05     1.0  12313.884367     0.0
shared_ratio             7.383109e+05     1.0  17622.480378     0.0
primacy_weight           2.099998e+05     1.0   5012.410347     0.0
edges_per_node           8.457493e+04     1.0   2018.689015     0.0
Residual                 3.013158e+05  7192.0           NaN     NaN
leermeester
  • 365
  • 3
  • 19
  • 1
    `anova_lm` returns a pandas DataFrame. So you have all the pandas options to display and work with the results. – Josef Nov 05 '19 at 13:11

1 Answers1

0

PR(>F) is probably smaller than 0.000000

Looking at other statsmodels anova tables, it seems statsmodels displays floats with 6 decimals.

For example:

              df  sum_sq     mean_sq          F    PR(>F)
C(Fitness)   2.0   672.0  336.000000  16.961538  0.000041
Residual    21.0   416.0   19.809524        NaN       NaN

sourced from: https://www.statsmodels.org/stable/examples/notebooks/generated/interactions_anova.html

leermeester
  • 365
  • 3
  • 19