Questions tagged [anova]

ANOVA is an acronym for "analysis of variance". It is a widely used statistical technique to analyze the source of variance within a data set.

Overview

Although ANOVA stands for ANalysis Of VAriance, it is about comparing means of data from different groups. It is part of the general linear model which also includes linear regression and ANCOVA. In matrix algebra form, all three are:

Y=XB+e

Where Y is a vector of values for the dependent variable (these must be numeric), X is a matrix of values for the independent variables and e is error.

Tag usage

SO questions on ANOVA should be about implementation and programming problems, not about the statistical or theoretical properties of the technique.
Consider whether your anova question might be better suited to CrossValidated, the StackExchange site for statistics, machine learning and data analysis.

In scientific software r for statistical computing and graphics, function aov implements ANOVA. Note that function anova does something else. See When should I use aov() and when anova()?

1456 questions

votes

0 answers

Statsmodels Anova for logistic regression

I found the statsmodels implementation of the anova testing for linear models to be very useful (http://www.statsmodels.org/dev/generated/statsmodels.stats.anova.anova_lm.html#statsmodels.stats.anova.anova_lm) but I was wondering, since it's not…

python logistic-regression statsmodels anova

asked Apr 13 '18 at 14:38

Asher11

1,295
2
15
31

votes

1 answer

2-way anova on unbalanced dataset

Is aov appropriate for unbalanced datasets. According to help ...provides a wrapper to lm for fitting linear models to balanced or unbalanced experimental designs. But later on it says aov is designed for balanced designs, and the results can be…

r anova

asked Apr 12 '10 at 19:15

Brani

6,454
15
46
49

votes

1 answer

Error in TukeyHSD in R

I'm working on mixed design ANOVA and would like to run TukeyHSD for its post-Hoc test. I keep getting error, "Error in UseMethod("TukeyHSD") : no applicable method for 'TukeyHSD' applied to an object of class "c('aovlist', 'listof')". I've…

r anova

asked Jul 08 '13 at 19:54

Rachel

votes

1 answer

Categorical variables usage in pandas for ANOVA and regression?

To prepare a little toy example: import pandas as pd import numpy as np high, size = 100, 20 df = pd.DataFrame({'perception': np.random.randint(0, high, size), 'age': np.random.randint(0, high, size), …

pandas numpy scipy anova hypothesis-test

asked May 23 '19 at 01:52

A T

13,008
21
97
158

votes

1 answer

Pass a named list of models to anova.merMod

I want to be able to pass a named list of models (merMod objects) to anova() and preserve the model names in the output. This is particularly useful in the context of using mclapply() to run a batch of slow models like glmers more efficiently in…

r lme4 anova do.call

asked Oct 18 '18 at 07:05

Dan Villarreal

votes

3 answers

How to run ANOVA on a wide format data.frame?

I've been taught to run an ANOVA with the formula: aov(dependent variable~independent variable, dataset) but I am struggling with how to run an ANOVA for a particular dataset because it is broken up into three columns that each contain a value. The…

r dataframe statistics reshape anova

asked Apr 29 '18 at 23:03

Victoria Fletcher

votes

2 answers

ANOVA with block design and repeated measures

I'm attempting to run some statistical analyses on a field trial that was constructed over 2 sites over the same growing season. At both sites (Site, levels: HF|NW) the experimental design was a RCBD with 4 (n=4) blocks (Block, levels: 1|2|3|4…

r statistics anova

asked Jan 23 '17 at 14:48

Rory Shaw

votes

1 answer

TukeyHSD adjusted P value is 0.0000000

I just performed a factorial ANOVA, followed by the TukeyHSD post-test. Some of my adjusted P values from the TukeyHSD output are 0.0000000. Can these P values really be zero? Or is this a rounding situation, and my true P value might be…

r anova

asked May 09 '13 at 20:20

Todd

votes

1 answer

How to perform single factor ANOVA in R with samples organized by column?

I have a data set where the samples are grouped by column. The following sample dataset is similar to my data's format: a = c(1,3,4,6,8) b = c(3,6,8,3,6) c = c(2,1,4,3,6) d = c(2,2,3,3,4) mydata = data.frame(cbind(a,b,c,d)) When I perform a…

r anova

asked Jan 07 '13 at 23:53

Borealis

8,044
17
64
112

votes

2 answers

Apply function to each row in Pandas dataframe by group

I built a Pandas dataframe (example below) indexed by gene name that has sample names for columns and integers as cell values. What I want to do is run an ANOVA (f_oneway(), from scipy.stats) for lists of row values as defined by lists of the…

python pandas dataframe scipy anova

asked Sep 25 '20 at 13:34

André Soares

votes

1 answer

Nested ANOVA unique factor levels

I'm running a nested ANOVA with the following setup: 2 areas, one is reference, one is exposure (column named CI = Control/Impact). Two time periods (before and after impact, column named BA), with 1 year in the before period and 3 years in the…

r anova

asked Nov 24 '17 at 13:12

user2602640

votes

3 answers

How to compare 2 models in R using the plm package?

So I am running a fixed effects model using the plm package in R, and I am wondering how I can compare which of two models are more suitable. For example, here is the code for two models I have constructed: library(plm) eurofix <- plm(rlogmod ~…

r anova plm

asked Feb 05 '15 at 00:43

NuclearPenguins

votes

1 answer

ANOVA: Degrees of freedom almost all equal 1

I have a data set that begins like this: > d.weight R N P C D.weight 1 1 0 0 GO 45.3 2 2 0 0 GO 34.0 3 3 0 0 GO 19.1 4 4 0 0 GO 26.6 5 5 0 0 GO 23.5 6 1 45 0 GO 22.1 7 2 45 0…

r anova

asked Oct 13 '14 at 15:59

XGF

votes

2 answers

How to extract a p-value when performing anova() between two glm models in R

So, I'm trying to compare two models, fit1 and fit2. Initially, I was just doing anova(fit1,fit2), and this yielded output that I understood (including a p-value). However, when I switched my models from lm()-based models to glm()-based models,…

r model regression glm anova

asked Nov 05 '12 at 18:06

Atticus29

4,190
18
47
84

votes

1 answer

Running scipy's oneway anova in a script

I have a problem. I want to run the scipy.stats f_oneway() ANOVA in a script that loads a data-archive containing groups with numpy arrays like so: archive{'group1': array([ 1, 2, 3, ..., ]), 'group2': array([ 9, 8, 7, ..., ]), …

python scipy anova

asked Oct 02 '12 at 02:13

surchs

Prev 1

…

96 97 Next