Python: ANOVA with dictionaries of different lengths

Question

I have the following data:

data = {'treatment_1': [80, 0, 0, 8],
        'treatment_2': [78, 62],
        'treatment_3': [85, 62, 10, 3, 18, 18, 98, 71, 78, 12, 52, 39, 24, 13],
        'treatment_4': [78, 33, 78, 40, 47, 32]
       }

I am trying to run an ANOVA comparing these four treatments. As you can see, there are different numbers of data points in each treatment. Now, this shouldn't be a problem in theory, because ANOVA does not assume equal sample sizes. First, I tried to create a DataFrame. The code:

import pandas as pd
df = pd.DataFrame(data)

Gives me the error message:

ValueError: All arrays must be of the same length

So, this tells me that a DataFrame will not work. But no matter how I search for "Anova with unequal sample sizes," all I find is information using lists (and their code does not work with dictionaries) and/or equal sample sizes (which do not explain how to adjust for unequal sample sizes). How should I approach an ANOVA with dictionaries of different lengths? Or maybe I'm going about this wrong using dictionaries in the first place?

Does this answer your question? [dictionary of nested variable length lists to pandas DF](https://stackoverflow.com/questions/34720539/dictionary-of-nested-variable-length-lists-to-pandas-df) — Алексей Р, Sep 03 '22 at 02:50
No. Those give the error message "TypeError: cannot convert dictionary update sequence element #0 to a sequence" — jason-hernandez-73, Sep 03 '22 at 03:00

score 0 · Accepted Answer · answered Sep 03 '22 at 03:01

data = {'treatment_1': [80, 0, 0, 8],
        'treatment_2': [78, 62],
        'treatment_3': [85, 62, 10, 3, 18, 18, 98, 71, 78, 12, 52, 39, 24, 13],
        'treatment_4': [78, 33, 78, 40, 47, 32]
        }

df = pd.DataFrame({k: pd.Series(v) for k, v in data.items()})
print(df)

Prints:

    treatment_1  treatment_2  treatment_3  treatment_4
0          80.0         78.0           85         78.0
1           0.0         62.0           62         33.0
2           0.0          NaN           10         78.0
3           8.0          NaN            3         40.0
4           NaN          NaN           18         47.0
5           NaN          NaN           18         32.0
6           NaN          NaN           98          NaN
7           NaN          NaN           71          NaN
8           NaN          NaN           78          NaN
9           NaN          NaN           12          NaN
10          NaN          NaN           52          NaN
11          NaN          NaN           39          NaN
12          NaN          NaN           24          NaN
13          NaN          NaN           13          NaN

Python: ANOVA with dictionaries of different lengths

1 Answers1