0

I am trying to do kruskal-wallis in a dataframe. In rows- I have 4 groups of data (4 disease diagnosis), each of the group has 6 patients each In columns- I have around 7000 genes.

I am trying to perform kruskal-wallis/ANOVA amoung the 4 groups for each gene.

I am able to do it by this code- stats.kruskal(*[group["gene_1"].values for name, group in df.groupby("disease_diagnosis")])

This gives the p-value for gene 1

I am trying to get this done over the 7000 genes (in columns) I tried this code.

for key, value in df.iteritems():
       kw_table = stats.kruskal(*[value.values for name, group in 
df.groupby("disease_diagnosis")])
print(kw_table) 

this did not work among others I tried. Will be helpful to get some guidance.. thanks

smad
  • 1
  • 1
  • 1
    What is going wrong? Can you present a [minimal, _reproducible_ example](https://stackoverflow.com/help/minimal-reproducible-example)? – Nelewout Jun 06 '21 at 15:59
  • 7,000 columns? Reconsider wide formatted data for long format. You can avoid extensive code in doing so. – Parfait Jun 06 '21 at 16:49

0 Answers0