Calculating means and sd for different groups

Question

I am trying to calculate means and standard deviations based on groups in a data.frame.

Sample	Widht	Weight	Length
A1.1	3.5	6.7	5.8
8.3	4.2	6.3	5.5
A1.1	2.9	5.7	5.1
8.3	3.7	6.1	5.4

I have been trying with this code to calculate means and standard deviations for each column based on the sample. I have many more columns in the real data frame but all should be calculated based on the sample column.

agdf<- aggregate(d.f, by=list(d.f$sample), function(x) c(mean = mean(x, na.rm=TRUE), sd = sd(x, na.rm=TRUE)))

When I try this command I get this error message :

    Error in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm = na.rm) :
    Calling var(x) on a factor x is defunct.
    Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.

I have checked classes for each column and the "sample" column is a factor while the others are numeric. I am very new to R and I don´t really understand what is wrong and how I could solve it. I would really appreciate some ideas/help. Thank you.

please provide `dput(d.f)` of your data frame for help for the helpers/answerers — Gwang-Jin Kim, Dec 11 '20 at 09:21
Note that in your code you are writing the `sample` column with lowercase "s", whereas in your screenshot it appears to be uppercase - that's a difference in R. — deschen, Dec 11 '20 at 09:26
`agdf <- aggregate(.~sample, d.f, function(x) c(mean = mean(x, na.rm = TRUE), sd = sd(x, na.rm = TRUE)))` — Ronak Shah, Dec 11 '20 at 09:41

score 1 · Answer 1 · answered Dec 11 '20 at 09:31

Always preferring the tidyverse way:

library(tidyverse)

agdf <- d.f %>%
  group_by(sample) %>%
  summarize(across(everything(), list(mean = mean, sd = sd), na.rm = TRUE))

Here we assume that you want to aggregate all your columns except the grouping column. If you only want to summarize a few columns, you can adjust the across(...) part.

Calculating means and sd for different groups

1 Answers1