0

I'm trying to get the minimum time for each row in a dataframe. I don't know the names of the columns that I will be choosing, but I do know they will be the first to fifth columns:

data <- structure(list(Sch1 = c(99, 1903, 367), 
               Sch2 = c(292,248, 446), 
               Sch3 = c(252, 267, 465), 
               Sch4 = c(859, 146,360), 
               Sch5 = c(360, 36, 243),
               Student.ID = c("Ben", "Bob", "Ali")),
          .Names = c("Sch1", "Sch2", "Sch3", "Sch4", "Sch5", "Student.ID"), row.names = c(NA, 3L), class = "data.frame")

# this gets overall min for ALL rows
data %>% rowwise() %>%  mutate(min_time = min(.[[1]], .[[2]], .[[3]], .[[4]], .[[5]])) 

# this gets the min for EACH row
data %>% rowwise() %>%  mutate(min_time = min(Sch1, Sch2, Sch3, Sch4, Sch5))

Should column notation .[[1]] return all values when in rowwise mode? I've also tried grouping on Student.ID instead of rowwise, but this doesn't make any difference

zx8754
  • 52,746
  • 12
  • 114
  • 209
pluke
  • 3,832
  • 5
  • 45
  • 68

1 Answers1

1

The reason column notation .[[1]] returns all values even during the grouping is is that . is not actually grouped. Basically, . is the same thing as the dataset you started with. So, when you call .[[1]], you are essentially accessing all the values in the first column.

You may have to mutate the data and add a row_number column. This allows you to index the columns you are mutating at their corresponding row numbers. The following should do:

data %>%
    mutate(rn = row_number()) %>%
    rowwise() %>%
    mutate(min_time = min(.[[1]][rn], .[[5]][rn])) %>%
    select(-rn)

Should yield:

#    Sch1  Sch2  Sch3  Sch4  Sch5 Student.ID min_time
#   <dbl> <dbl> <dbl> <dbl> <dbl>      <chr>    <dbl>
# 1    99   292   252   859   360        Ben       99
# 2  1903   248   267   146    36        Bob       36
# 3   367   446   465   360   243        Ali      243
Abdou
  • 12,931
  • 4
  • 39
  • 42