How to most efficiently melt a data.frame with multiple measured column groups

Question

I have a data frame of the form:

df <- data.frame("id_1" = seq(1,100, 1),
                 "id_2" = seq(1,100, 1),
                 "id_3" = seq(1,100, 1),
                 "m1_g1" = seq(1,100, 1),
                 "m2_g1" = seq(2,200, 2),
                 "m3_g2" = seq(3,300, 3),
                 "m4_g2" = seq(4,400, 4))

I would like to melt this so that the id columns serve as id's, but then I effectively have 2 rows per original entry that show me the m1-m4 columns where m1-m2 are of the same measure type (g1) and m3-m4 are of another measure type (g2)

The final data frame would look like this:

   id_1    id_2    id_3   var   value_1    value_2
1    1       1       1     1       1           3
2    1       1       1     2       2           4
3    2       2       2     1       2           6
4    2       2       2     2       4           8

I've tried using melt to create separate melted data frames and then pasting over the column, but I feel there has got to be a better way.

Thank you all!

I don't understand what `var`, `value_1` and `value_2` represent, could you clarify? — Andrea M, Mar 22 '22 at 21:38
Assume that from the original data frame m1 and m2 are two time points of the measurement g1 and that m3-m4 are the same two time points but for a different measurement g2 I want to melt m1-m4 such that m1-m2 are grouped and m3-m4 are grouped. In doing so each original row in the data frame should become two rows The first would be var = 1 (the first time point) for measurements g1 and g2 The second would be the same as the prior row, but end in var = 2 (the second time point) for measurements g1 and g2) — wingsoficarus116, Mar 22 '22 at 21:53

score 1 · Accepted Answer · answered Mar 22 '22 at 21:50

Using tidyr, we can do it in one pivot, but we need to rename some of them first so that the leading m*_ is less different.

library(dplyr)
library(tidyr) # pivot_longer
rename(df, m1_g2 = m3_g2, m2_g2 = m4_g2) %>%
  pivot_longer(-starts_with("id"), names_pattern = "m(.*)_g(.*)", names_to = c("val", ".value"))
# # A tibble: 200 x 6
#     id_1  id_2  id_3 val     `1`   `2`
#    <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
#  1     1     1     1 1         1     3
#  2     1     1     1 2         2     4
#  3     2     2     2 1         2     6
#  4     2     2     2 2         4     8
#  5     3     3     3 1         3     9
#  6     3     3     3 2         6    12
#  7     4     4     4 1         4    12
#  8     4     4     4 2         8    16
#  9     5     5     5 1         5    15
# 10     5     5     5 2        10    20
# # ... with 190 more rows

How to most efficiently melt a data.frame with multiple measured column groups

1 Answers1