0

I need to reshape a wide-format dataframe into a long-format.

I am familiar with the melt() function and have used it multiple times.

df <- data.frame( Weekday = c( "Mon", "Tues", "Wed", "Thurs", "Fri" ),
                  Q1      = c(  9.9 ,  4.9  ,  8.8 ,   12.2 ,  12.2 ),
                  Q2      = c(  5.4 ,  9.7  , 11.1 ,   10.2 ,   8.1 ),
                  Q3      = c(  8.8 ,  7.9  , 10.2 ,   9.2  ,   7.9 ),
                  Q4      = c(  6.9 ,    5  ,  9.3 ,   9.7  ,   5.6 ) ) 

df_melt <- melt(df, id.vars=c("Weekday"),
           variable.name="Quarter",
           value.name="Delay")

The above function will give the following output:

Weekday  Quarter  Delay
Mon        Q1      9.9
Tues       Q1      4.9
Wed        Q1      8.8
Thurs      Q1     12.2
Fri        Q1     12.2
Mon        Q2      5.4
Tues       Q2      9.7
Wed        Q2     11.1
...       ...      ...

However, I would like my long-format to look like this:

Weekday  Quarter  Delay
Mon        Q1      9.9
Mon        Q2      5.4
Mon        Q3      8.8
Mon        Q4      6.9
Tues       Q1      4.9
Tues       Q2      9.7
...        ...      ...

Are there functions in R that allow me to do this?

user15141497
  • 73
  • 1
  • 6

3 Answers3

1

You can match and order the output with a vector of the order you want.

order_vec <- c('Mon', 'Tues', 'Wed', 'Thurs', 'Fri')
df_melt[order(match(df_melt$Weekday, order_vec)), ]

#   Weekday Quarter Delay
#1      Mon      Q1   9.9
#6      Mon      Q2   5.4
#11     Mon      Q3   8.8
#16     Mon      Q4   6.9
#2     Tues      Q1   4.9
#7     Tues      Q2   9.7
#...
#...

Instead of melt if you use tidyr::pivot_longer it will give you the order you want directly.

tidyr::pivot_longer(df, cols = -Weekday)

#   Weekday name  value
#   <chr>   <chr> <dbl>
# 1 Mon     Q1      9.9
# 2 Mon     Q2      5.4
# 3 Mon     Q3      8.8
# 4 Mon     Q4      6.9
# 5 Tues    Q1      4.9
# 6 Tues    Q2      9.7
# 7 Tues    Q3      7.9
#...
#...
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

We can melt the dataset and order on the factor converted 'Weekday' with the levels specified as the unique values

library(data.table)
melt(setDT(df), id.var = 'Weekday', variable.name = 'Quarter',
       value.name = 'Delay')[order(factor(Weekday, levels = unique(Weekday)))]

-output

  Weekday Quarter Delay
 1:     Mon      Q1   9.9
 2:     Mon      Q2   5.4
 3:     Mon      Q3   8.8
 4:     Mon      Q4   6.9
 5:    Tues      Q1   4.9
 6:    Tues      Q2   9.7
 7:    Tues      Q3   7.9
 8:    Tues      Q4   5.0
 9:     Wed      Q1   8.8
10:     Wed      Q2  11.1
11:     Wed      Q3  10.2
12:     Wed      Q4   9.3
13:   Thurs      Q1  12.2
14:   Thurs      Q2  10.2
15:   Thurs      Q3   9.2
16:   Thurs      Q4   9.7
17:     Fri      Q1  12.2
18:     Fri      Q2   8.1
19:     Fri      Q3   7.9
20:     Fri      Q4   5.6

Or this can be done with base R by replicating the 'Weekday' and the column names while transposing the values of the columns except the first

data.frame(Weekday = rep(df$Weekday, each = 4), 
    Quarter = rep(names(df)[-1], 5), Delay = c(t(df[-1])))
   Weekday Quarter Delay
1      Mon      Q1   9.9
2      Mon      Q2   5.4
3      Mon      Q3   8.8
4      Mon      Q4   6.9
5     Tues      Q1   4.9
6     Tues      Q2   9.7
7     Tues      Q3   7.9
8     Tues      Q4   5.0
9      Wed      Q1   8.8
10     Wed      Q2  11.1
11     Wed      Q3  10.2
12     Wed      Q4   9.3
13   Thurs      Q1  12.2
14   Thurs      Q2  10.2
15   Thurs      Q3   9.2
16   Thurs      Q4   9.7
17     Fri      Q1  12.2
18     Fri      Q2   8.1
19     Fri      Q3   7.9
20     Fri      Q4   5.6
akrun
  • 874,273
  • 37
  • 540
  • 662
0

For the sake of completeness, the as_factor() function from the package creates levels from a character vector in the order in which they appear. This can be utilized to order df_melt() as requested:

df_melt[order(forcats::as_factor(df_melt$Weekday)), ]
   Weekday Quarter Delay
1      Mon      Q1   9.9
6      Mon      Q2   5.4
11     Mon      Q3   8.8
16     Mon      Q4   6.9
2     Tues      Q1   4.9
7     Tues      Q2   9.7
12    Tues      Q3   7.9
17    Tues      Q4   5.0
3      Wed      Q1   8.8
8      Wed      Q2  11.1
13     Wed      Q3  10.2
18     Wed      Q4   9.3
4    Thurs      Q1  12.2
9    Thurs      Q2  10.2
14   Thurs      Q3   9.2
19   Thurs      Q4   9.7
5      Fri      Q1  12.2
10     Fri      Q2   8.1
15     Fri      Q3   7.9
20     Fri      Q4   5.6

This can be utilized as well to streamline akrun's data.table version:

library(data.table)
melt(setDT(df), id.var = 'Weekday', variable.name = 'Quarter',
     value.name = 'Delay')[order(forcats::as_factor(Weekday))]
    Weekday Quarter Delay
 1:     Mon      Q1   9.9
 2:     Mon      Q2   5.4
 3:     Mon      Q3   8.8
 4:     Mon      Q4   6.9
 5:    Tues      Q1   4.9
 6:    Tues      Q2   9.7
 7:    Tues      Q3   7.9
 8:    Tues      Q4   5.0
 9:     Wed      Q1   8.8
10:     Wed      Q2  11.1
11:     Wed      Q3  10.2
12:     Wed      Q4   9.3
13:   Thurs      Q1  12.2
14:   Thurs      Q2  10.2
15:   Thurs      Q3   9.2
16:   Thurs      Q4   9.7
17:     Fri      Q1  12.2
18:     Fri      Q2   8.1
19:     Fri      Q3   7.9
20:     Fri      Q4   5.6
Uwe
  • 41,420
  • 11
  • 90
  • 134