1

I have a data frame A that I would like to melt row wise instead of column wise to look like B (which also excludes observations with NA in them). Can this be accomplished with the "melt" function?

A <- read.table(text="  Id1 Id2 Var1    Var2    Var3    
            1   1   1   2   NA  
            1   2   NA  3   4   
            1   3   5   6   7   ", header=T)


B <- read.table(text="  Id1 Id2 NewVar  
    1   1   1   
    1   1   2   
    1   2   3   
    1   2   4   
    1   3   5   
    1   3   6   
    1   3   7   ", header=T)

I found an answer to a similar question but the function seems like overkill and is beyond my current R skills.

Community
  • 1
  • 1
RTrain3k
  • 845
  • 1
  • 13
  • 27

3 Answers3

5

From the tidyverse, melt -> summarise -> unnest

library(reshape2)
library(tidyr)
melt(A, id=c("Id1", "Id2")) %>% group_by(Id1, Id2) %>%
  summarise(value=list(value[!is.na(value)])) %>% unnest()
# Source: local data frame [7 x 3]
# Groups: Id1 [1]
# 
#     Id1   Id2 value
#   <int> <int> <int>
# 1     1     1     1
# 2     1     1     2
# 3     1     2     3
# 4     1     2     4
# 5     1     3     5
# 6     1     3     6
# 7     1     3     7

Edit

Since melt has an NA removal argument (credit: @Frank) , a one-liner will do. Note: this line is a reshape2 version of akrun's data.table approach:

melt(A, id=c("Id1", "Id2"), na.rm=TRUE)[-3]
Pierre L
  • 28,203
  • 6
  • 47
  • 69
4

We can order it after melting

library(data.table)
melt(setDT(A), id.var = c("Id1", "Id2"), na.rm = TRUE, value.name = "NewVar"
     )[order(Id1, Id2)][, variable := NULL][]
#   Id1 Id2 NewVar
#1:   1   1     1
#2:   1   1     2
#3:   1   2     3
#4:   1   2     4
#5:   1   3     5
#6:   1   3     6
#7:   1   3     7

Or we can do with base R by replicating to expanding the rows of 'A' for the 'Id' columns, and cbind with the transpose of non-Id columns, concatenated.

`row.names<-`(na.omit(data.frame(A[rep(1:nrow(A), each =3), 1:2], 
                    NewVar= c(t(A[-(1:2)])))), NULL)
#  Id1 Id2 NewVar
#1   1   1      1
#2   1   1      2
#3   1   2      3
#4   1   2      4
#5   1   3      5
#6   1   3      6
#7   1   3      7
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    I don't understand, isnt this question just a basic reshape with minor formatting? `reshape(A, dir = 'long', varying = 3:5, sep = '')` – rawr Aug 26 '16 at 15:20
2

You can use dplyr with gather from the tidyr package:

library(dplyr)
library(tidyr)
A %>% gather(Key,NewVar,Var1:Var3) %>% 
      filter(!is.na(NewVar)) %>% 
      arrange(Id1,Id2) %>% select(-Key)
##  Id1 Id2 NewVar
##1   1   1      1
##2   1   1      2
##3   1   2      3
##4   1   2      4
##5   1   3      5
##6   1   3      6
##7   1   3      7

This will gather the columns Var1:Var3 into rows in the NewVar column. The Key column will contain the names of the column that is gathered, which we don't need in the final result. After gathering, filter out the NA rows in NewVar. Then sort by ID1 followed by ID2 and remove the Key column.

aichao
  • 7,375
  • 3
  • 16
  • 18