1

I have a dataframe that looks like this enter image description here

I want the dataframe to be converted to long format like this

enter image description here

This is the code I use

long_ex <- melt(wide_ex, id.vars = 5, na.rm=TRUE) #wide_ex is wide format dataframe

However, my end result looks like this

enter image description here

Is there a way to use melt without extracting the column names? Will gladly accept alternative functions if melt is not best suited for this job

Edit: Data output from dput

structure(list(ï..Column1 = c(NA, NA, NA, NA), Column2 = c(NA, 
NA, NA, NA), Column3 = c(NA, NA, NA, NA), Column4 = c(NA, NA, 
NA, NA), Column5 = structure(c(2L, 1L, 4L, 3L), .Label = c("Eric ", 
"Jim", "Matt", "Tim"), class = "factor"), Column6 = c(NA, NA, 
NA, NA), Column7 = structure(c(1L, 3L, 2L, 3L), .Label = c("Eric", 
"Erica", "Mary "), class = "factor"), Column8 = structure(c(3L, 
2L, 1L, 3L), .Label = c("Beth", "Loranda", "Matt"), class = "factor"), 
    Column9 = structure(c(2L, 3L, 1L, 3L), .Label = c("Courtney ", 
    "Heather ", "Patrick"), class = "factor"), Column10 = structure(4:1, .Label = c("Beth", 
    "Heather", "John", "Loranda "), class = "factor"), Column11 = c(NA, 
    NA, NA, NA), Column12 = c(NA, NA, NA, NA), Column13 = c(NA, 
    NA, NA, NA), Column14 = c(NA, NA, NA, NA), Column15 = c(NA, 
    NA, NA, NA)), class = "data.frame", row.names = c(NA, -4L
))
Jin
  • 527
  • 6
  • 21
  • 1
    Why can you not delete the 'column column' afterwards? Anyhow, can you share your data using `dput` ? – markus Jan 04 '19 at 20:36
  • @markus I know that's a very easy solution, but I'm curious about alternative methods for my own edification – Jin Jan 04 '19 at 20:41

2 Answers2

3

If you want a base R solution:

data.frame(name_1 = rep(as.character(wide_ex$Column5), each=nrow(wide_ex)),
    name_2 = as.vector(t(wide_ex[, c("Column7", "Column8", "Column9", "Column10")])))

I'm still of the opinion the most concise method is melt with data.table:

library(data.table)
setDT(wide_ex)
melt(wide_ex, id.vars = c("Column5"), na.rm=TRUE)[,variable := NULL][]

It will also offer considerable speed improvements over reshape2 if speed is of concern.



(Additional explanation) ... What's with the use of additional []?

A) the use of additional [] in data.table is known as chaining. It allows you to perform more operations on preceding []'s.

As you originally indicated, the output of melt produces an unwanted column (variable). variable := NULL removes it. It's essentially the same as doing the following (on your original question):

 long_ex <- melt(wide_ex, id.vars = 5, na.rm=TRUE) 
 long_ex$variable <- NULL

However the use of := does it by reference (and on the same line).

Chaining can be super useful to keep your code nice and concise. Say you wanted to order your output by the first column (as you kind of indicated in your original question). You could do this like so:

melt(wide_ex, id.vars = c("Column5"), na.rm=TRUE)[, variable := NULL][order(Column5)][]

data.table really is an amazing package (especially if you're dealing with medium to large data). If you're interested, I'd suggest reading & learning more about it: https://github.com/Rdatatable/data.table/wiki

Khaynes
  • 1,976
  • 2
  • 15
  • 27
1

Similarly using melt for a tidyverse approach

library(tidyverse)
library(data.table)

df %>%
  melt(id.vars=5) %>%
  filter(complete.cases(.)) %>%
  select(c(1,3))

 Column5     value
1    Jim     Eric
2    Eric    Mary 
3    Tim     Erica
4    Matt    Mary 
5    Jim     Matt
6    Eric    Loranda
7    Tim     Beth
8    Matt    Matt
9    Jim     Heather 
10   Eric    Patrick
11   Tim     Courtney 
12   Matt    Patrick
13   Jim     Loranda 
14   Eric    John
15   Tim     Heather
16   Matt    Beth
NColl
  • 757
  • 5
  • 19