I have a large dataframe with 3 blocks of incomplete data. I want to convert this dataframe from a wide to long format using R.
Example
df <- structure(list(V1 = 1234:1240, V2 = structure(1:7, .Label = c("text1","text2", "text3", "text4", "text5", "text6", "text7"), class = "factor"), V3 = structure(c(1L, 1L, 1L, 1L, NA, NA, NA), .Label = "constant1", class = "factor"), V4 = structure(c(1L, 1L, 2L, 3L, NA, NA, NA), .Label = c("VariableA1", "VariableA2", "VariableA3"), class = "factor"), V5 = structure(c(1L, 2L, 1L, 2L, NA, NA, NA), .Label = c("VariableA4", "VariableA5"), class = "factor"), V6 = structure(c(NA, NA, NA, 1L, 1L,NA, NA), .Label = "constant2", class = "factor"), V7 = structure(c(NA, NA, NA, 1L, 2L, NA, NA), .Label = c("VariableB1", "VariableB2"), class = "factor"), V8 = structure(c(NA, NA, 1L, NA, NA,
1L, 1L), .Label = "constant3", class = "factor"), V9 = structure(c(NA, NA, 1L, NA, NA, 1L, 2L), .Label = c("VariableC1", "VariableC2"), class = "factor"), V10 = structure(c(NA, NA, 1L, NA, NA, 2L, 1L), .Label = c("VariableC3", "VariableC4"), class = "factor")), class = "data.frame", row.names = c(NA,-7L))
The data looks like this at present
1234 text1 constant1 VariableA1 VariableA4 NA NA NA NA NA
1235 text2 constant1 VariableA1 VariableA5 NA NA NA NA NA
1236 text3 constant1 VariableA2 VariableA4 NA NA constant3 VariableC1 VariableC3
1237 text4 constant1 VariableA3 VariableA5 constant2 VariableB1 NA NA NA
1238 text5 NA NA NA constant2 VariableB2 NA NA NA
1239 text6 NA NA NA NA NA constant3 VariableC1 VariableC4
1240 text7 NA NA NA NA NA constant3 VariableC2 VariableC3
What I want is
1234 text1 constant1 VariableA1 VariableA4
1235 text2 constant1 VariableA1 VariableA5
1236 text3 constant1 VariableA2 VariableA4
1236 text3 constant3 VariableC1 VariableC3
1237 text4 constant1 VariableA3 VariableA5
1237 text4 constant2 VariableB1 NA
1238 text5 constant2 VariableB2 NA
1239 text6 constant3 VariableC1 VariableC4
1240 text7 constant3 VariableC2 VariableC3
In the actual data the values in the column 1 and 2 are not as consistent as those included here. In columns 3 to 10 the constant and Variables could be 1 to 3 different character values.
This is the closest potential answer I can find so far
Edit The ultimate aim of the query is to consolidate a wide df into a long format. Columns 3:5, 6:7 and 8:10 could be viewed as blocks. The answer below works but its not elegant or consistent. I need to manually sort the column orders after each run.