0

New to code in general and have been through hundreds of google searches and stackoverflow threads but nothing yet really explains how my solution works. There are many ways to melt data, most appear overly complex...curious why my solution works when all other solutions are overly complex.

Original dataframe

> df <- data.frame(
ResponseID = c("123", "1234"),
Q1_W1 = c("Agree", "strongly disagree"),
Q1_W2 = c("disagree", "Agree"),
Q2_W1 = c("Disagree", "Disagree"),
Q2_W2 = c("Agree", "NA")
)

Desired output

  ResponseID variable value    variable     value
    123       Q1_W1   agree      Q2_W1      disagree     
   1234       Q1_W2   disagree   Q2_W2      agree      

I was able to achieve this with:

nalh5=ALH %>% gather(question,response, Q1_W1:Q1_W7)%>%
 gather(q2, r2,Q2_W1:Q2_W3)%>%
 gather(q3, r3, Q3_W1:Q3_W5)

It works well, but are there more efficient ways to achieve this?

alphamelt
  • 13
  • 2
  • You did things in reverse give us the original messy data, not the one you were able to tidy up – Bruno Dec 27 '19 at 14:56
  • Why is that the output you want? That format makes it less clear how variables and values match – camille Dec 27 '19 at 16:18

1 Answers1

0

I guess this is cleaner but still in my opnion you are butchering an already tidy dataset.

df %>% 
  pivot_longer(names_to = "Q1_questions",values_to = "Q1_answers",cols = contains("Q1")) %>% 
  pivot_longer(names_to = "Q2_questions",values_to = "Q2_answers",cols = contains("Q2"))

You can even make it into a function

butcher_function <- function(df,Q) {
  names_to_par <- str_c(Q,"questions",sep = "_")
  values_to_par <- str_c(Q,"answers",sep = "_")

  pivot_longer(data = df,
               names_to = names_to_par,
               values_to = values_to_par,
               cols = contains(Q))
}
df %>%
  butcher_function(Q = "Q1") %>%
  butcher_function(Q = "Q2")
Bruno
  • 4,109
  • 1
  • 9
  • 27
  • Apologies, I did display the messy data incorrectly, butcher function is a nice touch though ha. What I should have shown is the general messy data that comes from survey data, where if you have multiple questions with multiple possible answers - you want all that data grouped together in long format and their corresponding values. Thank you for the assist, I'm still working on getting devtools to get pivot_longer. – alphamelt Dec 27 '19 at 18:06