0

I am very stuck on this for a while. I am working on a complex survey; it has 26 questions, but they were filled in randomised order. It is not about shifting NA data, it's merging 2 columns, without adding the information in them. I need to create 1 column containing all the rows from both column + add one more column to exhibit the condition assigned to each id/row.

This is an image of how my data looks right now and how it should look. Var 1, Var 2, and Var 3 are identical variables, but they were assigned to different conditions.

enter image description here

Any chance to be able to do this with R? Otherwise I go at it manually (after 7 hours of research in R on how to do this, I think it's easier just to crack up an excel). Thank you!

LE: I managed to do this with coalesce:

library(dplyr)

# create sample data frame
df <- data.frame(id = c(1, 2, 3),
                 var1 = c("apple", NA, "banana"),
                 var2 = c("", "orange", "yellow"))

# add var2 to var1 if var1 is NA
df$var1 <- coalesce(df$var1, df$var2)

# remove var2 column
df <- select(df, -var2)

# print the final data frame
print(df)
Andreea
  • 101
  • 5

1 Answers1

2

You could do that with excel, but if for any reason you need to redo it, its the same work all over again. Writing a script might be longer the first time but much shorter for all the following times.

anyway, one solution to do this with base r wold simply be running something like:

var1_new <- ifelse(randomiser == 1, var1_1, var1_2)

for each variable.

Please note that the variables are probably named differently than in your example table. There could not be two variables in the same dataset named Var 1 (and also no spaces are permitted).

BoTz
  • 477
  • 4
  • 15