I'm trying to merge the four data frames with columns that overlap. However, the values within the columns are singular in some of the data frames and merged into one cell in another.
I simplified the dataset to only two data frames for the example. Here's an example of the data and code I've tried:
x <- as.data.frame(cbind(c("1;2;3", "4", "5;6"), c("a;b;c", "d", "e;f")))
y <- as.data.frame(cbind(c(10:1), c("X", "Y", "Z;bird;goat", "A;hello", "B;cat", "C", "XX;what", "YY", "ZZ", "AA"), c(20:29), c(40:49)))
new_data <- c()
for (i in 1:nrow(x)){
if(length(unlist(strsplit(x$V1[i], ";")))==1){
foo.rows <- merge(x[i,], y, by="V1")
}
if(length(unlist(strsplit(x$V1[i], ";")))>1){
foo.ID <- unlist(strsplit(x$V1[i],";"))
foo.row <- c()
for (k in 1:length(foo.ID)){
foo.row <- y[y$V1==foo.ID[k],]
foo.rows <- paste0(foo.row, ";") # clearly doesn't do what I'm aiming for
}
}
new_data <- rbind(new_data, foo.rows)
}
If it's unclear what the output should be, I can add it; just lmk. Basically, I want y
merged/cbinded to x
by V1
. But if there are multiple codes in x$V1
, the corresponding columns in y
should be combined before merging/cbinding to x
.