Below is the first dataframe where I want to remove the first 3 rows:
book1 <- structure(list(Instructions..xyz = c("Note: abc", "", "Set1",
"id", "632592651", "633322173", "634703802", "634927873", "635812953",
"636004739", "636101211", "636157799", "636263106", "636752420"
), X = c("", "", "", "title", "asdf", "cat", "dog", "mouse",
"elephant", "goose", "rat", "mice", "kitty", "kitten"), X.1 = c("",
"", "", "hazard", "y", "y", "y", "n", "n", "y", "y", "n", "n",
"y"), X.2 = c("", "", "Set2", "id", "632592651", "633322173",
"634703802", "634927873", "635812953", "636004739", "636101211",
"636157799", "636263106", "636752420"), X.3 = c("", "", "", "title",
"asdf2", "cat2", "dog2", "mouse2", "elephant2", "goose2", "rat2",
"mice2", "kitty2", "kitten2"), X.4 = c("", "", "", "index", "0.664883807",
"0.20089779", "0.752228086", "0.124729276", "0.626285086", "0.134537909",
"0.612526768", "0.769622463", "0.682532524", "0.819015658")), class = "data.frame", row.names = c(NA,
-14L))
I did book1 <- book1[-c(1:3),]
but I'm not sure how to make id, title, hazard, id, title, index as the column name instead of Instructions..xyz, etc. See image below for desired output
Then for the second dataframe,
book2 <- structure(list(identity = c(632592651L, 633322173L, 634703802L,
634927873L, 635812953L, 636004739L, 636101211L, 636157799L, 636263106L,
636752420L, 636809222L, 2004722036L, 2004894388L, 2005045755L,
2005535472L, 2005630542L, 2005788781L, 2005809679L, 2005838317L,
2005866692L), text = c("asdf_xyz", "cat", "dog", "mouse", "elephant",
"goose", "rat", "mice", "kitty", "kitten", "tiger_xyz", "lion",
"leopard", "ostrich", "kangaroo", "platypus", "fish", "reptile",
"mammals", "amphibians_xyz"), volume = c(1234L, 432L, 324L, 333L,
2223L, 412346L, 7456L, 3456L, 2345L, 2345L, 6L, 345L, 23L, 2L,
4778L, 234L, 8675L, 3459L, 8L, 9L)), class = "data.frame", row.names = c(NA,
-20L))
I then rename column 1 and 2 in book2 so that it matches that of book1 by names(book2)[1:2] <- c('id','title')
where I can later do inner_join. The desired output is shown in the image below by
library(dplyr)
book1 %>%
inner_join(book2, by = c("id", "title"))
This is taking quite a few steps and wondering if there's a simplified version to this?