-1

I am trying to built a patent network. I have a sample dataframe (aa) that contains an ID variable (origin) and string character (Target). I want to split the string character into separate groups and then add it back to the dataframe in long format so that it shows up as a new dataframe (ab). I've tried a few things trying to combine strsplit, do.call and reshape functions but to no avail. Appreciate any help.

From

aa<-data.frame(Origin=c(1,2,3),Target=c('a b c','d e','f g a b'))
aa

to

ab<-data.frame(Origin=c(rep(1,3),rep(2,2),rep(3,4)), Target=c('a','b','c','d','e','f','g','a','b'))
ab 

1 Answers1

1

You can achieve this using a combination of strsplit, mutate and unnest functions

library(dplyr)
library(tidyr)

aa %>% mutate(Target = strsplit(as.character(Target), " ")) %>% unnest(Target)

#   Origin Target
# 1      1      a
# 2      1      b
# 3      1      c
# 4      2      d
# 5      2      e
# 6      3      f
# 7      3      g
# 8      3      a
# 9      3      b
Prradep
  • 5,506
  • 5
  • 43
  • 84