1

There is a string column like this, in my data frame.

str=as.character(c("M 12; M 13","M 24", NA, "C 12; C 50; C 78"))
no=seq(1:4)
data.frame(no,str)

  no              str
1  1       M 12; M 13
2  2             M 24
3  3             <NA>
4  4 C 12; C 50; C 78

It has multiple values, separated by ";" symbol. I need to split this into multiple columns (3 columns based on this example) as each column contains only one value of the string. Will this be possible using R?

deschen
  • 10,012
  • 3
  • 27
  • 50
student_R123
  • 962
  • 11
  • 30

2 Answers2

1

You can use str_split. Note, I assume you want to give the new columns a meaningful name, so I added a rename_with, but you can just cut it if you don't need:

library(tidyverse)

df <- data.frame(str = c("M 12; M 13","M 24", NA, "C 12; C 50; C 78"),
                 no = seq(1:4))

df %>%
  mutate(splits = str_split(str, "; ")) %>%
  unnest_wider(splits) %>%
  rename_with(.cols = starts_with("..."),
              .fn   = ~paste0("split_", 1:length(which(str_detect(., "...")))))

which gives:

# A tibble: 4 x 5
  str                 no split_1 split_2 split_3
  <chr>            <int> <chr>   <chr>   <chr>  
1 M 12; M 13           1 M 12    M 13    <NA>   
2 M 24                 2 M 24    <NA>    <NA>   
3 <NA>                 3 <NA>    <NA>    <NA>   
4 C 12; C 50; C 78     4 C 12    C 50    C 78  
deschen
  • 10,012
  • 3
  • 27
  • 50
1

This is a good occasion to make use of extra = merge argument of separate:

library(dplyr)
df %>% 
  separate(str, c('A', 'B', 'C'), sep= ";", extra = 'merge')
  no    A     B     C
1  1 M 12  M 13  <NA>
2  2 M 24  <NA>  <NA>
3  3 <NA>  <NA>  <NA>
4  4 C 12  C 50  C 78
TarJae
  • 72,363
  • 6
  • 19
  • 66