0

How do I split the first column into 2 components (e.g., 01 & run1) and create 2 other columns to store that information?

P = c('01_run1', '01_run2', '02_run1', '02_run2')
Score = c(1, 2, 3, 4)
df = data.frame(P, Score)

        P Score 
1 01_run1     1
2 01_run2     2
3 02_run1     3
4 02_run2     4

End Product

            P Score Number  Run
    1 01_run1     1     01 run1
    2 01_run2     2     01 run2
    3 02_run1     3     02 run1
    4 02_run2     4     02 run2

I can use strsplit() with split = '_' to separate the 2 components but is there another way to create the 2 columns other than using loops (which many have advised not to do so in r?)

TYL
  • 1,577
  • 20
  • 33

2 Answers2

1

We can try using sub here, for one base R option:

df$Number <- sub("_.*$", "", df$P)
df$Run    <- sub("^.*_", "", df$P)

enter image description here

Demo

The first call to sub uses the pattern _.*$ and replaces with empty string (i.e. deletes what is matched). What this would match is everything from underscore until the end of the string. Similarly, the second call to sub uses the pattern ^.*_, which would remove everything before and including the underscore. In both cases, it would leave us with the data we want.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0

How about this:

df <- df %>% separate(P, c("Number", "Run"), "_", remove = FALSE) %>% select(P, Score, Number, Run)