I have an example data frame like the one below.
ID | File |
---|---|
1 | 11_213.csv |
2 | 13_256.csv |
3 | 11_223.csv |
4 | 12_389.csv |
5 | 14_456.csv |
6 | 12_345.csv |
And I want to add another column based on the string between the underscore and the period to get a data frame that looks something like this.
ID | File | Group |
---|---|---|
1 | 11_213.csv | 213 |
2 | 13_256.csv | 256 |
3 | 11_223.csv | 223 |
4 | 12_389.csv | 389 |
5 | 14_456.csv | 456 |
6 | 12_345.csv | 345 |
I think I need to use the str_extract feature within stringr but I am not sure what notation to use for my pattern. For example when I use:
df <- df %>%
mutate("Group" = str_extract(File, "[^_]+"))
I get the all the information before the underscore like this:
ID | File | Group |
---|---|---|
1 | 11_213.csv | 11 |
2 | 13_256.csv | 13 |
3 | 11_223.csv | 11 |
4 | 12_389.csv | 12 |
5 | 14_456.csv | 14 |
6 | 12_345.csv | 12 |
But that is not what I want. What should I use instead of "[^_]+" to get just the stuff between the underscore and the period? Thanks!