2

I have a dataframe that looks like this:

df = data.frame(V1= c(1:3), V2 = c("abc-1-10", "def-2-19", "ghi-3-937"))

I would like to be able to add a column V3 to this dataframe such that the column V3 contains the string before the first hyphen ("-") in column V2. So it should look like this:

dfDesired = data.frame(V1= c(1:3), V2 = c("abc-1-10", "def-2-19", "ghi-3-937"), V3 = c("abc", "def","ghi"))

When I try using strsplit, it gives me a list of vectors with the three parts of the each string.

ag14
  • 867
  • 1
  • 8
  • 15

2 Answers2

2

A non-regex friendly way is using tidyr::separate:

tidyr::separate(df, V2, into = "V3", extra = "drop", remove = F, sep = "-")

  V1        V2  V3
1  1  abc-1-10 abc
2  2  def-2-19 def
3  3 ghi-3-937 ghi
Maël
  • 45,206
  • 3
  • 29
  • 67
2

You can drop everything after the first -

In base R,

transform(df, V3 = sub('-.*', '', V2))

#  V1        V2  V3
#1  1  abc-1-10 abc
#2  2  def-2-19 def
#3  3 ghi-3-937 ghi
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213