2

I need to split and obtain the all the characters before ^

example: I have a column in a dataframe that reads

2567543^ABC 
7545435^J 
8934939^XY

and the result column in the same dataframe should read:

2567543
7545435
8934939

I tried using stringr, strsub{base}, stringi, gsubfn. But they are throwing weird results because ^. I cannot replace ^ because the table is simply huge.

1 Answers1

3

Just remove all the chars from ^ upto the last using sub function. Since ^ is a special meta charcater in regex which matches the start of a line, you need to escape ^ symbol in-order to match a literal ^ symbol.

sub("\\^.*", "", df$x)

Example:

> df <- data.frame(x=c("2567543^ABC", "7545435^J", "8934939^XY"))
> df$x <- sub("\\^.*", "", df$x)
> df
        x
1 2567543
2 7545435
3 8934939

OR

> df <- data.frame(x=c("2567543^ABC", "7545435^J", "8934939^XY"))
> df$x <- strsplit(as.character(df$x), "\\^")[[1]][1]
> df
        x
1 2567543
2 2567543
3 2567543

OR

Use fixed=TRUE parameter in strsplit since ^ is a special character.

> df <- data.frame(x=c("2567543^ABC", "7545435^J", "8934939^XY"))
> df$x <- strsplit(as.character(df$x), "^", fixed=TRUE)[[1]][1]
> df
        x
1 2567543
2 2567543
3 2567543
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274