0

I would like to keep only the string after the last | sign in my rownames which looks like this: in:

"d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Chromatiales|f__Woeseiaceae|g__Woeseia"

out:

g__Woeseia

I have this code which keeps everything from the start until a given sign:

gsub("^.*\\.",".",x)
user2300940
  • 2,355
  • 1
  • 22
  • 35

1 Answers1

2

We could do this by capturing as a group. Using sub, match characters (.*) until the | and capture zero or more characters that are not a | (([^|]*)) until the end ($) of the string and replace by the backreference (\\1) of the captured group

sub(".*\\|([^|]*)$", "\\1", str1)
#[1] "g__Woeseia"

Or match characters until the | and replace it with blank ("")

sub(".*\\|", "", str1)
#[1] "g__Woeseia"

data

str1 <- "d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Chromatiales|f__Woeseiaceae|g__Woeseia"
akrun
  • 874,273
  • 37
  • 540
  • 662