-4

I have Data like this

0266-VN22.5P-AC
0292-VN22.6P-BC
0300-VN22.7P-CC
0316-VN22.8P-DC

I want to remove everything before first hypen and result should look like this

VN22.5P-AC
VN22.6P-BC
VN22.7P-CC
VN22.8P-DC

Any help would be much appreciated.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Vinay
  • 75
  • 1
  • 8
  • Split on hyphen `-`, then paste back last 2? – zx8754 Aug 26 '16 at 13:03
  • `sub( "....-", "", c( "0266-VN22.5P-AC", "0292-VN22.6P-BC", "0300-VN22.7P-CC", "0316-VN22.8P-DC" ))` – Daniel Wisehart Aug 26 '16 at 13:33
  • with package `stringr` and if indeed the numbers at the beginning of string don't have the same number of digits, you can do `str_replace(x, str_extract(x, "\\d+"), str_pad(str_extract(x, "\\d+"), 6, pad = "0"))` to get what I think you want (if I'm not mistaken) – Cath Aug 26 '16 at 13:36
  • 1
    No Cath, digits vary from one to six... That's why I am splitting and making it 6 digit and adding it – Vinay Aug 26 '16 at 14:18
  • then the solution in my comment should work, preferably assigning the result of `str_extract` to avoid the double call. (*just a remark, you should state that in the question, or make the number of digits vary in your example to make it representative and avoid answers that work on example data but not on the real ones*) – Cath Aug 26 '16 at 14:21
  • 1
    That was awesome Cath, It was good learning from you. Thank You – Vinay Aug 26 '16 at 14:33

2 Answers2

2

Try:

gsub("^[^\\-]+\\-", "", "0266-VN22.5P-AC")
NJBurgo
  • 779
  • 3
  • 8
  • Thank you, NJBurgo ... That is what I was looking for. May I know how to concate it again? – Vinay Aug 26 '16 at 12:47
  • 2
    I don't know much about about regex, but why are you escaping the `-` sign? Is there anything wrong with `gsub("^[^-]+-", "", "0266-VN22.5P-AC")`? Anyway, +1. – RHertel Aug 26 '16 at 12:49
  • 1
    `gsub` is vectorised, so will take in a concatenated set of inputs, returing the corrected values. `s <- c("0266-VN22.5P-AC", "0292-VN22.6P-BC", "0300-VN22.7P-CC")`, then `gsub("^[^\\-]+\\-", "", s)` will work – NJBurgo Aug 26 '16 at 12:50
  • 2
    note that `sub("[^-]+-", "", "0266-VN22.5P-AC")` does the same... no need for `gsub` and a more complicated `regex`, `sub` will take only the first match into consideration, which is what OP wants – Cath Aug 26 '16 at 12:51
  • @RHertel, I am never sure about the `-` sign, it needs to be escaped inside `[..]` as in that case it would define a range. Therefore I always just escape it. – NJBurgo Aug 26 '16 at 12:52
  • 1
    As it is the only thing in between brackets there, `-` doesn't need to be escaped in the first part. And definitely no need to escape it in the second part – Cath Aug 26 '16 at 12:53
  • 1
    I want to prefix the numbers with 0's so that I can make it six digits and then concatenate it again so that I can match with my original data. Hence I am extracting numbers and then making those as six digits with irrespective of any digit and then I would concatenate. – Vinay Aug 26 '16 at 12:53
  • 1
    @Vinay you should ask the problem you really have instead of what you think is a part of it. I believe you can directly do what you need with `sprintf` – Cath Aug 26 '16 at 12:55
  • 1
    Also `gsub(".*?-(.*)", "\\1", "0266-VN22.5P-AC")` – David Arenburg Aug 26 '16 at 13:02
  • 1
    @NJBurgo No need to escape it anywhere — see http://stackoverflow.com/a/4068725/1968 – Konrad Rudolph Aug 26 '16 at 13:15
2

Non-greedy search. The question mark is the magic:

sub(".*?-", "", x)
#[1] "VN22.5P-AC" "VN22.6P-BC" "VN22.7P-CC" "VN22.8P-DC"
Pierre L
  • 28,203
  • 6
  • 47
  • 69