1

Given some data:

test <- data.frame(strings = c('a;b;c;;;;;;;', 'd;e;f;g;h;i;j;k;l;m', 'n;o;p;q;r;;;;;', ';;;;;;;;;' ))

How do I remove all trailing semicolons to get:

test <- data.frame(strings = c('a;b;c', 'd;e;f;g;h;i;j;k;l;m', 'n;o;p;q;r', '' ))

Features of this dataframe:

  1. maximum of 9 semicolons per row, separating a maximum of 10 characters
  2. rows contain differing amounts of characters, and the semicolons always add to 9
  3. when a row contains no characters, it contains 9 semicolons.
Rich Pauloo
  • 7,734
  • 4
  • 37
  • 69

2 Answers2

5

I think the regex you want, in words, is "one or more semicolons followed by end of line". So this works:

library(dplyr)
test %>% 
  mutate(newstrings = gsub(";{1,}$", "", strings))

              strings          newstrings
1        a;b;c;;;;;;;               a;b;c
2 d;e;f;g;h;i;j;k;l;m d;e;f;g;h;i;j;k;l;m
3      n;o;p;q;r;;;;;           n;o;p;q;r
4           ;;;;;;;;; 
neilfws
  • 32,751
  • 5
  • 50
  • 63
4

You can use the following regex, to select all sequence of 1 or more semicolons at the end of your string and replace it by '' in order to trim them.

;+$
Allan
  • 12,117
  • 3
  • 27
  • 51