1

I have a question which confused me for a long time: how should I remove a pattern starting with question mark?

For example:

## dataframe named test
x y
1 gffsd?lang=dfs
2 sdldfsd?lang=gsd
3 eoriwesd?lang=fh
4 eriywo?lang=asd

What I want is:

x y
1 gffsd
2 sdldfsd
3 eoriwesd
4 eriywo

I tried several method, including:

test$y = sapply(strsplit(test$y, '?'), head, 1)
test$y = sapply(strsplit(test$y, '?lang='), head, 1)
gsub("?",NA, test$y, fixed = TRUE)

Unfortunately all of them failed.

Thanks in advance!

BTW, anybody knows how to replace "®" to "-"

BigD
  • 571
  • 7
  • 24
  • Did you conduct some research? So many examples on SO. [this](https://stackoverflow.com/questions/10617702/remove-part-of-string-after), [this](https://stackoverflow.com/questions/31836750/removing-everything-after-a-character-in-a-column-in-r), [this](https://stackoverflow.com/questions/26611922/remove-everything-after-a-string-in-a-data-frame-column-with-missing-values), etc., etc.. StackOverflow isn't a Googling service. – David Arenburg Jun 28 '17 at 21:28
  • I did, and the methods I used were from the searched result, which you can see, they did not work. – BigD Jun 28 '17 at 21:31
  • I don't see anything related to `sapply` in the search results. – David Arenburg Jun 28 '17 at 21:32
  • When I google it, most of the questions just involved a vector, not a data frame, so I modified a little bit to ```sapply```. – BigD Jun 28 '17 at 21:35
  • I assume `test$y = sapply(strsplit(test$y, '?', fixed = TRUE), head, 1)` would work – Dirk Horsten Jun 28 '17 at 21:39

1 Answers1

12

gsub can work with the right regular expression.

test$y = gsub("\\?.*", "", test$y)
test
  x        y
1 1    gffsd
2 2  sdldfsd
3 3 eoriwesd
4 4   eriywo

You need to escape the question mark "\\?" and the ".*" signifies that you want to remove everything after the question mark as well.

Your second question is gsub as well.

string = 'anybody knows how to replace ® to -'
gsub("®",  "-", string)
[1] "anybody knows how to replace - to -"
G5W
  • 36,531
  • 10
  • 47
  • 80