1

I have a dataframe with a variable that is a factor containing $ signs. So the column is something like Revenue: $450, $550, $650 ..etc. I'd like to strip the $ and transform factor to numeric.

I tried parsing using methods found on stackoverflow but they return and error message. Are $ special symbols?

Here's what I've tried:

str_replace(df$Revenue, "$", "") #error message
as.numeric(gsub("$", "", df$Revenue) #Similar error message

These work to remove symbols like "%" but for some reason not replicable with "$". Any reason why?

D500
  • 442
  • 5
  • 17

1 Answers1

3

You could try:

myvec <- c("$450", "$550", "$650")
as.numeric(gsub('\\$', '', myvec))
#[1] 450 550 650

Or as an alternative:

as.numeric(gsub('$', '', myvec, fixed = TRUE))
#[1] 450 550 650

You would need to escape $ to make it work with regex (because $ is a special character) or set fixed = TRUE.

LyzandeR
  • 37,047
  • 12
  • 77
  • 87
  • Awesome. I didn't realize $ was a special character. What else are special characters in R? – D500 Nov 18 '17 at 20:33
  • They are special character in regex not R. `. \ | ( ) [ { ^ $ * + ?` these are the main ones. You can read more about it if you type `?regex` on your console. – LyzandeR Nov 18 '17 at 20:47
  • I have found this to be a great summary: http://stat545.com/block022_regular-expression.html – leerssej Nov 19 '17 at 02:58
  • 1
    This is exactly what we use in a production system which collects financial data. :) – Adam Bethke Nov 21 '17 at 00:31