3

I'm importing sales data that needs to be converted from character strings to numeric.

I'm trying to use parse_number in readr to do this, but it throws a parsing error for negative values, and coerces them to NAs.

As an example:

x <- c("$1,000.00", "$500.00", "-$200.00")

y <- parse_number(x)

Warning: 1 parsing failure. row # A tibble: 1 x 4 col row col expected actual expected <int> <int> <chr> <chr> actual 1 3 NA a number -

y

[1] 1000 500 NA

Does parse_number or readr have functionality that allows me to keep "-" for negative currency values?

(I'm not asking for an as.numeric(gsub()) solution.)

Ash Levitt
  • 153
  • 3
  • 11
  • I think if your data is already in this format, with negative before the currency symbol, it would be best to read in as character, remove the symbols and then use parse_number. Not sure why you are opposed to `as.numeric(gsub())` though – Calum You May 03 '18 at 22:04
  • Yes, see my comment below. I'm trying to keep our code consistent using tidyverse functions for readability and wanted to see if there was a `readr` solution, which it seems like there should be. Thanks. – Ash Levitt May 03 '18 at 22:07

2 Answers2

2

If you want to stay with tidyverse functions as per comment here you can just use stringr functions instead of gsub. Options like this:

library(tidyverse)
x <- c("$1,000.00", "$500.00", "-$200.00")
x %>%
  str_replace("^-\\$(.*)$", "$-\\1") %>%
  parse_number()
#> [1] 1000  500 -200

x %>%
  str_remove("\\$") %>%
  parse_number()
#> [1] 1000  500 -200

Created on 2018-05-03 by the reprex package (v0.2.0).

Calum You
  • 14,687
  • 4
  • 23
  • 42
0

The use of currency symbol is wrong in your example. Try

library(readr)
x <- c("$1,000.00", "$500.00", "$-200.00")
parse_number(x)
#[1] 1000  500 -200

Since, problem is known, hence a simple solution can be using gsub as:

parse_number(gsub("\\$","",x))
#[1] 1000  500 -200
MKR
  • 19,739
  • 4
  • 23
  • 33
  • 1
    Thanks. Unfortunately, the accounting program that we use outputs sales data in the way I used it in my example "-$200.00", which is obviously the root of the problem. Apparently, I'll need to modify the vector first if I want to use `parse_number`. It would be great if `parse_number` could handle both instances. – Ash Levitt May 03 '18 at 22:04
  • @AshLevitt I agree with your comments. But we need to check what is convention in other language. As you have already mentioned, as work-around , you can replace `-$` with `$-` and then convert. – MKR May 03 '18 at 22:07