2

I've a dataset of macroeconomic data like GDP, inflation, etc... where Rows=different macroeconomic indicators and columns=years

Since some values are missing (ex: the GDP of any country in any year), they are charged as "NA".

When I perform these operations:

#
data = read.table("14varnumeros.txt", header = FALSE, sep = "", na.strings = "NA", dec = ".", strip.white = TRUE)

benford(data, number.of.digits = 1, sign = "both", discrete=TRUE, round=3)
#

It gives me this error:

Error in extract.digits(data, number.of.digits, sign, second.order, discrete = discrete, :
Data must be a numeric vector

I assume that this is because of the NA strings, but I do not know how to solve it.

TylerH
  • 20,799
  • 66
  • 75
  • 101
Pablo
  • 21
  • 1
  • Which package are you getting the `benford` function from? I see at least two candidates on CRAN. – kdopen Mar 13 '15 at 15:30
  • 1
    I'm using the benford.analysis package, NOT the Benford.Tests package. I think the problem is that "data" is not numeric (obviously) because is a list – Pablo Mar 13 '15 at 16:07

1 Answers1

1

I came across this issue, too. In my case, it wasn't missing data, instead it's because of a quirk in the extract.digits() function of the benford.analysis package. The function is checking if the data supplied to it is numeric data, but it does so using class(dat) != "numeric" instead of using the is.numeric() function.

This produces unexpected errors. Consider the code below:

library(benford.analysis)

dat <- data.frame(v1 = 1:5, v2 = c(1, 2, 3, 4, 5))

benford(dat$v1)          # produces error

I've submitted an issue on Github, but you can simply wrap your data in as.numeric() and you should be fine.

paulstey
  • 632
  • 2
  • 7
  • 15