@Zheyuan Li 's answer solved your problem of using lapply
. Though I'd like to add several points:
Vectorize
is just a wrapper with mapply
, which doesn't give you the performance of vectorization.
The function itself can be improved for much better readability:
see
digitsum <- function(x) sum(floor(x / 10^(0:(nchar(as.character(x)) - 1))) %% 10)
vec_digitsum <- Vectorize(digitsum)
sumdigits <- function(x){
digits <- strsplit(as.character(x), "")[[1]]
sum(as.numeric(digits))
}
vec_sumdigits <- Vectorize(sumdigits)
microbenchmark::microbenchmark(digitsum(12324255231323),
sumdigits(12324255231323), times = 100)
Unit: microseconds
expr min lq mean median uq max neval cld
digitsum(12324255231323) 12.223 12.712 14.50613 13.201 13.690 96.801 100 a
sumdigits(12324255231323) 13.689 14.667 15.32743 14.668 15.157 38.134 100 a
The performance of two versions are similar, but the 2nd one is much easier to understand.
Interestingly, the Vectorize
wrapper add considerable overhead for single input:
microbenchmark::microbenchmark(vec_digitsum(12324255231323),
vec_sumdigits(12324255231323), times = 100)
Unit: microseconds
expr min lq mean median uq max neval cld
vec_digitsum(12324255231323) 92.890 96.801 267.2665 100.223 108.045 16387.07 100 a
vec_sumdigits(12324255231323) 94.357 98.757 106.2705 101.445 107.556 286.00 100 a
Another advantage of this function is that if you have really big numbers in string format, it will still work (with small modification of removing the as.character
). While the first version function will have problem with big numbers or may introduce errors.
Note: At first my benchmark was comparing the vectorized version of OP function and non-vectorized version of my function, that gave me the wrong impression of my function is much faster. Turned out that was caused by Vectorize
overhead.