As the question title says, I'd like to know why the byte compiled R code (using compiler::cmpfun
) is faster than equivalent Rcpp code for the following mathematical function:
func1 <- function(alpha, tau, rho, phi) {
abs((alpha + 1)^(tau) * phi - rho * (1- (1 + alpha)^(tau))/(1 - (1 + alpha)))
}
Since this is a simple numerical operation, I would have expected Rcpp (funcCpp
and funcCpp2
) to be much faster than the byte compiled R (func1c
and func2c
), especially since R would have more overhead for storing (1+alpha)**tau
or needs to recompute it. In fact computing this exponent two times seems faster than the memory allocation in R (func1c
vs func2c
), which seems especially counterintuitive, since n
is large. My other guess is that maybe compiler::cmpfun
is pulling off some magic, but I'd like to know if that is indeed the case.
So really, the two things I'd like to know are:
Why are funcCpp and funcCpp2 slower than func1c and func2c? (Rcpp slower than compiled R functions)
Why is funcCpp slower than func2? (Rcpp code slower than pure R)
FWIW, here's my C++ and R version data
user% g++ --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.0 (clang-700.0.72)
Target: x86_64-apple-darwin14.3.0
Thread model: posix
user% R --version
R version 3.2.2 (2015-08-14) -- "Fire Safety"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin14.5.0 (64-bit)
And here's the R and Rcpp code:
library(Rcpp)
library(rbenchmark)
func1 <- function(alpha, tau, rho, phi) {
abs((1 + alpha)^(tau) * phi - rho * (1- (1 + alpha)^(tau))/(1 - (1 + alpha)))
}
func2 <- function(alpha, tau, rho, phi) {
pval <- (alpha + 1)^(tau)
abs( pval * phi - rho * (1- pval)/(1 - (1 + alpha)))
}
func1c <- compiler::cmpfun(func1)
func2c <- compiler::cmpfun(func2)
func3c <- Rcpp::cppFunction('
double funcCpp(double alpha, int tau, double rho, double phi) {
double pow_val = std::exp(tau * std::log(alpha + 1.0));
double pAg = rho/alpha;
return std::abs(pow_val * (phi - pAg) + pAg);
}')
func4c <- Rcpp::cppFunction('
double funcCpp2(double alpha, int tau, double rho, double phi) {
double pow_val = pow(alpha + 1.0, tau) ;
double pAg = rho/alpha;
return std::abs(pow_val * (phi - pAg) + pAg);
}')
res <- benchmark(
func1(0.01, 200, 100, 1000000),
func1c(0.01, 200, 100, 1000000),
func2(0.01, 200, 100, 1000000),
func2c(0.01, 200, 100, 1000000),
func3c(0.01, 200, 100, 1000000),
func4c(0.01, 200, 100, 1000000),
funcCpp(0.01, 200, 100, 1000000),
funcCpp2(0.01, 200, 100, 1000000),
replications = 100000,
order='relative',
columns=c("test", "replications", "elapsed", "relative"))
And here's the output of rbenchmark
:
test replications elapsed relative
func1c(0.01, 200, 100, 1e+06) 100000 0.349 1.000
func2c(0.01, 200, 100, 1e+06) 100000 0.372 1.066
funcCpp2(0.01, 200, 100, 1e+06) 100000 0.483 1.384
func4c(0.01, 200, 100, 1e+06) 100000 0.509 1.458
func2(0.01, 200, 100, 1e+06) 100000 0.510 1.461
funcCpp(0.01, 200, 100, 1e+06) 100000 0.524 1.501
func3c(0.01, 200, 100, 1e+06) 100000 0.546 1.564
func1(0.01, 200, 100, 1e+06) 100000 0.549 1.573K