I am trying to run a QR decomposition (LAPACKE_dgeqrf) in R on a linux machine (CentOS) using a C++ program that is interfaced with Rcpp. Unfortunately, I see only 100% using top. This also happens on a Red Hat Enterprise Linux Server. However, the C++ program (with LAPACKE_dgeqrf) runs at nthreads * 100% when started from the terminal (independently outside of R). I compiled OpenBLAS with
NO_AFFINITY=1
and tried
export OPENBLAS_NUM_THREADS=4
export GOTO_NUM_THREADS=4
export OMP_NUM_THREADS=4
export OPENBLAS_MAIN_FREE=1
Nothing works. Everything works fine on a Mac though. 'mcaffinity()' from the parallel R package returns NULL. I configured R using
configure 'CFLAGS=-g -O3 -Wall -pedantic' 'CXXFLAGS=-g -O3 -Wall -pedantic' 'FCFLAGS=-g -O3' 'F77FLAGS=-g -O3' '--with-system-zlib' '--enable-memory-profiling'
My C++ code:
#include <Rcpp.h>
#include <lapacke.h>
#include <cblas.h>
//[[Rcpp::export]]
Rcpp::NumericMatrix QRopenblas(Rcpp::NumericMatrix X)
{
// Declare variables
int n_rows = X.nrow(), n_cols = X.ncol(), min_mn = std::min(n_rows, n_cols);
Rcpp::NumericVector tau(min_mn);
// Perform QR decomposition
LAPACKE_dgeqrf(CblasColMajor, n_rows, n_cols, X.begin(), n_rows, tau.begin());
return X;
}
My R code:
PKG_LIBS <- '/pathto/openblas/lib/libopenblas.a'
PKG_CPPFLAGS <- '-I/pathto/openblas/include'
Sys.setenv(PKG_LIBS = PKG_LIBS , PKG_CPPFLAGS = PKG_CPPFLAGS)
Rcpp::sourceCpp('/pathto/QRopenblas.cpp', rebuild = TRUE)
n_row <- 4000
n_col <- 4000
A <- matrix(rnorm(n_row * n_col), n_row, n_col)
res <- QRopenblas(A)