1

I wrote a cpp source with pari.h header file included:

#include<string>
#include<vector>
#include<algorithm>
#include<cmath>
#include<stdlib.h>
#include<time.h>
#include<iterator> // for ostream_iterator
#include<strings.h>
#include<string.h>
#include<sstream>
#include <pari/pari.h> // for PARI/GP library
#include<Rcpp.h> // for sourceCpp to work, this line must be uncommented

// Enable C++11 via this plugin (Rcpp 0.10.3 or later)
// [[Rcpp::plugins(cpp11)]]

using namespace std;
using namespace Rcpp; // for sourceCpp to work, this line must be uncommented

// [[Rcpp::export]]
int main() {

    long maxp = 1000000; // max value
    pari_init(500000,2); // initiate pari
    size_t rsize = 500000; // set stack size variables
    size_t vsize = 100000000;

    void paristack_setsize(size_t rsize, size_t vsize); // declare stack function
    paristack_setsize(rsize, vsize); // set stack size
    gp_allocatemem(stoi(100000000)); // allocate memory
    GEN p1; // declare PARI variable

    p1 = cgetg(maxp, t_VEC); // make the PARI variable a vector
    long j; // declare the variable for the number to be checked. one above the vector iterator
    for (long i = 0; i <= maxp; ++i) { // iterate over PARI vector
        j = i + 1; // decrement index for number
        gel(p1, i) = sumdiv(stoi(j)); // calculate the sum of divisors and update the vector
    }

    vector<long> p2(maxp); // empty vector of native type
    GEN x; // declare a PARI variable to subset PARI vector
    for (long i = 0; i < maxp; i++) { // for2, across vector indices
        x = gel(p1, i); // subset one item of vector
        p2[i] = gtolong(x); // convert PARI to native long integer and update long vector item
    } // close for2

    for (long z = 0; z < maxp; z++) { // for3, to iterate for stdout
        cout << p2[z] << '\n'; // return the result. the vector items are printed separately
    } // close for3

} // close function

(Note that, there may be unnecessary headers, I usually copy all of them across sources, but that's not an issue). Similar source files without pari.h header compile well with Rcpp with the necessary parts included (such as header, namespace, export line, etc).

The source, when Rcpp related references are commented, compiles well and works with no problem when compiled directly with g++ with the following flags:

g++ -lpari -fpermissive -Wall -Wextra -lm -fno-strict-aliasing -fomit-frame-pointer -o sumdivisors.o sumdivisors.cpp

I imported those flags to R also:

Sys.setenv("PKG_CXXFLAGS"="-lpari -fpermissive -Wall -Wextra -lm -fno-strict-aliasing -fomit-frame-pointer")

I also created a symlink at /usr/local/lib64/R/library/Rcpp/include/ to pari directory under /usr/include.

However the output from sourceCpp command is as such:

> sourceCpp("sumdivisors.cpp")
In file included from /usr/local/lib64/R/library/Rcpp/include/Rcpp/r/headers.h:48:0,
                 from /usr/local/lib64/R/library/Rcpp/include/RcppCommon.h:29,
                 from /usr/local/lib64/R/library/Rcpp/include/Rcpp.h:27,
                 from sumdivisors.cpp:15:
/usr/local/lib64/R/library/Rcpp/include/Rcpp/platform/compiler.h:47:0: warning: "GCC_VERSION" redefined
     #define GCC_VERSION (__GNUC__ * 10000 + __GNUC_MINOR__ * 100 + __GNUC_PATCHLEVEL__)                                                                                                      

In file included from /usr/local/lib64/R/library/Rcpp/include/pari/pari.h:16:0,
                 from sumdivisors.cpp:14:
/usr/local/lib64/R/library/Rcpp/include/pari/paricfg.h:19:0: note: this is the location of the previous definition
 #define GCC_VERSION "gcc version 6.2.1 20160830 (GCC)"                                                                                                                                       

Error in dyn.load("/tmp/Rtmpc9edZe/sourceCpp-x86_64-pc-linux-gnu-0.12.8/sourcecpp_188e46b44088/sourceCpp_2.so") :                                                                             
  unable to load shared object '/tmp/Rtmpc9edZe/sourceCpp-x86_64-pc-linux-gnu-0.12.8/sourcecpp_188e46b44088/sourceCpp_2.so':                                                                  
  /tmp/Rtmpc9edZe/sourceCpp-x86_64-pc-linux-gnu-0.12.8/sourcecpp_188e46b44088/sourceCpp_2.so: undefined symbol: pari_mainstack  

I replicated the steps with or without including the C++11 enable line, nothing changes. I also change gcc flags, with no result. It seems that there is a problem with gcc version definition and the definition of pari_mainstack.

I believe the problem is not about how the source is written. Below two example in which, the above cpp code is converted to a one returning a vector and the function is not main. A similar and simple code which compiles well with Rcpp is also presented:

#include<stdio.h>
#include<numeric> // for "accumulate"
#include<iostream>
#include<string>
#include<vector>
#include<algorithm>
#include<cmath>
#include<stdlib.h>
#include<time.h>
#include<iterator> // for ostream_iterator
#include<strings.h>
#include<string.h>
#include<sstream>
#include <pari/pari.h> // for PARI/GP library
#include<Rcpp.h> // for sourceCpp to work, this line must be uncommented

// Enable C++11 via this plugin (Rcpp 0.10.3 or later)
// [[Rcpp::plugins(cpp11)]]

using namespace std;
using namespace Rcpp; // for sourceCpp to work, this line must be uncommented

// [[Rcpp::export]]
vector<long> sumdivisors() {

    long maxp = 1000000; // max value
    pari_init(500000,2); // initiate pari
    size_t rsize = 500000; // set stack size variables
    size_t vsize = 100000000;

    void paristack_setsize(size_t rsize, size_t vsize); // declare stack function
    paristack_setsize(rsize, vsize); // set stack size
    gp_allocatemem(stoi(100000000)); // allocate memory
    GEN p1; // declare PARI variable

    p1 = cgetg(maxp, t_VEC); // make the PARI variable a vector
    long j; // declare the variable for the number to be checked. one above the vector iterator
    for (long i = 0; i <= maxp; ++i) { // iterate over PARI vector
        j = i + 1; // decrement index for number
        gel(p1, i) = sumdiv(stoi(j)); // calculate the sum of divisors and update the vector
    }

    vector<long> p2(maxp); // empty vector of native type
    GEN x; // declare a PARI variable to subset PARI vector
    for (long i = 0; i < maxp; i++) { // for2, across vector indices
        x = gel(p1, i); // subset one item of vector
        p2[i] = gtolong(x); // convert PARI to native long integer and update long vector item
    } // close for2

    return(p2);
    /*
    for (long z = 0; z < maxp; z++) { // for3, to iterate for stdout
        cout << p2[z] << '\n'; // return the result. the vector items are printed separately
    } // close for3
    */

} // close function

.

#include<stdio.h>
#include<iostream>
#include<string>
#include<vector>
#include<algorithm>
#include<cmath>
#include<math.h>
#include<time.h>
#include<Rcpp.h>

using namespace std;
using namespace Rcpp;

//#include "std_lib_facilities.h"

// [[Rcpp::export]]
int pe001Cpp(int x) { // define a function pe001 with one in$teger input
    int sum35 = 0; // define a scalar for the sum. start value is 0
    for (int i=1; i<x; ++i) { // for 1 loop for counting up to x
        if (i % 3 == 0 || i % 5 == 0) { // if 1, divisible by 3 or 5
            sum35 += i; // update sum
        } // close if 1
    } // close for 1
    return sum35; // return the final value
} // close function

// [[Rcpp::export]]
int pe001Cppb(int x) { // efficient method
    int sumdivisible(int x, int y); // declare the below function in this scope
    return sumdivisible(x, 3) + sumdivisible(x, 5) - sumdivisible(x, 15); // return the total sum
} // close function pe001Cppb

int sumdivisible(int x, int y) { // sum of terms divisibile by y
    int ny = floor ((x-1) / y); // number of terms less than x and divisible by y
    return ny * (ny + 1) / 2 * y; // return the sum
} // close function sumdivisible

The filtered strace output from the execution of the directly compiled binary is as follows:

open("/usr/lib/libpari-gmp-tls.so.5", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libgmp.so.10", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libnss_compat.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libnsl.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libnss_nis.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 3

As we see from here https://github.com/rstats-db/RPostgres/issues/80, the problem may be a wrong version of linked library which can be solved by symlinking. So I have to know which library files Rcpp tries to link with.

Update:

Scanelf output shows that the problematic symbol is in /usr/lib/libpari-gmp-tls.so.2.9.1 .

[s@SS ~]$ scanelf -l -s pari_mainstack | grep pari_mainstack
ET_DYN pari_mainstack /usr/lib/libpari-gmp-tls.so.2.9.1 

strace output of the g++ compiled file shows that the executable is linked to /usr/lib/libpari-gmp-tls.so.5 which itself a symlink to 2.9.1 version:

[s@SS library]$ strace ./sumdivisors3.o |& grep so | grep -v "No such file"
execve("./sumdivisors3.o", ["./sumdivisors3.o"], [/* 79 vars */]) = 0
open("/usr/lib/libpari-gmp-tls.so.5", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libgmp.so.10", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
open("/usr/lib/libnss_compat.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libnsl.so.1", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libnss_nis.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 3

The ldd output from the sourceCpp_4.so file created by sourceCpp command is as follows:

[s@SS library]$ ldd /tmp/Rtmpau9YqY/sourceCpp-x86_64-pc-linux-gnu-0.12.8/sourcecpp_3a105ad2bdba/sourceCpp_4.so
        linux-vdso.so.1 (0x00007ffc28f9d000)
        libR.so => not found
        libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007f5077111000)
        libm.so.6 => /usr/lib/libm.so.6 (0x00007f5076e0d000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f5076bf6000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007f5076858000)
        /usr/lib64/ld-linux-x86-64.so.2 (0x0000564489276000)

I tracked all those files with ldd and there is no link to /usr/lib/libpari-gmp-tls.so.2.9.1 or /usr/lib/libpari-gmp-tls.so.5 libraries. So the question is why can't sourceCpp link to those files given the necessary headers included (and while g++ can)?

Update:

Verbose output of sourceCpp shows the following command:

g++  -I/usr/local/lib64/R/include -DNDEBUG  -I/usr/local/include  -I"/usr/local/lib64/R/library/Rcpp/include" -I"/home/s/codes/cpp/projecteuler/library"  -lpari -fpic  -g -O2 -c sumdivisors2.cpp -o sumdivisors2.o
g++ -shared -L/usr/local/lib64/R/lib -L/usr/local/lib64 -o sourceCpp_5.so sumdivisors2.o -L/usr/local/lib64/R/lib -lR

I set the flags with (in fact -lpari suffices:

Sys.setenv("PKG_CXXFLAGS"="-lpari")

According to gp2c output, -lpari flag should also be included in the linking stage but here the linking command does not have it. Can it be the source of the problem? Or before that, why sourceCpp_5.so file is not linked to the necessary pari library?

And the finale:

The dependent libraries for the linking should also be explicitly declared via:

Sys.setenv("PKG_LIBS"="-lm -lpari -lc")

The library flags are given by the gp2c output. By the way to come over the gcc version issue, instead of creating a symlink to original pari headers directory, I created a copy inside R library path and commented out the line:

//#define GCC_VERSION "gcc version 6.2.1 20160830 (GCC)"

Now the compilation is successfull, a R can enjoy PARI/GP speed in number theoretic computations in R, thanks to Rcpp!

Serhat Cevikel
  • 720
  • 3
  • 11
  • So, the main issue that you are running into is `undefined symbol: pari_mainstack`? Correct? – coatless Jan 13 '17 at 08:30
  • 1
    Also, this is _not_ an Rcpp bug. Very bad style to cross post this minutes after posting to the official bug tracker. – coatless Jan 13 '17 at 08:33
  • If it is not an Rcpp bug, why can I compile it with g++? – Serhat Cevikel Jan 13 '17 at 08:47
  • By the way, the other way around. I first posted it here, and then opened an issue at Github. The more couples of eyes, the higher the chances to find a solution. – Serhat Cevikel Jan 13 '17 at 08:51
  • By the way, I believe this is an important issue for HPC with R, since PARI/GP is very critical in number-theory intensive tasks and there is no direct interface from R to PARI/GP. So the only way is to use PARI/GP in library mode from c++ and compile with Rcpp into R. – Serhat Cevikel Jan 13 '17 at 08:53
  • 2
    First, this question is targeted upon using an external library that is _not_ package bound and the method chosen to embed said library alongside _Rcpp_ is highly problematic. The fact that you can compile your code standalone and then expect it to instantly work especially given a `int main()` directive indicates that you are unaware of a few idioms. Therefore, I stand by my previous remark that is _not_ an Rcpp bug but a user issue that should _not_ have been posted to the official bug tracker. – coatless Jan 13 '17 at 08:55
  • It is not about how my code is written. I changed the code and also presented a similar one which compiles well with Rcpp, above – Serhat Cevikel Jan 13 '17 at 09:08
  • Okay, help me help you. Right now, it seems as if your issue is related to multiple possibilities (e.g. improper use of `sourceCpp()`, direct embedding of `pari` with Rcpp headers, the presence of `int main()`, bad flags being set, and -- the primary culprit -- R and pari disagree on compilers). If the two code snippets just added to your question _work_, then I'm confused as to what the issue is because `vector sumdivisors()` is literally the previous code snippet sin `int main()`. Please emphasize what exactly is problematic within your question. I'll be awake for the next 30 minutes. – coatless Jan 13 '17 at 09:31
  • Thanks @coatless for your interest. I just made that change in order to show that, the issue is not about the "main" function. Now the two snippets are the same in terms of construction. The second one works with Rcpp and the first one does not. Furthermore, the first code runs with g++ with flags I provided (and these are the flags given by the output of gp2c while converting gp script to c code). – Serhat Cevikel Jan 13 '17 at 10:24
  • It is quite probable that I am making a mistake and this is not a bug of Rcpp, since there are some cran packages that have pari as dependencies. What may be the reason that doesn't pose a problem for g++ but for Rcpp, maybe their differences, etc.? Maybe there is a flag that I am missing? – Serhat Cevikel Jan 13 '17 at 10:24
  • By the way I think I provided all material necessary and sufficient to replicate the situation, maybe you can give it a try when you have the chance. – Serhat Cevikel Jan 13 '17 at 10:25
  • 1
    You should put this into a package instead of using `sourceCpp`. Once you put this in a package, you should link to pari using `PKG_LIBS` in `src/Makevars` instead of `PKG_CXXFLAGS`. Linker flags go into `PKG_LIBS`. – jtilly Jan 13 '17 at 14:42
  • Good idea, I'll work on that – Serhat Cevikel Jan 13 '17 at 16:41

1 Answers1

5

Similar source files without pari.h header compile well with Rcpp with the necessary parts included (such as header, namespace, export line, etc)

When using sourceCpp(), the general use case does not involve using system installed libraries like pari without tapping into an R package that has previously established the appropriate linking flags. As a result, the separate package would handle surfacing pari to the R session and, subsequently, registering a plugin for the Rcpp plugin manager so that // [[Rcpp::depends(pkgname)]] could be included after the header file to set the appropriate link statements.

Having said this, the first step toward getting pari to work with Rcpp is to establish an RcppPari package preferably in the mold of RcppGSL.

To do so, you'll probably need:

  • configure.ac: To verify that the library does exist on the users system and the system configuration is well-suited (e.g. compiler is valid, etc.)
  • src/Makevars.in: To set the appropriate preprocessor flags to handle how the code embedded in the package should be both compiled and linked. (See below for links.)
  • r/inline.R: To establish the plugins necessary to setup scaffolding for using sourceCpp() and cppFunction() (discussed above as the "ideal" solution).

I tracked all those files with ldd and there is no link to /usr/lib/libpari-gmp-tls.so.2.9.1 or /usr/lib/libpari-gmp-tls.so.5 libraries. So the question is why can't sourceCpp link to those files given the necessary headers included (and while g++ can)?

As you found out, the use of -lm -lpari -lc in PKG_CXXFLAGS was not appropriately called during the linking stage. Instead, the flag needed to be set in the PKG_LIB option. This was emphasized by @jtilly in the comments.

Before you dive into writing a package, you might want to look the different meanings behind preprocessor flags in R with regard to Makevars file:

coatless
  • 20,011
  • 13
  • 69
  • 84