2

Briefly speaking, my question relies in between compiling/building files (using libraries) with two different compilers while exploiting OpenACC constructs in source files.

I have a C source file that has an OpenACC construct. It has only a simple function that computes total sum of an array:

#include <stdio.h>
#include <stdlib.h>
#include <openacc.h>

double calculate_sum(int n, double *a) {
    double sum = 0;
    int i;

    printf("Num devices: %d\n", acc_get_num_devices(acc_device_nvidia));

    #pragma acc parallel copyin(a[0:n])
    #pragma acc loop
    for(i=0;i<n;i++) {
        sum += a[i];
    }

    return sum;
}

I can easily compile it using following line:

pgcc -acc -ta=nvidia -c libmyacc.c

Then, create a static library by following line:

ar -cvq libmyacc.a libmyacc.o

To use my library, I wrote a piece of code as following:

#include <stdio.h>
#include <stdlib.h>

#define N 1000

extern double calculate_sum(int n, double *a);

int main() {
    printf("Hello --- Start of the main.\n");
    double *a = (double*) malloc(sizeof(double) * N);
    int i;
    for(i=0;i<N;i++) {
        a[i] = (i+1) * 1.0;
    }

    double sum = 0.0;
    for(i=0;i<N;i++) {
        sum += a[i];
    }
    printf("Sum: %.3f\n", sum);


    double sum2 = -1;
    sum2 = calculate_sum(N, a);
    printf("Sum2: %.3f\n", sum2);

    return 0;
}

Now, I can use this static library with PGI compiler itself to compile above source (f1.c):

pgcc -acc -ta=nvidia f1.c libmyacc.a

And it will execute flawlessly. However, it differs for gcc. My question relies in here. How can I built it properly with gcc?

Thanks to Jeff's comment on this question: linking pgi compiled library with gcc linker, now I can build my source file (f1.c) without errors, but the executable file emits some fatal errors.

This is what I use to compile my source file with gcc (f1.c):

gcc f1.c -L/opt/pgi/linux86-64/16.5/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -L. -laccapi -laccg -laccn -laccg2 -ldl -lcudadevice -lpgmp -lnuma -lpthread -lnspgc -lpgc -lm -lgcc -lc -lgcc -lmyacc

This is the error:

Num devices: 2
Accelerator Fatal Error: No CUDA device code available

Thanks to -v option when compiling f1.c with PGI compiler, I see that the compiler invokes so many other tools from PGI and NVidia (like pgacclnk and nvlink).


My questions:

  1. Am I on the wrong path? Can I call functions in PGI compiled libraries from GCC and use OpenACC within those functions?
  2. If answer to above is positive, can I use still link without steps (calling pgacclnk and nvlink) that PGI takes?
  3. If answer to above is positive too, what should I do?
Community
  • 1
  • 1
mgNobody
  • 738
  • 7
  • 23

1 Answers1

2

Add "-ta=tesla:nordc" to your pgcc compilation. By default PGI uses runtime dynamic compilation (RDC) for the GPU code. However RDC requires an extra link step (with nvlink) that gcc does not support. The "nordc" sub-option disables RDC so you'll be able to use OpenACC code in a library. However by disabling RDC you can no longer call external device routines from a compute region.

% pgcc -acc -ta=tesla:nordc -c libmyacc.c
% ar -cvq libmyacc.a libmyacc.o
a - libmyacc.o
% gcc f1.c -L/proj/pgi/linux86-64/16.5/lib -L/usr/lib64 -L/usr/lib/gcc/x86_64-redhat-linux/4.8.5 -L. -laccapi -laccg -laccn -laccg2 -ldl -lcudadevice -lpgmp -lnuma -lpthread -lnspgc -lpgc -lm -lgcc -lc -lgcc -lmyacc
% a.out
Hello --- Start of the main.
Sum: 500500.000
Num devices: 8
Sum2: 500500.000

Hope this helps, Mat

Mat Colgrove
  • 5,441
  • 1
  • 10
  • 11
  • Thanks Mat. Definitely what I was looking for. – mgNobody Jul 05 '16 at 23:24
  • Can I ask what do you mean by "external device routines" in the last sentence? Which functions in OpenACC do you mean? – mgNobody Jul 05 '16 at 23:33
  • Meaning calling subroutines from device code where the subroutine is found in a different object. If the subroutine is in the same source, it can be inlined, but external routines need to be linked. – Mat Colgrove Jul 06 '16 at 15:00
  • Dear @MatColgrove , now the NVHPC SDK is replacing the old PGI compilers and the shared libraries seem to have changed, I tried the flags you have used in the example but multiple libraries are reporting could not find, would you please update that line one more time? I tried to do it myself, but it results in the managed memory feature is missing and I've seen 700 error at runtime – Sanhu Li Sep 09 '20 at 22:06
  • Yes, all the libraries names got changed with the re-branding. However in general, the libraries can change from release to release so it's best to look at the compiler's verbose "dryrun" output to see what we're passing to the linker. Using the same flags that you'll use to line run "nvc -dryrun x.o" and look for the link (ld) command. – Mat Colgrove Sep 10 '20 at 16:47
  • Thank you @MatColgrove I've successfully linked my test program but we still have runtime issues like 700 error suddenly occurred. Will try more to solve it and probably will ping you if we failed. – Sanhu Li Sep 10 '20 at 22:16