Questions tagged [offloading]

This tag is for questions about software which utilize mechanisms for reducing workload from the CPU. This can be done by aggregating work before further processing is done and/or processing some of the workload in a dedicated hardware.

This tag is for questions about software which utilize mechanisms for reducing workload from the CPU. This can be done by aggregating work before further processing is done and/or processing some of the workload in a dedicated hardware.

Common offloads are network stack offloads such as LRO, GRO, TSO, etc. Other offloads are CPU offloads such as Intel's AES-NI for IPSec.

Offloads can be multi-layered, for example OVS (Open Virtual Switch) has a service for identifying and creating steering rules for packets. The user space service will offload the steering to the kernel software. Specific hardware can process the steering itself, so the kernel might offload to the hardware.

Common questions about offloads

  • How does specific offloads work?
  • How to enable offloads for specific cases?
  • What is the benefit of using specific offload?
111 questions
1
vote
1 answer

Declare all Fortran module variables target OpenMP 4.5+

I have a fortran90 code that use chemical species properties (i.e. molecular weight, viscosity, etc.) for calculations. To easily swap in and out groups of chemical species, we keep module files that store all the relevant data in 1D arrays. I.e. we…
Kschau
  • 145
  • 1
  • 12
1
vote
0 answers

pyMIC Offload MLPACK Code -> Error: Could not load library on device 0

We are trying to compile some code (a modified mlpack knn_example.cpp) that uses the mlpack and Armadillo c++ libraries. Compilation is successful but when running the pymic code we are getting an error: pymic.offload_error.OffloadError: Could…
Chris Njuguna
  • 335
  • 1
  • 3
  • 12
1
vote
1 answer

OpenMP offloading

I have an array of struct A that has arrays and int variables. How can I map them to the target Strcut A{ int **a; int *x; int *y; int ny; int nx; }A; A arrayA = (A*)malloc(sizeof(A)*MaxSize); for(int i=0; i
Taghreed
  • 11
  • 1
1
vote
0 answers

Xeon-Phi vs. Xeon Unexplained Overhead

I am trying to run the following code with different n sizes on an Xeon Phi KNC (with 61 cores and 4T/C) and Xeon (with 2 sockets of Xeon E5-2660 v2). I am getting the timings as shown in the tables below. However, I am trying to understand why…
1
vote
1 answer

Unexplained Xeon-Phi Overhead

I am trying to run this code with these different n sizes on an Xeon Phi KNC. I am getting the timings as shown in the table, but I have no idea why I am experiencing those fluctuations. Can you please guide me through it? Thanks in…
1
vote
1 answer

OpenMP offloaded target region executed in both host and target-device

I'm working on a project which requires OpenMP offloading to Nvidia GPUs using Clang. I was able to install Clang to support offloading by following instructions mentioned here. System specification OS - Ubuntu 16.04 LTS Clang -version…
piyumi_rameshka
  • 320
  • 4
  • 9
1
vote
0 answers

Build Nvptx-tools on Windows

I tried to install nvptx-tools on Windows in order to enable GCC offload tools, but I couldn't do it. I made a copy of nvptx-tools from here and to build it I followed this tutorial (the part with "build nvptx-tools"). If I put all the commands in a…
1
vote
1 answer

Problems with gcc 7 and 8 (debian) in OpenMP offloading to nvptx

I installed both gcc-7, gcc-8, gcc-7-offload-nvptx and gcc-8-offload-nvptx I tried with both to compile a simple OpenMP code with offloading: #include #include int main(){ #pragma omp target #pragma omp teams distribute…
648trindade
  • 689
  • 2
  • 5
  • 21
1
vote
2 answers

Can I use `omp_get_thread_num()` on the GPU?

I have OpenMP code which works on the CPU by having each thread manage memory addressed by the thread's id number, accessible via omp_get_thread_num(). This works well on the CPU, but can it work on the GPU? A MWE is: #include #include…
Richard
  • 56,349
  • 34
  • 180
  • 251
1
vote
0 answers

Can OpenMP 4 runs target regions in parallel?

Reading some tutorials from OpenMP 4, I found that target regions can participate in the same dependency graph of CPU tasks, using the depend clause. When programming OpenMP tasks, we know they can be run concurrently. But is this possible on GPUs?…
648trindade
  • 689
  • 2
  • 5
  • 21
1
vote
0 answers

OpenMP target (update) into

I am working with OpenMP 4.5 Accelerator Model on platform equipped with Intel Xeon Phi coprorcessors. I would like to use CPU + MIC to joint problem solving. I need use mechanism similar to Intel Offload into. I would like to transfer immediately…
JudgeDeath
  • 151
  • 1
  • 2
  • 9
1
vote
0 answers

How to pass user-defined struct in C with OpenMP to accelerator?

I have a struct defined, in which I have a dynamically allocated array, and I need to transfer this struct from the host to the accelerator (in my case it would be some nvidia GPU) through some OpenMP directives (in a C-code). The struct looks as…
Alf
  • 1,821
  • 3
  • 30
  • 48
1
vote
1 answer

Does GCC feature a similar parameter to pgcc's -Minfo=accel?

I'm trying to compile code on GCC that uses OpenACC to offload to an NVIDIA GPU but I haven't been able to find a similar compiler option to the one mentioned above. Is there a way to tell GCC to be more verbose on all operations related to…
1
vote
2 answers

Media Player as a Webservice

I am new in android development. I need to get idea about how to deploy Video/Audio file as a web-service at glass-fish server and then call the web-service from client device (Mobile). Please help me to get very basic idea. I have already run many…
Mushtaq
  • 29
  • 6
1
vote
0 answers

OpenMP offloading tasks to Intel MIC

I am trying to offload an expensive loop in my program to Intel MIC. The part of the code is: !$omp target map(to:coor,sigma_const,clase) map(tofrom:ener1,ener2) !$omp parallel private(i,j,fdummy1,k,l,fdummy2,fdummy3,fdummy4,fdummy5,dist) !$omp do…
armando
  • 1,360
  • 2
  • 13
  • 30