Questions tagged [offloading]

This tag is for questions about software which utilize mechanisms for reducing workload from the CPU. This can be done by aggregating work before further processing is done and/or processing some of the workload in a dedicated hardware.

This tag is for questions about software which utilize mechanisms for reducing workload from the CPU. This can be done by aggregating work before further processing is done and/or processing some of the workload in a dedicated hardware.

Common offloads are network stack offloads such as LRO, GRO, TSO, etc. Other offloads are CPU offloads such as Intel's AES-NI for IPSec.

Offloads can be multi-layered, for example OVS (Open Virtual Switch) has a service for identifying and creating steering rules for packets. The user space service will offload the steering to the kernel software. Specific hardware can process the steering itself, so the kernel might offload to the hardware.

Common questions about offloads

  • How does specific offloads work?
  • How to enable offloads for specific cases?
  • What is the benefit of using specific offload?
111 questions
0
votes
1 answer

Why OpenMP Doesn't Offload Arrays to GPU?

I am currently writing some codes in C and want to utilize GPUs to do the calculation. My code has a test function like this: void test_func(int *x, int N){ // x is allocated using x = malloc(N*(sizeof *x)) elsewhere. It's done on cpu. // N is a…
0
votes
2 answers

DPDK19.11.10: HW offload for IPV4 with VLAN tag is not working properly

I am using DPDK19.11.10 on centos. The application is working fine with HW offloading if I send only the IPV4 packet without the VLAN header. If I add the VLAN header with IPV4, HW offloading is not working. If capture the pcap on ubuntu gateway the…
0
votes
1 answer

Perform a triple pointer (C) offloading to NVIDIA GPU with OpenMP

I've been working with a heat transfer code. This code, basically, stablishes the initial conditions for a cube and all of its faces. The six faces start at different temperatures, and then the code will be calculating how the temperature changes in…
DrewHdz
  • 3
  • 2
0
votes
0 answers

Trying to load xdp on nic (offloaded)

I am trying to load my XDP program directly on the NIC (offloaded XDP). According to this answer, I need to specify the device to the following functions: BPF() load_func() I did it like so: b = BPF(src_file=”file.c”, cflags=[”-w”],…
Lidorelias3
  • 27
  • 1
  • 10
0
votes
0 answers

GCC fails to compile OpenMP offloading to GPU

I'm using GCC 9.3 on Ubuntu 20.04. I want to offload the famous SAXPY example to GPU using OpenMP. I installed GCC's offloading capabilities by sudo apt install gcc-9-offload-nvptx . Then compiled the following code by g++ -fopenmp main.cpp: int…
0
votes
2 answers

Do I have to build clang-11 from source on Ubuntu 18.04 to have OpenMP GPU target offload?

I installed clang-11 from https://apt.llvm.org/ on Ubuntu 18.04 and I have OpenMP host device functionality working in my C++ test project, but omp_get_num_devices() returns 0, even though I have Nvidia GPU and working CUDA 11 toolkit. Do I have to…
Paul Jurczak
  • 7,008
  • 3
  • 47
  • 72
0
votes
1 answer

Intel Advisor beta offloading analysis: No execution count

I am trying to use Intel oneAPI advisor beta to do a GPU offloading analysis (via analyze.py and collect.py). I have the problem that all non offloaded regions show Cannot be modelled: No Execution Count. Furthermore I get the warning advixe:…
lm1909
  • 35
  • 7
0
votes
1 answer

OpenMP Offloading with Private Arrays of Unknown Size at Compile Time

I am trying to offload several nested for loops in fortran using OpenMP, XL compiler suite. 90% of the routines are straight forward, but a handful of the loops involve private 1D arrays that are of unknown size at compile time, but will always be…
Kschau
  • 145
  • 1
  • 12
0
votes
2 answers

Difference between offloading decision and task scheduling in context of fog computing

In context of fog computing, the computation offloading decision decides where to offload- on cloud or on fog or execute it locally. While task scheduling also decides where to execute task on fog or on cloud. Then what is the difference between…
0
votes
0 answers

Tomcat MAX Thread VS sessions

We are using protocol="org.apache.coyote.http11.Http11NioProtocol" and I have a question as to what the max concurrent sessions Tomcat can handle? As per my understanding http11.Http11NioProtocol Tomcat can handle 10000 connection with 200…
0
votes
1 answer

Azure App Gateway SSL Offloading to a Datacentre server?

So I am looking at using Azure App Gateway to overcome a set of legacy servers (Win2003) that will not support TLS 1.2 and therefore come March+ 2020 the client browsers will not be able to access the site. So my question is can I use AZ App Gateway…
Rusty
  • 113
  • 1
  • 7
0
votes
1 answer

Debug OpenMP Python C extension offloading

I am using the modeling toolbox Anuga and have set it up to run with parallel support. To my current knowledge the mechanism behind is that Numpy is being extended by modules in C which are exposed to OpenMP through extra_args = ['-fopenmp'] I have…
Sebastian
  • 1
  • 2
0
votes
1 answer

Private Variables in Offloaded Fortran Parallel Loop

I am offloading code to a GPU using OpenMP 4.5. So far everything is working on the GPU, except when I try to make parallel sections with private variables that are allocated before I offload. I am using gcc 7.2.0 and cuda 9.2.88. I am running on…
Jared
  • 77
  • 6
0
votes
1 answer

libcoi_device.so.0 Not found Compiling Error with Intel 19.0.4 OpenMP 5.0 Offloading

I just installed Intel® Parallel Studio XE Cluster Edition for Linux* 2019 and am trying to use OpenMP to offload to a Xeon Phi accelerator. I am using cmake, with flags CC=~/intel/bin/icc CXX=~/intel/bin/icpc CMAKE_CXX_FLAGS="-qopenmp-offload"…
Jared
  • 77
  • 6
0
votes
1 answer

Using OpenMP target offloading in llvm-8.0.0

When trying to use openmp target offloading with llvm I get the following error $ cat offload.cpp #include int main() { #pragma omp target teams distribute parallel for for(int i=0; i<100; i++); return 0; } $ clang++ -fopenmp…
Alok
  • 33
  • 6