I am trying to parallelize the computation of a metric on the nodes of a graph.
As an approach I have made each thread calculate the metric on a node (since the calculation on a node is independent).
Each thread will have to calculate the neighbors at distance one of the considered node and store them in an array of initially unknown size (and different for each node).
I can't use extern __shared__
array because each thread has to compute its own array and can't be shared.
I can't declare a (max) fixed array size because it would be very inefficient for my task.
Is there any another way to handle this array or other dynamic data structures?
This is an extract of the kernel function:
__global__ void expectedForce(int* IR_vec, int* IC_vec, int n_IR)
{
double ExF = 0;
int seed = blockDim.x * blockIdx.x + threadIdx.x+1;
if(seed<n_IR) {
int valRiga = IR_vec[seed];
int distOne[]; // that's the array I have to handle
...}
}