Questions tagged [reduction]

481 questions
6
votes
4 answers

OpenCL float sum reduction

I would like to apply a reduce on this piece of my kernel code (1 dimensional data): __local float sum = 0; int i; for(i = 0; i < length; i++) sum += //some operation depending on i here; Instead of having just 1 thread that performs this…
Kami
  • 1,079
  • 2
  • 13
  • 28
5
votes
1 answer

Reduction operator using user-defined function error

The raku webpage says that extra bracket should be used for user-defined functions within reduction operator: https://docs.raku.org/language/operators#Reduction_metaoperators However, I am getting errors when I pass the function as a variable (I am…
lisprogtor
  • 5,677
  • 11
  • 17
5
votes
1 answer

Clean way to loop over a masked list in Julia

In Julia, I have a list of neighbors of a location stored in all_neighbors[loc]. This allows me to quickly loop over these neighbors conveniently with the syntax for neighbor in all_neighbors[loc]. This leads to readable code such as the…
NoseKnowsAll
  • 4,593
  • 2
  • 23
  • 44
5
votes
1 answer

Why does g++ use movabs, and with a weird constant, for a simple reduction?

I'm compiling this simple program: #include int main() { int numbers[] = {1, 2, 3, 4, 5}; auto num_numbers = sizeof(numbers)/sizeof(numbers[0]); return std::accumulate(numbers, numbers + num_numbers, 0); } which sums up the…
einpoklum
  • 118,144
  • 57
  • 340
  • 684
5
votes
4 answers

Finding max value in CUDA

I am trying to write a code in CUDA for finding the max value for the given set of numbers. Assume you have 20 numbers, and the kernel is running on 2 blocks of 5 threads. Now assume the 10 threads compare the first 10 values at the same time, and…
kar
  • 2,505
  • 9
  • 30
  • 32
5
votes
1 answer

how to parallelize this for-loop using reduction?

I am trying to make this for-loop parallelized by using Openmp, i recognized that there reduction in this loop so i added "#pragma omp parallel for reduction(+,ftab)",but it did not work and it gave me this error : error: user defined reduction not…
elias rizik
  • 263
  • 2
  • 11
5
votes
2 answers

OpenMP - critical section + reduction

I'm currently learning Parallel Programming using C and OpenMP. I wanted to write simple code where two shared values are beeing incremented by multiple threads. Firstly I used reduction directive and it worked as it was meant to. Then I switched to…
user4433856
5
votes
3 answers

Flow Shop to Boolean satisfiability [Polynomial-time reduction]

I contact you in order to get an idea on "how to transform a flow shop scheduling problem" into a boolean satisfiability. I already done such reduction for a N*N Sudoku, a N-queens and a Class scheduling problem, but I have some issue on how to…
Valentin Montmirail
  • 2,594
  • 1
  • 25
  • 53
5
votes
1 answer

segmented reduction with scattered segments

I got to solve a pretty standard problem on the GPU, but I'm quite new to practical GPGPU, so I'm looking for ideas to approach this problem. I have many points in 3-space which are assigned to a very small number of groups (each point belongs to…
Christian Rau
  • 45,360
  • 10
  • 108
  • 185
4
votes
1 answer

thrust::reduce_by_key performance with few key repetitions

I have to do keyed reductions of arrays with many different keys that repeat only once in a while: keys = {1,2,3,3,4,5,6,7,7, 8, 9, 9,10,11,...} array = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,...} // after reduction result =…
bbtrb
  • 4,065
  • 2
  • 25
  • 30
4
votes
1 answer

Algorithm to maximize profit: ways to solve/approach? (Advanced NP-Complete)

This one's hard, so all help really appreciated! I know it is NP-Complete and thus cannot be solved in polynomial time, but looking for help in analysis, what type of NP-Complete problem it reduces to, similar problems it reminds you of, etc. The…
Jason
  • 13,563
  • 15
  • 74
  • 125
4
votes
1 answer

How to get unique elements and their firstly appeared indices of a pytorch tensor?

Assume a 2*X(always 2 rows) pytorch tensor: A = tensor([[ 1., 2., 2., 3., 3., 3., 4., 4., 4.], [43., 33., 43., 76., 33., 76., 55., 55., 55.]]) torch.unique(A, dim=1) will return: tensor([[ 1., 2., 2., 3., 3., 4.], …
ojipadeson
  • 129
  • 1
  • 9
4
votes
1 answer

Why OpenMP reduction is slower than MPI on share memory structure?

I have tried to test OpenMP and MPI parallel implementation for inner products of two vectors (element values are computed on the fly) and find out that OpenMP is slower than MPI. The MPI code I am using is as following, #include #include…
Yunlin Xu
  • 49
  • 2
4
votes
2 answers

Using R for variable/dimension reduction on large data set

I have some data in R with various variables for my cases: B T H G S Z Golf 1 1 1 0 1 0 Football 0 0 0 1 1 0 Hockey 1 0 0 1 0 0 Golf2 1 1 1 1 1 0 Snooker 1 0 1 0 1 1 I also have a vector of my expected output per case: 1,…
Paul
  • 1,874
  • 1
  • 19
  • 26
4
votes
2 answers

Collectors.reducing method is updating the same identity when used as downstream for Collectors.partitionBy

I have a class similar to the below MyObject. public class MyObject { private String key; // not unique. multiple objects can have the same key. private boolean isPermanent; private double value1; private double value2; …
Gautham M
  • 4,816
  • 3
  • 15
  • 37
1 2
3
32 33