how to vectorize arrow::compute::Take?

Question

I have an array of large size input_array and an array of offsets take_array. I want to return the elements with those offsets very fast. Can I vectorize it for the arrow array? If so, how?

arrow::compute::Take(input_array, take_array)

Use Case: I am taking a subset of a really large input_array. It is used in places where OpenMP-like and MPI-like parallelism are used. So vectorization seems to be the next low-hanging fruit.

Example: https://arrow.apache.org/docs/python/generated/pyarrow.compute.take.html

The example is used in Apache Arrow. I am also open to Gandiva, Velox, LLVM or Intel MKL if there is a better way.

https://www.intel.com/content/www/us/en/developer/articles/technical/vectorization-llvm-gcc-cpus-gpus.html#gs.3vy461

https://llvm.org/docs/Vectorizers.html

https://www.dremio.com/blog/gandiva-performance-improvements-production-query/

One way I am trying to approach this problem is to think of the `take_array` as a filter. In this case, there might be a way we can vectorize the filtering using AVX through apache gandiva which uses xsimd behind the scene. — cpchung, Jun 23 '22 at 19:45
quoting reply from velox: "In Velox you could do this zero-copy by wrapping input_array and take_array and creating a DictionaryVector". Not sure whether there is an arrow-equivalent — cpchung, Jun 24 '22 at 18:51
>>Can I vectorize it for the arrow array? If so, how? Could you please elaborate a bit regarding this? — Shanmukh-Intel, Jul 07 '22 at 05:05

how to vectorize arrow::compute::Take?

0 Answers0