2

I'm trying to do vector math using ArrayFire.jl but the function for vector cross product is not implemented in Arrayfire. Is there a workaround for calculating it using Julia's Arrayfire.jl wrapper in a performant way? Defining the function in a naive way is really slow due to all the data transfer between the device and the host, and I don't understand the wrapper functions enough to figure out how to solve this.

cross(a::ArrayFire.AFArray, b::ArrayFire.AFArray) = ArrayFire.AFArray([a[2]*b[3]-a[3]*b[2]; a[3]*b[1]-a[1]*b[3]; a[1]*b[2]-a[2]*b[1]]);
Boxed
  • 168
  • 1
  • 5
  • So do you want to do this using array fire or just using julia's linalg (which has a built in cross function that is pretty fast ...) – Alexander Morley Jan 12 '17 at 16:19
  • I want to do it using ArrayFire so I can offload the calculations to the GPU and speed up my code. – Boxed Jan 12 '17 at 17:43
  • Can you write a version that takes 3 AFArrays and sets the elements of the first using the definition you gave? – David P. Sanders Jan 12 '17 at 18:16
  • If I understood the lower lever wrapping functions for the arrayfire.jl package, maybe, but using the indexing functions like a[i] are slow because it transfers the data to and from the gpu. – Boxed Jan 12 '17 at 21:07
  • Ah, hadn't realised that about indexing – David P. Sanders Jan 13 '17 at 00:20
  • to piggyback on @AlexanderMorley, are you sure that the GPU would offer a substantial speedup over Julia's CPU cross function? you have already observed the data transfer bottleneck between host and device. if your compute burden is not beefy then porting to the GPU may offer disappointing performance gains. – Kevin L. Keys Jan 13 '17 at 01:02
  • 1
    Yes, it should offer a substantial speedup. I have around 175 000 instances I need to calculate and majority of the code can take advantage of parallelization. I had good success in speeding up the code, but now ran into this problem, and unfortunately I can't get rid of cross products. – Boxed Jan 13 '17 at 08:57

2 Answers2

1

To answer myself, the cross product can be done using circshift() function to create shifted vectors in GPU and one can then do element-wise multiplication and subtraction. It's not the most elegant way, but it works.

function cross(a::ArrayFire.AFArray{Float32,1}, b::ArrayFire.AFArray{Float32,1})
    ashift = circshift(a, [-1]);
    ashift2 = circshift(a, [-2]);
    bshift = circshift(b, [-2]);
    bshift2 = circshift(b, [-1]);
    c::ArrayFire.AFArray{Float32,1} = ashift.*bshift - ashift2.*bshift2;
end
Boxed
  • 168
  • 1
  • 5
0

I think the following should work:

function cross!(c::AFArray, a::AFArray, b::AFArray)
    c[1] = a[2]*b[3]-a[3]*b[2]
    c[2] = a[3]*b[1]-a[1]*b[3]
    c[3] = a[1]*b[2]-a[2]*b[1]
end

c = AFArray(zeros(3))
a = AFArray([1.0, 2, 3])
b = AFArray([3.0, 4, 5])

cross!(c, a, b)
David P. Sanders
  • 5,210
  • 1
  • 23
  • 23