7

What's the efficient way to multiply two arrays and get sum of multiply values in Ruby? I have two arrays in Ruby:

array_A = [1, 2, 1, 4, 5, 3, 2, 6, 5, 8, 9]
array_B = [3, 2, 4, 2, 5, 1, 3, 3, 7, 5, 4]

My aim is to get the sum value of array_A * array_B, i.e., 1*3 + 2*2 + 1*4 + ... + 8*5 + 9*4.

Because I need to calculate them million times in my apps, what's the most efficient way to do such calculations?

It's just like a matrix calcuation: 1* N matrix * N*1 matrix or a vector dot product.

mu is too short
  • 426,620
  • 70
  • 833
  • 800
Kevin Hua
  • 93
  • 1
  • 1
  • 8
  • 3
    Im not a rails/ruby user but would this not be a general Ruby question rather than something specific to the Rails framework? – KillerX Sep 10 '11 at 14:52
  • These calculations will be used in rails apps. But I think it's general ruby question. I don't want to include matrix lib and just want to find a easier and more efficient way to do such calculations. – Kevin Hua Sep 10 '11 at 15:03
  • Very similar problem http://stackoverflow.com/questions/1009280/how-do-i-perform-vector-addition-in-ruby – dfens Sep 10 '11 at 19:55

9 Answers9

22

Update

I've just updated benchmarks according to new comments. Following Joshua's comment, the inject method will gain a 25% speedup, see array walking without to_a in the table below.

However since speed is the primary goal for the OP we have a new winner for the contest which reduces runtime from .34 to .22 in my benchmarks.

I still prefer inject method because it's more ruby-ish, but if speed matters then the while loop seems to be the way.

New Answer

You can always benchmark all these answers, I did it for curiosity:

> ./matrix.rb 
Rehearsal --------------------------------------------------------------
matrix method                1.500000   0.000000   1.500000 (  1.510685)
array walking                0.470000   0.010000   0.480000 (  0.475307)
array walking without to_a   0.340000   0.000000   0.340000 (  0.337244)
array zip                    0.590000   0.000000   0.590000 (  0.594954)
array zip 2                  0.500000   0.000000   0.500000 (  0.509500)
while loop                   0.220000   0.000000   0.220000 (  0.219851)
----------------------------------------------------- total: 3.630000sec

                                 user     system      total        real
matrix method                1.500000   0.000000   1.500000 (  1.501340)
array walking                0.480000   0.000000   0.480000 (  0.480052)
array walking without to_a   0.340000   0.000000   0.340000 (  0.338614)
array zip                    0.610000   0.010000   0.620000 (  0.625805)
array zip 2                  0.510000   0.000000   0.510000 (  0.506430)
while loop                   0.220000   0.000000   0.220000 (  0.220873)

Simple array walking wins, Matrix method is worse because it includes object instantiation. I think that if you want to beat the inject while method (to beat here means an order of magnitude fastest) you need to implement a C extension and bind it in your ruby program.

Here it's the script I've used

#!/usr/bin/env ruby

require 'benchmark'
require 'matrix'

array_A = [1, 2, 1, 4, 5, 3, 2, 6, 5, 8, 9]
array_B = [3, 2, 4, 2, 5, 1, 3, 3, 7, 5, 4]

def matrix_method a1, a2
  (Matrix.row_vector(a1) * Matrix.column_vector(a2)).element(0,0)
end

n = 100000

Benchmark.bmbm do |b|
  b.report('matrix method') { n.times { matrix_method(array_A, array_B) } }
  b.report('array walking') { n.times { (0...array_A.count).to_a.inject(0) {|r, i| r + array_A[i]*array_B[i]} } }
  b.report('array walking without to_a') { n.times { (0...array_A.count).inject(0) {|r, i| r + array_A[i]*array_B[i]} } }
  b.report('array zip') { n.times { array_A.zip(array_B).map{|i,j| i*j }.inject(:+) } }  
  b.report('array zip 2') { n.times { array_A.zip(array_B).inject(0) {|r, (a, b)| r + (a * b)} } }
  b.report('while loop') do
    n.times do
      sum, i, size = 0, 0, array_A.size
      while i < size
        sum += array_A[i] * array_B[i]
        i += 1
      end
      sum
    end
  end
end
Community
  • 1
  • 1
Fabio
  • 18,856
  • 9
  • 82
  • 114
  • 2
    I tried increasing the array size up to 10,000 elements and the results stay in the same order. Good stuff. – spike Sep 10 '11 at 20:12
7

Walking through each element should be a must

(0...array_A.count).inject(0) {|r, i| r + array_A[i]*array_B[i]}
PeterWong
  • 15,951
  • 9
  • 59
  • 68
  • 1
    +1 for your answer, you won the [benchmark](http://stackoverflow.com/questions/7372489/whats-the-efficient-way-to-multiply-two-arrays-and-get-sum-of-multiplied-values/7372927#7372927) – Fabio Sep 10 '11 at 16:18
  • 2
    The to_a is unnecessary and adds about 40% more time to your solution, according to Peter's benchmark. Any enumerable objects have inject defined on them, not just arrays. – Joshua Cheek Sep 10 '11 at 17:23
3

I would start simple and use the Ruby matrix class:

require 'matrix'

a = Matrix.row_vector( [1, 2, 1, 4, 5, 3, 2, 6, 5, 8, 9])
b = Matrix.column_vector([3, 2, 4, 2, 5, 1, 3, 3, 7, 5, 4])

result= a * b
puts result.element(0,0)

If this turns out to be too slow, then do the exact same method but with an external math library.

spike
  • 9,794
  • 9
  • 54
  • 85
3

This is how I would do it

array_A.zip(array_B).map{|i,j| i*j }.inject(:+)
christianblais
  • 2,448
  • 17
  • 14
  • 2
    +1 I don't know how will this perform a "million of times", but definitely the most idiomatic way to do it. Maybe .inject(0, :+) in case of empty arrays. – tokland Sep 10 '11 at 16:57
1

Try the NMatrix gem. It is a numerical computation library. I think it uses the same C and C++ libraries that Octave and Matlab uses.

You would be able to do the matrix multiplication like this:

require 'nmatrix'

array_A = [1, 2, 1, 4, 5, 3, 2, 6, 5, 8, 9]
array_B = [3, 2, 4, 2, 5, 1, 3, 3, 7, 5, 4]

vec_a = array_A.to_nm([1,array_A.length])    # create an NMatrix
vec_b = array_B.to_nm([1,array_B.length])

sum = vec_a.dot(vec_b.transpose)

I am not sure how the speed will compare using pure Ruby but I imagine it to be faster, especially for large and sparse vectors.

Henry Chinner
  • 429
  • 3
  • 9
1

This is another way:

array_A.zip(array_B).inject(0) {|r, (a, b)| r + (a * b)}
rubyprince
  • 17,559
  • 11
  • 64
  • 104
1

Since speed is our primary criterion, I'm going to submit this method as it's fastest according to Peter's benchmarks.

sum, i, size = 0, 0, a1.size
while i < size
  sum += a1[i] * a2[i]
  i += 1
end
sum
Joshua Cheek
  • 30,436
  • 16
  • 74
  • 83
0
array1.zip(array2).map{|x| x.inject(&:*)}.sum
Infinite Recursion
  • 6,511
  • 28
  • 39
  • 51
Assaf Shomer
  • 1,429
  • 1
  • 15
  • 21
0

EDIT: Vector is not fastest (Marc Bollinger is totally right).

Here is the modified code with vector and n-times:

require 'benchmark'
require 'matrix'

array_A = [1, 2, 1, 4, 5, 3, 2, 6, 5, 8, 9]
array_B = [3, 2, 4, 2, 5, 1, 3, 3, 7, 5, 4]

vector_A = Vector[*array_A]
vector_B = Vector[*array_B]

def matrix_method a1, a2
  (Matrix.row_vector(a1) * Matrix.column_vector(a2)).element(0,0)
end

def vector_method a1, a2
  a1.inner_product(a2)
end

n = 100000

Benchmark.bmbm do |b|
  b.report('matrix method') { n.times { matrix_method(array_A, array_B) } }
  b.report('array walking') { n.times { (0...array_A.count).to_a.inject(0) {|r, i| r + array_A[i]*array_B[i]} } }
  b.report('array walking without to_a') { n.times { (0...array_A.count).inject(0) {|r, i| r + array_A[i]*array_B[i]} } }
  b.report('array zip') { n.times { array_A.zip(array_B).map{|i,j| i*j }.inject(:+) } }
  b.report('array zip 2') { n.times { array_A.zip(array_B).inject(0) {|r, (a, b)| r + (a * b)} } }
  b.report('while loop') do
    n.times do
      sum, i, size = 0, 0, array_A.size
      while i < size
        sum += array_A[i] * array_B[i]
        i += 1
      end
      sum
    end
  end
  b.report('vector') { n.times { vector_method(vector_A, vector_B) } }
end

And the results:

Rehearsal --------------------------------------------------------------
matrix method                0.860000   0.010000   0.870000 (  0.911755)
array walking                0.290000   0.000000   0.290000 (  0.294779)
array walking without to_a   0.190000   0.000000   0.190000 (  0.215780)
array zip                    0.420000   0.010000   0.430000 (  0.441830)
array zip 2                  0.340000   0.000000   0.340000 (  0.352058)
while loop                   0.080000   0.000000   0.080000 (  0.085314)
vector                       0.310000   0.000000   0.310000 (  0.325498)
----------------------------------------------------- total: 2.510000sec

                                 user     system      total        real
matrix method                0.870000   0.020000   0.890000 (  0.952630)
array walking                0.290000   0.000000   0.290000 (  0.340443)
array walking without to_a   0.220000   0.000000   0.220000 (  0.240651)
array zip                    0.400000   0.010000   0.410000 (  0.441829)
array zip 2                  0.330000   0.000000   0.330000 (  0.359365)
while loop                   0.080000   0.000000   0.080000 (  0.090099)
vector                       0.300000   0.010000   0.310000 (  0.325903)
------

Too bad. :(

Colin Curtin
  • 2,093
  • 15
  • 17
  • Vector is fastest on that list, in that example, because you're not running it n.times :) When updating to run `n.times`, I get a result of 0.37 on my machine, which places is [just slightly] second-last. I'm actually fairly surprised it's as slow as it is, given that it's [functionally very similar](https://github.com/ruby/ruby/blob/trunk/lib/matrix.rb#L2007-L2010) to the while loop implementation, but with a `yield` for each element in both vectors – Marc Bollinger Jun 01 '16 at 23:36
  • Updated my solution - thanks! It seems like it would be an easy fix to make it more performant, then. – Colin Curtin Jun 08 '16 at 18:19