0

Let's say that I need to do complex calculations for 100 users. My current configuration looks like this:

producer

class Producer
  class << self
    def publish(target, options = {})
      connection = Bunny.new(some_params).start
      channel    = connection.create_channel
      exchange   = channel.fanout("#{target}_exchange", durable: true)

      exchange.publish(options.to_json)
    end
  end
end

MassComplexCalculations worker

module UsersWorkers
  class MassComplexCalculations
    include Sneakers::Worker

    from_queue "#{ENV['RAILS_ENV']}.users.mass_complex_calculations_queue",
               exchange: "#{ENV['RAILS_ENV']}.users.mass_complex_calculations_exchange"

    def work(options)
      parsed_options = JSON.parse(options)

      ActiveRecord::Base.connection_pool.with_connection do
        User.where(id: parsed_options['ids']).each do |user|
          ::Services::Users::ComplexCalculations.call(user)
        end
      end
      ack!
    end
  end
end

run worker

Producer.publish("#{ENV['RAILS_ENV']}.users.mass_complex_calculations", ids: User.limit(100).ids)

I do not quite understand how AMQP allocates resources to perform tasks and how I can help. Is it right, that it would be better to run each calculation in a separate worker? For example:

CHANGED MassComplexCalculations worker

module UsersWorkers
  class MassComplexCalculations
    include Sneakers::Worker

    from_queue "#{ENV['RAILS_ENV']}.users.mass_complex_calculations_queue",
               exchange: "#{ENV['RAILS_ENV']}.users.mass_complex_calculations_exchange"

    def work(options)
      parsed_options = JSON.parse(options)

      ActiveRecord::Base.connection_pool.with_connection do
        parsed_options['ids'].each do |id|
          Producer.publish("#{ENV['RAILS_ENV']}.users.personal_complex_calculations", id: id)
        end
      end
      ack!
    end
  end
end

NEW PersonalComplexCalculations worker

module UsersWorkers
  class PersonalComplexCalculations
    include Sneakers::Worker

    from_queue "#{ENV['RAILS_ENV']}.users.personal_complex_calculations_queue",
               exchange: "#{ENV['RAILS_ENV']}.users.personal_complex_calculations_exchange"

    def work(options)
      parsed_options = JSON.parse(options)
      user           = User.find(parsed_options['id'])

      ActiveRecord::Base.connection_pool.with_connection do
        ::Services::Users::ComplexCalculations.call(user)
      end
      ack!
    end
  end
end

In my understanding, there may be two options:

  1. the first implementation may work slower because it will call the service in order for each user, while in the second option we will have 100 simultaneous working workers which will do their job in parallel
  2. there is no difference

So which approach is better? Or maybe even one of them is completely wrong?

Thanks in advance.

taras
  • 1,239
  • 1
  • 10
  • 16

1 Answers1

0

Neither of your assumptions hold. You are not guaranteed to have 100 parallel workers as sneakers has a default thread pool size that you are not necessarily overriding:

https://github.com/jondot/sneakers/blob/master/lib/sneakers/worker.rb#L20

And if you do not have an ActiveRecord connection pool of at least 100 connections configured, your code will also block and wait because of resource starvation here.

in GENERAL, doing this sort of task in parallel is likely to be faster most of the time - but this is not guaranteed.

mcfinnigan
  • 11,442
  • 35
  • 28