1

I'm looking for a good idiomatic rails pattern or gem to handle the problem of inefficient after_commit model callbacks. We want to stick with a callback to guarantee data integrity but we'd like it to run once whether it's for one record or for a whole batch of records wrapped in a transaction.

Here's a use-case:

A Portfolio has many positions.

On Position there is an after_commit hook to re-calculate numbers in reference to its sibling positions in the portfolio.

That's fine for directly editing one position.

However...

There's we now have an importer for bringing in lots of positions spanning many portfolios in one big INSERT. So each invocation of this callback queries all siblings and it's invoked once for each sibling - so reads are O(n**2) instead of O(n) and writes are O(n) where they should be O(1).

'Why not just put the callback on the parent portfolio?' Because the parent doesn't necessarily get touched during a relevant update. We can't risk the kind of inconsistent state that could result from leaving a gap like that.

Is there anything out there which can leverage the fact that we're committing all the records at once in a transaction? In principle it shouldn't be too hard to figure out which records changed.

A nice interface would be something like after_batch_commit which might provide a light object with all the changed data or at least the ids of affected rows.

There are lots of unrelated parts of our app that are asking for a solution like this.

Adamantish
  • 1,888
  • 2
  • 20
  • 23

1 Answers1

0

One solution could be inserting them all in one SQL statement then validating them afterwards.

Possible ways of inserting them in a single statement is suggested in this post.

INSERT multiple records using ruby on rails active record

Or you could even build the sql to insert all the records in one trip to the database.

The code could look something like this:

max_id = Position.maximum(:id)
Postion.insert_many(data) # not actual code
faulty_positions = Position.where("id > ?", max_id).reject(&:valid?)
remove_and_or_log_faulty_positions(faulty_positions)

This way you only have have to touch the database three times per N entries in your data. If it is large data sets it might be good to do it in batches as you mention.

Community
  • 1
  • 1
Albin
  • 2,912
  • 1
  • 21
  • 31
  • What I'm talking about is analogous to validation but is more about doing than checking. @ We're already doing one big insert with bulk_import gem. The point is we already have code to guarantee this work gets done on an `after_commit` callback. To write that code again but slightly differently for the batch insert case seems too ugly for there to be no better solution. Also, there is yet another case where they're batch edited through a form. – Adamantish Jul 04 '16 at 17:38
  • But wouldn't my suggestion work with for that case as well? Just change `.reject(&:valid?)` to `.each(&:doing_method)` and skip the remove and log step? – Albin Jul 05 '16 at 07:00
  • This isn't a question about how to adhoc make sure work is done after or before a batch job. We're already doing that. This is about hardening the procedure into a model callback to guarantee data integrity and DRY it up but without massive inefficiency. – Adamantish Jul 05 '16 at 10:29