84

In Rails, both find_each and where are used for retrieving data from Database supported by ActiveRecord.

You can pass your query condition to where, like:

c = Category.where(:name => 'Ruby', :position => 1)

And you can pass batch size to find_each, like:

Hedgehog.find_each(batch_size: 50).map{ |p| p.to_json }

But what's the difference between the following 2 code?

# code 1
Person.where("age > 21").find_each(batch_size: 50) do |person|
  # processing
end

# code 2
Person.where("age > 21").each do |person|
  # processing
end

Does code 1 batch retrieve 50 tuples each time, and code 2 retrieve all tuples in one time? More details explaination is welcomed.

My opinion is:

  1. both where and find_each can be used for batch retrieving, but user can define batch size when using find_each.
  2. find_each does not support passing query condition.

Please correct me if my understanding is wrong.

coderz
  • 4,847
  • 11
  • 47
  • 70

2 Answers2

124

An active record relation does not automatically load all records into memory.

When you call #each, all records will be loaded into memory. When you call #find_each, records will be loaded into memory in batches of the given batch size.

So when your query returns a number of records that would be too much memory for the server's available resources, then using #find_each would be a great choice.

It's basically like using ruby's lazy enumeration #to_enum#lazy with #each_slice and then #each (very convenient).

  • So code 1 may execute SQL multiple times(according to records size and batch size), code 2 only execute SQL once? – coderz May 03 '15 at 08:50
  • 1
    Yes that is my understanding. If you look at your development log or look at the sql output in rails console you'll see something like `Users.for_each(batch_size) {|u| }` `SELECT "users"."*" FROM "users" WHERE ("users"."id" > 51)` `SELECT "users"."*" FROM "users" WHERE ("users"."id" > 51) LIMIT 50` ... and so on –  May 03 '15 at 09:18
  • `users = User.where(:birth_day < Date.today)` if we call this line, we didn't call `#each`. but are you sure that we don't load all data into `users` variable? – Jin Lim Feb 15 '22 at 17:27
2

Answering Jin Lim's question. The statement users = User.where(:birth_day < Date.today) does not load all the data in users variable as the statement has not executed yet and is due to lazy loading supported by Rails. When you call each on users, then all of the data would be loaded into memory.