3

I'm trying to export a large amount of data from a database to a csv file but it is taking a very long time and fear I'll have major memory issues.

Does anyone know of any better way to export a CSV without the memory build up? If so, can you show me how? Thanks.

Here's my controller:

def users_export
  File.new("users_export.csv", "w")           # creates new file to write to
  @todays_date = Time.now.strftime("%m-%d-%Y")
  @outfile = @todays_date + ".csv"

  @users = User.select('id, login, email, last_login, created_at, updated_at')

  FasterCSV.open("users_export.csv", "w+") do |csv|
    csv << [ @todays_date ]

    csv << [ "id","login","email","last_login", "created_at", "updated_at" ]
    @users.find_each do |u|
      csv << [ u.id, u.login, u.email, u.last_login, u.created_at, u.updated_at ]
    end
  end

  send_file "users_export.csv",
    :type => 'text/csv; charset=iso-8859-1; header=present',
    :disposition => "attachment; filename=#{@outfile}"
end

2 Answers2

7

You're building up one giant string so you have to keep the entire csv file in memory. You're also loading all of your users which will also sit on a bunch of memory. It won't make any difference if you only have a few hundred or a few thousand users but a some point you will probably need to do 2 things

Use

User.find_each do |user|
  csv << [...]
end

This loads users in batches (1000 by default) rather than all of them.

You should also look at writing the csv to a file rather than buffering the entire thing in memory. Assuming you have created a temporary file,

FasterCSV.open('/path/to/file','w') do |csv|
  ...
end

Will write your csv to a file. You can then use send_file to send it. If you already have a file open, FasterCSV.new(io) should work too.

Lastly, on rails 3.1 and higher you might be able to stream the csv file as you create it, but that isn't something I've tried before.

Frederick Cheung
  • 83,189
  • 8
  • 152
  • 174
  • I changed my code according to the suggested solutions (let me know if it looks like what you had in mind please) but it still seems to take an extremely long time. Any more ideas? Also do you know how I could stream the csv file as I create it? Can't seem to find anything on how to do that.. thanks! –  Jun 29 '12 at 07:19
  • Note: I'm using rails version 3.0.9 –  Jun 29 '12 at 08:14
1

Additionally to the tips on csv generation, be sure to optimize the call to the database also. Select only the columns you need.

@users = User.select('id, login, email, last_login, created_at, updated_at').order('login')
@users.find_each do |user|
   ...
end

If you have for example 1000 users, and each have password, password_salt, city, country, ... then several 1000 objects less are transfered from database, created as ruby objects and finally garbage collected.

Meier
  • 3,858
  • 1
  • 17
  • 46