I'm playing with Ruby EventMachines for some time now and I think I'm understandings its basics.
However, I am not sure how to read in a large file (120 MB) performantly. My goal is to read a file line by line and write every line into a Cassandra database (same should be with MySQL, PostgreSQL, MongoDB etc. because the Cassandra client supports EM explicitly). The simple snippet blocks the reactor, right?
require 'rubygems'
require 'cassandra'
require 'thrift_client/event_machine'
EM.run do
Fiber.new do
rm = Cassandra.new('RankMetrics', "127.0.0.1:9160", :transport => Thrift::EventMachineTransport, :transport_wrapper => nil)
rm.clear_keyspace!
begin
file = File.new("us_100000.txt", "r")
while (line = file.gets)
rm.insert(:Domains, "#{line.downcase}", {'domain' => "#{line}"})
end
file.close
rescue => err
puts "Exception: #{err}"
err
end
EM.stop
end.resume
end
But what's the right way to get a file read asynchronously?