2

I am seeing a lot of exceptions in my log:

A Mongo::OperationFailure occurred in foo#bar:

Mongo::OperationFailure
mongo (1.6.2) lib/mongo/util/tcp_socket.rb:76:in `read'

I am using Mongoid as my Ruby driver.

Not sure if this is related to connection pooling, but just in the event it is, this is my mongoid.yml:

production:
  host: xxx
  port: 27017
  username: xxx
  password: xxx
  database: foo
  logger: false
  pool_size: 200
  max_retries_on_connection_failure: 5

I understand EC2 can have transient network issues, but this is almost becoming the norm. What's the best way to solve this problem?

Just for background information, I'm running JRuby 1.6.7.

randombits
  • 47,058
  • 76
  • 251
  • 433

2 Answers2

0
  • What kind of EC2 instance are you running MongoDB on? They should be at least an m1.large.
  • How many servers in your MongoDB cluster? There should be at least 2 plus one arbiter. How are they configured?
  • Have you set your TCP keepalive timeout to 300 seconds?
  • Have you examined the basic stats on the database server using top and mongostat?
  • Have you installed and used the free MongoDB Monitoring Service from 10gen?

If you have used some monitoring tools, what have they told you? If you haven't, well, then use them and report back what you find.

Old Pro
  • 24,624
  • 7
  • 58
  • 106
0

This is probably related to the fact that

# Connect nonblock is broken in current versions of JRuby

for connect in lib/mongo/util/tcp_socket.rb: link to file

 def connect
  # Connect nonblock is broken in current versions of JRuby
  if RUBY_PLATFORM == 'java'
    require 'timeout'
    if @connect_timeout
      Timeout::timeout(@connect_timeout, OperationTimeout) do
        @socket.connect(@socket_address)
      end
    else
      @socket.connect(@socket_address)
    end
  else
    ... # nonblocking connect 

The error occurred here:

rescue Errno::EINTR, Errno::EIO, IOError 
  raise OperationFailure 
end

so it's probably an EIO/IOError.

Maybe try using Ruby instead of JRuby?

Hope this helps.

(If I have to make an uneducated guess which I will probably regret saying at all, it'd would be that since JRuby has to use a blocking socket connect instead of a non-blocking one, EIO/IOError occurred during read for high amount of reads/connections.)

K Z
  • 29,661
  • 8
  • 73
  • 78