1

I have a Rails application that has a Document with the flag available. The document is uploaded to an external server where it is not immediately available (takes time to propogate). What I'd like to do is poll the availability and update the model when available.

I'm looking for the most performant solution for this process (service does not offer callbacks):

  1. Document is uploaded to app
  2. app uploads to external server
  3. app polls url (http://external.server.com/document.pdf) until available
  4. app updates model Document.available = true

I'm stuck on 3. I'm already using sidekiq in my project. Is that an option, or should I use a completely different approach (cron job).

Documents will be uploaded all the time and so it seems relevant to first poll the database/redis to check for Documents which are not available.

Damien Roche
  • 13,189
  • 18
  • 68
  • 96

1 Answers1

0

See this answer: Making HTTP HEAD request with timeout in Ruby

Basically you set up a HEAD request for the known url and then asynchronously loop until you get a 200 back (with a 5 second delay between iterations, or whatever).

Do this from your controller after the document is uploaded:

Document.delay.poll_for_finished(@document.id)

And then in your document model:

def self.poll_for_finished(document_id)
  document = Document.find(document_id)
  # make sure the document exists and should be polled for
  return unless document.continue_polling?

  if document.remote_document_exists?
    document.available = true
  else
    document.poll_attempts += 1 # assumes you care how many times you've checked, could be ignored.
    Document.delay_for(5.seconds).poll_for_finished(document.id)
  end
  document.save
end

def continue_polling?
  # this can be more or less sophisticated
  return !document.available || document.poll_attempts < 5
end

def remote_document_exists?
  Net::HTTP.start('http://external.server.com') do |http|
    http.open_timeout = 2
    http.read_timeout = 2
    return "200" == http.head(document.path).code
  end
end

This is still a blocking operation. Opening the Net::HTTP connection will block if the server you're trying to contact is slow or unresponsive. If you're worried about it use Typhoeus. See this answer for details: What is the preferred way of performing non blocking I/O in Ruby?

Community
  • 1
  • 1
olamork
  • 179
  • 4
  • 3
    This seems like a waste of a Sidekiq worker. It would be better to fail the job if the response isn't 200 and retry the job later (say, after one minute). The worker can move on quickly and try a different file. – davidcelis Jul 25 '13 at 22:59
  • @davidcelis: I updated my answer based on your feedback. Further refinements could include shortening the timeouts or switching to Tyohoeus and batching several checks at once. – olamork Jul 26 '13 at 14:07