0

I want to fire up some task which will connect to an IMAP and bring email data to store in database.

Now such job would have huge volume to support many IMAP accounts. I want to leverage Akka cluster capabilities which can run these jobs in predefined set of machines in a network and do retrials in case of errors while fetching data from IMAP hosts.

I want to create an akka cluster which will fire up IMAP fetch job ( via an Actor ?? !! ).

IMAP is a tricky protocol it may fail to connect to remote host. In suchcase an actor should retry several configurable times to connect and fetch.

Eventually it should act as my IMAP fetch back end.

How should I go about it ?

Rakesh KR
  • 6,357
  • 5
  • 40
  • 55
Rakesh Waghela
  • 2,227
  • 2
  • 26
  • 46

2 Answers2

2

We use Akka heavily, including to connect to and process new messages from users' email accounts using JavaMail/IMAP. Fault tolerance is an important part of the puzzle. Here's roughly how our backend is set up:

  1. Supervisor node has an actor that selects users out of the DB for processing
  2. IMAP worker actors notify supervisor when they're ready for work (for more on this "work pulling" architecture, see my colleague Ryan Tanner's blog post: http://blog.goconspire.com/post/64901258135/akka-at-conspire-part-5-the-importance-of-pulling)
  3. Supervisor sends a ProcessAccount message--a custom object including an Gmail OAuth token (you could use traditional username and password credentials too)--to the idle IMAP worker.
  4. The IMAP worker uses JavaMail to read and process new messages. On an error, it sends a FailedProcessing message--a custom object including an error code and human-readable string--back to the supervisor actor. On success, it sends CompletedProcessing.
  5. The supervisor updates the user record in the database, including setting an error code if processing failed.
  6. In addition to periodically processing healthy accounts, the supervisor retries processing for failed accounts. Our use case is such that we only attempt re-processing once a day, but you could do that much more frequently.

Using Akka clustering, we keep workers separate from the supervisor. Combining this approach with the work pulling mechanism described above keeps us relatively tolerant to unrecoverable errors, e.g. OutOfMemoryErrors, in the workers.

pauljm
  • 442
  • 2
  • 12
  • Thanks for outlining the approach as you have solved similar problem. But do you have some code to refer, I am an Akka Noob :( – Rakesh Waghela Nov 17 '13 at 04:57
  • 1
    Please be a bit more respectful: Paul spent quite some time for a very detailed write-up. If you want someone to do your work for you then you’ll have to pay them for that. – Roland Kuhn Nov 17 '13 at 16:58
  • @RolandKuhn I understand what you mean, I never wanted the full source code. Just some open source repository which has something similar to refer. I am certainly grateful for detailed answer given by Paul. I have up-voted it already to comply with SO rules. As I can see the lead of Akka project himself commenting up beneath the answer, I have to accept it now :) Thanks for noticing this question. – Rakesh Waghela Nov 17 '13 at 17:05
  • Thanks Roland. Rakesh, I'm afraid I can't share more code than Ryan already posted on the blog, but there's some good stuff there. (Note that the link above is the fifth of five posts, most Akka-related.) – pauljm Nov 18 '13 at 18:16
0

Use JavaMail. Read the JavaMail FAQ.

Bill Shannon
  • 29,579
  • 6
  • 38
  • 40
  • Bill ! Thanks for the pointers, I have witnessed your prompt reply on my java mail related questions earlier. I already know how to fetch emails using Java Mail API. My question was more about using it with Akka technology. :) – Rakesh Waghela Nov 17 '13 at 17:09