2

I currently have a Windows instance on AWS which runs a windows scheduled task to execute a .net script to process the days orders.

I have recently load balanced a few instances using ELB and this is all fine.

The question is how do I setup the scheduled tasks so that not all the instances run it. Ive looked into OpsWorks, SimpleWorkFlow etc on AWS but it is so confusing on which one I should be focusing on for this relatively simple task.

Thanks

Raj
  • 897
  • 1
  • 15
  • 28
  • HI Raj, I was going through your query and I'm currently struck up, I'm going to have an asp.net page on amazon and I need to call this every hr on amazon for sending emails, how I can proceed ahead on that? I have an open question on this, you can reply me there. Thanks. – Sunil Nov 03 '14 at 13:11

2 Answers2

1

You might be able to use Amazon SWF, but that seems overkill/complex.

There is also Amazon Pipeline. It's probably the right answer, but requires a bit of reading to setup.

The simple thing to do is to schedule your job on all the boxes, and have a database declare the winner.

A) You can have it "lock" the unprocessed orders in the DB, then process them. As long as you lock in a single transaction, the other workers will get 0 orders (or a few new orders) to process.

B) You can create a special table with a single row that the job will lock. Something like "update work_table set worker='mybox', work_start = now() where worker = ''".

If you want to be robust to the worker box dying, have to create more complex rules: The workers can hang around until the worker marks his jobs as complete. If the job doesn't complete in a timely manner, they can assume the first worker died and try to steal the lock from him and run the job themselves.

If you don't have a database, you can always use SDB (Simple DB) or DynamoDB. Hitting the DB a few times per day will definitely fit in the free tier.

The whole thing will only be 20-50 lines of code if you do it right.

BraveNewCurrency
  • 12,654
  • 2
  • 42
  • 50
  • Yes I did look into AWS Data Pipeline but it didnt seem to allow me to just execute a script/url. Seemed to be more related to database operations, backups etc and there seems to be very little online help/tutorials around this. The database route you suggest is a fair option and will probably be the way forward if I cant find any more help on AWS in the next week! – Raj Dec 30 '13 at 08:31
1

Assuming you need to run the workflow once a day you can generate the workflow id containing the date and that will ensure only one instance ever gets triggered.

If u want n instances then u can have a parent workflow (Singleton ensured by workflow Id) which spawns n child workflows.