7

I was recently forced to move my app to Amazon and use auto-scaling, I have stumbled on to a issue with cron jobs and automatic scaling.

I have a cron job running every 15 minutes which checks if subscriptions should be charged, the query selects all subscriptions that are past due, and attempts to charge them. It changes their status once processed, but they are fetched In a batch, and the process takes 1-3 minutes.

If I have multiple instances with the same cron job, it could fire simultaneously and charge the subscriptions multiple times. This has actually happened once.

What is the Best approach here? Somehow locking the table?

I am using Amazon elastic beanstalk and symfony3.

starball
  • 20,030
  • 7
  • 43
  • 238
  • 1
    Best approach is a queue and one queue job per subscription charge request. See RabbitMQ, AMQP, Pheanstalk, etc. Otherwise, setup a tmp lock file with a unique path/name and if the file exists, your other cron doesn't start; otherwise it starts, tmp file created (`touch('/tmp/uniquely.lock')`), and at the end or exceptions/script exits, remove the tmp file. The problem with the latter is you need to monitor whether it's running, otherwise it may not run at all. – Jared Farrish Feb 22 '17 at 23:37
  • http://queues.io/ You might also add some Amazon-related tags to the question. – Jared Farrish Feb 22 '17 at 23:44
  • But how would the temp filme be shared across Amazon instances? These are seperat servers –  Feb 22 '17 at 23:45
  • Oh right. Yeah, I would use a worker queue. Although really all you need is a common data store for the "tmp file" method, which should be easy enough with AWS. – Jared Farrish Feb 22 '17 at 23:45
  • I Will checkout queues, i have worked a little with the JMSQueueBundle but it also seems to depend on a single instance of it running via supervisord –  Feb 22 '17 at 23:48
  • Amazon has [SQS](https://aws.amazon.com/sqs/). – Jared Farrish Feb 22 '17 at 23:49
  • Thanks, looking In to it now –  Feb 22 '17 at 23:49
  • Don't know if you're using commands per se, but [Queue-ing Symfony commands via Amazon Sqs](http://branchbit.github.io/SqsCommandQueueBundle/) (it's a Symfony Bundle). – Jared Farrish Feb 22 '17 at 23:52
  • I'm a step further and I'm having problems, Please have a look at my question [AWS Autoscaling Group EC2 instances go down during cron jobs](https://stackoverflow.com/questions/66271688/aws-autoscaling-group-ec2-instances-go-down-during-cron-jobs) – Yevgeniy Afanasyev Feb 23 '21 at 03:31

2 Answers2

0

At least you can use dedicated micro instance for subscription charging (not auto-scaled of course), just with cron jobs. Simplest way yet safest (obviously it will safe if you move your subscription handling logic from front-end servers which potentially can be hacked to the server behind VPC subnet that isn't available from global network).

But if you don't want, you still can use another approach. You mentioned you use Beanstalk. Beanstalk allow to use delayed jobs.

So possible approach is:

1) When you create subscription, you can calculate when it should be charged, and then push the job with calculated delay to Beanstalk tube.

2) Then, worker get the job (with subscription) on-time. Only one worker will get the particular job, so it will work if you use autoscaling.

3) In worker, you check the subscription (probably it can be deleted or inactive etc.) and if it ready to charge, just run the code for charging. Then calculate next charging time and push new delayed job (with subscription) to queue.

Beanstalk has Symfony bundle and powerful PHP library

E.K.
  • 1,045
  • 6
  • 10
0

You can make your job run only for one instance i.e make your functionality - charge subscription run only for one of instance. You can use AWS api for fetching all instances and then matching the instances with current running one.

ec2 = Aws::EC2::Resource.new(region: 'region',
    credentials: Aws::Credentials.new(IAM_KEY', 'IAM_SECRET')
)

metadata_endpoint = 'http://169.254.169.254/latest/meta-data/'
current_server_id = Net::HTTP.get( URI.parse( metadata_endpoint + 'instance-id' ) )

instances = []
ec2.instances.each do |i|
    if (i.state.name == 'running')
       instances << i.id
    end
end

if (instances.first == current_server_id )
{
  your functionality
}
fixatd
  • 1,394
  • 1
  • 11
  • 19
Deeksha Bilochi
  • 106
  • 1
  • 8