0

I use cron tasks to send mails to my app users every saturday at 9:00. But if there is a lot of users, is it a problem? If it is, what can I do to improve my code. Can I specify a cron task like "from 9:00 to 23:00" in order to be sure all users receive email? I heard about Task Queues but I don't know how to use it. Do I really need it?

EDIT

I finally managed to make Task Queue work with this code.

class SendMailHandler(webapp.RequestHandler):
    def get(self):
        members = Members.all()
        for member in members:
            taskqueue.add(url='/send', params={'sender_address':sender_address,
                                                   'user_address':user_address,
                                                   'subject':subject,
                                                   'html':html})

class SendMail(webapp.RequestHandler):
    def post(self):
        sender_address = self.request.get('sender_address')
        user_address = self.request.get('user_address')
        subject = self.request.get('subject')
        html = self.request.get('html')

        mail.send_mail(sender=sender_address, to=user_address, subject=subject, body='', html=html)

application = webapp.WSGIApplication([('/sendmail', SendMailHandler),
                                  ('/send', SendMail)], debug=True)
tsil
  • 2,069
  • 7
  • 29
  • 43

2 Answers2

4

App Engine cron tasks, like other tasks in App Engine, have 10 minutes to complete. If you need more time than that, you could use a backend or you could split up your sending into chunks across task queues.

EDIT: Here's docs for task queues: https://developers.google.com/appengine/docs/python/taskqueue/

What I would do, if I was going to write this code (which for reasons Nick detailed, I won't) is to decide on some sort of sharding. Say you have a 'to' field in the 'members' db model: make 26 tasks, one of which will handle all email addresses that start with 'a', 'b', etc.

You may find that particular scheme results in a cruddy distribution -- maybe one task ends up doing 50% of the work, because for some reason most of your users have an email address that starts with 'm'. If this happens, you could instead shard based on a hash of the 'to' address. The point is to break your members up somehow and launch a task to deal with each chunk, with some identifier for the chunk as a parameter to the task. Writing the code and optimizing the sharding is left, as the saying goes, as an exercise for the reader. (though of course if you have specific questions about implementation, please ask!)

Moishe Lettvin
  • 8,462
  • 1
  • 26
  • 40
  • Can you give please a code sample with Task Queues based on my code? – tsil Apr 26 '12 at 19:37
  • 3
    @IsmaelToé Have you read the docs on Task Queues? It has several excellent getting started examples. We're not your research assistant or your code monkey - it's expected that you take some time to try before asking for help. – Nick Johnson Apr 27 '12 at 03:27
0

I actually JUST has this exact problem. I've never committed an answer to stackoverflow before so be kind.

What I am doing is actually using a cron job (scheduled at 3:30 AM Eastern) to kick off the task to enter into the task que. From there you add as part of creating the task an ETA. This accepts a python datetime value. From here you can do logic to figure out if you need to run the task now, or if you want to calculate a new ETA based on a timezone, etc.

Here is a chuck of code where I am doing that. This code is executed from my Cron handler. Then I am sending out an email similar to how you are doing it. Note, I have timezone info stored in a dic and I compare that to info I am getting from stored data. This might not be super flexible in the future, but for now it works for what I need to do. If you have other ways of getting that info you could do this differently.

bigtimezonedic = {'America/New_York' : 0, 'America/Detroit' : 0, 'America/Kentucky/Louisville' : : 0, 'America/Kentucky/Monticello' : 0, 'America/Indiana/Indianapolis' : 0, 'America/Indiana    /Vincennes' : 0, 'America/Indiana/Winamac' : 0, 'America/Indiana/Marengo' : 0, 'America/Indiana/Petersburg' : 0, 'America/Indiana/Vevay' : 0, 'America/Chicago' : 1, 'America/Indiana/Tell_City' : 1, 'America/Indiana/Knox': 1, 'America/Menominee' : 1, 'America/North_Dakota/Center' : 1, 'America/North_Dakota/New_Salem' : 1, 'America/Denver' : 2, 'America/Boise' : 2, 'America/Shiprock' : 2, 'America/Phoenix' : 2, 'America/Los_Angeles' : 3, 'America/Anchorage' : 4, 'America/Juneau' : 4, 'America/Yakutat' : 4, 'America/Nome' : 4, 'America/Adak' : 5, 'Pacific/Honolulu' : 5}

calendar_list = calendarHTML.all()
    for calendar in calendar_list:
         if bigtimezonedic.has_key(calendar.calendarTimeZone):
             if bigtimezonedic[calendar.calendarTimeZone] == 0:
                taskqueue.add(url='/emailsender',
                        params=dict(calendaruserid=calendar.userid,
                            calendarid=calendar.calendarId))
             else:
                taskqueue.add(url='/emailsender',
                        params=dict(calendaruserid=calendar.userid,
                            calendarid=calendar.calendarId),
                     eta=datetime.datetime.now() +
                     datetime.timedelta(hours=bigtimezonedic[calendar.calendarTimeZone]))
         else:
            logging.info('There is an unsupported timezone!')
dsimandl
  • 11
  • 2