I just migrated our production website over to Google App Engine last night, and now the app starts giving a 504 Gateway Timeout after about 10-15 minutes of inactivity. But I've been running a test version of the new site on GAE, in a different project, for months and it's never had this problem. I don't understand what is going on.
Below is the GAE dashboard graph which shows that pretty much all requests were returning as 5XX errors until 7am. I wasn't sure what was happening, so I just decided to re-deploy the app to see if that would kick things into working again... and for some reason, it seemed to do so.
After I re-deployed, the site started responding again. I clicked around for a while, fixed a bug, re-deployed... everything was good.
Then at 7:20, all of the requests turned into 504 Gateway Timeout errors again.
I saw it at 7:40, re-deployed without any changes and it's back up and working again.
What is going on?
p.s. I know that a dynamic gae instance will go idle and require time to spin back up, but that's not what is happening. I see that delay with my dev project gae, but then the page comes up after about 10 seconds. On my production gae, my browser just spins for a full minute and then gives me a 504 error and it keeps doing it until I re-deploy.