GAE Front-end instance hours are increasing without the app being used (python-backend)

Question

My GAE Front-end instance hours keep increasing even though the application is no being used and there are no instances running (because i manually shut them down!!). My question has several sub-questions to it, here they are:

What is the major component that affect front-end instance hours? In my current implementation of the system, I am making extensive use of the following google resources: Memcache, task queue and NDB datastore. I know for a fact that the datastore doesn't have much to do with the front-end instances (or maybe i am wrong) but does Memcache inflict front-end instance hours? Up until yesterday, my application was running perfectly fine and instance hours were being use (as you would expect) as the task queue was being used. Which led me to believe that the main factor was using the task queue and sending multiple requests in a short period of time. But this morning I added some extra memcache usage and It started acting up. Also, does static resources affect instance hours?
How to optimize application, what are the main things to consider? Google API calls, URL calls in general, again memcache?

Application Information:

On my app.yaml file here is some of the configuration information:

default_expiration: 15m (I had it 1h before, but changed for testing purposes)
instance_class: F2 (I need to have it as F4 for some processing, but changed it to F2 again for testing purposes)
threadsafe: yes

I need to have some clear understanding of instance hours, the posts i have read here are not clear enough! If you have deep understanding of how google calculates the front-end instance hours please let me know! what causes it to go up, how to manage it and all these things.

Some extra visual context:

Process summary

As you see on this picture, there are no instances deployed (AKA no instances running) yet billing is just going up and they summary chart is just going crazy!, look at all those spikes!

Anything would be greatly appreciated!

Does the answer provided address your questions about instance hour calculation and optimization potential. If so, please select it so the question may be considered resolved. If not, feel free to elaborate on the details you think it is missing so that it can be improved accordingly. — Nicholas, Jun 20 '16 at 16:31
Yes, this is a satisfactory answer. My apologies for the delay in marking it as correct. Again thank you for contribution. — Jeury Mejia, Jul 22 '16 at 15:04

Nicholas · Accepted Answer · 2016-06-16T14:05:33.463

Instance hour calculation

As written in Instance Billing, any instance will run for at least 15 minutes even if it receives a single request. This is partly to avoid having many end users experience the latency of spinning up an instance. Instead, an instance will be idle and ready to receive requests immediately.

As for how the instance hours are calculated, this is based on actual operation time. If an F1 instance is operational for 2.5 hours, it will have consumed around 2.5 instance hours. The number of instance hours is multiplied by the class number as documented in the Important note in App Engine Pricing. Therefore, an F2 instance operating for 2 hours would consume 4 instance hours.

Optimization with concurrency and asynchronicity

When it comes to optimization, your best bet is to delegate where appropriate and not waste any time. For example, assume your handler needs to get 3 entities containing URLs from the datastore and then fetch those URLs for some JSON data. Getting entities from the datastore takes some time. If using the synchronous version of the get calls, your handler is essentially blocked while getting the entities. Using the asynchronous get_async version of the same API call will allow the handler to continue with other tasks while the entities are retrieved.

Similarly, fetching URLs can take a long time. Doing so synchronously may freeze your handler while it waits for those to return. The latency will be even worse if those requests never return or only return with the default timeout. Though the urlfetch library does not have any asynchronous fetching built-in, many third-party libraries do or lend themselves well to threading in python. In go, you can fetch these URLs in goroutines which execute concurrently. This will allow the handler not only to continue with other tasks but also possibly fetch those URLs in parallel.

Identifying bottlenecks

Depending on your application needs, there may be many opportunities for concurrent design. This can greatly reduce response times for your instances allowing them to handle more requests or spin down sooner, both of which will consume fewer instance hours. I would suggest using Stackdriver Trace to study the lifespan of a request. It may be helpful in identifying bottlenecks. With this information, you should have an idea where to prioritize optimization.

Nicholas, thank you so much for your insight. This is really helpful. — Jeury Mejia, Jul 22 '16 at 15:06

GAE Front-end instance hours are increasing without the app being used (python-backend)

1 Answers1

Instance hour calculation

Optimization with concurrency and asynchronicity

Identifying bottlenecks