27

I have a simple online ordering application I have built. It probably handles 25 hours a week, most of those on Mondays and Tuesday.

Looking at the dashboard I see:

Billing Status: Free - Settings Quotas reset every 24 hours. Next reset: 7 hrs 
Resource             Usage  
Frontend Instance Hours      16%     4.53 of 28.00 Instance Hours

4.53 hours seems insanely high for the number of users I have.

Some of my pages make calls to a filemaker database stored on another service and have latencies like:

URI         Reqs        MCycles     Latencies          
/profile    50          74          1241 ms
/order      49          130         3157 ms

my authentication pages also have high latencies as they call out to third parties:

/auth/google/callback 9  51  2399 ms

I still don't see how they could add up to 4.53 hours though?

Can anyone explain?

deltanine
  • 1,166
  • 1
  • 13
  • 25

2 Answers2

29

You're charged 15 minutes every time an instance is spins up.

If you have few requests, but they are spaced out, your instance would shut down, and you'll incur the 15 minute charge the next time the instance spins up.

You could easily rack up 4.5 instance hours with 18 HTTP requests.

dragonx
  • 14,963
  • 27
  • 44
  • 4
    Also, note that *every time you deploy* your service (without using an explicit `--version` value) gcloud will launch a new version of the service but will also keep *all your previous versions running*, each with a distinct instance attached. Normally this would render those (old) instances idle, since they don't receive traffic, but in some cases they can still generate costs. – Valeriu Paloş Jul 18 '18 at 03:55
  • 1
    @dragonx, I'm exactly in the situation you describe. Would you mind sharing how long an instance has to be idle before it goes to shutdown? Thanks! – mgouin Mar 05 '19 at 02:47
  • 1
    @mgouin if you are using basic scaling it defaults to 5 min, but you can adjust `idle_timeout`. Not sure about the automatic scaling mode. You can probably put in a shutdow hook and log when it's called. – dragonx May 20 '19 at 23:59
28

In addition to the previous answer, I thought to add a bit more about your billing which might have you confused. Google gives you 28 hours of free instance time for each 24 hour billing period.

Ideally you always have one instance running so that calls to your app never have to wait for an instance to spin up. One instance can handle a pretty decent volume of calls each minute, so a lot can be accomplished with those free 28 hours.

You have a lot of zero instance time (consumed less than 5 instance hours in seventeen hours of potential billing.) You need to worry more about getting this higher not lower because undoubtedly most of the calls to your app currently are waiting for both spin-up latency plus actual execution latency. If you are running a Go app, spin-up is likely not an issue. Python, likely a small-to-moderate issue, Java...

So think instead about keeping your instance alive, and consume 100% of your free instance quota. Alternatively, be sure to use Go, or Python (with good design). Do not use Java.

stevep
  • 959
  • 5
  • 8
  • When you commend Python instead of Java, it's just because GAE is not optimized (on spin-up terms) for Java, right? or there is another significant reason? – Ricardo Dec 24 '13 at 19:29
  • 3
    Java instances very often take quite a lot longer to load vs. Go or Python. There have been many long threads about this, and the ways to optimize Java for faster instance loading. I am not a Java person, so am unable to speak to this effectively. Search the Google App Engine group and you should find lots of information and *heated* debate. – stevep Dec 25 '13 at 19:43
  • I know you answered this question a while ago, but what actually are "Front-End Instance Hours" and "Back-End Instance Hours"? I know the difference between frontend and backend, just not in the context of GAE. Also, can you force GAE to use only one instance per hour, as to never leave the free tier, and how many users, would you say, one instance can handle? Thanks. –  Dec 13 '18 at 01:45
  • The link below pretty clearly differentiates FE vs BE. You can control how FE instances are evicted using idle_timeout variable. Utilizing a cron to check things every few minutes (e.g. any tasks to do in a low-priority queue) will keep an instance active. How many users an instance can handle depends very heavily on how long it takes to handle a call. There are many decisions that you can make to make things either fast or slow. Scope of this is far beyond this thread. Link: https://cloud.google.com/appengine/docs/standard/python/backends/ – stevep Dec 14 '18 at 07:01