How to deploy a simple app to GCP with minimal costs (or how to disable autoscaling after deploy)?

Question

In my first attempt at using Cloud to deploy an app...

The problem: GCP (Google Cloud Platform) unexpected instance hour usage (Frontend Instance Hours). High traffic was not the issue but for some reason a bunch of "instances" and "versions" were created by their autoscaling feature.

Solution they suggested: Disable autoscaling and stop serving previously deployed versions of your instance. I still need one version/instance running but through their console I still have not found where it shows how many versions/instances I have running or where to stop them (also verifying that at least 1 instance is still working in order to not break my app)

My app is simple app that was developed by Google developers and recommended by them for dynamic rendering a JS SPA (allows search engines and crawlers to see fully rendered html).
My actual website together with a node app to point to GCP for crawlers is hosted else where (on Godaddy) and both are working together nicely.

The app I deployed to GCP is called Rendertron (https://github.com/GoogleChrome/rendertron)

Google also recommends deploying to GCP (most documentation covers that form of deployment). I attempted deploying to my Godaddy shared hosting and it was not straight forward and easy to make work so I simply attempted creating a GCP project and tried deploying there. All worked great!

After deploying the app to GCP that has almost no traffic yet, I expected zero costs or at most something under a dollar.

Unfortunately, I received a bill for more than $150 for the month with approx the same projected for the next month.

Without paying an addition $150 for tech support, I was able to contact GCP billing over the phone and they are great in that they are willing to reimburse the charges but only after I resolve the problem myself.

They are generous with throwing a group of document links at you (common causes of unexpected instance hour usage) but can't help further than that.

After many google searches, reading through documentation, paying for and watching gcloud tutorials through pluralsight.com, the direction I have understood or not understood so far is as follows:

almost all documentation, videos and tutorials talk about managing or turning off autoscaling using Compute Engine Instance Groups
It is not clear that instance groups is not another hole I will fall into that is a paid service and I will be charged more than necessary
Instance groups seems like overkill for a simple app that wants only one instance running at minimal cost
there is not enough or difficult to find documentation for how to run a very small scale app at minimal cost using minimal resources
I have not read or watched anything yet of how to simply use the config .yaml file (initially deployed) to make sure the app does not autoscale and also if I find that it seems like I still need to delete versions or instances that have already been started and it is not clear in how to do that as well.
Instances and Versions are not clear on google console of how many are running, I still have not found on google console where there are multiple instances/versions running.

I can use a direction to continue my attempt of investigating how to resolve the issue.

The direction of me needing to create a Group Instance (so I can manage the no autoscaling from there) is the way to go and where I should focus my attempts?
The direction of continuing learning how to simply update my config in the .yaml file to create no scaling, for example something like setting both min_instances and max_instances to 1 together with learning how to manually stop (directly from GCP console) more than 1 instance/version that are currently running is where I should focus on?
A third option?

As a side note, autoscaling with GCP does not seem very intelligent.
Why would my app that has almost no traffic run into an issue that multiple instances were created?

Any insight will be greatly appreciated.

**** Update **** platform info

My app is deployed to Google App Engine (GAE) (deployed code, not a container)

Steps taken for Deploy:

git clone https://github.com/GoogleChrome/rendertron.git
cd rendertron
npm install && npm run build
gcloud app deploy app.yaml --project MY_PROJECT_ID

I simply followed the steps above and my app has been working great, and have not touched a thing since deployment.

The config (app.yaml) originaly deployed was:
(which I made no changes to from the Rendertron repo)

runtime: nodejs12
instance_class: F4_1G
automatic_scaling:
  min_instances: 1
env_variables:
  DISABLE_LEGACY_METADATA_SERVER_ENDPOINTS: "true"

-- Google Cloud Console Info

under App Engine --> Versions
There is 1 item listed with the following values:

Instances: 1
Runtime: nodejs12

Environment: Standard

Size: 392.7 MB
Deployed: Feb 23, 2021
Config:
  runtime: nodejs12
  env: standard
  instance_class: F4_1G
  handlers:
    url: .*
    script: auto
  env_variables:
    DISABLE_LEGACY_METADATA_SERVER_ENDPOINTS: 'true'
  automatic_scaling:
    min_idle_instances: automatic
    max_idle_instances: automatic
    min_pending_latency: automatic
    max_pending_latency: automatic
    min_instances: 1
  network: {}

**** Solution ****
I uploaded a new app.yaml file and changed: min_instances: 1 to max_instances: 1 (had to redeploy the entire project with an updated app.yaml)

At first I also changed "instance_class" from F4_1G to F1 to save money, but I was getting an error in my app that there was not enough memory and my app crashed with a 500 server error. (The rendertron app came up but crashed when trying to render something) I updated it again back to F4_1G and the app seems to work properly.

If I see charges again in the future when my traffic goes up, I will check if there is an instance class between F1 to F4_1G that could be enough memory for my app to work but accumulate the minimum charges possible.

Below you could see that when I made the change on Friday and until the following Sunday the costs dropped to 0 but the app is still running properly:
Screenshot showing GCP billing report costs dropped after change
**** Solution ****

Is your App deployed to Google App Engine (GAE)? If so, is it standard or flexible environment and what is your runtime? — NoCommandLine, May 02 '21 at 16:28
Your environment sounds like App Engine flexible, or a standard environment with a manual scaling. You need to share more about your platform, your deployment and your config. Can you also share if you deploy your code or a container? — guillaume blaquiere, May 02 '21 at 18:24
I made an *Update* to the question to answer your questions. Thanks for your interest :) — Jimmy Levy, May 03 '21 at 08:02

DazWilkin · Accepted Answer · 2021-05-03T16:44:02.610

The rendertron repo suggests using App Engine standard (app.yaml) and so I assume that's what you're using.

If you are using App Engine standard then:

you're not using Compute Engine [Instance Groups] as these resources are used by App Engine flexible (not standard);
managing multiple deployments should not be an issue as standard does not charge (!?) for maintaining multiple, non-traffic-receiving versions and should automatically migrate traffic for you from the current version to the new version.

There are at least 2 critical variables with App Engine standard: the size of the App Engine instances you're using and the number of them:

You may wish to use a (cheaper) instance class (link).
You can max_instances: 1 to limit the number of instances (link).

It appears your bandwidth use is low (and will be constrained by the above to a large extent) but bear this in mind too, as well as the fact that...

Your app is likely exposed on the public Internet and so could quite easily be consuming traffic from scrapers and other "actors" who stumble upon your endpoint and GET it.

As you've seen, it's quite easy to over-consume (cloud-based) resources and face larger-than-anticipated bills. There are some controls in GCP that permit you to monitor (not necessarily quench) big bills (link).

The only real solution is to become as familiar as you can with the platform and how its resources are priced.

Update #1

My preference is to use gcloud (CLI) for managing services but I think your preference is the Console.

When you deploy an "app" to App Engine, it comprises >=1 services (default). I've deployed the simplest, "Hello World!" app comprising a single default service (Node.JS):

https://console.cloud.google.com/appengine/services?serviceId=default&project=[[YOUR-PROJECT-ID]]

I deployed it multiple (3) times as if I were evolving the app. On the "Versions" page, 3 versions are listed:

https://console.cloud.google.com/appengine/versions?serviceId=default&project=[[YOUR-PROJECT-ID]]

NOTE There are multiple versions stored on the platform but only the latest is serving (and 100% of) traffic. IIRC App Engine standard does not charge to store multiple versions.

I tweaked the configuration (app.yaml) to specify instance_class (F1) and to limit max_instances: 1:

app.yaml:

runtime: nodejs14
instance_class: F1
automatic_scaling:
  max_instances: 1

And, this is reflected in the deployed app's config:

Update #2

If you can encourage someone to write a Dockerfile and contribute it to the rendertron repo, you could then deploy the container to various alternative services (both Google and non-Google).

A curious fact with App Engine standard is that, while you deploy 'code' to the platform, it creates a container image from your artifacts and this is what gets deployed to App Engine. You can prove this to yourself by viewing the Container Registry (service) in your project:

https://console.cloud.google.com/gcr/images/dazwilkin-210503-67357098?project=[[YOUR-PROJECT-ID]]

And, if you wish, you could reuse that image elsewhere.

Google Cloud Run is probably your best option on Google. Cloud Run permits you both to restrict the number of instances you run and you can more easily limit access to the deployed app to authenticated users.

With a container, you can deploy rendertron anywhere that runs container as-a-service.

I made an *Update* to the question. Thanks for your answer :) While I continue to ponder items in your answer, please let me know if my update sparks a thought for a more specific direction I could take. — Jimmy Levy, May 03 '21 at 08:05
Please consider "accepting" my answer if it is one. Folks on Stackoverflow contribute to help others but we're measured by accepted answers and upvotes. — DazWilkin, May 03 '21 at 15:54
For future questions, I recommend you consider breaking multi-part questions into multiple questions. These are more quickly (and more likely) answered by contributors here. Extensive questions like this are daunting and tend to discourage responses. — DazWilkin, May 03 '21 at 15:55
I've updated an update to my answer that explains how deployments work and can be viewed on GCP. A second update proposes considering using containers as a more 'portable' solution. — DazWilkin, May 03 '21 at 16:45

Martin Zeitler · Answer 2 · 2021-05-03T17:00:24.043

There's automatic_scaling configured, which is an optional configuration.
The other options available would be basic_scaling and manual_scaling.

The least expensive configuration might be manual_scaling with a single B1 instance:

instance_class: B1
manual_scaling:
  instances: 1

These two configuration parameters directly affect the pricing, that's kinds of "pay as configured".

While this might not be the suggested configuration for production.
See scaling elements and princing for some more information.

How to deploy a simple app to GCP with minimal costs (or how to disable autoscaling after deploy)?

2 Answers2

Update #1

Update #2