2

I have an Google Flex App Engine App that recieves POSTs with JSON data, then performs an insert statement using SQLAlchemy that writes the JSON data to a Cloud SQL database. The latency between POST device (in Dallas) and App Engine (us-east) is well above 300ms, sometimes spiking over 1500ms. I believe it may be due to poor INSERT write speed, although the load is hardly 'ridiculous' or anything. Roughly up to 5-8 reads/sec and 150-300 writes/sec

Using Flask, SQLAlchemy, MySQL 5.6 (on Cloud SQL), and 2nd generation Google Flex App Engine instance.

My problem is I need to drop the latency and I can't figure out what is causing it. I have run traceroute and ping commands to the IP that POSTs json data to the Flask App page being hosted by Google App Engine. The ping is fine, there is a hiccup in one hop but it is only one area. If I have too high of a latency, the IP will possibly drop POST.

from api import app
from sqlalchemy import create_engine
from datetime import datetime, timedelta

engine = create_engine(app.config.get("database_uri"))

Class Events(object):
    @staticmethod
    def add(event):
        connection = engine.connect()
        for observation in event["observations"]:
            try:
                connection.execute("""INSERT INTO events (
                    ...
                ) VALUES (
                    %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s
                )""", (
                    ...
                ))
            except Exception as ex:
                print(ex)
        connection.close()

This is triggered by:

@app.route('/', methods=['POST'])
def events_post():
    data = request.json
    if data["secret"] != app.config.get("secret"):
        return
    Events.add(data["data"])
    return "Success"

I have tried with and without connection.close() at the end and it doesn't seem to matter. This is a steady stream of data so closing and re-opening the connection every time seemed resource-heavy for the job at hand.

Thanks in advance.

jhomr
  • 477
  • 3
  • 16
  • Add some logging and check timestamp in Stackdriver Logging to see where you use the time: https://cloud.google.com/appengine/docs/standard/python/logs/ – FelixEnescu Jun 14 '18 at 17:59
  • There isn't enough information here for anyone to do more than guess. As @blueCat suggests, you can add logging to try to identify delays. The `sqlalchemy.engine` logger emits a lot of information even at INFO level. Maybe use a connection pool to avoid recreating connections on each call, otherwise your code seems ok. It could be the network or the DB server. For the network, you need to speak to Google support. For the DB, do have lot of indexes etc that would slow down writes? You could try swapping to a new server, otherwise again contact Google support. – snakecharmerb Jun 17 '18 at 08:39
  • Sorry for no updates. For now this has been shelved to work on something else related to the project at hand and I'll be switching over to Datastore in the future. I'll update later if I ever figure it out. – jhomr Jun 18 '18 at 01:13

1 Answers1

0

As already suggested on previous comments you need to view the logs on Stackdriver. Particularly the Cloud SQL Instance operation logs or the MySQL error logs. This will give you an idea of what needs fine tuning in order to improve the performance of your queries. As you stated that this may be due to poor INSERT write speed, the only way of confirming this is to filter through the logs.

For 2nd Gen Google App Engine Flex instances, latency is influenced by the location of the writer and database as sending data a long distance introduces latency. A larger distance between your Cloud SQL instance and your App Engine application causes greater latency for connections to the database.

Have a look at this documentation that can help you configure your app for better scaling when connecting to Cloud SQL from App Engine Flex for Python apps.

I hope this helps answer your question.

Samuel N
  • 615
  • 5
  • 10