3

I recently switched over an API from an EC2 instance to API Gateway with lambda. I noticed that my CPU usage on the database then shot up quite rapidly.

Below is an image of the active connections on the database, the red arrow is when I switched over to Lambda from an EC2. There where no new customers added so the actual throughput of data remained the same.

enter image description here

I am using the same code that I am using as when I was hosting on the Ec2 instance. This code uses a connection pool to connect to the database which I am not sure is the best idea. Basically I call the query method when I want to query the database in my main script.

A connection is retrieved from a connection pool for the lambda function to run the query and then released at the end of the query.

Is this a good approach or should one connection be constantly persisted throughout the execution of the lambda function? The CPU usage on a T3 medium is nearly maxed out since I moved to Lambda and I think this is the problem.

// This file is used to get a connection from the mysql connection pool 
// The pool is contained in this file and the function to get a connection is exported 

// Imports //
const mysql     = require('mysql')

// Define the connection pool //
const pool = mysql.createPool({
    host:       process.env.DBHOST,
    user:       process.env.DBUSER,
    password:   process.env.DBPASSWORD,
    port:       process.env.DBPORT,
    database:   process.env.DBNAME
})

/// getConnection ///
// Call this function whereever you need a connection
// It gets a connection from the connection pool 
function getConnection(){
    return new Promise((resolve,reject) => {
    pool.getConnection((err,con) => {
        if(err){
            reject(`Connection Failed\n${err}`);
        }else{
            console.info(`Successful connection, id ${con.threadId}`)
            resolve(con)
        }
    })
})
}

/// query ///
// This function makes a query on a database 
// It accepts and sql statement and a list of args if used in the statemens
// Inputs:
//  sql  - The sql statement to execute on the database 
//  args - The arguments to use in the sql statament  
// Returns:
//  resolve - The result of the sql insert statemen 
//  reject  - An error
async function query(sql,args){
    return new Promise(async(resolve,reject) => {
        // try/catch is to handle the await for the connection 
        try{
            let con = await getConnection() // If there is an error here it will be caught
            // No error this query will execute 
            con.query(sql,args, function(err,result) {
                // If error returned from query will be caught here  
                if (err){
                    reject(err);// Reject the promise 
                }else{
                    resolve(result);// Resolve the promise 
                }
                // Always release
                con.release() 
            })
        }catch(error){
            // Error in getting connection reject the error 
            reject(error)
        }
    })    
}

module.exports = {query}

Jordan Arsenault
  • 7,100
  • 8
  • 53
  • 96
WK123
  • 620
  • 7
  • 18
  • 2
    The big difference is probably that your lambda workers may run a lot more in parallel than your single instance did, and/or that lambda workers are necessarily connecting and disconnecting all the time. If you want a connection pool, you need it as a separate instance between lambda and the database, since individual lambda instances can’t share a resource like a connection pool internally. – deceze May 19 '21 at 09:33
  • 1
    I’d say spikes of ~50% usage looks like you’re using the available database resources more effectively. You should take care not to let it increase much more than that though. Did your API response time or processing time improve at the same time compared to the single instance…? – deceze May 19 '21 at 09:36
  • Yes so at a max there could be 40/50 concurrent instances running looking at the lambda console, do you think I should be using connection pools or limit it to one connection per instance? The CPU usage has shot up to 99% which worries me. The response time has gone up and data is being logged much faster but the database is getting stressed. – WK123 May 19 '21 at 09:57
  • Sorry the first part of that last comment does not make sense, 40/50 concurrent instances using a connection pool or not will still result in 40/50 connections the way I have it implemented. – WK123 May 19 '21 at 09:59
  • 2
    Yeah, again, even if you pool connections internally in your Lambda worker, several parallel Lambda workers will still create many connections to the database. You'll want to consider a connection pool between your Lambda workers and your database to reduce the database stress, and/or limit your Lambda concurrency. – deceze May 19 '21 at 10:00
  • 3
    https://aws.amazon.com/rds/proxy/ – deceze May 19 '21 at 10:01
  • Ah ok I think I get you, so the proxy sits in between the database and my code, it is no longer the application controlling connection pooling it is the proxy. In your experience in your experience have you seen an increase in database efficiency/database stress go down using this method? – WK123 May 19 '21 at 10:16
  • 1
    I haven't had to use an RDS proxy yet, but sure, it caps the maximum parallel connections and through that also the maximum parallel running queries, so that should certainly cap CPU usage. It also means your Lambda workers may back up a bit waiting for connections/cursors to become available, which ultimately limits how many requests you can handle in parallel. You'll need to figure out your own specifics bottlenecks and tweak things as necessary. – deceze May 19 '21 at 10:18
  • Yeah I understand, I think though even just having the connections managed in one central place instead of connection pools in scripts is already a benefit, it seems yes you can cap them too so I can play around with what I can get away with. Thanks again! – WK123 May 19 '21 at 10:21
  • @deceze another problem I ran into if you have any knowledge on the subject https://stackoverflow.com/questions/67602603/migrating-mysql-8-0-20-database-to-aws-aurora – WK123 May 19 '21 at 11:44
  • Yeah, but what if you're using a DB that doesn't support RDS Proxy (e.g. Oracle RDS)? – O.O Jul 06 '23 at 21:11

0 Answers0