10

When I run my tests on CircleCI, it logs the following message a many times and eventually the tests fail because none of the database methods can retrieve the data due to the deadlocks:

{
  "message": "Error running raw sql query in pool.",
  "stack": "error: deadlock detected\n    at Connection.Object.<anonymous>.Connection.parseE (/home/circleci/backend/node_modules/pg/lib/connection.js:567:11)\n    at Connection.Object.<anonymous>.Connection.parseMessage (/home/circleci/-backend/node_modules/pg/lib/connection.js:391:17)\n    at Socket.<anonymous> (/home/circleci/backend/node_modules/pg/lib/connection.js:129:22)\n    at emitOne (events.js:116:13)\n    at Socket.emit (events.js:211:7)\n    at addChunk (_stream_readable.js:263:12)\n    at readableAddChunk (_stream_readable.js:250:11)\n    at Socket.Readable.push (_stream_readable.js:208:10)\n    at TCP.onread (net.js:597:20)",
  "name": "error",
  "length": 316,
  "severity": "ERROR",
  "code": "40P01",
  "detail": "Process 1000 waits for AccessExclusiveLock on relation 17925 of database 16384; blocked by process 986.\nProcess 986 waits for RowShareLock on relation 17870 of database 16384; blocked by process 1000.",
  "hint": "See server log for query details.",
  "file": "deadlock.c",
  "line": "1140",
  "routine": "DeadLockReport",
  "level": "error",
  "timestamp": "2018-10-15T20:54:29.221Z"
}

This is the test command I run: jest --logHeapUsage --forceExit --runInBand

  • I also tried this: jest --logHeapUsage --forceExit --maxWorkers=2

Pretty much all of the tests run some sort of database function. This issue only started to occur when we added more tests. Has anyone else had this same issue?

skyboyer
  • 22,209
  • 7
  • 57
  • 64
James111
  • 15,378
  • 15
  • 78
  • 121
  • I have the same issue while doing GraphQL Snapshots-based Testing, lot of queries/mutations to send/expect the response and then PostgreSQL starts to deadlock, did you found a solution to this already? – SkyzohKey Nov 17 '21 at 13:22

1 Answers1

1

Based on the error message we got Deadlock because of RowShareLock;

This means that two transactions (lets call them transactionOne and transactionTwo) have locked resurce which the other transaction requires

Example:

transactionOne locks record in UserTable with userId = 1
transactionTwo locks record in UserTable with userId = 2

transactionOne attempts to update in UserTable for userId = 2, but since it is locked by another transaction - it waits for the lock to be released

transactionTwo attempts to update in UserTable for userId = 1, but since it is locked by another transaction - it waits for the lock to be released

Now the SQL engine detects that there is a deadlock and randomly picks one of the transactions and terminates it.

Lets say the SQL engine picks transactionOne and terminates it. This will result in the exception that is posted in the question.

transactionTwo is now allowed to perform an update in UserTable for user with userId = 1. 
transactionTwo completes with success

SQL engines are pretty fast in detecting deadlocks, and the exception will be instant.

This is the reason for the deadlocks. Deadlocks can have different root causes.

I see you use the pg plugin. Make sure you use it right with the transactions: pg node-postgres transactions

I would suspect a few different root causes and their solutions:

Cause 1: Multiple tests are running against the same database instance

It may be different ci pipelines executing the same test against the same Postgres instance

Solution:

This is the least probable situation, but the CI pipeline should provision its own separate Postgres instance on each run.

Cause 2: Transactions are not handled with appropriate catch("ROLLBACK")

This means that some transactions may stay alive and block others.

Solution: All transactions should have appropriate error handling.


  const client = await pool.connect()
  try {
    await client.query('BEGIN')

    //do what you have to do

    await client.query('COMMIT')
  } catch (e) {
    await client.query('ROLLBACK')
    throw e
  } finally {
    client.release()
  }

Cause 3: Concurrency. For example: Tests are running in parallel, and they cause deadlocks.

We are writing scalable apps. This means that the deadlocks are inevitable. We have to be prepared for them and handle those appropriately.

Solution: Use the strategy "Let's try again". When we detect in our code that there is a deadlock exception, we just retry finite times. This approach has been proven with all my production apps for more than a decade.

Solution with helper func :

//Sample deadlock wrapper
const handleDeadLocks = async (action, currentAttepmt = 1 , maxAttepmts = 3) {
  try {
    return await action();
  } catch (e) {
    //detect it is a deadlock. Not 100% sure whether this is deterministic enough
    const isDeadlock = e.stack?.includes("deadlock detected");
    const nextAttempt = currentAttepmt + 1;
    if (isDeadlock && nextAttempt <= maxAttepmts) {
      //try again
      return await handleDeadLocks(action, nextAttempt, maxAttepmts); 
    } else {
      throw e;
    }

  }
}

//our db access functions
const updateUserProfile = async (input) => {
  return handleDeadLocks(async () => {
    //do our db calls
  });
};

If the code becomes to complex/ nested. We can try to do it with another solution using High order function

const handleDeadLocksHOF = (funcRef, maxAttepmts = 3) {
  return async (...args) {
    const currentAttepmt = 1;
    while (currentAttepmt <= maxAttepmts) {
      try {
        await funcRef(...args);
      } catch (e) {
        const isDeadlock = e.stack?.includes("deadlock detected");
        if (isDeadlock && currentAttepmt + 1 < maxAttepmts) {
          //try again
          currentAttepmt += 1;
        } else {
           throw e;
        }
      }
    }
  }
}



// instead of exporting the updateUserProfile we should export the decorated func, we can control how many retries we want or keep the default

// old code:

export const updateUserProfile = (input) => {
  //out legacy already implemented data access code
}

// new code
const updateUserProfileLegacy = (input) => {
  //out legacy already implemented data access code
}

export const updateUserProfile = handleDeadLocksHOF(updateUserProfile)


Svetoslav Petkov
  • 1,117
  • 5
  • 13