-2

I have some TypeScript code in a project that does a number of native DynamoDB update operations:

import { nativeDocumentClient, nativeDynamo } from '../utils/aws';

// snipped code

// updatedProducts is of type `{ id: string; siteId: string; state: ProductState }[]`

await Promise.all(
  updatedProducts.map(({ id, siteId, state }) =>
    nativeDocumentClient
      .update({
        TableName,
        Key: {
          id,
          siteId,
        },
        ExpressionAttributeNames: {
          '#s': 'state',
        },
        ExpressionAttributeValues: {
          ':state': state,
          ':updatedAt': new Date().toISOString(),
        },
        UpdateExpression: 'SET #s = :state, updatedAt = :updatedAt',
        ReturnValues: 'ALL_NEW',
      })
      .promise()
  )
);

Now we would always expect this record (with its composite Key) to exist, but we have found a rare situation where this is not the case (it's probably just poor data in a nonprod environment rather than a bug specifically). Unfortunately it looks like the underlying UpdateItemCommand does an upsert and not an update, despite its name. From the official docs:

Edits an existing item's attributes, or adds a new item to the table if it does not already exist

We can do a guard clause that does a get on the record, and skips the upsert if it does not exist. However that feels like an invitation for a race condition - is there an option on .update() that would get us what we want without separate conditionals?


Update - a guard clause might look like this in pseudocode:

object = get(TableName, Key: {
  id,
  siteId,
})
// Do update only if the document exists
if (object) {
    update(
      TableName,
      Key: {
        id,
        siteId,
      },
      ...
    )
 }

I think this could be susceptible to a race condition because the read and write operations are not wrapped in any sort of transaction or lock.

halfer
  • 19,824
  • 17
  • 99
  • 186
  • Ah, I wonder if we could do `ConditionExpression: 'attribute_exists(id)'` or something like that - basically if the old object has one of its keys set, then this is an update and not an upsert, and the operation can go ahead. – halfer Jan 17 '23 at 18:03
  • This idea is a bit mucky, but for the sake of completeness: set `ReturnValues: 'ALL_OLD'` and delete the new record if the old returned one was empty. – halfer Jan 17 '23 at 18:05
  • See [Confirming existence or non-existence of an item](https://www.alexdebrie.com/posts/dynamodb-condition-expressions/#1-confirming-existence-or-non-existence-of-an-item). – jarmod Jan 17 '23 at 18:51

2 Answers2

1

All writes/updates to DynamoDB are strongly serializable.

Using a ConditionExpression for attribute_exists() would ensure you only update the item if it exists, if it does not exist the update will fail.

aws dynamodb update-item \
    --table-name MyTable \
    --key file://config.json \
    --update-expression "SET #Y = :y, #AT = :t" \
    --expression-attribute-names file://expression-attribute-names.json \
    --expression-attribute-values file://expression-attribute-values.json  \
    --condition-expression "attribute_exists(#PK)"
Leeroy Hannigan
  • 11,409
  • 3
  • 14
  • 31
  • Doing the Get then the Update can be a race condition unless you add more conditional update clauses. If you only care about the item existing before you call Update (rather than upserting the item), then Lee's suggestion to add "attribute_exists(#PK)" to your Update logic is the simplest way to achieve that. – Chris Anderson Jan 17 '23 at 19:09
  • Although this answer looked very promising, I am seeing a case of an AWS exception where the record does not exist, defeating the purpose of adding the clause. We are trying the code-level conditional, which as @ChrisAnderson rightly says would theoretically have a race condition, but we don't delete records - just mark them as unpublished. So that might be easier overall - easier to grok, and no unexpected side-effects. See: https://stackoverflow.com/questions/38733363/dynamodb-put-item-conditionalcheckfailedexception – halfer Jan 20 '23 at 12:56
  • I'd expect an exception when the record doesn't exist, because the condition expression requires it to exist. If you share the specific exception message, I can confirm that's expected. – Chris Anderson Jan 20 '23 at 18:33
  • @ChrisAnderson, thanks! I have added another answer, which was the solution we went with. This contains the error we saw. I didn't read Lee's caveat carefully enough - apologies - in which it was said that the operation will fail in the case of non-existence. – halfer Jan 20 '23 at 19:10
  • I've transferred the acceptance mark to my answer, to let readers accurately know what solution we went with. I've added an upvote to make up for the lost rep – halfer Feb 03 '23 at 22:37
-1

I am adding a self-answer, as the other existing answer didn't work. We did try it, but it crashed in the case of the record not existing:

2023-01-20T11:00:00.000Z    c9175s91-8471-9834-cd43-89dc62ac7381    ERROR   Gh [Error]: ConditionalCheckFailedException: The conditional request failed
    at Mt._assembleError (/var/task/src/importers/product/productImportHandler.js:71:62727)
    at Mt.feed (/var/task/src/importers/product/productImportHandler.js:71:62071)
    at TLSSocket.<anonymous> (/var/task/src/importers/product/productImportHandler.js:71:27931)
    at TLSSocket.emit (events.js:400:28)
    at TLSSocket.emit (domain.js:475:12)
    at addChunk (internal/streams/readable.js:293:12)
    at readableAddChunk (internal/streams/readable.js:267:9)
    at TLSSocket.Readable.push (internal/streams/readable.js:206:10)
    at TLSWrap.onStreamRead (internal/stream_base_commons.js:188:23)
    at TLSWrap.callbackTrampoline (internal/async_hooks.js:130:17) {
  time: 1674212125481,
  code: 'ConditionalCheckFailedException',
  retryable: false,
  requestId: 'xxxxxxxxxx',
  statusCode: 400,
  _tubeInvalid: false,
  waitForRecoveryBeforeRetrying: false,
  _message: 'The conditional request failed',
  codeSeq: [ 4, 37, 38, 39, 43 ],
  cancellationReasons: undefined
}

We solved it with a ternary short-circuit:

const product = await detailsRepository.get(id, siteId);

// Only update when product exists
return (
  product &&
  nativeDocumentClient
    .update({
      TableName,
      Key: {
        id,
        siteId,
      },
      ExpressionAttributeNames: {
        '#s': 'state',
      },
      ExpressionAttributeValues: {
        ':state': state,
        ':updatedAt': new Date().toISOString(),
      },
      UpdateExpression: 'SET #s = :state, updatedAt = :updatedAt',
      ReturnValues: 'ALL_NEW',
    })
    .promise()
);

It has been correctly pointed out on this page that this could theoretically create a race condition, between detecting the presence of the object and it being deleted by another process; however, we don't delete this kind of document in our app. This solution doesn't crash in the case of an object not existing, but moreover we like it because it is easier to understand.

As an alternative, perhaps we could have stuck with the condition expression in Dynamo, but used a try catch to ignore ConditionalCheckFailedException throws.

halfer
  • 19,824
  • 17
  • 99
  • 186