2

I have the following Cloud Function and want to find out whether I should use batched writes or a transaction:

const firestore = admin.firestore()
// The following two queries potentially return hundreds of documents.
const queryA = firestore.collectionGroup('a').where('b', '==', 'c'),
  queryB = firestore.collection('b').where('b', '==', 'c')

const snapshotA = await queryA.get(), snapshotB = await queryB.get()
const batch = firestore.batch()
for (const documentSnapshot of snapshotA.docs.concat(snapshotB.docs)) {
  batch.update(documentSnapshot.ref, { 'b': 'd' })
}
return batch.commit()

I do require this operation to never fail, however, I do not see any case this would ever fail.

Is there any reason to use a transaction instead in this case?
Conversely, is there any reason not to use a transaction here?

Renaud Tarnec
  • 79,263
  • 10
  • 95
  • 121
creativecreatorormaybenot
  • 114,516
  • 58
  • 291
  • 402

1 Answers1

4

You are only writing to Firestore (and not reading), so at first sight, there is no need to use a transaction since "transactions are useful when you want to update a field's value based on its current value, or the value of some other field" (see doc), and you should use a Batched Write.

However, you should note that "each transaction or batch of writes can write to a maximum of 500 documents" (see same doc). Since you mention that your queries "potentially return hundreds of documents" you may encounter a problem here and have to write in several batches.


Other point: You say "I do require this operation to never fail, however, I do not see any case this would ever fail". You cannot be sure that it will never fail: as explained in the documentation, a Cloud Function "might exit prematurely due to an internal error" (see point below for more details). For example there might be problem of connectivity between the Cloud Function platform and the Firestore one and your batched write fails (independently of your code).

To fulfill this requirement (i.e. "I do require this operation to never fail") you should take advantage of the possibility to retry a background Cloud Function and of the fact that a batch of writes completes atomically. For background Cloud Functions retrying, see this doc which explains:

  1. Why this might happen ("On rare occasions, a function might exit prematurely due to an internal error, and by default the function might or might not be automatically retried"), and
  2. How to deal with this situation (in two words, enable retries and make your background Cloud Functions idempotent).
Renaud Tarnec
  • 79,263
  • 10
  • 95
  • 121
  • This helped me to clear up my confusion a lot. I can just take my array of document snapshots and divide it into bundles of 500 and then run batched writes for all of these bundles, returning `Promise.all`. – creativecreatorormaybenot Jun 18 '19 at 07:59
  • 1
    Not sure that by using `Promise.all()` with batched writes you will keep the global atomic character of the write.... See the doc https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/all, which says "If any of the promises reject, Promise.all asynchronously rejects with the value of the promise that rejected, **whether or not the other promises have resolved.**". So you might end with a situation when (in the same Cloud Function) some of the batched writes are committed when some others are not, and then it might be difficult to ensure idempotency.... – Renaud Tarnec Jun 18 '19 at 08:04
  • The operation is idempotent, which means that this should be fine in combination with retry on failure (assuming that all batched writes will succeed together eventually). Also, is there another way? If you know an alternative, that would be great, however, I do not see any way to ensure atomic behavior with the limit of 500 field transformations. – creativecreatorormaybenot Jun 18 '19 at 08:26
  • You're right it's idempotent! (you just update the field `b`), so you should be ok with this approach! – Renaud Tarnec Jun 18 '19 at 08:30