6

I'm using Firestore at beta version with Cloud Functions. In my app I need to trigger a function that listens for an onCreate event at /company/{id}/point/{id} and performs an insert (collection('event').add({...}))

My problem is: Cloud Functions with Firestore require an idempotent function. I don't know how to ensure that if my function triggers two times in a row with the same event, I won't add two documents with the same data.

I've found that context.eventId could handle that problem, but I don't recognize a way to use it.

exports.creatingEvents = functions.firestore
  .document('/companies/{companyId}/points/{pointId}')
  .onCreate((snap, context) => {

    //some logic...

    return db.doc(`eventlog/${context.params.companyId}`).collection('events').add(data)
})
brabster
  • 42,504
  • 27
  • 146
  • 186
João Victor
  • 639
  • 6
  • 20

2 Answers2

10

Two things:

  1. First check your collection to see if a document has a property with the value of context.eventId in it. If it exists, do nothing in the function.
  2. If a document with the event id doesn't already exist, put the value of context.eventId in a property in the document that you add.

This should prevent multiple invocations of the function from adding more than one document for a given event id.

Doug Stevenson
  • 297,357
  • 32
  • 422
  • 441
  • Good answer - and probably as good a guarantee of correctness as you'll get with the `set` approach I outlined if Cloud Functions guarantees no concurrent execution of the same event. The `set` approach saves a network round-trip though, which can matter under load. – brabster Apr 26 '18 at 16:33
  • 1
    The `set()` approach demands that the body of the function have no variance between executions, else you end up with two different writes. – Doug Stevenson Apr 26 '18 at 16:48
  • Definitely worth pointing out, but I would assume by default that we're interested in the payload that was sent to the function by whatever invoked it, rather than details of the execution context (other than a unique ID for the payload). I'd typically just store the invocation payload rather than the context details unless I had a reason to do otherwise. – brabster Apr 26 '18 at 16:50
  • 1
    context.eventId always will be unique? Using both approaches I could handle my problem :) – João Victor Apr 26 '18 at 16:51
  • I'm checking with the Cloud Functions team on how exactly eventId is unique. I think for your particular case it'll be OK, but there may be some other cases where there's a dup (for example, Firestore ids may not share the same space as Realtime Database ids). – Doug Stevenson Apr 26 '18 at 17:33
  • 2
    The event ID should always be unique from a give event provider. So, all your events coming from Firestore will be unique among themselves, but not necessarily unique when taken in combination with other event providers, such as Realtime Database. – Doug Stevenson Apr 29 '18 at 15:52
  • 1
    So this events collection will grow forever... Is there a good way to clean it out as you go without requiring another function? – Jason Scott Aug 02 '18 at 08:43
3

Why not set the document (indexing by the event id from your context) instead of creating it? This way if you write it twice, you'll just overwrite rather than create a new record.

https://firebase.google.com/docs/firestore/manage-data/add-data

This approach makes the write operation idempotent.

brabster
  • 42,504
  • 27
  • 146
  • 186
  • 1
    Hmm, but with set I need to know the id of document, and currently firestore are doing this for me, but I think if I'm merge your solution with the solution of @doug-stevenson and add a document with the id of the context.eventId I can use "set" and avoid using a new query to check if the document has one property with value of context.eventId – João Victor Apr 26 '18 at 16:38
  • 1
    There's not enough information in the body of the given function to know if two executions will generate the exact same document data. (If not guaranteed to be exactly the same, then it wouldn't be idempotent.) – Doug Stevenson Apr 26 '18 at 16:42
  • 1
    You are correct, but in my case, two executions will generate the same data – João Victor Apr 26 '18 at 16:44
  • 1
    Sounds like you have a plan. Happy we could help :) – brabster Apr 26 '18 at 16:51