7

I have a Firebase Cloud Function that monitors changes to my Realtime Database based on a sample provided in Firebase's documentation.

My function is working correctly and executes for every change as it was written to do.

With that said, and, as per Firebase's recommendation:

• Debouncing - when listening to realtime changes in Cloud Firestore, this solution is likely to trigger multiple changes. If these changes trigger more events than you want, manually debounce the Cloud Firestore events.

I would like to do just that.

Can anyone offer a good approach?

If we look at this function based on Firebase's example:

exports.onUserStatusChanged = functions.database.ref('/status/{uid}').onUpdate(
            async (change, context) => {

  // Get the data written to Realtime Database
  const eventStatus = change.after.val();

  // Create a reference to the corresponding Firestore document
  const userStatusFirestoreRef = firestore.doc(`status/${context.params.uid}`);

  // re-read the current data and compare the timestamps.

  const statusSnapshot = await change.after.ref.once('value');
  const status = statusSnapshot.val();

  // If the current timestamp for this data is newer than
  // the data that triggered this event, we exit this function.

  if (status.last_changed > eventStatus.last_changed) {
    return null;
  }

  // Otherwise, we convert the last_changed field to a Date

  eventStatus.last_changed = new Date(eventStatus.last_changed);

  // write it to Firestore

  userStatusFirestoreRef.get().then((user: any) => {
    user.forEach((result: any) => {       
      result.ref.set(eventStatus, { merge: true })
    });
  });
  return;
});

How should i attempt to debounce its execution?

Can i attempt to debounce the .onUpdate() event?

I originally thought the following would suffice:

functions.database.ref('/status/{uid}').onUpdate(
  debounce(async(change:any, context:any) => {
    ...
  }, 10000, {
    leading: true,
    trailing: false
  })
);

But, thanks to @doug-stevenson for pointing out that attempting to debounce the onUpdate event in this way will not work for the following reason:

"That's not going to work because each invocation of the function might happen in a completely different server instance with no shared context."

DevMike
  • 1,630
  • 2
  • 19
  • 33

2 Answers2

4

One way is to use a task scheduler (e.g., Google Cloud Tasks). Rather than handling the event directly in the cloud function itself, you'll use the task scheduler to control when the event is handled.

I've included two approaches: one for a debounce and one for a delayed throttle.

Debounce

The idea is to enqueue a task for each cloud function invocation. If there is already a task scheduled for that entity, the existing task should be canceled.

For example, if your debounce interval is 5 minutes, schedule each task 5 minutes into the future. When each task is scheduled, cancel the previous task for that entity, if there is one. When the task finally runs, it means there were no other cloud invocations within 5 minutes (i.e., a successful debounce).

Delayed Throttle

A delayed throttle means that your event will be handled at most once per interval, at the end of the interval.

The idea is: each time the cloud function runs, enqueue a task only if it's not a duplicate. You'll want to come up with a task naming convention that lets you de-dupe redundant tasks.

For example, you could append the entity id with the scheduled execution time. When your cloud function runs, if there is already a scheduled task for that id and time, you can safely ignore that event.

Here's a code sample that handles events at most once per minute.

// round up to the nearest minute
const scheduleTimeUnixMinutes = Math.ceil(new Date().getTime() / 1000 / 60);
const taskName = id + scheduleTimeUnixMinutes.toString();
const taskPath = client.taskPath(project, location, queue, taskName);

// if there's already a task scheduled for the next minute, we have nothing
// to do.  Google's client library throws an error if the task does not exist.
try {
  await client.getTask({ name: taskPath });
  return;
} catch (e) {
  // NOT_FOUND === 5.  If the error code is anything else, bail.
  if (e.code !== 5) {
    throw e;
  }
}

// TODO: create task here
chessdork
  • 1,999
  • 1
  • 19
  • 20
3

Since each event might be delivered more than once, you must track the provided event ID provided in context.eventId. If you see a duplicate event, you know that it's being repeated.

There are dozens of strategies to do this, and there is not just one right way to do it. You could store the processed ID in a database or some other persistent storage, but you can't just store it in memory, since each function invocation can happen in complete isolation from each other.

Also read about "idempotence", as this is the property of functions the behave the same way with each invocation.

https://firebase.google.com/docs/functions/tips#write_idempotent_functions

https://cloud.google.com/blog/products/serverless/cloud-functions-pro-tips-building-idempotent-functions

Doug Stevenson
  • 297,357
  • 32
  • 422
  • 441
  • When you say a duplicate context.eventId constitutes a repeated event, what exactly is repeated? an execution to the same resolved path such as ref('/status/UID12345'), the updated/changed data, or that both need be true (same resolved path, same attempted data change) for an event id to repeat? – DevMike Oct 31 '19 at 15:42
  • The entire event that was delivered to the function. Individual has nothing to do with it. – Doug Stevenson Oct 31 '19 at 15:56
  • Sorry, that comment should have said "individual path has nothing to do with it". – Doug Stevenson Oct 31 '19 at 16:10
  • Ok, so if i understand correctly, i need to take into account the fact that it is _possible_ that an event could repeat, which can be detected via matching eventId's, and thus in my case, id need to ensure that i debounce execution only for unique eventIds that have not yet been 'processed'? And by 'processed' i am referencing your linked video tutorial whereas i can store the eventid along with my data as a way of tracking whether or not an event had completed. – DevMike Oct 31 '19 at 16:12
  • 1
    Yes, you will want to ignore invocations where you have previously recorded a successful handling of the event. – Doug Stevenson Oct 31 '19 at 16:23
  • Thank you! i think i get it now. – DevMike Oct 31 '19 at 16:35