0

I'm attempting to listen to new commits to a public GitHub repository that I do not own, and would like to push new commit events to GCP Pub/Sub such that a function can consume it further. I recognize that GitHub itself exposes RSS feeds to different event types in a given repository, and am curious about how we can process changes in real-time efficiently.

Here are two similar workflows that I thought about:

  1. Routinely schedule a job to pull the RSS feed, check for new changes against what we've already seen and processed, and enqueue whatever is fresh.

  2. Fork the repository, and integrate with Cloud Repositories, such that we can enqueue notifications on changes. We can then setup a Fork Sync action to routinely sync the forked repo itself (say, every 2-5 minutes).

My worry is that we're routinely scheduling jobs in both approaches, which may be unnecessary and potentially expensive if we're entering periods with no changes at all. Is there a much more effective approach where we can trigger processing without continuously polling for changes?

Kabilan Mohanraj
  • 1,856
  • 1
  • 7
  • 17
alcao758
  • 165
  • 1
  • 2
  • 7
  • 1
    Have you explored on webhooks : https://docs.github.com/en/developers/webhooks-and-events/webhooks/about-webhooks – Gourav B Sep 17 '21 at 19:34

1 Answers1

0

You can first mirror the repository. By doing that, "Cloud Source Repositories automatically syncs your repository with the mirrored repository when a user commits a change". Then, you can set up Cloud Pub/Sub notifications, and get notified for event "RefUpdate". Finally your function can consume the Pub/Sub message generated.

Sri
  • 182
  • 7