Using Git to frequently push dynamically created commits

Question

I want to store the configuration of a no-code tool, as a JSON file, into a Git repo. (GitHub)

I'd get this "config" object (JSON), which contains the configuration of all my database tables and columns, and I want to push it into a Git repository at a regular interval (whenever there are changes).

The goal is to keep track of that file across time, to see what has changed and when, as well as having a backup of the DB structure.

Getting the data is not the issue, I'm more interested about how Git could help. I'm concerned about automatically committing/pushing that file at a regular interval (I don't know what issues I might run into).

Also, and that's probably the hardest part, I want to generate files based on the "config" object. Basically re-creating the whole database structure using folders and files, so that all "columns" of the db have their own text file that describes their configuration.

So, at a regular interval, I'd need to:

Get the newest version of the config file (using JS)
Commit the config file (I'm not sure how to do this automatically, I've only ever used git manually, not programmatically, I guess I need to use something like https://github.com/nodegit/nodegit)
Generate new files (or update existing ones) based on what's changed (I'm thinking on using GitHub Actions to generate the files and commit/push them after the config file has changed)

I wonder if Git conflicts could arise and how to avoid them, thinking of pushing in force mode all the time to avoid that, not sure if it's a great idea either.

Here is an example of the "config" I mentioned: https://gist.github.com/Vadorequest/6972b9cc91b36e4273bd80eaa28a84cd

I wonder if I could use something like Vercel to host the API that would commit the config file to my git repository, or if I can do that from a browser directly (committing from a Chrome Extension, probably not secure though, hence the need for an API).

I guess I could then simply react on "changes" in the GitHub repository to trigger a GitHub Action that would detect which file should be re-generated, generate them and commit/push them onto the repository.

One of my main concerns with this design is about to Git conflicts, if I keep making programmatic commits/pushes from different sources, I'll probably run into conflicts, and I'm not sure how to solve those programmatically.

As a general rule, if a file is generated, it's not source code, and shouldn't be in a Git repository in the first place. Git isn't a build or deployment tool and trying to force it to act as one leads to grief. Store the JSON source and generate the generated file in a build process and then deploy that, with the generated file never appearing in Git, and things will be much happier. — torek, Jun 12 '22 at 00:09
Yeah, I also thought about that (using a DB instead of Git), but the goal is to have a github repository that contains the files. What I'm trying to do is convert a manual process into an automated process. Currently, I do all that manually (creating folders/files and copying their content into my GitHub repo), and I want to automate it while not changing my workflow and way of doing things. I love the fact that I can simply do a mass-search for a keyword across all my databases in my current local git copy, I'm just tired of keeping it up-to-date manually. — Vadorequest, Jun 12 '22 at 07:10
Offering a counter-opinion, I think this workflow is something you could do in Git. It's not the typical use of Git, but I think Git can be part of your solution. GitHub Actions are a great place to run the automated scripts. Saving large binary files might not be a good idea, but saving generated text files would be fine, as it would give you the ability to see their history, as you say. — joanis, Jun 12 '22 at 19:19
Conflicts: if you do your workflow right, you should not have conflicts. Conflicts arise from merging branches that have diverged, but if you're pushing changes and that triggers regenerating the files, the commit on that should work without conflicts. Force pushing: I advise against that, strongly, that's going to give you lots of conflicts to resolve. Create an incremental workflow, not a force push workflow! — joanis, Jun 12 '22 at 19:19
We use workflows similar to what you describe for some of our projects, and it's very useful. When we push the config changes, derived files are generated and committed right in the GitHub Actions workflow. A specific example: Docusaurus web page generation, the generated static web page is automatically saved to a branch in the Git repo and GitHub pages serves the site directly from that branch. I think there are many workflows that follow the general approach you're asking about. — joanis, Jun 12 '22 at 19:21
Thanks @joanis for the feedback, it's great to hear! I've discovered a whole new programming world since asking the above question, https://github.com/nodegit/nodegit in particular, which has led me to asking other questions, like https://stackoverflow.com/questions/72595392/nodegit-how-to-know-if-a-commit-is-empty-to-avoid-pushing-empty-commits - If you have any tip regarding programmatic Git I'm very much interested to hear them! :) So far things are working out, and I haven't needed to force push anything! — Vadorequest, Jun 12 '22 at 19:52
@joanis: I agree that there are some subsets of problems where this can work fine. In particular, if you generate an entire tree and then simply write that tree to a new Git snapshot on a branch that simply accumulates releases (and never has any work done directly on it), that should work without any snags. The *potential* snag comes from someone thinking "oh, I should work on *this* branch" - how to stop that becomes the main problem. — torek, Jun 12 '22 at 20:19
There is another potential issue, related to the concurrency of changes. Because it's a server that creates the commits, I suspect it could happen that if there were a lot of concurrent calls, there could be multiple commits created at the same time using the same parent commit, and trying to push at the same time. One of them would pass through, while the other would be stuck. But that's an advanced scenario and I would simply ignore the missed commit for my needs. — Vadorequest, Jun 12 '22 at 20:25
@torek Yeah, that can be an issue, but I would solve that by saying, e.g., only a push to release, or maybe main, triggers a build to update, e.g., the publish branch. GitHub Actions CI workflows can be as specific as needed, and will work well with a carefully thought out overall workflow. — joanis, Jun 12 '22 at 20:26
@Vadorequest if you write your script carefully, you can detect that the push got rejected (that'll happen by default if it's not a fast-forward push). In your other question, you're talking about nodegit, but keep in mind that in a GitHub Actions yaml script, each `run:` block is a full shell script where it's easy to use the Git CLI commands. That's how I do all my automation and the route I would recommend. — joanis, Jun 12 '22 at 20:28

Haunted · Answer 1 · 2022-06-14T01:28:59.020

You can use Github actions to do this easily.

You can write a workflow trigger to run a workflow at scheduled intervals. The following will run your workflow every 30 minutes. From my experience, Github actions cron does not run exactly on the same time as you want. It's a hit or miss.

on:
  scheduled:
    - cron: '0/30 * * * *'

You can then write another action that generates your config file.

- name: Generate config file
  run: |
    # Generate your config file

Then, commit your config files.

- name: Commit to git
  run: |
    git config user.name "Vadorequest"
    git config user.email "vadorequest@earth.com"
    git remote set-url origin https://${{github.actor}}:${{github.token}}@github.com/${{github.repository}}.git
    git add config.json
    git commit -m "Committing back my config file"
    git push origin main

This will commit config.json to your repository.

https://docs.github.com/en/actions/learn-github-actions/contexts.

This is the full workflow.

name: Generate config files and commit to git
on:
  scheduled:
    - cron: '0/30 * * * *'

jobs:
  config:
    runs-on: ubuntu-latest
    steps:
      - name: Generate config file
        run: |
          # Generate your config file
      - name: Commit to git
        run: |
          git config user.name "Vadorequest"
          git config user.email "vadorequest@earth.com"
          git remote set-url origin https://${{github.actor}}:${{github.token}}@github.com/${{github.repository}}.git
          git add config.json
          git commit -m "Committing back my config file"
          git push origin main

This is an interesting start, it's not really what I aim to do (I'll rather use a HTTP-based trigger, or a file changed trigger even, better in my particular case). I think you thought my issue was with "Commit the config file (I'm not sure how to do this automatically, I've only ever used git manually, not programmatically, I guess I need to use something like" which I solved using Nodegit, see https://stackoverflow.com/questions/72595392/nodegit-how-to-know-if-a-commit-is-empty-to-avoid-pushing-empty-commits — Vadorequest, Jun 15 '22 at 19:50
My next challenge is "Generate new files (or update existing ones) based on what's changed (I'm thinking on using GitHub Actions to generate the files and commit/push them after the config file has changed)", which I would trigger whenever the config file has changed, and then it gets complicated (need to diff the old config file and the new one, understand what's been added/modified/removed, and modify files accordingly before committing and pushing those changes. - Upvoting because it answers part of the question, and the questions contains several issues and they're not easy to address all. — Vadorequest, Jun 15 '22 at 19:51

Using Git to frequently push dynamically created commits

1 Answers1