Github dev/stg/prod workflow: Which branch should I create feature branch from?

Question

I'm creating a CI workflow for a 50+ Looker developer team. I have read all models in this page but none really fits my requirements. The current flow I have in mind is:

The github repo has three branches: dev, stg and prod. We also have three Looker instances that connect to one of them;
dev is essentially a volatile sandbox and can be re-created rather quickly from prod if any critical file has been deleted;
We cannot afford a release branch method because merging the release branch into prod is a nightmare. Basically we cannot afford any model that requires a big merge from somewhere to stg or prd (in a team of 50+ developers, even a daily merge is too problematic);
stg is regarded as a stable environment where developers need to prove that their code works in such an environment before being merged into prod, a second stable environment. A merge into stg should be guarded by automated tests, but not by humans;
prd is final and should be guarded by a human reviewer;
Step 1: Developer A creates a feature branch from dev, say, A_feature_A1, eventually it gets merged into dev;
Step 2: A checks that everything is fine in dev and creates a PR to merge the feature branch into stg. This time a github action is triggered to perform some automated validation and tests;
Step 3: A checks that everything works as expected in stg, and creates a PR to merge the feature branch into prd. This time there is a human approver to approve the PR.

However I have a big issue in Step 1 -- should I create feature branches from dev or from prd/stg? I have tried both and neither works perfectly. If I create feature branches from dev, and because dev is always messed up, it's very difficult to merge the feature branch into either of the two stable branches. If I create feature branches from prd, Looker simply tells me that remote and local are not synced, which makes sense because dev is always different from prd.

How can I resolve this issue? There are a few other restriction comparing to software engineering work:

Looker developers prefer to develop in Looker UI, which puts some restrictions on git operations (e.g. Say developers only have access to Looker-dev instance, which points to the dev branch, they simply CANNOT create feature branches from other branches);
The workflow has to guarantee CI/CD that is, 1) expect a lot of commits everyday and they are expected to be moved into production ASAP, and 2) expect that the dev branch is messed up

This sounds like gitflow, but you're trying to re-use a single `stg` branch for every release instead of creating a new release branch? You say `because merging the release branch into prod is a nightmare.`, but in your flow, it seems like you'd have the same problem, or worse, with merging `stg` to `prd`? You also say ` because dev is always messed up` -- that seems like the problem to fix to me. Why are devs merging broken feature branches into `dev`? — tconbeer, Jul 20 '22 at 17:28
Hi @tconbeer sorry for the confusion. The model does not merge `dev` into `stg` nor does it merge `stg` into `prd`. It tries to merge `feature branch` into `dev` and the same `feature branch` into `stg` and `prd` -- thus the issue of -- from which branch should I create the feature branch from? `dev` is messed up because Looker developers face business directly and they have to deliver as quickly as possible, sometimes with dirty results. I wish I could explain better but `dev` has to be a sandbox. Maybe `dev` is not the best word? — Marcus Anthony, Jul 20 '22 at 17:40
BTW it's different from gitflow or gitlab flow AFAIK. Gitflow needs to merge a large number of commits from a release branch into `prd`, as per this webpage: https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow However we cannot afford using a release branch because business cannot wait for say a week/sprint to see results in production. — Marcus Anthony, Jul 20 '22 at 17:43
Gitflow doesn't require a "large" release. You could cut a release branch for every feature branch if you want. But at some point (ideally after every new feature branch is merged) you have to merge prod back into stg and dev, or neither of those branches reflect what's in prod — tconbeer, Jul 20 '22 at 19:39

score 3 · Accepted Answer · answered Jul 21 '22 at 12:39

You currently seem to have a flow that's similar to the flow used by the git project or a slightly modified GitLab flow model. Also, you seem to have the following constraints:

Looker developers prefer to develop in Looker UI, which puts some restrictions on git operations (e.g. Say developers only have access to Looker-dev instance, which points to the dev branch, they simply CANNOT create feature branches from other branches);

The workflow has to guarantee CI/CD that is, 1) expect a lot of commits everyday and they are expected to be moved into production ASAP, and 2) expect that the dev branch is messed up.

Basically we cannot afford any model that requires a big merge from somewhere to stg or prd (in a team of 50+ developers, even a daily merge is too problematic).

You also mention in one of your comments that

Looker developers face business directly and they have to deliver as quickly as possible, sometimes with dirty results.

and

However we cannot afford using a release branch because business cannot wait for say a week/sprint to see results in production.

Approach 1: Adapting your current model

What should you branch from?

I would say you should always be branching off of prd for your feature branches and then merging into dev when you're ready. I say this because with pull requests that are made in GitHub/GitLab/etc. you'll be merging from the common ancestor of the branch you want to merge into up to the current HEAD of the feature branch. So if I branch off of dev or stg and then try to merge my feature into prd, I can potentially pull unstable or not-yet-reviewed code into prd. There are ways around this issue when branching off of other branches such as rebasing or adopting a patch-set workflow with stacked git (if you don't want to use the traditional email workflow). However, as I've never heard of Looker and don't know its full scope of restrictions, I'm going to assume there are limitations in place if it's restricting which branches people can branch from (though I am assuming people can branch off of branches they've created).

Avoiding merge conflicts

Also, git is exceptional at performing merges. Use that to your advantage. You mention that daily merges are too problematic but also that business can't wait a week to see things in production. Taking these excerpts from your post and subsequent comments, it sounds like you're afraid of frequent merges yet still needing to meet businesses' expectations of rapid releases. As long as your dev team is communicating with one another during development and not making massive changes, git should be capable of handling X number of merges per day. If there are merge conflicts, fix them. For advice on how to avoid merge conflicts see Open Bank Project's article and Gehsan's blog post.

Modifying the existing model

If I were to modify your branching model, here's what I would have the branches as defined below.

prd the current state of production
stg the stable testing branch (contains all changes in prd)
dev your primary integration branch (contains all changes in prd)

You should be frequently overwriting dev with prd. This way developers are integrating into a semi-stable environment, rather than an always broken environment. Maybe start with doing this weekly and see where it goes. Yes, developers who have not had their changes graduate to the stg branch will need to re-merge into dev, but this should not be a major issue. Just be sure to announce when you overwrite so developers can check if their branches need to be re-merged. By doing this, it should allow developers to have a more stable integration branch.

Addressing dirty results

You mention developers sometimes have to merge with "dirty results" to please business. This is just all around a bad practice. Even dev is intended for code that's been locally (or in a separate environment) tested by the individual developer and refactored, not code that's hot off the keyboard. If this is something business needs to see, you should have a frank conversation with them. If they're unwilling to budge, I would suggest having your developers create throwaway branches. The idea behind throwaway branches is you create them, maybe merge some other branches into them for testing, and then purge them when you're done. For example a developer working on a feature could do the following:

Create branch feature_A
Develop feature
Create thrw_feature_A from feature_A
Optionally merge stg into thrw_feature_A to ensure they're grabbing other stable updates.
Show the feature to whoever needs to see it on the business side
Delete the thrw_feature_A branch and either continue developing or merge into dev if done.

If that's not an option, you can also adopt feature flags (see the section on Trunk-based development below).

Approach 2: Trunk-based Development

I don't see in any of the links you shared any examples of trunk-based development. It's common in large companies and scales very well. Atlassian has a good article about it. In summary, it's where all changes are merged directly into prd. Features that are not complete or buggy are ignored through the use of feature flags. In this way, small updates to the code can be made to master and then immediately tested and deployed, if stable, by your CI/CD pipeline.

With this approach everyone could simply branch off and merge into master. That would solve your issue with Looker's restrictions and keep a steady flow of changes going out to ensure business is happy. The downside is this flow can take some time to perfect, and involves a lot of effort spent on automation.

Thanks, we actually used Approach 2 a while ago, i.e. everyone branch from production and create a PR to merge into it. There are a few problems probably unique to the BI world (comparing to engineering world) that forced us to seek a new model (the model I'm thinking about right now). For example, business sometimes requires dirty commits to be merged into `production` branch, so we want to use `develop` and `staging` to at least protect `production` from rush merges (so if someone rush merges into `dev`, `production` is still intact`). — Marcus Anthony, Jul 21 '22 at 22:24
1) "So if I branch off of dev or stg and then try to merge my feature into prd, I can potentially pull unstable or not-yet-reviewed code into prd. " -- Thanks, we found out about this yesterday so now feature branches are based on `production`; 2) "You should be frequently overwriting dev with prd." 100%, I found out I had to reset `dev` and `staging` based on `main` after EVERY merge into `main`. Git regards `feature->dev` and `feature->main` as different and assign different hashes. This creates conflict for each `feature->dev/stg`. — Marcus Anthony, Jul 21 '22 at 22:34
@MarcusAnthony git can usually handle differences in merge commits if the commits introduced by them are the same. It seems your business really doesn't want to wait for sprints/release cycles to conclude. I still stand by my trunk-based suggestion. With that said, there are other variants like Microsoft's release flow. Google uses paired programming with its trunk-based approach to do coding and code-review in parallel, which might help the review process. — ElderFuthark, Jul 23 '22 at 03:33
Links that were too long to fit in my last comment. Microsoft Release Flow: https://devblogs.microsoft.com/devops/release-flow-how-we-do-branching-on-the-vsts-team/ Google Dev-Ops Trunk Development: https://cloud.google.com/architecture/devops/devops-tech-trunk-based-development — ElderFuthark, Jul 23 '22 at 03:33