Blog George Gritsouk

Blog George Gritsouk

I’m a web developer in Toronto, Canada. I have a degree in Nanotechnology Engineering from the University of Waterloo.

Human Git Aliases
Stubbornly Refusing to Speak The Computer’s Language

The most common .gitconfig I see is blank except for setting a username. The second most common is this:

ci = commit
cia = commit -a
cam = commit –amend
cama = commit –amend -a

cl = clean
cldf = clean -df

res = reset
resa = reset HEAD

# 82 more 4-character aliases

This config basically trades space in your head for keystrokes. Save on typing by remembering short command aliases. I don’t love that. I make typos, and sometimes I don’t get enough sleep, and generally this is just going to make life harder on me. I shouldn’t be bending to suit the computer’s language, the computer should learn mine. I don’t care so much about having short commands, I have a shell with autocomplete that works. Instead, I use real words and try to make the whole thing more human.

My goals with git aliases are:

smooth out git’s unwieldy UI
make a few common workflows faster

For example, in git, trying to just get a list of something in the repository is insanely inconsistent. I fix it like so:

branches = branch -a
tags = tag
stashes = stash list

How about common operations for undoing work? I never want to Google “how to unstage a file”, there should just be a %$&#ing command to unstage a file.

unstage = reset -q HEAD —
discard = checkout —
uncommit = reset –mixed HEAD~
amend = commit –amend

I even have a nuclear version:

nevermind = !git reset –hard HEAD && git clean -d -f

which unstages changes in the index, discards changes in the working directory, and removes any new files.

I also really like having

graph = log –graph -10 –branches –remotes –tags –format=format:’%Cgreen%h %Creset• %<(75,trunc)%s (%cN, %cr) %Cred%d' –date-order

to see real timeline of who is working on what and when. Another good example:

precommit = diff –cached –diff-algorithm=minimal -w

This is a key part of my workflow. I run this before every commit to make sure I don’t need to use the undo commands.

Bend the aliases to how you think and work, not the other way around. Let your aliases reflect your values, instead of just saving you keystrokes.

I got a few great suggestions from Reddit comments on this post:

unmerged = diff –name-only –diff-filter=U by kasbah and remotes = remote -v by WrongSubreddit are my favourites. Thank you!

A full list of my Git aliases is in my dotfiles repo.



Introduction to GitSlave

Gitslave creates a group of related repositories—a superproject repository and a number of slave repositories—all of which are concurrently developed on and on which all git operations should normally operate; so when you branch, each repository in the project is branched in turn. Similarly when you commit, push, pull, merge, tag, checkout, status, log, etc; each git command will run on the superproject and all slave repositories in turn. This sort of activity may be very familiar to CVS and (to a lesser extent) Subversion users. Gitslave’s design is for simplicity for normal git operations.

Gitslave has been used for mid-sized product development with many slave repositories (representing different programs and plugins), branches, tags, and developers; and for single-person repositories tracking groups of .emacs and .vim repositories (in the latter case, it is basically used to keep the slave repositories up to date via a single command).

The gits wrapper typically runs the indicated git command on each repository in the project and combines (and occasionally post-processes for some special commands) the output from the individual git commands to make everything clearer, which is very useful when you have a few dozen slaves—looking at a concatenation of normally identical output for each git command would lose the wheat in the chaff.

Gitslave does not take over your repository. You may continue to use legacy git commands both inside of a gits cloned repository and outside in a privately git-cloned repository. Gitslave is a value added supplement designed to accelerate performing identical git actions over all linked repositories and aside from one new file in the superproject, adjustments to .gitignore, and perhaps a few private config variables, does not otherwise affect your repositories.

Other options

git-submodules is the legacy solution for this sort of activity. submodules went a different way where you have a submodule at a semi-fixed commit. It is a little annoying to make changes to the submodule due to the requirement to check out onto the correct submodule branch, make the change, commit, and then go into the superproject and commit the commit (or at least record the new location of the submodule). It was originally designed for third party projects which you typically do not doing active development on (it works the other way with a little inconvenience). Most git commands performed on the superproject will not recurse down into the submodules. As suggested above, submodules give you a tight mapping between subproject commits and superproject commits (you always know which commit a subproject was in for any given superproject commit).

Another option is to stick everything in one giant repository (either natively or by the git subtree merge strategy). This might make your repository annoyingly large and it is usually a bad idea to aggregate multiple concepts in the same repository. It also doesn’t work conveniently (or at least efficiently) if the subsets are shared with other super-projects or you changes need to be shared with the other super-projects or back upstream.

Another options include repo from Google, used with Android. Repo seems to work much like gitslave from a high level perspective, but I’ve not seen a lot of documentation on using it for other projects. Gitslave also came first.

Still another option is kitenet’s mr which supports multiple repository types (CVS, SVN, git, etc). It is absolutely the solution for multi-SCM projects, but since it works on the lowest common denominator you would lose much of the expressive power of git.

Gitslave is not perfect

Gitslave is imperfect in a few ways. It can complicate forensic archeology, it may need special care and feeding if one or more of the repositories are third party repositories, you can have partial success and partial failure (no atomic cross repository actions), not every git command has specific support in gits which needs it, and things can get a little squirrelly if different branches/tags have different attached slave repositories. However, we have not had any significant problems in over two years of intensive work on a project using this script nor has anyone else reported anything—do not mistake that for a warranty or a guarantee, for there is none.

Gitslave complicates forensic archeology in two ways. Most obviously you cannot have gitk (or something similar) show the complete history of all projects in all linked repositories. Less obviously, there is a very loose relationship between commits in different repositories. You cannot easily and precisely determine what commit/SHA any other repository was at when a particular commit was made (though you can approximate and assume pretty easily). Only tags provide exact synchronization between different repositories. Thus, gitslave may not be appropriate for blame-based debugging or egofull programming.

Your setup may need special care and feeding if one or more of the repositories is a third party repository. If you blindly attached the true upstream master to your local repository, you are at the mercy of the upstream commits to master. If there is a change which is not fully baked, you cannot refuse to accept it. Also you cannot easily use public branches since you probably will be unable to push those branches to the third party repository. The solution is to:

Consider using a unique naming system for branches and tags. This allows you to keep your branches and tags separate from the upstream branches and tags. This might even go as far as ditching master as your normal branch for your project-specific repositories (`git symbolic-ref HEAD refs/heads/mymaster` can change the default branch when cloning from a bare clone).
Choose one of the following schemes for updating:
Keep a project-local master mirror repository for the third party package as your project’s upstream (git clone –mirror –shared URL mydir). Periodically fetch in the bare repository. When you are ready to bring in some/all changes, you can `git merge` from remote/origin/ to . This has the disadvantage of requiring server-side git commands (the fetch) to be executed, of requiring a strict separation of reference namespace, and requires that you remember which upstream branches correspond to which project branches, but at least you can see (via gitk) those merges with the correct names.
A slight variant on the above is to have a normal bare repository as the project local master, and use a bare mirrored client repository (with the projectmaster as a remote) as a proxy to avoid having to run commands on the project repository server. Fetch on origin and (metaphorically) `git push –all –tags projectmaster` You then can have a normal clone do the merge of origin/master into mymaster. As long as you keep all local changes off the upstream branch, your transfer repository can happily import changes from the true upstream to the projectmaster and a normal clone can merge as necessary. It still requires a strict separation of reference namespace, and you still have to remember which upstream branches correspond to which project branches, but at least you can see (via gitk) those merges with the correct names.
The next variant gets rid of the requirement to have a strict separation of upstream namespace and your project namespace (except for the namespaceless tags). You create a normal project-master bare repository and have a normal clone of it. That clone add a remote for the true upstream. That transfer clone then merges between the upstream remote branch and the project branch and pushes the result to origin as normal. This still has the problem that there is no memorized mapping between the upstream and project branches. Even worse, no-one except this repository (or any repository with upstream as a remote) will be able to see (via gitk) the mapping. They will just see the merge from an anonymous branch.
Finally we have the punting option. Have a normal bare repo as a local master and create a vendor branch in the repository. When you want to update, checkout the vendor branch and replace the working directory with the most recent checkout/tarball from the appropriate upstream release/commit. Then merge the changes in. You lose the detailed history of the upstream changes, but this is a very easy and tradition method of importing changes. There is no question of namespace contamination, but you must manually figure out what to merge where in a normal checkout from your local project master (though gitk can help you see what you did in the past). This doesn’t work at all conveniently if different local-project release branches are tracking different upstream-project release branches—creating multiple vendor branches loses the simplicity which makes this option attractive.
Some git subcommands need special support from gitslave because they deal with (typically) repository URLs. For instance, `gits remote add NAME URL` is special cased because it has to figure out the correct URL for each of the submodules based on the superrepository URL and the subproject information. However, not all git commands have been specially modified when run with gits. See the manual page for the list of the ones which have, but specifically `gits remote set-url` and `gits branch –set-upstream` are two which have not been specially supported yet.

Even less perfect is the full and complete project documentation on what gitslave does, how it does it, and the various features and tweaks it might have. Gitslave isn’t all that complex so the hope is that it doesn’t need alot. We have an extensive manual page which is a good first step, and there is a lengthy tutorial on basic gitslave operations. See the links on the left for more information.

Summary, gitslave is a powerful tool when used for good

When you have a problem which calls for easy multirepository management without lots of synchronization, where you typically might want to run the same git command over every repository in your project, gitslave is the solution for you. Merging vs. Rebasing Merging vs. Rebasing

The git rebase command has a reputation for being magical Git voodoo that beginners should stay away from, but it can actually make life much easier for a development team when used with care. In this article, we’ll compare git rebase with the related git merge command and identify all of the potential opportunities to incorporate rebasing into the typical Git workflow.

Conceptual Overview

The first thing to understand about git rebase is that it solves the same problem as git merge. Both of these commands are designed to integrate changes from one branch into another branch—they just do it in very different ways.

Consider what happens when you start working on a new feature in a dedicated branch, then another team member updates the master branch with new commits. This results in a forked history, which should be familiar to anyone who has used Git as a collaboration tool.

Now, let’s say that the new commits in master are relevant to the feature that you’re working on. To incorporate the new commits into your feature branch, you have two options: merging or rebasing.

The Merge Option
The easiest option is to merge the master branch into the feature branch using something like the following:

git checkout feature
git merge master
Or, you can condense this to a one-liner:

git merge master feature
This creates a new “merge commit” in the feature branch that ties together the histories of both branches, giving you a branch structure that looks like this:

Merging is nice because it’s a non-destructive operation. The existing branches are not changed in any way. This avoids all of the potential pitfalls of rebasing (discussed below).

On the other hand, this also means that the feature branch will have an extraneous merge commit every time you need to incorporate upstream changes. If master is very active, this can pollute your feature branch’s history quite a bit. While it’s possible to mitigate this issue with advanced git log options, it can make it hard for other developers to understand the history of the project.

The Rebase Option
As an alternative to merging, you can rebase the feature branch onto master branch using the following commands:

git checkout feature
git rebase master
This moves the entire feature branch to begin on the tip of the master branch, effectively incorporating all of the new commits in master. But, instead of using a merge commit, rebasing re-writes the project history by creating brand new commits for each commit in the original branch.

The major benefit of rebasing is that you get a much cleaner project history. First, it eliminates the unnecessary merge commits required by git merge. Second, as you can see in the above diagram, rebasing also results in a perfectly linear project history—you can follow the tip of feature all the way to the beginning of the project without any forks. This makes it easier to navigate your project with commands like git log, git bisect, and gitk.

But, there are two trade-offs for this pristine commit history: safety and traceability. If you don’t follow the Golden Rule of Rebasing, re-writing project history can be potentially catastrophic for your collaboration workflow. And, less importantly, rebasing loses the context provided by a merge commit—you can’t see when upstream changes were incorporated into the feature.

Interactive Rebasing
Interactive rebasing gives you the opportunity to alter commits as they are moved to the new branch. This is even more powerful than an automated rebase, since it offers complete control over the branch’s commit history. Typically, this is used to clean up a messy history before merging a feature branch into master.

To begin an interactive rebasing session, pass the i option to the git rebase command:

git checkout feature
git rebase -i master
This will open a text editor listing all of the commits that are about to be moved:

pick 33d5b7a Message for commit #1
pick 9480b3d Message for commit #2
pick 5c67e61 Message for commit #3
This listing defines exactly what the branch will look like after the rebase is performed. By changing the pick command and/or re-ordering the entries, you can make the branch’s history look like whatever you want. For example, if the 2nd commit fixes a small problem in the 1st commit, you can condense them into a single commit with the fixup command:

pick 33d5b7a Message for commit #1
fixup 9480b3d Message for commit #2
pick 5c67e61 Message for commit #3
When you save and close the file, Git will perform the rebase according to your instructions, resulting in project history that looks like the following:

Eliminating insignificant commits like this makes your feature’s history much easier to understand. This is something that git merge simply cannot do.

The Golden Rule of Rebasing

Once you understand what rebasing is, the most important thing to learn is when not to do it. The golden rule of git rebase is to never use it on public branches.

For example, think about what would happen if you rebased master onto your feature branch:

The rebase moves all of the commits in master onto the tip of feature. The problem is that this only happened in your repository. All of the other developers are still working with the original master. Since rebasing results in brand new commits, Git will think that your master branch’s history has diverged from everybody else’s.

The only way to synchronize the two master branches is to merge them back together, resulting in an extra merge commit and two sets of commits that contain the same changes (the original ones, and the ones from your rebased branch). Needless to say, this is a very confusing situation.

So, before you run git rebase, always ask yourself, “Is anyone else looking at this branch?” If the answer is yes, take your hands off the keyboard and start thinking about a non-destructive way to make your changes (e.g., the git revert command). Otherwise, you’re safe to re-write history as much as you like.

If you try to push the rebased master branch back to a remote repository, Git will prevent you from doing so because it conflicts with the remote master branch. But, you can force the push to go through by passing the –force flag, like so:

# Be very careful with this command!
git push –force
This overwrites the remote master branch to match the rebased one from your repository and makes things very confusing for the rest of your team. So, be very careful to use this command only when you know exactly what you’re doing.

One of the only times you should be force-pushing is when you’ve performed a local cleanup after you’ve pushed a private feature branch to a remote repository (e.g., for backup purposes). This is like saying, “Oops, I didn’t really want to push that original version of the feature branch. Take the current one instead.” Again, it’s important that nobody is working off of the commits from the original version of the feature branch.

Workflow Walkthrough

Rebasing can be incorporated into your existing Git workflow as much or as little as your team is comfortable with. In this section, we’ll take a look at the benefits that rebasing can offer at the various stages of a feature’s development.

The first step in any workflow that leverages git rebase is to create a dedicated branch for each feature. This gives you the necessary branch structure to safely utilize rebasing:

Local Cleanup
One of the best ways to incorporate rebasing into your workflow is to clean up local, in-progress features. By periodically performing an interactive rebase, you can make sure each commit in your feature is focused and meaningful. This lets you write your code without worrying about breaking it up into isolated commits—you can fix it up after the fact.

When calling git rebase, you have two options for the new base: The feature’s parent branch (e.g., master), or an earlier commit in your feature. We saw an example of the first option in the Interactive Rebasing section. The latter option is nice when you only need to fix up the last few commits. For example, the following command begins an interactive rebase of only the last 3 commits.

git checkout feature
git rebase -i HEAD~3
By specifying HEAD~3 as the new base, you’re not actually moving the branch—you’re just interactively re-writing the 3 commits that follow it. Note that this will not incorporate upstream changes into the feature branch.

If you want to re-write the entire feature using this method, the git merge-base command can be useful to find the original base of the feature branch. The following returns the commit ID of the original base, which you can then pass to git rebase:

git merge-base feature master
This use of interactive rebasing is a great way to introduce git rebase into your workflow, as it only affects local branches. The only thing other developers will see is your finished product, which should be a clean, easy-to-follow feature branch history.

But again, this only works for private feature branches. If you’re collaborating with other developers via the same feature branch, that branch is public, and you’re not allowed to re-write its history.

There is no git merge alternative for cleaning up local commits with an interactive rebase.

Incorporating Upstream Changes Into a Feature
In the Conceptual Overview section, we saw how a feature branch can incorporate upstream changes from master using either git merge or git rebase. Merging is a safe option that preserves the entire history of your repository, while rebasing creates a linear history by moving your feature branch onto the tip of master.

This use of git rebase is similar to a local cleanup (and can be performed simultaneously), but in the process it incorporates those upstream commits from master.

Keep in mind that it’s perfectly legal to rebase onto a remote branch instead of master. This can happen when collaborating on the same feature with another developer and you need to incorporate their changes into your repository.

For example, if you and another developer named John added commits to the feature branch, your repository might look like the following after fetching the remote feature branch from John’s repository:

You can resolve this fork the exact same way as you integrate upstream changes from master: either merge your local feature with john/feature, or rebase your local feature onto the tip of john/feature.

Note that this rebase doesn’t violate the Golden Rule of Rebasing because only your local feature commits are being moved—everything before that is untouched. This is like saying, “add my changes to what John has already done.” In most circumstances, this is more intuitive than synchronizing with the remote branch via a merge commit.

By default, the git pull command performs a merge, but you can force it to integrate the remote branch with a rebase by passing it the –rebase option.

Reviewing a Feature With a Pull Request
If you use pull requests as part of your code review process, you need to avoid using git rebase after creating the pull request. As soon as you make the pull request, other developers will be looking at your commits, which means that it’s a public branch. Re-writing its history will make it impossible for Git and your teammates to track any follow-up commits added to the feature.

Any changes from other developers need to be incorporated with git merge instead of git rebase.

For this reason, it’s usually a good idea to clean up your code with an interactive rebase before submitting your pull request.

Integrating an Approved Feature
After a feature has been approved by your team, you have the option of rebasing the feature onto the tip of the master branch before using git merge to integrate the feature into the main code base.

This is a similar situation to incorporating upstream changes into a feature branch, but since you’re not allowed to re-write commits in the master branch, you have to eventually use git merge to integrate the feature. However, by performing a rebase before the merge, you’re assured that the merge will be fast-forwarded, resulting in a perfectly linear history. This also gives you the chance to squash any follow-up commits added during a pull request.

If you’re not entirely comfortable with git rebase, you can always perform the rebase in a temporary branch. That way, if you accidentally mess up your feature’s history, you can check out the original branch and try again. For example:

git checkout feature
git checkout -b temporary-branch
git rebase -i master
# [Clean up the history]
git checkout master
git merge temporary-branch

And that’s all you really need to know to start rebasing your branches. If you would prefer a clean, linear history free of unnecessary merge commits, you should reach for git rebase instead of git merge when integrating changes from another branch.

On the other hand, if you want to preserve the complete history of your project and avoid the risk of re-writing public commits, you can stick with git merge. Either option is perfectly valid, but at least now you have the option of leveraging the benefits of git rebase. How to version large files with Git LFS How to version large files with Git LFS

Versioning large files (such as audio samples, videos, datasets, and graphics) can be difficult when working with distributed version control systems like Git. Fortunately, a new extension to Git makes handling of large files easier: Git Large File Storage (LFS) is an open-source project that replaces large files with text pointers inside Git, while storing the contents of the files on a remote server like GitHub or an AWS bucket.

After running the installation script, set up LFS via the following command:

$ git lfs install
Tracking file types
All you need to do now is to tell Git LFS which file types to track. Navigate to your Git repository, and issue a git lfs track command. For example, if you want Git LFS to automatically handle all .mat files in your repository (although it’s rarely a smart idea to have binaries under version control), you would call:

$ git lfs track “*.mat”
If your Git repository has subdirectories, you can use globbing to track all .mat files in all subdirectories:

$ git lfs track “**/*.mat”
Or you can track single files:

$ git lfs track myLargeFile.mat
That’s it! Continue your work using git commit and git push as usual.

Storing large files
If you have tried uploading large files to the remote repository before, you might have noticed a warning popping up telling you that GitHub does not recommend to upload files larger than 50MB. You won’t even be able to upload files larger than 100MB. With Git LFS installed, the file will instead be uploaded to a dedicated remote host that is different from your remote repository, and the git push command will go through as usual:

$ git commit -am “add large file”
$ git push origin master

Instead of storing the file in the remote repository, Git LFS will upload only a small file reference. If you try to inspect the file on GitHub, you will only find the following note:

Back in the local repository, you will notice that the file is still accessible, until you switch branches.

Retrieving large files
As soon as you switch branches, the locally stored binaries will be gone. If you now inspect the file controlled by Git LFS, all you will find is a tiny text file that might look something like this:

oid sha256:d63d7c81d9191f17263b0c65f97101083dade9637e069aea23c6be778cbf89bdf
size 68536835
So where did your file go, you might ask? It is still on the LFS remote host. To download the file from the remote host, use the following command:

$ git lfs fetch
To see a list of all LFS-related commands, simply type:

$ git lfs Storing large binary files in git repositories Storing large binary files in git repositories

Git-annex works by storing the contents of files being tracked by it to separate location. What is stored into the repository, is a symlink to the to the key under the separate location. In order to share the large binary files between a team for example the tracked files need to be stored to a different backend. At the time of writing (23rd of July 2015): S3 (Amazon S3, and other compatible services), Amazon Glacier, bup, ddar, gcrypt, directory, rsync, webdav, tahoe, web, bittorrent, xmpp backends were available. Ability to store contents in a remote of your own devising via hooks is also supported.

Git-annex uses separate commands for checking out and committing files, which makes its learning curve bit steeper than other alternatives that rely on filters. Git-annex has been written in haskell, and the majority of it is licensed under the GPL, version 3 or higher. Because git-annex uses symlinks, Windows users are forced to use a special direct mode that makes usage more unintuitive.

Latest version of git-annex at the time of writing is 5.20150710, released on 10th of July 2015, and the earliest article I found from their website was dated 2010. Both facts would state that the project is quite mature.

Git Large File Storage (Git LFS)
The core Git LFS idea is that instead of writing large blobs to a Git repository, only a pointer file is written. The blobs are written to a separate server using the Git LFS HTTP API. The API endpoint can be configured based on the remote which allows multiple Git LFS servers to be used. Git LFS requires a specific server implementation to communicate with. An open source reference server implementation as well as at least another server implementation available. The storage can be offloaded by the Git LFS server to cloud services such as S3 or pretty much anything else if you implement the server yourself.

Git LFS uses filter based approach meaning that you only need to specify the tracked files with one command, and it handles rest of invisibly. Good side about this approach is the ease of use, however there is currently a performance penalty because of how Git works internally. Git LFS is licensed under MIT license and is written in Go and the binaries are available for Mac, FreeBSD, Linux, Windows. The version of Git LFS is 0.5.2 at the time of writing, which suggests it’s still in quite early shape, however at the time of writing there were 36 contributors to the project. However as the version number is still below 1, changes to APIs for example can be expected.

git-bigfiles – Git for big files
The goals of git-bigfiles are pretty noble, making life bearable for people using Git on projects hosting very large files and merging back as many changes as possible into upstream Git once they’re of acceptable quality. Git-bigfiles is a fork of Git, however the project seems to be dead for some time. Git-bigfiles is is developed using the same technology stack as Git and is licensed with GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2).

git-fat works in similar manner as git lfs. Large files can be tracked using filters in .gitattributes file. The large files are stored to any remote that can be connected through rsync. Git-fat is licensed under BSD 2 license. Git-fat is developed in Python which creates more dependencies for Windows users to install. However the installation itself is straightforward with pip. At the time of writing git-fat has 13 contributors and latest commit was made on 25th of March 2015.

Licensed under MIT license and supporting similar workflow as the above mentioned alternatives git lfs and git-fat, git media is probably the oldest of the solutions available. Git-media uses the similar filter approach and it supports Amazon’s S3, local filesystem path, SCP, atmos and WebDAV as backend for storing large files. Git-media is written in Ruby which makes installation on Windows not so straightforward. The project has 9 contributors in GitHub, but latest activity was nearly a year ago at the time of writing.

Git-bigstore was initially implemented as an alternative to git-media. It works similarly as the others above by storing a filter property to .gitattributes for certain type of files. It supports Amazon S3, Google Cloud Storage, or Rackspace Cloud account as backends for storing binary files. git-bigstore claims to improve the stability when collaborating between multiple people. Git-bigstore is licensed under Apache 2.0 license. As git-bigstore does not use symlinks, it should be more compatible with Windows. Git-bigstore is written in Python and requires Python 2.7+ which means Windows users might need an extra step during installation. Latest commit to the project’s GitHub repository at the time of writing was made on April 20th, 2015 and there is one contributor in the project.

Git-sym is the newest player in the field, offering an alternative to how large files are stored and linked in git-lfs, git-annex, git-fat and git-media. Instead of calculating the checksums of the tracked large files, git-sym relies on URIs. As opposed to its rivals that store also the checksum, git-sym only stores the symlinks in the git repository. The benefits of git-sym are thus performance as well as ability to symlink whole directories. Because of its nature, the main downfall is that it does not guarantee data integrity. Git-sym is used using separate commands. Git-sym also requires Ruby which makes it more tedious to install on Windows. The project has one contributor according to its project home page. GitFlow considered harmful GitFlow considered harmful

GitFlow is probably the most popular Git branching model in use today. It seems to be everywhere. It certainly is everywhere for me personally – practically every project at my current job uses it, and often it’s the clients themselves who have chosen it.

I remember reading the original GitFlow article back when it first came out. I was deeply unimpressed – I thought it was a weird, over-engineered solution to a non-existent problem. I couldn’t see a single benefit of using such a heavy approach. I quickly dismissed the article and continued to use Git the way I always did (I’ll describe that way later in the article). Now, after having some hands-on experience with GitFlow, and based on my observations of others using (or, should I say more precisely, trying to use) it, that initial, intuitive dislike has grown into a well-founded, experienced distaste. In this article I want to explain precisely the reasons for that distaste, and present an alternative way of branching which is superior, at least in my opinion, to GitFlow in every way.

But mistakes aren’t even the worst part. What I consider the biggest failure of GitFlow is that it doesn’t give people a clear vision of a versioning scheme. This is especially true if you have any deviations from the standard workflow that GitFlow forces on you (for example, you have a long release cycle with a lot of back and fourth between QA and development). All of the mistake examples I gave above really stem from the fact that people are confused about what actually represents the current state of the project. Since they don’t really understand what that state is, then it’s no wonder that they make mistakes when they try to change it (as that is what publishing their work is actually meant to accomplish).

I want to describe an alternative method that I’ve used myself successfully on a number of projects (to be clear, I’m not talking about one-person projects). I believe it fulfills all of the goals that GitFlow set out to accomplish, and does it in a lot simpler, clearer and lightweight way which scales to any number of developers. You can call it “Anti-gitflow”, as it’s very similar in a lot of points that don’t need any change compared to GitFlow, but does the exact opposite wherever GitFlow falls short, as I’ve described above.

Here it is:

  • There is only one eternal branch – you can call it master, develop, current, next – whatever. I personally like “master”, and that’s the name I’ll use in the rest of the description, as it’s convention by now in the Git world and immediately conveys it’s purpose.
  • All other branches (feature, release, hotfix, and whatever else you need) are temporary and only used as a convenience to share code with other developers and as a backup measure. They are always removed once the changes present on them land on master.
  • Features are integrated onto the master branch primarily in a way which keeps the history linear. You have a lot of leeway in how you want to enforce this. You can make it simply a convention that developers are encouraged, but not forced, to follow. On the other side of the spectrum, if you use something like Gerrit to manage your Git repositories (which I recommend, even if you don’t practice code reviews – the permission system is fantastic, and if you ever decide you want code reviews, it’ll be very easy to start doing them), you can set up permissions in such a way that actually forbids pushing merge commits to master, and that way ensure linear history.
  • Releases are done similarly to in GitFlow. You create a new branch for the release, branching off at the point in master that you decide has all the necessary features. From then on new work, aimed for the next release, is pushed to master as always, and any necessary changes are pushed to the release branch (in my opinion, it’s an anti-pattern and a huge red flag if your release requires separate commits to work, but that’s a topic for another article – for simplicity, let’s assume you can’t or don’t want to change that). Finally, once the release is ready, you tag the top of the release branch. Then, because there is one eternal branch, there is only one way to get your release to be versioned permanently – and that is to merge the release branch into master and push that changed master. After that, all the changes that were made during the release are now part of master, and the release branch is deleted.
  • Hotfixes are very similar to releases, except you don’t branch from an arbitrary commit on master, but from the release tag that you want to make the fix in. Again, work on master continues as always, and the necessary fixes are pushed to the hotfix branch. Once the fix is ready, the procedure is exactly the same as for a release – tag the top of the branch creating a new release, merge it into master, then delete the hotfix branch.