GitHub Flow: Github Workflow

Published on August 05, 2013

GitHub Flow: Github Workflow

Original author: Scott Chacon
  • Transfer
Brief introduction by the translator.
Захватывающе интересная статья одного из разработчиков «GitHub Inc.» о принятом в компании рабочем процессе потребовала употребить пару специальных терминов при переводе.

То понятие, для которого на английском языке достаточно одного слóва «workflow», на русский приходится переводить словосочетанием — «рабочий процесс». Ничего лучше не знаю ни сам я, ни при помощи гуглоперевода так что и мне, и читателям придётся с этим мириться, хотя бы и поневоле.

Другое понятие, «deploy», на русский часто переводят словом «развёртывание», но в моём переводе я решил вспомнить оборот из советского делопроизводства — «внедрение инноваций на производстве» — и стану говорить именно о «внедрении» новых фич. Дело в том, что описанный ниже рабочий процесс не имеет «выпусков» (releases), что делает несколько неудобными и речи о каком-либо «развёртывании» их.

К сожалению, некоторые переводчики бывают склонны грубо убивать сочную метафору «иньекции» (или даже «впрыскивания», если угодно), содержающуюся в термине «code injection», так что и его также переводят словосочетанием «внедрение кода». Эта путаница огорчает меня, но ничего не могу поделать. Просто имейте в виду, что здесь «внедрением кода» я стану назвать внедрение его именно в производство (на продакшен), а не в чей-нибудь чужой код.

Я стремился употреблять словосочетание «в Гитхабе» в значении «в компании GitHub Inc.», а «на Гитхабе» — в значении «на сайте GitHub.com». Правда, иногда разделять их сложновато.

Git-flow issues


I travel everywhere, teaching Git to people - and in almost every lesson and seminar I recently conducted, they asked me what I think about git-flow . I always replied that I think this approach is great - he took a system (Git) for which there may be myriads of possible workflows, and documented one proven and flexible process that is suitable for many developers for a fairly simple use. This approach also becomes a bit of a standard, so developers can move from project to project and from company to company, while remaining familiar with this standardized workflow.

However, git-flow also has problems. I have repeatedly heard the opinions of people who expressed hostility to the fact that branches of features departfrom develop instead of master , or to the manner of handling hotfixes, but these problems are relatively small.

For me, one of the biggest problems with git-flow was its complexity - more than what most developers and working groups actually require. Its complexity already led to the emergence of an assistant script to support the workflow. This in itself is cool, but the problem is that the assistant does not work from the Git GUI, but from the command line, and it turns out that those same people who really need to learn the difficult workflow really well, because they will have to go through all the steps manually - for thesepeople, the system is not convenient enough to use it from the command line. This is what becomes a big problem.

All of these problems can be easily overcome by following a much simpler workflow. We do not use git-flow in github. Our workflow is based (and always has been) based on a simpler approach to Git.

Its simplicity has several advantages. Firstly, it’s easier for people to understand it, so that they begin to use it faster, less often (or never at all) make mistakes that require rollback. In addition, a wrapper script is not required to help you follow the process, so using a GUI (etc.) does not cause problems.

Github Workflow


So why don't we use git-flow in github ? The main problem is that we have adopted the continuous introduction of changes. The git-flow workflow was created primarily to help release the new code. And we do not have “releases”, because the code arrives at the production (main working server) daily - sometimes several times a day. We can send commands to the bot for this in the same chat room, which displays the results of CI (integration testing). We strive to make the process of testing the code and its implementation as simple as possible so that it is convenient for each employee to work with.

Such a frequent introduction of new products has a number of advantages. If it happens every few hours, then it is almost impossible to cause a large number of major bugs. Small flaws happen, but they can be fixed (and the corrections, in turn, are implemented) very quickly. Usually, you would have to make a “hotfix” or somehow deviate from the normal process, but for us it becomes just part of the normal process: in the Github workflow there is no difference between a hotfix and a small feature.

Another advantage of continuous implementation of changes is the ability to quickly respond to problems of any kind. We can respond to reports of security problems or execute small (but interesting) requests for new features - but the same process works when making changes related to the development of a feature of a normal (or even large) size. The process is the same and it is very simple.

How do we do it


So what is the github workflow?

  • The contents of the master branch are always deployable.
     
  • Starting work on something new, branch from the master branch a new branch, the name of which corresponds to its purpose (for example, new-oauth2-scopes ”).
     
  • After committing to this branch locally, send your work regularly to the branch of the same name on the server.
     
  • When you need feedback, or help, or when you find the branch ready to merge, send a merge request .
     
  • After someone else has reviewed and approved the feature, you can merge your branch into the master branch .
     
  • After the master branch has replenished with new code, you can immediately implement it on production and you should do it.

That's the whole workflow. It is very simple and productive, it works for fairly large working groups - 35 people are currently working in Github, of which maybe fifteen or twenty are simultaneously working on the same project (github.com). I think that most development teams (groups working simultaneously with the logic of the same code that can cause conflicts) have the same size - or less than this. Especially groups progressive enough to engage in fast and consistent implementation.

So, let's take a look at each step in order.

The contents of the master branch are always deployable.


In general, this is the only strict rule in the entire system. There is only one branch, which always has some special meaning, and we called it master . For us, this means that the code for this branch is either implemented on production, or, in the worst case, it will be implemented within a few hours. This branch is very rarely rolled back a few commits (to cancel work): if a problem occurs, changes from the commits are canceled or completely new commits fix the problem, but the branch itself almost never goes backward.

The master branch is stable. Implementing its production code or creating new branches based on it is always, always safe. If from you to masteran untested code arrives or it breaks the assembly, then you violated the “social contract” of the development team and you have to scrape cats about it. Each branch is tested with us, and the results are sent to the chat room - so if you haven’t tested it locally, you can push the branch (even with a single commit) to the server and wait until Jenkins reports if all the tests passed successfully .

Branches off from the branch master new branches whose names match the destination


When you want to work on something new, branch from the stable branch master a new branch whose name corresponds to the purpose. (For example, in the Github code right now there are branches user-content-cache-key ”, submodules-init-task ”, redis2-transition ”.) This name has several advantages. For example, just submit the fetch command to see what topics the others are working on. In addition, leaving the branch for a while and returning to it later, it is easier to remember by name what it was about.

And this is nice, because when we go to the page with the list of branches on Github, it’s easy to see which branches we have recently worked on (and, approximately, what was the amount of work).

[screenshot]

It is almost like a list of future features with a rough estimate of their current state. If you don’t use this page, then you should know that it has cool features: it shows you only those branches in which work was done that is unique to the branches you have selected at the moment, and even sorts them in such a way that the branches with the most recent work was from above. If I want to curiosity, then I can click on the “Compare” button and look at the exact combined diff and the list of commits unique to this branch.

Now, when I write this, we have 44 branches with unconnected code in the repository, but it is also clear that out of them only nine or ten received the code in the last week.

Constantly send named branch code to the server


Another major difference from the git-flow: we are constantly doing push branches to a server. From the point of view of implementation, you have to really worry only about the master branch , so push will not puzzle anyone or break anything: everything that is not master is just code that is being worked on.

This creates a safety copy in the event of the loss of a laptop or hard drive failure. This supports, and more importantly, the constant exchange of information between developers. Simple commands « the git the fetch » can obtain a list of TODO, over which all work now.

$ git fetch
remote: Counting objects: 3032, done.
remote: Compressing objects: 100% (947/947), done.
remote: Total 2672 (delta 1993), reused 2328 (delta 1689)
Receiving objects: 100% (2672/2672), 16.45 MiB | 1.04 MiB/s, done.
Resolving deltas: 100% (1993/1993), completed with 213 local objects.
From github.com:github/github
 * [new branch]      charlock-linguist -> origin/charlock-linguist
 * [new branch]      enterprise-non-config -> origin/enterprise-non-config
 * [new branch]      fi-signup  -> origin/fi-signup
   2647a42..4d6d2c2  git-http-server -> origin/git-http-server
 * [new branch]      knyle-style-commits -> origin/knyle-style-commits
   157d2b0..d33e00d  master     -> origin/master
 * [new branch]      menu-behavior-act-i -> origin/menu-behavior-act-i
   ea1c5e2..dfd315a  no-inline-js-config -> origin/no-inline-js-config
 * [new branch]      svg-tests  -> origin/svg-tests
   87bb870..9da23f3  view-modes -> origin/view-modes
 * [new branch]      wild-renaming -> origin/wild-renaming

It also allows everyone to see (on the Github branch list page) what everyone else is working on - you can analyze the code and decide if you want to help the developer in anything .

Create a merge request at any time


GitHub has an amazing code review system called merge requests ; I'm afraid not enough developers are fully aware of it. Many people use it in ordinary work on open source code: forked the project, updated the code, sent a merge request to the project owner. However, this system can also be used as a means of internal code verification, and so we use it.

In fact, we use it as a means of viewing and discussing branches rather than as a merge request. GitHub supports sending a merge request from one branch to another in the same project (open or private), so in the request you can say “I need a hint or an overview of this code”, and not just “please accept this code”.

[screenshot]

In this illustration, you can see how Josh asks Brian to look at the code, and he gives advice on one of the lines of code. Below you can see how Josh agrees with Brian’s considerations and replenishes the code to respond to them.

[screenshot]

You can finally see that the code is still at the testing stage: it is not yet a branch prepared for implementation, we use merge requests to review the code long before we really want to merge it into master and send it for implementation.

If your work on a feature or branch gets stuck and you need help or advice, or if you are a developer, and the designer should also look at your work (or vice versa), or even if you have little (or no) code, but there is some somethingcomposition of screenshots and general ideas, then you open a merge request. The Github system allows you to add people to the discussion @ by mentioning them, so if you need a review or response from a specific person, you can mention him in the request (you saw above how Josh did this).

And this is cool, because in merge requests you can comment on individual lines of the combined diff, or individual commits, or the entire request as a whole - and copies of the replicas will form a single discussion. You can also continue to replenish the branch with code, so if someone points out an error or a forgotten opportunity in the code, you can put the correction in the same branch, and GitHub will show new commits in the discussion, so you can work on the branch like this.

If the branch exists for too long and you feel that the code in it is inconsistent with the code of the master branch , then you can add the code from master to your branch and continue working. In the discussion of the merge request or in the list of commits, it is easy to see when the branch was last updated with the code taken from master .

[screenshot]

When the work on the branch is completely and completely finished and you feel it is ready for implementation, then you can proceed to the next step.

Merge only after query review


We do not work directly in the master branch , but we do not merge work from the named branch immediately after we consider it to be completed - first we try to get approval from other employees of the company. It usually takes the form of “+1,” or emoji , or the comment “: shipit:”, but we need to bring someone else to look at the branch.

[screenshot]

When approval is received, and the branch has passed CI, we can merge it into master and for implementation; at this point, the merge request will be closed automatically.

Implementation immediately after review


Finally, your work is finished, and its fruits are on the master branch . This means that even if you don’t begin to implement them right now, they will still become the basis for the branches of other employees, and that the next implementation (which is likely to happen in a few hours) will launch a novelty in business. And since it can be very unpleasant to find that someone else has run your code and suffered from it (if the code breaks something ), it’s natural for people to carefully check the stability of the results of their merge and implement the results themselves.

Our campfire bot, named hubot, can inject code as directed by any of the employees. It is enough to send the command hubot deploy github to production in the chat, and the code will go to production, where it will restart (with zero downtime) all the necessary processes. You can judge for yourself how often this happens on Github:

[screenshot]

As you can see, six different people (including one support and one designer) implemented the code more than two dozen times a day.

I did all of the above for branches with one commit containing a single-line change. The process is simple, straightforward, scalable and powerful. The same thing can be done with a feature branch containing fifty commits that required two weeks of work, and with one commit made in about ten minutes. The process is so simple and not so burdensome that its necessity is not annoying even in the single-commit case, so people rarely skip or skip individual steps of it - unless it is a change so small and insignificant that it does not matter.

Our workflow has both power and incredible simplicity. I think many will agree that GitHub is a very stable platform, that we respond to its problems quickly (if they arise at all), that new features are being implemented at a fast pace. There are no compromises in terms of quality or stability that could increase the speed and simplicity of the workflow or reduce the number of steps.

Conclusion


Git itself is pretty hard to understand. If it is also used in a work process more complex than necessary, then the matter will end with a daily excessive effort of reason. I will always advocate the use of the simplest possible system suitable for the work of your group, and until this system ceases to work; only then add complexity when you can’t avoid it.

For those working groups that need to prepare official code releases at long intervals (from several weeks to several months between releases), and create hotfixes and legacy support branches, and do other things that need to be caused by such infrequent code releases, it makes sense flow-the git ,and I would highly recommend its use.

For groups whose work is built around code delivery, which update production daily, continuously test and implement features, I would recommend a simpler workflow - such as GitHub Flow.