Help me, because I think Martin Fowler has a Merge Paranoia

There must be something very wrong with me: for the first time in my life I think that Martin Fowler is wrong on a specific topic. And, since Martin is Martin and I’m just a humble developer (Arialdo who?) I’m likely to be the one who’s completely off the track.Nevertheless, unfortunately, no matter how much I dig into the topic, I’m not able to convince myself that Martin Fowler’s arguments about Feature Branching, Continuous Integration and Feature Toggling are right.

Please, help me to understand what I’m missing.

Continuous Integration is a weak solution to a wrong problem

I read the last, amazing article by Martin Fowler, OpportunisticRefactoring. It is, as usual, a very inspiring post.

When Martin comes to suggest what not to do in order to achieve a good and proper refactoring, he writes (again) against Feature Branching. Uh? Is Feature Branching something I should avoid if I want to do a proper refactoring? Why, Martin, why?

There’s no Continuous Integration vs Feature Branching dichotomy

Chris Birmele on MSDN Branching and Merging Primer and Jeff Atwood in his famous Coding Horror’s post Software Branching And Parallel Universes described a set of Branching Anti-patterns. I think that this one could describe how I see Martin Fowler when he writes about Feature Branching:

Merge Paranoia: Merging is avoided at all cost, due to a fear of the consequences.

Martin Fowler wrote against Feature Branching here and in other articles about Continuous Integration. So did Sarah Taraporewalla in Experience Report: Branch by Feature

To me, Continuous Integration is a weak solution to a wrong problem. To Martin Fowler, the problem is the branch length: it’s better to merge on the trunk, frequently, in order to minimize merge problems. To me the problem is the version control system you are using: if you are afraid to merge, you should change your versioning tool because VCS are meant to do merges and to help you in your refactoring activities.

Well, I think Martin Fowler could have a fear of merges because he’s using the wrong tool. He likes Subversion and, actually, branching and merging in SVN are a nightmare. Is it the same with modern DVCS, like git?

I completely agree with James McKay and what he wrote in
Why Does Martin Fowler Not Understand Feature Branches

It seems that all the FUD about feature branches boils down to one thing: we should restrict ourselves to trunk-based development because Continuous Integration is the One True Way to do configuration management. But let’s just take a step back and ask ourselves why we do Continuous Integration in the first place? Largely because we were restricted to trunk-only development. If you check in code that breaks the build, then go home, and then someone else checks it out, they can’t get anything done till you return and fix it. You constantly need to have a ready-to-deploy version of your code in case of security vulnerabilities. While this isn’t the whole picture, and there are other arguments in favour of Continuous Integration, it is at least partly a hack to work around the restrictions of trunk-based development.

The vagueness of Martin Fowler’s “cool rule of thumb”—that you should check in “at least once a day” is testimony to this. Cargo culting your way through advice like that will lead to you checking in incomplete, buggy or even downright broken code, and the need for high-maintenance hacks such as feature toggles to compensate.
A hack to work around the limitations of another workaround for the limitations of our tooling.

In another excellent post about this topic, McKay proposes a smart question:

If DVCS had come first, would Continuous Integration ever have been invented?

I’m pretty sure the answer is no.

One of the techniques Martin Fowler proposes as a good alternative to Feature Branching is FeatureToggle: don’t branch; always commit on the trunk; if your feature is buggy or incomplete or not yet planned, disable it with some configuration. Ok, I’m oversimplifying it. Yet I think it’s one of the most weird and weak approach I’ve ever heard. Again, I agree with the arguments of James McKay: believe me, read
Why Does Martin Fowler and Feature Branches Versus Continuous Integration, they are absolutely clarifying. To McKay, Feature Toggling is much much more dangerous than Feature Branching.

The real world

Anyway, I’m still confused: how could Martin be wrong? My doubt grows when I think about Feature Branching in the real world.

Linux is probably one of the biggest software developers community ever.
Linux Torvalds recently posted this comment on his Google+’s page

Merging, merging, merging. The linux-next tree has something like 8600 commits, and I’ve merged about 5600 in the last few days. So more than halfway done. And not everything from linux-next tends to get merged.

Yep. Linux is using Feature Branching. And a single man merged thousands of branches. These are two tiny samples of how Linux Kernel git tree appears:

And know what? No one merges to the trunk. Martin Fowler wrote about DVCS and centralized VCS, but strangely did no mention about the fact that with DVCS developers don’t push their code to the trunk. They have they own repository, and the release manager fetches from them. That’s why DVCS are distributed. It’s not just a matter of speed, as Martin writes.

Linux Kernel developers use Feature Branching. Martin, face it: Feature Branch can work. And it works, actually.

Jilles van Gurp noticed in his post Using Git and feature branches effectively:

It’s not the practice of feature branching that is the problem but the fact that testing and continuous integration are not decentralized in a lot of organizations. In other words until your changes land on the central branch, you are not doing the due diligence of testing. Even worse, you are not making sure you have tested your changes before you add them to the main branch.

You can’t do decentralized versioning unless you also decentralize your testing and integration. Git has value when used as a SVN replacement. Git has more value when used as a DVCS. There is no good reason why you can’t do decentralized testing and integration with git. Rather the opposite: it has been designed with exactly this in mind. The whole point of git is divide and conquer. Break the changes up: decentralize the testing and integration work and solve the vast majority of problems before change is pushed upstream. If you push your problems along with your changes you are doing it wrong. Decentralized integration is a very clever strategy that is based on the notion that the effort involved with testing and integration scales exponentially rather than linearly with the amount of change. By decentralizing you have many people working on smaller sets of changes that are much easier to deal with: the collaborative effort on testing decreases and when it all comes together you have a much smaller set of problems to deal with.

Again: I agree.

Feature Branching (on a private repository) and merging (with a smart, modern tool like git, not with an old CVS like tool like SVN), to me, means the freedom to create dozen of branches each day, experimenting and feeling safe. I understand that if I would be forced to use SVN or TFS (I am, these days) I would feel the need of some other kind of fan guard (like Continuous Integration).

Hence, I think I’ll try to practice Opportunistic Refactoring and I’ll be heavily using Feature Branching, because I think it can help me refactoring.

What am I missing?

22 thoughts on “Help me, because I think Martin Fowler has a Merge Paranoia

  1. Excellent post. Don’t just take the guru’s word for it, keep challanging. Commit to trunk+with feature toggle is not wrong, it’s a solution to a combination of problems: svn+easily breaking code+reliance on one authoritive build (ci).

    If you don’t have that situation, it just doesn’t compute. But Fowler and TW are marketing this as the golden solution, that could be about profits more than professionalism.

    I like the suggestion to go the dvcs way and decentralize completely.

    1. I strongly disagree. Challenging authority is always good, but in this case he is simply wrong. Feature branches and continuous integration are solutions to different problems and in some situations (Linux kernel development) feature branches are much preferable, while in others CI is superior.

      It is absolutely untrue to say that CI is a weak solution to the problem of working with a weak VCS. What is true is that SVN sucks for the workflow used on the Linux kernel and it is also true that this workflow is correct for this project. So yes, SVN sucks for Linux kernel development. It doesn’t follow that CI is a weak solution to using a weak tool.

      Having extensive experience with CVS, SVN and Mercurial I can tell you SVN is an *excellent* tool for colocated development, almost as good as Mercurial. If you disagree, then you are probably misunderstanding what CI means. No shame in that, but it is better to increase your understanding.

      And the suggestion that TW’s and Fowler’s advocacy are based on money is beneath you Machiel.

      To get back to the weak solution thing: I’d say it’s almost the other way around.

      It is true that DVCSs solve many problems associated with having branches, the book-keeping aspect as I like to call it, which mainly involves repeated merges. There they are clearly superior to SVN as it was when I last used it in anger. Since then SVN has added merge tracking, but I don’t know how well that works in practice, since I’m using Mercurial nowadays. DVCSs aren’t perfect though, for instance Mercurial still has problems with criss-cross merges.

      But that’s only a minor issue. The real problem is the other half of the problem that DVCSs do not solve and that is refactoring. Remember the early days of XP when we used to talk about Merciless Refactoring and Continuous Integration? We weren’t kidding, we really meant the Merciless and Continuous part. CI means merging *all the time*, both ways, and many times a day. If you do merciless refactoring on separate branches, then you *will* get into trouble with separate branches no matter how good your tool is. If you’re splitting up a long file, discovering composite objects etc and I’m adding new functionality, then we will get conflicts.

      When we’re practicing CI however, we find that we can easily interleave small commits so that in most cases conflicts simply evaporate because the second person to make a change adds it on top of the first change that’s already been made. This is an enormously natural workflow if you are using the red-green-refactor cycle of TDD. You should be able to merge very time you are at green.

      And this approach solves both the book-keeping associated with repeated merges and the refactoring issues, whereas DVCSs by themselves only solcve the former. Because of that I consider it a vastly superior solution than using feature branches, unless you are in a situation like that of the Linux kernel developers where you can’t do CI because the team simply isn’t and cannot be that closely integrated. It is better to be in a situation where you can do CI than in one where you can’t, but sometimes you don’t have a choice.

      If you can do CI, you should, because it will be vastly superior. The tools have very little to do with it, as I said SVN is almost as good for this as Mercurial. If you don’t like this style of working, fine, nobody is holding a gun to your head. But if you are going to criticise the approach, at least make sure you understand it. And Machiel, frankly I’m disappointed in your comments, because from our previous interactions I got the mistaken impression that you did understand TDD and CI.

      1. SVN is a terrible, terrible tool. It may be the best of CVS tools, but CVS are a plague.

        There’s nothing that you can do with SVN which you can’t with Git or Mercurial. However, there’re plenty of stuff you can do with Git that you cannot with SVN. I’ve seen “merge paranoia” for real in companies and it’s a terrible thing to witness. People were locking the mainline for weeks to merge branches, they were having multiple .old copies of their folder because they were not allowed to commit as someone locked their files.
        There’s nothing less efficient as an SVN based system in our current world.

        People who claim that SVN is a good tool should read the “HgInit: SVN Re-Education” blog.

        Where I work, we have a combination of both techniques. We use git, we are decentralized, and we perform decentralized testing. We only work on features branches. When everything is reviewed, tested, validated, and the production branch is integrated in the feature branch, we do the opposite move and send the feature branch to the production branch. We then have a CI server that will process tests, and allow the build to be released or not.

        Best of the two worls in my opinion.

      2. Sorry, if you think SVN is a bad tool for CI, then you simply don’t understand what CI is. SVN is excellent for CI, so is Mercurial and so is Git. You’re using feature branches for your work, which is fine if it works for you. But that by definition isn’t CI, which means everybody commits to trunk a couple of times a day, every commit triggers a build that gives a warning if it fails (‘breaks the build’), and every time a build breaks it is fixed within ten minutes.

        Many people confuse CI with continuously building a deployable unit, and running a suite of tests and metrics against it (which you might call CIT or Continuous Integration Testing). Both are excellent practices, and they can be used together very effectively, but they can also be used independently and only one of them is called CI.

        If you use feature branches, then I agree SVN probably isn’t the tool you want to be using, though I haven’t used SVN in quite a while and it apparently now has much better merge tracking than it used to. You can still use something like Jenkins to great effect, but what you’re doing then just isn’t called CI.

        That’s basically the whole misunderstanding underlying this post and the subsequent discussion: the technical term CI doesn’t mean ‘using something like Jenkins’, that’s called CIT, while CI is a form of trunk-based development. Almost everybody likes CIT, hardly anyone uses, likes or even understands CI (properly defined).

  2. Good post. I also think that feature toggle is weak solution and actually increases risks and testing efforts.

  3. Old post I realize, but I’d like to ask a few questions.
    How many conflicts did Torvald resolve while merging those branches? How different is conflict resolution on git than svn? If your feature branch merges clean without conflicts, what was the need for your branch? For an organization with a development team larger than half a dozen people, what is the cost, both dollars and hours, of maintaining a test environment for every developer? What’s the difference between a ‘feature branch’ and the local copy of the repo I have on my dev machine, other than the length of time I can be out in the woods for? If you have 10 feature branches a day on your local repo, how many different tasks are you working on concurrently? How does this affect your productivity?

    What it boils down to is this: If your feature branch is completely independent of all other work in your system, including the integration points into the existing larger system, then it will work great. If you have to resolve conflicts then this is hard. The difficulty of merging conflicts grows exponentially with the size of the conflict, not with the number of merges. I have used git, svn, and perforce, and my experience resolving conflicts has been about the same with each of these VCS. I’m curious what size groups you’ve worked in that you’ve never experienced someone breaking your code by merging out your work, or just plain breaking your code by merging incorrectly. Git doesn’t solve this problem, chunk based management makes changes across file renames better, but does nothing to address truly conflicting changes. Continuous integration simply encourages more frequent integrations which means less complicated merges, which has to do with conflict resolution, not how easy it is to ask git or perforce to merge for you. Personally I have had no trouble googling the svn merge commands and using them to merge changelists. You can create a branching structure in svn that merges cleanly, sure you have to be more careful setting it up than git, but most of the time I see people bitching about merges in svn are because they didn’t merge properly in the first place and then it is hard.

    I like git as much as the next guy, I just don’t buy into the idea that because merging conflictless changes is easier is a reason to branch for longer periods of time. I use feature branches all the time, main lives on the perforce server and my branch lives on my dev machine. When I complete a unit of work I merge my feature branch into the trunk by submitting my changelist. The longer I wait to merge the more difficult it becomes, when conflicts exist. If there are no conflicts, then it truly doesn’t matter.

    “In general, these studies have disclosed that people show severe interference when even very simple tasks are performed at the same time”
    http://en.wikipedia.org/wiki/Human_multitasking

    1. Hi Chris.
      Yes, it’s an old post, yet this is a topic I still love very much.
      Thanks for your interesting comments. They made me think.

      A necessary preliminary remark: Feature Branching does not mean producing long-lived branches. On the contrary, one of the best practices in Feature Branching is having branches as short as possible. That is, integrating very often. In other words: there’s no dichotomy between Feature Branching and integrating often. Who says “Feature Branching is evil because integration is important” is missing the point. As a matter of fact, Feature Branching is a means to a better and easier integration of code. Trunk Based (non) branching strategy just seems simpler: in fact, it hides a problem preteding it does not exist.

      Also: I use Continuous Integration together with Feature Branching. There’s no dichotomy. Continuous Integration is not an excuse to commit to trunk or to avoid parallel developing of features. I like Continuous Integration encouraging me to often integrate with others: on the other hand, I like Feature Branching allowing me to integrate with other developers without forcing them to get my incomplete code. Which trunk based branching strategy would not allow.

      I’ll try to expose my point of view.

      You asked how many conflicts did Torvald resolve while merging his branches.
      Well, actually, I think: 0. He must be a fool if he decided to take the useless and dangerous role of branches integrator.

      As a project maintainer, with git (not with svn) you have your own repository. No one but you can push on it. Instead, you can pull from other repositories, inspect new features and fixes and decide what to integrate. Yet, you are not called to resolve integration problems.
      How is it possible?

      A rule you might set (and, in fact, which I used when I had to coordinate several developers) is: only accept contributions that merges with a fast-forward merge. That is: only integrate already integrated contributions.

      Please, try to stop a while and think about the implications of this simingly silly and contradictory rule.

      The basic idea is: decentralize the integration. Ask your fellows to preemptively integrate their contributions, so that you will be able to review them and merge them with a fast-forward merge. No conflicts, by design. Simply: as a project maintainer or release manager, strictly refuse to do the dirt job of integrator.

      As a DVCS, git enforces this policy. Subversion, which is centralized, allows it but does not enforce it.

      Why?
      Not because git has a better merge algorithm (which it has, by the way): but because of its inherently distributed nature.
      Subversion forces people to push and integrate their contributions to a main repository. Distributed VCS like git allow very different workflows.
      For example: a team can manage to have the integration phase on a specific repository, adopting this workflow:


      Or, integration of components and features could be organized in layers, so that several Release Managers can mantain specific subsets of the system, coordinating the integration of the modules, while another release manager integrates at a higher level, like in


      Here other workflows are described.
      The idea is very simple: as a developer, you start a feature branch from dev; when you have done and you are asked to integrate the feature into a release, you must merge your feature branch to release branch. Note: not release branch to feature branch. The opposite. Doing this, you, must resolve the conflicts. The other developers will resolve their conflicts. You will also integrate functionally, eventually producing an end to end test demonstrating that your feature branch does not break the release. Then you push.

      You wrote: “When I complete a unit of work I merge my feature branch into the trunk by submitting my changelist. The longer I wait to merge the more difficult it becomes, when conflicts exist.

      I don’t. I don’t merge to the trunk when I have finished. I first get the trunk’s code, before merging, so that the final merge will be smooth. Since when I merge I don’t alter the trunk, I can merge much before I finish my job. Sometimes I merge just to verify I’m not diverging too much. Then I can even carry on with my feature branch excluding again the trunk, that is, omitting the merge.

      The release branch is still pristine: since I cannot push to the Integration Repository, I have no control, and no possibility to break the system.

      The Release Manager can pull my branch, and merge it to the release branch. Since my branch has already been merged to the release branch, the opposite will work too: release branch will surely smoothly integrate with the new feature. If not, my branch will be rejected much before breaking the system.

      Why am I using a feature branch, after all? For two reasons: first, I don’t want the broken code or incomplete features partly developed by other programmers (and I guess they wish to don’t be bothered with my spikes and incomplete features as well) and, second, because my manager must have the freedom to include or exclude any features for the next release and decide which functionality to include at the last responsible moment. Always merging to the trunk will be like deciding from the beginning what to include. That means: should be a single feature not ready (or should the customer decide to wait), the deployment can be compromised, since the trunk cannot be de-integrated.

      But the key reason for Feature Branching is in the fact that distributed VCSs allow distribution of tests and integration.

      I wonder why git fans don’t point at this as the greatest of git‘s features. To me, it is.

      Let’s start from a fact: Feature Branching can be drammatically hard, with huge projects.
      What I claim is: at least, it allows you to face and manage the problem. As a matter of fact, Feature Branching is a tool to tackling the integration problem in the heart.

      You asked: “For an organization with a development team larger than half a dozen people, what is the cost, both dollars and hours, of maintaining a test environment for every developer?

      A lot, surely.
      Never the less, you must compare it with the cost of a centralized, all-inclusive test environment.
      Please, accept this analogy: think to a car manufacture chain. What is the cost of having a specialized test environment for lights equipments, another for shock absorbers system, and 100 others for any of the 100 single parts of the engine? Why don’t they have a single, central point where to test everything, after the car has been built, that is, at the integration phase?

      A DVCS is drammatically effective because it allows the distribution of the testing and the integration of the code.

      I think the core of your comment is

      What it boils down to is this: If your feature branch is completely independent of all other work in your system, including the integration points into the existing larger system, then it will work great. If you have to resolve conflicts then this is hard. The difficulty of merging conflicts grows exponentially with the size of the conflict, not with the number of merges.

      Yes, I agree: the difficulty of merging conflicts grows exponentially with the size of the conflict. And this, in turn, grows with the size of the project. This is completely independent from the use of git or Subversion of whatever.

      A team has few tools to fight this reality. Some of these are the applications of the divide et impera principle. Huge problems must be divided into several smaller problems which are less hard to be managed.

      This, of course, moves most of the difficulties in managing the interaction and integration between the parts. A huge system may be organized in smaller components, but most of the hard work is in organizing the architecture for components integration.

      Conway’s Law states that “organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations“; even without Conway’s Law, it’s probable that a complex system will be probably developed by several independent teams and they will be probably organized into several interoperative modules or layers.

      You need to produce code with high cohesion and low coupling, that is a lot of small, independent and highly specialized modules, if you want to build a huge system.

      Feature Branching can be a nightmare if two contributions violate the single responsibility principle and, even if regading two different topic, they break the system. But, if this happen, this means your system has low cohesion (the responsibility for the feature developed in a feature branch is scattered in files also present in another, ideally independent, feature branch) and high coupling (you developed a feature and you broke another).
      It’s not Feature Branching that requires high cohesion and low coupling: it’s Software Engineering.
      If someone avoids Feature Branching because it requires that independent features should be independently developed, it may be a smell of a poor architecture or a trick he’s proposing to hide the difficulty to make independent features independently developable.

      Trunk Based developing, that is Spaghetti merging is just a means to hide the problem.

      Feature Branching is surely a hard technique to manage, especially with huge projects. On the other hand, it’s the problem with huge projects to be hard. Feature Branching is like n-tier architecture: both are difficult to understand, use and manage; yet, both allow a very hard problem to be managed.
      Avoiding Feature Branching and prefering a trunk based workflow so that one can evade the problem of integrating code is like avoiding n-tier architecture and prefere a monolithic structure in order to be saved from the need to manage the integration and communication between the many modules and layers.

      A last note on how many feature branches I open: I try to avoid multitasking. Once I begin with a task (that is, a feature branch), I try to deliver as fast as possible. No, I don’t open another feature branch. Other developers in the same team works on parallel features, hence in parallel branches.

      I try to organize my commits around TDD/BDD tests, and those around feature branches, and those around User Stories.
      Since I work with a limited WIP, I usually work with a User Story at a time, that is, I usually don’t open a new feature branch before having delivered the previous one.

      Cheers

      1. Thanks for the response Arialdo. I totally agree that you can practice many of the principles of Continuous Integration while working on a feature branch (as you pointed out, this is possible in centralized VCS as well). I agree also that doing so pushes the burden of integration onto the developer who would like to have his stuff eventually pulled back into the mainline, which is healthy.

        The description of how to manage each developers feature branch also feels very sensible. In response to my comment about submitting my ‘feature branch’ every time I check in, you responded with:

        ‘I don’t. I don’t merge to the trunk when I have finished. I first get the trunk’s code, before merging, so that the final merge will be smooth.’ (I apologize for my poor formatting, I don’t know wordpress markup).

        To clarify, before submitting my changelist, I also sync my workspace to the latest, rebuild and retest locally, then submit. I still feel that this reflects the same workflow even if the toolset isn’t as nice. You follow up by stating that the merge must come in clean or it is rejected. This is valid, and to me it gets at the root of my concern of the philosophy of feature branching.

        I played soccer with some engineers at Microsoft, and one of them described the feature team model his team used for their product (I won’t name it, but one of their larger products on a 3 year release cycle). He explained that there were about 120 feature teams, most with only 1 developer. The planning phase was completed during the testing/finishing phase of the prior release cycle, so after a release everybody began to code. Everyone worked in their own feature branch and after signoff from test and stakeholders they were allowed to merge into the mainline for release. He explained that they had 5 months worth of ‘feature’ development, and the next 2.5 years were spent testing, bugfixing, and merging. In order to merge you had to have sign off, the first team to merge got in free because it had nothing to merge with. The team after that had to then incorporate the first teams change into their own feature branch before achieving the clean merge back into the mainline you describe, and so on. As you can imagine, the merge problem explodes for each team down the road as each team before them is allowed to incorporate their changes into the mainline. It’s not until somebody elses changes are brought into main that you have the opportunity to merge. This is not true of git, which is truly an advantage, but the thought of 120 teams coordinating each with each other to merge amongst themselves is hard for me to imagine in reality. The communication challenge alone is tremendous, not to mention electing which team does the dirt work, as you refer to it :).

        This is clearly a worst case scenario, but it’s also one from the most ‘successful’ software company of all time (measured in dollars, not quality or functionally or anything else measuring value for users). My point is, it’s fine to require everyone to merge in their own feature branch before having theirs pulled into the mainline, but if everyone hangs on to their feature branch until just prior to release, then you go from having nothing new to merge from the mainline, into a race to see who can merge first and seeing everyone down the line suffer.

        This leads nicely into your point about the system architecture itself. Right out of the gate I’ll say that everyone would love a well designed architecture, it just seems awfully hard to find well designed enterprise systems in real life :D. I wholeheartedly agree that if you find yourself colliding on a large scale with other feature work, you have clear cohesion/coupling issues and serious problems with your design. The point you make is that if your merge is difficult, then doing it frequently masks the fact that your poor design requires the need to resolve conflicts regularly. I didn’t fully appreciate your point until I reread it just now, and I think your point is very well made. The question then is, in the event I have a poorly designed system, what is the healthiest way to integrate into it. I have unfortunately worked in 3 poorly designed legacy systems, and only 1 greenfield project. The greenfield project was built with TDD, constant refactoring, and driving the system to be Open/Closed to the actual feature needs rather than trying to over generalize up front. I digress, back to the legacy systems. I think my attachment to Continuous Integration and working directly in main has been driven by my experience with poorly designed systems. One example within the last 6 months cropped up where a team decided to ‘feature branch’ as an excuse to ignore the challenge of working with our poorly designed system. After 5 months in which they did not merge the mainline into their branch, they attempted (and failed miserably) to merge their changes back into our system. As far as I know the feature has been scrapped because the complexity of incorporating their changes was more risk than the potential value added by the feature.

        You’re right that the root cause is that the system is poorly designed, the problem is you can’t rewrite the system and you want to add incremental value. In this environment, the feature throttle is the mechanism that, by design, forces you to create isolation from the existing legacy system and carve out a space for yourself that is testable, well designed, well encapsulated, and doesn’t degenerate the system further. Personally I don’t put up with duplication, therefore my feature needs to throttle on and off with exactly one conditional statement in my code (managed by an abstract factory producing enabled/disabled instances of my feature work). Integrating my feature into the code from day one and incrementally checking in functionality even while the rest of the product keeps shipping is the most reliable, healthy, way to ensure isolation of my feature and drive better design into the system.

        The picture in my head that’s beginning to form is that we always want to continuously integrate our changes with other changes to the system, the timeline and implementation of that depends on the quality of the design of your system. If your system is well designed, the complexity of feature throttles is unnecessary and adds risk that you don’t need to take on, while adding code to a Big Ball of Mud is going to be painful regardless of what you do, and integrating into a system in which you are guaranteed to conflict regularly should be done early and often, Martin Fowler style, to drive isolated features that prevent degradation of the existing system and ensuring healthier design in new modules.

        Thanks for this post. To be honest I was initially suspect of your conclusions but I now have a much better appreciation of your perspective. In particular I understand why you (rightly) accuse spaghetti merging of hiding the real problem, which is poor system design. It’s just the world I’ve lived in for much of my career so I have no trouble sympathizing with Fowler’s advice :D.

      2. Hi Chris.
        I admit I never worked in software shops as huge as Microsoft: even if I work for a company shipping worldwide, my experience has always been with projects not even comparable in size to the new release of Microsoft Windows.

        Never the less, I find very astonishing that a single developer can be kept isolated for 5 months, working on a feature without sharing code with other developers for the whole cycle.
        And, even more astonishing, that still the company demanding him to be able to smoothly integrate his work with the product of other 120 developers who have been working for the same 5 months in the same isolated situation.

        To me, this is not Feature Branching: this shooting themselves in the foot.
        This is not developing in parallel: this is deliberately be looking for conflicts.

        Should my Company use this definition for Feature Branching, I’d say: “Ok, I quit. I completely agree with Martin Fowler; let’s use a trunk based branching strategy and rely on a centralized integration testing environment.

        But DVCS should be a means for communication, and Feature Branching a tool to aid developers to share code.

        Right now, I follow these principles:

        * I open a Feature Branch for each User Story.
        * While working on that feature, I don’t multitask; I don’t open other branches, I don’t commit on other stories
        * Other developers should be working on stories about the same topic, that is, about the same epic(1). We must communicate, because be work in temporary sandboxes, but we have to integrate very often.
        * I begin the branch with an end to end test demonstrating that the feature is not present; the test will remain red and eventually becomes green at the end of the activity
        * I use TDD while developing
        * Now and then, I merge from dev branch (the branch used to communicate) and I keep in touch with the project evolution
        * As soon as the user story is completed, I merge, I integrate and I carryon with another user story

        (1) Rather than having 120 teams working for 5 months on 120 features, I’d rather have 120 developers working on several user stories about the same feature, so that this single (big) feature could be delivered in much less than 5 months; in the meanwhile, the 120 developers have the interest to often integrate, since their single user stories will be much shorter (<1 week) and they are working on the same topic.

        Now, since each User Story is by design sized in terms of hours or days (not months!) it's implicit that a feature branch should not live more than few days.
        In my current team, we raise an alarm when a User Story/Feature Branch is more than 2 days old. 1 day is already a smell.

        This is what I call Feature Branching.

        I'm aware that other companies could decide to keep a topic open for 5+ months: in this case, I think that no techniques can help; surely, Feature Branching wouldn't be the solution.

        In conclusion, I think The response to your question "in the event I have a poorly designed system, what is the healthiest way to integrate into it?” could be:

        adopt an agile, iterative, incremental strategy, and act accordingly; that is: a) use feature branches, mapping each branch to a single user story; b) let your developers work in parallel, but on the same epic/topic; c) prefere communication to isolation, frequent feedback to slow ones; that is, let each feature branch be as short as possible (merge often) and self-consistent (merge only when concluded); d) if the project is huge, map your company hierarchy with repositories (i.e. use a Dictactor Workflow, like the one I linked you before);

        Cheers!

      3. Also: I use Continuous Integration together with Feature Branching. There’s no dichotomy. Continuous Integration is not an excuse to commit to trunk or to avoid parallel developing of features.

        And here we get to the root of the problem: you are using the word Continuous Integration incorrectly. By definition you cannot be doing CI and FB together because they are mutually exclusive. There *is* in fact a dichotomy. You can do neither, one or the other, but not both at the same time.

        That’s not to say there isn’t a place for both, there is, just that the words have conflicting definitions. Continuous Integration means what it says: integrating all the time, that is many times a day. That doesn’t mean it is the One True Way, just that it is a technical term with a very clear and unambiguous meaning. That’s a good thing. There are other approaches that could also work, and that may be preferable in some circumstances. FB for the Linux kernel are one example, and no doubt there are others too.

        Frequent Integration, or maybe Frequent Controlled Integration could be good in some circumstances too, but it wouldn’t Continuous Integration, simply because it wouldn’t be continuous. It doesn’t help to stretch the definition, because that would obscure discussion, as it did in this blog post. Opinions differ on whether CI is a good idea, or whether something a bit less continuous wouldn’t be better. CI isn’t a synonym for good, it doesn’t mean you’re not hip or not a coding ninja if you aren’t using it.

        CI also isn’t synonymous with using a build server or even a build server that runs unit and integration tests and reports the results. Not even if it blocks commits / pushes if it breaks. These practices are synergistic with CI, but they are not the same thing.

        The reason you don’t understand why Fowler advocates what he advocates appears to be that you misunderstand what it is he is proposing or possibly because you cannot accurately imagine what it’s like. Again, no shame in that, and maybe you still won’t like it after you’ve seen it. I must say I found it very counterintuitive when I first heard about it, but now it’s become second nature.

      4. When merging some thousands of branches, certainly not all of them can be fast-forward merges – except when the authors of those branches merged each others branches already, such that you are effectively only merging one branch. (Or after each merge, the author of the next branch integrates your merge result – but then I doubt you would get to 5000 branches in some days.)

        I guess each of the 5000 branches could be a ff-change from the master/trunk/mainline/… before the big merge session starts, but not anymore after the first of them is merged.

  4. A couple of observations:
    1. As far as I understand the Linux model forces the contributors to resolve the merges so the pain is pushed out. That doesn’t mean this pain does not exist. Linus may not feel it but it still exists. Linus only sees a filtered version of what really goes on.
    2. When looking at this you need to consider the rate of change vs. the size of the code base. When the change is small relative to the code base the chance of collisions (e.g. merge issues) becomes smaller. You also need to look at how localized the changes are. This is partly why one size does not fit all.
    3. The point Martin Fowler makes and I think a lot of people are missing or ignoring is that merging, in most complex software, is not simply a mechanical act of combining the source code. There is a lot of cross-talk between different parts of the system. Integration is the process of getting it all to work together, not the process of pulling in different lines of code from different places.
    4. Because there is a lot of testing going on in the feature branch there is a lot of waste in the process. An open source project with hundreds of contributors may not care but a small team definitely cares. When you integrate continuously you are also testing the integrated code base continuously (manually and automatically).
    5. The economics of open source development are completely different than the economics of commercial software development. If some guy somewhere spends a month developing something that is never merged in no one cares. Time those people spend merging or testing is not accounted at all.
    6. Open source development is usually less of a team activity.
    7. There’s really nothing atomic about a feature. With almost any feature you can keep slicing it. Finish the slice, check it in, other team members can see it, use it, improve on it, test it. You need a new class for your feature, implement it, check it in. You need to refactor another class, refactor it, check it in. This is a fluid/malleable media, take advantage of that.
    8. Branching can be useful under some circumstances but they’re easy to misuse, features keep growing and timelines keep extending, panic ensues. Like many other techniques in software development you start with best practices and adjust based on your experirence.

    1. Very smart and inspiring observations. Thanks very much, Guy.

      1. As far as I understand the Linux model forces the contributors to resolve the merges so the pain is pushed out. That doesn’t mean this pain does not exist. Linus may not feel it but it still exists. Linus only sees a filtered version of what really goes on.

      Exactly!
      They key here is: Fowler, suggesting a trunk based branching approach, suggests to distribute the effort required by Integration over a time period; on the opposite, Feature Branching suggests to distribute the Integration Phase among more all developers (Thanks Martijn).

      The key is the sentence

      You can’t do decentralized versioning unless you also decentralize your testing and integration

      that I found in this excellent article by Jilles van Gurp about this topic.

      I agree with most of the arguments you are raising.
      Thanks!

      1. Feature Branching suggests to distribute the Integration Phase among more developers.

        CI distributes it among *all* developers. Everybody integrates, all the time. It wasn’t clear from your posting whether you understood this to be the case. Could you clear this up?

  5. Hi Chris,

    Great post – I have long thought about this topic myself. I think some of the arguments here are around the context of how the development team is operating.

    The merge tool to me is insignificant – it’s not how you merge, it’s when – I think your focus is here as well because you talk of “isolation” which I think is exactly the point.

    The teams I have ran have always done scrum sprints – mostly 2 week iterations and work very closely together, we actually break stories down into subtasks and make sure developers commit against those even smaller tasks.

    I like feature toggles but use them for epics only – this is when a story on it’s own doesn’t actually add up to enough business value to release but it is a self contained shippable piece of functionality.

    For developers the feature toggle is neither here nor there – it probably adds complexity they would rather do without but for QA it is a godsend. It means they can test the stories on staging, integrated with the rest of the product, and in the same sprint as they are developed.

    Without feature toggles and instead keeping the stories on the branch we end up putting extra load on QA when we finally reintegrate once enough stories are done to deem the epic releasable.

    It sounds like you are breaking stories down in such a way that your branches are so short lived I would ask why are you bothering with them in the first place? If the lifetime of the branch really is only a few days why not just commit to the mainline?

    The Linux examples mentioned in previous posts is a very, very different use case I think given the nature of open-source contributions. For me feature toggling works only for a tightly knit agile team.

    Rob.

  6. I am glad to see that not everyone listens blindly to what other ‘famous’ people say.

    On the other hand, it does make sense to merge to mainline as soon as practical, consider the extreml alternative, which some software houses try to practice, which is to merge all at the end in a big bang. It doesn’t need any version control knowlege to realise that if you have two documents that are changing in differnt places (branches) eventually it will be very hard to reconcile them. managment often somehow seem to fail to grasp this realvily simple concept.

    The problem with SVN is that is *still* does not do proper merge tracking, and so you sometimes have to do a merry dance to get the correct merge, and this quite rightly scares developers away becuase they don’t trust it, or understand why it does the things it does.

    The main benefit of distrubuted systems like git, mercurual, bazaar (and other proper source managment tools) is that the merges are tracked properly, it’s a fundamental thing that any source control system should be able to do. Thier distrubuted nature if just the extra topping. Svn is a slightly better cvs, but it’s 2015, and it’s extreemly wise to use the best tools for the job, and svn just isn’t it.

Leave a comment