Preemptive commit comments

tl;dr version

Rule #1: write commit comments before coding
Rule #2: write what the software should be supposed to do, not what you did

Long version

Dan North changed my life

You should read his epic post Introducing BDD. I don’t want to belittle his message when I say that Behavior Driven Development (which has been a real revolution in the IT industry) began with the simple attempt to replace the word “test” with the word “should“.

After having being introduced to BDD (and the fantastic Clean Code‘s chapter about methods’ names) I started taking a lot of care of the words I was using in comments, commits, test names and classes/variables/methods’ names.

Since I was used to write tests with names like

When_foo_happens_Method_bar_should_do_this()

I started thinking that I could use the same criteria for my commits’ comments as well.

Tell me what the software does (not what you did)

When I checkout a commit, I’m interested in what I will find.

I’m aware that someone worked on the code, in order to make the software able to perform some features (and I’m very grateful to the authors), but, ehm…, I’m not really that interested in what they did: instead, I’m really interested in what the software does in that specific snapshot.

I want an answer to the question

What is the project behavior, in this snapshot?

rather than an answer to the question

What did the programmers, in order to produce this snapshot?

Hence, comments like

Fixed Gitk and Git Gui [spdr870]

(I took a random commit’s comment from (the fantastic) GitExtensions project)

is of much less significance of something like

When the history is over 10k commits, it does not crash trying to display it with Gitk

Welcome back

After 1 week of holidays, my colleague came back. I needed to update her with all the changes we made to the code base.

I could have showed her something like:

Try to read this commit comments list and tell me if I’m wrong: isn’t it an annoying list of what we did in her absence?
I’m sure, we could do better.

Actually, I thought that a list like the following can be better read as a “what’s news” post:

Dan North, again

In other words, we began committing comments describing the behavior of the software, rather than the implementation or a description of what we did.
And we found that it led to a more readable history of the project.

My commits’ comments started to look like BDD’s methods name: a description of a behavior.

The principles me and my colleagues started to follow were:

  • Talk about the feature, not about yourself: don’t write what you did; describe what the software does (thanks to what you did)
  • Don’t refer to the past: don’t describe what the state of the software was: tell what the state is right now. Use the present tense.
  • I know it’s now: the commit has a datetime: you don’t need to specify “now” or any other time reference.

Here comes Eric Willeke

Then I read a tweet by Eric Willeke. He wrote

 

Wow“, I thought, “this is what my idea was missing, this is what BDD and TDD are prescribing: first assert with a test, then code. If my commits comments are like BDD assertions, they should be written before coding. Eric must be right. Let’s try his challenge.

We tried.
Know what? It works.

Write preemptive comments

As well as you write tests before you implement a feature, you should write commit’s comments before you start coding.

Here’s a list of benefits I noticed following this rule:

  • More focus while developing: I love pairing. I like that the navigator can take a note on a to-do list whenever the pair finds a reason to digress; in this case, the navigator just takes a note and takes the focus back to the feature to be implemented. Navigator’s notes may eventually be the next technical tasks to be performed. I found that, just like a good BDD test, a preemptive commit comment can help the pair to focus on the specific task to be performed. A preemptive commit comment acts just like a preemptive test. It’s just much more light to be written.

    With preemptive commits a pair is less prone to digressions.

  • Commit review is much easier: I’m used to carefully review each diff before committing. I discovered that since I started writing commit’s comments before coding, code review was much simpler.

    Commit review becomes a let’s-confirm-we-did-what-promised task rather than let’s-discover-what-we-did activity

  • Less cognitive load: I am used to write a failing test before quitting the office. It helps me to quickly switch to the work context the day after. As a matter of fact, one of TDD’s nice side effects it the less cognitive load on the team (this is one of Alberto Brandolini‘s ideas, and it’s dramatically true): after an interruption, a programmer is easier introduced in the workflow if a failing test communicates him what’s the next step to perform. I found that a preemptive commit comment does the same, with even a greater granularity.

    Every time I come back from lunch, I just read my preemptive commit comment, and my brain quickly switches to what I was doing 1 hour before. When I don’t remember with enough details, I launch the test suite, looking for red tests.

  • You learn commenting much more precisely: since you have to write them before coding, you want to be just enough specific; a commit comment becomes a declaration of intent, just like a BDD method name.

    Think about it: in git and mercurial a checkin is called “commit“: you are, in fact, making a commitment, an engagement. You are making a declaration of intent. That’s why it is so natural that you anticipate what you’re going to do.

    Once you have written your declaration of intent, you just want to develop code adhering to what the comment claims; hence you don’t want to be too much specific or too much generic. Comments naturally end to be much more precise.

    No more “Just a fix“, “Improvements” or “I made this, this, this and also this” comments.

  • Each preemptive comment triggers a micro design session: I know you don’t believe. But after a while, before writing a commit comment you want to know the exact name a class will have, or the exact word to describe a specific behavior. You will want to know which module has the responsibility to send the PDF file by email in your gorgeous ERP plugin, since you have to decide if it’s better to write “PDFExporter sends a copy of the document by email” rather than “EmailNotifier detects files exported by PDFExporter and sends them by email“. Before writing a commit comment, you must be able to describe the feature. And, if you try to comply with the “write a preemptive commit comment” practice, you will just stop a while and do a nice, light, old-fashioned upfront design session.
  • A preemptive comment sets a micro goal: the mere act of writing a commit comment helps to focus a goal to be reached. Since the goal the pair is committing on has been defined from the beginning, it’s simpler for the pair to realize when the goal has been reached and the job is finished.

    Without preemptive comments, I often went on coding, always asking myself: “Should I commit now? Have I reached a stable state which I could consider a good commit?“.

    I found that I can define micro-goals through preemptive comments, and a macro-goal through the feature branch name (which is also preemptive by design).

  • A preemptive comment creates a little timebox: I found that with practice, my job work (or, at least, the job related to a feature branch) is divided into several micro-timeboxes. This is because I have a moment when I start working from a stable state and a clear goal to reach. I am encouraged and pushed to reach the goal.

    Can you believe it? Just because of a stupid comment. Human brain is weird…

  • Writing comments preemptively puts the agreement between the pair members to a test: driver and navigator must agree on the comment, that is, the goal to reach. I found that very often the mere fact we must write the goal before coding requires us to discuss and share our points of view. And it’s awesome. The moment you will start coding, you will be much in accord with your pair on what you will want to achieve.
  • The commit history gains a very balanced granularity: each commit has a precise goal. A feature branch becomes a collection of evolutionary commits each of which has usually a 1:1 binding with tests; each test has its commit. It’s also very easy to find which commit introduced a bug, since each commit is related to a single new goal/feature. You could say that each commit honors the Single Responsibility Principle.

The experiment was originally conducted by Arialdo Martini, Mattia Piccinetti, Gian Marco Gherardi, Guglielmo Brasile, Francesco Pichierri and Giuseppe Mariano.

We are now using this technique on a regular basis.

(Should you decide to give a try to this practice, please, write me a note with your outcomes: I would love to share your opinions and ideas)

Cheers

About these ads

34 thoughts on “Preemptive commit comments

  1. Really Smart! Just wondering how to handle those scenarios when you have to change atomically more than one type to keep the code buildable. Indeed, I can’t think of a commit that intentionally break the build… In such case pre-emptive commit messages are still valuable but don’t map to single feature/behaviour/test.

  2. I’m incredibly appreciative to Arialdo for trying this out, and excited about his results!

    The unexpected killer aspect to this one for me was how it triggered the just in time design sessions. I’ve always valued the emergant architecture/design aspect of Agile quite a bit, and as a result my teams are always in the habit of sketching before coding, but I’ve not seen that behavior very consistently as a coach. For that to naturally emerge from a different practice is incredibly exciting!

  3. Wow. I can’t believe anybody thinks commit messages like that are a good idea. This is completely the wrong way to track this information. Your issue tracking system should contain the information about what the change does (in plain language) and be linked from the commit message and the commit message documents what the programmer did. The commit history should be a tool for the programmers and the issue tracking system should have more general visibility to stake holders. Taking just your first commit message as an example “Converted from SaveFileDialog to SmartFileSelector” vs “When the user saves the file, a Preview is shown, thanks to SmartFileSelector”, when you have a bug in SmartFileSelector and you review your git history, your first message shows I can ignore that commit however the second means I have to read the diffs to find out whether you changed the code of SmartFileSelector. Also by committing at the granularity of change does x you force a large commit as the entire change X must be completed in one commit whereas point to a issue can be carried through many smaller commits which are easier to review/revert/cherry pick/merge. All of your benefits can be gain by putting that effort into writing your issues and you can pair to write your tickets if that is beneficial to you. And by having your commit message summarise what you did that information is captured, under your system that information is lost.

    • You may be right. Actually, we are exploring and experimenting this approach, and I’m sure it can be improved. And yes: it may be completely wrong.

      About the granularity: inside a feature branch, I try to act like this:

      * I write an end-to-end test describing the whole feature to be implemented. This test will be always red, end it will eventually get green at the end of the feature branch.
      * I commit: this is the goal to be reached. When the e2e test is green, some business value is added to the project. The commit comments says the project is lacking a feature.
      * I start with TDD with a long series of comment+test/code/refactor. Each commit comment describes a behavior. At this level, the comment is very very developer-centric. It’s still a about a behavior, but usually it’s about a class behavior, not about what the end-user sees. Hence, I feel the git history can still be used as an effective tool for developers.

      The post shows an example with end-user centric commits; I should have included a whole feature branch too, since you’re right: the example given has comments describing end-user features. Here are 2 (real) examples of features branches



      As you can see, these are much more low-level. They talk about classes’ behavior.

      I also be very worried if this technique would lead to what you describe as “X must be completed in one commit“.
      No, actually it may even increase the number of commits.

      About the issue tracking, what you says is very interesting. I also feel the need to link the git history with issue tracking. My feeling about this is that git history should be self-inclusive. Issue tracker is a communication tool; git history is the project. In other words, ideally, I wish I’m able to completely drop the issue tracker without loosing information.

      Thanks for your (negative) feedback: I feel they will be useful to review what we did.

      • Interesting your thoughts that you feel that you need to be able to drop the issue tracker. I’m curious as to why. I feel much happier with your explanation that it increases your number of commits. I’d be interested in your thoughts on changes that cross multiple repositories? Hope to follow your changes to your approach (or why you choose not to).

      • Well, I didn’t mean that I want to drop my issue tracker, or that I want to replace it with git.

        I just meant that I don’t want to force a developer to access the bug tracker in order to understand the project history.

        In other words, I try to avoid commit comments like

        Fixed the bug #652 (see BugZilla for more information)

        The history should be self explanatory.

        About the multiple repositories, it’s an interesting and not trivial question. Let me chew the cud…

    • IMHO, in git, issues map to feature branches, not to commits. Thus if you named the branch after the issue, the issue-number in each commits is almost useless.
      Moreover, if you need to know what files a commit changed you should use “git show –name-only COMMITHASH” and if you need to know which commits has changed a files you should use “git log –follow Path/To/File”. Replicating such info in the commit message, is actually useless in most scenarios.

      • Definitely, you are right. This is what I tried to say with “I can define micro-goals through pre-emptive comments, and a macro-goal through the feature branch name (which is also pre-emptive by design).“.

        Anyway, this requires the use of feature branching (which I love).

        Should you wish to avoid feature branching (for example Martin Fowler thinks Feature Branching is evil, see Why Does Martin Fowler Not Understand Feature Branches and Martin Fowler Has A Merge Paranoia), you could map issues to a commit comment, rather than a branch name.

        Personally, I disagree with Fowler: I don’t see any reason to avoid feature branching, and I completely agree with your comment above.

  4. Just a note: I’ve just fixed some documentation over the code and… there’s no better commit message than “fix doc”. This is quite interesting, since comments are first class citizens of the contracts that the code express (eg .NET doesn’t support checked exceptions, thus good documentations complete the contract).

    However if I fix doc by adding a new tag I should use a commit message like “MyDumbMethod throws MyException”. I still fixed the doc only, but in this case “fix doc” is not enough, since clients should check whether they already catched or not the exception.

  5. Pingback: Rounded Corners 347 — Package delivery | Labnotes

  6. First of all I like the basic idea: like test-first I write a description what I’m going to do in my commit comment before I actually do the work. Here are my 2 cents:

    I see a commit as a change set opposed to a “snapshot” of the behaviour, which I think is better reflected in tests/specs in your codebase. And there are different kinds of changes to the system:
    - adding a feature: these are the cases you described above, where you added a new behaviour to the system
    - removing an obsolete feature: here you can only describe, what behaviour the system will not support anymore and the why (there should be a user story regarding the removal!)
    - refactoring (no changes in behaviour): in this case as no behaviour is changed the comment would be empty or only “refactoring”. But I think that is not enough: I would add also the rationale / the why behind this design work e.g. “refactored to remove duplication”, “refactoring: split up class A into class A and B to enforce Single Responsibility Principle”.

    • Absolutely inspiring. Thank you.

      I like your analysis: adding, removing and refactoring seem to complete the set of types of changes that can be applied to a system. May be we could also include “trying a spike” or “prototyping” (that is: commits containing code that, by design, will be discarded), but I’m not sure, it’s just an idea.

      I absolutely agree on your note about removing an obsolete feature.

      Talking about what to write in refactoring commit comments with Dan Bergh Johnsson, he wrote this inspiring tweet:

      I completely agree with him.
      In these days, while commenting on a feature branch, I started thinking I’m talking to a developer; when I have to merge the feature branch to dev, that is, when I’m adding a feature which is visible to the end user, I try to commit using a language meaningful to the end user.

      Hence, I’m not sure I agree with you. I’m not that sure that one should just write “refactoring” as you are suggesting.

      Also, about your examples “refactored to remove duplication” and “refactoring: split up class A into class A and B to enforce Single Responsibility Principle” I think that the use of the Past Participle is hiding the fact you are talking about an action you did (“[I] refactored to remove duplication“), not a state of the software. In other words, I think this rule of thumb could be valid:

      If comment explicitly or implicitly begins with “I”, it’s wrong. You should describe what the program does or how the code is, not what you did.

      Also, those comments are somehow referring to the past. May be I’m too strict on this, but I think that a commit is always a collection of diffs from another commit, hence it may be really useless to remember “this commit introduces a difference from the past“. I mean, it’s some kind of obvious information.

      Again, about time references: I aim to find an absolute description for the state of the software or the code. The future reader of the commit is interested in what she will find in that commit. Anything existing before that commit is past. Did you fix a bug? Well, that bug is a thing of the past. In few commits that bug will be forgotten. Or, at least: no one will have any interest in knowing that before your fixing commit there was a bug. What’s important is the current behavior.

      In the future, that commit could be cherry-picked by someone, and seen (and, somehow, correctly interpreted) as an “add-a-feature” commit, not as a “fix-a-bug” commit.

      The same, I think, applies to refactoring.

      For example, rather than your

      refactoring: split up class A into class A and B to enforce Single Responsibility Principle

      (really? How were A and B classes before? Before when?)

      I would write

      class A has responsibility X, class B has responsibility C

      • I’ve got your point, but I see the model of the commit log still more as a history of changes than your history of snapshots of what the software does. A change has a “before” (the past, that you don’t wanna read) and an “after” (the state / behaviour you want to read). For adding features the past is not worth mentioning (the feature was not implemented). IMO for those commits your model works perfectly. But for the other kinds of changes (bugfixing, refactoring and removing of a feature) I think the “before” is equally or even more interesting for the reader of the log, because as a reader of the log I also want to know, WHY the change was necessary. In these cases the not intended state “before” is actually the reason for the change:
        - removing: not the functionality “before”
        - bugfixing: not the bug that existed “before”
        - refactoring: not the design flaw that existed “before”
        Combining the what and the why could sound like
        - bugfixing: “affected feature XY now works correctly without bug 1234″
        - refactoring: “a centralized constant MAX instead of dispersed literals across the class”
        Aren’t you missing the WHY? And if so, how would you write the bugfixing example then?

      • The “why” argument is really a good point.

        It reminds me of the typical “In order to do X, As a Y, I want the software to do Z” sentence, where the “in order to do X” captures the reason why someone wishes the feature.

        In a random chat, Marco Pegolo suggested me to use exactly that sentence, at least in commits that will be merged to the release branch, that is, in commits that will be eventually read by business users. I think I agree with him, but I also think that your examples achieve the same effect.

        Especially in the case of bug fixing I think a reference to the previous (bugged) state could be useful. I can agree with you.
        Let me ruminate on this for a while :)

      • I think there’s a lot of value in following the pattern for the refactoring/removal cases, because I pretty strongly believe it’s the current reality (as of the point of the commit) that’s valuable… including the “why” if it’s not obvious.

        Examples include “New subscriptions can no longer access the ‘program’ menu” (as an example of a removed feature, and would be attached to a story like “In order to reduce our overall interface complexity, as portfolio owner I no longer want users to utilize the legacy ‘program’ capability”)

        For refactoring activity, I might see:
        “CommentRepository has ownership ownership of serialization of comments (instead of PostRepository)”

        In the previous, I needed to elaborate ‘why’, while in the latter it should be generally obvious (especially if the next commit is “CommentRepository supports JSON output in addition to XML” or some such)

      • @Arialdo
        Stating the “why” (BDD’s “In order to do X”) delivers two benefits depending on the point in time:
        - before performing the changes of a commit, it reminds the programmer to think about the value of the following efforts in advance
        - reading the comment after the commit tells the history of changes including the motivations for them back in time

        @Eric
        +1 for “why on demand” depending on the context

  7. Pingback: Liens en vrac #6 | hupstream

  8. Pingback: Proč je dobré dodržovat Coding Standards (Pravidla pro psaní kódu)? | Martin Hujer o všem možném

  9. Pingback: kłołt Commit Driven Development ankłołt | Maciej Aniserowicz o programowaniu

  10. Pingback: Preemptive commit comments | The Pragmatic Programmer | Scoop.it

  11. Pingback: Keeping My Software on Track | Webamoeba

  12. Pingback: Intentando hacer buenos commits | Jesús L.C.

  13. Pingback: Top Reads for January - Jaco Pretorius

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s