Skip to main content

One Pull Request. One Concern

Mirror of:

Git has a command called git-request-pull. The purpose of the command is to prepare a set of commits so that the owner of a copy of a project can request to the owner of another copy to land a small set of changes. The idea is to be able to share a piece of functionality with the owner of a more trusted version of the project, without requiring additional write access. This opens a wide range of opportunities for collaboration: anyone can clone a trusted project, make changes, and then ask the original author to land those changes.

When Github came out, it leveraged the philosophy of Git and "forked" the idea behind git-request-pull. The purpose was to create an abstraction on top of it so that those who are not familiarized with Git could understand and apply the philosophy of collective collaboration without having to setup mailing lists.

They called it Pull Request.

A Pull Request serves a similar underlying purpose of git-request-pull, with the difference that it deeply integrates something that was one of the main reasons for the success of Github:


A Pull Request is a successful attempt to deeply integrate collaboration to the developer's workflow

A Pull Request is not just a way to "prepare" a set of commits so that it can be sent to the owner of another copy of the project through external means (such as e-mail). A Pull Request is the set of commits, that also contains the technical history of everything that was related to that Pull Request. All in a single place.

There are strong objections to some fundamental decisions and trade-offs of the "Pull Request" approach. What can't be denied, is that Github successfully accomplished their goal, which was to make text collaboration (not just code) more accessible to everyone, providing an interface so simple that even those who can't program can still collaborate.

Last time, I wrote an article called "One Commit. One Change.". It explained that a commit represents a single atomic change, an indivisible change. It can succeed entirely or it can fail entirely, but it cannot partly succeed.

In a Git commit, we measure "succeed" as the ability to deliver value to the application. "Value" is not just about business value, it can represent payment of technical debt, legibility fixes or internal interface changes, but it cannot contain certain refactoring or whitespace changes that don't have a clear purpose and therefore can succeed even if part of the change is omitted.

A Github Pull Request (from now on just "Pull Request") is more than just a set of commits. While a commit can only contain a single change, a Pull Request can contain one or more changes that together form a high-level concern.

A Pull Request represents a way to deliver value to the application in the form of a set of changes that together form a high-level concern

It is also important for a Pull Request to be atomic. But with a Pull Request, we measure the "succeed" as the ability to deliver the smallest possible piece of functionality, it can either be composed by one or many atomic commits.

One of the bad practices of a Pull Request is changing things that are not concerned with the functionality that is being addressed, like whitespace changes, typo fixes, variable renaming, etc. If those things are not related to the concern of the Pull Request, it should probably be done in a different one.

One might argue that this practice of not mixing different concerns and small fixes in the same Pull Request violates the Boy Scout Rule because it doesn't allow frequent cleanup. However, cleanup doesn't need to be done in the same Pull Request, the important thing is not leaving the codebase in a bad state after finishing the functionality. If you must, refactor the code in a separate Pull Request, and preferably before the actually concerned functionality is developed, because then if there is a need in the near future to revert the Pull Request, the likelihood of code conflict will be lower.

It is important to note that, on the Linux kernel mailing list, the git-request-pull command is used by maintainers of some copies of the kernel to sync their trees into larger ones and eventually into the mainline, and that might not contain a single concern. For a single concern they use a "patch set", or "patch series":

On the Linux kernel mailing list, patch sets are used for grouping atomic changes and should have what the author calls one concern. Pull requests are usually used by subtree maintainers to sync their trees into larger trees and eventually into the mainline. In this case, pull requests can contain patches from many community members that the maintainer has applied to his/her tree in which case pull requests do not address a single concern […]

— User mdmd136 on Reddit

But what does "patch set" means in the context of the Linux kernel development mailing list?

[…] The cover letter does not contain a patch, but describes the theme of the entire patch series and usually has a diff stat of the set […]

— Excerpt from mdmd136's comment

A Pull Request shares the same fundamental purpose of git-request-pull, with the difference that it is useful to express the intent of a single atomic concern, something that most of the time shouldn't be done using only a single commit.

Most of the atomicity advantages of a commit are also advantages of a Pull Request but in a higher level.

It doesn't matter which workflow was chosen, using Pull Requests efficiently containing an atomic concern can help scaling the codebase through collaboration.


Popular posts from this blog

Capture and compare stdout in python unit tests

A recent fan of TDD, I set out to write tests for whatever comes my way. And there was one feature where the code would print messages to the console. Now - I had tests written for the API but I could not get my head around ways to capture these messages in my unittests. After some searching and some stroke of genius, here's how I accomplished capturing stdout.

On working remote

The last company I worked for, did have an office space, but the code was all on Github, infra on AWS, we tracked issues over Asana and more or less each person had at least one project they could call "their own" (I had a bunch of them ;-)). This worked pretty well. And it gave me a feeling that working remote would not be very different from this. So when we started working on our own startup, we started with working from our homes. It looked great at first. I could now spend more time with Mom and could work at leisure. However, it is not as good as it looks like. At times it just feels you are busy without business, that you had been working, yet didn't achieve much. If you are evaluating working from home and are not sure of how to start, or you already do (then please review and add your views in comments) and feel like you were better off in the office, do read on. Remote work is great. But a physical office is better. So if you can, find yourself a co-working s

The economics of crypto investing

If you believe in the greater fool theory, there is no other market as speculative and volatile as the crypto market today. We are perhaps living in the biggest bubble of our times. I am not bullish on this market in particular. I am bullish on the mania. 90% of the cryptos we see today will crash. They are just tokens with no tangible value generation capability. However, I believe that the mania and euphoria will stay. Having said that, should one consider investing in this market? Certainly! The risk/reward is lovely, potential upsides and margins are huge and with 3-5% of your net worth, the bet on the mania is worth it. How does one choose where to invest? If you follow the stock markets, you are expected to do thorough Fundamental Analysis before investing. Expect the same for the crypto market. I invest in large caps. I invest in index funds. And I invest over and over again. Markets rise, always. Extrapolating the same strategy - invest in indices - the top 10 tokens by