Linux development is distributed - profoundly so September 2, 2020 on Drew DeVault's blog

The standard introduction to git starts with an explanation of what it means to use a “distributed” version control system. It’s pointed out that every developer has a complete local copy of the repository and can work independently and offline, often contrasting this design with systems like SVN and CVS. The explanation usually stops here. If you want to learn more, consider git’s roots: it is the version control system purpose-built for Linux, the largest and most active open source project in the world. To learn more about the true nature of distributed development, we should observe Linux.

Pull up your local copy of the Linux source code (you have one of those, right?1) and open the MAINTAINERS file. Scroll down to line 150 or so and let’s start reading some of these entries.

Each of these represents a different individual or group which has some interest in the Linux kernel, often a particular driver. Most of them have an “F” entry, which indicates which files they’re responsible for in the source code. Most have an “L” entry, which has a mailing list you can post questions, bug reports, and patches to, as well as an individual maintainer (“M”) or maintainers who are known to have expertise and autonomy over this part of the kernel. Many of them — but, hmm, not all — also have a tree (“T”), which is a dedicated git repo with their copy of Linux, for staging changes to the kernel. This is common with larger drivers or with “meta” organizations, which oversee development of entire subsystems.

However, this presents a simplified view. Look carefully at the “DRM” drivers (Direct Rendering Manager); a group of drivers and maintainers who are collectively responsible for graphics on Linux. There are many drivers and many maintainers, but a careful eye will notice that there are many similarities as well. A lot of them use the same mailing list, dri-devel@lists.freedesktop.org, and many of them use the same git repository: git://anongit.freedesktop.org/drm/drm-misc. It’s not mentioned in this file, but many of them also shared the FreeDesktop bugzilla until recently, then moved to the FreeDesktop GitLab; and many of them share the #dri-devel IRC channel on Freenode. And again I’m simplifying — there are also many related IRC channels and git repos, and some larger drivers like AMDGPU have dedicated mailing lists and trees.

There’s more complexity to this system still. For example, not all of these subsystems are using git. The Intel TXT subsystem uses Mercurial. The Device Mapper team (one of the largest and most important Linux subsystems) uses Quilt. And like Linux DRM is a meta-project for many DRM-related subsystems & drivers, there are higher-level meta projects still, such as driver-core, which manages code and subsystems common to all I/O drivers. There are also cross-cutting concerns, such as the interaction between linux-usb and various network driver teams.

Patches to any particular driver could first end up on a domain-specific mailing list, with a particular maintainer being responsible for reviewing and integrating the patch, with their own policies and workflows and tooling. Then it might flow upwards towards another subsystem with its own similar features, and then up again towards meta-meta trees like linux-staging, and eventually to Linus' tree2. Along the way it might receive feedback from other projects if it has cross-cutting concerns, tracing out an ever growing and shrinking bubble of inclusion among the trees, ultimately ending up in every tree. And that’s still a simplification — for example, an important bug fix may sidestep all of this entirely and get applied on top of a downstream distribution kernel, ending up on end-user machines before it’s made much progress upstream at all.

This complex graph of Linux development has code flowing smoothly between hundreds of repositories, emails exchanging between hundreds of mailing lists, passing through the hands of dozens of maintainers, several bug trackers, various CI systems, all day, every day, ten-thousand fold. This is truly illustrative of distributed software development, well above and beyond the typical explanation given to a new git user. The profound potential of the distributed git system can be plainly seen in the project for which it was principally designed. It’s also plain to see how difficult it would be to adapt this system to something like GitHub pull requests, despite how easy many who are perplexed by the email-driven workflow wish it to be3. As a matter of fact, several Linux teams are already using GitHub and GitLab and even pull or merge requests on their respective platforms. However, scaling this system up to the entire kernel would be a great challenge indeed.

By the way — that MAINTAINERS file? Scroll to the bottom. My copy is 19,000 lines long.


  1. Okay, just in case: git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ↩︎

  2. That’s not the only destination; for example, some patches will end up in the LTS kernels as well. ↩︎

  3. If you are among the perplexed, my interactive git send-email tutorial takes about 10 minutes and is often recommended to new developers by Greg KH himself. ↩︎

Have a comment on one of my posts? Start a discussion in my public inbox by sending an email to ~sircmpwn/public-inbox@lists.sr.ht [mailing list etiquette]

Articles from blogs I read Generated by openring

Status update, August 2020

Hi! Regardless of the intense heat I’ve been exposed to this last month, I’ve still been able to get some stuff done (although having to move out to another room which isn’t right under the roof). I’ve worked a lot on IRC-related projects. I’ve added a znc-i…

via emersion 2020-08-19 00:00:00 +0200 +0200

What's cooking on Sourcehut? August 2020

Another month passes and we find ourselves writing (or reading) this status update on a quiet, rainy Sunday morning. Today our userbase numbers 16,683 members strong, up 580 from last month. Please extend a kind welcome to our new colleagues! Thanks for read…

via Blogs on Sourcehut 2020-08-16 00:00:00 +0000 +0000

Go 1.15 is released

Today the Go team is very happy to announce the release of Go 1.15. You can get it from the download page. Some of the highlights include: Substantial improvements to the Go linker Improved allocation for small objects at high core coun…

via The Go Programming Language Blog 2020-08-11 11:00:00 +0000 +0000

North Pacific Logbook

The passage from Japan (Shimoda) to Canada (Victoria) took 51 days, and it was the hardest thing we've ever done. We decided to keep a logbook, to better remember it and so it can help others who wish to make this trip.Continue Reading

via Hundred Rabbits 2020-07-31 00:00:00 +0000 GMT