
In this post, I’m going to answer a seemingly simple question: Why do neither Debian or Fedora have a well defined and reliable process to rebuild everything from source?
I’m often surprised by how many people I encounter in the FOSS world, even experienced developers, that are either intimidated by the idea of building “everything” from source, or think it’s crazy, or just not worth it. Let’s just assume for the purposes of this discussion that rebuilding from source is valuable. I mean, after all it’s Free Software, not Free Binaries Wrapped With Some Metadata.
First, let me define “everything”: the goal here is to construct a basic Linux-based system, bootable in qemu, and you can log in as root. A perfect example of our goal here is to build the source code that comprises JS/Linux. That means the kernel, bash, glibc, gcc, etc. If you read the tech notes, you’ll see it’s built using Buildroot.
The first thing to observe here is not only do multiple projects to accomplish this goal exist, they do so with a high degree of reliably, and solve real-world needs. For example, the Yocto project’s “core-image-minimal” target gets you basically this same thing, and you just run bitbake core-image-minimal, and everything else is done for you. Likewise, a quick read of the Buildroot manual will show you just how little needs configuration or manual intervention.
The second thing to note about these systems like Buildroot, Yocto, Baserock and others – they are not (by default), self hosting. The host and target systems need not be the same. For example, you can use Yocto to build “core-image-minimal” from a Red Hat Enterprise Linux 6 system, an Ubuntu 12.04 system, and a variety of others. In fact, you can even do full cross builds from x86_64 to ARM. Now, interestingly Yocto can generate self-hosting systems, but it’s not the default.
We’re getting closer to answering our original question. Let’s further observe that both Debian and Fedora are defined to be self hosting systems. Why is self hosting a problem? It’s because of circular build dependencies. The classic example of this is gcc, which is written in the C programming language. In order to build it, you need a C compiler already. The Yocto/Buildroot type build systems get out of this problem in a simple way – they assume you already have a functioning gcc on the host system.
But in Debian and Fedora, in order to build the gcc package, you need gcc already built as a package – the build system won’t accept just having a “gcc” binary in the $PATH. That’s how the build systems work because again, that’s how the projects are defined.
If you haven’t done this recently, grab a mirror you can hold in your hand, and go into your bathroom, and point the hand mirror at the wall mirror. You’ll get an infinite recursion. It’s really quite beautiful and fun to do, but since I’m sure many of you won’t, there’s a good picture here.
This infinite recursion resulting from self-hosting is the reason there isn’t one reliable command to rebuild all of Debian or Fedora from source.
One question you might have – would it make sense to have a well-defined process for bootstrapping a self-hosting system like Debian? Some of the developers think so, and the DebianBootstrap wiki page describes the thoughts so far. Personally though, I think it’s both too complex and too vague. A much simpler, and ultimately more reliable goal would be to ensure that version N of the system can be built by N-1. So Fedora 17 can be built on a Fedora 16 system, Debian Wheezy can be built from Squeeze, Red Hat Enterprise Linux 6 can be built from Red Hat Entrerprise Linux 5, etc. Eventually this is a goal I’d like to achieve for Red Hat Enterprise Linux at least. There’d be some cost to packages with circular build dependencies, but having a well defined, reliable process for building from source: priceless.
