4.6 Minix, Linux and Hurd

In January 1991 a graduate student at University of Helsinki had bought himself a PC with a 80386 processor. This came with MS-DOS installed as most PC’s at that time. This student, Linus Torvalds, preferred the Unix type of operating systems that he learned about at the university. His apartment was a good distance from the university and student terminals were not always available, so he wanted to run a unix-like OS on his PC. Searching for it he found Minix.

Minix is a small Unix clone made by Andrew S. Tanenbaum for teaching purposes. Tanenbaum is a professor at Vrije University in Amsterdam. He made Minix because AT&T had decided to forbid the teaching of Unix internals. The source code for Minix was published as an appendix to first edition of Operating Systems: Design and Implementation in 1987.

At this time the development of the Hurd kernel had started, but no one knew when a runnable version of Hurd would be available. Torvalds installed Minix on his PC and in April 1991 he started to experiment in building an operating system of his own. In the end of August he posted onto the minix newsgroup and stated that: “I’m doing a (free) operating system (just a hobby, won’t be big and professional like gnu) for 386(486) AT clones.”. In the beginning of October he posted onto the minx newsgroup again, this time inviting people to experiment with and improve Linux. In this post he also explains his reason for making a new kernel:

I can (well, almost) hear you asking yourselves ~why?~. Hurd will be out in a year (or two, or next month, who knows), and I’ve already got minix. This is a program for hackers by a hacker. I’ve enjoyed doing it, and somebody might enjoy looking at it and even modifying it for their own needs. It is still small enough to understand, use and modify, and I’m looking forward to any comments you might have.

It is worth to note that, as mentioned in section 4.3, work to make a free Unix clone and porting it to the 386 processor was already being done at Berkley. Jolitz was working on 386/BSD about the same time as Torvalds was working on Linux. Torvalds would later say that had he known about the availability of 386/BSD he would probably have worked with it rather than starting on his own kernel.

For different reasons Linux attracted a larger following than 386/BSD and its derivatives FreeBSD, NetBSD and later OpenBSD. In 1992 AT&T’s Unix System Laboratories (USL) filed suite, first against a company named Berkley Software Design Incorporated (BSDI) which sold a proprietary offspring of NET/2, BSD/386. Later that year USL refiled the suite against both BSDI and UC Berkley. This created uncertainty around the code from which all the BSD off-springs were based. In 1993 FreeBSD and NetBSD were started based on 386/BSD. In 1995 NetBSD where forked creating OpenBSD as an offshoot.

In the early nineties monolithic kernels, where different components of the kernel like memory management and file systems are all compiled into a single binary, were out of fashion among operating system theorists. One of this theorists was Tanenbaum who made Minix with a micro kernel. In a micro kernel sub components, like memory management, are isolated from a small kernel core. Because discussions about Linux were taking place in a newsgroup devoted to Minix, Tanenbaum posted a news challenging the choice of a monolithic kernel i Linux. This spun of a lengthy debate still available on WWW to day. As Torvalds explained in the previously mentioned e-mail he designed Linux to be a program for hackers by a hacker. Torvalds figured that a monolithic kernel would be easier for hackers to tweak. The choice of a monolithic kernel was also the quickest route to a working kernel. Monolithic kernels is simpler to make in the first versions, but have a tendency to grow into a big, hard to debug, mess.

Unlike GNU’s Herd kernel, which was being made with a micro kernel, Linux was available. Simple as Linux still was, it gave promise that it could be made into something great. Torvalds was welcoming to contributions and good at responding to interested people, making contributers feel their efforts were being appreciated and not just thrown away.

By the end of 1991 Torvalds no longer portrayed Linux as hobby for himself. Many who wanted the Linux source code did not have access to the Internet, and therefore could not get it from the FTP site it were distributed. For these people to get Linux sources someone had to copy it to disk and send it. The original homegrown license for Linux did not permit making money on Linux, this included distribution fees. Sending floppy disks cost money, so many developers asked Torvalds to permit a small copying fee. From the 0.12 version released in January 1992 Linux has been licensed under GPL. This license allow a distribution fee, and guarantees that Linux will stay free. In an interview Torvalds said, concerning this decision: “Making Linux GPL’d was definitely the best thing I ever did”.

The phase of Linux development grew rapidly from early 1992 and onwards. In early 1992 Orest Zborowski took it on himself to port X windows, the windowing system most common in Unix systems, to Linux. X used a lot of system libraries not implemented in Linux, so the work was more to make Linux fit X than the other way around. The work of porting X expanded the functionality of the Linux kernel, and it exposed many deep bugs.

The next big step was to make a TCP/IP stack for Linux. Because the BSD code was still being disputed in court, the Linux community started from scratch. First Ross Biro made some crude code, which was taken over by Fred van Kempen. van Kempen had a visions to “throw away the old and write it all from the bottom up for a perfect grant vision”.

This effort was taking a long time and van Kempen kept his efforts close to his chest. People in the Linux community were getting impatient, van Kempen failed to make some interim code that worked in 80% of the cases. Torvalds sanctioned a parallel coding effort by Alan Cox. Cox had a vision of “make it work first, then make it better”. Cox took van Kempen’s early code and made it into something useful.

Torvalds chose to include Cox’s code into the official version. People started to send networking code to Cox making Cox the semiofficial “networking guy”. Torvalds legitimised this by sending networking patches to Cox first before he looked at it. This was the first sign of the “lieutenant model” used for decision making in Linux. This is a hierarchy where code is sent to lieutenants, but where Torvalds have the final say on what is included.

The first stable version of Linux, version 1.0, was released in March 1994. In 1992 Tanenbaum had criticised Linux for being too tightly connected to the 386 processor, making it difficult to port Linux to other hardware platforms. In 1994 a Unix programmer at DEC, John Hall, met Torvalds at a DEC user meeting. Hall was impressed with Linux. Hall later convinced Torvalds to work on porting Linux to DEC’s 64bit Alpha processor, with help from DEC. The Linux kernel was re-engineered to make it more portable to different platforms. The most resent Linux kernel release (2.6) support, at least, 17 general purpose architectures. In addition Linux have been ported to a number of embedded systems. It is interesting to see that Linux was criticised by Tanenbaum for being to tied to the x86 architecture and how many architectures Linux supports now.

Many people thinks that Linux is an entire operating system, but it is not. Linux is a kernel. A kernel is a relative small part of what people commonly think an operating system is, like command interpreter, visual display and tools. The kernel is, however, the most important peace of an operating system. A kernel hides all the tricky hardware details of a computer, from programs running on it. Because a lot of the core utilities commonly used in Linux systems to day were made by GNU, FSF claims that Linux should be called GNU/Linux.

To get Linux running on a machine, and getting all the programs you want, was difficult. This gave rise to what is to day called Linux distributions. Distributions conveniently packages a lot of useful software together, and makes it relatively easy to get a Linux based system running. Two identifiable strains of Linux distributions were visible from the early start.

The first strain is based on a community model like Linux is. Example of these are Slackware and Debian, both started in 1993. Debian is the most used distribution, according to the linux counter. The name Debian comes from the name of it’s founder Ian Murdock and his wife Debra. To stir up some interest for his plans to make a Linux distribution, Murdock posted his intentions on Usenet’s comp.os.linux and on different Internet sites. This caught Stallman’s attention leading to FSF supporting Debian for one year. The Debian project is known, apart for being a good distribution, for the Debian social contract. This contract is the basis of The Open Source Definition which we will look into later.

The other strain of Linux distributions, the commercial distributions, were pioneered by a company calling itself Yggdrasil. Yggdrasil began to sell it’s distribution on CD-ROM bundled with non-free software in binary form. In the eyes of FSF this was a sin. Yggdrasil wanted to include the non-free software because they found it useful. Other distributions made with the commercial marked in mind are the European SuSE which started in 1994, the American RedHat which started in 1993 and the Japanese Turbolinux which started in 1992. SuSE was bought by Novell in 2004, a company that specialize in network operating systems. RedHat now have two distributions. One based on a community model named Fedora, and the other for the business marked named Red Hat Enterprise Linux.

The phase of Linux development grew rapidly during the nineties, and is still strong to day in 2006. The credit file, in the Linux kernel sources, where important contributers are mentioned, have increased from 80 contributer in version 1.0 to 472 in version 2.6.15. The code base for the kernel has also increased manifold, from 176,250 lines of code in version 1.0 to 5,929,913 lines of code in version 2.6.

Up until version 2.2 Linux was a pure monolithic kernel. You had to compile all the features you wanted to include in the kernel and all the device drivers you wanted to have, into one binary. If you added some hardware to your system, you had to compile a new kernel with the device driver of the new hardware included. In version 2.2 Loadable Kernel Modules (LKM) was introduced. With LKM, device drivers and extended functionality can be loaded into the kernel during run-time. This makes it easier to extent and test a part of the kernel. It is sort of a middle ground between a monolithic kernel and a micro kernel.