Thursday, June 19, 2008

Nerd Food: On the UbuntuBox


The Regular User


As with many other geeks, I find myself in the unofficial position of "computer guy" for family and friends (F&F). This entails sorting out broken computers, providing advice on new purchases for a variety of devices and software packages, installing said devices and packages, doing security updates and giving security advice, providing mini-tutorials on applications, teaching basic programming - the list goes on and on. The funny thing is, much like all other nerds, I may moan about my duties but secretly I enjoy performing them. Sometimes I get to setup people with cheap Ubuntu boxen, which they may complain about a little in the beginning but eventually use in extremely productive ways; and even on windows setups, I get to understand what drives people to Microsoft and what its weak and strong points are. Its a very instructive job.

Things got even more interesting since I bought my eee PC, a device that is selling like fire in the polar winter. The eee PC helped me understand a bit better how most people think. All along we, the Linux community, have focused on providing a user experience that is very similar to Windows: you may not have a Start Menu but you have the Ubuntu logo; toolbars and menus are very similar and so on. Even the newest eye-candy is similar to Mac and Vista's way of doing things (although there's always the chicken-and-egg problem). The end result is interesting: developers and regular Linux users are now convinced that no one should have any difficulties at all moving from Windows to KDE or Gnome; on the other hand, as soon as you sit a user down on a Linux box, he or she immediately tells you that something is not right. Objectively, the average user will probably not be able to point out what's wrong, if anything at all, other than "this is not Windows, can we not have Windows please".

The fantastic thing about the eee PC was that, of all people I showed it to, not a single one said: bah, this is not Windows. Most of them got on with the user interface immediately, and found it really intuitive. In all my years of advocating Linux I never before seen a reaction like this. I did absolutely no advocating whatsoever, no mention about freedom or the superiority of free software. Just letting them play with it was enough. As an example, my girlfriend has been using Linux for over 5 years, and I get the periodic complains of "why can't we just use windows" whenever I have some difficulties installing a device, or I break the world on a dist-upgrade. But within minutes of playing with the eee, her reaction was: "I want one of these!!".

The reactions I've seen towards the eee PC are almost the opposite of the few Vista users I've spoken to. Sure, Vista looks nice, but have you tried installing a one-click wireless router? That's when F&F call me out, when it all goes wrong with the "one-click" cheap product they bought. But thing is, I can't say much in Ubuntu's defense either. For example, I spent several days installing a Huawei e220 modem to provide 3G Internet access to my nephews, and let me just tell you, trivial would not have been a word one could apply to any part of the process. Vodaphone's new clever GUI may be good for Vodaphone users but I never got the damn thing to cooperate. True, the whole exercise wasn't taxing for a nerd - hey, its fun to look at AT commands now and then - but there is no way, just noooo waaaay a regular user would have gone through the pain, even with the brilliant Ubuntu forums to hand.

Now, before we go any further, I can already hear the complaints: "so you've chatted to what, twenty people, and now you think you understand the market?". Well, that much is true, I cannot claim any statistical accuracy to my diagnostics. These are my opinions; the entire article is based on empiricism and small samples. However, if my line argumentation is done correctly and rightly interprets the success of the eee, then there must be some truth to my views because I've tried to align them with the eee. The market has given a verdict on this gadget, loudly and unequivocally.

The eee PC is also a brilliant illustration of the huge divide between regular users and the developers who are tasked with providing software for them. At a moment in time where the Gnome community is yet again rethinking the future of Gnome, not a single regular user would find this debate interesting. This should send all the alarm bells ringing, but unfortunately that doesn't seem to be the case. The truth is, regular users don't want flashy 3D desktops, although they can eventually cope with them; they don't need spinning cubes although they may start using them once they understand it. What they really want is simplicity. They have a simple set of tasks to perform, and they want to do so cheaply and reliably, and they truly do not understand why everything has to be so complicated and why do computers have to change so much so often.

So what made the eee popular? In my opinion, there are two key points:
  • Its cheap. No one would even have a look at it if it was 400 GBP
  • Its easy.
These are the key selling points to a regular user. To illustrate the second point, when I said to my girlfriend I was thinking about installing Ubuntu Hardy on the eee, she replied in dismay: "Why would you do that??".

The Regular User Use Cases

The key thing to notice about the eee is that most users don't even know its not running Windows. Its just an appliance, a bit like a PlayStation, and thus there is no need to enquire about it's operative system. Like an appliance, it is also expected to be switched on and just work - the fast boot reinforces this idea. The interface provided is also designed for the tasks common to the vast majority of regular computer users, and allows them to find things fast. But, looking at the wider problem, what do our regular users do with their computers? I compiled a list of all use cases I found in my user base:
  • Internet: email, browsing, playing on-line games and youtube;
  • Listen to music, sync with their music player;
  • Watch local video content;
  • Talk with their friends: IM, VOIP
  • Play (basic) games: on all cases, real gaming is done on the PlayStation;
  • Work: word-processor by far, some spreadsheet use but "it's quite hard";
  • Burning and ripping;
  • Downloading: torrents, etc. Not very popular because "its complicated";
  • Digital photo management: storage, some very basic manipulation (make it smaller for emailing);
  • Printing: mainly for school/University; pictures in very few cases.
In addition to these, some additional requirements crop up:
  • Windows users all have proprietary firewalls and virus scanners;
  • All machines are multi-user, and data must be kept private - especially with the youngsters;
  • Machines must withstand battering: switched off at any point, banged about, dropped, etc;
  • Internet connectivity is vital, ADSL, cable and 3G are used. Computers are useless without the Internet;
  • Wireless around the house is vital. External wireless is nice, but not frequently used because "it's too complicated";
  • Costs must be kept exceedingly low as IT budget is normally very low;
That's it. You'd be amazed with the percentage of the market one covers with only these use cases; not just doing them, but doing them well, like a PlayStation plays games.

And what are the biggest complaints about computers?
  • They're really hard. Installing hardware and software is a nightmare, and they'd be stuffed without the local nerd;
  • They break easily. One of my Vista users is still in disbelief that installing wireless drivers could cause the DVD drive to stop working;
  • They're expensive. Sure you can get a cheap'ish box but then everything else is expensive (software, peripherals, etc);
  • They change far too frequently. Most users just about got around XPs user interface just to see it all change again;
  • They're insecure. They don't know how or why but that's what they've heard. That and the constant popups that look like viruses.
On one hand, the regular user is quite advanced, making multi-user and networking a central part of its computer experience. On the other hand, he/she is very naive: the vast majority of computing power goes under-utilised - the OS gobbling most of the resources for no good reason - and the majority of software expenses easily avoidable by using freely available applications. Regular users haven't got nowhere near using Media Centres, "clever" media management software, or even connecting their PCs to the TVs. All these things they consider "advanced" and yet nerds and more savvy users have been doing it for years. One cannot help but feel that there is a massive market out there for the taking - a market that Vista cannot aim to grab because it's diametrically opposed to its needs - and yet, no one else seems to find the path to its door.

UbuntuBox: The Hardware Platform

The rest of this article is an Ubuntero Gedankenexperiment: if I was a manufacturer, what sort of box would I like my F&F to have? What would make my life and their life easier? The short answer to that question is a PlayStation 2 like box but with PC-like functionality. The long answer is, well, long.

I'm not going to bother with engineering reality here - I'm sure some requirements will be so conflicting they cannot possibly be implemented. However, I've got zero experience in hardware manufacturing, weights, cooling, large scale deployment and so on - so much so that I'm not even going to bother pretending; any assumptions I'd make would be wrong anyway. So, to make matters easy, I'll just ask for it all - impossible or not - and wait for the reality check to come in.

The first, very different thing about our box is that it's not a computer. Well, inside it is a regular PC of course, but it doesn't look like one. It is designed to look exactly like a DVD player, and to fit your living room. A bog-standard black-box with a basic LED display would do. Inside, it has:
  • Multiple cores: four would be ideal, but at least two. They don't have to be particularly fast (1.x Ghz would do, but I guess 2 Ghz would be easier to find);
  • 4 GB of RAM: can be the slowest around, but we need at least 4; the more the merrier, of course;
  • 250 to 500 GB hard drive: the more the merrier. Doesn't have to be fast, we just need the space;
  • Average video card: key things are RGB/HDMI and TV out; resolution decent enough to play most games (not the latest);
  • Loads of USB ports;
  • RW DVD drive;
  • Analog TV + DVB card (for FreeView in England);
  • Wired and Wireless Ethernet;
  • Sound card with 5.1 surround sound: doesn't have to be a super card, just an entry level one would do;
  • SD card, compact flash readers;
  • Ability to control the box with a remote control;
And now the key limiting factor:
  • The overall cost of the box must not exceed 200 GBP. This may require some tweaking, e.g. if raising it to 299 means we can put all features in, it may be worthwhile.
Notice that all the hardware will be standard on all boxes of the same generation. This is all commodity hardware - certainly nothing proprietary - but without the heterogeneity that is associated with it. Note that control is a key feature - the limiting of user and vendor freedom to swap things at will. We'll return to the topic later on, as I'm sure it will prove controversial.

Now, how does the box behave for the regular use case? Well, you buy it, plug it in, set all the cables up and start it up. You will see only two things on boot: the logo (say the Ubuntu logo) fading in and out, and the console password. That's it. No BIOS, no flashing X-Server, nothing else. Within a few seconds you'll be prompted for the console password and given an option of not needing a password in the future (Note: console is _not_ root). Lets leave the desktop at that for the moment as we'll cover it properly in the next section.

What about Internet access, you ask? Well, you will need to buy one of the available modems:
  • 3G;
  • ADSL;
  • Cable.
Each of these modems are made available at market prices (i.e. as cheap as possible); however, they will have been officially and exhaustively tested and stamped with a "UbuntuBox compliant vX" or some such, where vX is the box's generation. To be compliant means that your hardware has been throughly tested and is known to work with the hardware and software in a given generation. When you plug any of these devices after console login, a simple wizard will appear asking you to choose a provider. Each provider will have been also part of a certification program before inclusion.

The other networking device is an Ethernet Switch. This is only required if your modem does not come with switching abilities (maybe in the 3G case). Network Manager already does a pretty good job of this, so all you'll need to do is setup the network on your console session (SSID, etc). You can use a USB keyboard for this or just endure typing from the remote control.

Note that the certification requirement is extended to all hardware used with the box. In other words, there is a pretty draconian control on the hardware platform. Users are, of course, free to do as they wish with the device they bought, but if they go down the uncertified route, all support contracts are rendered void (more on this later). The truth is, its impossible to provide cost-effective support to all possible permutations of off-the-shelf hardware - a fact all Linux and Windows nerds are all too aware, as are Mac engineers. There will always be some weird combination that makes things break, and it can take many, many man-days to fix it; when you have 1M boxen out there, this cost would be prohibitive. The only way is to control the standard platform.

For all of its closeness, the certification process is actually open when compared with other companies. All the criteria involved is made available in public websites, APIs with all the hooks required to extend wizards are public (with examples), companies are free to do public dry runs and any company can request a slot for validation. Perhaps some cost needs to be associated with the process (time is money after all, and we must discourage the less serious companies), but in general, the process is fair and public. The tests, however, are stringent; hardware that passes _cannot_ fail when deployed in the wild.

One final note with regards to entry level hardware. Some people may not be aware, but the computing power available as standard today is incredibly high. For example, one of the PCs I maintain has a 1Ghz CPU, 512 MB of ram, 10 GB hard drive and an average ATI card; I bought it for 60 GBP. This machine runs Ubuntu Hardy and sometimes has to cope with as many as 3 users logged on. It doesn't do any of the 3D Compiz special effects due to the dodgy ATI card, but it does pretty much everything else. You'd be surprised on what you can do with the slowest RAM, cheapest sound-card and so on.

UbuntuBox: The Software Platform

By now you must have guessed that the box would be running Ubuntu; but this is not your average Ubuntu. Using an interface along the lines of Remix, we would make a clear statement that this is an appliance - not a PC. As the eee has demonstrated, perceptions matter the most. Remix's interface will remind no one of Windows, whilst at the same time making the most common tasks really easy to locate.

In addition to regular Ubuntu, the software platform would provide, out-of-the box, complete media support. This entails having GStreamer will all the proprietary plugins, Adobe's flash and any other plug-ins that may be required for it to play all the media one can throw at it.

The UbuntuBox is mainly a clever Media Centre, and, as such, applications such as Elisa, Rhythmbox/Banshee, F-Spot, etc are at the core of the user experience. These applications would need to be modified slightly to allow for a better multi-user experience (e.g. shared photo/music collections, good PVR and DVB support, etc), but on the whole the functionality they already provide is more than sufficient for most users.

As with the hardware side, the software platform is tightly controlled. Only official Ubuntu repositories are allowed, and all software is tested and known to work with the current generation of boxen. And, as with hardware, the software platform is made available for third-party who want to deploy their wares. An apt interface similar to click 'n run is made available so that commercial companies can sell their wares on the platform and charge for it. They would have to go through compliance first, of course, but if the number of boxes out there is large enough, there will be companies interested in doing so. This would mean, for example, that a games market could begin to emerge based on Wine; instead of having each user test each Windows application for their particular setup, with many users having mixed results, this would put the onus of the testing on the company owning the platform and on the software vendor. Games would have to be repackaged as debs and be made installable just like any other Debian package. Of course, the same logic could be applied to any windows Application.

As I mentioned previously, boxen come with support contracts. A standard support contract should provide:
  • Access to all security fixes;
  • Troubleshooting of problems, including someone remotely accessing your machine to help you sort it out.
Due its homogeneity, UbuntuBox is very vulnerable to attacks. If an exploit is out in the wild, large number of boxen can be compromised very quickly. To make things a bit safer, the platform has the following features:
  • SELinux is used throughout;
  • All remote access is done via SSH and is only enabled on demand (e.g. when tech support needs access);
  • All users have passwords and must change them regularly;
  • There is an encrypted folder (or vault) for important documents, available from each user's desktop.
Finally, notice that binary drivers and proprietary applications are avoided when possible - e.g. Intel drivers would be preferable to nVidia, provided they have the same feature-set. However, where the proprietary solutions are technically superior, they should be used. Skype springs to mind.

UbuntuTerm

Readers may be left wondering, "this is all very nice and dandy, but am I supposed to do my word processing using a TV?". Well, not quite. Whilst the TV is central, its use is focused on the gaming and Media Centre aspects of the box. If you want to use UbuntuBox as a regular PC, you will need to buy a UbuntuTerm. Just what is a UbuntuTerm? It is a dumb terminal of "old" in disguise (e.g. LTSP). It is nothing but a LCD display of a moderately decent size (19" say), with an attached PC - the back of the monitor or the base would do, as the hardware is minimal. The PC has a basic single core chip with low power consumption to avoid fans and on-board video, sound and wireless Ethernet. It is designed to boot off the network if BOOTP can be used over wireless; if not, from flash. Whichever way it boots, its configured to find the mothership and start an XDMCP session on it. Its price should hover around the 100 GBP mark.

As with any decent terminal these days, UbuntuTerm is designed to fool you in believing you are sitting on the server. X already does most of the magic required, but we need to take it one level further: if you start playing music, the audio will come out of your local speakers via pulseaudio; if you plug your iPod via its USB port, the device will show up on your desktop; if you start playing a game, the FPSs you get remotely will comparable to playing it on the server. As with everything else mentioned in this article, all of these technologies are readily available on the wider community; its a matter of packaging them in a format that regular users can digest (see Dave's blog for example).

The standard hardware on a UbuntuTerm is as follows:
  • Low RAM, basic video card;
  • Speakers attached to monitor;
  • SD Card, compact flash readers;
  • WebCam, headset;
  • Lots of USB ports
A house can have as many UbuntuTerms as required, and the server should easily cope with at least 6 of them without too much trouble, depending on what sort of activities the users get up to.

Finally, in addition to the UbuntuTerm in hardware, there is also a UbuntuSoftTerm. This is nothing but a basic Cygwin install with X.org, allowing owners of PCs to connect to their UbuntuBox without having to buy an entire UbuntuTerm.

Conclusions

UbuntuBox is an attempt to ride the wave of netbooks; it also tries to make strengths out of Linux's weaknesses. The box is not may not live up to everyone's ideals of Free Software, but its main objective is to increase Ubuntu's installed base, allowing us to start applying leverage against the hardware and software manufacturers. The design of the box takes into account the needs of a very large segment of the market which have basic computing needs, but don't want to became experts - just like a PlayStation owner does not want to know the ins-and-outs of the PowerPC chips.

The UbuntuBox is an appliance, and as such is designed to be used in a fairly rigid number of ways, but that cannot be avoided if one wants to stay true to its nature. The more freedom one gives to users, the worse the end product will be for the Regular User, which cares not for intricate technical detail.

Note also I haven't spent much time talking about business models for the company providing UbuntuBoxen. The opportunities should be there to create a sustainable business, based on revenue streams such as monthly payments for support, fees from OEMs, payments to access the platform (content providers). However, I don't know too much about making money so I leave that as an exercise to the reader. The other interesting aspect is comunity leverage. If managed properly, a project of this nature could enjoy large amounts of comunity participation: in testing, packaging, marketing, support - in fact, pretty much all areas can be shared with the comunity, reducing costs greatly.

All and all, if there was an UbuntuBox out there for sale, I'd buy it. I think such a device would have a good chance of capturing this illusive segment of the market, giving Linux a foothold, however small, on the desktop.

Wednesday, June 11, 2008

Nerd Food: On Evolutionary Methodology

Unix's durability and adaptability have been nothing short of astonishing. Other technologies have come and gone like mayflies. Machines have increased a thousand-fold in power, languages have mutated, industry practice has gone through multiple revolutions - and Unix hangs in there, still producing, still paying the bills, and still commanding loyalty from many of the best and brightest software technologists on the planet. -- ESR

Unix...is not so much a product as it is a painstakingly compiled oral history of the hacker subculture. -- Neal Stephenson

The Impossibly Scalable System

If development in general is an art or a craft, its finest hour is perhaps the maintenance of existing systems which have high availability requirements but are still experiencing high rates of change. As we covered previously, maintenance in general is a task much neglected in the majority of commercial shops, and many products suffer from entropic development; that is, the piling on of changes which continuously raise the complexity bar, up to a point where it is no longer cost-effective to continue running the existing system. The word "legacy" is in itself filled with predestination, implying old systems cannot avoid time-decay and will eventually rot into oblivion.

The story is rather different when one looks at a few successful Free and Open Source Software (FOSS) systems out there. For starters, "legacy" is not something one often hears on that side of the fence; projects are either maintained or not maintained, and can freely flip from one state to the other. Age is not only _not_ a bad thing, but, in many cases, it is a remarkable advantage. Many projects that survived their first decade are now stronger than ever: the Linux kernel, x.org, Samba, Postgresql, Apache, gcc, gdb, subversion, GTK, and many, many others. Some, like Wine, took a decade to mature and are now showing great promise.

Each of these old timers has its fair share of lessons to teach, all of them incredibly valuable; but the project I'm particularly interested in is the Linux kernel. I'll abbreviate it to Linux or "the kernel" from now on.

As published recently in a study by Kroah-Hartman, Corbet and McPherson, the kernel suffers a daily onslaught of unimaginable proportions. Recent kernels are a joint effort of thousands of kernel hackers in dozens of countries, a fair portion of which working or well over 100 companies. On average, these developers added or modified around 5K lines per day during the 2.6.24 release cycle and, crucially, removed some 1.5K lines per day - and "day" here includes weekends too. Kernel development is carried out in hundreds of different kernel trees, and the merge paths between these trees obeys no strictly enforced rules - it does follow convention, but rules get bent when the situation requires it.

It is incredibly difficult to convey in words just how much of a technical and social achievement the kernel is, but one is still compelled to try. The absolute master of scalability, it ranges from the tiniest embedded processor with no MMU to the largest of the large systems - some spanning as many as 4096 processors - and covering pretty much everything else in between: mobile phones, Set-Top Boxes (STBs), game consoles, PCs, large severs, supercomputers. It supports more hardware architectures than any other kernel ever engineered, a number which seemingly keeps on growing at the same rate new hardware is being invented. Linux is increasingly the kernel of choice for new architectures, mainly because it is extremely easy to port. Even real time - long considered the unassailable domain of special purpose - is beginning to cave in, unable to resist the relentless march of the penguin. And the same is happening in many other niches.

The most amazing thing about Linux may not even be its current state, but its pace, as clearly demonstrated by Kroah-Hartman, Corbet and McPherson's analysis of kernel source size: it has displayed a near constant growth rate between 2.6.11 and 2.6.24, hovering at around 10% a year. Figures on this scale can only be supported by a catalytic development process. And in effect, that is what Linux provides: by getting better it implicitly lowers the entry barrier to new adopters, which find it closer and closer to their needs; thus more and more people join in and fix what they perceive to be the limitations of the kernel, making it even more accessible to the next batch of adopters.

Although some won't admit it now, the truth is none of the practitioners or academicians believed that such a system could ever be delivered. After all, Linux commits every single schoolboy error: started by an "inexperienced" undergrad, it did not have much of an upfront design, architecture and purpose; it originally had the firm objective of supporting only a single processor on x86; it follows the age-old monolithic approach rather than the "established" micro-kernel; it is written in C instead of a modern, object-oriented language; its processes appear to be haphazard, including a clear disregard for Brook's law; it lacks a rigorous Q&A process and until very recently even a basic kernel debugger; version control was first introduced over a decade after the project was started; there is no clear commercial (or even centralised) ownership; there is no "vision" and no centralised decision making (Linus may be the final arbiter, but he relies on the opinions of a lot of people). The list continues ad infinitum.

And yet, against all expert advice, against all odds, Linux is the little kernel that could. If one were to write a spec covering the capabilities of vanilla 2.6.25, it would run thousands of pages long; its cost would be monstrous; and no company or government department would dare to take on such an immense undertaking. Whichever way you look at it, Linux is a software engineering singularity.

But how on earth can Linux work at all, and how did it make it thus far?

Linus' Way

I'm basically a very lazy person who likes to get credit for things other people actually do. -- Linus Torvalds

The engine of Linux's growth is deeply rooted in the kernel's methodology of software development, but it manifests itself as a set of core values - a culture. As with any other school of thought, not all kernel hackers share all values, but the group as a whole displays some obvious homogeneous characteristics. These we shall call Linus' Way, and are loosely summarised below (apologies for some redundancy, but some aspects are very interrelated).

Small is beautiful
  • Design is only useful on the small scale; there is no need to worry about the big picture - if anything, worrying about the big picture is considered harmful. Focus on the little decisions and ensure they are done correctly. From these, a system will emerge that _appears_ to have had a grand design and purpose.
  • At a small scale, do not spend too long designing and do not be overambitious. Rapid prototyping is the key. Think simple and do not over design. If you spend too much time thinking about all the possible permutations and solutions, you will create messy and unmaintainable code which will very likely going to be wrong. Best implement a small subset of functionality that works well, is easy to understand and can be evolved over time to cover any additional requirements.

Show me the Code
  • Experimentation is much more important than theory by several orders of magnitude. You may know everything there is to know about coding practice and theory, but your opinion will only be heard if you have solid code in the wild to back it up.
  • Specifications and class diagrams are frowned upon; you can do them for your own benefit, but they won't sell any ideas by themselves.
  • Coding is a messy business and is full of compromises. Accept that and get on with it. Do not search for perfection before showing code to a wider audience. Better to have a crap system (sub-system, module, algorithm, etc.) that works somewhat today than a perfect one in a year or two. Crap systems can be made slightly less crappy; vapourware has no redeeming features.
  • Merit is important, and merit is measured by code. Your ability to do boring tasks well can also earn a lot of brownie points (testing, documentation, bug hunting, etc.) and will have a large positive impact on your status. The more you are known and trusted in the community, the easier it will be for you to merge new code in and the more responsibilities you will end up having. Nothing is more important than merit as gauged by the previous indicators; it matters not what position you hold on your company, how important your company is or how many billions of dollars are at stake - nor does it matter how many academic titles you hold. However, past actions do not last forever: you must continue to talk sense to have the support of the community.
  • Testing is crucial, but not just in the conventional sense. The key is to release things into a wider population ("Release early, release often"). The more exposure code has the more likely bugs will be found and fixed. As ESR put it, "Given enough eyeballs, all bugs are shallow" (dubbed Linus' law). Conventional testing is also welcome (the more the merrier), but its no substitute for releasing into the wild.
  • Read the source, Luke. The latest code is the only authoritative and unambiguous source of understanding. This attitude does not in anyway devalue additional documentation; it just means that the kernel's source code overrides any such document. Thus there is a great impetus in making code readable, easy to understand and conformant to standards. It is also very much in line with Jack Reeve's view that source code is the only real specification a software system has.
  • Make it work first, then make it better. When taking on existing code, one should always first make it work as intended by the original coders; then a set of cleanup patches can be written to make it better. Never start by rewriting existing code.
No sacred cows
  • _anything_ related to the kernel can change, including processes, code, tools, fundamental algorithms, interfaces, people. Nothing is done "just because". Everything can be improved, and no change is deemed too risky. It may have to be scheduled, and it may take a long time to be merged in; but if a change is of "good taste" and, when required, provided the originator displays the traits of a good maintainer, it will eventually be accepted. Nothing can stand on the way of progress.
  • As a kernel hacker, you have no doubts that you are right - but actively you encourage others to prove you wrong and accept their findings once they have been a) implemented (a prototype would do, as long as it is complete enough for the purpose) b) peer reviewed and validated. In the majority of cases you gracefully accept defeat. This may imply a turn-around of 180 degrees; Linus has done this on many occasions.
  • Processes are made to serve development. When a process is found wanting - regardless of how ingrained it is or how useful it has been in the past - it can and will be changed. This is often done very aggressively. Processes only exist while they provide visible benefits to developers or, in very few cases, due to external requirements (ownership attribution comes to mind). Processes are continuously fine-tuned so that they add the smallest possible amount of overhead to real work. A process that improves things dramatically but adds a large overhead is not accepted until the overhead is shaved off to the bare bone.
Tools
  • Must fit the development model - the development model should not have to change to fit tools;
  • Must not dumb down developers (i.e. debuggers); a tool must be an aid and never a replacement for hard-thinking;
  • Must be incredibly flexible; ease of use can never come at the expense of raw, unadultered power;
  • Must not force everyone else to use that tool; some exceptions can be made, but on the whole a tool should not add dependencies. Developers should be free to develop with whatever tools they know best.
The Lieutenants:

One may come up with clever ways of doing things, and even provide conclusive experimental evidence on how a change would improve matters; however, if one's change will disrupt existing code and requires specialised knowledge, then it is important to display the characteristics of a good maintainer in order to get the changes merged in. Some of these traits are:
  • Good understanding of kernel's processes;
  • Good social interaction: an ability to listen to other kernel hackers, and be ready to change your code;
  • An ability to do boring tasks well, such as patch reviews and integration work;
  • An understanding of how to implement disruptive changes, striving to contain disruption to the absolute minimum and a deep understanding of fault isolation.
Patches

Patches have been used for eons. However, the kernel fine-tuned the notion to the extreme, putting it at the very core of software development. Thus all changes to be merged in are split into patches and each patch has a fairly concise objective, against which a review can be performed. This has forced all kernel hackers to _think_ in terms of patches, making changes smaller and concise, and splitting scaffolding and clean up work and decoupling features from each other. The end result is a ridiculously large amount of positive externalities - unanticipated side-effects - such as technologies that get developed for one purpose but uses that were never dreamt of by their creator. The benefits of this approach are far too great to discuss here but hopefully we'll have a dedicated article on the subject.

Other
  • Keep politics out. The vast majority of decisions are taken on technical merits alone, and very rarely for political reasons. Some times the two coincide (such as the dislike for binary modules in the kernel), but one must not forget that the key driver is always the technical reasoning. For instance, the kernel uses the GNU GPL v2 purely because its the best way to ensure its openness, a key building block of the development process.
  • Experience trumps fashion. Whenever choosing an approach or a technology, kernel hackers tend to go for the beaten track rather than new and exciting ones. This is not to say there is no innovation in the kernel; but innovators have the onus of proving that their approach is better. After all, there is a solid body of over 30 years of experience in developing UNIX kernels; its best to stand on the shoulders of giants whenever possible.
  • An aggressive attitude towards bad code, or code that does not follow the standards. People attempting to add bad code are told so in no uncertain terms, in full public view. This discourages many a developer, but also ensures that the entry bar is raised to avoid lowering the signal-to-noise (S/N) ratio.

If there ever was a single word that could describe a kernel hacker, that word would have to be "pragmatic". A kernel hacker sees development as a hard activity that should remain hard. Any other view of the world would result in lower quality code.

Navigating Complexity

Linus has stated in many occasions he is a big believer of development by evolution rather than the more traditional methodologies. In a way, he is the father of the evolutionary approach when applied to software design and maintenance. I'll just call this the evolutionary methodology (EM) by want of a better name. EM's properties make it strikingly different from everything that has preceded it. In particular, it appears to remove most forms of centralised control. For instance:

  • It does not allow you to know where you're heading in the long run; all it can tell you is that if you're currently on a favourable state, a small, gradual increment is _likely_ to take you to another, slightly more favourable state. When measured in a large timescale it will appear as if you have designed the system as a whole with a clear direction; in reality, this "clearness" is an emergent property (a side-effect) of thousands or small decisions.
  • It exploits parallelism by trying lots of different gradual increments in lots of members of its population and selecting the ones which appear to be the most promising.
  • It favours promiscuity (or diversity): code coming from anywhere can intermix with any other code.

But how exactly does EM work? And why does it seem to be better than the traditional approaches? The search for these answers takes us right back to the fundamentals. And by "fundamentals", I really mean the absolute fundamentals - you'll have to grin and bear, I'm afraid. I'll attempt to borrow some ideas from Popper, Taleb, and Dawkins to make the argument less nonsensical.

That which we call reality can be imagined as a space with a really, really large number of variables. Just how large one cannot know, as the number of variables is unknowable - it could even be infinite - and it is subject to change (new variables can be created; existing ones can be destroyed, and so on). With regards to the variables themselves, they change value every so often but this frequency varies; some change so slowly they could be better describbed as constants, others so rapidly they cannot be measured. And the frequency itself can be subject to change.

When seen over time, these variables are curves, and reality is the space where all these curves live. To make matters more interesting, changes on one variable can cause changes to other variables, which in turn can also change other variables and so on. The changes can take many forms and display subtle correlations.

As you can see, reality is the stuff of pure, unadulterated complexity and thus, by definition, any attempt to describe it in its entirety cannot be accurate. However, this simple view suffices for the purposes of our exercise.

Now imagine, if you will, a model. A model is effectively a) the grabbing of a small subset of variables detected in reality; b) the analysis of the behaviour of these variables over time; c) the issuing of statements regarding their behaviour - statements which have not been proven to be false during the analysis period; d) the validation of the models predictions against past events (calibration). Where the model is found wanting, it needs to be changed to accommodate the new data. This may mean adding new variables, removing existing ones that were not found useful, tweaking variables, and so on. Rinse, repeat. These are very much the basics of the scientific method.

Models are rather fragile things, and its easy to demonstrate empirically why. First and foremost, they will always be incomplete; exactly how incomplete one cannot know. You never know when you are going to end outside the model until you are there, so it must be treated with distrust. Second, the longer it takes you to create a model - a period during which validation is severely impaired - the higher the likelihood of it being wrong when its "finished". For very much the same reasons, the larger the changes you make in one go, the higher the likelihood of breaking the model. Thirdly, the longer a model has been producing correct results, the higher the probability that the next result will be correct. But the exact probability cannot be known. Finally, a model must endure constant change to remain useful - it may have to change as frequently as the behaviour of the variables it models.

In such an environment, one has no option but to leave certainty and absolutes behind. It is just not possible to "prove" anything, because there is a large component of randomness and unknown-ability that cannot be removed. Reality is a messy affair. The only certainty one can hold on to is that of fallibility: a statement is held to be possibly true until proven false. Nothing else can be said. In addition, empiricism is highly favoured here; that is, the ability to look at the data, formulate an hypothesis without too much theoretical background and put it to the test in the wild.

So how does this relate to code? Well, every software system ever designed is a model. Source code is nothing but a set of statements regarding variables and the rules and relationships that bind them. It may model conceptual things or physical things - but they all inhabit a reality similar to the one described above. Software systems have become increasingly complex over time - in other words, taking on more and more variables. An operative system such as multics, deemed phenomenally complex for its time, would be considered normal by today's standards - even taking into account the difficult environment at the time with non-standard hardware, lack of experience on that problem domain, and so on.

In effect, it is this increase in complexity that breaks down older software development methodologies. For example, the waterfall method is not "wrong" per se; it can work extremely well in a problem domain that covers a small number of variables which are not expected to change very often. You can still use it today to create perfectly valid systems, just as long as these caveats apply. The same can be said for the iterative model, with its focus on rapid cycles of design, implementation and testing. It certainly copes with much larger (and faster moving) problem domains than the waterfall model, but it too breaks down as we start cranking up the complexity dial. There is a point where your development cycles cannot be made any smaller, testers cannot augment their coverage, etc. EM, however, is at its best in absurdly complex problem domains - places where no other methodology could aim to go.

In short, EM's greatest advantages in taming complexity are as follows:
  • Move from one known good point to another known good point. Patches are the key here, since they provide us with small units of reviewable code that can be checked by any experienced developer with a bit of time. By forcing all changes to be split into manageable patches, developers are forced to think in terms of small, incremental changes. This is precisely the sort of behaviour one would want in a complex environment.
  • Validate, validate and then validate some more. In other words, Release Early, Release Often. Whilst Linus has allowed testing and Q&A infrastructure to be put in place by interested parties, the main emphasis has always been placed in putting code out there in the wild as quickly as possible. The incredibly diverse environments on which the kernel runs provide a very harsh and unforgiving validation that brings out a great number of bugs that could not have possibly been found otherwise.
  • No one knows what the right thing is, so try as many possible avenues as possible simultaneously. Diversity is the key, not only in terms of hardware (number of architectures, endless permutations within the same architecture, etc.), but also in terms of agendas. Everyone involved in Linux development has their own agenda and is working towards their own goal. These individual requirements, many times conflicting, go through the kernel development process and end up being converted into a number of fundamental architectural changes (in the design sense, not the hardware sense) that effectively are the superset of all requirements, and provide the building blocks needed to implement them. The process of integrating a large change to the kernel can take a very long time, and be broken into a sequence of never ending patches; but many a time it has been found that one patch that adds infrastructure for a given feature also provides a much better way of doing things in parts of the kernel that are entirely unrelated.

Not only does EM manage complexity really well but it actually thrives on it. The pulling of the code base in multiple directions makes it stronger because it forces it to be really plastic and maintainable. It should also be quite clear by now that EM can only be deployed successfully under somewhat limited (but well defined) circumstances, and it requires a very strong commitment to openness. It is important to build a community to generate the diversity that propels development, otherwise its nothing but the iterative method in disguise done out in the open. And building a community entails relinquishing the traditional notions of ownership; people have to feel empowered if one is to maximise their contributions. Furthermore, it is almost impossible to direct this engine to attain specific goals - conventional software companies would struggle to understand this way of thinking.

Just to be clear, I would like to stress the point: it is not right to say that the methodologies that put emphasis on design and centralised control are wrong, just like a hammer is not a bad tool. Moreover, its futile to promote one programming paradigm over another, such as Object-Orientation over Procedural programming; One may be superior to the other on the small, but on the large - the real world - they cannot by themselves make any significant difference (class libraries, however, are an entirely different beast).

I'm not sure if there was ever any doubt; but to me, the kernel proves conclusively that the human factor dwarfs any other in the production of large scale software.