Clean Rooms and Reverse-Engineering

by Sean

The IBM PC - we all know it, we all (mostly) love it.  Since 1984, the PC - or rather its underlying architecture - has become the dominant computing platform.  Part of its rise to ubiquity was the fact that it was easily copied.  Clones of the PC flooded the market with cheap computers that were, for the most part, 100 percent compatible with one another.  And while there is a lot to be said about the rise of the PC and the ensuing clone wars, I want to focus on just one factor: how IBM's BIOS was reverse-engineered and what we can learn from the story.

First, some background.  Big Blue has always been known for their penchant for proprietary systems.  In a weird twist, the PC - their most recognizable creation - broke that streak.  On release day, you could buy an IBM PC technical reference manual that contained every single technical detail of the computer.  This went from the function of the system to a bill of parts used, and right down to the schematics for the whole computer.  In a very real way, the technical reference manual was a guide to building your own PC from off-the-shelf parts.  Besides the IBM logos plastered everywhere, nothing about the PC hardware was proprietary or protected by copyright or patent.  However, IBM wasn't dumb here.  They weren't giving away everything that made the PC work.

Enter the ROM BIOS.  This was a small piece of firmware that was responsible for managing the PC hardware.  It managed booting the system and provided a low-level interface for programmers to use.  The BIOS only amounted to a few kilobytes of code, but it was what made the PC work.  And, most importantly, IBM held copyright to the BIOS.  In other words, no one but IBM could use the PC BIOS, and you can't make a working PC without the BIOS.

This is where reverse-engineering came into play.  The first company to successfully create a PC clone (at least one that was legal to sell) was Compaq, the first clone being the Compaq Portable.  The hardware was just a rip from the IBM technical reference manual, mainly since there are a lot of ways to skin a cat, but only a few ways to build a PC.  The BIOS, however, was a totally new firmware written in-house at Compaq.  So how did they get around IBM's copyright?

The tricky part here was that there was a copy of the BIOS in every PC just waiting to be disassembled.  But if Compaq used any code derived through those means, they were bound to get sued into oblivion.  They had to have plausible deniability that every byte of their new BIOS was free from IBM code.  To ensure this, Compaq used a method of reverse-engineering called clean room design (sometimes called the "Chinese wall" technique).

This method uses two teams of programmers to create the final product.  In the case of Compaq, one team scoured the PC technical reference manual, and disassembled BIOS code and PC programming references.  This "dirty" team then created a specification for their new BIOS implementation, basically just a technical explanation of the PC BIOS as a black box.  Their spec was then cleared by legal to make sure it didn't contain any IBM code or IBM-specific fingerprints.  Then the second team, on the other side of the "wall" so to speak, used the specification to write up the new Compaq BIOS.  The second team was the "clean room" since they had no experience with the PC hardware or software and had no contact with the "dirty" team outside of the spec sheets.  By keeping a meticulous paper trail, Compaq had a pre-built legal case.  If IBM sued, then they could easily show that their new BIOS had no copyright code in it.  In the end, the trick worked.  Compaq had a legally reverse-engineered BIOS implementation and was able to sell PC clones using it.

So how can we adapt techniques from early PC cloning?  Obviously, most of us aren't creating sellable products in any sense.  It's rare that a hobbyist needs to defend themselves in court, but it does happen.  For a lot of projects, using clean room design may be overkill, but if your project turns into a product, then this method may provide a legal safeguard.  However, I think there is a more general takeaway from this method: we should share more of our own "specification."

Trying to reverse-engineer something, especially older technology, can be fun but can also really suck.  This is even more true if you are working with limited resources.  It is rewarding to hack together a solution using trial and error, but it can take a lot of time, effort, and frustration.  Having access to a specification, or something close, can turn an impossible task into a fun afternoon of tinkering.  The hacking community is already really good at transparency; a lot of us share our projects and findings online or in person.  That being said, a lot of what gets passed around online are projects that are already complete, whereas partly finished or failed attempts are less often shared.  And a lot of half-done or dropped projects can have a solution hidden in them that other people sorely need.  The start of a reverse-engineering job can easily serve as a specification to help someone else finish the job.

So share more of your "dirty" work.  Even if you don't get to the final steps, anything you find can be useful for the community.  It's O.K. to let someone else be the "clean room."

Return to $2600 Index