Interviewee: Mitch Bradley
In this interview we talk with Mitch Bradley – who was responsible for developing the firmware for the OLPC project. In specific, we talk about:
- Mitch’s history with Sun, FirmWorks, and elsewhere
- Evolution of One Laptop Per Child
- Novel hardware in OLPC laptops
- Propagating technology from OLPC to the mainstream
- Open-source intersections with the commercial sector
- Open-source firmware and community
- History and future of firmware’s place on computers
Sean Campbell: Mitch, tell us a little bit about your background, both as it relates to One Laptop Per Child and outside of that.
Mitch Bradley: I spent a year-and-a-half out of ‘Uni’ working as an analog circuit designer at Rolm Corporation, which some people even remember. And then I moved to Sun when it was about six months old.
I was at Sun for 10 years; that was a pretty formative part of my career. I started out as a hardware designer doing things like SCSI and Ethernet interfaces.
I ended up being the guy that could debug the system problems when other people couldn’t as a result of my sort of refusing to specialize in either hardware or software and also as a result of this Forth diagnostics environment that I had put together over the years.
I had gotten interested in Forth at some point and started building little diagnostics. Then, about 1987, Andy Bechtolsheim–who was the hardware wizard at Sun–approached me to do some firmware for his new design, which was ultimately going to be the SPARCstation one. He needed the firmware to be smaller than the prevailing firmware size on the Sun product line.
I jumped at the chance because I had always wanted to rewrite the firmware in Forth.
And thus was born OpenBoot, which still ships on SPARC machines and was taken to the IEEE, became an IEEE standard, and renamed OpenFirmware at that time. Apple picked that up for their PowerPC Macintoshes, as did IBM for some of their PowerPC servers.
And I have been doing open firmware not quite ever since, but off and on ever since. I left Sun after 10 years there and started my own company, FirmWorks, in order to provide support services for OpenFirmware.
As an employee of Sun, I couldn’t really support firmware on third-party platforms at the level that they needed to be supported, so I started FirmWorks to do that.
And FirmWorks is still a going concern, although it has been on and off the back burner several times. I took a hiatus and built high-speed cameras to watch people hit golf balls.
Sean: That’s a nice, pleasant diversion.
Mitch: It was kind of like my midlife crisis. I had to do something different so I wouldn’t go crazy. I like to play golf so that was kind of fun.
Sean:What happened after that?
Mitch: That company cratered in the dot-com crash, and then I worked for a company that was doing voice control for cable television, talking to the control and telling it which channel you want to watch or tell it to go find the West Wing or tell it to go find some movie that you like, or show what sports were on and stuff like that.
We got that working pretty well, but we couldn’t get the cable companies to deploy it, because it wasn’t going to get them enough monthly incremental revenue to meet their threshold of interest.
I started out doing Open Firmware for that company. We were building a super computer because we were going to need a lot of compute cycles in the table TV head-in where rack space was at a premium and power was at a premium.
So, we had a MIPS- based 128 processor mesh machine in a small box that could do a lot of speech recognitions per second, but that eventually died on the vine because they couldn’t fabricate the chips properly.
I was doing the firmwork for that, and I did some DSP stuff and various odds and ends. Then I got involved with OLPC about a year and a half ago and have been doing firmware for them. So, that’s the short answer.
Sean: Tell me a little bit about your role with OLPC.
Mitch: I pretty much did the firmware. When I signed at OLPC, the idea was to use LinuxBIOS (which is now called coreboot). LinuxBIOS started out as a firmware where the idea was really not to have any more firmware than was absolutely necessary–the smallest amount of start-up that you could do and jump to a Linux kernel.
It kind of grew so that it acquired various payloads like GRUB and things like that, because in a real production environment it just doesn’t fly to have the firmware not be able to do anything other than jump to the OS.
There are lots of diagnostics and system maintenance and system upgrade issues that really need to be addressed, so you need something there other than just initialize the memory and jump.
The idea at OLPC was to use LinuxBIOS along with a stripped-down Linux kernel as the payload, and that would be the boot loader for the real Linux kernel, which would be loaded off the main larger flash.
We were running into problems with that, because even when we stripped the kernel down as much as we reasonably could and compressed it with industrial strength compressors, it still wouldn’t quite fit into flash with all the stuff we needed.
I thought of putting in Open Firmware, but the problem at the time was that there were some encumbrances on my Open Firmware implementation as a result of the fact that I did it when I was at Sun, and Sun has some copyrights on it.
But it turns out that shortly after I started it at OLPC, Sun open sourced their OpenBoot implementation as part of the Open SPARC initiative. So that removed the encumbrance on the portions of my code base that I didn’t own outright.
FirmWorks then open sourced their part of the deal and so we had a complete OpenFirmware implementation that met OLPC’s openness requirement.
As a result, I engaged in replacing Linux BIOS–or at least the Linux kernel payload portion of Linux BIOS–with OpenFirmware. I eventually replaced everything because I realized that what was left of Linux BIOS could be replaced with basically a register initialization table and about 50 lines of assembly language.
So over the course of the first four, five months that I was at OLPC, we came out with a coherent firmware strategy based on OpenFirmware. That got us down to the point where we could put a lot more functionality in the amount of space that we had, and it kind of went on from there. A lot of the rest of the time I spent using OpenFirmware to diagnose hardware problems.
OLPC has a lot of innovative hardware, which means that there were a lot of things that went wrong and had to be fixed.
Scott Swigart: I did a tiny bit of firmware work way back in the day on some Motorola UNIX systems. And I remember that the firmware was really beneficial because if the answer was, “Well, boot the OS and run these diagnostics,” the next problem tended to be, “Well, if I could boot the OS, I wouldn’t have to run those diagnostics.”
I am guessing that was the nature of some the problems, things that might be preventing the OS from booting or preventing some fairly fundamental things from working once it booted, and the firmware just kind of gets rid of a lot of stuff in the middle.
Mitch: A lot of times, we would have a situation where the OS in fact would boot, but some peripheral would be behaving in a flaky way. And the difficulty with trying to debug it under the OS is, the OS is doing so many things autonomously that it is really hard to zero in on the details.
Sean: There is just a lot less going on when it is just the firmware and the hardware basically.
Mitch: Exactly. So you can narrow the problem down. And Open Firmware has remarkable ability to let you play with things because it has got a live Forth Interpreter. So you can script anything just by typing a few characters on the serial port.
Over the years, I have developed a fairly substantial array of diagnostics and drivers for all kinds of devices. And also as a result of my history as the debugging guy at Sun, I have a pretty good war-chest of strategies for throwing scripts at hardware and seeing what happens.
In principle, you can do that with shell scripts and various things at the Linux level, but in practice, the productivity is probably two orders of magnitude less, because there is just so much happening, and it takes a long time to recompile a kernel and re-download it.
Then, the answers you get are inconclusive, whereas with a firmware you start seeing some effect. You can fairly quickly change your tests to eliminate possibilities and eventually isolate the problem and prove that the hardware is doing ‘X’ when it should be doing ‘Y.’
Sean: Can you talk for a minute about some of the novel hardware that is in the laptop?
Mitch: The display is the most obvious novelty. One aspect of the display is that instead of being only backlit and color or only reflective and monochrome, it will do either, so it is sort of a combination of transmissive and reflective.
It will display color when you are inside and can put light through the back, or it will display monochrome when you are outside and it would be too bright to see the backlight in any case.
It is a very low-cost display due to some novel pixel layout characteristics, and as a result, we needed a special chip to drive it. You can’t just feed it the same kind of video stream with the same pixel orders that comes right out of graphics processors these days.
This display controller chip does pixel swizzling among other things–it was a new chip that was done on a very aggressive schedule, and it had some problems, so we had to work those out.
In addition to doing its pixel swizzling thing, this display controller chip also has one frame worth of memory, so that when the processor doesn’t have anything to do, the processor can tell this display controller to grab the last frame and keep it on the screen, and the process can go to sleep without losing the display.
That lets us do aggressive power management when the child is, for example, just reading a book or something like that and not generating a lot of interaction, or the interaction is happening on second granularity. During intervening times, the processor power can be turned off and the user doesn’t even see it.
The switching from the autonomous display refresh mode to the processor-mediated refresh mode required a lot of debugging.
Sean: Tell us a little bit about the mesh networking.
Mitch: For our mesh networking, we want to be able to maintain the integrity of the mesh while the processor is turned off. So the radio has its own processor. There is a little ARM chip on the radio module that does the core of the mesh stuff, and that processor remains powered while the main X86 CPU is turned off.
This mesh module is on a USB port, and that was actually a lot of trouble too. We had all kinds of problems with things like, when you turn off the power to the processor, does the USB interface connecting it to the ARM processor freak out?
There are all sorts of issues having to do with power management, too. General power management is a very, very difficult problem because modern computers have a lot of different power domains, and they’re communicating over buses that may cross these domains.
Sean: In terms of One Laptop Per Child, how much time needed to be expended compared to more of a typical commercial effort, where Dell is rolling out a new line of laptops and they want to increase battery life by half an hour, but they are not terribly concerned about having massive leaps forward in power management?
How much more effort do you think the firmware development effort took for all these issues around power management? I have to believe it is pretty huge.
Mitch: I would say it probably is, but I can’t really give you direct comparison because I haven’t been involved in the PC industry at the same level that I have been involved in OLPC.
Sean: Well, that is fair. What percentage do you think was of your guys’ efforts then?
Was it the major thing you were dealing with every time you tried to integrate with a new component or you got a new version of a new component?
Mitch: Power management has been a constant battle for the last year, and we are still not done. I started work in earnest on power management around January of last year and did the early startup code necessary to wake up the processor as fast as I possibly could.
I put that code to bed, but then ever since we have been fighting, fighting, fighting with all sorts of issues like when you turn something back on, it glitches the power rail and mostly works, but if you try to suspend and resume 10,000 times, it will crash.
On a conventional PC, if you could do 10,000 suspend/resume cycles, you would declare victory in some cases.
Scott: So, your PC, if it sits idle for five minutes, will essentially suspend, and then, when it resumes, which takes seconds, you reconnect to the network and a bunch of stuff at a hardware level re-initializes.
You are telling the processor to go to sleep the same way my desktop will, if it just sits idle for long enough or if I tell it to suspend, but you are keeping everything else around it alive. Like you said, you are using an ARM processor in a mesh network to keep it alive. You just put a frame in the display so that it essentially just keeps showing that same frame while the processor went to sleep.
You took a novel approach to extend battery life by essentially turning off the processor, which is one of the things that is going to draw the most power.
Mitch: We certainly had an aggressive goal that nobody had ever contemplated seriously or gone after in a really big way. The goal was basically to provide computers to people that are poorest of the poor. And in fact, right now, we are deploying in Peru, in places where they don’t have any electricity, so we have to roll out solar power along with everything.
Sean: To throw out an example from a completely different direction, Linksys has this 802.11g router, the one that is stacked up to the sky when you walk into a Best Buy. It’s a standard model they have been selling for three years.
The reason I bring it up, is that apparently, the firmware that Linksys uses on that particular router actually had an open source lineage. And so, what happened is there is a whole cottage industry in the last three years that they have built up around, essentially people adding functionality to this router.
Given that open source projects like to borrow, consume, get inspired by, or build on top of other things, how much do you think of what you have done is actually transferable?
Mitch: I’m pretty confident that much of the technology that came out of OLPC is going to be part of the landscape for many years to come. It won’t be a slam dunk in many cases to just pick up piece ‘A’ and deploy it in product ‘B,’ but, for example, the basic display technology, that’s going to happen.
It is a great display and there is no reason that it couldn’t be deployed in any number of things. And the company that manufactures the display has some time ago spent a very large amount of money building a factory to produce these displays in enormous quantity, fully expecting that it wasn’t just going to be for OLPC.
802.11s is a happening thing and the firmware improvements that are being made to make the mesh networking robust are going to appear in other products. The firmware that we use for wireless networking is not open sourced actually; it is done under contract to Marvell, which is the company that manufactures the chip.
And we really wish it were open source and there is a small group that is trying to make an open source replacement for it, but that is not very far along, as far as I can tell.
The elements of this system that are not open source, including that mesh firmware, the firmware that runs the touchpad, and the firmware that runs the embedded controller that does the very lowest level of power management, those are not open sourced, and they are some of our biggest headaches.
We have had endless fights trying to solve difficult problems in those areas where we can’t see the code.
So I would say that one of our biggest development challenges has actually been dealing with the bits of firmware that were closed source, because we just didn’t have enough clout to force them to be open.
Sean: You have been working on OpenFirmware for quite some time, and it didn’t necessarily start as OpenFirmware when you were working on it back at Sun.
From your perspective, what are some of the advantages you see in open source software at this really low hardware and firmware level, versus proprietary closed source?
Mitch: I can sort of summarize the advantages of open source things in sort of two statements. Statement one is that if you can see the source, you can debug it. Closed software is great when it works, but when it doesn’t work, you are pretty much stuck.
And the other advantage to open source is that a lawyer is not going to tell you that you can’t use what you did.
Sean: What is your impression about why, at the firmware level, more of it isn’t open source? Is there a feeling that by open sourcing the firmware, it makes it easier to reverse engineer the hardware itself? What do you think are some of the barriers at the firmware level that make a lot of companies resistant?
Mitch: Certainly, hardware vendors are very much in the mode of trying to protect their intellectual property. They have a lot of clearly definable investment in their designs. They particularly want to sell it, and they’re particularly concerned about some factory cloning their device and selling it more cheaply.
I understand that companies have a legitimate reason for wanting to make a profit, and many of the really successful companies in the past have been fairly secretive.
Certainly, IBM was not in the business of giving away all of the source code for their VM operating system, and Microsoft is not in the business of giving away source code for Windows, and Intel is not in the business of giving away masks for their silicon.
The only two games in town as far as graphics chips–Nvidia and ATI, are very, very closed about their software, their drivers, and a lot of their details about how their hardware works.
It’s hard to argue with the guys who are raking in the dough, while everybody else is sort of scrabbling around for the scraps. There are arguments in favor of opening things up. I’m not going to repeat them–they’re pretty easy to find–but unfortunately I don’t see that a lot of the people making those arguments are as wealthy as the other people.
Sean: Dozens of news articles say that Sun acquiring MySQL proves the business model of open source.
I think there’s an understanding that at some point, open source companies have to be able to get similar levels of return, similar levels of revenue and profitability, and things like that, that closed source companies have done in the past.
Sun to me is sort of a fascinating company in a Petri dish. They are taking stuff that five or 10 years ago they would have argued to be intellectual property that they could never expose to the world, and they are just putting the source code out there and it will remain to be seen what happens.
I think they are doing it even down to some of their chip masks, in addition to their software side. And so, there is this argument that really the value is in the experience and the talent pool that knows that code base and knows that architecture so well–it isn’t necessarily in the code itself.
Somebody could fork MySQL and run off and make their own version of it, but for some reason nobody does. And I think it is because it would be really hard to just staff 200 people and just keep moving the product forward and try to get everybody to convert over to your version instead of the other one.
What are your thoughts on that? You are a lot closer to it, especially where hardware meets software, which is an interesting place where this plays out.
Mitch: You mentioned Sun and openness. It is kind of interesting to me how much distrust there is between the Linux community and Sun or vice versa. It seems like the Linux community really distrusts Sun.
Sean: I think that’s a fair assessment.
Mitch: Sun was really a pioneer at making things available. If you go back to NFS, Sun made NFS available and the Linux community would argue that, “Well they didn’t make it available according to our terms of open source,” but that’s rewriting history, because those terms of open source didn’t exist at that time.
And similarly with the NeWS windowing environment, Sun promoted and made any number of technologies available under very, very generous terms by the standards of the time, but they don’t meet the current standards of open source as articulated by the Free Software Foundation, for example.
People are retroactively applying their particular worldview to situations that happened when things were very different.
Sean: What do you think are some interesting directions that someone could take the firmware for One Laptop per Child? Obviously, you can imagine ways that they could take the software stack running on it and interesting ways to do things at a software level.
But given that it is open source, where do you think somebody could take the firmware? I could see scenarios like if it was a deployed in a country, maybe they eventually take on a project themselves to modify the firmware to support some local need or some balance, maybe the power management versus another activity that they consider more important or something like that.
Mitch: That brings up kind of an interesting conundrum that I have always had as a firmware person. If I do my job really, really well, nobody knows that I have done anything.
Sean: Right–if it works well, no one ever enters the BIOS.
But assuming people looked at it as a target to move forward, what do you think the areas for improvement are? If somebody was to say, “Well I love firmware and I want to tackle this and I want to move the One Laptop per Child platform forward in a given direction.”
What would they be able to do or what do you think they should be inspired to do?
Mitch: I really don’t think there is very much opportunity in that area, because the functionality of the OLPC firmware right now is pretty complete. There are a few little things I need to do, like make it a little bit more robust to avoid getting killed if you try to update it at the wrong time.
But in terms of functionality, there is not a lot to be said, particularly in light of the fact that the OLPC security model makes it impossible to access the firmware unless you have a special developer key.
So, by and large, the firmware is kind of under lock and key, because if you can get access to the firmware, you can own the machine in a very profound way, and it pretty much violates every security assertion you can think of.
Now, you can get a developer key just by asking for it, and our security model is not intended to prevent people from doing things, but rather to prevent bad people from stealing the machines. Nevertheless, most people are not going to bother.
Most people are going to be working at a much higher level and doing Python scripting and this and that and the other.
I think it is more interesting to contemplate where OpenFirmware could go in other platforms. One of the things that I worked on recently was the ability to boot Windows XP from OpenFirmware. That’s kind of working, so that maybe makes it plausible to use OpenFirmware in some embedded platforms based on X86 processors, where you previously could only really consider a conventional BIOS, because you were unwilling to give up the possibility of maybe sometime running Windows.
Sean: I am curious about what that entails at a firmware level to boot XP versus boot Linux or something like that?
Mitch: Well, to boot XP you have to emulate a list of BIOS services that are packaged as real mode INTs of a traditional kind. This is maybe changing with EFI, but as of right now, the historical legacy real mode BIOS INTs are the name of the game.
Basically, what I had to do was go through and build a call gateway, so that I could bounce a real mode INT into OpenFirmware, and then I could just write Forth scripts to implement the functionality.
Sean: How much of a community is there around OpenFirmware? Are there dozens of participants, or hundreds, or what?
Mitch: There are probably dozens, but not very many dozens. We have given OpenFirmware classes to any number of people inside IBM, and there are quite a few people inside Sun that know it, but the difficulty is that most of the people that know it are inside some company.
Sean: So you are one of the few independent people who knows OpenFirmware?
Mitch: Yeah. I have a few colleagues that I call on from time to time when there is work to be done, and there are actually quite a few people that have done OpenFirmware in the past that would very much like to be doing it now, but since the industry, has kind of collapsed into PCs, they just really haven’t had an opportunity.
So if the OpenFirmware things were to take off again–and I am not really expecting it to in any big way–I know some people I could probably reengage.
Sean: This has been great. We usually turn the microphone back to the person we are interviewing and say, “Hey, did we not cover something you wish we brought up, is there something unique about the subject you’d want to just get on the record or just kind of a closing thought or two from your end?”
Mitch: Yeah, actually, there is one observation I would like to make. In the time that I have been in the industry, the speed and capacity of most aspects of the computer have changed by a factor of several thousand, but the space available for firmware has changed by a factor of at most 10.
So I have sort of maintained this frugality mindset over the years that seems to be going away in a big way and people don’t think anything about throwing 10 megabytes of code at something.
Sean: Well, it is funny you mention that, because I think eventually we will be on the ninth generation language and it won’t even look like a computer language. Our grandkids or even our kids will have a concept of programming that won’t be anything like what we have.
It seems like the present time will be kind of the dark hallway of computing, and you will never the same largesse, that people who hang out in the OS actually get, unless something changes, but it doesn’t sound like from your perspective it is really going to change in the near future.
I mean, maybe there will be some growth vectors, but firmware will always be kind of tied into this really tight box while still trying to do a lot of things.
Mitch: Never say never. You don’t know what is going to happen with hardware, but it is certainly the case that over these 25, 30 years, they are still putting in ROMs that are about a megabyte.
When I was first starting up in firmware at Sun, we had 128K ROM that quickly bumped up to 256K and now we have grown by a factor of four, but in the meantime disk sizes have changed by a factor of 1000.
Sean: And you have also got the multitude of devices that are in the computer, just in terms of the different ways to connect, although I would assume that kind of collapses and expands. At one point, your wireless controller is separate from your Ethernet controller, which is separate from your EVDO controller, and eventually they are all in one device that talks on a standard that can communicate to all of it I guess.
Mitch: Well, they certainly collapse in terms of what the hardware looks like, but the software doesn’t collapse at all, because they still retain their old programming models.
Sean: That’s true, because you still have to interface with them, so even if it is all sandwiched on one little mini-PCI card, from your perspective, while they saved on space, they didn’t really save on the amount of stuff that you have to deal with to make the device perform the actions that it is designed to do.
Mitch: In fact, in many cases it gets worse, because in order to preserve the old programming model, but also acquire the new functionality that they want to add, they hide things behind a bunch of layers.
So especially at the firmware level, you have to deal not only with the original programming model, but with all of the layers of modalities that lets you put it into that model.
Sean: We both appreciate you taking the time to do this. Thank you.
Mitch: Thank you.