Interviewee: Mark Gross
In this interview, Richard Bowler and Scott Swigart interviewed Mark Gross of Intel. Mark Gross is an Engineer in the Open Source Technology center at Intel Corporation. He primarily works on telecommunication computing platform Linux OS support for Intel, along with some additional activities. Mark is also the chair of the power management working group for the Consumer Electronics Linux forum. Mark is a robotics hobbyist and participates in PARTS http://portlandrobotics.org and the occasional Dorkbot get together.
This interview covered a variety of topics some of which follow:
- The role of the Open Source Technology Center at Intel
- Device Drivers and OpenSource distros
- Who determines what gets into the Kernel
- Turf battles with Hibernate
- Security and OpenSource
- Critical Mass and Buy Off in the Open Source community
- Peer Review of Submissions
Richard Bowler: Hi Mark, thanks for taking the time to talk to us. I understand that you work in Intel in the Open Source Technology Center. What’s the mission of the Open Source Technology Center?
Mark Gross: The primary mission is platform and hardware enablement. Whenever we [at Intel] come up with a new chipset, process, or instruction, our job is to enable it, first internally, and then for the community at large and the distros. In support of that, we employ a lot of kernel developers, and we maintain a significant presence in the Linux kernel community.
Richard: Can you tell us what your role in the group is?
Mark: I provide an outward face to the OSV community in the telecommunication space. Intel has some telecommunication businesses, and we need to make sure that the telco-centric OSVs enable their OSs for our new platforms as they come out.
For example, I work with Monte Vista, Wind River, and RedHat. I also work with the embedded Linux community on power-management. My role isn’t a very hands-on one there, but I’m on the chair of the Power Management Working Group for the Consumer Electronics Linux Forum.
I’m also a minor contributor to the Linux kernel.
Scott Swigart: Let me ask you about certain Linux distros being more open than others to things that aren’t 100% open source. Device drivers, for example, aren’t always able to be completely open source, because the companies writing the drivers just aren’t going to do that. For instance, Broadcom has a wireless adaptor that the FCC only allows to operate in certain frequencies, but that’s controlled by software.
And so they don’t open source the code for the driver, because they don’t want somebody to be able to change the operation of their device in such a way that their device no longer complies with the FCC. Can you speak to that a little bit?
Mark: Sure. Regulatory issues do come up every once in awhile, and can be used as the justification for not open sourcing a driver. That argument isn’t usually accepted well by the Linux community.
Intel’s wireless Ethernet is an example. It had a similar problem, and we worked around it.
The first way we addressed it was a regulatory daemon. The regulatory component was done in software, which Windows was implemented as a user-mode module. It was ported to Linux as a user-mode daemon. The closed-source daemon talked through an interface to the driver, which controlled the hardware, enforcing regulatory behavior.
The community did not like that, and it was problematic for us. Some key OSV’s wouldn’t ship it. The solution was to modify the firmware loaded on the wireless card. We rewrote the firmware to insure the regulatory compliance, and the daemon was no longer needed.
The open source community hates it when an IHV hides behind regulations as a justification to not make a GPL driver. They push back and they say, "Why can’t you just do what Intel did, and rewrite your wireless firmware so that regulatory is enforced on the device itself?”
Scott: It seems, then, that there are at least some examples out there of companies that have figured out a way to stay in regulatory compliance and still open source their driver. And so the community looks at those examples, and pushes other hardware vendors to follow those examples.
Mark: Or they call their bluff and say, "Look, you know we don’t believe your excuse. You just don’t want to do it." [laughs] That’s usually what it comes down to.
Scott: So, the open source community isn’t really very accepting of any sort of excuse for why something they need, like a driver, can’t be open sourced.
From their perspective there’s probably a way to do it, and the company, to some degree, is trying to pull it over on them by giving an excuse for why they don’t want to open source something.
Mark: That’s probably correct. But at the same time, it’s not like they won’t use closed source drivers. Display drivers are a good example. Given the choice, they will not use a closed source driver, but if they don’t have a choice, they will.
Mark: By definition, all the kernel code is written by the community.
Scott: Well, I mean it might be written by the community, but it might be a person who’s employed by RedHat, IBM, or Intel, versus somebody in their garage in Lichtenstein, who puts code in.
On the one hand, you hear sometimes that it’s open source, and everybody contributes. The flipside of that is we hear sometimes, "No, it’s really only 10 people who work on this, and they all work for companies."
What’s your impression on that issue?
Mark: As far as drivers go, if it’s a supported driver, it should be almost universally written by a company. Either an employee of a company or a contractor hired by the company to write it.
Every once in a while you’ll find a case where someone has tried to reverse engineer a closed source driver and make it work, but those are in the minority.
Scott: And things like the task scheduler?
Mark: As far as sub-systems go — and major blocks of functionality — there’s a somewhat higher proportion of independents working in the kernel space, at least historically.
A lot of times, people will start hacking on it as hobby, and they’ll get good at it. If they get notoriety they may get hired, funded, or get a stipend of some sort from some company like RedHat, Intel, or someone. For some people, that’s their goal when they contribute. Others contribute because they need the functionality themselves, and others because it’s innately rewarding.
Scott: Talk to me a little bit about the gate keeping process. Who determines what gets into the kernel? How much of your ability to contribute is reputation based? In other words, what is the difference in the experience of someone with established credibility versus someone no one has ever heard of?
Mark: The answer is in part that it depends on what type of code you’re trying to get in. If it’s a driver for a new device that hasn’t been supported before, you provide a driver that has a clean certificate of origination. People look at it, they test it, and it goes through a code-review cycle. If everyone feels good about it after that, then it gets merged into the kernel tree and you’re the maintainer of the driver.
Getting a new driver into the kernel is actually fairly easy. I wrote a driver. I’m the maintainer of a driver called the Telco clock driver, for example. It was about a three-month process. I wrote the driver in a week, I posted my first version of the driver to the mailing list, and I got a bunch of code-review feedback.
Most of the feedback is cosmetic in nature, initially. And then eventually you get some feedback that’s more substantive. You make the changes, and you post the update back. You do this about three or four times. It’s a fairly lengthy process. Eventually, after a month or two of going back and forth once or twice a week, it gets accepted into what’s called the "-mm tree." Once it’s in the -mm tree, it stays there for a cycle and then it’s tucked into the mainline kernel. That’s how you get a driver in.
If you’re going to tackle some major change, like a re-write to the scheduler or power management behavior, then it gets a little more political. The memory manager? Oh my gosh, you touched the memory manager, you better be ready.
Scott: In your opinion, what leads to that politicization of certain sub-systems?
Mark: Sometimes it’s a turf battle. A good example is the Hibernate feature. That’s been plagued with political drama ever since it came out. This is basically the equivalent of Hibernate for Windows. The maintainer exerts tight control over changes, and a maintainer has a lot of discretion over what accepted or rejected. Sometimes it’s for technical reasons, but not everything that people do is logical. Additionally the large companies have not gotten to the point where they will step in and work to fix things the way they need them.
Scott: So there are gatekeepers that have to approve changes to different pieces, because they’re the maintainer of those pieces. And if they don’t approve it, it doesn’t make it in. And the only thing you can do, if you really want your thing in, is just fork you own entire different Linux.
Mark: Few people do that. You just have to keep trying, and sometimes there are good reasons for it, other times there are just petty reasons that just don’t make any sense.
Scott: It seems like that could stifle innovation and slow down progress in certain areas, but talk to me about the flipside. Open source has obviously built the operating system and the web server that the majority of the Internet runs on, so it was successful somehow.
What’s the counterbalance to some of those issues? What do you think really makes open source work?
Mark: Large companies like Intel stepping up and saying, "Hey, this is crap. Let’s fix it." [laughs]
Scott: Do you really see the sponsorship of IBM and Intel and Sun or whoever as being really key to the success?
Mark: Yes, I do. Absolutely. I have no doubts about that.
Scott: They see a problem and they throw resources at it to make it right, and they put it out there for the community at large?
Mark: They fund the development and work through the process to get it accepted. They do what they have to, to placate any community concerns. They work closely with the community big shots to address problems.
Richard: There’s a sense that process development and enforcement are more difficult in an open-source development model as compared to a closed-source model, just based on geographic dispersion and difficulty in day-to-day control over individual contributors.
Can you talk about how the open source community deals with that? Because, obviously, you have to have good process to have quality code, and it’s more difficult to enforce that in an open source model. How does the open source community deal with that issue?
Mark: It’s on a project-by-project basis. The Linux kernel has a process, Mozilla has a different process, Fedora has a different process. One Laptop Per Child has yet another process. They all have different processes and different behaviors.
It’s important to set up your project so it doesn’t have too many problems with personality or process-related issues. When someone turns toward those things, it’s because they have a personal agenda.
Scott: One of the people that we interviewed in Microsoft is Michael Howard, who heads up their security development lifecycle. He said, effectively, that there was a time when Microsoft didn’t focus a ton on security. They had a series of high-profile vulnerabilities, like Nimda, Code Red, Slammer and Blaster.
And so Microsoft, because they’re a closed-source company, just decided, "We are going to change the way we build software." And his joke was, "If you want to participate in the Microsoft salary plan, there’s just certain things you will do as a developer," in terms of threat-modeling and security reviews.
When they go into testing, they do "fuzz" testing and all of this kind of stuff to ensure security in their code. And now, they can point at things like SQL Server, which with the latest service pack hasn’t had a vulnerability in two years. They can point at IIS 6 and a number of products that have been through this process, and they can quantitatively show how their process has increased the security of their code.
One question that comes out of that is, what are some of the things you see in the open-source community that help to ensure security in the product. I know it’s project-by-project in the open source world, but I would imagine, with the kernel especially, there’s got to be a high focus on security, preventing buffer overruns and other vulnerabilities.
Mark: As far as security goes, it’s mostly because you have more eyes. It’s a transparent process. That’s really where you get the security from. You get people testing it.
The security is evolutionary, or reactive. When a security issue is identified, it is fixed. It’s not necessarily designed for security up front, although sometimes it is. Some things are. People will look at new functions or new interfaces and identify vulnerabilities for denial of service attacks, for example. But in general, the security is only in code reviews and people looking for security issues, there isn’t really a design for security.
Scott: Would you say that one of the strengths of open source is that when a vulnerability is found, the vulnerability can be addressed within days or even hours?
Mark: That’s true in theory, but in practice, people are using an OSV’s kernel and it’s going to take the OSV a while to qualify the fix and roll the fix out. If the fix is in the kernel, I can guarantee you it’s going to take time. If the fix is something like a utility or some small program that runs in user space, then the fix will roll out much sooner.
Scott: So, in other words, there might be a fix available in two hours, but it’s not the kind of fix that people are going to distribute throughout their data center, because it hasn’t gone through the process.
Mark: Right. You’d be crazy to do it.
Scott: Talk about some of the other advantages you see in open source. What are some of the big strengths?
Mark: Many eyeballs looking at the code, and version support.
Let’s say a company wants to write a driver. If you don’t open source it, you get stuck in this nasty rat race of having to support many different kernel versions across many different major and minor revisions. It’s intractable.
Open source allows you to release a driver, and only support it in the upstream version of the kernel. Some people are on an older version of the kernel because that’s what’s in their distro. Say someone is using RedHat Enterprise Linux, which is based off kernel version 2.6.9, I think. I maintain my Telco clock driver on 2.6.20, or 21. If I fix a bug, in theory — if RedHat thinks that’s an important bug to them — they will backport it, not me. And that’s a pretty handy thing. That’s a strength, the ability to support so many versions.
Richard: All right. Every community has a lot of diversity, right? Every project I’ve ever been a part of has some struggle among the team over which features will be extended or added over the life of the product.
In closed source, that’s pretty easy. The guys with the power, the technical managers and the product managers, they negotiate until they reach some sort of conclusion.
Open source is much freer, at least in the sense that anybody can derive and redistribute. But if you’re under a single banner that’s going forward, a single distro, then you need some sort of coherent agreement about what’s going to go in the next release. How do you guys come to those decisions?
Mark: Well, I can’t speak for the distros, but I can speak to how the kernel community deals with this issue. It’s a struggle. If somebody wants to put in a new feature — let’s say the notion of operating points for controlling power management — right now there’s friction between the embedded Linux guys and desktop/laptop Enterprise Linux guys.
The embedded Linux guys would like to have some power management features that the desktop guys really don’t care about. There’s a pushing back and forth trying to reach some middle ground, and it’s been a hassle.
Richard: How is the final decision negotiated? Is there some council that says, "OK, we’ve talked about it long enough, we’ll take a vote"?
Mark: No, nothing that formal. Really, what you try to do is you try to get it into that -mm tree I told you about. Somebody just writes it and tries to get it checked in to that tree. If you can get buy-in from the major guys, you can get it into the –mm tree.
Richard: So, it’s like a critical mass kind of deal, you get enough support in the community and that momentum will carry it into the kernel.
Mark: Yes. And if you don’t get the critical mass, then it kind of sits there and doesn’t go anywhere. If it’s important to you, you just keep trying. And eventually, you might catch people on a good day and you might get it in. Or you change it. You placate whatever issue they’ve raised and you get it in that way.
Richard: Considering development methods — like waterfall or spiral or agile or whatever — is there any specific group of development methods that work best in open source? Or is it just a mish-mash, just like closed source?
It seems intuitively that waterfall wouldn’t work; you’re not going to start with it before you implement anything. I assume that it’s more agile, but it sounds more informal than any formal process. As in, "I really want this to go in, so I’ll try to get some guys behind me and we’ll see if we can push it in."
Richard: On the peer review on the mailing list, do they have something like a pocket veto? If you don’t get much response, then does it just die by atrophy?
Mark: Sometimes yes, sometimes no — it depends on the feature. And there’s also significant backchannel communication, so it’s not all just on the LKML (Linux Kernel Mailing List).
Richard: OK. You post it and then you make some phone calls and say, "This is what I did, it’s really cool."
Mark: You usually don’t have to do that. You post it, and you say, "I think this is cool. I’d like to get it in there. What do you think?" And you’re almost guaranteed to get comments, at least the first time.
If no one is paying attention, you need to check to see that you’re following the standards and practices. There are a number of documents in the documentation directory of the Linux kernel that prescribe how to submit a patch, and if you violate those guidelines, you might not get a response.
But assuming you’re doing all those things correctly, and no one responds, then you post it again. And you just keep posting it until you get a response, positive or negative. And when you get one, you react to it.
Scott: I would guess that people on the mailing list, though not necessarily in an obstructionist way, probably throw a lot of stones, saying, "Hey, this is cool, but did you think about this?" Or, "What happens in this scenario?" Or, "How does that affect this other thing?"
Is that the way a lot of the communication works? People trying to poke at the edges of it and make sure it’s really well thought out?
Mark: Yes, after the first set of cosmetic complaints, you get a bunch of non-cosmetic things. Those are the interesting ones.
Richard: Cosmetic, as in, "I don’t like a brace here, I like it there?"
Mark: Right. "You have a white space at the end of a line. Please remove that white space."
Scott: Well, those are probably the fastest things. Those are probably the lowest hanging fruit that would let someone like me contribute to the feedback without really looking very deep.
Mark: That’s true. Anybody can pipe up and say something.
Richard: Was there a lot of struggle about coding standards in the community? For instance, with "C" for instance, you have Kernighan Ritchie Bracing Style, and then you have the popular "brace" in the next line down "tabbed in" style. Do they go back and forth with that, or is there sort of an accepted coding standard that the community tries to adhere to?
Mark: The Linux community does have a coding standard, and people try to adhere to it. And periodically people try to push back on it, and change it. The coding standard calls for not having lines that go past column 80, and that’s a pain in the neck, especially with eight-space tabs. If they’re going to tab in too far, that makes it a pain. People periodically push back on that, but it’s still there.
Scott: It seems to me that one of the things that differentiates open source and closed source is that in closed source, there might a lot more focus on the methodology. There might be a lot more focus on whether it’s done using agile development, for example. In closed source there are different milestones, and gates, and signs offs.
Mark: Yeah, no milestones. I’d say it’s probably agile programming, because they re-factor things periodically.
Scott: In open source, it seems like it’s really 100% about the code. That there isn’t really a lot of discussion about how the code got there or the methodology that somebody used to write it. It’s really that the only thing that you have to look at, and judge, and evaluate is the code itself. Is that a fair statement?
Mark: I’d say that’s pretty much fair. You have two things, the code and bugs; those are the two things people talk about.
Richard: With something really complex like an operating system, it seems like you need some sort of overarching architectural view that you implement too. Is that true in open source projects?
Mark: I can only speak to the kernel, but I’d have to say it’s not true. They don’t really have an overarching architecture. It’s an evolving design, some parts are good some are not so good, but all are subject to change.
Richard: If someone’s going to put a new feature in, they say, "Well, this is the right place to hook the existing code," and then they do that and try to sell it?
Mark: Yeah, they do that, they try to sell it. A good example was the high-resolution timers that finally got into the Linux kernel. A lot of people have been wanting high-resolution timers in the Linux kernel for a good five years now. It’s been pushed back by various people who don’t want to change the complex pieces of code to support it.
Finally some person of sufficient stature took interest in the problem and started working on it. After that the process of implementing the high-resolution timers and getting into the kernel was a fairly methodical approach that was mapped out. It was a two-year process.
Richard: Then, do you get a microcosm of that, depending upon what someone’s adding, and depending upon their own preference for development methodologies?
Mark: That’s right, if the feature’s significantly complicated, and it ripples through a lot of other code, or has a number of dependencies. For instance, the timer has a lot of dependencies all around the kernel to make the timer sub-system.
First, you had the timer sub-system, and then you had to bolt in the high-resolution timers. And then you had to fiddle around with it some more, there was lots of testing, people had to verify that it worked on different architectures, and then finally it got in. It just depends on the feature.
Scott: I can see people getting excited about working on the kernel, or working on a patch, things like that. I mean, programmers like to write code.
But how are people motivated to do the things that aren’t fun? What motivates people to review somebody else’s code? What motivates somebody to build test suites for somebody else’s code? Or what motivates a developer to write good documentation when they’re done with the code, instead of just working on version two?
Mark: First off, no one’s motivated to write documentation. For reviewing the code, there’s motivation there, because when you find issues, that’s one of the ways you start developing a reputation on the mailing list.
If you want to grow a career in working with this community, one way to get into it is to start by finding issues, and working with whoever owns the issue to get the issue solved. There’s a career path actually doing that, so that’s some motivation.
Another motivation is that some people just like pointing out other people’s flaws, especially if you come from a notable company.
Scott: How deep does it go? Is it sort of a code review where you look for things in the code itself, or do people actually go off and build a test suite that really tests somebody else’s code, or API, or whatever?
Mark: I have not seen that done in a Linux kernel yet.
Richard: When people are checking the code on the mailing list, I assume they’re just tracing it mentally. They’re not plugging it in and compiling it, and running it, or are they?
Mark: You get a mixture. Some people do actually run it, and if it doesn’t compile, they’ll let you know. And other people just trace it mentally. The people that complain about the formatting, they just kind of trace through it. They don’t really even compile it, but other people do.
Scott: When people check in code, or put code out there for review, do they often put their own test suites and stuff out there alongside of it, or is it just the function of the code?
Mark: Sometimes they do. Sometimes they publish test results, for example if they made a change to a performance-critical code path, they’ll provide a benchmark results showing that they didn’t hurt anything.
Sometimes they’ll even post tests. It really depends on the feature. I’ve seen them do unit tests, and I’ve seen them not do it.
Scott: Thanks. This has been really fantastic information.