How Software is Built

A blog forum to provide deep dive analysis and community conversations about software development models. For more details click here.
Tags:

Interviewers: Scott Swigart and Sean Campbell

Interviewee: Bjorn Freeman-Benson

In this interview we talk with Bjorn Freeman-Benson. In specific, we talk about:

Scott Swigart: Bjorn, could you please introduce yourself and your work with the Eclipse Foundation?

Bjorn Freeman Benson: Of course. I’m the Director of Committer Community for the Eclipse Foundation, which means that I’m responsible for making sure that the hundred-plus projects at the Eclipse Foundation follow open source rules of engagement and remain open and transparent.

The Eclipse Foundation itself (being a member-funded organization) has another whole side related to member company activities, marketing, and so on, but I’m not involved with that–I’m just concerned with the open source project side of things.

Scott: You mentioned that Eclipse has a hundred projects, and I think that a lot of people think of Eclipse as simply an IDE. Likewise, people tend to think of Apache as a web server, even though the Apache Foundation has 60 or so projects under the governance of their foundation.

Tell us a little bit more about the breadth of those Eclipse projects, as well as what a project has to do in order to become an Eclipse project, and even, what the benefits are once they do.

Bjorn: You’re right that, like Apache, we have the problem that when people hear Eclipse, they think of just the Java IDE. Of course, we started as a Java IDE–and a pretty good one, I might add–but over time, the things that happen at Eclipse have expanded.

We moved into C/C++, PHP, Fortran, Ada, and a whole bunch of languages. And then two or three years ago, a bunch of developers said, “Hey, you know, this pluggable structure that you’ve got for Eclipse is good for building more than just IDEs. You could also build rich client applications with this structure.” And so they started building client applications that weren’t IDEs at all, using the same infrastructure.

Next, some people said, “Why are we stuck with just client applications? This stuff works great on a server, too.” And so now there’s an aspect of Eclipse projects where some of them run strictly on the server side, but using the same plug-in mechanism and OSGI runtime underneath, and so on. All of that has led to a huge breadth of projects, although when some new people come to Eclipse, they still just want the Java IDE.

One other thing–at Eclipse, we say that we have a hundred projects, and Apache says they have 60, but we don’t really have more projects than Apache. We probably just segment them differently.

Scott: You mentioned a core pluggable infrastructure that people have found to be a good foundation for building a lot of different things. How is that structured, and how do projects that utilize it take advantage of the advances made to the core? How is all of that maintained?

Bjorn: There are really two aspects to the core. The first part, which is really not the part you’re mainly asking about, is the infrastructure that the Foundation supplies to the projects to help them run, like the source code system, Bugzilla, Wiki, all the download servers, and things like that.

Of course, we also supply tools on the server side that are nicely integrated with the Eclipse IDE, and since we all use the Eclipse IDE, that all works really well.

The other thing that the Foundation provides is a well-defined development process and IP policy. That’s an important consideration for people deciding whether to come to the Eclipse Foundation instead of going to Apache or storing their code at Source Forge, or Google Code, or something like that.

Many of the projects at Eclipse are sponsored by one corporate entity or another, and the basic idea behind Eclipse is that the projects build extensible frameworks on which the Eclipse members can build commercially successful products and services.

We’re not just building free software–we’re building an ecosystem that actually makes money for everybody in the long run. And then some of that money comes back around and sponsors the Foundation and pays for things like download servers and so on.

But in order to do that, the companies have to feel comfortable that they can adopt code without having to worry about issues like being sued for various IP issues. We spend a fair amount of time and money on making sure that we have clean IP. That means having a carefully defined process about how that IP gets into the projects, as well as a traceable trail for where everything came from.

That is ultimately one of the things that makes Eclipse what it is–carefully documented IP gives everyone confidence in grabbing code and using it.

But all the behind-the-scenes stuff is not what you probably were asking about. What you’re probably asking about is the core platform level of Eclipse, which was originally just the Eclipse IDE and is now the platform that other people build on top of.

The core platform at Eclipse consists of (depending how you measure) between three and six different Eclipse projects. They have a set of committers like any other open source project, and the main Eclipse projects have together come up with an annual release cycle. We originally called it a train, but now we name them after the moons of Jupiter. This year’s release is Ganymede.

This coordination of all our major projects on this yearly cycle is how the new innovations that get into the platform are consumed by the other projects. A series of milestones works their way up to that annual release, and as of milestone three, the platform says, “We’re going to put in this feature.”

That lets the C++ IDE put in that feature, and therefore, the modeling tools that consume both the platform and the C++ tools see that, as of M3, they can use a new API, and so they can also implement a new feature. Everybody coordinates that way, and we move our development to completion on this annual release cycle.

Scott: You mentioned that a lot of the projects have primary corporate sponsors, and of course, those sponsors might want to get stuff into the core infrastructure.

How are conflicts resolved? How does the architecture itself support it if somebody needs something, but not everybody agrees that it’s necessarily a good thing to put into the core?

Bjorn: Let’s say a project is sponsored by a corporate entity, so the people who are working on that project have sort of a dual allegiance, split between the Eclipse open source project and of course the entity that pays them and tells them that they need to do something else.

To date, we haven’t had a lot of conflicts, because the people who are elected to the Eclipse projects have been very careful about having a clear distinction between the IP that they’re putting into the Eclipse project versus things they’re working on for their corporate masters.

We haven’t had a lot of cases where a company wants to put in a feature just for themselves, that nobody else wants. The other type of problem is where somebody in the community wants a feature, but the corporate single-entity people who are working on it don’t put that feature in.

This has caused some conflicts in our core platform team, mostly in cases where people make single-solution requests. For example, I personally want something that fixes a font problem for my particular Linux installation, but the people who are working on the platform are very careful to put in only general solutions.

That’s where conflict tends to occur, when people who are asking for something don’t get it and become frustrated.

Scott: What other options are there? Can they ship a different version of the core with their particular project that’s basically a fork? Or does that not really work because of the way the core is designed?

Bjorn: One of the great things about the Eclipse platform’s original architecture is that everything was built as a plug-in. That means that if you want to ship a unique version for personal or corporate use, including a version of the whole thing that has a different version of just one of the plug-ins, the infrastructure supports that.

That’s definitely possible, and in fact, some of our member companies do that. It’s one of the things we identify as a strength–unlike some IDEs that were built as a single application and then plug-ins were sort of an afterthought, Eclipse is built based on plug-ins from the beginning.

In fact, the original Eclipse team worked this way inside IBM, before it was even spun off as a series of open source projects. They worked at, I believe, seven different physical locations in North America and Europe, and each location built plug-ins that worked with those from the other locations, so they had to agree on APIs.

The APIs that they built into the original platform were really rock solid, because they had to use them themselves. There was none of this back door scheming that you could get if you all worked at the same company and just said, oh well, build me this little thing because I need it.

As a result of those solid APIs, you can go and replace any one of the plug-ins with your own version, and it will work just like the original one plus your changes, and there’s no secret sauce that you have to learn to make that work. We’ve tried to push that same architecture through the other Eclipse projects, and for the most part I’d say we’re successful there.

It’s a great design. We’ve found over time (and Eclipse has been seven years in the public now) that the original plug-ins were a little bit too large to make this as common as you’d like it to be. In other words, conceiving of the environment as 88 plug-ins instead of one monolithic application is good, but maybe if we had 180 plug-ins instead, the granularity would have been small enough that people would have replaced them more often.

With the new Eclipse 4.0 effort, we’re looking to have the plug-ins a little bit more fine grained, so we can get a bit more of that mix and match behavior.

Sean Campbell: Plug-ins are obviously one of the key ways that people extend and customize open source projects. On the other hand, with a development environment, I would think that the plug-in architecture needs to be concerned with how deeply integrated those plug-ins can be into the IDE, in terms of scenarios where one might crash another plug-in.

How do you track down the source of a problem for a developer who’s using tens of plug-ins that they’re updating on a fairly regular basis?

Bjorn: I don’t do that kind of troubleshooting on a daily basis, but I personally do use tens of plug-ins, and I don’t necessarily update them regularly. I’ve discovered, and this is just my personal style, that when things start to conflict, it’s actually fairly easy for me to figure out which plug-in is causing the problem and to disable it or to roll back to a previous version.

The other thing that goes along with that is that many of the successful plug-in authors build their applications as a set of plug-ins. For example, the Mylyn project, which is great stuff, builds not only their core plug-ins but also a bunch of adaptors that hook Mylyn functionality into other plug-ins that you may or may not have loaded.

Therefore, if you load Mylyn on top of the base Eclipse IDE, you get a certain set of adaptors that adapt Mylyn to the Java tooling, to the PDE tooling, and so on. If you also have the C/C++ tooling loaded when you load up Mylyn, you get the adaptor to that. If you had the Subversion tooling loaded, you’d get the adaptor to that.

All of that means you get a kind of customized version for the specific plug-ins that you’ve currently got. And that’s the way they segment out the code so there’s not just one big module that breaks when you upgrade.

We have found that time-based development works pretty well, but then, of course, it might be a self selection thing, where the people who come to join the club are those who like this strategy and those who don’t like it go away.

We’ve noticed some interesting things as a result of the time-based development method.

The first is that we’ve discovered that the forcing function of the schedule causes people to make constrained decisions that produce better code, in the end. Of course, if you let somebody have unconstrained, full creative control over everything, they often get stuck on the possibility that they could do anything at all.

We have found that it works very well if you put the constraint of six-week milestones and annual releases on people, telling them to do something great in a given area, but to make sure they have something to release every six weeks and a product by the end of the year.

That tends to drive more functionality into the projects than you might otherwise expect simply by reducing the set of all possibilities down to things that people can actually implement. So, that’s point one, and again, self selection may well be at work there.

Point two is that as Eclipse has become more and more popular and successful, other open source projects have started to align their schedules around ours.

The great story on this one is around the Ant team at Apache, which was at one time releasing after the Eclipse release. They would put out a new feature, but it wouldn’t be supported in Eclipse until something like 10 months later.

Eclipse and the Ant team got together and the Ant team said, “Well, it’s easier for us to change our schedule than for you guys to change your schedule. So, we’ll just move our releases to a place where you can adopt them.” And so, they moved their releases to something like February, which let Eclipse adopt the new features and support them in time for the annual release in June.

That was really a great collaboration. I mean, it was fantastic that the Ant team was able and willing to adapt their schedule like that, and as a result, it’s better for the users, and it’s better for both teams. It was just a matter of talking between the teams, without anyone coming out from on high and telling anyone else what to do.

Sean: As opposed to something like Firefox, which has a target audience of general end users, Eclipse is primarily used by developers. What do you find is different about the developer audience than some of those other audiences, and how does that influence what the different Eclipse projects do?

Bjorn: I would actually say it’s a lot more similar to Firefox than you think. It turns out that most Eclipse users–not the people who buy the products from the companies, but the ones who just go and get the free Eclipse products and use them–just want to download the bits.

They don’t want to contribute to the code base. They just want all their features. Maybe they are a little bit more technical when they provide bug reports, because they are developers themselves and so they can look at a stack trace and make comments, but for the most part, we’re not seeing that.

That could be the fault of the way that we have structured some of the Eclipse infrastructure. Consider an example. Right now, if you wanted to build your own Linux distro, it’s really easy. You get all the source code, you edit a few configuration files, you type make, and then you get your own Linux distro you can fiddle with.

Building your own Eclipse distro from source is currently not that easy. You could build one from all the plug-ins, but from raw source we don’t have it quite that easy. Maybe as a result, we’re not getting the feedback from our users that we would get if it were that easy.

Sean: On the other hand, I think it’s a certain sign of success when you don’t have to be a developer to use an open source product, because there’s an order of magnitude more people who are going to want to use a product that way, even though many of them are not in a position to contribute back.

Bjorn: That’s an excellent point, and it is also probably true that a lot of our users are writing applications that are well in their own domain complex, such as financial services and so on. That software may be very complex, but not in the same way that the Eclipse framework is.

In order to be built out of plug-ins on top of straight Java, with all that dynamic dispatch and so on, the Eclipse framework uses fairly complex APIs internally. Maybe there are people who don’t want to take the time to learn how to do that but are still perfectly competent programmers. They are users of Eclipse, but they wouldn’t be programmers of plug-ins.

Sean: You mentioned being very schedule driven, so if a feature is ready, and it can be incorporated, and the quality is high, it will make it into the release. Otherwise, they can just keep working on it for the next one. How do you define that cut? What’s the process for deciding whether it’s going to be in or not in?

Bjorn: Each of the projects does it slightly differently, but I’ll give you a generic explanation. The one-year release cycle consists of seven six-week milestones, plus five or so release-candidate builds. A few of the major milestones have particular labels attached to them, like feature complete, API freeze, and so forth.

As you are working your way toward getting your feature ready, you have to have your APIs ready before the API freeze date, or else you have to go and convince the project leadership that they should still incorporate them.

Projects have a series of stricter and stricter rules as we move through the final milestones and into each of the release candidates, in terms of what’s going to be allowed in or not. For example, early on, a committer can commit code, because everybody trusts that their colleagues are responsible and capable.

After a while, the rule is that they can commit code if they can convince another committer that it’s a good idea. Later in the cycle, it requires two committers plus the project lead. Then it’s two committers plus two project leads. Next, it’s two committers, plus two project leads, and one of the project management committee people. Finally, near the end of the release cycle, making any changes requires buy-off from the entire project management committee.

Having the rules get stricter and stricter as we get closer and closer to release is how we filter down the steady onslaught of an immense amount of work, into a final set of bits that has run through all the tests and is consistent and so on.

Scott: That’s interesting to compare with the way Microsoft builds software. When their process gets to a certain point, the product is supposed to be feature complete, and developers are just fixing bugs and things like that. They are required to tell a release committee about changes they are making to the code. Then it flips over to a mode where they actually have to get permission to make changes.

As a result, there is a point where the bar to even fixing certain bugs becomes really high. As anybody who has written code knows, any time you make a change, there’s a non-trivial chance that you are going to introduce other problems. Therefore, in the late stages of the process, they are actually weighing the severity of the bug, weighing the risk versus the benefit of going in and making any changes.

Do you run into the same of thing? What are some of the criteria that those approvers you mentioned use to decide whether a given change is worthwhile at a certain stage?

Bjorn: One of the interesting things about Eclipse is that each of the projects is amazingly independent. Even though I and some other senior people in the Eclipse organization try to provide guidance to the projects, we actually don’t have that many set rules about how they have to make decisions.

Really, what I’m telling you is that the successful Eclipse projects adopt rules like this. We have found that we should advise projects when they start to adopt rules like this. We tell them that if they don’t do things like having an increasingly difficult set of hurdles that contributors have to get over to get their code in, they are likely to end up with buggy, crappy code. We’ve been around the block a few times, and we’ve seen that happen. But again, there’s not one strict set of rules.

For instance, the Test and Performance Tools Platform project has a very strict set of rules for introducing bug fixes after a certain date.

They even have a little form that they require you to fill out that justifies your fix, and you have to answer a whole bunch of questions. You have to mail it off to the project management committee, and they either approve it or they disapprove it.

Other projects may not have a form, but they do still have a set of checkpoints, so if you want to change after a certain time, you have to go through certain hoops, like talking to the project lead, having someone else sign off on your code change, and so on.

Some of the more junior projects are much weaker about it. They just check code in, and as a result, the quality of their code overall is not as high.

Some sort of model like this, including the one you mentioned at Microsoft, is essential if you want to release anything that’s stable. Whatever model you chose, at some point you have to stop writing code.

If you have a bunch of people involved, you have to have some process by which people agree to stop writing code. That can’t be as simple as saying that everybody stops on Friday and we ship it on Monday, because you know there are going to be bugs in there.

You have to have some process, and the process that we’ve adopted around Eclipse projects is one where we incrementally make it more difficult to put code in, so you have to justify more and more about why it’s a good idea to put it in.

Scott: I think we’re nearing the end of our time, so is there anything else you’d like to add to the interview before we button it up?

Bjorn: I actually brought it up already, but I’d like to reiterate a few points about the granularity of building plug-ins. Every project, whether it’s open source or closed source, has to choose a different granularity. I think it’s an interesting point as we continue to evolve, to consider how to build software of the right granularity to encourage people to contribute to an open source project without increasing the complexity of the code.

Because, of course, the more plug-ins you have, or the more modules, or whatever you want to call them, the more APIs you have and therefore the more points of discussion you have and the slower your development is going to be. If you built just one big monolithic application with one guy, you could crank it out faster than you could with 88 or 100 different plug-ins.

But on the other hand, the more monolithic it is, the less likely it is for someone to help you and contribute. There’s an ideal point somewhere in there for us to continue to hone in on.

Scott: Thanks, Bjorn. I appreciate your time.

Bjorn: Thank you.

Posted by campsean on Friday, June 27th, 2008


You can follow any responses to this entry through the magic of "RSS 2.0" and leave a trackback from your own site.

Post A Comment