In this interview, we chat with Jason van Zyl about the Apache Maven project. Maven is designed for build automation. Unlike well known predecessors such as MAKE, Maven uses a project object model (POM) to describe the software being built.
- Building limitations into Maven that enhance consistency and efficiency
- Encouraging a paradigm shift in place of incremental change with new tools
- Fostering collaboration, even among competing organizations
- Avoiding the temptation of trying to fill every role at a young company
- Bridging the gap between various programming languages with future versions of Maven
Scott Swigart: To start, could you take a minute to introduce yourself?
Jason van Zyl: I come from a physics background, but I’ve been developing software for almost 20 years. When I was working in printing and publishing in New York, about 14 years ago, I started getting into open source, and I’ve been making my living from open source for almost 15 years now. It’s culminated in working on Maven and turning everything Maven-related into a product line, and that’s where I am today.
I’ve been part of the Apache Software Foundation for almost 10 years now, and I started the Apache Maven project in 2001. The oldest version was released in 2002, and then Maven 2 was released in 2005. In 2007, I started Sonatype to help invest in the community. In 2008, we became a product-based company and got venture funding.
Scott: My understanding of Maven is that it’s a build system. Talk a little bit about Maven, what it does well, and your thinking about the need to build a better mousetrap in this area.
Jason: To start, I would say that the largest competitors to Maven are large ad hoc systems that are created inside organizations. The primary difference between Maven and something like Ant or Make is that, while those are great tools, there’s no pattern or structure to how you actually use them.
Maven goes a step further, by bringing standards and patterns to a build system: most organizations tend to fall into a set of patterns for developing applications. For .NET, there are standard patterns from Microsoft for building applications. If you’re using Java, there are standard J2EE patterns and there are even design patterns for creating applications.
Maven takes that concept and applies it at the development infrastructure level. As a result, if you were to look at 100 projects built with Make or 100 projects built with Ant, you would find that they are very different. If you look at 100 Maven projects, they’re almost entirely identical in structure. While there might be differences in how they are customized, at a very basic level, if you know how to build one Maven project, you can build all Maven projects. Look at 100 different Ant projects, and each project is a unique “creation”. Over time, these 100 Ant projects will diverge from one another.
Scott: Talk a little about how that plays out, in terms of value in day-to-day operations.
Jason: If you were to take someone off the street without any experience and show them your Maven build internally, 99 times out of 100, they’d be able to build it on the first try.
Maven’s approach addresses some of the problems that came from the explosion of tools that were used in development infrastructure. It had become very hard, even within the same organization, to have a standard, cohesive structure shared among all projects. Even within a single company, you would have inefficiencies. Differences would start to emerge over time with one group using one project structure and build tool and another group using a different structure.
If you expand the perspective, and look at collaboration between different organizations, these inefficiencies become even more apparent. We have more than 2,000 very large clients who talk to one another as part of our technical advisory board. They use the same terms about how to build, test, and deploy software, because Maven provides a platform upon which other tools can build.
It’s impossible to build a platform on top of an ad hoc structure, whereas it’s very easy to build a platform on top of something that has a consistent model. That’s where Maven is different than most other build frameworks or build tools. It has a standard set of lifecycle phases that it passes through. It isn’t just a random assortment of tools, it is an entire factory, and our products and approach provide an end-to-end solution for software engineering.
Every component that is made with Maven is made in the same way, whether it’s a .NET assembly, a .jar file, or a .war file. You step through the same phases of compiling, testing, gathering resources, doing integration testing, and installing or deploying into a repository.
Many of the core ideas for Maven come from “A Pattern Language” by Christopher Alexander. He was an architect who tried to capture the patterns that made cities productive – patterns that enabled a healthy city infrastrucutre, where people can start businesses, communicate efficiently, and so on. I took Alexander’s ideas and applied them to questions about how you construct software and how you develop at an infrastructure level.
When I arrived at Apache 10 years ago, I worked on five or six different projects, and we were finding that people weren’t volunteering to build the software. They would show up, and then they couldn’t build it. We kept losing volunteers, because they didn’t want to spend their time messing around with the build. They wanted to devote their energy to actually creating code.
That’s when I took the version of Maven that I had had for a couple of years before that and checked it into Apache software foundation. It started in the Turbine project, which was probably the first Java framework that was ever created, long before Struts. It graduated into its own profitable project at Apache in 2002.
Scott: The first time I bumped into design patterns, which seem to be at the core of what you were talking about, I was at a software conference down in San Francisco, and one of the gang of four was giving a presentation on design patterns.
He was talking about how they were sitting around lamenting the notion that Smalltalk developers seem to solve the same problem the same way, because there were design patterns baked into the language, whereas C++ developers might solve the same problems in a whole bunch of different (and often horrific) ways, because they could go off in any direction they want.
As a result, there was a big push about factoring problems in terms of the patterns that you were looking at. It sounds like that’s really entrenched in Maven: that when you’re building software, there are certain patterns that repeatedly arise, and if they’re codified in the build system, then the same things are done the same way, even across a wide group of people who never talk to each other.
Jason: Maven is opinionated, and I designed to be this way. I don’t think it’s possible to have a tool work for such a large number of people unless you restrict some of the options and provide some guidelines.
Some people at first railed against Maven conventions and the fact that you have to do things a certain way, but if you follow those patterns, you generally can get a software project up and running in a day.
That’s evidenced by the traffic from Maven central, which has become the de facto standard central repository for Maven artifacts in the open source community. We get on the order of 250 or 300 million hits a month against it. In 2008, we had about 1.9 million unique visitors to Maven central. In 2009, we had almost four million.
So, even though Maven’s opinionated, it seems to work for a lot of people. Even though you’re restricted in some of the things you do, it provides more value by limiting what you can do than it would in providing ultimate freedom to give you as much rope as you want.
Cities that are designed well usually have structural patterns in common. Look at cities that don’t have much infrastructure or don’t copy other cities. Lots of cities in North America do that, and they have horrible infrastructures. A lot of developers and organizations tend not to look at what’s happened in the past, and they often repeat mistakes over and over again.
We have worked with very large organizations to address this issue. For example, Intuit, SAP, and Cisco work openly with us, and they’ve adopted Maven. We’ve actually had people from these organizations talk to one another about what their infrastructures look like, and really, they’re not all that different when it comes down to it.
We’ve been able to collect a lot of large companies because they want consistent patterns. They want to be able to hire resources off the street. They want to be able to move their developers from project to project without forcing them to relearn things over and over again.
That’s the tradeoff for what’s perceived at first as a limited set of freedoms. We actually get a whole tool chain that can help you produce a piece of software from end to end, and generally get a theme up and running and have continuous integration, testing, and deployment all working in a few days.
Scott: The challenge is always when there’s a paradigm shift, because you end up with people who are simply trying to use the new tool that is designed to operate under a different paradigm, and they’re trying to use it the old way.
People try to use C++ without embracing the object-oriented model, or they move from a pointer-based style to garbage kind of development. If someone is coming from Ant, Make, or some home-brewed build system into Maven, I assume it’s challenging for them to step back and take a few days and learn how to work within its flow, instead of in spite of it.
It sounds like you have found a way to help people be successful at that. The shoulder on the learning curve is low enough that people are willingly adopting it, which helps to create its own momentum.
Jason: We recommend that people approach Maven initially with a smaller project to understand the basic principles. People do run into problems when, as you say, they try to wedge Maven into an old model.
That’s why, if someone’s first attempt with Maven is a version of another project, they’re generally not successful.
Scott: There’s almost a pattern around learning a new paradigm. I remember when .NET came out, a lot of organizations tried to learn the new technology by taking their largest, most complex piece of Java software and porting it to .NET.
That’s actually the worst way to go about learning the new technology. The best way is to choose a new, low-priority project and build it from scratch. Doing that a number of times before taking on a bigger project will save a lot of heartache and make the whole experience more successful.
Jason: We’ve adopted certain strategies with the larger companies we work with. You can’t do a bank conversion on an organization that has 4,000 developers; it’s just not possible. Instead, we’ve developed a pattern over time to encompass how projects using different build systems can interoperate at the Maven repository level.
So, even if you’re not building with Maven, as long as your project can publish artifacts to a Maven repository and other projects can consume those artifacts from the Maven repository, you can make it work. There are Ant tasks for dealing with Maven, as well as Ivy, and there are other build systems like Gradle and Buildr that all talk to Maven repositories.
Usually, the companies that we go into are interested in how components are being promoted around various systems. For example, how does a development team “promote” a particular build to a release repository, and how does development interact with quality assurance and an operation team that is responsible for support an an application in production. These organizations tend to move slowly, they leave projects alone that are delivering, and they publish and consume to Maven repositories.
We try to help companies adopt that strategy. Even if they don’t adopt Maven as a build tool at first, they can still immediately benefit from repository management.
Scott: I see more of that sort of model in the business world, and I think that’s a good thing in general. Instead of organizations dictating to others, they adopt a stance more like they’re offering a service, even though there’s also an element of competition to it. After all, they’re competing with the existing build system
I think it’s interesting to see different parts of a company organized around service delivery. There may still be lots of fiefdoms of power, but there’s also an impetus toward cooperative interaction.
Jason: There’s often a consistent solution, but generally what happens is a pattern. Someone likes the idea of Maven, and they want a common reporting and repository infrastructure. They will often become a spearhead within the organization.
People see how they build their projects, and almost every continuous integration tool and every passing framework integrates with it, and you get all of this at a very low cost. You have to learn something, but essentially, every other organization in the world can contribute to Maven, and thousands of corporations have contributed.
Eventually, people within the organization see this, and they set up a small Maven build. Once it starts within an organization, it’s usually spreads like wildfire.
Scott: You also mentioned that there’s the opportunity for a lot of cross-company collaboration. There’s an opportunity for Cisco to collaborate with SAP, in terms of using Maven, because in general, companies don’t see the build system itself as a competitive differentiator, even though the thing they’re building might be.
I see that as a prime area for open source to fly. It’s kind of like in the software-as-a-service model, where the intellectual property is their competitive advantage, but the underlying web server isn’t, whether it’s Apache or IIS or whatever.
It’s the code that they’re running on top of it that differentiates the companies, and so Maven makes perfect sense. Like you said, most companies don’t view their make files or their Ant files as the differentiator.
Still, I’m sure you’ve run into some companies that are resistant to that idea. I remember working with a company that had built its own remote object protocol; they reinvented COM objects and that sort of thing.
They really felt like it made sense for them to maintain this large, proprietary system of plumbing. They couldn’t hire anybody off the street who knew it, of course, but certain people within the company were so invested in it that they were really reluctant to let it go.
Do you run into that? You mentioned that it’s common for companies to have their own home-grown build systems. Talk a little bit about the politics involved there.
Jason: We’ve gone into organizations where someone asked for our help, and we immediately get into arguments with others who are very resistant to our being there. It’s one of the last frontiers in an organization, because there are existing patterns for almost every other task. A lot of times, with J2EE applications, for example, there aren’t too many other tasks that vary. A lot of good engineers like making infrastructure, because it’s hard to do. They often get very attached to the systems they create, and I’ve been in places where people think of them as their babies. It’s really hard to dislodge them.
In some places, it’s taken a year to get through all of the politics, but we try not be antagonistic. We tell them that they can integrate into the process slowly, and we try to show them the benefits. We often try to convince them that contributing to Maven is a better avenue for working on this stuff than trying to keep their own system around. There’s generally resistance at first, but there are good, logically sound arguments for that approach. If people are going to spend their energy developing infrastructure, they’ll probably like having people using the things they make.
Ultimately, if you contribute to Maven, you’re going to have two million people using your stuff, as opposed to maybe a hundred people using an internal system. We often say that writing build tools is fun, until people other than your friends start to use it. Then it becomes a drag, because you end up having to support 1,000 people.
One problem that we run into in almost all of these organizations is that, if your build system doesn’t support the functionality required to code something in your application, you wind up having to change your build infrastructure at the same time you change your application. If both of those things change at the same time, and something goes wrong, it’s just a big finger pointing session. One of the points we often make is that by using a framework that many other people are also using, if something goes wrong in your application and tests fail while the build infrastructure remains the same, you can be pretty certain that it’s your application that’s wrong.
The cost of maintaining your own build infrastructure can be substantial. One company we work with has estimated conservatively that over five years, failed projects and lost developer productivity because of poorly executed infrastructure have cost them $400 million USD. Now that is real money, and I think companies are starting to appreciate not just the dollar cost of maintaining custom development infrastructure but the lost opportunity for collaborating with a large community using a standard process.
Scott: That does make sense when you consider that code is inventory, and similarly to the retail business, its best to minimize the amount of inventory you keep around, because you always incur costs just from having it around.
The ideal for a development organization is to maintain only the code that is core to your business. The build system just isn’t the type of code you want to maintain, because it simply amounts to overhead.
Jason: A lot of people think it looks easy, but I would draw an analogy to how, as a developer, I used to regard sales people. It looked simple enough to me, until I knew more about what their job actually entails, and it turns out that it’s a very hard, complex job.
Some developers think that, in terms of infrastructure, they can just whack something out that’s good enough to meet their needs in a day or two. As soon as they start needing other features, they see that it’s just as complicated a task as writing an application.
People often underestimate this complexity and really get themselves into trouble, especially because the patterns for developing infrastructure are not the same as for developing an application.
In a lot of large organizations, it’s not developers who run the builds. It’s done by people who have a lot of scripting experience or come from a configuration management background. They’re aggregating a lot of components that come in from different places, making sure configurations work, and validating deployments, and it’s just not the same type of thing.
This is where we’re almost coming full circle, now back to a lot of these dynamic language based build tools. They’re using Groovy and Ruby, and a build system created by a developer ends up being a 4,000 line Ruby script, which doesn’t really help anybody. You have to continuously invest in maintaining this 4,000 line Ruby script if that is the path you choose.
I would point anyone interested in creating build systems to read a little bit of history about designing cities. These things all collapse when you don’t invest in them, because infrastructure requires investment and conscious planning if you want it to work. Fully healthy, functioning infrastructures do not just emerge. They have large degrees of entropy, and you have to take care of them.
Scott: Let’s switch over for a second and talk about the Sonatype company. You spend a certain amount of time consulting with Fortune 500s who, like you said, may have 40 different build systems. There’s a lot of opportunity for building efficiency there.
It sounds like a lot of what you do is to coach those companies through adopting things like Maven. But talk a little more about Sonatype and what it does on a day-to-day basis.
Jason: Prior to 2008, we did a lot of consulting and training. When Brian Fox and I, who is the first employee of Sonatype, decided we wanted to pursue making Maven-based products, we soon realized it was very hard to actually do system training at the same time as trying to develop the products.
In 2008, we took venture capital, and we moved away from consulting at that point. We still do strategic consulting for the largest companies in the world, who basically help us develop the software for others. If the software works for Cisco, then we know it’s going to work for other people.
But we soon turned into a product company. We invest a lot of our time, energy, and money on the open source side, including the Maven project itself; into Eclipse, which is our Maven integration for the Eclipse platform; and Nexus, which is our Maven repository manager.
The strategy for the company is that we have an essentially open core, so we support all the open source versions of these tools. On top of that, we have a commercial stack of products that we’ve been selling for two years, and that’s what we actually make our money on.
We have commercial versions of Nexus, like Nexus Professional, and we have the commercial version of our Eclipse integration called Maven Studio for Eclipse. So, we are primarily in the business now of shepherding the open source projects and creating commercial extensions to them.
We have a lot of consulting partners now, and we funnel 99 percent of the consulting work we get asked for onto our partners. We’re very focused on keeping the open source projects healthy and then developing products on top of them.
Scott: In a lot of open source projects, things get built because there’s somebody who’s scratched their own itch. They developed something in open source that was novel and had real benefits over the things that came before it.
When functionality like that starts to develop a following, the people who wrote some of the initial code often wrap a company around it. That provides some stability for the project, because if the company’s profitable, you can bet that the underlying project is going to get development effort.
At the same time, there are a lot of different skills required to run and grow a business that are not necessarily natural to the software engineer. What do you say to the 20 year old out there in college right now who’s got some open source code they are thinking of growing a business around?
Jason: I would say to find someone who has a lot of experience, and I happened to do by chance. I took funding from Hummer Windblad, solely because I met a guy named Will Price, who is now the CEO of Widgetbox.
I was lucky that I found him. He gave me a tremendous amount of advice related to starting and structuring a business. He told me that at some point in my life, I may be able to run the company by myself, but that I should not make the mistake of thinking that I understood what I needed to know initially.
If you’re good at technology, you probably can learn how to run a technology company, but if you’re going to start a business, you need to find someone who understands how to grow it and how to have it function at a management level.
We are just at the point now where we’re hiring a new CEO. We hired a great board that has helped me manage the company to this point. Really, a key requirement is figuring out what you’re not good at, and most technical people are not good at running companies.
Having an under-appreciation for the other parts of the organization is the easiest way to run it into the ground. We were very lucky that our board of directors found great salespeople and marketing people for us.
I learned to defer to those people in situations where I don’t have any experience. Recognize where you have no experience, and don’t try to think you can learn it on the fly, because you can’t.
Scott: In my experience growing a business, there’s the tendency to want to guard your baby, since you as the founder have put your heart and soul into the company. I’ve also noticed some small business owners who are afraid to hire somebody smarter than them.
They feel like they need to be the expert at the technology and at understanding the customers, and they may extend that to think they need to be the expert at employee benefits, employee agreements, and running projects.
They’re a little bit afraid to hire a project manager who’s smarter than them, or a sales guy who might make more than the business owner by generating an enormous amount of sales revenue. Many people, especially at the beginning, lack the perspective that it’s not a detriment to the business if you hire a sales guy who is way better than you ever imagined. [laughs]
Jason: We’ve been lucky there. I’ve had mentors at VCs who are great. We have Mark Gorenberg from Hummer Winblad on our board, and Paul Levine, who’s from Morgenthaler and was the guy who started Atria software.
They’ve been great mentors, and they have always said that you need to find the right people for the work that needs to be done. Nothing significant is accomplished by one person; it’s always a team.
They drilled that in to me, although it didn’t take long, since I had been working with open source for a long time. Having a successful community and project takes hundreds of contributors, and it wasn’t that far of a stretch to use that analogy in a business.
Still, I was categorically lucky to have these people on the board who have lifetimes of experience. That’s the single most important piece of advice I would give anyone starting a company: don’t assume you know anything, and look for people who can help you almost immediately if you want your company to work.
Scott: If you’re an engineering person, another thing to realize is that you don’t necessarily want to do some of the jobs that are needed to run a business well. You have to ask yourself whether you want to spend your time thinking about how you could do automated deployments to Amazon’s EC2 cluster, or whether you want to review a new version of the 35-page employee handbook.
There are people who really enjoy that and are good at that, so get them to free you up to do the things on the engineering side that are really going to move the company forward and that, frankly, you may like a whole lot better.
Jason: You also have to consider that doing things outside your sphere of expertise means that it will take you far longer than someone who specializes in that area, and the result will probably not be as good, either.
Small business owners need to learn that pretty quickly, or they put their companies in jeopardy. You have to be very cognizant of cash burn when you’re starting a company, and if you pay attention, it immediately becomes apparent that there are lots of things you shouldn’t be spending your time doing.
Scott: Well, we’re coming up on the hour, so I’d like to hand the microphone to you and ask what interesting things are in the future of Maven and Sonatype?
Jason: The polyglot Maven project is an attempt to enable people to use dynamic languages with Maven. Normally, you would use an XML file to describe your Maven project, but we have some very interesting collaborations with people like Charles Nutter from the Jruby project. About a month ago, we took Maven central, which had about 100,000 data artifacts in it, and we did something in our Nexus project that dynamically turns a Maven repository into a Ruby Gems repository. So, we turned Maven central into the largest Ruby Gems repository on the face of the planet.
That was fairly interesting work, in terms of allowing Jruby developers writing applications in Ruby for the JVM to leverage every Java library that exists in Maven central. We’re trying to work on some of these polygot approaches to bridge the gap between people using Ruby, Clojure, and Scala. So people writing applications in these languages can benefit from Maven and repository managers like Nexus. We’ve created a facility where they can use their favorite language as a front end to Maven.
We’re also working a lot on holistic solutions. We plan to release a stack based on Maven, Eclipse, Nexus, and Hudson. Most of our clients ask us for end-to-end solutions. They don’t want point solutions, and if they’re paying us for support, if something goes wrong in a part of the system, they want help with the whole system.
We’re actually going to be releasing a commercial development-infrastructure stack, and that’s really exciting for us. That’s being driven almost entirely by the needs of our clients. Some of the largest companies you can think of are helping us drive the features that are being put into the stack.
Maven 3 is about a month away from being released. We’ve gone to great lengths to make it fully backward compatible, which will allow users to easily pick up new functionality if they need it. There’s a lot of stuff going on.
Scott: Awesome. Thanks for talking today.
Jason: Great. Thanks a lot, Scott.