Interviewee: Dries Buytaert
In this interview we talk with Dries. In specific, we talk about:
- Building specialized commercial support for open source technology
- Making cloud computing more viable for widespread use
- The relationship between Drupal and Linux distros
- Lowering the barriers to the deployment of Drupal
- Integrating multiple functionalities and components into one solution
Scott Swigart: Dries, could you lay out some of your background for us, including your experience with Drupal and the new work you’ve been doing with Acquia?
Dries Buytaert: Sure. I’m based in Belgium, and I am the founder and project lead of the Drupal project.
Drupal is an open source content management system that combines traditional features like workflow and versioning with social media features, like blogging, forums, RSS feeds, tagging, and wikis. That combination, I think, makes Drupal a unique platform.
We estimate that there are more than a quarter million Drupal sites out there today. Some of these are operated by large organizations like Sony, Forbes, Warner Brothers, and Amnesty International, but many are also operated by individuals.
One of the goals of the Drupal project is to make it very easy for anyone to express themselves online, to get conversations going, and to connect with people.
I started Drupal in 2001, so we have been around for about seven years, during which time our growth has been rather organic. Today, we have more than two and a half million downloads of Drupal Core, which is the main version of Drupal. We have a couple thousand people contributing code, and many more people contributing with support, documentation, and things like that.
More than 700 people contributed to Drupal 6, which is our current main version of the core, and in addition, we have two to three thousand modules that have been contributed by people in the community. It’s a large community of active developers.
I’ve always done Drupal as a hobby, but around a year ago, I co-founded Acquia, which is a venture-backed startup whose purpose is to provide commercial-grade support to Drupal users. The company is based in Boston, U.S., and it consists of about 25 people now.
We’re building a couple of things; the first is a Drupal distribution that we call Acquia Drupal, which consists of Drupal Core and a selection of some of the best modules from Drupal. The second thing we’re building is called the Acquia network, which is a set of electronically delivered services that help you with the operation and maintenance of your site, as well as access to our technical support team.
Scott: I’ve heard the direction Acquia is heading compared to what Red Hat did with Linux, in terms of providing commercial support and enterprise capability for Drupal. Is that an accurate representation?
Dries: Yes, it is true that we are trying to be for the Drupal community what Red Hat was for the Linux community in the sense that both companies offer a suite of services to software projects developed by organic open source communities. Another class of commercial open source vendors could be described as non-organic. They are less focused on the community development model, but instead use open source software licensing as a component of their business strategy–to upsell proprietary versions or SaaS offerings, for example.
Scott: What do you think are the strengths and weaknesses of each approach? What is the significance of the distinction between arising out of a long running open source effort the way Ubuntu did with Debian, or Red Hat with Linux, as opposed to just having open source as part of a strategy?
Dries: To make sure the terms are clear, when I say non-organic open source, I am referring to a case where the majority of the software is built by employees of a company, rather than developed externally by community members. This isn’t to say that these vendors don’t have communities–they just don’t depend on the community for their primary software development efforts.
As a result, I think organic communities generally tend to be bigger, largely because of the tendency to remove some of the control around the project. For example, Acquia is not in charge of Drupal–the Drupal community is. I think because of that, the project might have the opportunity to attract more people.
I think the big advantage of an organic open source community like the Drupal community is that it’s very active and extremely passionate. As I mentioned, we have literally thousands of people contributing code changes to the project.
Scott: So that contrasts with something like MySQL, which was very non-organic, with almost all of the code written by MySQL employees. The company basically controlled the direction of it, and while it developed a very good community of users, that community did not control how the code and features should evolve.
The big switch they made when version five came out to add a lot of the "enterprise" features is interesting, and now, there’s a fork into Drizzle. It still remains to be seen how it takes off, but it looks to some degree as if Drizzle represents the community reclaiming MySQL, although obviously, that fork is started by what are now Sun employees.
Still, there’s the notion of opening it up so people can really contribute code and invest in the future direction of the database.
Of course, both models can be successful–both MySQL and Drupal are doing quite well. In terms of having a community that’s really invested in the software and feels like they have control over the direction, though, it seems like it’s a little harder for a project to go off in the wrong direction if thousands of people with different viewpoints are actively contributing to it.
Dries: I agree. As I said, there are thousands of modules in the Drupal community, and each of these contributed modules is like a small open source project on its own. Some are maintained by a single person, while others are maintained by team of maybe five people or more, but each of these projects has its own road map and release cycles. In other words, there are a lot of moving pieces in the Drupal community, and they are all moving at their own pace and in their own direction.
Part of the value that Acquia provides is very similar to Red Hat–we help manage that complexity for our customers. They no longer have to keep track of all the moving parts; we track and improve the stability, security, and scalability of their modules, we do integration testing to make sure they work well with other modules, we provide configuration support, module selection advice, and so on. In other words, we make sure that our customers are set up for long-term success with Drupal.
In a non-organic open source community, that’s often different, because the software is controlled by a single company. They tend to use different models, such as having a community version and an enterprise version, for example, which may have differences in terms of features or stability.
Scott: It seems that what you guys put out is basically a distro where you’ve done the work to determine which version of which modules are ready for general inclusion, in much the same way as a Linux vendor decides what’s going to be packaged in with their distro. They make the decisions about what’s stable and ready to ship across all those thousands of modules.
Sean Campbell: Let me ask you a question in another area. What do you think are the lessons learned, and what do you think about Red Hat’s current direction? For example, bringing in Whitehurst as the CEO surprised a lot of people. Notwithstanding the fact that he’s obviously a believer in open source, airline executive to open source company head wasn’t necessarily what people expected.
More generally, what do you think of their recent evolution in different technology areas, such as the way they’re moving to the cloud and going after the small-to-medium business market?
Dries: Honestly, I’m not really following Red Hat that closely, but I do pick up their news statements and so forth, and one of the things they recently launched, and which I’m excited about is a cloud-based version of their offering.
That’s pretty interesting, in the sense that from Drupal’s point of view, we see a lot of people that want to move to the cloud, but it’s still not trivial. You still have to do all the work like set up load balancers, MySQL servers, master/slave modes, and web servers, and you have to get PHP working.
As a next step, I would really like to see them do is to move up a level in the stack, meaning that they would help with actually setting up the MySQL servers, installing the load balancers, and doing all the rest of the hard work. I don’t know if that’s in RedHat’s road map, to be honest, but it’s probably more in Amazon’s roadmap than it is in Red Hat’s. The more vendors move to the cloud, the more likely it might be to happen, which is the exciting part for me.
Sean: We’ve talked to some other cloud providers who have expressed the same thing. The consensus seems to be that what Amazon provides is a fairly bare-metal environment that you can do whatever you want in, but a lot of people are going to want a lot more stuff preconfigured, supported, and backed up–kind of a ready-to-run environment.
I don’t know if it’s on Amazon’s roadmap, but there are certainly a lot of other providers looking at providing that layer of services on top of the cloud.
Dries: That’s true, and bear in mind that cloud computing doesn’t make it easier to scale a Drupal site; all it does is make it easier for people to do capacity. It’s easy to spin up a new instance and to distribute the loads, but they don’t actually remove all the configuration work that it takes to make it scale.
Scott: Right. You would want the cloud providers essentially to centralize the knowledge about how to do those things, so that every customer of the cloud provider doesn’t have to figure those best practices out for themselves. It certainly seems that there is a market opportunity there for people to provide more than just bare-metal services.
Dries: I agree.
Sean: To set aside cloud computing for a moment in favor of something a bit more mundane, can you characterize the relationship between Drupal and the Linux distros a bit?
Dries: The relationship is somewhat different from a simple desktop or server application. Upgrading a Drupal web site is more of a challenge than upgrading a simple application.The reason for that is that every site is unique. People want to have a unique brand for their site, so they have a custom template or theme, custom features, and so forth. Because every site typically has at least some amount of customization, something like apt-get doesn’t usually work well to upgrade Drupal sites.
Although Drupal is in these distributions, it’s not necessarily the best way to do upgrades, although it is ideal for installing the software from scratch. A lot of people just do an install and have every distribution worry about the dependencies and getting the right version of PHP, Apache, MySQL, or whatever database they’re using. Once the site is in production, though, it tends to get a little more tricky.
Or imagine having a Drupal site running on a four-server cluster–upgrades become more complicated.
Scott: Is work on the Linux standard base at a low enough level that it doesn’t really affect what you guys do, or does work on the Linux standard base in any way help ease compatibility problems between one distro and another?
Dries: We’re not affected by them too much, to be honest, because we run on top of the LAMP stack. The LAMP stack runs on pretty much every platform.
Scott: I know that Microsoft made an effort with Windows Server 2008 to have PHP applications run on Windows. Do you see any interest in running Drupal on Windows, or is it pretty much 100% Linux in terms of the install base?
Dries: There are a lot of people who evaluate and install Drupal on Windows machines, and Microsoft’s IIS web server has a pretty big market share. I don’t know the exact numbers, but it’s somewhere between 30 and 40% of all the web sites in the world that are run from Microsoft servers, so making Drupal work on them is certainly something we care about.
Sean: I’m curious about how you deal with user-facing requests in terms of the project’s evolution. To use another Red Hat example, they don’t see a consumer model in the desktop, mostly because they don’t see a way to charge money for it. On the other hand, Ubuntu seems to see opportunity there, at least in terms of leading them into other offerings.
By their very nature, portal products have two sides that are pretty well walled off from one another. To the IT guy who rolls it out, it’s just another way of displaying files and various elements of a company’s information architecture and data. To another set of users, it’s an entirely user-driven model, where they can make changes and do what they want.
What is your process for engaging with a more end-user-facing community, as opposed to projects that are geared to deal with work by developers, for developers?
Dries: I started working on Drupal as a way to explore web technologies back in 1999, before I released the first version, so the first set of users were pretty much techies and developers. For a long time, the Drupal community continued to be very developer focused; and, to some extent, it still is today.
In the last four or five years, though, we have been working to open the platform more to non-developer end users. We have started to put more and more time and effort into enhancing usability, for example, to support our vision to make it easy for people to express themselves.
There is an increasing number of people and organizations contributing to improve Drupal’s usability.
Scott: I want to spin back around to the cloud, since that’s such an intense area of interest at the moment. What do you think are some of the opportunities for Drupal in a cloud environment?
Dries: I think that, for Drupal to succeed, we have to lower the barriers for adoption. We have spoken about usability issues, as well as the support issues that are addressed by Acquia. In addition to those, scalability is very important; scaling Drupal is currently a non-trivial proposition, because of the complexities of scaling all the layers of the LAMP stack.
When I say "all the layers of the LAMP stack," I am including things like load balancing, MySQL, master/slave stuff, PHP opcode caching, as well as things like memcache and, potentially, integration with content-delivery networks.
On the other hand, because Drupal is written in PHP, it is very accessible to people, as compared to a Java-based solution, for example. People can get started with PHP on a $10 hosting account, and in some cases, those sites can become massively successful. When that happens, we often tend to see people who are not necessarily technical gurus suddenly having to operate and maintain very large sites.
Sean: To state that a different way, one of the challenges of building an easy-to-use framework is that it’s an easy-to-use framework. In other words, people can get themselves into the position fairly quickly of having a site that they will have a lot of trouble maintaining.
In contrast, to consider the commercial model, if you built the site on a Windows box running IIS and couldn’t scale it, you might tend to point the finger at Microsoft, rightly or wrongly. With open source solutions, though, it’s more viable to say, "Well, that really wasn’t our component, or our project. You need to go learn how to optimize Apache or MySQL."
At the same time, it might make it harder to get people past that barrier, because you can’t own all the pieces and the conversations around them. That seems to be a tremendous challenge; if a site is amazingly successful, even if it’s not Drupal that creates their scalability problem, that might be the way they see it, which gives you ownership of issues that you can’t necessarily control.
Dries: That’s a fair characterization, I think. Typically, as a site gets bigger, it goes through a number of phases. One of the first things that will arise is that the site will encounter bottlenecks with MySQL.
You could argue that it’s Drupal’s fault because it is doing too many queries [laughs], but ultimately, you need to scale your MySQL server, because Drupal does what it needs to do to run your site. Of course, scaling that MySQL server has nothing to do with Drupal.
While scaling MySQL is well understood–there have been books written about it–site builders don’t necessarily know how to do it upfront.
Scott: Do you think there is an opportunity, then, for someone to produce a Drupal virtual appliance for the cloud that already has a lot of that configuration done, which people could just push into a fairly bare-metal cloud like Amazon EC2?
Dries: I do think that’s a big opportunity, and some companies in the Drupal world are exploring that area.
Sean: A couple of months ago, Shadow Worth said he feels there should be more synchronization between various projects. He is trying, in some ways, to blur the distinction among the individual elements that make up his project, in an effort to make it more consumable by an end user community.
The effect is similar in some respects to the fact that when an organization buys SharePoint, they don’t necessarily think of themselves as buying Volume Shadow Copy and Windows–they buy the solution.
Do you guys feel you are trying to carve out a place in the market for yourselves around portal solutions in the same way that Ubuntu is trying to do it on the desktop and Red Hat did it on the server?
Dries: I wouldn’t say portal solutions, but solutions in general, yes. That’s our goal with our distribution, for example. Instead of having users assemble raw bits from different places, we do the assembling and we productize it as a solution for them.
We bundle it in one physical package, write documentation for it, and pre-configure it so it works right out of the box. We are certainly in the business of building solutions for people to simplify their effort to build web sites.
Scott: You’ve been involved in content management systems for a long time, and you mentioned that one of Drupal’s strengths is that it understands a lot of the social networking blogs and wikis, and that kind of stuff. What really interesting things do you see on the horizon for where content management systems and social networking are headed?
Dries: A lot of different stars aligning, if you will. A lot of scope and standards are being defined or getting traction — things like OpenID, RDFa, Microformats, OpenSocial, OAuth, PoCo, and more.
One of the things I recently blogged about is what it could mean to have semantic technologies in Drupal. For example, major search engines like Google and Yahoo are moving aggressively toward trying to capture more structured data, because that allows them to build better, more specialized search functionality, like search for products that can filter by price, availability, color, shipping costs, etc.
Today, if you build a product with Drupal, we could have definitions of products in our database, but they are being translated to HTML pages without any semantic markup, so a lot of the structure we have available is lost. I think we have an opportunity to admit structured data, which then in turn will enable a lot of new things to happen.
To support that, we basically have to add some semantic technologies into Drupal. Specifically, as we generate HTML, we should mark it up using RDFa, so that search engines like Yahoo! SearchMonkey can pick up that data and do more meaningful things with it.
Scott: We were at the Gartner conference last week, and there was a hilarious presentation that pointed out the ways in which search hasn’t really changed in 14 years. The presenter said it’s the equivalent of walking into a library where the librarian is behind a wall with a little hole in it.
You would walk up to the hole and yell "Penguins!"The librarian doesn’t know who you are, she doesn’t know if you’re a researcher or a five year old. And so, you look through the hole and she just shows books sequentially and says, "This one? This one? This one?"
Scott: That’s basically what search has been. And this presentation says that the direction search has to go to get better is federation and conversation, to give better context. Thus, the user’s request about penguins is shown in light of information clusters that associate the request with products, or services, or research, or children’s stories, or whatever.
Federation is also interesting, and it seems to apply to Drupal as well. It’s this notion that you can have content that’s in Drupal, content that’s in your mail servers, inside of an organization. People want to be able to search, and to search across all of that through a single interface.
So, talk about those two concepts a little bit and how you see them from the perspective of Drupal.
Dries: There used to be a lot of focus on functionality in the Web 2.0 world, a lot of new features, a lot of new ideas of doing things and moving things online to the web.
The next generation will be more about interoperability and data portability. It’s essentially concerned with being able to take your data in and out of your site, and supporting the next level of mashups enabled by semantic technologies and new protocols.
Instead of generating pages with very few semantic clues, I do think we have an opportunity to turn the Internet into what behaves essentially as a very large database that is very easy to access and parse using crawlers, search engines, and so forth. That’s a tremendous opportunity.
I hope Drupal can help make happen.
Sean: Thanks, Dries. I see that we’ve run out of time, and I think that’s a good place to close.
Dries: Thanks. It’s been a pleasure.