<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
		xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>How Software is Built</title>
	<atom:link href="http://howsoftwareisbuilt.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://howsoftwareisbuilt.com</link>
	<description></description>
	<lastBuildDate>Fri, 25 Jun 2010 19:53:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
	<copyright>2006-2007 </copyright>
	<managingEditor>scottswigart@technologyevangelism.com (How Software is Built)</managingEditor>
	<webMaster>scottswigart@technologyevangelism.com (How Software is Built)</webMaster>
	<ttl>1440</ttl>
	<image>
		<url>http://howsoftwareisbuilt.com/wp-content/plugins/podpress/images/powered_by_podpress.jpg</url>
		<title>How Software is Built</title>
		<link>http://howsoftwareisbuilt.com</link>
		<width>144</width>
		<height>144</height>
	</image>
	<itunes:subtitle></itunes:subtitle>
	<itunes:summary></itunes:summary>
	<itunes:keywords></itunes:keywords>
	<itunes:category text="Society &#38; Culture" />
	<itunes:author>How Software is Built</itunes:author>
	<itunes:owner>
		<itunes:name>How Software is Built</itunes:name>
		<itunes:email>scottswigart@technologyevangelism.com</itunes:email>
	</itunes:owner>
	<itunes:block>no</itunes:block>
	<itunes:explicit>no</itunes:explicit>
	<itunes:image href="http://howsoftwareisbuilt.com/wp-content/plugins/podpress/images/powered_by_podpress_large.jpg" />
		<item>
		<title>Jason van Zyl &#8211; Apache Maven</title>
		<link>http://howsoftwareisbuilt.com/2010/06/25/jason-van-zyl-apache-maven/</link>
		<comments>http://howsoftwareisbuilt.com/2010/06/25/jason-van-zyl-apache-maven/#comments</comments>
		<pubDate>Fri, 25 Jun 2010 19:52:41 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=333</guid>
		<description><![CDATA[In this interview, we chat with Jason van Zyl about the Apache Maven project. Maven is designed for build automation. Unlike well known predecessors such as MAKE, Maven uses a project object model (POM) to describe the software being built. Building limitations into Maven that enhance consistency and efficiency Encouraging a paradigm shift in place [...]]]></description>
			<content:encoded><![CDATA[<p>In this interview, we chat with Jason van Zyl about the Apache Maven project.  Maven is designed for build automation.  Unlike well known predecessors such as MAKE, Maven uses a project object model (POM) to describe the software being built.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/06/25/jason-van-zyl-apache-maven/#limit">Building limitations into Maven that enhance consistency and efficiency</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/06/25/jason-van-zyl-apache-maven/#paradigm">Encouraging a paradigm shift in place of incremental change with new tools</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/06/25/jason-van-zyl-apache-maven/#foster">Fostering collaboration, even among competing organizations</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/06/25/jason-van-zyl-apache-maven/#avoid">Avoiding the temptation of trying to fill every role at a young company</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/06/25/jason-van-zyl-apache-maven/#bridge">Bridging the gap between various programming languages with future versions of Maven</a></li>
</ul>
<p><span id="more-333"></span></p>
<p><b>Scott Swigart:</b> To start, could you take a minute to introduce yourself? </p>
<p><b>Jason van Zyl:</b> I come from a physics background, but I&#8217;ve been developing software for almost 20 years. When I was working in printing and publishing in New York, about 14 years ago, I started getting into open source, and I&#8217;ve been making my living from open source for almost 15 years now. It&#8217;s culminated in working on Maven and turning everything Maven-related into a product line, and that&#8217;s where I am today. </p>
<p>I&#8217;ve been part of the Apache Software Foundation for almost 10 years now, and I started the Apache Maven project in 2001. The oldest version was released in 2002, and then Maven 2 was released in 2005. In 2007, I started Sonatype to help invest in the community. In 2008, we became a product-based company and got venture funding. </p>
<p><b>Scott:</b> My understanding of Maven is that it&#8217;s a build system. Talk a little bit about Maven, what it does well, and your thinking about the need to build a better mousetrap in this area. </p>
<p><a name="limit"></a></p>
<p><b>Jason:</b> To start, I would say that the largest competitors to Maven are large ad hoc systems that are created inside organizations. The primary difference between Maven and something like Ant or Make is that, while those are great tools, there&#8217;s no pattern or structure to how you actually use them. </p>
<p>Maven goes a step further, by bringing standards and patterns to a build system: most organizations tend to fall into a set of patterns for developing applications. For .NET, there are standard patterns from Microsoft for building applications. If you&#8217;re using Java, there are standard J2EE patterns and there are even design patterns for creating applications. </p>
<p>Maven takes that concept and applies it at the development infrastructure level. As a result, if you were to look at 100 projects built with Make or 100 projects built with Ant, you would find that they are very different. If you look at 100 Maven projects, they&#8217;re almost entirely identical in structure. While there might be differences in how they are customized, at a very basic level, if you know how to build one Maven project, you can build all Maven projects. Look at 100 different Ant projects, and each project is a unique &#8220;creation&#8221;. Over time, these 100 Ant projects will diverge from one another. </p>
<p><b>Scott:</b> Talk a little about how that plays out, in terms of value in day-to-day operations. </p>
<p><b>Jason:</b> If you were to take someone off the street without any experience and show them your Maven build internally, 99 times out of 100, they&#8217;d be able to build it on the first try. </p>
<p>Maven&#8217;s approach addresses some of the problems that came from the explosion of tools that were used in development infrastructure. It had become very hard, even within the same organization, to have a standard, cohesive structure shared among all projects. Even within a single company, you would have inefficiencies. Differences would start to emerge over time with one group using one project structure and build tool and another group using a different structure. </p>
<p>If you expand the perspective, and look at collaboration between different organizations, these inefficiencies become even more apparent. We have more than 2,000 very large clients who talk to one another as part of our technical advisory board. They use the same terms about how to build, test, and deploy software, because Maven provides a platform upon which other tools can build. </p>
<p>It&#8217;s impossible to build a platform on top of an ad hoc structure, whereas it&#8217;s very easy to build a platform on top of something that has a consistent model. That&#8217;s where Maven is different than most other build frameworks or build tools. It has a standard set of lifecycle phases that it passes through. It isn&#8217;t just a random assortment of tools, it is an entire factory, and our products and approach provide an end-to-end solution for software engineering. </p>
<p>Every component that is made with Maven is made in the same way, whether it&#8217;s a .NET assembly, a .jar file, or a .war file. You step through the same phases of compiling, testing, gathering resources, doing integration testing, and installing or deploying into a repository. </p>
<p>Many of the core ideas for Maven come from &#8220;A Pattern Language&#8221; by Christopher Alexander. He was an architect who tried to capture the patterns that made cities productive – patterns that enabled a healthy city infrastrucutre, where people can start businesses, communicate efficiently, and so on. I took Alexander&#8217;s ideas and applied them to questions about how you construct software and how you develop at an infrastructure level. </p>
<p>When I arrived at Apache 10 years ago, I worked on five or six different projects, and we were finding that people weren&#8217;t volunteering to build the software. They would show up, and then they couldn&#8217;t build it. We kept losing volunteers, because they didn&#8217;t want to spend their time messing around with the build. They wanted to devote their energy to actually creating code. </p>
<p>That&#8217;s when I took the version of Maven that I had had for a couple of years before that and checked it into Apache software foundation. It started in the Turbine project, which was probably the first Java framework that was ever created, long before Struts. It graduated into its own profitable project at Apache in 2002. </p>
<p><b>Scott:</b> The first time I bumped into design patterns, which seem to be at the core of what you were talking about, I was at a software conference down in San Francisco, and one of the gang of four was giving a presentation on design patterns. </p>
<p>He was talking about how they were sitting around lamenting the notion that Smalltalk developers seem to solve the same problem the same way, because there were design patterns baked into the language, whereas C++ developers might solve the same problems in a whole bunch of different (and often horrific) ways, because they could go off in any direction they want. </p>
<p>As a result, there was a big push about factoring problems in terms of the patterns that you were looking at. It sounds like that&#8217;s really entrenched in Maven: that when you&#8217;re building software, there are certain patterns that repeatedly arise, and if they&#8217;re codified in the build system, then the same things are done the same way, even across a wide group of people who never talk to each other. </p>
<p><b>Jason:</b> Maven is opinionated, and I designed to be this way. I don&#8217;t think it&#8217;s possible to have a tool work for such a large number of people unless you restrict some of the options and provide some guidelines. </p>
<p>Some people at first railed against Maven conventions and the fact that you have to do things a certain way, but if you follow those patterns, you generally can get a software project up and running in a day. </p>
<p>That&#8217;s evidenced by the traffic from Maven central, which has become the de facto standard central repository for Maven artifacts in the open source community. We get on the order of 250 or 300 million hits a month against it. In 2008, we had about 1.9 million unique visitors to Maven central. In 2009, we had almost four million. </p>
<p>So, even though Maven&#8217;s opinionated, it seems to work for a lot of people. Even though you&#8217;re restricted in some of the things you do, it provides more value by limiting what you can do than it would in providing ultimate freedom to give you as much rope as you want. </p>
<p>Cities that are designed well usually have structural patterns in common. Look at cities that don&#8217;t have much infrastructure or don&#8217;t copy other cities. Lots of cities in North America do that, and they have horrible infrastructures. A lot of developers and organizations tend not to look at what&#8217;s happened in the past, and they often repeat mistakes over and over again. </p>
<p>We have worked with very large organizations to address this issue. For example, Intuit, SAP, and Cisco work openly with us, and they&#8217;ve adopted Maven. We&#8217;ve actually had people from these organizations talk to one another about what their infrastructures look like, and really, they&#8217;re not all that different when it comes down to it. </p>
<p>We&#8217;ve been able to collect a lot of large companies because they want consistent patterns. They want to be able to hire resources off the street. They want to be able to move their developers from project to project without forcing them to relearn things over and over again. </p>
<p>That&#8217;s the tradeoff for what&#8217;s perceived at first as a limited set of freedoms. We actually get a whole tool chain that can help you produce a piece of software from end to end, and generally get a theme up and running and have continuous integration, testing, and deployment all working in a few days. </p>
<p><a name="paradigm"></a></p>
<p><b>Scott:</b> The challenge is always when there&#8217;s a paradigm shift, because you end up with people who are simply trying to use the new tool that is designed to operate under a different paradigm, and they&#8217;re trying to use it the old way. </p>
<p>People try to use C++ without embracing the object-oriented model, or they move from a pointer-based style to garbage kind of development. If someone is coming from Ant, Make, or some home-brewed build system into Maven, I assume it&#8217;s challenging for them to step back and take a few days and learn how to work within its flow, instead of in spite of it. </p>
<p>It sounds like you have found a way to help people be successful at that. The shoulder on the learning curve is low enough that people are willingly adopting it, which helps to create its own momentum. </p>
<p><b>Jason:</b> We recommend that people approach Maven initially with a smaller project to understand the basic principles. People do run into problems when, as you say, they try to wedge Maven into an old model. </p>
<p>That&#8217;s why, if someone&#8217;s first attempt with Maven is a version of another project, they&#8217;re generally not successful. </p>
<p><b>Scott:</b> There&#8217;s almost a pattern around learning a new paradigm. I remember when .NET came out, a lot of organizations tried to learn the new technology by taking their largest, most complex piece of Java software and porting it to .NET. </p>
<p>That&#8217;s actually the worst way to go about learning the new technology. The best way is to choose a new, low-priority project and build it from scratch. Doing that a number of times before taking on a bigger project will save a lot of heartache and make the whole experience more successful. </p>
<p><b>Jason:</b> We&#8217;ve adopted certain strategies with the larger companies we work with. You can&#8217;t do a bank conversion on an organization that has 4,000 developers; it&#8217;s just not possible. Instead, we&#8217;ve developed a pattern over time to encompass how projects using different build systems can interoperate at the Maven repository level. </p>
<p>So, even if you&#8217;re not building with Maven, as long as your project can publish artifacts to a Maven repository and other projects can consume those artifacts from the Maven repository, you can make it work. There are Ant tasks for dealing with Maven, as well as Ivy, and there are other build systems like Gradle and Buildr that all talk to Maven repositories. </p>
<p>Usually, the companies that we go into are interested in how components are being promoted around various systems. For example, how does a development team &#8220;promote&#8221; a particular build to a release repository, and how does development interact with quality assurance and an operation team that is responsible for support an an application in production. These organizations tend to move slowly, they leave projects alone that are delivering, and they publish and consume to Maven repositories. </p>
<p>We try to help companies adopt that strategy. Even if they don&#8217;t adopt Maven as a build tool at first, they can still immediately benefit from repository management. </p>
<p><a name="foster"></a></p>
<p><b>Scott:</b> I see more of that sort of model in the business world, and I think that&#8217;s a good thing in general. Instead of organizations dictating to others, they adopt a stance more like they&#8217;re offering a service, even though there&#8217;s also an element of competition to it. After all, they&#8217;re competing with the existing build system </p>
<p>I think it&#8217;s interesting to see different parts of a company organized around service delivery. There may still be lots of fiefdoms of power, but there&#8217;s also an impetus toward cooperative interaction. </p>
<p><b>Jason:</b> There&#8217;s often a consistent solution, but generally what happens is a pattern. Someone likes the idea of Maven, and they want a common reporting and repository infrastructure. They will often become a spearhead within the organization. </p>
<p>People see how they build their projects, and almost every continuous integration tool and every passing framework integrates with it, and you get all of this at a very low cost. You have to learn something, but essentially, every other organization in the world can contribute to Maven, and thousands of corporations have contributed. </p>
<p>Eventually, people within the organization see this, and they set up a small Maven build. Once it starts within an organization, it&#8217;s usually spreads like wildfire. </p>
<p><b>Scott:</b> You also mentioned that there&#8217;s the opportunity for a lot of cross-company collaboration. There&#8217;s an opportunity for Cisco to collaborate with SAP, in terms of using Maven, because in general, companies don&#8217;t see the build system itself as a competitive differentiator, even though the thing they&#8217;re building might be. </p>
<p>I see that as a prime area for open source to fly. It&#8217;s kind of like in the software-as-a-service model, where the intellectual property is their competitive advantage, but the underlying web server isn&#8217;t, whether it&#8217;s Apache or IIS or whatever. </p>
<p>It&#8217;s the code that they&#8217;re running on top of it that differentiates the companies, and so Maven makes perfect sense. Like you said, most companies don&#8217;t view their make files or their Ant files as the differentiator. </p>
<p>Still, I&#8217;m sure you&#8217;ve run into some companies that are resistant to that idea. I remember working with a company that had built its own remote object protocol; they reinvented COM objects and that sort of thing. </p>
<p>They really felt like it made sense for them to maintain this large, proprietary system of plumbing. They couldn&#8217;t hire anybody off the street who knew it, of course, but certain people within the company were so invested in it that they were really reluctant to let it go. </p>
<p>Do you run into that? You mentioned that it&#8217;s common for companies to have their own home-grown build systems. Talk a little bit about the politics involved there. </p>
<p><b>Jason:</b> We&#8217;ve gone into organizations where someone asked for our help, and we immediately get into arguments with others who are very resistant to our being there. It&#8217;s one of the last frontiers in an organization, because there are existing patterns for almost every other task. A lot of times, with J2EE applications, for example, there aren&#8217;t too many other tasks that vary. A lot of good engineers like making infrastructure, because it&#8217;s hard to do. They often get very attached to the systems they create, and I&#8217;ve been in places where people think of them as their babies. It&#8217;s really hard to dislodge them. </p>
<p>In some places, it&#8217;s taken a year to get through all of the politics, but we try not be antagonistic. We tell them that they can integrate into the process slowly, and we try to show them the benefits. We often try to convince them that contributing to Maven is a better avenue for working on this stuff than trying to keep their own system around. There&#8217;s generally resistance at first, but there are good, logically sound arguments for that approach. If people are going to spend their energy developing infrastructure, they&#8217;ll probably like having people using the things they make. </p>
<p>Ultimately, if you contribute to Maven, you&#8217;re going to have two million people using your stuff, as opposed to maybe a hundred people using an internal system. We often say that writing build tools is fun, until people other than your friends start to use it. Then it becomes a drag, because you end up having to support 1,000 people. </p>
<p>One problem that we run into in almost all of these organizations is that, if your build system doesn&#8217;t support the functionality required to code something in your application, you wind up having to change your build infrastructure at the same time you change your application. If both of those things change at the same time, and something goes wrong, it&#8217;s just a big finger pointing session. One of the points we often make is that by using a framework that many other people are also using, if something goes wrong in your application and tests fail while the build infrastructure remains the same, you can be pretty certain that it&#8217;s your application that&#8217;s wrong. </p>
<p>The cost of maintaining your own build infrastructure can be substantial. One company we work with has estimated conservatively that over five years, failed projects and lost developer productivity because of poorly executed infrastructure have cost them $400 million USD. Now that is real money, and I think companies are starting to appreciate not just the dollar cost of maintaining custom development infrastructure but the lost opportunity for collaborating with a large community using a standard process. </p>
<p><b>Scott:</b> That does make sense when you consider that code is inventory, and similarly to the retail business, its best to minimize the amount of inventory you keep around, because you always incur costs just from having it around. </p>
<p>The ideal for a development organization is to maintain only the code that is core to your business. The build system just isn&#8217;t the type of code you want to maintain, because it simply amounts to overhead. </p>
<p><b>Jason:</b> A lot of people think it looks easy, but I would draw an analogy to how, as a developer, I used to regard sales people. It looked simple enough to me, until I knew more about what their job actually entails, and it turns out that it&#8217;s a very hard, complex job. </p>
<p>Some developers think that, in terms of infrastructure, they can just whack something out that&#8217;s good enough to meet their needs in a day or two. As soon as they start needing other features, they see that it&#8217;s just as complicated a task as writing an application. </p>
<p>People often underestimate this complexity and really get themselves into trouble, especially because the patterns for developing infrastructure are not the same as for developing an application. </p>
<p>In a lot of large organizations, it&#8217;s not developers who run the builds. It&#8217;s done by people who have a lot of scripting experience or come from a configuration management background. They&#8217;re aggregating a lot of components that come in from different places, making sure configurations work, and validating deployments, and it&#8217;s just not the same type of thing. </p>
<p>This is where we&#8217;re almost coming full circle, now back to a lot of these dynamic language based build tools. They&#8217;re using Groovy and Ruby, and a build system created by a developer ends up being a 4,000 line Ruby script, which doesn&#8217;t really help anybody. You have to continuously invest in maintaining this 4,000 line Ruby script if that is the path you choose. </p>
<p>I would point anyone interested in creating build systems to read a little bit of history about designing cities. These things all collapse when you don&#8217;t invest in them, because infrastructure requires investment and conscious planning if you want it to work. Fully healthy, functioning infrastructures do not just emerge. They have large degrees of entropy, and you have to take care of them. </p>
<p><a name="avoid"></a></p>
<p><b>Scott:</b> Let&#8217;s switch over for a second and talk about the Sonatype company. You spend a certain amount of time consulting with Fortune 500s who, like you said, may have 40 different build systems. There&#8217;s a lot of opportunity for building efficiency there. </p>
<p>It sounds like a lot of what you do is to coach those companies through adopting things like Maven. But talk a little more about Sonatype and what it does on a day-to-day basis. </p>
<p><b>Jason:</b> Prior to 2008, we did a lot of consulting and training. When Brian Fox and I, who is the first employee of Sonatype, decided we wanted to pursue making Maven-based products, we soon realized it was very hard to actually do system training at the same time as trying to develop the products. </p>
<p>In 2008, we took venture capital, and we moved away from consulting at that point. We still do strategic consulting for the largest companies in the world, who basically help us develop the software for others. If the software works for Cisco, then we know it&#8217;s going to work for other people. </p>
<p>But we soon turned into a product company. We invest a lot of our time, energy, and money on the open source side, including the Maven project itself; into Eclipse, which is our Maven integration for the Eclipse platform; and Nexus, which is our Maven repository manager. </p>
<p>The strategy for the company is that we have an essentially open core, so we support all the open source versions of these tools. On top of that, we have a commercial stack of products that we&#8217;ve been selling for two years, and that&#8217;s what we actually make our money on. </p>
<p>We have commercial versions of Nexus, like Nexus Professional, and we have the commercial version of our Eclipse integration called Maven Studio for Eclipse. So, we are primarily in the business now of shepherding the open source projects and creating commercial extensions to them. </p>
<p>We have a lot of consulting partners now, and we funnel 99 percent of the consulting work we get asked for onto our partners. We&#8217;re very focused on keeping the open source projects healthy and then developing products on top of them. </p>
<p><b>Scott:</b> In a lot of open source projects, things get built because there&#8217;s somebody who&#8217;s scratched their own itch. They developed something in open source that was novel and had real benefits over the things that came before it. </p>
<p>When functionality like that starts to develop a following, the people who wrote some of the initial code often wrap a company around it. That provides some stability for the project, because if the company&#8217;s profitable, you can bet that the underlying project is going to get development effort. </p>
<p>At the same time, there are a lot of different skills required to run and grow a business that are not necessarily natural to the software engineer. What do you say to the 20 year old out there in college right now who&#8217;s got some open source code they are thinking of growing a business around? </p>
<p><b>Jason:</b> I would say to find someone who has a lot of experience, and I happened to do by chance. I took funding from Hummer Windblad, solely because I met a guy named Will Price, who is now the CEO of Widgetbox. </p>
<p>I was lucky that I found him. He gave me a tremendous amount of advice related to starting and structuring a business. He told me that at some point in my life, I may be able to run the company by myself, but that I should not make the mistake of thinking that I understood what I needed to know initially. </p>
<p>If you&#8217;re good at technology, you probably can learn how to run a technology company, but if you&#8217;re going to start a business, you need to find someone who understands how to grow it and how to have it function at a management level. </p>
<p>We are just at the point now where we&#8217;re hiring a new CEO. We hired a great board that has helped me manage the company to this point. Really, a key requirement is figuring out what you&#8217;re not good at, and most technical people are not good at running companies. </p>
<p>Having an under-appreciation for the other parts of the organization is the easiest way to run it into the ground. We were very lucky that our board of directors found great salespeople and marketing people for us. </p>
<p>I learned to defer to those people in situations where I don&#8217;t have any experience. Recognize where you have no experience, and don&#8217;t try to think you can learn it on the fly, because you can&#8217;t. </p>
<p><b>Scott:</b> In my experience growing a business, there&#8217;s the tendency to want to guard your baby, since you as the founder have put your heart and soul into the company. I&#8217;ve also noticed some small business owners who are afraid to hire somebody smarter than them. </p>
<p>They feel like they need to be the expert at the technology and at understanding the customers, and they may extend that to think they need to be the expert at employee benefits, employee agreements, and running projects. </p>
<p>They&#8217;re a little bit afraid to hire a project manager who&#8217;s smarter than them, or a sales guy who might make more than the business owner by generating an enormous amount of sales revenue. Many people, especially at the beginning, lack the perspective that it&#8217;s not a detriment to the business if you hire a sales guy who is way better than you ever imagined. [laughs] </p>
<p><b>Jason:</b> We&#8217;ve been lucky there. I&#8217;ve had mentors at VCs who are great. We have Mark Gorenberg from Hummer Winblad on our board, and Paul Levine, who&#8217;s from Morgenthaler and was the guy who started Atria software. </p>
<p>They&#8217;ve been great mentors, and they have always said that you need to find the right people for the work that needs to be done. Nothing significant is accomplished by one person; it&#8217;s always a team. </p>
<p>They drilled that in to me, although it didn&#8217;t take long, since I had been working with open source for a long time. Having a successful community and project takes hundreds of contributors, and it wasn&#8217;t that far of a stretch to use that analogy in a business. </p>
<p>Still, I was categorically lucky to have these people on the board who have lifetimes of experience. That&#8217;s the single most important piece of advice I would give anyone starting a company: don&#8217;t assume you know anything, and look for people who can help you almost immediately if you want your company to work. </p>
<p><b>Scott:</b> If you&#8217;re an engineering person, another thing to realize is that you don&#8217;t necessarily want to do some of the jobs that are needed to run a business well. You have to ask yourself whether you want to spend your time thinking about how you could do automated deployments to Amazon&#8217;s EC2 cluster, or whether you want to review a new version of the 35-page employee handbook. </p>
<p>There are people who really enjoy that and are good at that, so get them to free you up to do the things on the engineering side that are really going to move the company forward and that, frankly, you may like a whole lot better. </p>
<p><b>Jason:</b> You also have to consider that doing things outside your sphere of expertise means that it will take you far longer than someone who specializes in that area, and the result will probably not be as good, either. </p>
<p>Small business owners need to learn that pretty quickly, or they put their companies in jeopardy. You have to be very cognizant of cash burn when you&#8217;re starting a company, and if you pay attention, it immediately becomes apparent that there are lots of things you shouldn&#8217;t be spending your time doing. </p>
<p><a name="bridge"></a></p>
<p><b>Scott:</b> Well, we&#8217;re coming up on the hour, so I&#8217;d like to hand the microphone to you and ask what interesting things are in the future of Maven and Sonatype? </p>
<p><b>Jason:</b> The polyglot Maven project is an attempt to enable people to use dynamic languages with Maven. Normally, you would use an XML file to describe your Maven project, but we have some very interesting collaborations with people like Charles Nutter from the Jruby project. About a month ago, we took Maven central, which had about 100,000 data artifacts in it, and we did something in our Nexus project that dynamically turns a Maven repository into a Ruby Gems repository. So, we turned Maven central into the largest Ruby Gems repository on the face of the planet. </p>
<p>That was fairly interesting work, in terms of allowing Jruby developers writing applications in Ruby for the JVM to leverage every Java library that exists in Maven central. We&#8217;re trying to work on some of these polygot approaches to bridge the gap between people using Ruby, Clojure, and Scala. So people writing applications in these languages can benefit from Maven and repository managers like Nexus. We&#8217;ve created a facility where they can use their favorite language as a front end to Maven. </p>
<p>We&#8217;re also working a lot on holistic solutions. We plan to release a stack based on Maven, Eclipse, Nexus, and Hudson. Most of our clients ask us for end-to-end solutions. They don&#8217;t want point solutions, and if they&#8217;re paying us for support, if something goes wrong in a part of the system, they want help with the whole system. </p>
<p>We&#8217;re actually going to be releasing a commercial development-infrastructure stack, and that&#8217;s really exciting for us. That&#8217;s being driven almost entirely by the needs of our clients. Some of the largest companies you can think of are helping us drive the features that are being put into the stack. </p>
<p>Maven 3 is about a month away from being released. We&#8217;ve gone to great lengths to make it fully backward compatible, which will allow users to easily pick up new functionality if they need it. There&#8217;s a lot of stuff going on. </p>
<p><b>Scott:</b> Awesome. Thanks for talking today. </p>
<p><b>Jason:</b> Great. Thanks a lot, Scott. </p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=333&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F25%2Fjason-van-zyl-apache-maven%2F&amp;title=Jason+van+Zyl+%26%238211%3B+Apache+Maven" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F25%2Fjason-van-zyl-apache-maven%2F&amp;title=Jason+van+Zyl+%26%238211%3B+Apache+Maven" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F25%2Fjason-van-zyl-apache-maven%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F25%2Fjason-van-zyl-apache-maven%2F&amp;title=Jason+van+Zyl+%26%238211%3B+Apache+Maven" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F25%2Fjason-van-zyl-apache-maven%2F&amp;title=Jason+van+Zyl+%26%238211%3B+Apache+Maven" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F25%2Fjason-van-zyl-apache-maven%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Jason+van+Zyl+%26%238211%3B+Apache+Maven+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F25%2Fjason-van-zyl-apache-maven%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/06/25/jason-van-zyl-apache-maven/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interview with Damien Katz &#8211; Apache CouchDB</title>
		<link>http://howsoftwareisbuilt.com/2010/06/18/interview-with-damien-katz-apache-couchdb/</link>
		<comments>http://howsoftwareisbuilt.com/2010/06/18/interview-with-damien-katz-apache-couchdb/#comments</comments>
		<pubDate>Fri, 18 Jun 2010 16:59:57 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=329</guid>
		<description><![CDATA[In this interview, we chat with Damien Katz, founder of Apache CouchDB.&#160; CouchDB is a schema-less distributed document-oriented database that can be queried over HTTP using a RESTful JSON API. Meeting the challenges associated with building a database and storage engine Contrasting document-oriented databases with relational ones Trade-offs relative to working with relational databases Smoothing [...]]]></description>
			<content:encoded><![CDATA[<p>
    In this interview, we chat with Damien Katz, founder of Apache CouchDB.&nbsp;<br />
    CouchDB is a schema-less distributed document-oriented database that can be<br />
    queried over HTTP using a RESTful JSON API.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/06/18/interview-with-damien-katz-apache-couchdb/#challenges">Meeting the challenges associated with building a database and storage engine</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/06/18/interview-with-damien-katz-apache-couchdb/#contrast">Contrasting document-oriented databases with relational ones</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/06/18/interview-with-damien-katz-apache-couchdb/#trade">Trade-offs relative to working with relational databases</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/06/18/interview-with-damien-katz-apache-couchdb/#smooth">Smoothing the paradigm shift to schema-less and document-oriented databases</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/06/18/interview-with-damien-katz-apache-couchdb/#couchio">The role of CouchApp and the Couchio company in the world of CouchDB</a></li>
</ul>
<p><span id="more-329"></span></p>
<p><b>Scott Swigart:</b> To start, please introduce yourself and talk a little bit about the CouchDB project and the Couchio company.</p>
<p><a name="challenges"></a></p>
<p><b>Damien Katz:</b> I am the founder of the CouchDB project, and I&#8217;ve been in the industry now for 15 years. I started in the software industry on LotusNotes, the back-end database model of which is largely the inspiration for CouchDB, although Lotus Notes was a much larger project than CouchDB is. </p>
<p>About five years ago, I was done with a startup and decided I didn&#8217;t want to take a normal job. Relational databases were everywhere at that point, and MySQL was getting insanely popular. I had a lot of fondness for the LotusNotes platform, although it was very much the ugly duckling. </p>
<p>I saw all of its warts, but I also saw that it had some beauty, and I wanted to take that beautiful part and bring that into the modern open source web. I quit my paid software career and moved my family from Massachusetts to North Carolina and lived off of savings and occasional contract work while I started the CouchDB project.</p>
<p>I started it in C++ and XML; there was no Erlang or anything like that. Eventually, we threw away the C++ codebase, and eventually we moved to JSON, JavaScript, and Erlang. We&#8217;ve gone through quite a lot of iterations to get to where we are.</p>
<p>We just founded Couchio, which is the company wrapped around CouchDB, last December. I moved my family out to California in January, and we&#8217;re doing the whole startup thing in Silicon Valley. We&#8217;re focusing on services and support and hosting right now.</p>
<p><b>Scott:</b> Taking on developing a database and a storage engine is no small task. In fact, it&#8217;s one of the hardest things to do in computer science. In addition to considering the functionality, there are also performance and scalability considerations that become really important. Talk a little bit about taking on that challenge.</p>
<p><b>Damien:</b> You’re right that there are a lot of issues to worry about with a database server, such as performance, concurrency, and reliability. When somebody writes something, does it actually get written? And when somebody read something, was it supposed to be read? Was it actually committed by someone else? </p>
<p>At the root of these concerns are the ACID properties: atomic, consistent, isolated, and durable. It&#8217;s very difficult to accomplish that while also achieving the performance and reliability characteristics you want. It takes a lot of careful planning.</p>
<p>Doing it improperly can easily produce results that seem to perform well until you hit some sort of threshold where you run out of resources on the computer and everything sort of falls off. Or the computer doesn&#8217;t shut down properly and you might corrupt all your data.</p>
<p>There are a whole lot of challenges there that you have to really think about, and there are a surprisingly large number of ways that you can address those issues. CouchDB takes a fairly novel approach compared to most database engines.</p>
<p>Actually, not all database engines do actually address those issues. Some just decide to kind of punt by saying that you need to run on a very reliable server or on multiple machines for failover.</p>
<p>There are all sorts of different ways to directly solve these issues, as long as you really think about it carefully early on. Early in the CouchDB development cycle, it seems like I spent weeks or months just thinking about the design from a very high-level point of view. </p>
<p>I spent huge amounts of time planning how the software would interact with the clients and the disk, how multiple clients would interact with all that, and coming up with a simple design that would be reliable and provide good performance.</p>
<p><a name="contrast"></a></p>
<p><b>Scott:</b> Another point of interest is that CouchDB is a document-oriented database. It’s far more common to be familiar with relational databases, so talk a little bit about this kind of database and the tradeoffs between it and your traditional relational database.</p>
<p><b>Damien:</b> The traditional relational database, based on tables and rigid schemas, is an old design that’s very well studied and optimized. When you can break your design out into schemas and carefully size each column and table to match your data, you can make amazingly fast solutions. </p>
<p>But we often don&#8217;t have that sort of insight into our data model ahead of time when we&#8217;re developing an application. And frequently our data model just won&#8217;t fit into that kind of schema. The document model is an alternative, where the data is self contained, and documents fit into that. </p>
<p>To understand the model, it helps to think of business cards. Most business cards contain the same information: a person&#8217;s name, phone number, email address, possibly their actual physical mailing address, and other bits of information such as a professional title or the company name.</p>
<p>These cards are all documents, and they have close similarities as well as differences. This is an example of something that doesn&#8217;t necessarily fit very well into a relational database. Of course, business cards are relatively simple, and you might be able to figure out all the possible iterations and put them into a relational database.</p>
<p>But there are other types of documents where you don&#8217;t have that luxury ahead of time, and it can be especially difficult with something like content. That sort of data is ideal for a document database, where you&#8217;re going to have fields and attributes that vary broadly from document to document.</p>
<p>It doesn&#8217;t mean the documents themselves are completely different from one another, and usually they actually have a great deal in common. Usually, all the documents have a common set of attributes, but they also have unique attributes that vary document to document. Those are the types of things that relational databases have a difficult time with, but that document databases are ideal for.</p>
<p><b>Scott:</b> From a programming standpoint, how do developers deal with that? To consider your business card example, it’s obviously simple to code against the stuff you’re expecting, but like you said, the whole point of a document-centered database is that it can deal with the more freeform stuff as well.</p>
<p>How do you see application developers dealing with unknown, free-form sets of fields?</p>
<p><b>Damien:</b> Frequently, in a document oriented app, you just want to capture the information at first, and you don&#8217;t necessarily even know how you want to report on it. The simplest approach is just to add the fields that you want to capture into the documents, and then later on figure out what to do with it.</p>
<p>For example, you might have captured a large set of foreign tax IDs, and then six months later, you decide how to create a report on all those content records. Therefore, it’s important to make it easy to get the data into the database, so you don&#8217;t have to adjust the schema, you don&#8217;t have to do an alter table or anything. You just stick the data in there and it&#8217;s captured.</p>
<p>Later on, you can create a new view without having to alter any of the data in the database, without having to alter any of the existing logic to create something that will show you all the documents that have these foreign tax IDs.</p>
<p>That&#8217;s an example of something that can be kind of difficult to do in a relational database. With a document database, the application doesn&#8217;t actually have to modify a schema somewhere in order to do this; it just basically sticks the attributes right in the document.</p>
<p><a name="trade"></a></p>
<p><b>Scott:</b> Obviously, relational databases can do joins, and there are other things that they&#8217;re good at, but there are trade-offs. You have to consider what your data looks like to decide which model fits best with your application. </p>
<p>It also seems that document-oriented databases have strengths when it comes to distributed processing in the cloud. Parallelism used to mean multiple threads inside one computer, but more and more, it refers to nodes. </p>
<p>Tens, hundreds, or thousands of nodes each crunch on a piece of data, and then the results are put back together on the back end. Talk about a little bit about how you see development changing as it moves to the cloud and where CouchDB fits in.</p>
<p><b>Damien:</b> With relational databases, one of the difficulties is that, with a fixed schema, it&#8217;s difficult to have multiple instances of the same database in different places, particularly with a master/master type of setup. </p>
<p>A master/slave type of setup is the standard for relational databases across multiple machines. All the changes go into a single master, and you have multiple slaves to alleviate the read load. That works fine with relational databases, and you can also do that in CouchDB and other document databases.</p>
<p>On the other hand, in document databases, you can also have a master/master model, with geographically distributed databases that can be independently updated and can replicate their updates with each other.</p>
<p>With relational databases, you can do master/master setups with very careful planning of your data model ahead of time. You have to do a subset of the relation model to make sure that you don&#8217;t get into an invariant condition, and then you can do master/master.</p>
<p>Using relational databases, though, you can&#8217;t change the schema on one side in a master/master setup without a tremendous amount of planning and some downtime on each side. When you change the schema on one node, you create a conflict of schemas and data. You have to figure out how to translate the differences.</p>
<p>With a document-oriented database, each document contains its own schema, so those conflicts don’t arise. Instead, if two people edit the same document, that conflict is detected automatically by the system and can be resolved by the user on an automated basis.</p>
<p>Relational technologies are designed to have a single master at any given time that is the authoritative master state of the data. Without a whole lot of planning, you have to build something on top of the relational model in order to get around that.</p>
<p>In the document model, it’s simple to open it up to multiple machines and move the data around. The data is independent rather than interrelated.</p>
<p>Document databases allow data to  move between machines, within a single cluster or across clusters much more easily. Of course, that makes cloud computing much easier than with relational databases. It’s easier to partition new machines, bring new machines online, or move them across data centers.</p>
<p><a name="smooth"></a></p>
<p><b>Scott:</b> Schema-less databases and document-oriented databases are a paradigm shift, and it provides bigger programmer challenges than something like picking up a new language. I would liken it to the transition from procedural to object oriented programming.</p>
<p>When people try to pick up a functional programming language for the first time, they have to grasp an entirely new way of approaching the problem. I would imagine that there is a tendency by some people to try to force it to work like a relational database.</p>
<p>What are some of the key concepts that people need to wrap their heads around in order to use CouchDB the way it&#8217;s meant to be used, rather than shoehorning it into working the old way?</p>
<p><b>Damien:</b> The first thing to consider is the tendency to normalize your data excessively. While we aren&#8217;t a relational database, it is possible instead of putting a customer&#8217;s name into an order record, to use a customer ID, and that might actually be a better way in CouchDB. </p>
<p>Still, you could go too far and make everything relational. For example, when a customer buys something from a store, the receipt doesn&#8217;t have the store&#8217;s tax ID number on it, right? They give you the actual name of the store and the address and the phone number.</p>
<p>Those things could change later on. The store could change names, it could change ownership, it could move, or it could change phone numbers, and any piece of information on that receipt could potentially become out of date.</p>
<p>That receipt still has value to us, as a record of a transaction at a point in time. The relational paradigm says don&#8217;t do that, we need to normalize everything. The document paradigm says a receipt is perfectly fine, and we can put the name and phone number right on the receipt.</p>
<p>CouchDB encourages you to think more in terms of the receipt, but it can also allow you to be more relational. You don&#8217;t have to be entirely non-relational and de-normalized. The challenge for developers to work their head around is where to find that sweet spot, so they can decide optimally where to normalize and denormalize in CouchDB. </p>
<p>That’s actually the challenge in a relational database, too. It is very easy to over-normalize a relational database. As a result, it can become difficult to work with, all the queries can become unwieldy because you&#8217;re doing way too many joins, and everything can become more obfuscated than it should be.</p>
<p>There is a similar issue with CouchDB, in the sense that you can go over-normalized or under-normalized, although in CouchDB, you generally want to lean more toward a de-normalized form.</p>
<p><b>Scott:</b> I think that data warehousing foreshadowed this transition to some extent, in the sense that, in certain scenarios you have to embrace denormalization. On one hand, you have to be concerned with breaking things down granularly and having tables and sub-tables, and joining all kinds of stuff together. </p>
<p>But to actually get performance, the data warehouse has to have a pre-denormalized view, to let you do that sort of stuff. </p>
<p>Document centric database in some ways maps to object oriented bit better, right? A document is sort of a hierarchical object; if you think of an HTML page, it&#8217;s got a title, it&#8217;s got a body, and it&#8217;s got a bunch of elements in it that are defined on one hand but on the other hand they can kind of be in any order and so forth.</p>
<p>Talk a little bit how a document-centric database is a more natural fit for an object-oriented programmer than this sort of data-mapping layer where we&#8217;ve always had to take our objects, shatter them into tables, push them out into a relational database, and then kind of reform the hierarchy as we pull stuff out and get it back into objects.</p>
<p><b>Damien:</b> I&#8217;ve always felt that, at its heart, CouchDB is really an object database. The reason I don&#8217;t call it an object database is that it&#8217;s a very overloaded term in the industry, and people have a set of expectations that go along with object databases. </p>
<p>Nowadays, there really aren’t that many popular object databases, but back in the day, Gemstone had some popularity. Maybe it&#8217;s still popular these days, but I haven&#8217;t heard much about it. This was the idea of a transparent object-persistence layer for a programming language. Gemstone was for Smalltalk, there was Zope for Python, and I know there were others for other languages.</p>
<p>These are essentially databases to store programming objects natively right into a database without any sort of real translation layer. CouchDB, on the other hand, is not a seamless, persistent layer where you get your objects and then you have different methods and everything also gets stored in there. </p>
<p>That&#8217;s the biggest difference, but there are still objects. They just don&#8217;t have the methods, and they&#8217;re not in an object-oriented programming database.</p>
<p>I think one of the big reasons object databases didn&#8217;t catch on is that they&#8217;re married too closely to the languages they are designed to work with. That&#8217;s convenient when you&#8217;re working with the language, but the data frequently outlives the application or the project, and you also need to share the data with other applications or other projects. </p>
<p>With CouchDB and JSON, you can use it from any language, but if you&#8217;re using something like Gemstone from Smalltalk to instantiate one of those objects, you&#8217;re going to have to call it from Smalltalk. If you&#8217;re going to access it from something else, you are going to have to write an access wrapper.</p>
<p>That becomes a big obstacle to re-using that data from another application. I think that&#8217;s one of the big reasons that object databases never really caught on and never had a chance against SQL which is its own simple language and so it didn&#8217;t have that problem.</p>
<p>It also plays much more easily with a variety of programming languages than any single-object database does. CouchDB&#8217;s popularity owes in part to the fact that it can play very easily with a wide range of programming languages. And it opens up your database to a lot more applications and being able to share the data, which is very important.</p>
<p><b>Scott:</b> As cloud development matures and evolves, what do you see as the challenges, opportunities and future direction for CouchDB?</p>
<p><b>Damien:</b> The large data stuff hasn&#8217;t really been our primary focus at CouchDB. That’s not to say that we haven&#8217;t focused there at all, and we have always designed the CouchDB data model so it can scale up. </p>
<p>But our focus has been primarily on making sure that we can build applications very easily, very rapidly, so they can replicate. Building really large scale applications has always been a secondary goal for CouchDB.</p>
<p>We make it very easy to launch web apps, scale them up or scale them down, and not worry too much about their reliability. You can just bang them out, get them working, and deploy them. That&#8217;s primarily our interest.</p>
<p><a name="couchio"></a></p>
<p><b>Scott:</b> Talk a little bit about Couchio and what the company does.</p>
<p><b>Damien:</b> Right now we&#8217;re providing services and support. Our first customer that we signed up for support is Canonical, and we’re helping them with their Ubuntu One service. We are just about to launch our hosting service. We are in private beta, and we hope to initiate public beta soon. </p>
<p>Our focus with hosting is going to be on low-friction hosting, to get people started with as little work as possible. We aren&#8217;t focused on large-scale hosting, for people with huge data sets and things like that.</p>
<p>We eventually want to get into application hosting. There&#8217;s a thing called CouchApps. CouchDB, being an HTTP server, can host applications directly, so you can write applications and forward the HTML, CSS, and JavaScript through CouchDB.</p>
<p>When you point your browser at it, the browser comes alive and starts the JavaScript. It becomes interactive as you query and update the server and everything. When everything is served from CouchDB, that’s a CouchApp and you can replicate it around, just like the data.</p>
<p>We’re building these CouchApps for things like mail, timelines, scheduling, and document management. We want to provide hosting and support for these CouchApps, so users don&#8217;t necessarily even know that these CouchApps are built on top of CouchDB, they just know that their applications are being made to run their business.</p>
<p>And the benefit of these CouchApps is that they can run them in the cloud, they can run them in the browser, and they can run them on a small server in their office. The whole time, we have them backed up and they&#8217;re available in the cloud via replication. </p>
<p>Their data&#8217;s always with them, it&#8217;s installed locally on their laptops or on their phones, and it&#8217;s always available using CouchDB for replication. So if their network connection goes down, or it&#8217;s really slow, or if they&#8217;re traveling on an airplane, they always have access to their data.</p>
<p>Moreover, they’re always accessing it through the same browser interface, and regardless of the network conditions, it&#8217;s always a fast, low-wait connection, because it&#8217;s local.</p>
<p><b>Scott:</b> This notion of a CouchApp, where the app is stored in the database, is fascinating, and it makes sense because the database can store anything. And it&#8217;s automatically replicated, so if you needed to stand up more nodes or whatever behind your load balancer or those kinds of things, it’s a piece of cake. </p>
<p><b>Damien:</b> Databases have this concept of design documents, which are like regular documents, except they contain information about how to index the database. The MapReduce JavaScript indexes are specified in these design documents, and they can also contain the HTML, the CSS, the images, and whatever they need for the applications. </p>
<p>All of those can be stored in a single design document, and these replicate around with the data. You can also use apps to put them up on the server cloud, so you point your browser at it and it&#8217;s just like a regular web browser app. We have several examples of these.</p>
<p>Mozilla has written one called Raindrop that’s basically a mail-style app. We&#8217;ve written a chat app and a calendar app.</p>
<p>You run these in your browser, and they&#8217;re just like regular web apps. The really cool thing is that once you install CouchDB locally, you can replicate these down and still run them even if you&#8217;re disconnected from the network. You have the exact same experience.</p>
<p>We want to get the browser plugins, so you can integrate them straight in the browser. In the CouchApps, there would be the option to use it offline, which would cause it to replicate locally. </p>
<p>The browser would still deliver the same user experience, like a local proxy. They would still see in the URL that they&#8217;re hitting the remote, but it would intercept all those requests and hit the local replica. </p>
<p>Any changes they make to the local replica immediately get replicated up to the cloud, and any changes somebody makes to the cloud version are replicated down to the local one, so you always have the most up to date version locally.</p>
<p>It could also be an appliance that’s rented to somebody. There’s this thing called the Shiva plug, which is a really small Linux plug that&#8217;s about the size of an Airport Express. It runs Linux, and it has a hard drive and a little web server. We&#8217;re going to put CouchDB on those types of devices.</p>
<p>Then people can rent these plugs that have the server and their application on there. Their cloud provider will mail it to them, they plug it into the wall, and it automatically connects to the cloud CouchDB instance.</p>
<p>They would get a really high bandwidth, low latency experience for these apps, because they would be served straight from the plug, which is also constantly replicating with the cloud. This solves problems for small businesses that are running off of a cable modem, and their telco has oversold.</p>
<p>A lot of people run off of DSL or cable modem, and frequently throughout the day, it gets slow or stops working altogether. The customers don&#8217;t have access to the network for whatever reason, so they’re completely in a holding pattern until the network comes back up. The plug makes it so the data is always right there locally.</p>
<p><b>Scott:</b> That’s a great explanation, because it makes it clear to people how this technology can actually help them in their business.</p>
<p><b>Damien:</b> The idea behind that little device is that now you have a little tiny server that&#8217;s always tethered to the cloud. You can serve applications to everybody in the office, and nobody has to install anything. </p>
<p>You&#8217;ve essentially replaced the whole Microsoft Office package. Whereas previously, you&#8217;d have to buy Microsoft Office and install Office Server and Exchange and all sorts of different things in order to deal with the fact that you can&#8217;t have everything in the cloud because your network connection is just too unreliable.</p>
<p>This bridges that world, so now you&#8217;ve got this low cost tiny plug and everything&#8217;s higher speed because it’s local, and then you&#8217;ve also got everything in the cloud, so that everything&#8217;s backed up, professional administered, and up to date. And if the plug dies, everybody in the office just accesses the cloud version while you wait for Fed Ex to deliver you a brand new plug.</p>
<p><b>Scott:</b> I want to be sensitive to the time. Are there any closing thoughts from your side?</p>
<p><b>Damien:</b> We&#8217;ve just gone to version 0.11, which is our release candidate for 1.0. We&#8217;ve already been very stable as a database engine, and right now what we&#8217;re actually trying to stabilize on is our program interface, and we&#8217;re about ready to go.</p>
<p><b>Scott:</b> Thanks for taking the time to talk today.</p>
<p><b>Damien:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=329&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F18%2Finterview-with-damien-katz-apache-couchdb%2F&amp;title=Interview+with+Damien+Katz+%26%238211%3B+Apache+CouchDB" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F18%2Finterview-with-damien-katz-apache-couchdb%2F&amp;title=Interview+with+Damien+Katz+%26%238211%3B+Apache+CouchDB" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F18%2Finterview-with-damien-katz-apache-couchdb%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F18%2Finterview-with-damien-katz-apache-couchdb%2F&amp;title=Interview+with+Damien+Katz+%26%238211%3B+Apache+CouchDB" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F18%2Finterview-with-damien-katz-apache-couchdb%2F&amp;title=Interview+with+Damien+Katz+%26%238211%3B+Apache+CouchDB" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F18%2Finterview-with-damien-katz-apache-couchdb%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Damien+Katz+%26%238211%3B+Apache+CouchDB+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F06%2F18%2Finterview-with-damien-katz-apache-couchdb%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/06/18/interview-with-damien-katz-apache-couchdb/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Interview with Edd Dumbill &#8211; OSCON Co-Chair</title>
		<link>http://howsoftwareisbuilt.com/2010/05/31/interview-with-edd-dumbill-oscon-co-chair/</link>
		<comments>http://howsoftwareisbuilt.com/2010/05/31/interview-with-edd-dumbill-oscon-co-chair/#comments</comments>
		<pubDate>Mon, 31 May 2010 16:48:19 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=325</guid>
		<description><![CDATA[OSCON is just around the corner, and in this conversation, we chat with true open-sourcer Edd Dumbill &#8211; co-chair of the conference and hacker behind the scenes. Determining what content to feature at OSCON Transitions in the open source community as a whole Business-strategy considerations associated with open source Selecting participants to present at OSCON [...]]]></description>
			<content:encoded><![CDATA[<p>
    OSCON is just around the corner, and in this conversation, we chat with true open-sourcer Edd Dumbill &#8211; co-chair of the conference and hacker behind the scenes.</p>
<ul>
<li><a href="#determine">Determining what content to feature at OSCON</a></li>
<li><a href="#transitions">Transitions in the open source community as a whole</a></li>
<li><a href="#business">Business-strategy considerations associated with open source</a></li>
<li><a href="#select">Selecting participants to present at OSCON</a></li>
<li><a href="#future">Interrelationships between NoSQL, cloud, and mobile shape the future</a></li>
<li><a href="#role">The role of OSCON in the larger sphere of technology</a></li>
</ul>
<p><span id="more-325"></span></p>
<p><b>Scott Swigart:</b> To start us off, please take a second to introduce yourself and your involvement with OSCON.</p>
<p><b>Edd Dumbill:</b> Sure. I have now been the program co-chair of OSCON for three years, and my colleague Allison Randal is the other program co-chair. I took over from the legendary Nat Torkington. </p>
<p>I wear two hats at O&#8217;Reilly. In addition to being program co-chair for OSCON, I&#8217;m also responsible for the software that runs the conferences, so I kind of get to hack on both sides of the fence. </p>
<p>Before I started working with O&#8217;Reilly, I had done a couple of startups and been an open source hacker for a while. I contributed to the GNOME project and was a Debian developer, so I&#8217;ve been in and around the open source world for quite a long time.</p>
<p><a name="determine"></a></p>
<p><b>Scott:</b> How many years has OSCON been going now?</p>
<p><b>Edd:</b> This is the twelfth year, I believe.</p>
<p><b>Scott:</b> Obviously, you want to keep it fresh every year, so how do you track what the trends and interesting movements are in open source?</p>
<p><b>Edd:</b> That&#8217;s a great question, and it&#8217;s actually the part of the job I enjoy most. We go about it in two ways. The first is we&#8217;re lucky to have a great response to our call for participation. This year we had well over 700 proposals. </p>
<p>Considering that we could only accept around 120 of those, we unfortunately had to turn away a lot of really great stuff, but the spectrum of responses paints a pretty good picture of what&#8217;s trending and what&#8217;s not.</p>
<p>The other thing is that O&#8217;Reilly tracks projects and the industry as a whole day in, day out. An example of that effect is the health IT track, which we’re doing at OSCON this year. We identified health technology as an area that&#8217;s ripe for the kind of collaboration that comes with open source.</p>
<p>We&#8217;re betting on that, and we opened up a separate call for participation for health IT, which we&#8217;ve had a great response to. We hope not only to cover trends there, but also to be part of getting new conversations and collaborations going.</p>
<p>In a similar vein, we&#8217;ve also set up two summit days in parallel with our tutorials. The first summit is an open source and cloud summit, where we&#8217;re gathering a lot of leaders to discuss the place of open source in the cloud, as well as what the future holds for emerging standards and collaboration.</p>
<p>At the hacker level, we&#8217;re doing a summit on the Scala language, which is an up and coming JVM language that has both enterprise and hacker credibility. It&#8217;s been used in places like Twitter and FourSquare, places where there are a lot of scalability concerns. We&#8217;re organizing a half-day Scala tutorial on Monday, and then a full day of Scala-related talks on Tuesday.</p>
<p>To come back to your question, we observe and try to shape some trends. NoSQL databases will be featured strongly this year. There are also things we identify as up and coming, and we try to give them space to grow and provide a  forum for our community to get to know each other.</p>
<p><a name="transitions"></a></p>
<p><b>Scott:</b> Five or ten years ago, it felt like open source was trending toward a separate, parallel universe. We would have an open source operating system with an open source web server on it. There would be open source office productivity tools, open source development environments, open source compilers, and it was all very distinct from the proprietary side of the industry.</p>
<p>Looking at OSCON now, there&#8217;s stuff about Android and iPhone, and even though Android is open source, it&#8217;s not really community developed. There’s also a lot of material at OSCON about cloud infrastructure, and that area involves a lot of proprietary players, whether it&#8217;s VMware or Amazon. Amazon built an infrastructure using open source, but they&#8217;re not exactly open.</p>
<p>It seems that a lot of proprietary stuff shows up in direct conjunction with open source now. There&#8217;s even beginning to be a thriving open source ecosystem around those proprietary offerings, whether it&#8217;s iPhone or whether it&#8217;s a certain cloud provider or something like that.</p>
<p>You mentioned that your involvement with open source stretches back a ways, into some Debian development and that kind of stuff. How have you seen the perception of open source and proprietary software evolving over that time?</p>
<p><b>Edd:</b> I think there&#8217;s always been a spectrum, and that spectrum still exists today. At one end, you have the free software movement, which wants a complete stack of completely free software. At the other end is corporate open source, where the control is still very much with the organization, but they&#8217;re using open source as a tool. And there&#8217;s a spectrum that lies in between.</p>
<p>O&#8217;Reilly has always been pragmatic, recognizing that while the free software movement is a very important thing, you don&#8217;t have to subscribe to it to get value from open source. The key to open source and sharing source, and also really open data and open standards, is that there’s a lot of value to be had without subscribing to either extreme.</p>
<p><b>Scott:</b> How does that play out with transitions like the current one toward the cloud, which is really causing some new types of convergence between the open source and proprietary worlds?</p>
<p><b>Edd:</b> One of the really important questions facing open source right now is with software as a service, what is the place of open source? Previously, if my text editor was broken, I could just develop it and hack it, add a new feature and submit the patch. What happens when I can’t do that?</p>
<p>You&#8217;re seeing a slow uptake in the popularity of the Affero GPL license. Not only does it require you to publish the source of you make any changes to it, but, but if you were running this as a service, then you must publish your source.</p>
<p>A poster child project for this is the StatusNet project that Identi.ca runs on top of, but we also saw an announcement this week that Sugar CRM v6 will be AGPL. There is a free software reaction to the web services, and it will be interesting to see how that develops. </p>
<p>We also have Stormy Peters giving a keynote on that topic, and she’s also hosting a Birds of a Feather session and a bunch of lightning talks on it. I expect OSCON will be a forum this year where we can talk about our reaction to cloud and open source.</p>
<p>With regard to what you&#8217;re saying about how open source has transitioned away from free software idealism and so on, corporations are catching on more and more to the fact that open source is a very useful business tool. Tim O&#8217;Reilly made the point that you can see open source as a tool used by the second player in the market.</p>
<p>Taking phones as an example, the iPhone has a lot of dominance with a completely closed platform. The key for Android is that it&#8217;s open source, so they&#8217;re getting a competitive advantage by opening up. We are seeing that particularly in the mobile segment, but in other places too.</p>
<p>Open source is becoming a business tactic, and OSCON will have a lot of content to help guide businesses.</p>
<p><a name="business"></a></p>
<p><b>Scott:</b> We&#8217;ve interviewed people from a lot of companies that are wrapped around open source projects. We have found it to be quite common that somebody had an idea, built an open source project, and did the care and feeding of it for a while.</p>
<p>Then, when they got a lot of people using it, they launched a company around it to provide professional services, customization, support, and so on. It may be a bit of a stretch to say that most popular open source projects have some kind of company kind of wrapped around them, contributing code, but sometimes it seems that way.</p>
<p>What are your thoughts on that? Can you give us a little bit of preview about the bullet in the conference description, “use open source effectively as part of your business strategy?”</p>
<p><b>Edd:</b> In the business track, we&#8217;re targeting two kinds of audiences. The first is corporations that are thinking about open sourcing their code bases. OSCON is a great place for people to get oriented into the open source platform.</p>
<p>The other audience is businesses who are using open source. You can leave yourself open to a number of problems if you&#8217;re not actively tracking your use of open source and the associated licensing. Those issues can be especially complex in acquisition scenarios.</p>
<p>In general, open source developers need to pay more attention to matters of law. We have a session from Bradley Kuhn and Karen Sandler, which is proving very popular already, called “The Bare Essentials of Legal Issues for Developers.”</p>
<p>The goal of that session is to teach what developers need to know about trademarks, copyright, and patents. People can’t just release open source code and then forget about it. The future of their project is vulnerable on all three of those fronts, and people must not be ignorant about it.</p>
<p><b>Scott:</b> At this point, it seems to be inevitable for a software company to run into open source, whereas ten years ago, everything they were using might have been proprietary. </p>
<p>To look at the really big examples, IBM obviously interacts with open source and contributes to it a lot. Oracle just acquired Sun, so there&#8217;s a ton of open source in their portfolio. Even Microsoft now contributes to the Linux kernel and has a whole bunch of little projects they&#8217;ve released as open source.</p>
<p>If your company gets really huge, it will almost certainly interact with projects and contribute code, but more along the lines of what you&#8217;re alluding to, your developers are just going to want to use it.</p>
<p>The difference between an Apache license and a GPL license can have huge ramifications if you&#8217;re bundling the code with something you&#8217;re distributing.</p>
<p><b>Edd:</b> Absolutely, and companies need to be aware of that. There&#8217;s another aspect that we&#8217;re also covering. I think business and community in the open source world are very much interlinked. You can&#8217;t really use open source in isolation from its communities. </p>
<p>We have a whole community track that discusses things like how to manage a community full of contributors, how to be an effective leader in those places, and how to foster diversity.</p>
<p>We have a fascinating panel that Leslie Hawthorn&#8217;s put together about financial incentives in open source. Leslie ran the Google Summer of Code Project, and she just recently left Google. I&#8217;m really looking forward to that, just to hear the panel talk about what works and what doesn&#8217;t work and how you can effectively use money to accelerate and encourage open source projects.</p>
<p><b>Scott:</b> It&#8217;s a fascinating topic, and a lot of open source projects have failed because a company thought it was going to be easy. They just threw the code out there, but the community didn&#8217;t come, or maybe it fractured. There are a lot of very smart people working with open source, and a certain percentage are very sure that what they believe is absolutely right.</p>
<p>Community managers need to address the issues that come along with strong personalities, and they need to incent people to do work that nobody really feels like doing, even though it&#8217;s critical. Getting community contributions requires a different mindset than when you have a hundred developers on staff. </p>
<p>There’s also a whole domain of expertise associated with deciding strategic issues like whether to follow the aquarium model or the community model.</p>
<p><b>Edd:</b> That’s true, and the community as a whole is only now finding out a lot of those things. There&#8217;s only a handful of projects that you can look to for a ten-year span of experience, and some of them have gone through pretty radical overhauls. </p>
<p>There&#8217;s another presentation from Dave Beckett, who works for Yahoo! now. He&#8217;s been maintaining infrastructural projects like the PNG since 1994. His presentation is called &#8220;Mature Open Source: Making Your Software Last a Decade.&#8221;</p>
<p>What&#8217;s fascinating about him is that he&#8217;s just one guy. As opposed to the corporate side with incentives and whatnot, he started off more or less as a solo hacker and kept his projects going. As time has gone on, he has slowly expanded the pool of participants and so on.</p>
<p>I&#8217;m glad that we&#8217;re hearing the story from both ends. The corporate side is talking about taking their projects open or funding projects, and the lone hacker side is talking about how to stick with something and slowly build participation without going crazy.</p>
<p><a name="select"></a></p>
<p><b>Scott:</b> OSCON is in a fairly unique position to really understand open source developers, because you&#8217;ve been running this conference for 12 years now. </p>
<p>Open source has evolved with corporations getting more involved, but there’s still that core constituency of open source developers. Are you guys ever surprised by what resonates or doesn’t resonate with those developers?</p>
<p><b>Edd:</b> When you put together a conference, you&#8217;ve got to have several things in mind. First and foremost, people are paying to come there, so it&#8217;s got to be useful and relevant. Over and above that, it has to educate and challenge as well. </p>
<p>As program chairs, we play our part both in recognizing people in the community who&#8217;ve got something different and challenging to say and inviting people who we think will stretch people&#8217;s minds and thinking in our keynotes.</p>
<p><b>Scott:</b> So you&#8217;re not shying away from controversy, in other words. If you think somebody has a really intriguing point of view, you&#8217;re OK with it being somewhat controversial?</p>
<p><b>Edd:</b> I think so. For example, we have been sponsored by Microsoft, and they used our conference to announce their open source initiatives. When they first did that, it was the most frosty, hostile reception that they probably feared. </p>
<p>[laughter]</p>
<p>Still, although there was controversy, the OSCON crowd is a civilized and human one. In fact, I&#8217;ve never known a warmer bunch of people to work with, and to be among for a week. Nevertheless, because we&#8217;re geeks, we debate with a certain degree of ferocity. </p>
<p>To their credit, Microsoft stuck with it, and by year’s end, they started to deliver on a variety of fronts. </p>
<p><b>Sean Campbell:</b> You obviously have a defined process that involves committees and so on to determine which people are asked to present at OSCON, and I imagine that there are certain best practices to get invited as part of that process.</p>
<p>Without naming names, obviously, are there two or three things that you see a lot of people do that either help or hurt their efforts to appear as speakers or panelists at the conference?</p>
<p><b>Edd:</b> First, we certainly like to hear from the principals themselves. It&#8217;s generally a disadvantage if a PR company submits a proposal on your behalf, for instance. When we deal with the actual speaker, that person has much more of a personal investment, and we have a clearer understanding of what&#8217;s going on. </p>
<p>We do sometimes accept things that are submitted by PR, and that’s the way certain companies prefer to operate. The reason that it typically puts them at a disadvantage is that when a geek writes for other geeks, it tends to speak more directly to the audience. If it&#8217;s written by somebody else, it doesn&#8217;t quite grab you in the same way.</p>
<p>The second thing is that we like to see a fair amount of detail. We have 700-plus proposals, and part of the way we can be persuaded that you know what you&#8217;re talking about is that you&#8217;re able to write in a coherent manner and you give detail about the subject matter. </p>
<p>Not every superstar hacker is a superstar presenter, so the ability to present your thoughts clearly, even in just a description and abstract, is a pretty key indicator for us. We also look for the subject matter to be fresh and concrete, rather than being based on buzzwords.</p>
<p>For established projects, we tend to get a lot of proposals that describe &#8220;the state of X&#8221; or &#8220;what&#8217;s new with X in 2010.” That’s all well and good, but frankly, you can read that information off a blog, and we tend to de-emphasize those.</p>
<p>In the past, we&#8217;ve put them in lightning talks, where maybe 16 projects got five minutes each to say what was up and coming, but we&#8217;re not even doing that this year. These days everyone can read a blog or a changelog or a tweet or whatever. We&#8217;re looking for a lot more practical experience, or somebody to put forward an interesting new spin or viewpoint.</p>
<p><b>Sean:</b> Someone who&#8217;s used to giving monologues or who is just too sensitive may not be a good panelist. For a panel, you may want someone who is a bit of a contrarian to make things interesting, without being so far in that direction that the conversation seems fractured.</p>
<p>How do you balance looking for someone to participate on a panel versus an individual speaking slot?</p>
<p><b>Edd:</b> I think you said it pretty well, and to be honest, we don&#8217;t actually accept a huge number of panels. It&#8217;s very difficult to do a good panel, and I think some of it comes down to the personal preferences of the chairs.</p>
<p>Neither Allison nor I like to do too many panels, largely because if they don&#8217;t go well, they don&#8217;t tend to be very useful means of communicating to people.</p>
<p><b>Sean:</b> How do people get involved in the actual planning side of the conference? Particularly because O&#8217;Reilly is a for-profit corporation, what is the experience like for someone who is not an O’Reilly employee and wants to participate? What channels do you direct that kind of feedback to?</p>
<p><b>Edd:</b> We have a variety of channels, and we do rely very heavily on community contributions, with the first point of contributing obviously being through the call for proposals. A lot of longstanding friends of the conference contribute in that way.</p>
<p>We try to listen to the largest possible swath of people in and around the community, including our company. We do have a formal program committee made up of 20 to 30 folks active in the various orbits that OSCON covers.</p>
<p>We&#8217;re always looking for people to invite onto that committee who are active in their particular part of the open source world or interested in how to make OSCON better.</p>
<p>It&#8217;s a broad conference, so Allison and I as chairs need to rely heavily on the expertise of people in various areas. We have a Perl sub-committee, and people in Python, PHP, Postgres, and so on. We do some degree of self organizing, and they have little committees of reviewers who review all the talks.</p>
<p>We try to make sure that every single proposal is seen by at least two people, and very often a lot more than that. Everything is read and considered and voted on.</p>
<p><a name="future"></a></p>
<p><b>Scott:</b> To call out a technical point, I see that you&#8217;ve included NoSQL as a topic in cloud. How intersected do you feel like those two are? In other words, would we have a big interest in NoSQL if it wasn&#8217;t for the cloud?</p>
<p><b>Edd:</b> That&#8217;s an interesting point. We have three broad topic areas that we see as the growth areas this year: NoSQL, cloud, and mobile. All of them are interrelated, and to understand that interrelation, you need to dial back a few years, when we were talking about the LAMP stack.</p>
<p>Then we were talking about web applications, and the web sews all these topics together. The question of whether we would be as interested in NoSQL if we didn&#8217;t have the cloud is interesting from the standpoint that they both stem from the massively distributed and popular information network that we all use, called the web.</p>
<p>Perhaps the aspect that brought NoSQL to the biggest audience is the need to scale web applications and to be able to build inverted indices from regular SQL databases for performance reasons. </p>
<p>The vast majority of NoSQL applications sit side-by-side with an SQL database. There’s a collaboration and a cooperation going on there, and it&#8217;s a hybrid architecture.</p>
<p>Another interesting spin to NoSQL concerns graph databases. They have been around for a good while, but the rise of social networking on the web has finally given us a graph a lot of people care about. Again, you&#8217;ve got web applications driving that side of it, too.</p>
<p>Mobile is now quite interlinked with everything else. It is no longer an option, whether to add a mobile interface to your application. It&#8217;s kind of expected, rather than a value-add, if you&#8217;re launching any kind of commercial service.</p>
<p>Looking at the overall package as a bunch of thin clients and a bunch of chunky servers at the back end, it’s clear that mobile and cloud go hand in hand. If the client can do less, then the server has to do more.</p>
<p>All these issues are growing together, and the other very popular thin client, of course, is the web browser. One must not underestimate the importance of the HTML5  feature set. You&#8217;ve basically got feature-equivalent thin clients in the mobile world and the actual desktop.</p>
<p>One of the bets we&#8217;re making this year is that if you&#8217;re looking to target everywhere, including Android, iPhone, and the rest, then web technologies are the sane choice, as well as the most open route. </p>
<p>A lot of people are rightly worried about developing solely for an Apple platform, when they can change the terms and conditions, and blow what you&#8217;re doing out of the water.</p>
<p><b>Scott:</b> In another conversation earlier today, someone observed that people get too hung up about building iPhone applications with their proprietary SDK, Objective C and that kind of stuff. Most of those applications could be built as web applications, and they can even run locally on the phone, with local storage.</p>
<p>There seems to be an awareness problem, where people aren’t really considering that option. Do you think that web-based approach is going to catch on?</p>
<p><b>Edd:</b> I think this is the year when people are going to figure it out, and we&#8217;re actually working on that internally at O&#8217;Reilly at the moment. We develop a bunch of apps, and a lot of our books have been successful in that format on the App Store. </p>
<p>We think we need to provide more mobile interfaces to the things we do, and personally, I think the conference needs to be one of those. </p>
<p>Developers can either put up with a lot of risk and delay from the App Store, or they can do a perfectly great app using just the mobile browser. The poster child for this is Google&#8217;s mail, Gmail, which delivers a great experience on the iPhone.</p>
<p><b>Scott:</b> PhoneGap makes a web app to run local on Android, iPhone, and all that kind of stuff. It looks just like an app, but the whole thing&#8217;s built using HTML5.</p>
<p><b>Edd:</b> We have folks presenting at OSCON on that, and I mentioned several of the cross-platform talks about that sort of thing.</p>
<p><a name="role"></a></p>
<p><b>Scott:</b> You mentioned that this is going to be the year when this model takes off. Since O&#8217;Reilly has the ability to put a spotlight on things, talk a little about the notion that, “With great power comes great responsibility.”</p>
<p>You have observed that using proprietary SDKs and so on can require developers to start over from scratch in order to go from iPhone to Android, with very little ability for code reuse. How concerned do you think you need to be with regard to companies’ terms of service and so on, when you advocate for a certain development approach?</p>
<p><b>Edd:</b> Honestly, a lot of those things come down to personal values. The broad picture is that we&#8217;re obviously very much in favor of openness, transparency, and so on, but we&#8217;re not breaking with vendors consistently. </p>
<p>We will stand up and make a big deal of it when it matters, like Tim O&#8217;Reilly&#8217;s famous spat with Jeff Bezos&#8217; one-click patent, for instance. We also stood up and spoke out directly in favor of Microsoft when they were being hit with a silly patent dispute concerning their web browser.</p>
<p>I don&#8217;t know that the conflict between various mobile platforms is necessarily any different from Windows versus Linux versus Mac, in the sense that all these different platforms have different terms of engagement, and open source plays a role in each.</p>
<p>As far as mobile&#8217;s concerned, we&#8217;re here to help equip developers, so we have a great range of books on iPhone, Android, web standards in HTML5, and so on. Our role is really one of educating, rather than taking a stand on how people should develop.</p>
<p><b>Scott:</b> I want to be sensitive to the time. Any final thoughts about OSCON that you&#8217;d like people to be aware of?</p>
<p><b>Edd:</b> I&#8217;d encourage everyone who reads this to look through the OSCON program. There&#8217;s a wealth of topics that we&#8217;ve not been able to touch on here. For instance, we have a fantastic hardware track, covering Arduino, things like the Beagle Board, and plug computing apps. </p>
<p>It&#8217;s become very easy to prototype, and there are a lot of fun things we can put out in the world to process data, connect to the cloud, and monitor and augment reality. That&#8217;s a fascinating area for us.</p>
<p>We have an operations track,, which is very much following the rise of developer operations as a hybrid discipline, and something else the cloud brings with it.</p>
<p>You can&#8217;t really run software for the cloud without taking its architecture into account. When you have ten thousand instances of a machine, you need to allow for that, so  the Dev Ops discipline is very much growing.</p>
<p>There’s so much there, I could go on forever.</p>
<p><b>Scott:</b> It&#8217;s obviously a great conference. These are technical people, but at the same time, OSCON really tracks trends, conversations, and debates. Having Microsoft come in was a controversial thing to do, but you guys felt that it was important. It seems that you really live by a philosophy of not shying away from controversial topics.</p>
<p><b>Edd:</b> That&#8217;s right. Community means a great deal to us, and we regard it as an enormous privilege to have this conference. The attendees and speakers each year are what makes it a success, and it’s absurdly gratifying to stand on stage and welcome people.</p>
<p>It&#8217;s a fantastic collection of people who are inspiring, entertaining, and controversial. I really couldn&#8217;t think of a better job.</p>
<p><b>Scott:</b> Thanks for taking the time to chat.</p>
<p><b>Edd:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=325&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F31%2Finterview-with-edd-dumbill-oscon-co-chair%2F&amp;title=Interview+with+Edd+Dumbill+%26%238211%3B+OSCON+Co-Chair" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F31%2Finterview-with-edd-dumbill-oscon-co-chair%2F&amp;title=Interview+with+Edd+Dumbill+%26%238211%3B+OSCON+Co-Chair" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F31%2Finterview-with-edd-dumbill-oscon-co-chair%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F31%2Finterview-with-edd-dumbill-oscon-co-chair%2F&amp;title=Interview+with+Edd+Dumbill+%26%238211%3B+OSCON+Co-Chair" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F31%2Finterview-with-edd-dumbill-oscon-co-chair%2F&amp;title=Interview+with+Edd+Dumbill+%26%238211%3B+OSCON+Co-Chair" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F31%2Finterview-with-edd-dumbill-oscon-co-chair%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Edd+Dumbill+%26%238211%3B+OSCON+Co-Chair+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F31%2Finterview-with-edd-dumbill-oscon-co-chair%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/05/31/interview-with-edd-dumbill-oscon-co-chair/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Interview with Shay David &#8211; Co-founder of Kaltura</title>
		<link>http://howsoftwareisbuilt.com/2010/05/24/interview-with-david-shay-co-founder-of-kaltura/</link>
		<comments>http://howsoftwareisbuilt.com/2010/05/24/interview-with-david-shay-co-founder-of-kaltura/#comments</comments>
		<pubDate>Mon, 24 May 2010 23:54:51 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=320</guid>
		<description><![CDATA[Video streaming has grown way beyond YouTube and Hulu. Today, broadcasters such as ABC, NBC, and the Comedy Channel are streaming full shows straight to viewers. Netflix is living up to it&#8217;s name, and Amazon offers a huge library of movies on-demand. And open-source has moved well beyond the infrastructure layer, with open source video [...]]]></description>
			<content:encoded><![CDATA[<p><P>Video streaming has grown way beyond YouTube and Hulu.  Today, broadcasters such as ABC, NBC, and the Comedy Channel are streaming full shows straight to viewers.  Netflix is living up to it&#8217;s name, and Amazon offers a huge library of movies on-demand.  And open-source has moved well beyond the infrastructure layer, with open source video platforms like Kaltura.&nbsp; In this interview, we chat with Shay David, co-founder of Kaltura.</P></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/24/interview-with-david-shay-co-founder-of-kaltura/#rise">The rise of video as a key form of web content</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/24/interview-with-david-shay-co-founder-of-kaltura/#open">The open source projects and business model behind Kaltura</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/24/interview-with-david-shay-co-founder-of-kaltura/#cloud">The role of the cloud in the future of video editing</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/24/interview-with-david-shay-co-founder-of-kaltura/#content">Key vectors for the spread of video content on the web</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/24/interview-with-david-shay-co-founder-of-kaltura/#formats">Solving the format-selection and openness problems of web video</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/24/interview-with-david-shay-co-founder-of-kaltura/#future">What the future holds for IP-based video</a></p>
<p><span id="more-320"></span></p>
<p><b>Scott Swigart:</b> To get us started, could you introduce yourself and Kaltura?</p>
<p><b>Shay David:</b> I am one of the co-founders of Kaltura, and I run business and community development. Kaltura is the first open source video platform, meaning that we have a set of tools, technologies, best practices, and services that allow people to do anything and everything with video, including adding video to any site. </p>
<p>We empower the entire video stack with a set of open source building blocks, including ingestion, metadata normalization, player development, playlist development, content moderation, scheduling (for both live and on-demand video feeds), monetization, and analytics. Those building blocks are the foundations of video applications.</p>
<p>People are using Kaltura for video publishing, goods and asset management, distance education, and enterprise collaboration, among many other uses. Pretty much across every industry, we feel that we are becoming part of every web experience.</p>
<p><a name="rise"></a></p>
<p><b>Scott:</b> Why do you think video is becoming so important right now, whereas it wasn&#8217;t as widespread in the past?</p>
<p><b>Shay:</b> One of the reasons is that the means of production are so available. If you look five years back, it took a lot more to be a media company or a studio. People needed to have a satellite van, $60,000 cameras, and whatnot. </p>
<p>Today, two billion people carry the means of production in their pockets with smart phones, and more and more of the population are equipped with broadband. The capacity to produce and to consume video is very prevalent. </p>
<p>This creates a whole paradigmatic shift in the set of technologies that are necessary, because, unlike text, video is complicated. A lot of things that we know how to do with text pretty well are not well understood for video; we need the means to handle, store, and stream very large files, transcode one format to another, and manage advertising and analytics.</p>
<p><b>Scott:</b> Do you feel that there&#8217;s more opportunity for analytics in video? For example, if somebody goes to web page, you can determine how long they spent on the web page, but you don&#8217;t necessarily know what was interesting to them. </p>
<p>With video, on the other hand, I imagine you might be able to know what portions they watched, what they skipped, and so on. How are the opportunities around analytics different for video than other types of content?</p>
<p><b>Shay:</b> Analytics actually encompass two very different types of information. The first is audience measurement, such as who is looking at the content, which countries and demographics they come from, and so on. That type is well understood.</p>
<p>The second set of data is media analytics, which consists of the types of things you were mentioning, like drop-off points, popular pieces of content, level of engagement, etc., which is a much less understood science. Part of what we bring to the table is a whole set of technologies around that.</p>
<p>Using this information begins with identifying the data you actually want to collect and the events on the player side that are associated with such data. Next, you have to decide how to close the loop back into your production systems and determine how you will use the information. Just measuring it without acting on it is only half the task.</p>
<p><b>Scott:</b> The first time I got a sense for video analytics was the whole Janet Jackson Super Bowl thing. Since with Tivo, you can just hit a button and go back eight seconds, the Tivo company reported that the eight-second rewind function was used at that point more than at any other time in the history of their service.</p>
<p>I thought, “Oh my goodness, Tivo knows every time I back up eight seconds. I had no idea they collected that kind of telemetry information.” That opened my eyes to some of the things that you can track with video.</p>
<p><b>Shay:</b> Absolutely, and that&#8217;s just the beginning. When we talk about social media and interactive media around video, we can go well beyond basic audience measurement, but we have to be sensitive to privacy concerns too. </p>
<p>The Nielsen paradigm is about people-metering, and making some gross assumptions about who&#8217;s watching each channel in very large categories. And that was fine when we were talking about a broadcast paradigm, but who are the people watching? How many of them are there? How many couch potatoes are in front of the screen? </p>
<p>When you&#8217;re dealing with interactivity, there’s a whole new set of measurements to be concerned with, because that&#8217;s what the advertisers or publishers ultimately care about.</p>
<p><a name="open"></a></p>
<p><b>Scott:</b> Talk a little bit about the open source projects that are underneath what you do as a company, how you engage, and how the development of that works.</p>
<p><b>Shay:</b> Kaltura is open source through and through, and we run everything on the LAMP stack. So underneath, for production, we use Linux, Apache, MySQL, and PHP. Our entire backend is in PHP, and the front end is a combination of technologies. </p>
<p>We want to maximize availability to publishers, so we provide a set of code in Flash, a set of code in HTML 5, and more recently, a Silverlight front end. It&#8217;s completely multi-platform; I run it on my Mac, our developers sometimes develop on Windows, and production is based on Linux. The code itself is a set of extensions on top of Apache.</p>
<p><b>Scott:</b> Did you choose PHP for the presentation layer just because you wanted your customers to be able to easily modify it?</p>
<p><b>Shay:</b> Absolutely, and PHP is also the back end. A lot of the presentation layer, for the most part, is the graphics hub. You could use a variety of libraries when presenting content in PHP, .NET, or Java. Ruby, too, is becoming more and more popular. The idea is just to maximize publisher freedom. </p>
<p>A publisher can take the entire set of functionality and just plug it into whatever kind of system they like. One of the things that we&#8217;re proud of is a lot of pre-made integration kits, so if you want Kaltura within whatever context, you can most probably go to our newly released Application Exchange, which is a marketplace for all the extensions, plugins, and applications that Kaltura, our partners, and our developer community create.</p>
<p>You can find kits that we have already pre-integrated with Drupal, Joomla, MediaWiki, WordPress, and many other systems like that, regardless of the operating system or the language you&#8217;re using for the front end. That takes the pain out of the integration.</p>
<p><b>Sean Campbell:</b> Walk me through how you make money. When I look at the way you describe the five layers of your open source stack, you&#8217;re leveraging a lot of open source projects, but you don’t seem to have a fully packaged, open source version of the solution.</p>
<p><b>Shay:</b> We make money in three ways. We bring to market the best of breed open source platform, where people that understand open source and have the development wherewithal and capacity to do it can go to our community site, kaltura.org, download the software, install it on their own, and never buy a single service from us.</p>
<p>In exchange for that, we expect them to play nicely with the open source rules. Any derivative work has to be contributed to the community, and they have to follow rules about attribution. Licensing for the code is under the Affero-GPL. So if they want to produce web services, they&#8217;re still regulated by the open source licenses.</p>
<p>From that group of people, we may never make any money. We have 60,000 publishers in our network, and the majority of them never pay us license fees. They use the software and contribute code or bug reports, or just become part of an active community. Some of them opt in to receive ads. </p>
<p>On the other hand, we make money from that group by selling them professional services, such as maintenance, support, installation, application development, and things like that. That&#8217;s one revenue bucket.</p>
<p>The second revenue bucket is a service business whereby people that understand the open source game still don&#8217;t have the capacity to do it, so they pay us to provide them software and services.</p>
<p>Using the same stack of code, we offer a hosted service where we host their files, we run the transcoding of the files, and we run the applications that are going to be integrated into their stack in a hosted manner. </p>
<p>The third revenue bucket comes from service providers. To hosting companies, cloud companies, telcos, system integrators and the like, the platform is also available as a self-hosted solution, under a commercial license with full support and maintenance from Kaltura.</p>
<p><a name="cloud"></a></p>
<p><b>Sean:</b> I&#8217;ve heard a lot of people say that there are certain things they wouldn&#8217;t do with a cloud-based app, and a common thing people tend to bring up is heavy multimedia editing. It requires significant CPU, and you want to have full fidelity solutions like FinalCut Pro or even key solutions like iMovie or FinalCut Express or Pic Premiere on Windows.</p>
<p>On top of that, the open source community has at times struggled with desktop Linux OSs like Canonical to really get good embedded tools that would rival some of those toolsets. You guys have obviously spent some time on the editing side, and it looks like you have done really great work. </p>
<p>Talk a bit about the limits you have to deal with and work around, in terms of people&#8217;s user experience with uploading and editing.</p>
<p><b>Shay:</b> I think that&#8217;s challenging, but we have a few solutions for that. One is that we do offer what we call the Self-hosted Edition, which has the capacity to run the hardened version locally. If you&#8217;re a service provider, an enterprise, or a university and you want to run this locally, you don&#8217;t necessarily have to run it from the cloud. </p>
<p>We give you a hardened version that you can install on your premises, and you own ROOT. You can control it, and the files are close to the destination so you don&#8217;t have to operate over VPN. </p>
<p>Depending on the details of the deployment, you might be able to enjoy economies of scale without having to have the complexity of cloud. We see a lot of adoption of this model in higher education, where many institutions have robust on-campus infrastructure.</p>
<p>I think you&#8217;re correct to point out that it&#8217;s only a matter of time until you can pretty much do everything remotely. People like Sun have been saying for ages that the “network is the computer,” meaning that, at some point in time, if the network is fast enough, you don&#8217;t really care where the CPU is.</p>
<p>In that sense, I think that what we have is a beginning. It&#8217;s not ready to replace high-end, broadcast-grade editing solutions any time soon, but people are using it for production, and you can run really complex work-flows around it.</p>
<p>Where cloud-based editing becomes essential is when you&#8217;re starting to think about collaboration. On one hand, if I&#8217;m editing my own files that I just uploaded from my camera to my computer, I have little motivation to run everything on the cloud, other than the benefits of automatic updates, auto-backup, and so on.</p>
<p>On the other hand, a group of students working on a class project doesn&#8217;t really have the option of doing it offline. In the context of enterprise collaboration or training, the need arises to gather information from multiple sources, and potentially from multiple geographical locations.</p>
<p>Therefore, you are almost mandated to have a network solution. For a network solution like that, the first tier replicates the broadcast model online. A provider has a library of content and wants to capture data out there. But that is just the starting point.</p>
<p>As we&#8217;re moving into the new realm of video, it&#8217;s about interactive media; it’s about collecting information from multiple sources and being able to transmit it. That&#8217;s the place where I think we are seeing a lot of adoption of the more interesting concepts around editing, remixing, annotation, content manipulation, and metadata extraction. </p>
<p>In my mind, all those fall in the same category of advance network services that have to do with collaboration.</p>
<p><a name="content"></a></p>
<p><b>Sean:</b> People obviously love social media streams and the collaboration associated with them, but it&#8217;s somewhat uni-directional. Normally, I upload my video, and people watch it. It gets more interesting when they remix it or do something interactive with it. Some musicians actually like that their fans are doing that. </p>
<p>That creates an interactive world where, by lifting all that weight off of physical drives in everybody&#8217;s house and office, we enable new use cases. If it really is all up there and you&#8217;ve got good editing tools, you can blend and create environments and communities.</p>
<p>There&#8217;s plenty of video on the Web, but it&#8217;s mostly a broadcasting-type model. I&#8217;m hearing you say that you might be able to enable a certain level of community around video as a content stream, given the tools and the environment you provide.</p>
<p><b>Shay:</b> Absolutely. I think that revolution is impending, and I think the first step we&#8217;re going to see at a large scale is on Wikipedia. Using this technology, we&#8217;re providing Wikipedia the same type of media collaboration that they know from text. </p>
<p>Some people are going to upload, some people are going to edit, some people are going to clip, and some people are going to transmit. It&#8217;s a completely open system that allows people to collect multiple media files, sequence them, edit them, and translate them, all in an open way.</p>
<p>It&#8217;s very exciting stuff, and two years of work are coming to fruition. The first phase just went live a couple of weeks ago. People are contributing content, and then later in the year, all the sequencing and remixing software is going to go live. That&#8217;s going to be very significant.</p>
<p>It&#8217;s been a great year for HTML5 video. Safari, Chrome, and Firefox already support it, and a few weeks ago, Microsoft announced that a new version of IE is going to support it. Google announced that it has open-sourced the VP8 codec. That kind of opens the way to creating a truly open HTML5 solution.</p>
<p><b>Sean:</b> It seems that one of your potential business development strategies would be to look at properties on the Web that are already successful but also text-heavy, which would naturally lend themselves to a video overlay.</p>
<p>I imagine you also have your eyes on other text-heavy properties that are driven by community contributions.</p>
<p><b>Shay:</b> True, but I think the opportunity goes beyond publishers like Wikipedia in the classic sense of publishing. It&#8217;s not only text-heavy content that is aimed to end users; it&#8217;s also other places where there is mainly text. </p>
<p>Think about university learning-management systems (LMSs) that are 95 percent text right now and maybe a PDF and an image here and there. Think about Enterprise Resource Planning (ERP) or Customer Relationship Management (CRM) software, and all the other places that couldn&#8217;t use rich media because it was too complicated. Now we can change all that. </p>
<p>We are giving the market a set of technologies that are enabled for rich media and video. I’ve seen statistics that show that video is going to be more than 90 percent of the traffic on the web. I think that&#8217;s a conservative estimate; I think it&#8217;s going to be somewhere like 95 to 99 percent of the traffic in the world in terms of number of bits. </p>
<p>For large education institutions or enterprises, this requires a huge shift in thinking. For these guys, it&#8217;s not about all the libraries, but being able to integrate into the platform that&#8217;s going to give them all the tools, because now, if you don&#8217;t do it properly, all of a sudden you&#8217;re going to choke on it.</p>
<p>Imagine you&#8217;re a university campus of 50,000 students and you have an LMS that’s running great. You have a new film professor who assigns his 300 students homework to upload movies to a campus web site. If they do that before you provide for it, 300 students each uploading half-gigabyte files will cause the system to collapse. True story.</p>
<p>In scenarios like that, you really have to think about the infrastructure. You have to use a new set of tools. That&#8217;s just one example.</p>
<p><a name="formats"></a></p>
<p><b>Scott:</b> You mentioned HTML5. For the longest time, as videos started to emerge on the Web, it all gravitated toward Flash, but now, the industry seems to be moving toward HTML5. Google is pushing it, Microsoft&#8217;s going to support it, and of course, Apple has declared war on Flash. </p>
<p>I can’t play at least 85 percent of the videos on the web using my iPad, because they&#8217;re all Flash-based. What are you seeing in terms of emerging video formats, and what opportunities do you think that creates?</p>
<p><b>Shay:</b> I think a lot of the war is driven by adoption and devices. There&#8217;s no question that Flash is installed on a large majority of PCs and Macs, but that’s not the case on mobile devices. Flash doesn’t have any adoption on IP-enabled TVs, set-top boxes, Windows Mobile phones, Android phones, iPhones, iPads, and iPod Touches. </p>
<p>Only a relatively small portion of devices can support Flash, and while its absolute adoption numbers might be increasing because the PC market is still growing, it has declined as a percentage of the overall ’screens’ out there. </p>
<p>I think there is an opportunity for open formats to take Flash’s place, and HTML5 is one of those formats. HTML5 has always been a codeword in that sense, because it does not define the codec or the transfer protocols for video, so we&#8217;re going to see a lot of bifurcation in the market moving forward.</p>
<p>The promise of HTML5 is that it allows developers to write once and publish anywhere. You can include video tags on your page, and it’s the device&#8217;s responsibility and the browser’s responsibility to figure out the codec and the transfers that render the video properly. In that sense it&#8217;s a huge step forward, but I don&#8217;t think Flash is going to disappear overnight.</p>
<p>It&#8217;s a paradigm shift that is going to take a few years, and I think Flash has a lot of things going for it. We work very closely with both Microsoft and Adobe to maximize the benefits to publishers. The biggest opportunity it presents for us is that today, publishers are faced with a bifurcated audience, because they have to address the iPhone, the iPad, Android, set-top boxes, and PCs.</p>
<p>They need a set of tools that allows them to simplify writing and publishing their application without having to think about all the devices, and that represents to us a big opportunity to provide an infrastructure that allows developers to write just once. Our libraries are going to do the heavy lifting for them in order to deploy the app with the appropriate protocols.</p>
<p><b>Sean:</b> Anybody who&#8217;s produced video for the Web, even to put a video on their blog, is immediately faced with choosing a bitrate, encoding, and whether to put it up as an MOV file, Flash, a WMV file, or whatever. </p>
<p>People trying to create interesting content don&#8217;t really want to worry about all those technical details, and there is no right choice anyway. It really depends on what a particular user has on their machine. </p>
<p>Like you said, if they&#8217;re using a mobile phone one format&#8217;s right, and if they&#8217;re using a Mac or a Windows PC, a different format might be optimal. I think you&#8217;re right that there&#8217;s a huge opportunity to deal with those issues and offload the content developer from having to worry about them.</p>
<p><b>Shay:</b> Precisely. I think that no single company can solve the challenge alone. We&#8217;re big believers in creating standards in the market, interoperability, and distributed control. </p>
<p>We’re a founding member of an organization called the Open Video Alliance. That&#8217;s an umbrella organization that is bringing together all the people who care about open video. We had a wonderful Open Video Conference last year where more than 900 people showed up to talk about these issues, and 4,000 more were on the live feed. </p>
<p>This year it is going to happen again on October 1 and 2. I&#8217;m inviting the developer community and the people who care about it to actually come and participate and just vote with their feet, helping solve that problem on the ground.</p>
<p>We are also involved with the industry resource website called HTML5Video.org, which allows people that have interest to showcase the technology, explore the libraries, play around with it, and experience the potential of HTML5.</p>
<p><a name="future"></a></p>
<p><b>Scott:</b> That&#8217;s great. Any closing thoughts from your end?</p>
<p><b>Shay:</b> It&#8217;s an exciting time for this market, and we are maybe at the end of the first inning, with eight more innings to go. I think this is the same order of explosion that we saw with the first wave of the Web 10-12 years ago, with text. </p>
<p>I think next year, more than half of the TVs are going to be IP enabled, and half of the homes are connected to broadband. Two billion people have access to the production process, and imagination is their only limitation. </p>
<p>I think 2010 is the year of HTML5, and we are going to see massive adoption. 2011 and 2012 are going to be all about productization and adoption of new business models around video. We&#8217;re nearing a state where the plumbing of video, so to speak, is done, and the beautiful gardens will soon spring into full bloom.</p>
<p>Publishers no longer have to spend human resources on the infrastructure. The problems are beginning to be well understood—and solved with technology like what Kaltura brings to the market. Now people can start focusing on innovation and actually building the applications and what follows around it. </p>
<p>The Application Exchange we just launched recently is an excellent example of that, which allows people to think about higher-level applications, instead of plumbing.</p>
<p>How could video drive better distance education experiences? How could it drive better enterprise collaboration? How could it drive better publishing and media experiences? I think that&#8217;s the future we&#8217;re looking at.</p>
<p><b>Scott:</b> Thanks so much for taking the time to chat.</p>
<p><b>Shay:</b> Thank you. For more information check out:</p>
<ul>
<li><a href=”http://www.kaltura.com”>www.kaltura.com</a></li>
<li><a href=”http://www.kaltura.org”>www.kaltura.org</a></li>
<li><a href=”http://openvideoalliance.org”>openvideoalliance.org</a></li>
<li><a href=”http://html5video.org”>html5video.org</a></li>
</ul>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=320&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F24%2Finterview-with-david-shay-co-founder-of-kaltura%2F&amp;title=Interview+with+Shay+David+%26%238211%3B+Co-founder+of+Kaltura" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F24%2Finterview-with-david-shay-co-founder-of-kaltura%2F&amp;title=Interview+with+Shay+David+%26%238211%3B+Co-founder+of+Kaltura" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F24%2Finterview-with-david-shay-co-founder-of-kaltura%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F24%2Finterview-with-david-shay-co-founder-of-kaltura%2F&amp;title=Interview+with+Shay+David+%26%238211%3B+Co-founder+of+Kaltura" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F24%2Finterview-with-david-shay-co-founder-of-kaltura%2F&amp;title=Interview+with+Shay+David+%26%238211%3B+Co-founder+of+Kaltura" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F24%2Finterview-with-david-shay-co-founder-of-kaltura%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Shay+David+%26%238211%3B+Co-founder+of+Kaltura+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F24%2Finterview-with-david-shay-co-founder-of-kaltura%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/05/24/interview-with-david-shay-co-founder-of-kaltura/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Interview with Jonathan Lindo &#8211; CEO Replay Solutions</title>
		<link>http://howsoftwareisbuilt.com/2010/05/17/interview-with-jonathan-lindo-ceo-replay-solutions/</link>
		<comments>http://howsoftwareisbuilt.com/2010/05/17/interview-with-jonathan-lindo-ceo-replay-solutions/#comments</comments>
		<pubDate>Mon, 17 May 2010 19:01:41 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=317</guid>
		<description><![CDATA[Replay Solutions makes innovative solutions the help developers troubleshoot and debug appliciatons. Basically, it lets you record interactions that are happening in product, play them back in the lab, and figure out what&#39;s going on.&#160; In this interview, we chat with Jonathan Linxu, CEO of Replay Solutions. The genesis of ReplayDIRECTOR: recording and playback for [...]]]></description>
			<content:encoded><![CDATA[<p>Replay Solutions makes innovative solutions the help developers troubleshoot and<br />
    debug appliciatons. Basically, it lets you record interactions that are<br />
    happening in product, play them back in the lab, and figure out what&#39;s going on.&nbsp;<br />
    In this interview, we chat with Jonathan Linxu, CEO of Replay Solutions.</p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/17/interview-with-jonathan-lindo-ceo-replay-solutions/#genesis">The genesis of ReplayDIRECTOR: recording and playback for software</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/17/interview-with-jonathan-lindo-ceo-replay-solutions/#architecture">Managing the architectural complexity of simulating multi-level applications</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/17/interview-with-jonathan-lindo-ceo-replay-solutions/#implement">Implementing the test and debug environment, and the results that follow</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/17/interview-with-jonathan-lindo-ceo-replay-solutions/#economy">Handling the economic pressures of being a startup, especially in a down economy</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/17/interview-with-jonathan-lindo-ceo-replay-solutions/#open">The value of open source in an entrepreneurial setting</a></p>
<p><span id="more-317"></span></p>
<p><b>Scott Swigart:</b> To start us off, take a second to introduce yourself and talk a little bit about Replay Solutions.</p>
<p><b>Jonathan Lindo:</b> I&#8217;m one of the co-founders and CEO of Replay Solutions. We&#8217;re a software company that has developed a very unique technology that helps to automate a large portion of the software life cycle for anybody who is building or managing software applications. </p>
<p>My background is technical. I&#8217;ve been a software engineer for over 15 years, before founding this company in 2004. Throughout my own career, I&#8217;ve felt the pains and the need for a technology like ReplayDIRECTOR, which is our core product. </p>
<p>We&#8217;ve experienced a lot of the challenges and difficulties that folks are facing even more today, in terms of delivering high quality applications for the new types of environments in which people are now deploying applications.</p>
<p><a name ="genesis"></a></p>
<p><b>Scott:</b> Unpack that a little bit. Explain what ReplayDIRECTOR does and a little bit about how it does it.</p>
<p><b>Jonathan:</b> Let me start off with a bit of a story to tell you how we got to where we are as a company. My co-founder, Jeff Daudel, who is our CTO, and I have worked together for over a decade, most recently at a company called Muse Corporation. </p>
<p>That company was building a very ambitious software project to deliver the next generation of the Internet, including a lot of new technologies like 3D graphics, multi-user browsing, spatialized audio, and video, in a 3D environment. </p>
<p>The idea was that companies like Paramount and Sony that have a lot of content could develop very rich 3D websites, while making them so simple from a user perspective that your mom could very easily access that content in a very compelling way.</p>
<p>Building that required a lot of different components. There was a highly distributed, heterogeneous back end, with a lot of databases talking to application servers.</p>
<p>What we found as we were trying to bring that product to market was that, with our large group of QA testers and beta testers, we were drowning in the sea of bugs and issues that were being reported. We were spending almost all of our time simply trying to reproduce and fix these problems that were being reported, as opposed to actually improving and shipping the product.</p>
<p>This was in the late &#8217;90s, and TiVo had just started to become popular. Both Jeff and I were early adopters and big fans of that technology, and we still are. We both thought it would be great if we could take a similar concept to TiVo&#8217;s recording and replay capability to software development.</p>
<p>When a problem was experienced in the field or during testing, we would have a recording of it as it first occurred, which would allow us to easily trace back to the exact root cause. That was the genesis of the idea, and how it works is actually pretty simple. We basically hook into the software application as it&#8217;s running and record all of the sources of input that could potentially affect the software.</p>
<p>By doing so, we provide just enough data to allow the software to replay and actually reproduce, at the source code level, the exact root cause of any error. We reproduce the error in such a way that it&#8217;s actually debuggable. You can attach your debugger, like Visual Studio or Eclipse, step through the code, and see the root cause of the problem, right down to the responsible line of code.</p>
<p><b>Scott:</b> It seems that there would be a lot of technical challenges to building that, in terms of capturing all those events, all those inputs, into the software, and being able to play it back. A number of things spring to mind.</p>
<p>Let&#8217;s say that a user logs into their online banking site, tries to pay a bill, and gets some kind of an error. You can&#8217;t literally just replay that transaction over and over, taking money out of their account and giving it to the power company each time. How do you address cases like that, where an error can&#8217;t be replayed in the production environment?</p>
<p><b>Jonathan:</b> In your online banking example, a browser talks to a web server, and there&#8217;s also an application server, a database, and probably some authentication servers running. All of that interaction will result in a lot of different transactions taking place between the various components of your application. </p>
<p>That set of transactions speaks to the incredible complexity that now exists in application-deployment environments, which becomes even more pronounced with deployment into the very distributed environments associated with cloud infrastructures.</p>
<p>Our technology basically isolates each of those components as they&#8217;re running in the system. Let&#8217;s take the example of the application server. What we do is actually monitor the activity on that application server in a very lightweight, seamless way, and allow just those actions that occurred on the application server to be recorded and replayed.</p>
<p>The result of that approach is you can take the recording of the problem that we just described, an error in an online banking system, and you can bring that problem out of the actual production environment and put it onto a developer&#8217;s or a QA person&#8217;s workstation.</p>
<p>Without having to add access to the database, the authentication servers, or any of the other components, including the client itself, you can actually replay the problem and see it occur at a source code level. It&#8217;s not necessary to actually execute the transaction again.</p>
<p>When you replay, you&#8217;re actually running in a virtualized environment, so part of what we actually do is application virtualization. VMware and Microsoft are very popular for doing OS virtualization, and we bring that up the stack and virtualize at the application layer. That allows us to run software in a virtualized mode without impacting any of the surrounding components that existed when the recording took place. </p>
<p><a name ="architecture"></a></p>
<p><b>Scott:</b> Do I understand correctly, then, that you&#8217;re creating a virtualized test instance of the application server? When input comes in, the application server makes a call out to try to talk to the database and retrieve records. The replay is basically just replaying back the data that it got when it did that in the production environment. </p>
<p>From the app server&#8217;s perspective, it thinks it called to the database, and it thinks it got data back, but it&#8217;s really just getting a replay of that. It&#8217;s been transplanted into your test environment, but it still thinks that it&#8217;s in the production system and is behaving as though it were.</p>
<p><b>Jonathan:</b> That&#8217;s a perfect way to describe it. We inject the same inputs back into the application and simulate the surrounding components such as the database and any clients that existed when that issue took place. </p>
<p>You can imagine that, if you had 1,000 clients connected to your application, one of the big benefits of a solution like ReplayDIRECTOR is that you don&#8217;t need to generate that load again. The replay will have captured all of that activity from those 1,000 clients, and you simply need to press &#8220;play.&#8221;</p>
<p><b>Scott:</b> Do you run into issues when something needs to authenticate? My understanding is that some security systems are specifically designed not to let you replay a response of an access token, since that could allow a malicious user to intercept an authentication request as a means to spoof or hack the system. </p>
<p>I&#8217;m just kind of trying to poke around the edges, but do you occasionally run into some scenarios that you can&#8217;t replay, because of issues like that?</p>
<p><b>Jonathan:</b> If you think about your earlier description, where the database is not required on replay, it&#8217;s the same for any other server, such as an authentication server. We are actually simulating and making the application think that there&#8217;s an authentication server present during replay. </p>
<p>When that request goes out to the authentication server to authenticate, our system will respond, in place of the authentication server, to say, &#8220;Yes, you&#8217;re OK. Keep going.&#8221; That&#8217;s what allows you to take that recording out of the environment in which it occurred and replay it anywhere.</p>
<p><b>Scott:</b> I&#8217;m picturing little agents or something similar that do this lightweight monitoring. They are able to capture these inputs and outputs in your production environment, so you can take them back to the lab. </p>
<p>How does your hosted solution plug into that? That is, what&#8217;s the difference between doing this on premise versus hosted?</p>
<p><b>Jonathan:</b> We&#8217;re really excited to have announced with Version 3.0 of ReplayDIRECTOR that we are now offering cloud hosting as part of our solution. What that means is, as you described, you&#8217;ve got agents that run on your application servers that are recording, either 24 by 7 or however you&#8217;d like to set that up. </p>
<p>The data those recordings generate is being sent to the ReplayDIRECTOR server, which can be run behind your firewall and is used to store and manage those recordings. It also provides a web-based interface that lets people share, collaborate, and access those recordings.</p>
<p>In the cloud-hosted solution we&#8217;re now offering, the ReplayDIRECTOR servers will run on our premises. We provide that as a hosted service, so the recordings will be stored securely on our servers. That&#8217;s the component that allows folks to get up and running really quickly, without having to download and install a server on their own.</p>
<p><b>Scott:</b> I imagine that these recordings could result in extremely large amounts of data, depending on what&#8217;s flowing through the system and what&#8217;s required to service a request. Sometimes applications aren&#8217;t very efficient, and they&#8217;ll make a request to a database that returns a lot more data than it actually needs, and the application will parse through it, rather than correctly pushing the work to the back end.</p>
<p>Especially if you&#8217;re going to be pushing this data up to a hosted service, it seems like there must be some kind of tuning you can do to control what&#8217;s captured, how long it&#8217;s kept, and those sorts of things.</p>
<p><a name ="implement"></a></p>
<p><b>Jonathan:</b> One of the nice things about our technology is that it&#8217;s very simple to get, configure, install, and start to use. It&#8217;s basically a very binary type, and there&#8217;s not a lot of tuning required. We see that as a very good thing. </p>
<p>We&#8217;re able to do that because the recording system itself is extremely lightweight. It can be run all the time, even in a production environment, and it&#8217;s there when you need it.</p>
<p>We&#8217;ve gone through a lot of research and development effort over the last six years, building up this technology so that it&#8217;s extremely lightweight and records as little data as humanly possible. The data it does record is actually compressed and encrypted as well, which further reduces the data size that&#8217;s generated.</p>
<p>How long the recordings are kept is also tunable. If you want to allocate 500 gigabytes of disk for your recordings, that&#8217;s often going to be enough to store more than 30 days worth of recordings. You can also tune it to keep a rolling window of the last seven days, for example. </p>
<p><b>Scott:</b> Obviously, the primary scenario is that an error happens in production, or there&#8217;s a potential bug, and you need to track it down. Have you had people use it for other things? I could imagine, for example, that people could use it forensically to find out what happened when their web server got hacked.</p>
<p><b>Jonathan:</b> There are really four areas where this technology has been used successfully. The first is in problem reproduction and resolution. The second is in security, as you mentioned, which allows people to go back and look at any security issue, as it occurred, with as much detail as they would like. </p>
<p>Third, this is a very powerful technology to analyze and resolve performance issues. Finally, it allows any kind of compliance-related issue to be verified, reproduced, and audited. We&#8217;ve had applications of our technology outside of problem resolution and with good success.</p>
<p><b>Scott:</b> Without divulging anything about your customers, obviously, what can you tell us about the spectrum of uses for this technology, maybe from small startups to relatively massive implementations that require a lot more scale?</p>
<p><b>Jonathan:</b> Let me give you an example of a very large software company that has a lot of products out in the field with thousands of customers. They had a product that was used on a daily basis by large software teams, and every couple of months, they&#8217;d get a report of a bug that was corrupting a database. </p>
<p>It was so bad that people would have to go back and actually restore the database from the last backup that they&#8217;d made. This was the kind of problem that they could never reproduce in their pre-production environment. They ended up making seven attempted fixes to this problem, and one after another, all seven failed in the field, and another problem report would come in with a corrupted database.</p>
<p>When they started using Replay, they captured the problem with a recording, which allowed them to reproduce the issue in literally about 10 minutes. All they had to do was to access the recording, press &#8220;play,&#8221; and watch the problem happen again. Their next fix nailed that bug, which it turned out had been in their system for over two years.</p>
<p><b>Scott:</b> Talk a little bit about the business side of the company. These are interesting times, as they say, economically. What can you tell some of our readers who may aspire to starting a company about your lessons learned and that sort of thing?</p>
<p><b>Jonathan:</b> In 2004, having identified this as a major pain point for us and for most people in our industry, we decided to found a company. We went out with a prototype of the technology, early on, and we started working with companies directly. </p>
<p>We were fortunate enough to be able to use some of our contacts to start working directly with companies before we even considered raising money or building up the company. At that time, it was just me and my business partner, Jeff.</p>
<p>We built up a small stable of pretty notable customers, including Microsoft, NVIDIA, and Electronic Arts. We were able to use those relationships to bootstrap the company and continue our research and development.</p>
<p>A lot of man years of research and development have gone into this technology, including about a dozen patents, two of which have now been issued. We&#8217;re very excited to have reached this point, but it certainly was a bit of a road.</p>
<p>In 2006, we had built the company up to a handful of folks. We had reached profitability and then decided to go and look for an opportunity to grow the company. We did so by partnering with two venture capital firms in Silicon Valley. One was Hummer Winblad, and the second was Partech.</p>
<p>With them, we were able to grow the company and really expand the breadth of our technology, to build a more ubiquitous solution that could be used by even bigger teams than in the past. We&#8217;ve continued that research, and we actually raised our second round of funding in 2008, with Sigma Partners and UV Partners.</p>
<p>Since then, the economy has really started to rebound, and we&#8217;re seeing the benefits of that. We had a record 2009, and we have started off very well in 2010. We&#8217;re hiring, we&#8217;ve expanded our office space, and we intend to meet or exceed our targets for the year.</p>
<p>We&#8217;re certainly seeing some good signs in the economy. People are coming back and looking at investing in technology that will help them be more efficient in the new environment in which we find ourselves. </p>
<p>People are looking for solutions to help them with the present complexity of the software environment, which includes cloud computing, and we&#8217;re one of the new players on the scene to really address those kinds of issues. </p>
<p>From a business standpoint, people are looking to be lean, mean, and efficient, but they are also looking to be strategic, and we&#8217;re benefiting from that as well. Some of our large customers in the financial services space are really taking that attitude, and it&#8217;s been good.</p>
<p><a name ="economy"></a></p>
<p><b>Scott:</b> Looking back at the very beginning of your company, there are a couple of tensions that I think any startup faces. That&#8217;s especially true if you&#8217;re going after a lot of large enterprise customers, and I&#8217;d be interested to hear how you dealt with them. </p>
<p>The first of these problems I&#8217;ve seen is that a big company might be very interested in your technology, but they could be afraid that your little company won&#8217;t be around in a year. They may refuse to make a big investment in it that includes implementation, training people, and so on, because you&#8217;re simply not big enough yet.</p>
<p>The other common problem is that they may think the base technology is great, but your product may be missing one or more features that they really need. At that point, you have a conflict in staying true to your vision for the product, while at the same time servicing that kind of one-off feature requests. They may take your product in a direction that&#8217;s not strategically beneficial to your company as a whole.</p>
<p><b>Jonathan:</b> There&#8217;s no question that we have felt those pressures during the course of our company&#8217;s lifetime. We&#8217;ve approached that first question, about convincing large organizations to make a financial and workflow investment, by building our software to be extremely seamless and lightweight.</p>
<p>That means that people don&#8217;t really have to make significant changes, if they have to make any at all, to the way they build and manage their software. We&#8217;re very complementary to the existing workflows, technologies, and tools that folks are using today.</p>
<p>We have painstakingly taken the approach to keep our technology as transparent as possible so it fits into existing workflows. That means that, if they did need to remove our technology from their workflow, it would not cause things to break.</p>
<p>Obviously, it would make their jobs more difficult not having Replay, but it&#8217;s not something that they&#8217;re betting the farm on, in terms of the way that they go about their daily work flows.</p>
<p>Your second point concerns how to react and adjust to pressure for custom development or feature requests that might not be strategic for our company. We&#8217;ve approached that, again, by building our technology to be lightweight and seamless, but also to have a low entry point.</p>
<p>By having a broad appeal, and by virtue of our technology being scalable and robust, we can afford to take on smaller customers, rather than just the $500,000 deals. Our low entry point in terms of price means that we win out on volume and don&#8217;t necessarily have to bend over backwards to win specific deals.</p>
<p>We take a long-term view in terms of how we engage with our customers. People might start off by deploying to their pre-production teams in one or two departments, and then they might get comfortable with it and grow with the product.</p>
<p><b>Scott:</b> I&#8217;d also like to talk a bit about your approach of bootstrapping the company for a couple of years, and then going out and looking for venture funding. It seems to me that I have heard a lot of people talking about VC funding really having dried up in the 2007-2008 timeframe.</p>
<p>I think there are a lot of people out there who think that they can walk into a VC with a set of slides to show off a great idea and fairly easily get funding. What advice can you share about walking down that path of bootstrapping versus VC, as well as the right point to switch from one to the other?</p>
<p><b>Jonathan:</b> Walking into a venture capital office with a slide deck is a very rough hill to climb, unless your last three companies were extraordinarily successful. It&#8217;s also the case that today, people can get very substantial infrastructures online with very little capital investment. </p>
<p>It is really a new world in terms of being able to build substantial modern technology, get it online, and start building a customer base or a community around it with very few resources. Then you can start thinking about venture capital or other forms of investments, or not.</p>
<p>There are options in 2010 that didn&#8217;t exist in 2004, in terms of your ability to go out with a small team and do incredible things. Our premise was to prove to ourselves that our idea was really viable before anything else. We did that by building up a base of about 15 customers that were really getting value from our technology.</p>
<p>Once we had done that, we felt like we had a proof of concept on which to build a profitable company, so we decided to see if we could expand at a more rapid pace. At some point, you do have to be concerned about other companies coming up and chasing down those technologies.</p>
<p>It is a bit of a balance, and I think today, VCs are more risk-averse then they have ever been before. It is a little bit more difficult to go and attract venture financing, but on the flipside, it is also much easier to do more with less.</p>
<p>I think it is a great time to be thinking like an entrepreneur, and if you have a great idea, go out there and try to make something happen. If you look at what has happened with the iPhone App Store, the Google Apps Store, and soon Microsoft&#8217;s Windows Mobile 7, there are many opportunities to go and build new technology and bring it to market.</p>
<p><a name ="open"></a></p>
<p><b>Scott:</b> The majority of people we talk to are involved with open source projects, or their business is wrapped around an open source project. There is the perception in some quarters that open source is trendy, and there is more interest in a product that has an open source component to it, versus something fully proprietary. </p>
<p>Obviously, you have been successful during a pretty tough economic time, so what is your view about the perceived necessity to have some piece of the puzzle be open?</p>
<p><b>Jonathan:</b> I think open source makes a tremendous amount of sense for certain sectors, although I am not convinced that the problem of how to monetize successful open source projects has been completely solved. </p>
<p>I think that is an ongoing area of experimentation and research. There certainly have been some good successes with open source companies that have been acquired, but in terms of really generating a lot of revenue in that space, I think there is still a lot of work to be done.</p>
<p>Because our technology is so new, and there really isn&#8217;t a lot to compare it to out there, we have decided so far not to open source the core component, although we do open source pieces of it, such as our plug in.</p>
<p>That is certainly an area we are looking at constantly, and we are big fans of the open source community. We actually are providing our software free of charge to many open source groups, and certainly we will continue to do that and expand our engagement with the open source community. In terms of open sourcing our technology, I am still looking for some good business reasons and benefits to the company to do so.</p>
<p><b>Scott:</b> Well, we have covered a lot of great ground. Are there any closing thoughts from your end?</p>
<p><b>Jonathan:</b> It is an exciting time to be developing software and building new technologies. We like to think of ourselves as a great example of a company that is taking advantage of the new paradigm in software. Our technology, we think, is really helping folks to capitalize on that, so we&#8217;re hoping that people will check out ReplayDIRECTOR and see what it can do for them.</p>
<p><b>Scott:</b> Great. Thanks for taking some time to chat. </p>
<p><b>Jonathan:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=317&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F17%2Finterview-with-jonathan-lindo-ceo-replay-solutions%2F&amp;title=Interview+with+Jonathan+Lindo+%26%238211%3B+CEO+Replay+Solutions" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F17%2Finterview-with-jonathan-lindo-ceo-replay-solutions%2F&amp;title=Interview+with+Jonathan+Lindo+%26%238211%3B+CEO+Replay+Solutions" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F17%2Finterview-with-jonathan-lindo-ceo-replay-solutions%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F17%2Finterview-with-jonathan-lindo-ceo-replay-solutions%2F&amp;title=Interview+with+Jonathan+Lindo+%26%238211%3B+CEO+Replay+Solutions" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F17%2Finterview-with-jonathan-lindo-ceo-replay-solutions%2F&amp;title=Interview+with+Jonathan+Lindo+%26%238211%3B+CEO+Replay+Solutions" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F17%2Finterview-with-jonathan-lindo-ceo-replay-solutions%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Jonathan+Lindo+%26%238211%3B+CEO+Replay+Solutions+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F17%2Finterview-with-jonathan-lindo-ceo-replay-solutions%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/05/17/interview-with-jonathan-lindo-ceo-replay-solutions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interview with Andy Chou, Coverity Chief Architect</title>
		<link>http://howsoftwareisbuilt.com/2010/05/03/interview-with-andy-chou-coverity-chief-architect/</link>
		<comments>http://howsoftwareisbuilt.com/2010/05/03/interview-with-andy-chou-coverity-chief-architect/#comments</comments>
		<pubDate>Mon, 03 May 2010 20:18:47 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=313</guid>
		<description><![CDATA[In this interview, we chat with Andy Chou, the chief architect at Coverity, a company that produces proprietary software analysis products.&#160; In addition, Coverity has been scanning over 300 open source projects, resulting in more than 11,000 defects that have been fixed by the open source community. Defining the role of static analysis in the [...]]]></description>
			<content:encoded><![CDATA[<p>
    In this interview, we chat with Andy Chou, the chief architect at Coverity, a<br />
    company that produces proprietary software analysis products.&nbsp; In addition,<br />
    Coverity has been scanning over 300 open source projects, resulting in more than<br />
    11,000 defects that have been fixed by the open source community.
</p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/03/interview-with-andy-chou-coverity-chief-architect/#define">Defining the role of static analysis in the software industry</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/03/interview-with-andy-chou-coverity-chief-architect/#target">Targeting static analysis where it can have the most benefit</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/03/interview-with-andy-chou-coverity-chief-architect/#interact">Interacting with the open source community</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/03/interview-with-andy-chou-coverity-chief-architect/#commercial">The role of automated defect analysis in commercial software development</a></p>
<p><a href="http://howsoftwareisbuilt.com/2010/05/03/interview-with-andy-chou-coverity-chief-architect/#future">Areas for future study and development</a></p>
<p><span id="more-313"></span></p>
<p><b>Scott Swigart:</b> Why don&#8217;t you introduce yourself and your role at Coverity?</p>
<p><a name="define"></a></p>
<p><b>Andy Chou:</b> I&#8217;m chief architect at Coverity. I really have many different hats, but basically, I&#8217;m responsible for the technology direction of our products. Coverity was founded seven years ago, and we build products that help companies improve their software integrity. Our flagship product performs static analysis to find defects in source code automatically.</p>
<p>This is an interesting space, because we do software development ourselves, but we also analyze other peoples&#8217; software. We get to see a wide variety of processes, tools, and products.</p>
<p>We started out as a research project at Stanford University. The potential of static analysis technology was clear after we spent one weekend in 2000 and discovered hundreds of defects in the Linux kernel. </p>
<p>In 2003, we commercialized the technology and bootstrapped the company for four years without much external funding. Since then we got our first round of VC funding, and we have been expanding internationally and growing our product line since then.</p>
<p><b>Scott:</b> Various open source projects use static analysis, and I know that Coverity has scanned something like 280 open source projects and provided those results back. Some projects more than others have worked to root out the issues found. </p>
<p>Before we go there, though, talk a little more about integrity scanning and higher-level issues related to software quality and design.</p>
<p><b>Andy:</b> One of the things we discovered as we started to expand is that companies are interested in more than just the code in terms of software integrity. They&#8217;re interested in how the code is designed. They’re interested in how it’s assembled by the build process. They&#8217;re interested in how it’s tested and whether there&#8217;s anything that can be found by automatic tools during QA.</p>
<p>We have different modules that together form the Coverity Integrity Center. It&#8217;s our combined offering that addresses different parts of the development life cycle. </p>
<p>There are a lot of tools in the Application Lifecycle Management space that focus on streamlining and automating processes in software development, like tracking bug tickets that come in or doing source control management. </p>
<p>We consider ourselves to be in an adjacent space because we analyze the artifacts of software development and really dig into how they are put together and what they are doing. For example, a source control system treats source code as nothing more than a pile of text. We treat source code at a more semantic level, by parsing it and analyzing its behavior on specific paths. </p>
<p><b>Scott:</b> When you&#8217;re talking about analyzing code, you have to consider the broad range of programming languages out there: C#, Java, PHP, and all kinds of other stuff. When you&#8217;re getting more into the design and build process, even though there are artifacts, there are also build instruction files, for lack of a better term, and things like that to consider.</p>
<p>How did you choose where to invest? Back in the day, organizations used a fairly discrete set of technologies, whether it was .NET, Java, or C++. As time went on, we got Ruby on Rails, different kind of cloud platforms and services, development for handhelds, and whole ecosystems growing up around those.</p>
<p>I assume that to support each one of those requires a separate investment from you, so how do you decide where you&#8217;re going to focus your resources? It seems like it would take a crystal ball to determine what&#8217;s going to be around for a long time.</p>
<p><a name="target"></a></p>
<p><b>Andy:</b> We focus primarily on areas where software integrity is essential, for example safety-critical or business-critical systems. Often, this means embedded software. For example, software that’s going into a device, into a plane or car, or into a router, where it can be very, very expensive to fix a problem (think recall, or flying an engineer on-site to debug a problem). </p>
<p>The amount of software that&#8217;s in these devices is growing tremendously. I think it&#8217;s not really widely understood that software systems have become truly enormous over the past decade or so. For our customer base, a million lines of code is not considered very large.</p>
<p>If you printed out a million lines of code, it would be a six foot high stack of paper. It&#8217;s far too much for a single person to truly comprehend all the details. It’s hard to imagine any human product on that scale that has no errors in it. Some of the larger individual code bases we work with run into the tens of millions of lines of code. </p>
<p>And by the way, that&#8217;s just one variant of that code base. We all know that in real software development, there are many branches of development that share a great deal of the code. We also have products put together by sharing different pieces of software across teams, and the complexity grows tremendously.</p>
<p>It&#8217;s always a challenge to understand where to invest in order to maximize your impact, but we believe that listening to our customers is of paramount importance. They tell us which types of their software are most critical, and the most critical stuff is where automated scanning is most valuable.</p>
<p><b>Scott:</b> You mentioned embedded devices, which makes a lot of sense, since they take a long time to develop, and they are very difficult to patch. You can push a patch to some types of embedded devices, but there others that don&#8217;t have a network connection, and it basically takes a product recall to make changes.</p>
<p>Cars are a great example that I think bring the fact home to the general public that software is everywhere. </p>
<p><b>Andy:</b> The average person probably thinks of software as something they interact with when they turn on their PC, but the reality is that most of the software we all interact with on a day-to-day basis is not on a desktop computer. </p>
<p>Some of it is in your car, your mobile phone has tons of software on it, and you use software almost every time you purchase something. Furthermore, when you get a piece of information off the Internet, you use many layers of software. It&#8217;s not just on your computer and the website you&#8217;re going to, but also on all the infrastructure in between – the routers, switches, wireless access points, firewalls, and database servers. There’s an incredible amount of software that supports mundane everyday activities, and that&#8217;s not really well understood.</p>
<p>At Coverity, we think in terms of the software supply chain. When companies put together applications or devices, they&#8217;re no longer developing all of it themselves. So much software development is about pulling together components from different suppliers (including open source), integrating it together, and customizing it.</p>
<p><a name="interact"></a></p>
<p><b>Scott:</b> Let&#8217;s switch over and talk a little bit about the work you do with open source projects.</p>
<p><b>Andy:</b> We worked with the Department of Homeland Security (DHS) putting together a scanning service for open source projects called Coverity Scan. That effort began in 2006, and today (early 2010) we scan over 300 different open source packages and have found over 11,000 defects that have been fixed by the open source community. </p>
<p>The DHS was looking for a way to have a big impact on software integrity across a spectrum of different kinds of open source projects. They saw that static analysis technology could apply to millions of lines of code and provide a quick way to harden open source infrastructure. Open source is used by virtually every company and government agency. </p>
<p>The economics of using shared code is compelling. But the downside is that defects in the shared infrastructure propagate everywhere. I think the DHS was concerned about this growing ubiquity, and what could be done to accelerate fixing more defects earlier. </p>
<p>Scan has been really successful in terms of raising the awareness of what static analysis can do. </p>
<p><b>Scott:</b> You bring up a good point about evangelizing the benefits of static analysis. I think for a long time in the open source community, there was the notion that once code has been posted to a mailing list and looked at by a lot of people, that scrutiny will find all of the bugs.</p>
<p>When Coverity started scanning projects and turning these reports over to the maintainers, some projects probably really appreciated it, while others may have been more inclined to dismiss the results as a bunch of false positives. Some people would hesitate to believe that a program could really understand their code. </p>
<p>How have you seen the thinking about the benefits of static analysis evolve among open source projects over the past three years or so?</p>
<p><b>Andy:</b> There has been a greater embrace of static analysis, both within the open source community and among commercial software makers. Part of the reason is that static analyzers have gotten better. They&#8217;re starting to think about software more like a person would and less like a machine enforcing rules blindly. </p>
<p>Part of why we think static analysis is being adopted more is that it has begun to provide enough information to understand the problems that could arise if you don&#8217;t fix a particular bug. That makes it more valuable and clearer that there is a real issue.</p>
<p>Some open source communities have really taken the results and run with them, fixing virtually everything that&#8217;s found by our tool, whereas others are a lot slower to adopt it. We&#8217;ve found a high correlation between the projects that do static analysis and doing other things like extensive unit testing that also help with quality.</p>
<p><b>Scott:</b> In other words, certain projects broadly adopt tools and techniques to raise quality as part of their culture. Those measures might include static analysis, unit testing, and maybe other things.</p>
<p><b>Andy:</b> And they see static analysis as a useful technique that can address some of the gaps left by other techniques, in terms of achieving coverage of all of the code without necessarily having test cases that cover everything or missing things that a developer might overlook. </p>
<p>Static analysis is good at looking at all of the different paths in the software, which people are not so good at.</p>
<p>Still, it&#8217;s been very difficult to get companies and open source adopters to really incorporate static analysis deeply into their daily workflows. We have found that, if you incorporate testing for defects into a nightly build process, for example, it really helps. </p>
<p>In that scenario, the developers are talking about defects that are fresh and about code that they just wrote, which really brings up the relevance level. Even for commercial software development, that matters quite a lot. </p>
<p><a name="commercial"></a></p>
<p>For a lot of companies that use our software, although not all, they see the initial set of defects that come out of the tool and they say, &#8220;Well, it&#8217;s too much work to fix all of those right away, so we&#8217;ll baseline that and fix everything that comes in new after today.&#8221;</p>
<p><b>Scott:</b> That makes sense. Like you said, if they&#8217;re running it over a million or ten million lines of code, and thousands and thousands of defects come out, they might say, &#8220;Well, the software works today, and we&#8217;re not getting a lot of complaints about the quality, but we want to continue to raise the bar.&#8221; </p>
<p>It also seems like a pretty good educational tool. You check your code in before you go home at night, and in the morning, there are some defects waiting for you. That could very quickly help you change your own development practices to avoid defects.</p>
<p><b>Andy:</b> We recently published a paper “A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World” in the Communications of the ACM that described some of the lessons we&#8217;ve learned over the past few years in terms of making static analysis work in a commercial setting. </p>
<p>One of the continuing lessons we&#8217;ve learned is that incorporating a new technology like static analysis into real, commercial software development really requires you to understand how people work and some of the complexities of their environment.</p>
<p>One example is that in our forthcoming Coverity 5 version, there&#8217;s going to be support for multiple code branches and identifying defects in shared code, which I think almost every software development shop has. They have a release version, a development version, perhaps a branch where they&#8217;re doing advanced work, and perhaps branches for customers that have special sets of requirements.</p>
<p>That is a very common scenario, but all of those different branches are variants of the same code base. It&#8217;s the same when you have different targets. You may have a Windows version, a Linux version, and versions for other target platforms.</p>
<p>Even in embedded applications, each device may have a different CPU or set of devices and, therefore, a different layer on the bottom to interface with the low level components. Some phones have cameras and some have Bluetooth, but not all of them. The variants between these different products and branches are part of the real life situation in software development.</p>
<p>Supporting that well requires more than scanning a code base and finding a bunch of bugs. It means digging into where else the same bug may arise. If I understand your branching structure and the ways in which you put together all of the software components into different products, I can tell you which products might be impacted by this bug.</p>
<p>That can be really useful. If you have a bug and you fix it in one branch, you may not realize that it&#8217;s also present in others. If you don&#8217;t also find it there and tell the developer about it when it&#8217;s timely, you may miss out on an opportunity to save quite a lot of time.</p>
<p><b>Scott:</b> You mentioned that you&#8217;ve got an upcoming release, Coverity 5. Give us a little bit of a preview, in terms of what is coming in that version, and even some of the interesting problems that you&#8217;re thinking about beyond that.</p>
<p><b>Andy:</b> One of the features we’re excited about in Coverity 5 is defect impact mapping. It gives developers a quick way to see what the most likely high-impact defects are. Instead of thousands of defects in a flat list, you can focus in on those that are most likely to be critical, such as those that involve things like memory corruption and memory leaks, which can be very expensive to find and to fix. </p>
<p>When you combine insight into which defects are critical with which defects occur in multiple code branches or streams, it&#8217;s a huge step forward in understanding the impact of defects on the business.</p>
<p>A third feature is that we&#8217;ve mapped all the results that we find to the Common Weakness Enumeration, the CWE, which is an industry standard way of talking about software vulnerabilities. It&#8217;s something that our customers have been asking for, because it allows them to understand defects not in terms of how Coverity classifies them, but also in terms of how the industry is talking about them.</p>
<p><a name="future"></a></p>
<p><b>Scott:</b> You&#8217;ve talked about a lot of ways of finding and analyzing defects. What are some of the hard problems that are left? I think that, in terms of compute problems, this is probably right up there which speech recognition and automatic language translation, in terms of being a really hard computer science problem.</p>
<p><b>Andy:</b> In our space, every time you turn around you face an NP-complete or undecidable problem. What we&#8217;ve found is that you really need to be careful about the trade-offs made when finding defects. You simply can’t find all defects with a low false positive rate, while scaling to millions of lines of code while remaining almost completely automatic.</p>
<p>The balance between these many factors is something that you have to strike very, very carefully, because a tool needs to work for very large code bases as well as very small code bases, and on code bases written using different styles. </p>
<p>One of the challenges we continually face is how to make software analysis something that really customizes itself for the particular code base that&#8217;s being scanned. You should treat code bases like patients, in the sense that you don&#8217;t treat each patient exactly the same way. </p>
<p>If you look at their history, the processes used to develop them, and the idioms used in the code base, you may come up with a different set of diagnoses about what&#8217;s wrong. That&#8217;s a relatively new field. We&#8217;ve done some of the work in our products, but I think there&#8217;s a lot of room to grow in that dimension. </p>
<p>If you&#8217;re talking about a web application, for example, security may be of paramount importance, but if you&#8217;re talking about embedded device, you may have a different set of concerns. The blind scanners that just look for a list of problems don&#8217;t necessarily have that context, so they may look for things that are lot less relevant, just because they don&#8217;t understand what you&#8217;re trying to do with this application, how you developed the software, and what idioms you&#8217;re using.</p>
<p>That&#8217;s one area for future work. From a top down point of view, I am interested in pushing software analysis more deeply into the work flow and incorporating information from different sources. </p>
<p>Correlating the information between different systems that track bugs and track changes in source code is another interesting area for the future. There&#8217;s tremendous value in converging information from many different sources and using that to better determine whether software is ready to release, to gauge its quality, and to determine which teams are doing a good job.</p>
<p><b>Scott:</b> This is definitely an interesting space, and it&#8217;s good to see it becoming more mainstream. We&#8217;ve covered most of the stuff I had on my agenda, so do you have any closing thoughts?</p>
<p><b>Andy:</b> Despite the growth in the past few years, this space is still kind of nascent. I think the market is large for tools that are starting to integrate information from different software components together. </p>
<p>One of the things that we&#8217;re looking at is how to increase the quality of the supply chain, in addition to just individual software projects. That&#8217;s an area that we think will require some automated analysis. It&#8217;s too expensive to manually look at code bases that are being produced by third parties or by suppliers.</p>
<p>There&#8217;s no common regulation or standard by which software quality is measured today, so going beyond just finding defects, how do you determine what is high quality software? I think there&#8217;s an open question out there, because today, there is no real metric for quality.</p>
<p>There&#8217;s a lot of work to be done in that area, and that&#8217;s going to make for an interesting future.</p>
<p><b>Scott:</b> Great. Thanks for taking time to chat. This has been a great interview.</p>
<p><b>Andy:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=313&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F03%2Finterview-with-andy-chou-coverity-chief-architect%2F&amp;title=Interview+with+Andy+Chou%2C+Coverity+Chief+Architect" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F03%2Finterview-with-andy-chou-coverity-chief-architect%2F&amp;title=Interview+with+Andy+Chou%2C+Coverity+Chief+Architect" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F03%2Finterview-with-andy-chou-coverity-chief-architect%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F03%2Finterview-with-andy-chou-coverity-chief-architect%2F&amp;title=Interview+with+Andy+Chou%2C+Coverity+Chief+Architect" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F03%2Finterview-with-andy-chou-coverity-chief-architect%2F&amp;title=Interview+with+Andy+Chou%2C+Coverity+Chief+Architect" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F03%2Finterview-with-andy-chou-coverity-chief-architect%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Andy+Chou%2C+Coverity+Chief+Architect+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F05%2F03%2Finterview-with-andy-chou-coverity-chief-architect%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/05/03/interview-with-andy-chou-coverity-chief-architect/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interview with Brian Gentile &#8211; CEO of Jaspersoft</title>
		<link>http://howsoftwareisbuilt.com/2010/04/21/interview-with-brian-gentile-ceo-of-jaspersoft/</link>
		<comments>http://howsoftwareisbuilt.com/2010/04/21/interview-with-brian-gentile-ceo-of-jaspersoft/#comments</comments>
		<pubDate>Wed, 21 Apr 2010 18:46:23 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=310</guid>
		<description><![CDATA[Open source has evolved beyond simple infrastructure and is steadilly working it&#39;s way up the stack.&#160; A prime example is Jaspersoft, which supports the worlds most popular open source business intelligence software.&#160; In this interview we chat with Brian Gentile, Jaspersoft CEO. Breadth of open source business intelligence offerings from Jaspersoft The rise of information [...]]]></description>
			<content:encoded><![CDATA[<p>
    Open source has evolved beyond simple infrastructure and is steadilly working<br />
    it&#39;s way up the stack.&nbsp; A prime example is Jaspersoft, which supports the<br />
    worlds most popular open source business intelligence software.&nbsp; In this<br />
    interview we chat with Brian Gentile, Jaspersoft CEO.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/04/21/interview-with-brian-gentile-ceo-of-jaspersoft/#breadth">Breadth of open source business intelligence offerings from Jaspersoft</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/04/21/interview-with-brian-gentile-ceo-of-jaspersoft/#rise">The rise of information as the key asset owned by many businesses</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/04/21/interview-with-brian-gentile-ceo-of-jaspersoft/#role">The role of the open source model in Jaspersoft&#8217;s success</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/04/21/interview-with-brian-gentile-ceo-of-jaspersoft/#transition">Transition in business intelligence toward smaller companies and less expensive solutions</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/04/21/interview-with-brian-gentile-ceo-of-jaspersoft/#cloud">The role of cloud computing in the future of business and government</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/04/21/interview-with-brian-gentile-ceo-of-jaspersoft/#layer">How the business intelligence layer fits into the larger cloud computing model</a></li>
</ul>
<p><span id="more-310"></span></p>
<p><b>Scott Swigart:</b> Please introduce yourself and Jaspersoft, as well as the company&#8217;s relationship with open source.</p>
<p><a name="breadth"></a></p>
<p><b>Brian Gentile:</b> I&#8217;m President and Chief Executive Officer of Jaspersoft, which, we’re happy to say, has emerged as the world&#8217;s most widely used business intelligence software. We&#8217;ve very successfully used the open source model of development and distribution to reach this point. It allows our software to be, in a sense, downloaded and put into use by virtuallly anyone who has the acumen or even interest to solve some data problem using a BI tool for reporting and analysis. </p>
<p>Today we sponsor and promote five major business intelligence projects that are also signature products. JasperReports is the world&#8217;s most widely used reporting engine and library, and iReport is a companion report design and development product that sits on top of JasperReports.</p>
<p>JasperServer is a repository-based report and analysis server that provides scheduling, authentication, and all sorts of server-based business intelligence functions. JasperAnalysis is a full relational OLAP environment that works side-by-side with JasperServer and provides multi-dimensional capabilities through the ROLAP engine. </p>
<p>Lastly, we sponsor the JasperETL Project, which allows for very sophisticated data integration, enabling extract, transform, and load functions in creating a data warehouse.</p>
<p>The combination of all five projects we refer to as the Jaspersoft Business Intelligence Suite, and we take it to market first in community releases, which means freely downloadable, mostly GPL-licensed releases, and second, in the form of commercial editions, which we refer to as the Professional Edition and the Enterprise Edition. </p>
<p>These two commercial editions have features and capabilities that go above and beyond the community-oriented project releases. So for our commercial customers, there is a feature-based reason to potentially subscribe to and pay for the professional or enterprise editions of our suite. </p>
<p><b>Scott:</b> In its early days, a lot of open source was infrastructure-oriented, and in many cases, it was written for developers, by developers, such as the Linux kernel and the Apache web server. Now, we&#8217;re seeing success with companies like yours that offer software that operates at a higher level, along with open source ERP systems, CRM systems, content-management systems, and those kinds of things.</p>
<p>Talk a little bit about the trajectory of open source evolution. Why is now a good time in history for these higher-level products to emerge in open source, where it used to be more infrastructure?</p>
<p><b>Brian:</b> Let me use some numbers that provide clear testimony that the world and its development model has surely changed. With our business intelligence platform, we are now approaching volumes that are close to those infrastructure open source projects that preceded us. For instance, we surpassed the 10 million download mark in January 2010, and we&#8217;re closing in on 10.5 million downloads today. </p>
<p>Right now, we have a new community member registering at the Jasper website about every five minutes. There are more than 125,000 registered members, and we have more than 70,000 forum posts. Forum posts are where our community interacts, so they&#8217;re helping one another every time they post something to solve one another&#8217;s problems, answer each other&#8217;s questions, and so on. </p>
<p>We now have what is known clearly as the largest ecosystem in all of BI, because of the partners, contractors, community members, and developers that are participating. It really is fundamentally changing the way software is built.</p>
<p>Just as the Spring Framework, Hibernate, and a variety of other development tools have completely transformed and enabled the LAMP stack and the new world of web development, Jaspersoft now is the BI layer in that world. </p>
<p>The volumes and the kind of development that is occurring, the kind of people that are developing and creating applications for which BI now becomes a web-based layer, has expanded dramatically over the last few years. And it has expanded because of the existence of all these open source products and different layers.</p>
<p>Therefore, it makes all the sense in the world that open source software has marched up the stack. Jaspersoft dwells at a level in between middleware and applications. We&#8217;re not quite an application, but we&#8217;re certainly more than middleware. Our providing the business intelligence layer within a web application frees the developer to focus in his core competencies at the application level.  And, assembling applications rapidly based on the new web model and using open source tools whenever they&#8217;re available is clearly the new design point.</p>
<p><a name="rise"></a></p>
<p><b>Scott:</b> Historically, the need for content management systems, CRM systems, and ERP systems goes back a long way, so it seems obvious in hindsight that open source alternatives have become available for them. There&#8217;s a relatively longstanding need that is relatively well understood. </p>
<p>CMS and CRM are slightly newer, but even those go back a decade in terms of proprietary solutions. Business intelligence, on the other hand, seems to be driven by a different force. It seems like we&#8217;re more addicted to data than we&#8217;ve ever been.</p>
<p>More and more, the standard seems to be just to save everything. Save every search query that anyone enters, and save every detail about every interaction with the customer. Just as an oilfield or a super collider produces terabytes of data in a really short time, now even mundane business tasks produce terabytes of data, too.</p>
<p>This addiction to data has driven the need to have really good business intelligence, because it&#8217;s just too much to look at and make sense out of any other way.</p>
<p><b>Brian:</b> For the last 20 or so years, information and its use have constituted a large percentage of a company&#8217;s relative value. 100 years ago, perhaps 90 percent of the company&#8217;s value was based on its ability to manipulate land, labor, and capital&#8211;the fundamental factors of production. </p>
<p>As we moved into the late last century, information technology began to drive an important part of the economy as we became so much more capable of using it effectively. Today, the management of information makes up the vast majority of the value of many companies. </p>
<p>For example, one could argue that nearly 100 percent of the value of a financial services firm is in the information that they command and maintain. Vast amounts of data exist because that data is increasingly valuable, even for an old guard steel mill or manufacturing company.</p>
<p>That trend makes it very important to efficiently manage and manipulate that information. Value is created from insights gleaned from that information, by increasingly large numbers of people. The data needs to be distributed, decentralized, and democratized so it can be used in the largest possible variety of ways, as quickly as possible.</p>
<p>That is a fundamental reason why open source and lighter weight, faster tools for business intelligence have become so popular, and their popularity will go nowhere but up over the course of the next decade.</p>
<p><b>Scott:</b> Concerning the democratization of data, not that long ago, companies had canned reports that they printed on a big line printer on a weekly, monthly, or quarterly basis. Someone had determined ahead of time what data and reports were useful, in a fairly inflexible way, and they went into a binder on a shelf.</p>
<p>Today, there&#8217;s the understanding that the company can&#8217;t fully predict how that data might be used. Individuals may have very specific problems that they need to solve, and to do so, they may use data in a unique way that is not apparent or useful to anyone else.</p>
<p><b>Brian:</b> That phenomenon is exactly why Gartner estimates that about 40 percent of the budgets for business intelligence are now pushed out to the business units, either consciously or unconsciously. </p>
<p>Those who really understand the data and need to make use of it for better business decisions and faster insights are not in IT, and they&#8217;re not at headquarters. They need to be able to make decisions about which tools they&#8217;re going to use, how they&#8217;re going to put them in place, and which data they&#8217;re need to access and manipulate.</p>
<p>We&#8217;re seeing a fundamental shift toward working with departments and enterprises who are now creating software and using the tools inside of that software at a faster pace than we&#8217;ve ever seen before. </p>
<p>It&#8217;s all being done using web techniques and thin-client tools, and they&#8217;re doing it in days and weeks rather than months and quarters, for tens of thousand of dollars rather than millions of dollars. That trend is the next stage of the democratization of data, and it is largely enabled by open source.</p>
<p><a name="role"></a></p>
<p><b>Scott:</b> Open source has clearly been successful for Jaspersoft, and as you said, you are getting millions of downloads. What is it about the open source nature of the foundation that has made your approach so successful?</p>
<p><b>Brian:</b> Based on surveys that I&#8217;ve participated in and read, first, the innovation and the capabilities available in open source software are simply not found in their proprietary alternatives, if those proprietary alternatives even exist.</p>
<p>Second, open source affords companies flexibility. Some enjoy the fact that they have the source code, some enjoy the fact that they have access to lots of other people who are using it and can share ideas and improve it, and all of them like the idea that they have an alternative to a proprietary player, again, if one exists in the category.</p>
<p>Many of the senior people in these organizations that have been successful in using open source would go on to say that it prevents them from feeling locked in to a proprietary vendor who can constantly increase prices or put pressure on them to do things that they might not otherwise want to do. If there is any lock-in in open source, it is surely at a much lower cost and a much lower threshold level. </p>
<p>Today, because of those factors, we&#8217;re seeing what I&#8217;ve referred to as the mainstreaming of or the mainstream use of open source. As a result, we are seeing real depth in the use of open source, even among those who are not on the bleeding edge of technology.</p>
<p>One of the best examples is the public sector&#8217;s adoption of open source. Governments around the world, by definition and by charter, have never really been on the edge or in the vanguard in their use of information technologies. And now public sector entities are adopting and using open source now in big numbers.</p>
<p>The clear advantages occur in terms of innovation, flexibility, and avoiding lock in. Notice that I didn&#8217;t even talk about low costs, which would probably come fourth or fifth in importance for those enterprises that have been most successful with open source.</p>
<p><b>Scott:</b> My experience with government customers has been that one of the reasons they love open source is that their procurement process is so onerous. The ability to use a free, open source tool that doesn&#8217;t require approval, a competitive bidding process, RFPs, and all that kind of stuff is very attractive to them. They can have an idea this morning, download some stuff this afternoon, and have a prototype mocked up by this evening.</p>
<p>Obviously, though, you guys have got to make money, and obviously the community of users is going to be larger than the community of paying users. But talk about how a government user translates from free to fee, in a way that kind of keeps your business going and lets you keep contributing back to the underlying projects.</p>
<p><b>Brian:</b> Their use of the free, open source tools is no different than in private enterprise, in the sense that they can, just as you said, download something and get started on a prototype or a proof of concept on their own time. There&#8217;s nothing about the licensing techniques that would prevent a public sector employee any differently than a private sector one. </p>
<p>The good news is that the public sector&#8217;s big enough that there plenty of employees willing to do what they need to do to get their jobs done better, and that&#8217;s a good thing for taxpayers all around.</p>
<p>But then your second point is the procurement process, which I cannot deny is certainly more laborious than in a private enterprise, although I can point to some of our most difficult contract discussions and lengthy procurement processes actually coming from the private world and not the public sector. There are many private-sector companies that would give any government a run for its money in terms of ridiculous procurement processes.</p>
<p>Your point is accurate, though; it does take longer on average in the public sector. As an open source company, we have to have the ability and patience to work through those issues so we can get on purchasing schedules and so forth. It&#8217;s all about making the purchase process easier for them. It takes time, but it&#8217;s absolutely worthwhile.</p>
<p>And, that&#8217;s the right thing to do for public sector spending in general. I&#8217;ve written plenty about the need for governments to embrace and use open source wherever it exists. Using our taxpayer dollars more efficiently than they would be otherwise is definitely worth the effort, even though it can be difficult.</p>
<p><b>Scott:</b> Different companies that have an open source foundation unpack the difference between the community version and the paid version differently. Do your customers find value mostly in features they want that are only available in the enterprise version, or is it more the support peace of mind? </p>
<p><b>Brian:</b> There are three or four major reasons that a customer would choose to move to a commercial, paid version instead of the community version, and I&#8217;ll list them in no particular order, because it&#8217;s going to vary by customer.</p>
<p>The first reason is that there are features and capabilities in the commercial versions that are not in the community version, and those features and capabilities may make the paid product better aligned with what they need to do. </p>
<p>Our community versions are clearly capable products, so what we&#8217;ve chosen to put in the professional or commercial versions is very significant. Any commercial company, I think, would look at our commercial products and say, &#8220;Yeah, that&#8217;s absolutely worth paying for.&#8221;</p>
<p>The second reason is the license. Some customers are not comfortable with the license that governs the open source code. Usually, it&#8217;s in the range of some relatively permissive license, like the LGPL or Apache or something like that, to a relatively restrictive one, like the GPL.</p>
<p>Each of those licenses comes with terms that make it less interesting for some companies to put the software into use. They would rather have standard, commercial terms around the software and its use. By going to a commercial version, customers have a commercial license which includes the indemnification and protection they want to go along with it.</p>
<p>The third reason is professional support. The forums and the support that you get from working with the Jaspersoft community are pretty robust, but if your use of Jaspersoft is relatively mission critical, you probably want the comfort of knowing that you have a support team that&#8217;s very intimate with the code and available to you 24/7.</p>
<p><a name="transition"></a></p>
<p><b>Scott:</b> Another area that&#8217;s interesting to me is the notion that business intelligence used to be used by the largest companies, and it was sold to them by very large software companies. These were very expensive solutions and implementations. </p>
<p>Talk a little bit about the need for BI in the small and medium size business sector. What kinds of examples have you seen where BI has become increasingly critical to relatively small companies?</p>
<p><b>Brian:</b> This is one of the most important and fundamental shifts that has occurred in the last 10 years: the size of the company now has almost nothing to do with the size of the data problems. We have many small to mid-size companies that need to analyze enormous volumes of data, in very sophisticated ways and in almost real time. </p>
<p>One of my favorite examples in our portfolio of customers is a small company called CrowdStar, which creates Facebook games.</p>
<p>They need to analyze, in almost real-time, 20 million records of data at least every day to understand patterns of movement and player activity within their games. They can modify those games during the course of a day to enable more activity, which for them equates to more use and more payments.</p>
<p>They analyze all that data using Jaspersoft in the cloud, which they pay an incredibly small amount of money for. CrowdStar is not a very big company, but they use big volumes of data, and there is literally almost no other way they could afford to do this without Jaspersoft in the cloud. </p>
<p>I think that&#8217;s a quintessential example. There are others that are less dramatic, perhaps, but they are equally important in the magnitude of data that needs to be analyzed and the size of the company that is behind it. </p>
<p>This trend is causing all kinds of technological shifts, and it has triggered what I refer to as &#8220;therefores&#8221; in the world of business and technology. By “therefores”, I simply mean that they are a second order effect, caused by the sea of data so commonly barraging any size enterprise. Examples of these &#8220;therefores&#8221; are increased interest in columnar orientation analytic data stores, proactive analysis, and predictive-like analysis. There&#8217;s much more interest in elegant design of responsive mechanisms for data analysis, such as dashboarding methods of quickly drilling into data, even through multiple dimensions.</p>
<p>It&#8217;s created a need for fast, successful small projects. It has pushed, as we described earlier, more of the emphasis out to the business unit levels, so they can have these small projects around where the data is best known and understood, and they can accomplish things more quickly.</p>
<p>So this host of business and technology &#8220;therefores&#8221; has come from the enormous amount of data now that is being analyzed by almost everyone.</p>
<p><a name="cloud"></a></p>
<p><b>Scott:</b> There are so many different trends converging that let an entrepreneur go from an idea to a product in less time, with less capital expense than ever before. It used to be that, if you had an idea for software, step one was finding an office. Step two was buying servers. Step three was hiring a bunch of people. </p>
<p>That all burned through a lot of cash trying to get a product built, before you even had something that your first customer could look at. Today, you don&#8217;t have to buy a server. Even before the cloud, hosting providers offered dedicated servers, and then they started offering virtual servers.</p>
<p>Then clouds came online like Amazon EC2, and you started to see the platform as a service plays like Google App Engine and Azure, and you see a lot of the traditional hosting providers also becoming cloud providers to some degree, building on top of a virtualization infrastructure.</p>
<p>You can put a $2,000, or $5,000, or $10,000 project out on Elance or something like that, or build it yourself if you&#8217;ve got programming capability. You can be a company of five people, 10 people, 20 people who relatively quickly layer on top of another infrastructure like Facebook or something that has millions of users. Your user base can grow at a fairly steep exponential curve, having millions of users and enormous amounts of data.</p>
<p>There is also a huge amount of open source data, or data that you can subscribe to. There is a website that&#8217;s built on Google App Engine called Walk Score that you put in an address and it tell you how walkable a neighborhood is. It taps into all kinds of different streams of data that they don&#8217;t own. </p>
<p>It&#8217;s not their data, and they didn&#8217;t have to collect it, but it&#8217;s syndicated and accessible data. Whether that&#8217;s baseball stats, map and geo-location data, or the data that you might need to use business intelligence on, it increasingly might not even be your data. You might be looking for correlations or you might be relating data in novel ways that nobody else has bothered to put together.</p>
<p>What do you see as the future of these trends?</p>
<p><b>Brian:</b> Some of the changes in the last few years have been profound, and the great amplifier of this phenomenon, going forward, is of course the cloud. If we fast forward five or seven years, it will look ridiculous that we built all these data centers. Everybody having their own data center will seem as absurd as everybody having their own power plant.</p>
<p>The ability to find tremendous efficiency, and deliver computing power on demand, instant on, as it&#8217;s needed, is one of the most important things happening right now. So many technologies that are transfromational, we tend to overestimate their impact in the near-term and underestimate their impact in the long-term.</p>
<p>All that is why Jaspersoft has been working in the cloud for more than a year. It&#8217;s not because there is yet a huge demand for business intelligence in the cloud. It&#8217;s because I know fundamentally that we have to be there so we can learn, adapt, and ensure that our BI platform can be delivered as a service through the cloud even more easily than it can be downloaded today.</p>
<p>My favorite example of all this is the U.S. federal government, which manages slightly more than 1,000 data centers today. If you&#8217;re a U.S. taxpayer, your money is going in a big way to help continue the running and management of more than 1,000 data centers.</p>
<p>Let&#8217;s just say, for fun, they could cut that number in half, which should be, by the way, an incredibly modest goal. It should actually come down to about a couple of dozen data centers. But let&#8217;s just say they could cut it in half by virtualizing and taking into the cloud a substantial percentage of the infrastructure platforms (hardware and software) that is common, or should be common, across every government agency. </p>
<p>Fortunately, that is a goal that the new federal government CIO and CTO are working on right now. It&#8217;s a high priority on their agenda, and I don&#8217;t think it can happen fast enough. I&#8217;m hoping that they do it in a way that is highly successful, and I&#8217;m hoping it results in a 90 percent reduction in those data centers in the course of the next decade or so.</p>
<p>You could use that same example across much of the private-sector landscape. Private clouds are going to have a huge impact. I think the public cloud has probably been overrated, but it will have an impact in a longer term way. The methods of building and creating private clouds is going to be one of the most important things that have come along since the invention of the personal computer and the Internet.</p>
<p><b>Scott:</b> When I think of private clouds, I envision a company that still has a data center, but it&#8217;s just providing cloud-like services to the organization. A guy I talked to one time said, &#8220;A cloud is basically a data center with an API.&#8221; </p>
<p>You can run, start, stop, and clone instances, and there&#8217;s the notion of IT being more of a service provider to the organization. How does that square with the way you&#8217;re using the term &#8220;private cloud?&#8221;</p>
<p><b>Brian:</b> Back to the government example, that&#8217;s all true, but what Vivek Kundra should be creating is a single, enormous private cloud for the U.S. government that should allow the reduction of 1,000 data centers to, say, 50.  This would allow agencies to share these resources simply and capably.</p>
<p>And, this Government cloud should be private, so nobody unauthorized is able to tap into it. It should be a U.S. government agency cloud-based infrastructure that is shareable and provisionable for each agency, but it&#8217;s managed by some central authority, and enormous in scale.</p>
<p><b>Scott:</b> The idea being that, even if you look at the government doing a census and tax time for the IRS and all of these peak loads, the entire capacity of the thousand data centers are nowhere near fully utilized.</p>
<p>So you could consolidate that all down to 50 data centers and still have lots of headroom, but without the risk that if a public cloud provider gets compromised, Los Alamos would be potentially compromised.</p>
<p><b>Brian:</b> That&#8217;s exactly right. The present duplication of resources is remarkable, and cutting a lot of that out is where the cost savings would come from, more so than the virtualized hardware and software, and it would allow for the fungible provisioning of computer resources based upon different agencies&#8217; needs for it.</p>
<p><b>Scott:</b> Certain things are holding organizations back from going to the cloud. Part of the hesitation is that they just don&#8217;t trust the technology yet, but there are other factors as well. Some organizations are held back by the difficulty of extending their identity and security boundaries or issues around compliance requirements. </p>
<p>How does that impact Jaspersoft and business intelligence, given your stated goal of having it run as well in the cloud as it does if you download it and run it on premises? What are some of the specific things that you are tackling to make sure that you&#8217;re ready, as some of the network security trust issues get solved, and people start adopting the cloud in a more mainstream way?</p>
<p><b>Brian:</b> We&#8217;re avoiding those use cases where those preventatives are in place, to be honest. It&#8217;s one of those things where gravity finds its own level or water runs downhill. The use cases come to us. The customers that don&#8217;t have those obstacles in their way and that have problems that can be solved with BI in the cloud are the ones that are finding us now. </p>
<p>The kinds of things we&#8217;re seeing come to us now in the cloud pretty commonly occur around three or four major project types. You are right that those project types do not involve industries or processes where there are major restrictions of security or data flow, or authentication issues like HIPAA-based requirements, and so on. Those are not the use cases being solved for now.</p>
<p>Our logic is to solve the green field problems where the marriage of cloud and BI make good sense today, and CrowdStar is a great example (one I offered earlier). Over time, the benefits of using instant-on, highly provisionable cloud-based services like BI are going to become even more efficient.</p>
<p>Even those industries and sectors that are highly regulated or highly concerned with security will find ways eventually to adapt and provide secure provisioning and whatever&#8217;s necessary to make those services available for these more difficult use cases.</p>
<p>If we were to target these security-minded use cases now, we would be foolish. Instead, we&#8217;re going to target and work with those entities, companies, and sectors that don&#8217;t have those restrictions or problems, solving mostly green field problems today. Then, over time, as the benefits emerge more clearly, we&#8217;ll surely see even the laggard use by those industries that are highly regulated.</p>
<p><a name="layer"></a></p>
<p><b>Scott:</b> As everybody works on that problem and they figure out a way to do HIPAA in the cloud, what are some of the things that Jaspersoft needs to do, or is doing and needs to continue to invest in to be a great BI platform in the cloud?</p>
<p><b>Brian:</b> Truthfully, I don&#8217;t think we need to do anything differently than we&#8217;re doing today as a great BI platform. I think the issues that will need to be solved with the cloud are going to be separate from the BI layer.</p>
<p>Some things we&#8217;re doing in general to help us to be a better participant in highly regulated urgent care industries include things like marrying BI to processes and collaboration environments. </p>
<p>Those are probably the two most requested emerging trends that we&#8217;re seeing, where we have partners that are building Jaspersoft into industry-specific BPM software, to take some time out of processing and add analytics to it, so that there are huge gains in the output and the results.</p>
<p>One of our good partners is named HandySoft. It’s created a remarkable environment using Jaspersoft and its own BPM expertise to fully automate the process flow of secure documents that need to travel inside the Nuclear Regulatory Commission for the maintenance and management of nuclear materials all around the world.</p>
<p>Of course, that&#8217;s a very secure, highly authenticated environment. We&#8217;re solving the challenges there today with HandySoft for the NRC. They&#8217;re actually providing a hosting service for the NRC, very cost effectively.</p>
<p>I think someday that will be available generically in the cloud. We have a bunch of hurdles to cross before that can happen, but there&#8217;s no change needed to the BI software, really. What&#8217;s needed will be changes to the cloud infrastructure, the perception of the cloud, and how it behaves in order to satisfy uses beyond what I described for the NRC.</p>
<p><b>Scott:</b> There are a couple of different cloud models, and I would imagine that infrastructure as a service is particularly interesting to you. I&#8217;m referring to the model where somebody essentially has a virtual machine, and they can put whatever they want on it in terms of an OS and a stack, and they can put Jaspersoft&#8217;s stuff on it. </p>
<p>On the other end of the spectrum, there are things like Google App Engine, where the developers really insulate it from even knowing that there&#8217;s a machine underneath them, an OS, a file system, and that kind of stuff. It can call out to something running on another cloud provider, but it seems like there wouldn&#8217;t be a lot of partner opportunity for a company like yours.</p>
<p>And then in the middle, there are things like Engine Yard, where companies are making things like Ruby on Rails or these other development frameworks into platform as a service offerings that can run on an arbitrary infrastructure as a service.</p>
<p>What do you see as the opportunities associated with some of these different models?</p>
<p><b>Brian:</b> I think that many of these emerging application infrastructures will include Jaspersoft as the BI reporting and analytics layer, so that if you&#8217;re choosing from a smorgasbord of functionality in your platform as a service environment you could very well choose Jaspersoft as your reporting and analytic layer to go become part of your solution. </p>
<p>Our job is to ensure that the Jaspersoft platform is inside of all these environments, which is precisely why we&#8217;re doing what we&#8217;re doing with both the platform fundamentally and the cloud work that we&#8217;ve done.</p>
<p><b>Scott:</b> In some ways, it&#8217;s almost like an OEM relationship, where you make the case to the platform or the service providers why their customers may want Jaspersoft as a BI solution they can choose from. </p>
<p><b>Brian:</b> In fact, a little more than 60 percent of our business is in arrangements that are OEM-embedded uses of our tool, for a lot of reasons. That substantial customer base that we have who are embedding Jaspersoft, makes me not want to compete with them, but to help them succeed by attracting more customers. </p>
<p>I don&#8217;t want to, for instance, host Jaspersoft and try to compete with the very customers I&#8217;m selling to; that is, our OEM customers. I would much rather allow them to do it in a way that is probably application- or domain-specific, and which adds their expertise to our BI platform – and includes our expertise, which is all about building a great BI platform.</p>
<p><b>Scott:</b> This has been a great interview. We&#8217;re running short of time, so are there any closing thoughts you&#8217;d like to add?</p>
<p><b>Brian:</b> The kind of innovation that we&#8217;ve represented in the last few years and the substantial community that we&#8217;ve built around our open source heritage is about to take some important leaps forward. </p>
<p>We have two major releases planned this calendar year, and we have significant announcements with our community that will show just how big and broad it is. I think you&#8217;ll see a variety of projects that use Jaspersoft as the open source engine across stand alone and embedded implementations that will prove we really are the alternative now to the aged, proprietary vendors of the past. </p>
<p>Our ability to innovate rapidly and steadily, and our ability to work with the community is going to continue to be a substantial differentiator for us. It&#8217;s going to be a lot of fun, and a year from now, our progress is going to be shocking. There are plenty of surprises ahead over the course of the next 12 months.</p>
<p><b>Scott:</b> Thanks for taking the time to talk today.</p>
<p><b>Brian:</b> Scott, thank you so much. This has been great.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=310&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F21%2Finterview-with-brian-gentile-ceo-of-jaspersoft%2F&amp;title=Interview+with+Brian+Gentile+%26%238211%3B+CEO+of+Jaspersoft" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F21%2Finterview-with-brian-gentile-ceo-of-jaspersoft%2F&amp;title=Interview+with+Brian+Gentile+%26%238211%3B+CEO+of+Jaspersoft" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F21%2Finterview-with-brian-gentile-ceo-of-jaspersoft%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F21%2Finterview-with-brian-gentile-ceo-of-jaspersoft%2F&amp;title=Interview+with+Brian+Gentile+%26%238211%3B+CEO+of+Jaspersoft" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F21%2Finterview-with-brian-gentile-ceo-of-jaspersoft%2F&amp;title=Interview+with+Brian+Gentile+%26%238211%3B+CEO+of+Jaspersoft" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F21%2Finterview-with-brian-gentile-ceo-of-jaspersoft%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Brian+Gentile+%26%238211%3B+CEO+of+Jaspersoft+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F21%2Finterview-with-brian-gentile-ceo-of-jaspersoft%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/04/21/interview-with-brian-gentile-ceo-of-jaspersoft/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Interview with Zeev Suraski &#8211; CTO Zend, Co-creator PHP</title>
		<link>http://howsoftwareisbuilt.com/2010/04/07/interview-with-zeev-suraski-cto-zend-co-creator-php/</link>
		<comments>http://howsoftwareisbuilt.com/2010/04/07/interview-with-zeev-suraski-cto-zend-co-creator-php/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 20:09:28 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=303</guid>
		<description><![CDATA[Clearly, PHP is one of the most important open-source technologies on the planet.&#160; In this interview, we talk with Zeev Suraski, CTO of Zend (twitter) and co-creator of PHP about where PHP has come from, and where it&#39;s heading. The development of the Zend Framework and Zend the company On the rise in popularity and [...]]]></description>
			<content:encoded><![CDATA[<p>
    Clearly, PHP is one of the most important open-source technologies on the<br />
    planet.&nbsp; In this interview, we talk with Zeev Suraski, CTO of <a href="www.zend.com">Zend</a> (<a href="http://twitter.com/Zend">twitter</a>) and<br />
    co-creator of PHP about where PHP has come from, and where it&#39;s heading.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/04/07/interview-with-zeev-suraski-cto-zend-co-creator-php/#development">The development of the Zend Framework and Zend the company</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/04/07/interview-with-zeev-suraski-cto-zend-co-creator-php/#php">On the rise in popularity and the lasting success of PHP</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/04/07/interview-with-zeev-suraski-cto-zend-co-creator-php/#history">The history and future of PHP adoption and innovation</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/04/07/interview-with-zeev-suraski-cto-zend-co-creator-php/#microsoft">The evolution of the relationship between PHP and Microsoft</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/04/07/interview-with-zeev-suraski-cto-zend-co-creator-php/#cloud">The future of cloud computing as it intersects with PHP</a></li>
</ul>
<p><span id="more-303"></span></p>
<p><b>Scott Swigart:</b> First, could you please introduce yourself, and tell us a little about Zend Framework? </p>
<p><a name="development"></a></p>
<p><b>Zeev Suraski:</b> Sure. I&#8217;m co-founder and CTO for Zend Technologies, one of the principal contributors to PHP. I&#8217;m also responsible for the Zend Framework project, and I lead the R&#038;D organization at Zend. </p>
<p>PHP is obviously at the base of our technology stack. My colleague Andi Gutmans and I became involved with PHP as students back in 1997. We actually used PHP/FI v2 two beta, which was written by Rasmus Lerdorf.</p>
<p>We ended up rewriting it from scratch, and that was the beginning of the PHP project in a form similar to what it is today: a community project with a lot of different contributors at many different levels. We&#8217;ve been involved ever since.</p>
<p>I spearheaded the development of PHP 4, and Andi spearheaded the development of PHP 5. And five is where we are today, with six on the horizon but not quite baked yet.</p>
<p>Zend Framework is an open source project that we started in 2005 with a mandate to provide infrastructure classes for PHP applications. Zend Framework was definitely not the first framework to exist for PHP, but in a relatively short amount of time it became one of the most popular, and it’s probably the most popular in commercial environments.</p>
<p><b>Scott:</b> What are some of the factors that have helped make Zend Framework popular?</p>
<p><b>Zeev:</b> Developers appreciate the fact that Zend Framework is designed to work well with other frameworks so they can use it for the parts of a project that they want to, and they are free to use other frameworks or custom code for other parts of that same project.</p>
<p>It’s an easy choice to make, because a developer can get started with Zend Framework and then use it more extensively as he or she sees fit.</p>
<p>Interestingly, just last week, there was a conference in France called Symfony Live, which is the yearly conference of Symfony, another very popular PHP framework. They announced that, in the next major version of Symfony, they&#8217;re going to use parts of Zend Framework as the basis, instead of reinventing the wheel again.</p>
<p>I think, in a nutshell, that gives a pretty good indication of how far Zend Framework has come in terms of becoming somewhat of a standard for building PHP applications, so that even other frameworks sometimes decide to rely on it.</p>
<p><b>Scott:</b> How does Zend the company fit into the equation?</p>
<p><b>Zeev:</b> Andi and I founded the company in 1999, with a broad mandate to provide solutions around PHP. That&#8217;s kind of the mile-high view of the company.</p>
<p>At ground level, there are several interesting facets of business. First, we are very deeply involved in the open source projects that I mentioned. That includes PHP itself, for which we&#8217;re still major contributors, and we&#8217;re probably the biggest contributor and maintainer of the engine, called the Zend Engine.</p>
<p>In addition to that, there&#8217;s the Zend Framework, a community project for which we are leading the way and contributing alongside countless other contributors outside the company.</p>
<p>There are some other open source projects we&#8217;re involved with, like PDT, the Eclipse PHP Development Tools project that we’re leading, and a couple of others, but those are the major ones.</p>
<p>Zend also provides commercial software on top of the open source software. For instance, for development, we offer Zend Studio, which is an advanced version of PDT with a broader set of features.</p>
<p>On the server side, we offer Zend Server, which provides an easy-to-use, easy-to-install version of PHP, plus quite a few additional features around performance optimization, proactive monitoring and root-cause analysis, job-queuing, scalability, and so on. We offer both a free and a commercial version of Zend Server for Windows, Linux and IBM i. The free version is not entirely open source, though, so it&#8217;s not in the open source line.</p>
<p><a name="php"></a></p>
<p><b>Scott:</b> Can you tell the story of how you got involved in PHP originally?</p>
<p><b>Zeev:</b> OK. In 1997, Andi and I were students in the Technion, a technology university in Israel. While in school, both of us were also working part-time at Internet service providers. Early on, we found that we preferred the practical aspects of our work in the ISPs over our more theoretical studies.</p>
<p>Obviously, as part of the curriculum, we had to develop several projects, and one of the projects we decided to take on was to develop a shopping cart application. Today, one can choose from various shopping carts available to just download and maybe hack a bit to make it one’s own; but back in 1997, this kind of work was considered cutting-edge technology.</p>
<p>We were the only two students who asked our instructor for permission to use PHP/FI as our language for developing the project. All of the other students asked for Perl, which was the undisputed king of the web at that time.</p>
<p>I was familiar with PHP/FI through my part-time job, and thought it would be easier than Perl with its headaches, so Andi and I decided to use it. It wasn&#8217;t easy to convince our instructor to let us use this thing called PHP that she&#8217;d never heard of. Eventually, she agreed, but with a warning to us that, &#8220;If you bump into issues, because this is not very standard, then it&#8217;s your problem.&#8221;</p>
<p>We started implementing the project, and about 30 or 45 minutes into it, we started bumping into the kinds of issues that forced us to eyeball the source code to figure out where things were going wrong.</p>
<p>We were looking at parse errors, syntax errors and such, from all different sides, and trying to trim and cut the code and see how it behaved. After a while, we concluded that we weren&#8217;t doing anything wrong; there were parser issues that made it reject perfectly valid code.</p>
<p>We sent an email to Rasmus Lerdorf, because it had become clear to us that the language didn&#8217;t understand its own basic syntax. He came back to us and said, &#8220;Look, guys, I&#8217;m not a CS major. I just hack this around. Feel free to propose patches.&#8221; We decided to give that a try. We dove into the source code, trying to understand it more deeply, and we were pretty stunned by what we saw.</p>
<p>On one hand it was pretty amazing that Rasmus managed to create something that worked as well as it did, given the lack of architecture. On the other hand, the experience was like opening the hood of one’s car and, instead of an engine, there were mice running to move the wheels. Very impressive if those mice could make the car drive reasonably well, but still not the kind of engine one would choose to power the car on the highway…</p>
<p>Since we preferred the practical elements of computer science to the theoretical ones, we decided to try to implement it from scratch, to see how far we could go. We managed to write it from scratch, and during that time, we not only changed the implementation; we also turned PHP/FI, which was sort of a partially defined pseudo language, into a much more well-defined one. By the way, Rasmus did fix the parse error we bumped into very shortly after we reported it, but the seeds were already sown.</p>
<p>We literally eliminated the possibility for the language not to understand its own syntax, by using the compilation tools the right way. I want to add here that our compilation techniques professor, Michael Rodeh, pushed us to touch base with Rasmus and convince him to move over to our implementation. That in itself is interesting.</p>
<p>Rodeh had a firm view that there are too many development languages in the world, and he didn&#8217;t want to be in any way responsible for the creation of yet another one. He hoped we’d convince Rasmus to replace his version of the language with ours, and join us in working on the new implementation.</p>
<p>We weren’t sure that Rasmus would do it, but he was actually quite happy to, and that&#8217;s the story of how the modern incarnation of the PHP project was born. Rasmus had been working almost solo on PHP/FI 1 and 2 between &#8217;95 and &#8217;97.</p>
<p><b>Scott:</b> It&#8217;s interesting, because PHP is such a long-lived language, whereas so many other web-programming languages have sort of risen and fallen, although PHP has obviously gone through a lot of evolution, as well. </p>
<p>I worked as a developer for a long time, and like many developers, I often found that I just wanted to use a new programming language or a new framework on the next project, so I could learn more. And so, over the years I have explored all kinds of UNIX shell commands, scripting languages Tcl/Tk, web-development languages, low-level C++ and more. </p>
<p>PHP is really as much here to stay as it&#8217;s ever been. Why do you think it has been so successful and long lived?</p>
<p><b>Zeev:</b> I don&#8217;t pretend to know the complete answer to that question, because at the end of the day, a lot of this has to do with dynamics that are too big to analyze.</p>
<p>But, if we put aside a strictly scientific answer, I think that the key ingredient from a technology standpoint is the focus on simplicity. Trying to really solve the web problem in the simplest possible way, and not necessarily always the most ‘elegant’ way.</p>
<p>I think that some people might look down on PHP, because it might not always do things in the elegant way that they expect. Another way to put it is that I think Perl prides itself on the ability to do one thing in many different ways, and to do so in a way that looks very elegant.</p>
<p>At the same time, some people would look at that same piece of code and while they might find it elegant, they might also find it very difficult to comprehend.</p>
<p>With PHP, we have avoided that and focused instead on keeping things simple. We have not tried to create a general-purpose language, but rather a web-specific language. From the beginning, the PHP community started to thrive, and we had a self-nurturing cycle.</p>
<p>In addition to the language itself being simple and easy to use, and very fit for web applications, developers have been very enthusiastic about it and helped others move into PHP.  Those were the two key factors for success: a self-nurturing community and PHP’s simplicity. There are many factors, but honestly, they&#8217;re all that unique to PHP, such as good support for databases. We had very good support for MySQL very early on.</p>
<p>Probably, the reverse is also true, and MySQL may owe some of its success to PHP in much the same way. It&#8217;s difficult to say which project contributed to which other project more.</p>
<p><a name="history"></a></p>
<p><b>Scott:</b> I saw something similar happen with Visual Basic, which also set out to be simple. Its initial goals were to be very approachable and learnable, helping people become better programmers and get more done.</p>
<p>I would imagine that PHP probably faced a little bit of tension because of its focus on simplicity and giving people an easy way to get things done. Specifically, it was relatively easy for people who didn&#8217;t have computer science degrees to pick it up and be productive with it.</p>
<p>I imagine that PHP really had to evolve to try to be very approachable and learnable by amateur and hobbyist developers, while also meeting the needs of some of the best developers. Is that where something like the Zend Framework comes in? Talk a little bit about how PHP has addressed that tension over the years.</p>
<p><b>Zeev:</b> PHP 3 became very popular very quickly.  We designed it as a powerful language, but with simple, small apps in mind.  In practice, though, more and more people were using it for fairly complex applications with growing needs for performance.  It quickly became apparent that Andi and I weren’t two geniuses that got it right the first time around.</p>
<p>Almost immediately after PHP 3 was out, Andi and I realized that we could have done things differently. Developers were beginning to use PHP for really complex and demanding scenarios. Basically, the language was playing catch up between functionality and performance. It was capable, but slow, so with PHP 4, we made a lot of progress to solve that problem.  </p>
<p>In PHP 5, which was the next step, we really re-invented and re-implemented the whole object-oriented model, almost from scratch. That was perhaps the biggest downwards-compatibility breakage that we’ve ever made in PHP.  PHP 5 brought the language to new realms.  Beforehand, it was largely ignored by the architects of the world; after the release of PHP 5, some of the mega-vendors, including IBM, Oracle and Microsoft,  took notice and have become very much involved with PHP.</p>
<p>The next step in the evolution was the frameworks, specifically Zend Framework, where we went one level above the model and introduced a classier key that provides the building blocks and design-pattern implementations to help developers get productive quickly and create very maintainable applications.</p>
<p>Every step of the way, we kept existing users and existing types of code in the front of our minds. Even though it&#8217;s becoming less popular, one can still write a PHP procedural application today. My gut-feeling estimate is that it still accounts for at least half of the lines of PHP code in use.</p>
<p>People often ask me whether I envision the procedural support of PHP ever being deprecated in favor of a purer object-oriented code, and the answer is no. I don&#8217;t expect it to ever happen, and not only because I think we would be breaking God knows how many billions of lines of code, but also because I think that for many developers, that&#8217;s actually quite a nice way to start in scripting.</p>
<p><b>Scott:</b> Yeah, exactly.</p>
<p><b>Zeev:</b> I think we will generally move at the framework level as the place where innovation happens. The language has all the essential building blocks needed to provide frameworks to do most of the innovations.</p>
<p>We will continue to introduce major new language features, as we did with PHP 5, yet that is clearly not where most of the innovation happens today. I think taking a look at frameworks, and maybe extensions that connect PHP to all sorts of data sources and portals, is where the vast majority of innovation is going to happen going forward.</p>
<p><b>Scott:</b> If you feel that a lot of the innovation is happening outside the core, that is a good thing in a lot of ways. It provides reassurance to the market place so people know they can write code on it and it&#8217;s likely to continue to run into the future.</p>
<p>I think you&#8217;re exactly right about keeping procedural support. You end up with different factions, and there are always people who will say that if somebody can&#8217;t do object-oriented programming, then they have no business writing code.</p>
<p>On the other hand, you&#8217;ve got the entrepreneur in a small business who just wants to modify the company web site, and is able to get it done without hiring an expensive consultant. I think that if some people want to do everything with object-oriented code, that&#8217;s great. But there is a legitimate community out there using it as a procedural scripting language that makes a lot of sense to them, and enabling them to get their job done is important too.</p>
<p>I guess another testament to the success of PHP is as the &#8220;P&#8221; in the LAMP stack, even though it&#8217;s not just a Linux technology. Probably a lot of PHP code ran on UNIX back in the day and still does.</p>
<p>Microsoft has obviously also taken a pretty strong interest in it, doing real engineering work in Windows 7 to make PHP run better and supporting PHP up on Windows Azure. Are you involved in any of that? And what do you think of PHP on platforms that aren&#8217;t Linux?</p>
<p><a name="microsoft"></a></p>
<p><b>Zeev:</b> We are very involved with bringing PHP to non-Linux systems. Initially, we developed PHP on Linux, but then moved to Windows when we started working on PHP 4. I think it was Windows 2000 and Microsoft Visual C++ 6. That was in 2000, or maybe 1999, so the relationship between PHP and Windows started a long time ago. </p>
<p>It really materialized relatively recently, in the last three years or so. In 2001, Andi and I were invited to Redmond by Microsoft to see if we could significantly improve the performance and reliability of PHP on Windows.</p>
<p>We were invited by a team called the &#8220;Dotcom SWAT Team,&#8221; which was formed to convince Dotcoms using Linux to move to Windows Server. They were bumping into a lot of people who were very open to Windows Server but wanted also to use PHP.</p>
<p>We stayed there one week and made some progress, but the next step actually took a few years to complete. I think real progress started when Microsoft and some other big organizations felt that it was really important to support PHP, and to make PHP run well under Windows.</p>
<p>Microsoft has changed in the way it perceives its involvement with open source projects, and has started putting lots of resources into making open source projects run well on Windows.</p>
<p><b>Scott:</b> How did your involvement with Microsoft evolve after that initial contact?</p>
<p><b>Zeev:</b> We at Zend worked together with Microsoft in 2006 and 2007 to really improve the stability and performance of PHP on Windows. We introduced the use of an architecture that wasn&#8217;t new, but it wasn&#8217;t very common, called FastCGI to deploy PHP on Windows.</p>
<p>We changed the PHP build and the way it&#8217;s actually running on Windows to run as a single server application with multiple processes, as opposed to as a single process with multiple effects, which is more similar to what you find on Linux. It goes much faster, and more importantly, it has reached production quality in terms of reliability. We have been working with Microsoft ever since, and today Windows is a viable deployment platform for PHP apps.</p>
<p>The Zend product line evolved from a situation where Linux was perhaps our main supporter operating system&#8211;along with Solaris and a couple of others&#8211;so that today both Windows and Linux, as well as Mac OS X on the development side, are first-class citizens.</p>
<p>We&#8217;ve seen quite a lot of uptake of PHP on Windows, and we are very happy about it. Zend has always been very pragmatic, and we wanted to see PHP running on as many systems as possible. </p>
<p><a name="cloud"></a></p>
<p><b>Scott:</b> PHP is a higher-level language, as opposed to something like C, so it insulates the developer in a lot of ways from OS-specific APIs and that kind of stuff. I mean, you don&#8217;t have to know a lot about POSIX-compliant C libraries to be a PHP developer.</p>
<p>Another factor is that companies are now starting to look at what percentage of their applications make sense to run in the cloud. That&#8217;s kind of an extension to thinking that they want to partner with a hoster, rather than necessarily wanting to run every application in their own data centers, so the hoster can manage and maintain the system, guarantee uptime, and that sort of thing.</p>
<p>To me, cloud is an extension of that, but with some important differences. For instance, cloud basically gives you an API that lets you remotely stop and start instances. Or if it&#8217;s more of a Platform-as-a-Service like Google or Azure, it does scaling silently and automatically.</p>
<p>As you consider cloud providers, at one end there is Google App Engine, and it seems like the fact that they didn&#8217;t support PHP was an enormous hole, since a lot of things like Drupal, Jumbla, and WordPress are written in PHP.</p>
<p>For a lot of people, that&#8217;s the starting point, and obviously with Amazon Web Services, PHP runs just fine, because it is kind of an Infrastructure-as-a-Service that can run whatever they want. What are the opportunities and challenges around cloud development and PHP&#8217;s place in the cloud?</p>
<p><b>Zeev:</b> As you said, the basic stuff already works, because in the most naive way, one can look at cloud as just another way to host applications. Obviously, that works pretty well already, except for Google App Engine, which doesn&#8217;t support PHP.</p>
<p>If we look a little deeper and try to see the opportunity here, I think we have done some interesting work around cloud to make PHP even more attractive for developing web applications. We started an initiative called the Simple Cloud API several months ago, partnering with IBM, Microsoft, and several other cloud vendors, with the goal of creating an exceptional layer on top of cloud services.</p>
<p>I&#8217;m not talking about the Infrastructure-as-a-Service APIs, but about accessing the cloud services that developers typically have access to for an application that runs inside the cloud, like knowledge tables and things like that. That goal is to create an accession layer on top of those services and what developers could look at as some sort of ODBC for cloud services.</p>
<p>We are half-way to implementing it, and taking a look at simplecloudAPI.org will show where we are in the process. When we finish and push it out, it&#8217;s probably going to be part of the Zend Framework. The implementations will be open, and developers will be able to implement it in other frameworks, as well as in other languages.</p>
<p>I think that introducing something like this into PHP, and the Zend Framework, is going to be a first, and it may help spur companies and ISVs not only to use the cloud but actually to develop for the cloud. </p>
<p>It allows developers to create applications in a slightly different way, if they need to create very scalable applications and things like big tables are very, very attractive probably more so than relational databases. </p>
<p>When ISVs and other companies start developing for the cloud and not just deploying on the cloud, I think that PHP, together with the Simple Cloud API, are going to be very effective choices.</p>
<p>Other than that, I think that the more obvious approaches are also going to happen. For example, Zend software could be provided as Amazon or Azure instances that developers would be able to deploy quickly, instead of having to provision a server and install Zend Server or PHP on it. They would be able to just go to the cloud, start up an instance, and boot an application there. Some of this functionality is already available.</p>
<p>We&#8217;re also thinking about how to bring forth innovation and change the way people develop applications once they have access to cloud services. And that is, perhaps, one not-so-obvious direction that we&#8217;re taking.</p>
<p><b>Scott:</b> With the Simple Cloud API, is the idea partly to abstract the fact that different cloud providers have different ways of providing, for example, a table-driven distributed data store that&#8217;s non-relational? That abstraction layer would make it so programmers wouldn&#8217;t really have to code to the differences. They would be kind of coding to a common API.</p>
<p>Is the goal to make Zend PHP code more portable, or is it just to simplify development? </p>
<p><b>Zeev:</b> It&#8217;s actually both. Look at the cloud APIs today. Simplicity is not the number one goal, and sometimes vendors may indeed want to differentiate.</p>
<p>So, our goal is really twofold: both to abstract and to make applications portable. I think both are important before using those services can become more mainstream. Having abstraction, so developers can take their applications easily from one cloud vendor to another, is an important step. </p>
<p>Equally important, making the use of these services easier is definitely also a goal of the Simple Cloud API, hence the name &#8220;Simple&#8221; in the name.</p>
<p><b>Scott:</b> Right now, there&#8217;s a lot of interest in the cloud as a paradigm shift. It can be used for simple hosting, but it&#8217;s a lot more interesting when  used around applications that really require a lot of elasticity.</p>
<p>People talk about the fact that there aren&#8217;t a lot of standards out there, and different cloud providers do things a different way, whether it&#8217;s a VMware vCloud partner, Amazon, Azure, or Google App Engine.</p>
<p>At the language level, or at the library level, it seems like a lot of people are tackling the problem of how to standardize and achieve portability. The angle Zend takes is interesting, because it sounds like a PHP developer wouldn&#8217;t really have to learn a lot of non-PHP technologies to get the portability they&#8217;re looking for.</p>
<p><b>Zeev:</b> Most of the standardization efforts that I&#8217;ve come across around the cloud have to do with standardizing the Infrastructure-as-a-Service part of it: managing instances, starting and stopping instances, and so on. I haven&#8217;t come across a standardization effort for cloud services per se, regarding those services that one gains access to once inside the cloud.</p>
<p>That&#8217;s where we decided to focus, and it makes sense for a library to tackle that, for the framework.</p>
<p><b>Scott:</b> This has been great, but I want to be sensitive to the time. Any closing thoughts from your end?</p>
<p><b>Zeev:</b> On a personal note, it has been and still is an amazing ride. As you can understand from the origins of how our involvement began, neither Andi nor I had any clue whatsoever that PHP would be as popular as it is today. We would never have expected to see the White House website, Facebook, or Wikipedia standing on top of the technology that we helped get its start.</p>
<p>I&#8217;m confident that PHP has a bright future. Its popularity has never been higher, and it&#8217;s growing all the time, moving out into the mainstream. Many Java developers are now looking into scripting with PHP as a better way to tackle web applications.</p>
<p>Overall, I think we already are in a good position, and the road ahead looks very positive.</p>
<p><b>Scott:</b> Thank you very much for taking the time to chat.</p>
<p><b>Zeev:</b> You bet. Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=303&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F07%2Finterview-with-zeev-suraski-cto-zend-co-creator-php%2F&amp;title=Interview+with+Zeev+Suraski+%26%238211%3B+CTO+Zend%2C+Co-creator+PHP" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F07%2Finterview-with-zeev-suraski-cto-zend-co-creator-php%2F&amp;title=Interview+with+Zeev+Suraski+%26%238211%3B+CTO+Zend%2C+Co-creator+PHP" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F07%2Finterview-with-zeev-suraski-cto-zend-co-creator-php%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F07%2Finterview-with-zeev-suraski-cto-zend-co-creator-php%2F&amp;title=Interview+with+Zeev+Suraski+%26%238211%3B+CTO+Zend%2C+Co-creator+PHP" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F07%2Finterview-with-zeev-suraski-cto-zend-co-creator-php%2F&amp;title=Interview+with+Zeev+Suraski+%26%238211%3B+CTO+Zend%2C+Co-creator+PHP" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F07%2Finterview-with-zeev-suraski-cto-zend-co-creator-php%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Zeev+Suraski+%26%238211%3B+CTO+Zend%2C+Co-creator+PHP+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F04%2F07%2Finterview-with-zeev-suraski-cto-zend-co-creator-php%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/04/07/interview-with-zeev-suraski-cto-zend-co-creator-php/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Interview with Mihir Shukla &#8211; CEO Automation Anywhere</title>
		<link>http://howsoftwareisbuilt.com/2010/03/15/interview-with-mihir-shukla-ceo-automation-anywhere/</link>
		<comments>http://howsoftwareisbuilt.com/2010/03/15/interview-with-mihir-shukla-ceo-automation-anywhere/#comments</comments>
		<pubDate>Mon, 15 Mar 2010 17:29:15 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=300</guid>
		<description><![CDATA[In this interview, Mihir talks about how he built a company to bring buisiness process and IT automation down from the expensive enterprise sphere, and out to the every-man.&#160; Along the way, Mihir shares thoughts on open source and proprietary development approaches.&#160; Re-inventing the field of business-process automation Applying business-process automation principles to software testing [...]]]></description>
			<content:encoded><![CDATA[<p>
    In this interview, Mihir talks about how he built a company to bring buisiness<br />
    process and IT automation down from the expensive enterprise sphere, and out to<br />
    the every-man.&nbsp; Along the way, Mihir shares thoughts on open source and<br />
    proprietary development approaches.&nbsp;
</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/03/15/interview-with-mihir-shukla-ceo-automation-anywhere/#invent">Re-inventing the field of business-process automation</a>
    </li>
<li><a href="http://howsoftwareisbuilt.com/2010/03/15/interview-with-mihir-shukla-ceo-automation-anywhere/#apply">Applying business-process automation principles to software testing</a>
    </li>
<li><a href="http://howsoftwareisbuilt.com/2010/03/15/interview-with-mihir-shukla-ceo-automation-anywhere/#create">Creating software-testing routines that are resilient when application code changes</a>
    </li>
<li><a href="http://howsoftwareisbuilt.com/2010/03/15/interview-with-mihir-shukla-ceo-automation-anywhere/#relation">The nature of innovation in proprietary and open source approaches</a>
    </li>
<li><a href="http://howsoftwareisbuilt.com/2010/03/15/interview-with-mihir-shukla-ceo-automation-anywhere/#impact">Positive and negative impacts of a down economy on software transitions</a>
    </li>
</ul>
<p><span id="more-300"></span></p>
<p><b>Scott Swigart:</b> To start, please take a moment to introduce yourself, your company, and its key products.</p>
<p><a name="invent"></a></p>
<p><b>Mihir Shukla:</b> I&#8217;m the CEO of Automation Anywhere, which is a software automation company. Our flagship product, Automation Anywhere, has been a leader in the space of business and IT process automation. We started about six years ago, with the goal of shaking up the market.</p>
<p>Before we entered the market, products for business and IT process automation were in the cost range of $1 million to $2 million. Even though just about every company needs business and IT process automation, the entry point was too high for most businesses to afford it. Our vision was that we wanted to do business and IT process automation 10 times faster, for one tenth of the cost.</p>
<p>To put that in real-world terms, a project that used to require $1 million should be able to happen with a budget of $100,000. And if it took 10 months to do a business and IT process automation project, it should happen in about a month. That was a pretty lofty goal, so we had to innovate in many areas and change the rules of the game. Today, we have more than 25,000 customers in more than 90 countries.</p>
<p>By shaking up the market in business and IT process automation, we found that thousands of users are using it for test automation purposes. Last year, we decided to more formally enter into that market, and we launched the product Testing Anywhere.</p>
<p><b>Scott:</b> For a bit of level-setting, how would you define business process automation and IT automation, just to provide a sense of kind of exactly what those two things look like?</p>
<p><b>Mihir:</b> Every business has many, many business processes, which could include generating reports, or entering patient data into a health care system if you&#8217;re a hospital. If you&#8217;re a retail company, your business processes would include checking inventory and reordering whatever is in short supply. If you are a dentist, a relevant business process could be to remind people of their upcoming appointments. </p>
<p>And to give an example of an IT process, you may want to install a new operating system patch on various computers. You may want to monitor your data center and find out which hardware is performing below expected levels. You may want to restart various servers routinely.</p>
<p>Those are some examples. We have learned that over 25,000 customers, there are 20,000 unique business processes that businesses rely on.</p>
<p><b>Scott:</b> And then on the testing automation side, is that automated QA and software testing?</p>
<p><b>Mihir:</b> Right. That is a very specific example of a process that needs to be automated, but yes, test automation is part of automating a QA process.</p>
<p><b>Scott:</b> You mentioned that your goal when you set out was to complete implementations for a tenth of the cost, in a tenth of the time. What makes your business process, IT automation, and test automation different? </p>
<p>These are definitely spaces where there&#8217;ve been a lot of products in the marketplace for a long time, with really large vendors involved with patch management, systems management, and so on. Talk a little bit about your approach and what makes it different and less expensive and more productive.</p>
<p><b>Mihir:</b> We decided that if we are to improve it by a factor of ten, we could no longer do what has been done before, or we would be limited to perhaps a 10 or 20 percent improvement. We looked at the space and where the costs involved come from. </p>
<p>We found that the dominant approach to business and IT process automation was very programming centric and involved a lot of consulting requirements. There was not a lot of focus on putting intelligence in the system itself.</p>
<p>Let me give you an example. In business and IT process automation, we created a drag and drop environment and intelligent automation that automatically figures out what it is that you intend to do.</p>
<p>So, for example, consider a business process where I log into my ERP system, type in my dates, generate a report, and email that report to three executives. When I perform that business process, behind the scenes, our software would automatically create the entire automated business process. </p>
<p>It is done in a very intelligent fashion, so it figures out the intent of what you want to do, rather than focusing on your actions themselves. For instance, if you double click on your desktop, we know that you intend to start an application, so we focus on that, rather than the actual fact that you double-clicked.</p>
<p>We developed some technologies that were groundbreaking in terms of how we integrate with various systems for automation. Instead of creating specific adaptors for various systems, we came out with an on-demand integration adaptor for any Windows application. So, suddenly you could integrate with any of the 20 million applications out there.</p>
<p>We also implemented a visualization technology, so the entire automation became visual. At every step of the way, you have a visual picture of what part you are automating. </p>
<p><a name="apply"></a></p>
<p><b>Scott:</b> How are you extending that to the test automation space?</p>
<p><b>Mihir:</b> We intend to bring the same spirit of innovation that we did with Automation Anywhere. </p>
<p>Our viewpoint toward the test automation market is that in the last 10 years or so, there hasn&#8217;t been any true innovation. Many of the tools out there, even if they have more bells and whistles, more or less work the same way they used to. If anything, it takes longer to learn those tools, and even longer to automate some of the new technologies.</p>
<p>In the test automation space, we brought forward quite a few new innovations. For example, one of the problems we saw in the test automation space is the problem of test portability. You may have a test case that works in your environment, but not elsewhere, such as in a production environment, a partner&#8217;s site, or a customer&#8217;s site. In cases like that, I cannot take my test and run it on another environment, basically.</p>
<p>We came out with a very intelligent solution, where you can export a test into a standalone executable that has an intelligent Testing Anywhere agent inside it.</p>
<p>Consider a very simple example of a communication between a QA person and a developer. If a QA person finds a problem, they could report it to the developer, and the developer could come back and say, &#8220;I cannot reproduce your issue.&#8221;</p>
<p>In a situation like that, the conversation could go on back and forth for a while, in a very unproductive fashion.</p>
<p><b>Scott:</b> That&#8217;s certainly a familiar experience to a lot of people in the software industry. How does your approach change that dynamic?</p>
<p><b>Mihir:</b> With Testing Anywhere, the QA person could take the failing test case, export it as an executable, and attach it along with the bug report to the developer. The developer could run this test case, standalone, without installing Testing Anywhere, and instantly be able to reproduce the problem.</p>
<p>This innovation alone has opened up many possibilities to our customers. We see customers who are using the test to exe capability to do compatibility testing, for example, where you have a test case on one version of the operating system and you want to try it out on a different version to see if the same test case could work. </p>
<p>One of the areas where it&#8217;s used significantly is in Agile methodology, where some of our customers have two-week release cycles. That gives you about a week&#8217;s time for a feedback loop between a developer and QA to validate the feature, communicate it back to the developer, reproduce it, get it fixed, revalidate it again, and make a release. </p>
<p>They use the test to exe capability for communication between developers and QA. Many of our customers are using it to reproduce problems from a customer&#8217;s site or a partner&#8217;s site. Problems could be because of user error, or they could be genuine bugs based on certain values. </p>
<p>Issues that were taking an enormous amount of time are being solved now at a fraction of the cost and time.</p>
<p>One of our customers summarized using the test to exe capability well. He said, &#8220;15 years ago, when there weren&#8217;t mobile phones, life was just fine, but today, if I forget my mobile phone at home, I&#8217;ll miss it 15 times during the day.&#8221; Once you have it, you cannot imagine going back.</p>
<p><a name="create"></a></p>
<p><b>Scott:</b> When I think about the test tools that I&#8217;ve used over the last decade or so in software development, many of them seem brittle. Even when you make relatively small changes to an application, you may have to recreate all of the tests, because a button isn&#8217;t at the same pixel location, or the output that shows up on the screen isn&#8217;t exactly the same, and so it doesn&#8217;t key off of it quite right.</p>
<p>Talk a little bit about that notion of making tests more resilient, or the automation more resilient so that the amount of time that you have to go back and recreate them is minimized.</p>
<p><b>Mihir:</b> The argument for test automation is strongest in Agile methodology, where if you&#8217;re going to run through all the tests every two weeks, you&#8217;d better automate. The first issue there is that, as you very correctly pointed out, if you change an application, you shouldn&#8217;t automatically have to go back and change things. In today&#8217;s day and age, you don&#8217;t have months and months of product cycles to do that. </p>
<p>Another issue is that even if you have to change it, it shouldn&#8217;t take you too long to change it. We know of examples where it takes you 10 minutes to make those changes with Testing Anywhere, and it takes you two days to do it with some of the other software. The difference can be enormous.</p>
<p>The third point is that you want a process that is collaborative, especially in Agile method. Some of the older testing tools are designed for waterfall models or others where you have a year or two to develop and maybe six to eight months to do testing. In that methodology, you are OK, because you&#8217;re going through a systematic process. Well, in Agile also there&#8217;s a systematic process, but it&#8217;s a lot more collaborative.</p>
<p>With Testing Anywhere, collaboration is built in. Whether it&#8217;s two QA guys, a QA guy and a developer, a QA guy and a product manager, or a QA guy and outward-facing business users, you can easily collaborate on various aspects. </p>
<p><b>Scott:</b> What about documentation? How does that fit into the use of Testing Anywhere?</p>
<p><b>Mihir:</b> You don&#8217;t want to make Agile into a cowboy methodology, right? Testing Anywhere creates documentation on the fly that is lightweight but effective, and that helps us create a lot of traction in the market.</p>
<p>I&#8217;d like to come back to the issue of what you can do so that if things change you don&#8217;t have to come back to it to change your tests again. One of the things that Testing Anywhere does is that, as you create any test case, it doesn&#8217;t leave it to the user to figure out and implement what I call the plumbing logic.</p>
<p>The product is intelligent enough to figure out that this is what you intend to do, and it will create an automatic test routine by default. For example, if you double click on a desktop, it automatically figures out that you intend to start an application, and internally, it will just launch that application. </p>
<p>If you take that test case to any other machine, it will continue to work. Whether it&#8217;s a Windows application or a web application, it automatically figures out all the controls and it figures out what is the optimal way of matching those controls. It allows you the flexibility to change anything if you like, and the intelligence comes automatically.</p>
<p><b>Scott:</b> In other words, if you&#8217;re double-clicking an application on the desktop, and on a different machine, that icon is in a different location, that doesn&#8217;t matter, because the test has figured out the intent. As long as the application is installed, the test is still going to launch the app, even if something&#8217;s slightly different from one environment to another.</p>
<p><b>Mihir:</b> Absolutely. Testing Anywhere does about 100 things like that intelligently and automatically, so when you use the tool, the difference is quite obvious. Relative to using some of the older tools, you would go about 200 percent to 300 percent faster in creating, developing, editing, and managing your test cases. The effect is very measurable.</p>
<p><a name="relation"></a></p>
<p><b>Scott:</b> To shift the conversation over a bit to the business side, just today, Red Hat said on a community open source site that, with the exception of Apple, all the innovation that&#8217;s happening today is in open source.</p>
<p>In reality, though, we come across a lot of proprietary software companies that are doing very interesting things. Talk a little bit about your decision to build intellectual property internally and not develop this as an open source product.</p>
<p><b>Mihir:</b> As a company, we are founded on the principle of innovation. We believe that innovation is not a property of any particular model, but a result of what you believe in. We&#8217;ve seen innovation across all kind of models. </p>
<p>Our goal was to provide a very reliable but at the same time breakthrough technology that has never been seen before, and based on our experience in the industry for the last 15 or 20 years using many tools and technologies, we knew we had ideas that would shake the market up. </p>
<p>It made sense for us to come out with these technologies using a very fast-paced model. We internally use Agile methodology ourselves, and we incorporate many of these new breakthrough features. This made more sense for us than open source.</p>
<p>We also have a Team Anywhere program, which is a program where we invite an elite group of our customers to influence our product direction and provide feedback. In a sense, that creates a collaborative effort that has some things in common with open source. </p>
<p>The collective experience of our 25,000 customers is enormous. These are the people who live the day-to-day life of business process automation or test automation, and they have become a driving force for our innovation as well.</p>
<p><b>Scott:</b> In other words, whether your software&#8217;s being built by the community or the community&#8217;s just providing feedback, collecting that input from the actual users of the product is really important. I would imagine, too, that you&#8217;ve probably got some stories about people starting to use your products in ways that you never really imagined.</p>
<p><b>Mihir:</b> Absolutely.</p>
<p><b>Scott:</b> Talk a little bit about some of those cases.</p>
<p><b>Mihir:</b> We believe in innovation with the customer, and we have about six or seven different channels through which customers can talk to us. If a customer wants to tell us of a problem, something good, something that could be improved, or something that&#8217;s awesome, we always want to hear from them. </p>
<p>Within the product, we have various ways customers can provide us feedback. On our website, we offer a free service where we have a team of experts who can provide customers with assistance. We have learned so much from our users, across the board, in business and IT automation as well as test automation.</p>
<p>Let&#8217;s take an example. When we created the test to exe feature, we understood that test portability could open up many avenues, and we had ideas about a few. We saw enormous breadth in the ways customers incorporated it into their communication between QA and development.</p>
<p>At the site of one of the customers we visited recently, when a bug is identified, it&#8217;s back with the developer within about five minutes, and in the next five or ten minutes after that, the developer has reproduced it. That was amazing, and it&#8217;s just one example.</p>
<p><a name="impact"></a></p>
<p><b>Scott:</b> When you launched the company six years ago, the global economy was growing rather nicely. A year and a half or two years ago, the global economy pointed its nose at the ground and started to descend pretty rapidly. </p>
<p>What&#8217;s your advice to somebody who&#8217;s thinking about being a software entrepreneur, in terms of navigating the inevitable economic ups and downs?</p>
<p><b>Mihir:</b> I think one silver bullet, even if it&#8217;s not necessarily a novel thought, is that if you are very close to your customers, you can navigate through any economic ups and downs. Trouble happens when you are too far from the customer and you cannot adjust fast enough to the changing times. </p>
<p>For us, 2009 was pretty good; we had about 50 percent growth. We faced issues just like most people did, but you want to stay close to the customer, understand their pain points, listen to them, and use that as your inspiration for innovation.</p>
<p>When you do that, success is inevitable. When we launched the product Automation Anywhere, I wanted to be the first one to buy the software, just as a matter of pride. The whole company got together, we launched the product, took a coffee break, and gathered around a table. A customer in Australia beat us to it. That&#8217;s the story.</p>
<p><b>Scott:</b> You were doing something in the marketplace that you planned from the beginning was going to be disruptive. You were going not for incremental improvement, but really for an order of magnitude improvement. </p>
<p>Maybe during good times, a lot of companies might have said, &#8220;Well, we have testing tools and automation tools. We know them, we&#8217;ve bought them, they&#8217;re expensive, and it&#8217;s not necessarily strategic for us to look at replacing or augmenting those.&#8221;</p>
<p>In times of crisis, companies go out and really look at how maybe they could do things differently, and they&#8217;re motivated to go look for savings under every rock. Did you find that the cost and time savings of your product positioned you well for the downturn in terms of the interest customers may have shown?</p>
<p><b>Mihir:</b> You&#8217;re absolutely right; disruptive technologies have a huge upside in tough economic times, across all segments, including small businesses, medium businesses, and especially Fortune 500 customers. </p>
<p>Typically Fortune 500 customers have more budget at their disposal. Of the top Fortune 100 customers, 70-plus are our customers. Many of them just came to us and said, &#8220;We heard you do things differently.&#8221;</p>
<p>We heard that in tough economic times, it was hard to sell the old test automation tools, not just because they&#8217;re expensive, but because of the high cost of ownership and how long they take to ramp up, learn, modify, and manage.</p>
<p>It was taking so long that revenues of these companies were affected. I was very surprised at how open they were to accepting a new, disruptive tool in this space.</p>
<p><b>Scott:</b> We&#8217;re not very far into 2010, but I know that some software and IT companies are starting to feel like things are improving. As you look forward to 2010 and beyond, how are you feeling about growth potential?</p>
<p><b>Mihir:</b> 2010 definitely looks much better than 2009, that&#8217;s for sure. We see a significant improvement in the market, and a lot more budgets have opened up. We see a significant movement in both private and government sectors.</p>
<p><b>Scott:</b> This has been great, and we&#8217;ve covered a lot of ground. Do you have any closing thoughts?</p>
<p><b>Mihir:</b> In my experience with SIs, I found out that, of the total test automation market, about eight percent is currently in automation. They see a significant market potential for that to go up to about 30 or 35 percent, and everyone feels this stagnancy in the tool space in terms of the tools that are out there.</p>
<p>For this new market, they&#8217;ll look for new tools that can offer value, and we plan to fit in that space.</p>
<p><b>Scott:</b> Thanks for taking the time to talk today.</p>
<p><b>Mihir:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=300&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F15%2Finterview-with-mihir-shukla-ceo-automation-anywhere%2F&amp;title=Interview+with+Mihir+Shukla+%26%238211%3B+CEO+Automation+Anywhere" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F15%2Finterview-with-mihir-shukla-ceo-automation-anywhere%2F&amp;title=Interview+with+Mihir+Shukla+%26%238211%3B+CEO+Automation+Anywhere" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F15%2Finterview-with-mihir-shukla-ceo-automation-anywhere%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F15%2Finterview-with-mihir-shukla-ceo-automation-anywhere%2F&amp;title=Interview+with+Mihir+Shukla+%26%238211%3B+CEO+Automation+Anywhere" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F15%2Finterview-with-mihir-shukla-ceo-automation-anywhere%2F&amp;title=Interview+with+Mihir+Shukla+%26%238211%3B+CEO+Automation+Anywhere" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F15%2Finterview-with-mihir-shukla-ceo-automation-anywhere%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Mihir+Shukla+%26%238211%3B+CEO+Automation+Anywhere+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F15%2Finterview-with-mihir-shukla-ceo-automation-anywhere%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/03/15/interview-with-mihir-shukla-ceo-automation-anywhere/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interview with Chris Keene &#8211; CEO WaveMaker</title>
		<link>http://howsoftwareisbuilt.com/2010/03/05/interview-with-chris-keene-ceo-wavemaker/</link>
		<comments>http://howsoftwareisbuilt.com/2010/03/05/interview-with-chris-keene-ceo-wavemaker/#comments</comments>
		<pubDate>Fri, 05 Mar 2010 19:41:04 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=297</guid>
		<description><![CDATA[WaveMaker is gaining prominence, and showing up in the Gartner Magic Quadrants, to simplify cloud development.&#160; In this interview, we chat with WaveMaker CEO Chris Keene. Deciding on the open source approach to development and choosing a license Simplifying the creation of multi-tiered applications for the cloud Fitting WaveMaker into the broader open source and [...]]]></description>
			<content:encoded><![CDATA[<p>
    WaveMaker is gaining prominence, and showing up in the Gartner Magic Quadrants,<br />
    to simplify cloud development.&nbsp; In this interview, we chat with WaveMaker<br />
    CEO Chris Keene.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/03/05/interview-with-chris-keene-ceo-wavemaker/#decide">Deciding on the open source approach to development and choosing a license</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/03/05/interview-with-chris-keene-ceo-wavemaker/#simple">Simplifying the creation of multi-tiered applications for the cloud</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/03/05/interview-with-chris-keene-ceo-wavemaker/#fitting">Fitting WaveMaker into the broader open source and proprietary ecosystem</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/03/05/interview-with-chris-keene-ceo-wavemaker/#engage">Engaging with the community and deciding what goes into the paid edition</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/03/05/interview-with-chris-keene-ceo-wavemaker/#important">On the importance of community managers, and how to choose a good one</a></li>
</ul>
<p><span id="more-297"></span></p>
<p><b>Sean Campbell:</b> To start, why don&#8217;t you take a minute and introduce yourself and your company?</p>
<p><b>Chris Keene:</b> I&#8217;m the CEO of WaveMaker. Our company makes an open source cloud development platform, which you can think of as being like force.com without the proprietary lock in. </p>
<p>We introduced WaveMaker about two years ago as an open source Apache project, and there&#8217;s also a commercial version of that same product that has additional features, around security primarily. </p>
<p>At the end of last year, we moved that development platform into the cloud and really started to realize the full vision, similar to PowerBuilder or Microsoft Access for the cloud.</p>
<p><a name="decide"></a></p>
<p><b>Sean:</b> Tell us a little bit about the open source project and the choice to develop using community involvement.</p>
<p><b>Chris:</b> I should say first that my last company, Persistence Software, was a proprietary company, and we took that public and then sold it, so I kind of know the drill on the proprietary side. When I came to the open source world, I found that everything I thought I knew was wrong. And so the main way we&#8217;ve figured out how to do things in open source has been by doing them wrong first and then fixing our mistakes as we went along. </p>
<p>I think the upside is that if you&#8217;re willing to fix your mistakes, open source has an incredible adoption curve. In a little over a year, we gotten more than 15,000 registered developers in our community. For comparison, it took the SpringSource community over six years to get roughly three times as many developers.</p>
<p>We&#8217;ve had hundreds of thousands of downloads, but what we really care about is how many people have downloaded, installed, come back to sign up for our community, and then started to participate. To build that kind of a user base in a little over a year is just incredible.</p>
<p>Still, along the way, we made a lot of mistakes. We started out, for example, with an AGPL license, and we found that the open source community didn&#8217;t really trust that license for software that was going to be embedded into their own applications, which caused us to switch over to the Apache license. </p>
<p>We&#8217;ve ended up tweaking and changing almost every element of our business model, because we&#8217;re constantly getting feedback from our community about what they&#8217;d like to see and what&#8217;s most valuable for them.</p>
<p><b>Sean:</b> With the Apache license, people can embed it in proprietary software. So the GPL license makes things compatible with bundling with a Linux distro, for example, but the Apache license makes it possible to carry open source in a proprietary product without having to make the whole thing open source. Is that the motivation?</p>
<p><b>Chris:</b> That&#8217;s exactly right. Almost by definition, anything that somebody builds with a cloud development platform is going to embed, at a minimum, your libraries. Because of that, people didn&#8217;t feel comfortable under the AGPL license. </p>
<p>Part of the reason for the difference in the community&#8217;s reaction is that AGPL is fairly new, whereas Apache has been around for a while, and people really understand and trust it. I think if you are an open source company, you need to be willing to follow the lead of your own community; otherwise your community is basically just a marketing exercise.</p>
<p><b>Sean:</b> That makes sense, and in general, open source communities vote with their feet fairly rapidly. Generally, they have a certain number of choices in any given space, and it&#8217;s instructive to consider, for example, that MySQL and Apache took off even though they obviously weren&#8217;t the only options available. </p>
<p>It seems like there is a sense of getting things just right in order to have one project versus another one become a center of gravity that attracts a lot of people.</p>
<p><b>Chris:</b> I think that&#8217;s true, even if a feature-by-feature comparison might suggest that it&#8217;s a toss-up. One thing I will say for MySQL, though, is that they absolutely got the install right, and it&#8217;s still much easier, in my opinion, to install MySQL than many other of the open source databases. </p>
<p>I think that ease of use and approachability really help drive rapid adoption. That&#8217;s a key characteristic of MySQL, Ruby on Rails, and frankly of WaveMaker as well. We think that it really helps drive success when people don&#8217;t have to invest months of their lives to figure out whether your stuff works or not.</p>
<p><a name="simple"></a></p>
<p><b>Sean:</b> Before the cloud, people were using hosting providers to build websites. Hosting providers started to get into virtual private servers, so instead of having a dedicated server that was co-located, they could spin up a virtual instance. Amazon took that to a different level, with utility pricing and letting you spin up hundreds of servers with a command line or an API call. </p>
<p>How does WaveMaker fit into that spectrum, from the dedicated, co-located server to the virtual private server with Rackspace or whoever, to running in a cloud where you&#8217;ve got options for real elasticity?</p>
<p><b>Chris:</b> The cloud itself is just a concept from my perspective; it&#8217;s really just a data center with an API. If I&#8217;ve got an application and I need to figure out where to put it, the question is, &#8220;What steps do I need to go through to get my application running someplace where people can get at it with their browsers?&#8221;</p>
<p>Of course, in the co-lo world and the hosting world, I have to call people, I have to put contracts in place, I have to pick up the phone and send emails to make something happen. In the cloud space, I just make a set of API calls. </p>
<p>Not only has the cost and complexity of making something available on the Web gone down dramatically, but I can now start to imagine businesses that previously wouldn&#8217;t have tried to put things up on the Web wanting to do so.</p>
<p>Cloud computing really lowers the bar for deployment, in terms of both systems administration and economics. But cloud computing on its own doesn&#8217;t really do anything to lower the difficulty of creating that cloud application in the first place. That&#8217;s really where WaveMaker comes in.</p>
<p>If cloud computing is all about democratization, utility computing, and allowing more people to put applications on the Web, then we also need a new set of development platforms to facilitate that. In fact, we&#8217;d argue the need not just for development platforms but actually for application templates that people can take down and customize for their own purposes.</p>
<p>That set of templates has an intent very similar to what SharePoint templates have provided: a simple means of managing projects, time, assets, and issues to allow a whole new class of developers to build and deploy web-based applications.</p>
<p><b>Sean:</b> A lot of open source development, on core products, involves very low-level C programming, or those kinds of things. Then there&#8217;s another level where people are doing Java, .NET, or PHP programming. </p>
<p>Those who are using a higher level language might be using open source libraries, test and development tools, or development environments.</p>
<p>How receptive have you found the open source community to be to the notion of 4GL languages in general? Also, do the people who are actually building stuff with WaveMaker even really know or care that it&#8217;s open source, or to them is it just a really productive way to get stuff built?</p>
<p><b>Chris:</b> I think WaveMaker has pushed the envelope in terms of the type of developers we&#8217;re trying to reach out to. The typical stereotype open source developer is an alpha geek who has never gotten past EMACS as their development environment and eats other coders for breakfast. </p>
<p>The reality is that there are very few people, even very sophisticated developers, who have the full range of skills needed to build a web application. For example, a person might need to build a Java application, put an Ajax front end on it, build a database and database schema, and then deploy that into some sort of a hosting environment.</p>
<p>We&#8217;ve found that the target user base for WaveMaker consists of developers who may be very skilled but don&#8217;t have full teams where they can afford to put four to six developers with different skill sets on a project to get it through. </p>
<p>We tend to work with very sophisticated developers, but oftentimes they&#8217;re in the business units. Gartner calls these people &#8220;citizen developers&#8221; or &#8220;non-expert&#8221; developers, but it&#8217;s really the notion of people trying to get a lot done on their own.</p>
<p>Frankly, I think that&#8217;s the same crew that Ruby on Rails has appealed to. One of the interesting things for WaveMaker is that our biggest core set of customers is Macintosh developers, because they really like visual development. They tend to be very sophisticated, and they&#8217;re incredibly pragmatic; they just want to get the job done, and they want the best and most productive tool for getting that to happen.</p>
<p><b>Sean:</b> My understanding is that WaveMaker is that it&#8217;s really optimized for the &#8220;forms over data&#8221; types of applications. In other words, when NBC is trying to figure out how to build its site for streaming the Olympics, WaveMaker is maybe not the best choice. On the other hand, if somebody is building an expense-reporting application, or maybe even a whole CRM system from scratch, that&#8217;s more the niche that WaveMaker fits in. Does that ring true?</p>
<p><b>Chris:</b> Absolutely. WaveMaker really fits into the classic category of enterprise solutions that help business developers build basic business applications. We&#8217;re not trying to focus on the multimedia side of the Web, nor are we trying to focus on the complex transaction processing part of core IT. </p>
<p>We&#8217;re really about the broad middle: the 80 percent of applications in the business that are building similar functionality over and over again but customizing it slightly, which is where we think that combining a development platform with application templates is an incredibly powerful way to drive productivity broadly throughout the organization.</p>
<p><a name="fitting"></a></p>
<p><b>Sean:</b> So, if it&#8217;s running up on Amazon or you&#8217;re running it in your own local data center, it&#8217;s running on a server that has some kind of an OS, and it&#8217;s running on some kind of a Web server. What is the stack under WaveMaker? Is this a Linux/Apache sort of thing? Does it run on Windows?</p>
<p><b>Chris:</b> The stack underneath WaveMaker is a Spring/Hibernate stack. We made a decision very early on that not only were we going to release our software under open source, but we were also going to eat our own dog food, in the sense that we were going to actually build our stack on top of an open source stack. </p>
<p>That means Spring and Hibernate on the back end, and it means JFON and Dojo on the front end. What&#8217;s great about building a platform on that stack is that it&#8217;s incredibly flexible. We can run in any Java server, we can work with any database through Hibernate, and with any security mechanism through Asiji. We can to talk to any Web service through JAX WS. </p>
<p>On the front end, we&#8217;re able to support accessibility and internationalization requirements through Dojo. So effectively, we&#8217;re standing on the shoulders of giants.</p>
<p>It&#8217;s very interesting that in the cloud world, and in the cloud development platform world in particular, companies have gone back to inventing their own wheels. For example, with force.com, every single element of that architecture is proprietary. </p>
<p>Not only do they have to invest a tremendous amount of resources to build the same quality of tools, but by definition, they don&#8217;t have the breadth of integration points they would get working with an open source stack.</p>
<p><b>Sean:</b> If you&#8217;re dealing with a Java shop, this stuff probably fits right in. For more of a Microsoft .NET shop, it sounds like this would still integrate pretty well, because of the technologies that you&#8217;ve talked about. Do you have any examples you can cite about this being used in a heterogeneous environment where interoperability was important?</p>
<p><b>Chris:</b> Absolutely. WaveMaker as a platform talks to any kind of a Web service, whether it&#8217;s SOAP or RESTful or RSS. We have customers who very easily integrate functionality from WaveMaker with functionality from the .NET environment. </p>
<p>That&#8217;s a big part of what&#8217;s out there, so we have to be able to build on whatever intellectual property the enterprise has already developed, and that includes .NET.</p>
<p><b>Sean:</b> Can you talk a little more about the corresponding story for shops that do Java, PHP, and that kind of stuff?</p>
<p><b>Chris:</b> We&#8217;ve got a number of examples where companies have been able to effect a real transformation. For example, KANA is a company that produces call center software. They supply call centers to about half of the Fortune 100, and they have effectively rebuilt their entire stack based on DB2, WebSphere, and WaveMaker. </p>
<p>They created an SOA bus where all of the functionality around these very complex multi-channel call centers is wrapped up into a series of web services. WaveMaker calls into those web services and provides the user interface, the ability to customize that user interface, and the ability to make changes to the back-end system.</p>
<p><b>Sean:</b> If people are using WaveMaker to build their web front-end, how does it scale up and scale out?</p>
<p><b>Chris:</b> The work we&#8217;ve done with KANA has been targeted toward some of the largest enterprises on the planet, and they&#8217;re looking at things like 20,000 concurrent users and sub-second response rates, so we feel very confident that our solution scales. </p>
<p>Here, again, there&#8217;s a huge benefit from open source. For example, if we found that our back-end wasn&#8217;t scaling the way that we wanted it to, there are a lot of Spring developers out there who could help us make it better. We&#8217;ve been doing work with KANA to make our Dojo front end perform better, and again, there are many people with expertise in building very high-performance Dojo clients.</p>
<p>So we&#8217;re able to leverage not only our own community of 15,000 developers, but the Spring community, the Dojo community, and the Hibernate community to tweak out the performance and basically learn from best practices. </p>
<p>Open source gives us a huge benefit in this area, as opposed to building our own proprietary back-end and then having to go through all the painful learning process of how to get it to perform without the community&#8217;s help.</p>
<p><a name="engage"></a></p>
<p><b>Sean:</b> I&#8217;d like to hear more about how you guys engage with the open source community, at a tactical or business level. You have obviously articulated a lot of good, strong benefits for having an open source development model, but sometimes inherent tensions can creep in. </p>
<p>For example, there are issues around having X features in the community edition and Y features in the paid edition. You don&#8217;t have a static product—it&#8217;s obviously evolving—so how do determine what goes into the full-blown edition versus the community edition? How do you determine the levels of support?</p>
<p>These questions seem to especially get complicated when the community would really like to see something in the community edition but you&#8217;re not yet offering it.</p>
<p><b>Chris:</b> Those are hard questions, and I don&#8217;t think you ever get them completely right, even though you need to get them right enough. One thing that probably doesn&#8217;t get enough attention is that the open source community is not just developers. There&#8217;s also a very interesting level of cooperation and collaboration between open source CEOs. </p>
<p>I&#8217;ve had the benefit of being able to spend a lot of time with Marten Mickos of MySQL, Brian Gentile of JasperSoft, Rod Johnson of SpringSource, and a number of other CEOs. Those relationships have convinced me that open source is as much an attitude as it is a business model. If you really embrace the open source approach, what you get is a set of CEOs and companies that are wired to collaborate and cooperate and share best practices and learnings. </p>
<p>We found some rules of thumb for how to separate our open source and enterprise products. I think that the most important one is that the commercial features should have more to do with robustness and scalability of deployment than they should with fundamental things that you can or cannot do within development.</p>
<p>If you look across commercial open source companies, including WaveMaker, SpringSource, MySQL, and others, you&#8217;ll find trends in the things that tend to separate the commercial versions from the community versions. They tend to be clustered around management, monitoring, scalability, security, and other things that you really start to be concerned about when you take an application and deploy it into a robust enterprise deployment.</p>
<p>I think the most important thing is that the point of having a community edition versus an enterprise edition is not to make as many people as possible pay for your product. The point, to me, is that you&#8217;re trying to make the people who are willing to pay want to.</p>
<p>Therefore, you don&#8217;t pick a feature that could trip people up so that they get started with your product and then you somehow catch them because they want to do something else. The process involves sitting down in advance to determine who would never pay for the product anyway. The first step is to figure out a product that will make them happy and make them good marketers and developers for the project. </p>
<p>Next, you need to determine who would be likely to pay for the product, because they&#8217;re using commercial security, commercial databases, or commercial Web servers, or because they need higher levels of security and performance. You need to make sure that those people fall into the bucket where, in fact, there is a payment required.</p>
<p>If you think about it that way, it makes sense, and in fact, I would go one step further. You want to engage with that group of people who are never going to pay for your product and encourage them to invest time in your community. That&#8217;s the payment that you want from them. </p>
<p>For the group of people who are paying for your product, you want to provide as close to a commercial software experience as possible.</p>
<p><b>Sean:</b> Is tracking sales and trying to convert those community members over to paid members even a part of your business model?</p>
<p><b>Chris:</b> Certainly we track sales, but I have to say (and a number of other open source CEOs are starting to feel more this way as well) that people either have money in their pocket when they come into your community or they don&#8217;t. </p>
<p>If somebody doesn&#8217;t have money in their pocket, you&#8217;re not ever going to convert them. It doesn&#8217;t matter how compelling your for-pay product is, because they don&#8217;t have any money. The more you try to put the screws to them, the more you fritter away the goodwill that you would otherwise have with them. </p>
<p>Similarly, there are people that come into your community with money in their pocket, and if they find that they&#8217;re more productive, they&#8217;re getting better results, or their boss thinks they&#8217;re doing a great job, then they&#8217;ll give you some of that money relatively happily. </p>
<p>I think the whole notion is wrong that you should regard every single person in your community as a candidate for giving you their credit card.</p>
<p><b>Sean:</b> I agree. At the same time, how do you go about encouraging the non-buyers to incent other potential paying customers? I realize there&#8217;s an ethos around it, but at the end of the day there are lights to keep on, bills to pay, and things like that. </p>
<p><b>Chris:</b> From those people, we&#8217;d like to get good karma, because they like our stuff and benefit from it. Then we need to find ways in which they can give back that are non-monetary. I&#8217;ll admit that we&#8217;re not fabulous at this, although certainly the community is one place to do that. </p>
<p>Whenever we help somebody out in the community, we try to back-channel to them to make sure that things went well. If they&#8217;re excited, we usually suggest to them that, if they feel like the community helped them out, they should consider helping somebody else out by giving back to the community.</p>
<p>It&#8217;s basically karmic marketing. As people in the community get more confident in their skills, just as in any open source community, people get more involved. In addition to contributing code, for example, we&#8217;ve had people create communities in various different languages.</p>
<p>We want everybody to pay for the value that they receive, but some of the people pay in money and other people pay with their time. At some level, we&#8217;re indifferent to which they choose, because we think the more people who pay with their time, the more people they&#8217;re going to influence. Some percentage of the people they influence, as you said, are going to be looking for a commercial application of what we&#8217;ve got.</p>
<p><a name="important"></a></p>
<p><b>Sean:</b> Because of the size and the acceleration rate you guys have, do you have a community head or liaison who deals with issues like that as their full-time role? If not, do you have some kind of plan to staff up that area?</p>
<p><b>Chris:</b> I think open source is definitely a business model where you go hard or you go home, so if you&#8217;re not willing to pursue all the implications of open source, you really shouldn&#8217;t start down the path. And to me, the core of open source is the community, in the sense that if you don&#8217;t have a community, you really have nothing. </p>
<p>If I had two only employees in an open-source company, one of those people would be the community manager. That&#8217;s how important I think that position is. </p>
<p>In our company, we have a full time person who manages the community, and his background is technical. My view is that if you have a marketing person managing your community, you&#8217;ve said something pretty profound to the community, and I think there&#8217;s a spectacular mismatch between somebody with a marketing background and the kinds of questions and concerns that the community is needing to get addressed.</p>
<p>As your community grows, you may need to put a layer of management up there, but I think first and foremost, a community is about dialogue, and you need somebody who has the technical depth to be able to answer the questions of the people who come to the community. Otherwise, guess what? They&#8217;re not going to come.</p>
<p>So day one, we put a community manager in place. That is absolutely what drove us to 15,000 registered developers in our community in a little over a year. That&#8217;s absolutely what got the attention of companies like IBM, Amazon, Rackspace, and all the others we&#8217;re working with, and our community is our crown jewel.</p>
<p><b>Sean:</b> How do you go about finding somebody to be your community evangelist? In a lot of ways, you can&#8217;t measure them using standard marketing metrics. They may even be out of the building, virtually or physically, more often than a traditional marketing manager would be. On top of it, by design they&#8217;re bringing you feedback that runs, sometimes, upstream of where you want to go, but by definition you have to listen to it.</p>
<p>Your story might be a personal one, for all I know; maybe it&#8217;s someone you&#8217;ve known for years. But if you could extrapolate, what should other projects look for when they&#8217;re about to hire a community manager?</p>
<p><b>Chris:</b> The first key is that they need whatever technical depth is required to engage with specific questions in their communities. Second, you need somebody with some kind of management skills. Maybe they&#8217;re not going to be managing lots of people, but they need to be able to manage their time and their communities. </p>
<p>They need to be able to ask for help when it&#8217;s necessary. They need to be able to recognize when somebody&#8217;s really angry and be able to handle those situations. At the same time, they need to be able to recognize that somebody is really happy and take advantage of that as well, for instance by taking the situation to marketing to see what they can do with it.</p>
<p>You also need somebody with an incredible customer focus, because day in and day out, they will need to defuse people who have been disappointed in some way. People are bound to use your product for a while and then have it cause problems for them (or give them that perception). Some of those people will inevitably come back to your community yelling and screaming.</p>
<p>I tend to think that the combination of those skills is most likely to be found in a customer-support operation. People who have done technical support for software companies for a while know the job and how to keep their cool, and they tend have a good perspective where they can see trouble brewing, and similarly they can see opportunities. </p>
<p>That, to me, is the ideal. I think you could get that person from a number of different places, but technical customer support is probably the best place to look.</p>
<p><b>Sean:</b> Well, we&#8217;re almost out of time, so are there any closing thoughts from your end?</p>
<p><b>Chris:</b> I think we&#8217;re very early in the trajectory of the open source industry, and it&#8217;s been amazing to me to see over just the last two years, the kinds of changes in attitudes and understanding of what it takes to build a real ecosystem. </p>
<p>I think cloud computing is going to take that up another order of magnitude. Ecosystem is the killer app for the cloud, and companies that have learned the lessons of partnering and working with communities are going to be at a huge advantage as the cloud computing revolution takes off.</p>
<p><b>Sean:</b> That&#8217;s a nice place to wrap up. Thanks for taking the time to chat.</p>
<p><b>Chris:</b> Absolutely. Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=297&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F05%2Finterview-with-chris-keene-ceo-wavemaker%2F&amp;title=Interview+with+Chris+Keene+%26%238211%3B+CEO+WaveMaker" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F05%2Finterview-with-chris-keene-ceo-wavemaker%2F&amp;title=Interview+with+Chris+Keene+%26%238211%3B+CEO+WaveMaker" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F05%2Finterview-with-chris-keene-ceo-wavemaker%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F05%2Finterview-with-chris-keene-ceo-wavemaker%2F&amp;title=Interview+with+Chris+Keene+%26%238211%3B+CEO+WaveMaker" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F05%2Finterview-with-chris-keene-ceo-wavemaker%2F&amp;title=Interview+with+Chris+Keene+%26%238211%3B+CEO+WaveMaker" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F05%2Finterview-with-chris-keene-ceo-wavemaker%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Chris+Keene+%26%238211%3B+CEO+WaveMaker+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F03%2F05%2Finterview-with-chris-keene-ceo-wavemaker%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/03/05/interview-with-chris-keene-ceo-wavemaker/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Interview with Jason Huggins &#8211; CTO of Sauce Labs / Selenium</title>
		<link>http://howsoftwareisbuilt.com/2010/02/22/interview-with-jason-huggins-cto-of-sauce-labs-selenium/</link>
		<comments>http://howsoftwareisbuilt.com/2010/02/22/interview-with-jason-huggins-cto-of-sauce-labs-selenium/#comments</comments>
		<pubDate>Mon, 22 Feb 2010 18:58:20 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=293</guid>
		<description><![CDATA[Testing a web site against multiple browsers has been a long-time challenge of Web development.&#160; In this interview, Jason Huggins, CTO of Sauce Labs, explains how the Selenium project makes multi-browser testing much faster. The creation of the Selenium web-testing tool The Sauce Labs startup around Selenium to run testing on cloud systems Automated web [...]]]></description>
			<content:encoded><![CDATA[<p>
    Testing a web site against multiple browsers has been a long-time challenge of<br />
    Web development.&nbsp; In this interview, Jason Huggins, CTO of Sauce Labs,<br />
    explains how the Selenium project makes multi-browser testing much faster.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/02/22/interview-with-jason-huggins-cto-of-sauce-labs-selenium/#selenium">The creation of the Selenium web-testing tool</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/22/interview-with-jason-huggins-cto-of-sauce-labs-selenium/#sauce">The Sauce Labs startup around Selenium to run testing on cloud systems</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/22/interview-with-jason-huggins-cto-of-sauce-labs-selenium/#robot">Automated web testing with Selenium</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/22/interview-with-jason-huggins-cto-of-sauce-labs-selenium/#beyond">Taking Selenium beyond testing into software documentation and marketing</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/22/interview-with-jason-huggins-cto-of-sauce-labs-selenium/#revenue">The revenue and business models of Sauce Labs</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/22/interview-with-jason-huggins-cto-of-sauce-labs-selenium/#expand">The maturation of Selenium beyond predecessors and into the emerging cloud ecosystem</a></li>
</ul>
<p><span id="more-293"></span></p>
<p><b>Scott Swigart:</b> To start us off, could you introduce yourself and your company?</p>
<p><a name="selenium"></a></p>
<p><b>Jason Huggins:</b>I&#8217;m the founder of Sauce Labs. I think my unofficial title is CTO, but the official one on my business card is &#8220;Executive Software Chef.&#8221; I&#8217;m a fan of silly titles, so that&#8217;s my silly title. I&#8217;m also the creator of the Selenium web-testing tool, and Sauce Labs is built around the things that we wanted to do, as a startup, with Selenium.</p>
<p>A lot of interesting things happened in mid-2004: Django and Rails came out. And a little bit later in 2005, Jesse James Garrett coined the word AJAX; he looked at what Google was doing with Gmail and Maps, and how they were using JavaScript to create a tremendously dynamic website. </p>
<p>In what seems like overnight, JavaScript as a technology to use in web-app development went from being taboo to becoming something that everyone had to do. Right around that time, I was also coming to the conclusion that JavaScript was cool and useful, independently of the AJAX hype wave.</p>
<p>I was the tech lead of a team at ThoughtWorks that was writing an in-house time-expense system. Unlike most corporate entities, we couldn&#8217;t dictate by fiat that everyone should use the same browser. We had a bunch of alpha geeks there, and people used whatever they wanted to.</p>
<p>The classic thing you were supposed to do in web applications at the time was to have everything done on the server and present a kind of static HTML page to the client. This JavaScript stuff made the site more lively and more fun, though, so we decided to use that. The problem was that, even as JavaScript was getting better and Google was showing ways to help it work across different browsers, there were still some rough edges there.</p>
<p>As the project continued, the same series of events would keep happening. We would have a feature implemented in JavaScript that would be working in IE but broken in Mozilla. We would file a bug, and our developer would fix it in Mozilla, but forget to test it in IE. So the next week, the bug was still there, but now it would be broken in IE, instead.</p>
<p>That was frustrating, and I decided that we needed to put a stop to it. We decided to come up with a way to write one test and run it across all of the browsers we cared about. That was the genesis of the Selenium tool. </p>
<p><b>Scott:</b> How did the tool progress from that point to become an open source project?</p>
<p><b>Jason:</b> I thought I was going to be famous with this time-expenses thing we were building. It turned out, though, that as I was showing people what we had done to test these two browsers, they said they wanted to use it on their own projects.</p>
<p>Those developers wanted access to the code, so we made a formal request to open-source it. One thing led to another, and it kind of took off. We never actually open-sourced the time-expense system, but Selenium definitely took off.</p>
<p><b>Scott:</b> What was the trajectory from that point? Did the project grow primarily by word of mouth?</p>
<p><b>Jason:</b> It grew by word of mouth, one project at a time. At one of the last projects I worked on at ThoughtWorks, the developers were frustrated that running tests with Selenium was taking too long&#8211;about 30 minutes to run all tests end-to-end. In a classic agile project, you want to run everything often&#8211;ideally, at least every time you check a file in and everything done in 10 minutes or less. If the whole process of build and test takes too long, you&#8217;ll skip running those tests.</p>
<p>Every project has a different conception of how long is too long, but usually five to 10 minutes is kind of the ideal. On this particular project, total test time was hovering at about 25 minutes. Mutiny was afoot, and I had the idea of getting a couple of extra machines. Using four machines, I wrote some code to split the tests across the four machines, running concurrently, and we got the testing time back to a good level.</p>
<p>A couple of months later, I heard through the grapevine that Google was also using Selenium, and that same contact asked me if I had an updated resume. Of course, that&#8217;s the offer you can&#8217;t refuse, to be courted by Google, so I applied. It turns out they were building a Selenium farm, and I arrived on the scene right when they were doing that.</p>
<p>It turns out that the Writely team, which is now Google Docs, had the same kind of an epiphany, if you will. There were also using Selenium at the time, and it was taking an hour to run their test. They ran the testing on four machines, and the time went down to 15 minutes.</p>
<p>They started telling other groups at Google, and it became a virtuous cycle where more teams started using this farm, which enabled them to get approval to buy more machines. Having more machines let them divide all of their tests across those machines and make everything go faster. Selenium eventually started being used for Google Apps and Gmail, Maps, Calendar, and lots of apps that I can&#8217;t talk about.</p>
<p><a name="sauce"></a></p>
<p><b>Scott:</b> How did Sauce Labs grow out of that experience?</p>
<p><b>Jason:</b> I was at Google for about a year and a half, but I was looking to get back to Chicago. Meanwhile, I met John Dunham, who is now Sauce Labs CEO, through Elisabeth Hendrickson, an expert in agile testing, and a mutual friend of ours. John wanted to get back into the testing space, and he sought me out just to have lunch with a guy interested in the same field.</p>
<p>I revealed to him I might be looking to do a start up, too. I was the idea guy, but I didn&#8217;t have all the business sense to be able to actually run a company. The initial “cocktail napkin” business plan was to prove the viability of throwing hardware at the software testing problem. </p>
<p>We had proved it on a small scale with my little project at ThoughtWorks and also on a big scale at Google. The next question was to determine whether there was an actual market for it, and we started Sauce Labs to answer that question. </p>
<p>It didn&#8217;t hurt that cloud computing was the most hyped story of 2009. The type of testing we&#8217;re doing at Sauce is the perfect use of cloud technology. We use disposable machines to run a whole bunch of stuff in parallel and run web browsers on all those cloud machines. Customers then drive those machines through our API to execute their tests.</p>
<p>It&#8217;s a classic problem where a long, slow process needs to go faster, and we just need a lot of disposable machines to run it against. Sauce Labs wouldn&#8217;t be really feasible without cloud technology.</p>
<p><a name="robot"></a></p>
<p><b>Scott:</b> Do I understand correctly that Selenium provides a plug-in that you use with your browser of choice that lets you record what you do? My understanding is that it emits a script that can then drive the other browsers through those same actions. </p>
<p>It can actually video record what&#8217;s on the screen as it&#8217;s driving them through those tasks, and it&#8217;s got logic to verify that the right thing shows up on the screen. Do I have that right?</p>
<p><b>Jason:</b> There are several tools in the Selenium toolbox, but the tool you’re thinking of is a Firefox extension called “Selenium IDE”. The video recording is not included in Selenium out-of-the box&#8211;that’s something Sauce Labs has added to our online-based service to extend the capability and usefulness of running Selenium tests in the cloud. </p>
<p>Taking a step back, though, Selenium specifically is a functional testing tool (also called “acceptance testing”) that can verify features by interacting with a website exactly as a user would. Think of Selenium as a programmable software robot that knows how to use a web browser, and does so in all browsers, not just in IE or Firefox. We use IE, Firefox, Safari, Opera, and Google Chrome, on Windows, Linux, and Mac OS. We have APIs for Java, C#, Python, Ruby, Perl, and PHP.</p>
<p>The core minimum Selenium API is a collection of simple commands: “open”, “type”, “click”, and maybe “verify”. Open a web page, type text into a field, click a couple of buttons, and then ask the page if some text or another element is present or whether some state has changed. That&#8217;s the minimum essence of what Selenium does.</p>
<p><b>Scott:</b> My sense, too, is that it&#8217;s not just simulating the browser. It looks like one of the reasons it seems suited to the cloud is that it&#8217;s basically starting up a VM with specific OS and browser builds, driving it through certain things, collecting screen shots, video, and whatever, and then tearing it down.</p>
<p><b>Jason:</b> That&#8217;s actually what makes us unique as a cloud vendor. Everyone else is putting servers in the clouds, but we&#8217;re actually booting up the desktop environments and are running what you would otherwise call &#8220;client applications&#8221; in the cloud on those servers.</p>
<p>One of the more cryptic tag lines we had was, &#8220;Web browsers as a service.&#8221; That probably doesn&#8217;t play in Peoria, but it kind of gives you a sense of what we&#8217;re doing. We&#8217;re launching these desktops that you can connect to remotely and see that they are starting up IE or Firefox. It really is starting up the browser and you get to see everything a user would see.</p>
<p>Since you mentioned video, I will add that it&#8217;s been really important to me to add that functionality, to see directly what happens when a browser running in a cloud crashes, for example.</p>
<p>It was important to me in creating Selenium, since I am very visually oriented, to see what the browser was doing. I want to see whether it looks good or ugly, and I don&#8217;t really believe the application&#8217;s working until it actually runs in IE and Firefox. And by default, when you run things in the cloud, you can’t see what’s going on, so we put in the effort to add back that visual feedback when customers use our service. Video recording of every test has been a pretty popular feature with our users. They don&#8217;t just have some log files to look at. They can actually view the video, share it with their coworkers, and so on.</p>
<p><b>Scott:</b> I would imagine that as the web becomes more interactive, testing gets an different or expanded focus, as well. For example, with a traditional shopping cart app, you need to make sure data is handled the right way, and that at the end, the &#8220;your purchase has been completed&#8221; string shows up on the final page.</p>
<p>On the other hand, to test the functionality of dragging a map or something like that, there&#8217;s not a string to test for. In some of these cases, the only way you&#8217;re actually going to know if it really worked is to look and see whether the map been dragged to the right position or whatever.</p>
<p>That need for video suggests that you are always stuck with a certain number of tests that require kind of some human subjectivity, even if you can automate it to some degree.</p>
<p><a name="beyond"></a></p>
<p><b>Jason:</b> That&#8217;s true, although one of the &#8220;maybe someday,&#8221; crazy ideas I have is to add in some kind of &#8220;Mechanical Turk&#8221; functionality. For instance, once the video is created, perhaps you can automate answering the question of whether the functionality worked, even if it takes a human to answer the follow-up question of whether it looked ugly.</p>
<p>In other words, even if there are still questions that only a human could answer, it would be easier for them to do that by watching a movie and answering a couple of multiple-choice questions, instead of requiring them to actually perform the entire workflow manually themselves.</p>
<p>I read a <a href="http://gapingvoid.com/2007/06/17/but-what-if-i-fail">blog post</a> recently by Hugh McCleod, where the author suggested that, if you&#8217;re doing a start-up for any kind of company, there needs to be some kind of social object: some conversational aspect of what you do or produce.</p>
<p>For us, that&#8217;s this video that we can pass around, which can include assertions about whether it is an example of a passing test or failing test. If the test failed, it becomes a social thing that you can pass around to the other developers to say, &#8220;Look at the one-minute mark; you can see where the bug appears and the test fails.&#8221;</p>
<p>Ideally, you have a test for every feature in a website. That might be sending a message in Gmail, doing a product search on eBay, or buying something on Amazon. </p>
<p><b>Scott:</b> It seems like that could have implications beyond app testing. What kinds of things have you considered elsewhere in the life cycle?</p>
<p><b>Jason:</b> I would like to see some of the lines erased between documentation, marketing, and functional testing. My best example is an iPhone commercial, and even Android now these days. All they do is zoom in on the phone, show you opening an app, clicking buttons, seeing how cool it is.</p>
<p>They just show one feature, and how well it&#8217;s working, which is actually an ideal use case for Selenium and Sauce Labs. People write these tests so they show off one little feature, then there&#8217;s a video for it. Ideally, if you are a software developer, and you&#8217;re coding up a project, you add more features over time to the thing you are building.</p>
<p>You could have this continually growing little check-list of the features that you have written, and you can drill down and look at the recorded video of that feature working on latest version of your code.</p>
<p>You can see the history of that feature working, but specifically you can actually see it working over the life of the project. People are doing screencasts now to help sell their software, putting up little YouTube videos on their site.</p>
<p>The old approach was to let people try before they buy, downloading a trial before they bought it. With the unfocused attention of most people these days, now they want to watch the video before they even try it.</p>
<p>I&#8217;d like to make it so these videos get recorded all the time with all of your little features. You can use them either as-is or as your prototype for making little feature screencasts you can put up on your website.</p>
<p><b>Scott:</b> It also potentially solves the problem of keeping the docs in sync, especially with software as a service solutions that might roll out little changes daily, weekly, or every couple of weeks.</p>
<p>Keeping the videos in sync could become fairly automatic, since if it&#8217;s been tested it&#8217;s also documented, and the only stuff that&#8217;s undocumented is the stuff that&#8217;s untested.</p>
<p><b>Jason:</b> Actually, I&#8217;d like to see testing move away from being so focused on the tests themselves, and toward looking at the tests as more like screenplays, with the real focus being on the videos they produce. Those would be the actual artifacts that you care about.</p>
<p>As recently as five years ago, project testing was usually kind of a late process activity, done by one or two people in the corner. You called on testers and they used really expensive commercial tools, in isolation from everybody else.</p>
<p>Testing with Selenium follows the growing adoption of agile methodolgies where testing is something that developers do all the time. By analogy, I see a lot of those marketing screencasts being done very late in the process, and not by the developers. Screencasting today is where testing was five years ago.</p>
<p>I believe screencasting could benefit greatly by being pulled into the development process using the video capture functionality from Sauce Labs or similar tools. I think that the key glue there is automation, as well as encouraging testers and developers to think of themselves as screenwriters showing off their applications, rather than being so focused on acquiring the pass or fail data point.</p>
<p><a name="revenue"></a></p>
<p><b>Scott:</b> There&#8217;s Selenium, which is an open source project, and then as you said, wrapped around it is Sauce Labs. Talk a bit about what Sauce Labs does. What do you add beyond the open source project? What&#8217;s your revenue model?</p>
<p><b>Jason:</b> We started with the single focus of Selenium and cloud computing. As a request comes in, we either route that request to an existing running machine, or we boot more machines from a cloud vendor. There&#8217;s a lot of code needed that&#8217;s not in Selenium to manage this cloud environment and scale the infrastructure.</p>
<p>That&#8217;s the bulk of what we do, but we also collect log files, record video of each test, and make all that data accessible through our website. We also add support for this notion of multi-tenancy, in the sense that we have to make it work with lots of customers, so we have to set up and tear down cleanly between customer uses. That&#8217;s more code we have to write and maintain.</p>
<p>Along the lines of those security considerations, that’s how I found my other co-founder, Steve Hazel. He created a site called codepad.org, and you actually can paste in code from pretty much any language, and it&#8217;ll run it on his servers in EC2.</p>
<p>In addition to the actual security code itself, and managing how to properly sandbox running code in the cloud, Steve also provides that mindset of being security conscious and vigilant about how people can use and abuse your site.</p>
<p>Along those lines, we&#8217;ve expanded our mindset beyond just providing cloud execution to become the Selenium support company more generally. Because Selenium&#8217;s already been out there for five and a half years, a lot of people who have a pretty big installed base may not want to use our cloud service, but they do want a phone number or an email address where they can get support, bug fixes, updates for the latest browsers, and things like that.</p>
<p>We&#8217;ve expanded the scope of the company to a classic open source model of being the obvious company that you go to for help when you use this particular tool. Blended with that, we also buy cloud resources at wholesale rates by the hour and sell them by the minute.</p>
<p><a name="expand"></a></p>
<p><b>Scott:</b> I think you mentioned that the cloud provider behind this is Amazon EC2. It would seem that there are going to be a lot more choices as time goes on. I could envision you using multiple cloud providers simultaneously, transparently to the user, because it&#8217;s at a lower abstraction level where the VM is actually running, versus the data that the user&#8217;s getting back.</p>
<p><b>Jason:</b> Right. There&#8217;s also an argument to be made that, because we&#8217;re telling browsers to go to some website, ideally those browsers should be as close as possible to the active servers that we&#8217;re testing.</p>
<p>It&#8217;s a simple matter of the speed of light. If the server we’re hitting is in Japan, and our test lab is in Virginia, latency is going to be a problem. Therefore, we&#8217;d like to be wherever our customers are, in the sense that if they&#8217;re inclined to host their application on EC2, our browsers will be there, too, so the network obviously is really fast.</p>
<p>There are lots of good reasons to be cloud-vendor neutral.</p>
<p><b>Scott:</b> Obviously, if you&#8217;re in the same geography as somebody that helps, but if you&#8217;re in the same data center, that&#8217;s even better.</p>
<p><b>Jason:</b> If nothing else, your bandwidth is free when it&#8217;s inside the data center.</p>
<p><b>Scott:</b> On another topic, in getting ready to chat with you, we went out and did a little bit of homework and came across Adobe BrowserLab, which seems to provide some services that are similar to Selenium. Can you talk a little bit about some of things that are unique about Selenium?</p>
<p><b>Jason:</b> I&#8217;m glad you asked that, because it&#8217;s where I think the original synapses fired and I was like, &#8220;Hey, maybe there&#8217;s something here with Selenium as a service.&#8221; The word &#8220;cloud&#8221; didn&#8217;t exist a couple of years ago, but the phrase &#8220;as a service&#8221; was in fairly broad usage.</p>
<p>What was out there was very similar to that Adobe site. There was something called BrowserCam, which sounds shady, but it&#8217;s not.</p>
<p>[laughter]</p>
<p><b>Jason:</b> Basically, <a href="http://browsercam.com"> BrowserCam’s</a> a screen-shotting service, and there are a bunch of others, like <a href="http://browsershots.org">BrowserShots</a> and <a href="http://litmusapp.com">Litmus</a>.</p>
<p>The limitation there was that all of them were what I might characterize as just single-page snapshot services. They specifically market themselves to designers who just want to see whether a page that they&#8217;re developing renders right in all these different browsers. They don&#8217;t really target developers who need deeper functionality than that.</p>
<p>Also, at the same time, five years ago, virtual machines were not as commonly used, so if you only had a Windows machine and you wanted to see what a site looked like on Safari on Mac, you would submit your URL and get a screen shot back.</p>
<p>The most obvious limitation is that you can only give it one URL, and that&#8217;s it. They can&#8217;t test a multi-step workflow behind a login. So if you had, let&#8217;s say, a shopping cart, what you might really be interested in is to go from the front page, add something to the shopping cart, click &#8220;submit&#8221; and get all the way to the end of the purchasing process, then take a screen shot. </p>
<p>None of those screen-shot services could do that, unless you add some kind of automation mechanism to be able to navigate through the site to get to that point and then take a screen shot.</p>
<p>The genesis of what I originally wanted to do in the cloud was to take a classic screen-shot service and combine it with Selenium. Of course, I kind of took it to 11 and said, &#8220;Hey, why not, instead of just screen shots, record a movie?&#8221; And then you&#8217;ve got, theoretically, 30 screen shots per second that you can play with.</p>
<p><b>Scott:</b> Is the size of the movies prohibitive at all? Does it take screen shots at set points, such as when it knows that a response is finished rendering, or is it really full-motion video of the browser?</p>
<p><b>Jason:</b> Right now, it&#8217;s full-motion, in the sense that we&#8217;re running VNC as the protocol, remote desktop. VNC and RDP are the two main industry-standard protocols. VNC is kind of the open-source lowest common denominator, and RDP, which is the default one for Windows, is the next step up.</p>
<p>We just have a VNC server running on all of our test machines, and instead of connecting as a client application, we have a software library that I added some improvements to. The open source of that library is something called <a href="http://github.com/hugs/castro">Castro</a>. Instead of a user watching the remote screen, you dump the VNC traffic to what ultimately becomes a Flash video file.</p>
<p>Of course, we have done tweaking here and there, although it is inefficient if you do 30 frames per second. At five or even three frames per second, it actually works out well enough, though.</p>
<p>We&#8217;ve made some other improvements as well, although there are still more to be made. Specifically, I would love to bookmark the moments in the video where the interesting things happen, synchronizing it with the command you&#8217;re giving to the browser.</p>
<p>To your point, I could see where, instead of watching a video to get to those little bookmarked items, you could end up just getting something that&#8217;s a little bit more static. Using the raw video, for instance, you could go to each point of interest, pull out the video frame, and produce a sort of photo book or slideshow of just the relevant frames.</p>
<p><b>Scott:</b> It would be similar in some ways to apps that detect when a sufficient number of pixels change, and so they know that some kind of transition happened.</p>
<p><b>Jason:</b> Right. Perhaps it could be streamlined down to a slide show of only the key parts of the work flow, making reviewing the results go faster. Since we have recorded all of the video, we have all of the data, and we can then edit it in any number of ways.</p>
<p>Of course, people still come to us just expecting screen shots, which we will do for them. I think the next step is to synchronize the API traffic, the clicking, the typing, and so on, to make it easier to watch that flow of pictures, rather than having to explicitly take a screen shot of it the whole time.</p>
<p><b>Scott:</b> Well, I think we&#8217;ve covered most of the stuff that I had planned to talk about. Any closing thoughts on your end?</p>
<p><b>Jason:</b> The only thing that comes to mind is that we didn&#8217;t really get into the differences between Selenium user groups. There are developers on the one hand, using programming languages like Java, C#, Python, Perl, and Ruby. On the other hand, there are other Selenium users who don&#8217;t really do a lot of programming, but they use Selenium IDE specifically. </p>
<p>Sauce originally started out just to target those people who are using Selenium Remote Control. As we become more of a general Selenium company, we&#8217;re also embracing those users who are a little bit newer to automated testing, who are using Selenium IDE. We are branching out to do more things for those people as well. For example, we have a version of Selenium IDE, called Sauce IDE, that integrated with our Sauce OnDemand cloud service. So in a few seconds after creating a test, you can quickly start using our cloud service from inside the IDE.</p>
<p><b>Scott:</b> Thanks for taking the time to chat.</p>
<p><b>Jason:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=293&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F22%2Finterview-with-jason-huggins-cto-of-sauce-labs-selenium%2F&amp;title=Interview+with+Jason+Huggins+%26%238211%3B+CTO+of+Sauce+Labs+%2F+Selenium" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F22%2Finterview-with-jason-huggins-cto-of-sauce-labs-selenium%2F&amp;title=Interview+with+Jason+Huggins+%26%238211%3B+CTO+of+Sauce+Labs+%2F+Selenium" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F22%2Finterview-with-jason-huggins-cto-of-sauce-labs-selenium%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F22%2Finterview-with-jason-huggins-cto-of-sauce-labs-selenium%2F&amp;title=Interview+with+Jason+Huggins+%26%238211%3B+CTO+of+Sauce+Labs+%2F+Selenium" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F22%2Finterview-with-jason-huggins-cto-of-sauce-labs-selenium%2F&amp;title=Interview+with+Jason+Huggins+%26%238211%3B+CTO+of+Sauce+Labs+%2F+Selenium" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F22%2Finterview-with-jason-huggins-cto-of-sauce-labs-selenium%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Jason+Huggins+%26%238211%3B+CTO+of+Sauce+Labs+%2F+Selenium+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F22%2Finterview-with-jason-huggins-cto-of-sauce-labs-selenium%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/02/22/interview-with-jason-huggins-cto-of-sauce-labs-selenium/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Interview with Eliot Horowitz &#8211; CTO of 10gen / MongoDB</title>
		<link>http://howsoftwareisbuilt.com/2010/02/13/interview-with-eliot-horowitz-cto-of-10gen-mongodb/</link>
		<comments>http://howsoftwareisbuilt.com/2010/02/13/interview-with-eliot-horowitz-cto-of-10gen-mongodb/#comments</comments>
		<pubDate>Sat, 13 Feb 2010 01:10:52 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=289</guid>
		<description><![CDATA[In this interview, we talk with Eliot Horowitz, founder of 10gen, which is the company that created MongoDB.&#160; What&#39;s MongoDB? Well, it&#39;s an open source, document oriented, schema free database designed for massive scale and performance. If that got your attention, then I invite you to read on&#8230; The evolution of various data models in [...]]]></description>
			<content:encoded><![CDATA[<p>
    In this interview, we talk with Eliot Horowitz, founder of 10gen, which is the<br />
    company that created MongoDB.&nbsp; What&#39;s MongoDB? Well, it&#39;s an open source,<br />
    document oriented, schema free database designed for massive scale and<br />
    performance. If that got your attention, then I invite you to read on&#8230;
</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/02/13/interview-with-eliot-horowitz-cto-of-10gen-mongodb/#evolution">The evolution of various data models in databases</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/13/interview-with-eliot-horowitz-cto-of-10gen-mongodb/#scaling">Advantages of object-oriented databases in terms of horizontal scaling</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/13/interview-with-eliot-horowitz-cto-of-10gen-mongodb/#version">Handling issues when columns are added or deleted between application versions</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/13/interview-with-eliot-horowitz-cto-of-10gen-mongodb/#cloud">Suitability of MongoDB for the cloud</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/13/interview-with-eliot-horowitz-cto-of-10gen-mongodb/#OS">OS support and usage with MongoDB</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/13/interview-with-eliot-horowitz-cto-of-10gen-mongodb/#company">The company (10gen) wrapped around MongoDB</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/13/interview-with-eliot-horowitz-cto-of-10gen-mongodb/#community">Community involvement in the development of MongoDB</a></li>
</ul>
<p><span id="more-289"></span></p>
<p><b>Scott Swigart:</b> To start, can you introduce yourself, 10gen, and the role of open source behind it?</p>
<p><b>Eliot Horowitz:</b>Sure. I&#8217;m the CTO and co-founder of 10gen, which was really created to build MongoDB, an open-source, document-based database. We started 10gen and started working on MongoDB mostly out of our own frustrations after building scalable web infrastructure for a dozen years. </p>
<p>It really just came to a point where we were constantly fighting the same battles and building one-off solutions to handle functional gaps in relational databases. We started building MongoDB about two years ago, and we launched it earlier this year. People are getting very excited about it. </p>
<p>It&#8217;s very much targeted toward the standard web developer&#8217;s needs, anywhere from a simple one-guy website, all the way up to a Facebook-level, highly scalable, highly distributed system.</p>
<p><a name="evolution"></a></p>
<p><b>Scott:</b> Building a storage engine is no trivial task, and beyond that, for a long time the bread and butter way of doing it was with relational databases. There was Oracle on one end of the spectrum and MySQL on the other, with a lot of other stuff in between, but people were used to throwing select statements at a database, joining tables together. </p>
<p>Then companies like Google came along, and over time, they&#8217;ve started to stored data differently. It&#8217;s sharded; it&#8217;s a schema-less store. You see things like SimpleDB up on Amazon, and I think Microsoft&#8217;s doing something similar and offering a data store. Talk a little bit about where MongoDB fits in the spectrum of databases.</p>
<p><b>Eliot:</b> I think the first question you have to ask about any database these days is, &#8220;What&#8217;s the data model?&#8221; First, there&#8217;s the relational data model, which includes Oracle, MySQL, and all those. Then you have key-value, which has been around for awhile and includes things like MemcacheD and Dynamo from Amazon, where it&#8217;s really just &#8220;get input&#8221; type operations. </p>
<p>Obviously, those have been used for a long time, and they&#8217;re great for many cases. When you need simple gets and posts, they&#8217;re highly scalable, highly distributable, and very fast. They also have a very simple data model, although on the farthest possible extreme from SQL.</p>
<p>And then you&#8217;ve got a bunch of other data models in the middle: graph databases, tabular databases like BigTable, and document databases like Mongo.</p>
<p>There are a few different ways that you can think about document databases. One of the nice things about document databases is that they&#8217;re closely mapped to how most developers are writing code, whereas SQL databases were designed for accounting and banking 30 or 40 years ago, prior to the advent of web applications and the rise of object-oriented programming.</p>
<p>Now, when people are writing in PHP, Ruby, or Python, and they&#8217;re writing with data objects instead of rich classes (like user profiles). Trying to map that to relational data models is complex and slow. You can see the evidence of that in the number of different object-relational mapping libraries (ORMs) out there.</p>
<p>There are probably 1,000 different ORMs in the world, so clearly, it&#8217;s an unsolved problem, and every week you hear about a new one coming out. That whole class of problems exists because there&#8217;s a very clunky mapping from objects to relational databases. With document databases, that mapping becomes much simpler. </p>
<p>In many cases, it enables a direct mapping. You can take your object and direct it toward the database. And when you do need to do a mapping, the whole class of problems is massively simplified.</p>
<p>Another aspect of the same issue is that, when you loosen some of the versional constraints, you don&#8217;t have joins, because the data model is more suited to rich objects. For example, if you want to store tags for a blog post, you can store them as an embedded array rather than having a separate table that you have to join against. Once you get rid of joins, scaling horizontally becomes much simpler, which is a great benefit.</p>
<p>A bunch of the new databases coming out right now are trying to completely change the database paradigms, in terms of issues like how replication works or how you query the database. One of the things we&#8217;re trying to do with MongoDB is to keep it close to what people are used to. So, when moving from a relational database to MongoDB, there aren&#8217;t too many surprises. </p>
<p>You still have collections that are basically equivalent to tables, you still have different databases, and you still create indexes. You can still do an &#8216;explain&#8217; to see what kind of query plan you&#8217;re going to get, and you can do ad hoc queries.</p>
<p>That means that if you&#8217;re coming from MySQL, it&#8217;s a pretty easy transition to MongoDB. The data model changes, but a lot of the things you&#8217;re used to are still going to be the same, mostly because we don&#8217;t think there&#8217;s any need to change them.</p>
<p><b>Scott:</b> There was a guy a number of years ago who said that object-relational mapping is the Vietnam of our industry. It&#8217;s the thing that people just keep trying over and over to do, and it always ends up being excessively complicated. Back in my coding days, I spent as much time writing the mapping as I would have spent just writing the code to hydrate and dehydrate the objects.</p>
<p>Still, there have been object-oriented databases in the past. Talk to me about some of the things that make MongoDB unique or interesting compared to preceding object-oriented databases.</p>
<p><b>Eliot:</b> The first object databases came out in the early &#8217;90s, and there were a bunch of problems. For one thing, a lot of code was being written in C++ then, where they sped up the types, and the mapping layer really didn&#8217;t work all that well. </p>
<p>You could take a C++ object and save it, but not all the people were doing that. And then you had other languages where you didn&#8217;t really have objects the same way you do now. Remember that it was still very early in the trajectory of object-oriented programming at that point, so in some ways it was ahead of the curve.</p>
<p>The applications people were writing then were often more suited to relational databases, I think. Now, web scalability is a lot more important, because things can get big pretty quickly. I think people want much richer information from their databases, and they&#8217;re trying to use them in very different ways than they used to.</p>
<p>If you&#8217;re building a banking system or an e-commerce system, relational databases work pretty well. It&#8217;s really when you get into Web 2.0 sites that relational databases really start having troubles. If you look at Twitter, Facebook, or those kinds of websites, the needs have changed.</p>
<p>Really, though, I think the biggest thing is that the programming languages have changed a lot. No one&#8217;s writing web applications in C. Most web applications are now being written in PHP, Ruby Python, and some Java, all of which map very nicely to a document database. You can interact with any one of the languages and it all works pretty well.</p>
<p>I also think, going back to what we mentioned before, that the object databases before were actually more closely related to current graph databases than to document databases. The document database is really just taking MySQL, and instead of having a row, you have a document. So I think it&#8217;s a much simpler transition and it&#8217;s actually much closer to MySQL than a lot of people might think.</p>
<p><a name="scaling"></a></p>
<p><b>Scott:</b> I&#8217;m calling it object-oriented, and you&#8217;re calling it document-oriented. Can you unpack that for me a little bit?</p>
<p><b>Eliot:</b> In our case, a document is a BSON document, which you can think of as similar to a JSON document. You can have numbers, strings, dates, embedded arrays, and embedded objects. </p>
<p>If you have a user profile, instead of having first name, last name, then street, city, zip, you can have an address sub-object; you can have an address, and that can have fields. Then you can have multiple addresses, so instead of having address1 and address2 fields, you can have an array of addresses, and it&#8217;s already much simpler.</p>
<p>In your code, you can map that very cleanly. You have an address object that maps to this object, and you&#8217;ve got an array of them. You could have none, or you can have as many as you want. The database doesn&#8217;t really have a clunky interface, and you&#8217;re not joining, so there&#8217;s no huge performance hit. It&#8217;s all stored in the same place on disk.</p>
<p><b>Scott:</b> Is a document basically a language-agnostic object, for lack of a better term? In other words, it&#8217;s not tied to PHP or Ruby or any particular programming language, but it basically has object concepts?</p>
<p><b>Eliot:</b> Right; it&#8217;s basically like JSON, which is used pretty heavily in transporting data across the web now. We use BSON, which is basically like JSON except that it&#8217;s binary and has more types. JSON doesn&#8217;t have a native date type, and there are more types you need for web infrastructure. </p>
<p>Rather than hacking JSON, we just created a binary version that&#8217;s both easier for computers to read and extensible. It has more types, so you can have as many types as you need for real web applications.</p>
<p><b>Scott:</b> Talk a little bit about scaling. You mentioned that because you don&#8217;t have joins and those sorts of things, and the data is all stored in one place, it scales horizontally very well. What does that actually look like, and how does it put the data back together, find which nodes different pieces of data are stored on, and resolve those kind of issues?</p>
<p><b>Eliot:</b> The simple case is user profiles. Let&#8217;s say you have a billion user profiles and you needed 10 machines to store them all. Basically what you would do in this case is to say, &#8220;OK, I want to start my users collection,&#8221; and you could shard it on anything you want. You could shard it on email address or first name, or you could sort it geographically. You could shard it by country and then by state, depending on what kinds of queries you do.</p>
<p>Let&#8217;s say you sharded by using email addresses. Looking for a particular user is very fast. Now let&#8217;s say I want to find all people who live in Connecticut. It basically does a merge sort across all of the different shards. </p>
<p>It will go to each one and say, &#8220;OK, give me your users from Connecticut and get all of them,&#8221; and then it will aggregate them for you on the fly. From a driver/client perspective, it&#8217;s just the same if you&#8217;re sharded or if you&#8217;re not sharded. This is what people have been doing manually for years.</p>
<p>I&#8217;ve built many systems in the past where I needed 11 different databases to store something, and I did the sharding myself, which is a lot of manual work. What we&#8217;re doing is basically just generalizing what developers have been doing themselves in one-off solutions for years. It&#8217;s much simpler, because you don&#8217;t have to do joins. </p>
<p>A distributed join is very, very hard, but when your user profile is one object, it makes the problem much simpler.</p>
<p><b>Scott:</b> It&#8217;s always interesting for me to ask where the edges are. What are some scenarios where a relational database may still be a better choice? Are there other tables where, maybe, a big table store is a better choice?</p>
<p><b>Eliot:</b> I think big table solves the same general problems, although there we have some advantages over them. I think the document model is a little bit better than the tabular model, and there are a lot of subtleties there.</p>
<p>The big reason you use a relational database is if you need multi-object interactions. We don&#8217;t support multi-object transactions because of the performance overhead and complexity of doing distributed joins.</p>
<p>If you&#8217;re writing a banking system and you need to debit my account, credit your account, and do it completely atomically, we don&#8217;t support that. However, you <i>can</i> act atomically on any single object.</p>
<p>For example, you could say, &#8220;Increment this field on this object by five.&#8221; You can have it so lots of threads can do it, lots of different machines can do it, and it&#8217;s all atomic and safe.</p>
<p><a name="version"></a></p>
<p><b>Scott:</b> Another issue that&#8217;s interesting is dealing with versioning. For example, in the course of versioning an application, you may change your objects, add properties to them, or various other things. With a relational database, it&#8217;s a relatively painful process up front, where when you change the schema, you&#8217;re changing the schema for the database. </p>
<p>That can mean that you have to populate a whole bunch of new columns with default values or do other things to migrate the database to the new version. With Google App Engine, it doesn&#8217;t work that way. You&#8217;re never really rolling the whole database forward to a new version, but you have to do more defensive programming</p>
<p>For example, let&#8217;s say you pull a user out of the database and that user hasn&#8217;t been touched for a version or two of your app. You&#8217;re pulling out the data the way it was stored the last time that user was saved, and so you may have to do some stuff in your code to map that older user data to your newer object. How do you guys handle that?</p>
<p><b>Eliot:</b> You could migrate the entire database, but you don&#8217;t have to, which is the big thing. The simplest case is where you want to add a field. Here, the nice thing is that you can just start using the field. If an old object doesn&#8217;t have it, it will just return null, and it&#8217;s pretty easy and clean. </p>
<p>If you want to really change the format, you could actually have a migration script that migrates all your data, but that would require you to bring your site offline for an hour or so.</p>
<p>That&#8217;s another big change over the last ten years: people are no longer are willing to have maintenance windows. I remember working on systems where it was totally acceptable to have a two-hour maintenance window on a Saturday morning.</p>
<p>If Twitter went down for a two hour maintenance window once a month, I don&#8217;t think people would be too happy, and I think databases have to handle that. In any case, databases typically require you to do a little bit more work on the client side to make sure that you know what you&#8217;re getting out and you know what you&#8217;re putting back in.</p>
<p><a name="cloud"></a></p>
<p><b>Scott:</b> On another topic, the cloud is a very important subject these days, and it seems like MongoDB is very well suited for something like Amazon EC2 and infrastructure on demand, where you&#8217;re able to install your own software and those kinds of things. </p>
<p>Obviously, the things that are more &#8220;platform as a service,&#8221; whether it&#8217;s ads or Google App Engine, have their own store, and to some degree you get what you get, which is what they provide. Talk a little bit about where you&#8217;re seeing usage of MongoDB in the data center versus in the cloud and some of the interesting uses you&#8217;re seeing for it.</p>
<p><b>Eliot:</b> It does work well in the cloud, because it&#8217;s horizontally scalable, although it&#8217;s certainly not dependent on that usage. We see a lot of people running it in their own data centers, on their own hardware. </p>
<p>If you look at the startups that are using Mongo, a huge percentage of them are running on EC2, and I think that has less to do with MongoDB than with the fact that EC2 is a popular place to host sites.</p>
<p>We&#8217;re perfectly happy to run anywhere, but if you look at where people are building websites right now, it&#8217;s very heavily EC2 and other cloud-based solutions. You do not see very many startups buying and racking hardware to go on the Internet, which I did six years ago.</p>
<p>Horizontal scalability is a great thing for the cloud, because if you do grow quickly, you can add capacity very quickly. If you&#8217;re not horizontally scalable, the cloud doesn&#8217;t really help you that much; it&#8217;s actually worse, because you can&#8217;t buy or rent a $2 million Sun box on EC2.</p>
<p><b>Scott:</b> If you start out with a small instance and you want to upgrade, and it&#8217;s your own hardware and your own data center, you can put more memory and CPU in it. With a cloud provider, you tend to get the type of instance you said you wanted, unless you want to shut it down and migrate the whole thing over, which causes that downtime.</p>
<p><a name="OS"></a></p>
<p>Is this basically a Linux technology, or do you have significant Windows users?</p>
<p><b>Eliot:</b> We have more Windows users than I would ever have guessed. If you asked me nine months ago if we were going to have anyone running in production with Windows, I would have said, &#8220;Maybe one or two.&#8221; </p>
<p>It&#8217;s a low percentage, but there are definitely many people running on Windows in production. We fully support Windows, OS X, Linux, and Solaris.</p>
<p><b>Scott:</b> That&#8217;s interesting for the Windows developers. Is it sort of like PHP on Windows, or are people actually using it as a back end for .NET apps?</p>
<p><b>Eliot:</b> PHP and .NET; not so much anything else. There&#8217;s a C# driver, and someone is working on a pure .NET driver. My knowledge of that right now is a little weak, but there are a bunch of different .NET things going on. I think there is a link adaptor, also.</p>
<p><a name="company"></a></p>
<p><b>Scott:</b> We&#8217;ve talked a lot about the database and the project, so can you talk a little bit about 10gen and the company wrapped around it? What do you do around MongoDB, for instance?</p>
<p><b>Eliot:</b> All the core MongoDB developers and the main developers of all the official drivers are 10gen employees. We basically sponsor the projects, and we are doing most of the development work. We also offer commercial support and training. </p>
<p>There are other interesting revenue opportunities for us, but those are the main things right now.</p>
<p><b>Scott:</b> Talk a little bit about where it&#8217;s going. What are some of the top feature requests, or what do you see as some of the opportunities and interesting directions?</p>
<p><b>Eliot:</b> If you look at our road map for this year, there&#8217;s no one big feature. I think the only big thing we&#8217;re doing right now is getting the auto-sharding to be fully production bulletproof. Traditional master-slaves and all that kind of stuff are pretty much bulletproof at this point. </p>
<p>The rest of it is just a lot of features. We are sort of similar to MySQL or PostgreSQL in terms of how you could use us, and people want all the features that they&#8217;re used to in MySQL and PostgreSQL. These include things like full-text search, SNMP, and all the assorted add-ons providing special indexing.</p>
<p><b>Scott:</b> That must be an interesting thing to balance, because for example, MySQL&#8217;s original claim to fame was that it was simple, it was fast, and it wasn&#8217;t trying to be your enterprise relational database. </p>
<p>Then it just sort of kept growing and growing, because people kept asking for more things, until people looked at it and said, &#8220;What happened to my simple, lightweight, fast database?&#8221; And then you saw people going off and starting to work on Drizzle and those sorts of things.</p>
<p>How do you balance that request for one more thing versus wanting not to lose sight of what you set out to build and the scenarios that you really wanted to nail?</p>
<p><b>Eliot:</b> I think it comes down to a few key things. One is making sure that we don&#8217;t regress on performance, which is one thing we&#8217;re very careful about. We constantly monitor our performance and make sure we&#8217;re not doing things that hurt it. </p>
<p>If there was a feature that would hurt our performance, we would think long and hard about implementing it, and we are definitely more interested in making the basics work than we are in adding more features. </p>
<p>We&#8217;re also careful about which features we let in, emphasizing common web infrastructure needs. It&#8217;s not useful for us to be simple and fast if you also need another database because you can&#8217;t do x, y, and z.</p>
<p>It&#8217;s very much the 80/20 rule. We want to cover the most general, wide-reaching things that people need, but it really comes down to good software engineering and good product management to make sure that we don&#8217;t screw up the basics. You don&#8217;t break the simple stuff by adding extra features. Another important aspect is good documentation.</p>
<p><b>Scott:</b> I don&#8217;t want to throw too many software companies under the bus, but there is often a willingness to regress on performance, because they figure that hardware has gotten faster, so the user won&#8217;t be affected, even if, in terms of CPU cycles, it&#8217;s taking longer to do stuff. </p>
<p>How hard and fast is your policy of comparing the old version and the new one on the same box to make sure it didn&#8217;t get any slower?</p>
<p><b>Eliot:</b> The next version coming out in a couple of months will be significantly faster than the previous version on the same hardware. We&#8217;re constantly working on that, so I&#8217;d like to think that that&#8217;s a pretty hard rule.</p>
<p><b>Scott:</b> As time goes on, ultimately you want people to be using it, and you want it to meet people&#8217;s needs. Can you talk a bit about cases where, even if a change might hurt performance but really helps you nail a usage scenario, implementing it might be the right choice?</p>
<p><b>Eliot:</b> I think a lot of that really comes back down to software engineering. For example, you can have such a feature but make sure that people can turn it off if they&#8217;re not using it, and that it won&#8217;t hurt their performance if they do turn it off.</p>
<p>We&#8217;re really trying to be very good on that level.</p>
<p><a name="community"></a></p>
<p><b>Scott:</b> You mentioned that the people who are writing MongoDB mostly work for 10gen. Talk a little bit about the community story that lives around that.</p>
<p><b>Eliot:</b> We&#8217;re seeing very few contributions to the core server code, mostly because it&#8217;s big, it&#8217;s flawed, and it&#8217;s complicated. There is a huge community, and it is unbelievably helpful in terms of the drivers. We&#8217;re getting lots and lots of great patches and contributors to the official drivers, so if you look at the PHP, Perl, Ruby, and File/Print drivers, there are tons of contributions. </p>
<p>The community has also been building a lot of great tools that sit on top of the drivers. One of the things we&#8217;re trying to do is keep the drivers pretty low level, very simple, and as fast as possible. If you want to hook into a login service, or if you want to hook up Apache to MongoDB for logging, we&#8217;ve seen a lot of great projects like that.</p>
<p><b>Scott:</b> That&#8217;s interesting, in the sense that it&#8217;s a relatively common thread across a lot of successful open source projects that the number of people who contribute to the core might be really small, but the number of people who write add-ons, for example, might be very large.</p>
<p>It sounds like the ancillary components, for lack of a better term, related to MongoDB largely consist of those drivers, as well as administration and reporting tools that layer on top and add a lot of value.</p>
<p><b>Eliot:</b> Right, and for example, there&#8217;s also a pretty standard adapter for Apache to log to MySQL. People are going to naturally want that to go to MongoDB, and there are going to be a lot of other different things that people will want. </p>
<p>That&#8217;s where the community has been very helpful, because we can&#8217;t be experts on a thousand different technologies, so it really makes sense for the experts in those fields to build them. It really works better for everyone.</p>
<p><b>Scott:</b> Well, we&#8217;ve gotten to the end of our time, and I think we&#8217;ve covered everything I wanted to. Thanks for taking the time to talk to me today.</p>
<p><b>Eliot:</b> Thank you. I&#8217;ve enjoyed it.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=289&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F13%2Finterview-with-eliot-horowitz-cto-of-10gen-mongodb%2F&amp;title=Interview+with+Eliot+Horowitz+%26%238211%3B+CTO+of+10gen+%2F+MongoDB" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F13%2Finterview-with-eliot-horowitz-cto-of-10gen-mongodb%2F&amp;title=Interview+with+Eliot+Horowitz+%26%238211%3B+CTO+of+10gen+%2F+MongoDB" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F13%2Finterview-with-eliot-horowitz-cto-of-10gen-mongodb%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F13%2Finterview-with-eliot-horowitz-cto-of-10gen-mongodb%2F&amp;title=Interview+with+Eliot+Horowitz+%26%238211%3B+CTO+of+10gen+%2F+MongoDB" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F13%2Finterview-with-eliot-horowitz-cto-of-10gen-mongodb%2F&amp;title=Interview+with+Eliot+Horowitz+%26%238211%3B+CTO+of+10gen+%2F+MongoDB" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F13%2Finterview-with-eliot-horowitz-cto-of-10gen-mongodb%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Eliot+Horowitz+%26%238211%3B+CTO+of+10gen+%2F+MongoDB+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F13%2Finterview-with-eliot-horowitz-cto-of-10gen-mongodb%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/02/13/interview-with-eliot-horowitz-cto-of-10gen-mongodb/feed/</wfw:commentRss>
		<slash:comments>35</slash:comments>
		</item>
		<item>
		<title>Interview with Shaun Walker &#8211; DotNetNuke Co-Founder</title>
		<link>http://howsoftwareisbuilt.com/2010/02/08/interview-with-shaun-walker-dotnetnuke-co-founder/</link>
		<comments>http://howsoftwareisbuilt.com/2010/02/08/interview-with-shaun-walker-dotnetnuke-co-founder/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 00:54:22 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=284</guid>
		<description><![CDATA[DotNetNuke is an extremely successful open source project that targets the Windows platform and .NET Framework.&#160; In this interview, we talk with Shaun Walker about bringing DotNetNuke forward from its humble beginnings as a .NET sample project, to the premier open source content management system for Windows. The beginnings of DotNetNuke and engaging with Microsoft [...]]]></description>
			<content:encoded><![CDATA[<p>
    DotNetNuke is an extremely successful open source project that targets the<br />
    Windows platform and .NET Framework.&nbsp; In this interview, we talk with Shaun<br />
    Walker about bringing DotNetNuke forward from its humble beginnings as a .NET<br />
    sample project, to the premier open source content management system for<br />
    Windows.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/02/08/interview-with-shaun-walker-dotnetnuke-co-founder/#beginnings">The beginnings of DotNetNuke and engaging with Microsoft on open source</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/08/interview-with-shaun-walker-dotnetnuke-co-founder/#paradigm">Open source development paradigms interacting with proprietary software</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/08/interview-with-shaun-walker-dotnetnuke-co-founder/#monetize">Seeking venture capital and monetizing DotNetNuke</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/08/interview-with-shaun-walker-dotnetnuke-co-founder/#response">Response by the community to creating a commercial version</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/08/interview-with-shaun-walker-dotnetnuke-co-founder/#overcome">Overcoming the pressures of a down economy</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/02/08/interview-with-shaun-walker-dotnetnuke-co-founder/#future">Future directions: cloud platforms and improved distribution channels</a></li>
</ul>
<p><span id="more-284"></span></p>
<p><b>Scott Swigart:</b> Shaun, why don&#8217;t you take a second to introduce yourself and DotNetNuke?</p>
<p><a name="beginnings"></a></p>
<p><b>Shaun Walker:</b> I&#8217;m the original creator of DotNetNuke, which I started back in 2001, based on a reference implementation that Microsoft had released for the .NET Framework 1.0 Beta, an application called the IBuySpy Portal.</p>
<p>It was a sample application, released under a fairly liberal license, that allowed developers to come up to speed on the new .NET Framework. The application had some interesting characteristics, in that there was this notion of extensibility, although it wasn&#8217;t a fully baked solution.</p>
<p>I spent about a year on my own toying with that application and enhancing it. Really, what I had in mind at that point was building out a solution for amateur sports teams, so they could manage a website presence for themselves.</p>
<p>After about a year of doing that, I realized I really didn&#8217;t have the resources to convert what I had into a commercial venture. I had been active in the Microsoft developer community, specifically the forums on the www.asp.net site, where there was a small group of developers sharing their enhancements around that application.</p>
<p>It was actually Christmas Eve, 2002 that I decided to release my source code as an open source application. That initial release I called the IBuySpy Workshop, because I wanted it to have some connection to the original IBuySpy Portal.</p>
<p>I just announced it in that discussion forum on the ASP.NET site, and immediately, there were a lot of downloads and excitement around it, so I worked feverishly for about three months as people downloaded it and sent feedback. I think I was releasing every couple weeks at that point.</p>
<p>In March, Microsoft reached out to me because they had recognized that there was a bit of a community building around the application I had created, and they didn&#8217;t have any plans for enhancing the original IBuySpy Portal or building a community around it on their own.</p>
<p>Scott Guthrie reached out to me and invited me to come down to Redmond. We chatted about the vision for ASP.NET 2.0, because that&#8217;s what his team was working on at the time. We also discussed how DotNetNuke could be aligned with that vision and the importance of community, which I found to be quite refreshing, because it was not apparent to me that folks within Microsoft were thinking seriously about the benefits of open source.</p>
<p>We worked out an arrangement where Microsoft actually sponsored me for one year, which allowed me to leave my full-time job and focus my full efforts on building the DotNetNuke product and managing the ecosystem that built up around the project.</p>
<p>After that one year sponsorship, I had to find my own source of revenue to sustain the ongoing development. I relied mostly on advertising and sponsorship in the early years, then took on additional full-time resources to help manage the project. </p>
<p>It was in late 2006 that we actually took the bigger leap and formed DotNetNuke Corporation, which was a more suitable vehicle for governing the ongoing needs of the project.</p>
<p>At that point, I brought aboard as co-founders three of the contributors to the open source project who had been committed and loyal to the project. We worked under arrangement for about a year before we got serious about looking for venture capital. We were successful in getting a Series A round of venture capital in late 2008 from August Capital and Sierra Ventures.</p>
<p><a name="paradigm"></a></p>
<p><b>Scott:</b> We&#8217;ve talked to people who work on a lot of projects, from smaller ones to the largest ones like the Linux kernel and the Apache Foundation. Even though every open source project obviously evolves a little differently, what you&#8217;ve laid out there is almost a textbook story.</p>
<p>You started out scratching your own itch, like you said, with the IBuySpy sample that Microsoft put together, and it turned out that what you did was interesting to a lot of other people. A community formed around it, and so you decided to really invest in it and made it your day job.</p>
<p>The other people who came on board were basically people from the community who had made a name for themselves working on it. If you look at other open source communities, they&#8217;re also very reputation-based and reward people who actually get things done. Having an idea is great, but the ideas that tend to win out are those where someone has the idea and then rolls up their sleeves and implements it.</p>
<p>It&#8217;s really interesting to hear that basically that same road map was followed here as for the LAMP stack. This is built on top of Windows and the .NET Framework, completely proprietary technologies, but you&#8217;ve built an open source project and an open source business following the same trajectory you&#8217;d see elsewhere.</p>
<p><b>Shaun:</b> I think that the open source paradigm transcends platform. It&#8217;s not necessarily a Windows versus Linux debate. The whole notion of open source is just built on openness, transparency, the willingness to share, and open protocols. All of that can live in either environment very comfortably. </p>
<p>It turned out that the solution that I&#8217;d come up with in those early stages actually related to a very large problem. That&#8217;s one of the other fundamentals of starting an open source project: you have to be scratching an itch that a lot of people are feeling.</p>
<p>The problem has to be large enough that it&#8217;s interesting for the initial person to solve, and it also has to be such that other people are willing to get involved in it and get excited about it. Without the passion, that organic growth really can&#8217;t happen.</p>
<p>DotNetNuke, right from Day One, enjoyed a very passionate developer community, because they were all working toward solving a fairly large problem: both individuals and businesses need website presences.</p>
<p>Some of the early tools that were available for entities to get online were really quite limited in their functionality, and so with the emergence of content management systems and web application frameworks, the sophistication of those online properties has gone up significantly. It&#8217;s really been software applications like DotNetNuke that made that happen.</p>
<p><a name="monetize"></a></p>
<p><b>Sean Campbell:</b> I imagine that a lot of people, especially given the last year or so in the economy, are curious about that journey, going through and getting VC funding. What challenges would you reflect upon and say, &#8220;Well, if I&#8217;d known this before, I would have done something a bit differently during that process.&#8221;</p>
<p><b>Shaun:</b> We spent a considerable amount of time doing fundraising in 2008, and I was surprised to learn the extent to which it&#8217;s not easy to raise money.</p>
<p>The fact that we had a very large and vibrant open source community wasn&#8217;t nearly enough for a lot of investors to show commitment. Really what they were looking for was some commercial aspect of the project as evidence that people were willing to spend money on this solution and not just download and use it for free.</p>
<p>Prior to this year, we didn&#8217;t really have much of a commercial component to the project. We could point to huge download numbers and a huge number of extensions that were available for the platform but we really didn&#8217;t have any evidence to point to business traction. That&#8217;s really what the venture capitalists are looking for.</p>
<p>They felt like they were taking a significant risk because it was an unproven model. In the end, we did find a couple of investors who were willing to take that risk and it&#8217;s worked out very well. We had a great year in 2009.</p>
<p>We offer support and other proprietary features around the platform and we&#8217;ve grown from 0 to nearly 400 paying customers in year one. We&#8217;ve hit all of our revenue goals and that wouldn&#8217;t have been possible without the injection of capital at the end of 2008. That allowed us to add more full-time resources to the project which really started to accelerate its growth.</p>
<p><b>Sean:</b> How did you make the decisions around which aspects of the project you thought people would be willing to pay for and how did that guide the development of the project as a whole?</p>
<p><b>Shaun:</b> In the very early stages of the project when we didn&#8217;t have a commercial version, we would implement pretty much any feature that came to the surface from the community as being desirable. Over time, we found that a lot of those features that we were adding really only catered to a small subset of the community, largely the more serious business users.</p>
<p>When it came time for us to introduce the commercial versions of the product, it was actually fairly simple to dissect the needs of the different constituents. We could look at a typical community user as really only running a single website off of a single install, with fairly limited content management needs.</p>
<p>On the other hand, a more serious business user might be using our platform to run multiple websites off of a single install, and they might have way more advanced content management and scalability needs, as well as demands around performance and security. So, if you looked at those different vectors, especially performance, security, and advanced content management, it became fairly obvious where different features should reside across the different product lines.</p>
<p>The other thing that&#8217;s important to note is that, right from Day One, the DotNetNuke application was architected in a way that&#8217;s very modular. So introducing proprietary functionality around the open core was fairly straightforward for us to do because the application had always been designed that way.</p>
<p>We have a very vibrant extensions ecosystem for both what we call skins, which are the look and feel, as well as mini-applications that can be embedded in a DotNetNuke framework, which we call modules. There are more than 6,000 different extensions available today on our marketplace at www.Snowcovered.com. </p>
<p>With that vibrant ecosystem already there and people already used to paying money for extensions, it was a fairly easy transition for people to adjust to the fact that there was going to be a commercial version of the core application as well.</p>
<p><a name="response"></a></p>
<p><b>Sean:</b> Did you have any resistance from users who had been getting the application for free for years, when all of a sudden you started telling them that they would have to pay for certain features? Sometimes it&#8217;s just a vocal minority, but there&#8217;s generally a group like that.</p>
<p><b>Shaun:</b> That aspect of open source in general speaks to the transparency aspect. Basically, anyone who&#8217;s involved in an open source project can get up and stand on their soapbox and vocalize what their concerns are. Unfortunately, their concerns can be quite self-serving, rather than considering the broader community or the broader context.</p>
<p>As we made the transition from being a pure, free, open source product by adding a commercial version, there was certainly a vocal minority in the community that expressed their concerns through our public discussion forums.</p>
<p>As a project owner, you have to be very attentive to staying on top of those and not just ignoring them, but speaking to people&#8217;s concerns so that they understand that broader context because often they hadn&#8217;t considered some different perspective.</p>
<p>It was surprising, but a lot of people in the community who were vocally opposed to us having a commercial version didn&#8217;t consider the fact that we were actually getting requests from many businesses that wished there was commercial support available for the product.</p>
<p>Some people have the perspective that if you use it for free, then support should be free, and they hadn&#8217;t considered the broader business implications of that.</p>
<p>Once you start explaining those things, people can start to get a bit more comfortable with it. You&#8217;re never going to please everyone all the time, though. There&#8217;s always going to be a certain subset of people that are not going to buy into your arguments, and they&#8217;re going to continue to be vocal.</p>
<p>That becomes a challenge to deal with on an ongoing basis but I guess we&#8217;ve gotten pretty savvy at dealing with those things over the last seven years. It&#8217;s no different than any other open source project.</p>
<p>To be honest, that friction is a good thing because it keeps everybody a bit more accountable. Even if certain people have a very extreme point of view, by being forced to listen to that point of view, it keeps you more from straying from what your true goals should be.</p>
<p><a name="overcome"></a></p>
<p><b>Scott:</b> It&#8217;s become almost cliché, I guess, to talk about the economy, but you guys were going for VC funding during a pretty tough economic time. It&#8217;s always interesting to hear stories about people who have thrived in those circumstances.</p>
<p>Talk about some of the implications of seeking funding during this climate.</p>
<p><b>Shaun:</b> We signed the term sheet in September of 2008 with August and Sierra, and a couple weeks later, the bottom fell out of the economy. At that point, we had the term sheet in hand and we were going through due diligence. Due diligence took much longer than expected, because we had some clean up to do in terms of making sure everything was in order from a legal perspective to satisfy the investors&#8217; needs.</p>
<p>But the investors were committed at that point to doing the deal, so we worked through due diligence and were successfully funded and then they gave us a runway, which a lot of companies didn&#8217;t have, to focus on introducing some new products in 2009, adding a support component, and staffing up the business.</p>
<p>Where a lot of businesses were laying people off, we had the ability to actually hire people. We got to hire some really great people, because a lot of very talented people were out of work.</p>
<p>The other thing that worked in our favor, obviously, is that open source companies traditionally have lower costs than proprietary offerings. So when we look at some of the competitors that exist in the proprietary space, DotNetNuke is a more economical, more competitive solution in that regard. </p>
<p>I think that we picked up a lot of business this year based on the fact that we are a more economical offering for putting up a website.</p>
<p>The combination of those factors really helped us in this down economy, and as we&#8217;ve come out of this, hopefully 2010 is going to be a much better year for the global economy. We&#8217;re hoping that it helps us gain some additional momentum.</p>
<p><a name="future"></a></p>
<p><b>Scott:</b> What do you see as the future of DotNetNuke?</p>
<p><b>Shaun:</b> In 2010, we have some high-level objectives in terms of things we want to deliver. One area, in particular, is the marketplace where commercial extensions are sold for the platform. Like I said, there are about 6,000 extensions in that marketplace today, but it&#8217;s not really a seamless experience. </p>
<p>A customer has to install DotNetNuke and then they have to browse through our marketplace, make a purchase, and download a product. Then they have to upload it to their site.</p>
<p>As a software marketplace, the iPhone&#8217;s App Store provides a much more seamless experience, where you can view the different extensions that are available for the iPhone and choose to purchase them immediately. They download seamlessly to the device and they&#8217;re installed.</p>
<p>We want to build that type of tight integration between the DotNetNuke platform and the DotNetNuke marketplace in 2010. We believe that will really help consumers and businesses better manage their websites in a more seamless manner with less obstacles.</p>
<p>Another area that we see a tremendous amount of interest is around the cloud. It&#8217;s still challenging today if a business wants to put up a website. Although there are over 50 hosting partners today that offer a one-click install of DotNetNuke, if somebody comes directly to our site to download the software, they have to figure out what the next logical step would be to get it up and running. Basically, there are a number of obstacles in the way of a person accomplishing their goal.</p>
<p>This whole notion of cloud takes that complexity out of the equation as it provides a turnkey, instant gratification installation process that enables a customer to get up and running immediately in a suitable environment with all the necessary resources in place.</p>
<p><b>Scott:</b> What&#8217;s your story around the Windows Azure platform? That seems like a logical cloud for you to land on because it is .NET based, and it&#8217;s not the most painful thing to migrate an ASP.NET application over to that.</p>
<p><b>Shaun:</b> We are definitely talking to the Windows Azure team and it&#8217;s one of the cloud platforms that we would like DotNetNuke to run on. We are also talking to the Amazon EC2 team and, in fact, we have DotNetNuke running on EC2 already.</p>
<p>Like you said, Windows Azure is a natural choice because it&#8217;s native Microsoft technology but, at least in the early releases of Azure, it&#8217;s not a no-brainer in terms of getting a .NET application running on Azure. You have to do some re-architecture and, at that point, I don&#8217;t know if you&#8217;re tied too closely to the Windows Azure platform.</p>
<p>Microsoft has recently come out with some announcements that they are going to be offering some Windows virtual-machine-type deployment scenarios, much like those Amazon provides. That might be an interesting cloud option for DotNetNuke, as well.</p>
<p>You&#8217;ve also got to consider that those are the two predominant cloud players but there are a lot of smaller players, as well, like Rackspace and even smaller hosting providers that are offering cloud type offerings that give you a lot more control.</p>
<p>One thing that I&#8217;ve noticed with some of the larger players is that, in order to commoditize the service and allow it to be offered at a low price point, they actually take away a lot of features and a lot of flexibility that you would traditionally get in a typical web-hosting environment. </p>
<p>That&#8217;s where a lot of the smaller players are able to compete, because they can provide cloud-type services but with greater flexibility. So, we&#8217;ve got to look at all of these things when we look at running in a cloud.</p>
<p><b>Scott:</b> It sounds like you are definitely rapidly pursuing the cloud, in a provider-agnostic way, and you&#8217;d like to be an option for as many clouds as reasonably makes sense.</p>
<p><b>Shaun:</b> Right; we are avoiding lock-in, which aligns with our open source ideals.</p>
<p><b>Scott:</b> Thanks for taking the time to talk today.</p>
<p><b>Shaun:</b> My pleasure.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=284&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F08%2Finterview-with-shaun-walker-dotnetnuke-co-founder%2F&amp;title=Interview+with+Shaun+Walker+%26%238211%3B+DotNetNuke+Co-Founder" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F08%2Finterview-with-shaun-walker-dotnetnuke-co-founder%2F&amp;title=Interview+with+Shaun+Walker+%26%238211%3B+DotNetNuke+Co-Founder" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F08%2Finterview-with-shaun-walker-dotnetnuke-co-founder%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F08%2Finterview-with-shaun-walker-dotnetnuke-co-founder%2F&amp;title=Interview+with+Shaun+Walker+%26%238211%3B+DotNetNuke+Co-Founder" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F08%2Finterview-with-shaun-walker-dotnetnuke-co-founder%2F&amp;title=Interview+with+Shaun+Walker+%26%238211%3B+DotNetNuke+Co-Founder" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F08%2Finterview-with-shaun-walker-dotnetnuke-co-founder%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Shaun+Walker+%26%238211%3B+DotNetNuke+Co-Founder+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F02%2F08%2Finterview-with-shaun-walker-dotnetnuke-co-founder%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/02/08/interview-with-shaun-walker-dotnetnuke-co-founder/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Interview with Ian Clarke &#8211; Luminary and Freenet Creator</title>
		<link>http://howsoftwareisbuilt.com/2010/01/20/interview-with-ian-clarke-luminary-and-freenet-creator/</link>
		<comments>http://howsoftwareisbuilt.com/2010/01/20/interview-with-ian-clarke-luminary-and-freenet-creator/#comments</comments>
		<pubDate>Wed, 20 Jan 2010 03:41:25 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=279</guid>
		<description><![CDATA[It&#8217;s not exaggeration to say that Ian Clarke takes on some of the hardest problems in computing. Ian invented Freenet, and open-source network that remained uncracked, and is today contemplating Swarm, which has ambitions goals about moving computing closer to distributed data. But Ian isn&#8217;t just a geek&#8217;s geek. In this interview, he throws in [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s not exaggeration to say that Ian Clarke takes on some of the hardest problems in computing. Ian invented Freenet, and open-source network that remained uncracked, and is today contemplating Swarm, which has ambitions goals about moving computing closer to distributed data.  But Ian isn&#8217;t just a geek&#8217;s geek.  In this interview, he throws in some thoughts about the business side of software development too.</p>
<ul>
<li><a href="#origins">The origins of Freenet and anonymous peer-to-peer architecture</a></li>
<li><a href="#bacon">On the mathematics behind &#8220;Six Degrees of Kevin Bacon&#8221;</a></li>
<li><a href="#manage">Observations about optimal strategies for managing an open source community</a></li>
<li><a href="#swarm">The Swarm project&#8217;s approach to distributed data</a></li>
<li><a href="#implement">Implementing fork/join and related concepts in distributed systems</a></li>
<li><a href="#relative">Relative beneits of open source and proprietary approaches to development</a></li>
<li><a href="#incorporate">Incorporating open source into an overall business strategy</a></li>
</ul>
<p><span id="more-279"></span></p>
<p><b>Scott Swigart:</b> To start off our conversation, could you please take a moment to introduce yourself and talk a bit about the different efforts you divide your time between?</p>
<p><a name="origins"></a></p>
<p><b>Ian Clarke:</b> Sure. I have a degree in computer science and artificial intelligence from the University of Edinburgh in Scotland. I started my career as a subcontractor to the European Space Agency.</p>
<p>Prior to graduation, I did some work on emergent systems and anonymity on the Internet. That grew into an open source project for which I act as founder and coordinator called Freenet, which started back in 1999 and has been ongoing ever since. </p>
<p>When I made Freenet publicly available after I graduated in 1999, it ended up getting quite a bit of publicity, and we released our first version in March 2000. In fact, a paper we released in 2000 describing Freenet ended up being the most cited computer science paper of that year. </p>
<p>Freenet was kind of novel, in that it was the first defense that allowed anonymous peer-to-peer architecture. Freenet also was precursor to the concept of the distributed hash table, which is a technique to efficiently store and retrieve data from distributed computers.</p>
<p>I moved to California and started a company, raising $4 million in venture capital, primarily from Intel. We developed a peer-to-peer content distribution technology, which we sold in 2003. All this time, Freenet was my hobby going on in the background. </p>
<p>I then did some consulting, and as part of that, I designed a peer-to-peer video distribution system for Janus Friis and Niklas Zennstrom, who I&#8217;m sure you know are the creators of the software you&#8217;re currently using.</p>
<p>I handed that off to their team in 2006, and it became a project known as Joost, which you can look at as the next big thing after Skype. I then co-founded a company called Revver, which was one of the early entrants into the short-form video-content sharing space. Revver is unique in that we were advertising against video back into 2005 and sharing the revenue with content creators.</p>
<p><b>Scott:</b> How did that set the stage for the projects you&#8217;re working on now?</p>
<p><b>Ian:</b> One of the problems I was working on as Chief Scientist at Revver was how to target individual videos with specific ads to maximize revenue. I became very interested in the whole field of collaborative filtering, predictive analytics, and recommender systems.</p>
<p>My current day job, which is a project called SenseArray, grew out of that. One area I learned a lot about with Revver concerns the challenges of scaling up a website, scaling up databases, and worrying about what data goes where and what data is cashware.</p>
<p>I believe that a lot of the architectural problems inherent in building a website and having a bunch of people actually start using it should not be the concern of the software engineers. Those issues should be handled by the framework in a mostly transparent way.</p>
<p>Those thought processes gave rise to a project called Swarm, which I spoke publicly about only a couple of months ago for the first time. Right now, it&#8217;s at a very early conceptual prototype stage, because I&#8217;m trying to get people interested in helping me work on it. Swarm is extremely ambitious and encompasses some very difficult problems.</p>
<p><b>Scott:</b> Freenet sounds like the kind of thing that people might have been more interested in 2002 or later, in the sense that it predates the Patriot Act and widespread eavesdropping over global communications. Freenet provides an anonymized way to get out content and communicate with other people.</p>
<p>What do you think made Freenet interesting at the time you launched it, and how did it become more interesting because of changes culturally and geopolitically between 2000 and today?</p>
<p><b>Ian:</b> When we first launched Freenet, it really generated two kinds of interest which were almost orthogonal to each other. One could have easily occurred without the other, but they both happened at the same time. The first kind was academic interest in what Freenet was doing and how it was doing it.</p>
<p>Freenet did a couple of things that had never really been done before. One of the core problems that Freenet tried to solve was to find information quickly from data that is distributed across a large network of computers. Many of those computers are unreliable, and some could be operated by your enemies.</p>
<p>You cannot rely on any form of centralization in order to do that. You probably remember Napster, which was a peer-to-peer network that maintained a central directory recording where everything was. Of course, that central directory can be shut down, and that really was not an option for Freenet. </p>
<p>At the time, Freenet was designed to be used in countries like China, Saudi Arabia, and Iran, where governments are not constrained by constitutions and laws in the same way that they are in this country.</p>
<p>Freenet was intended to operate in a very hostile environment, which creates some difficult problems. How do you find a piece of information, without consulting some centralized entity that keeps track of where it is? I developed a technique for doing this that relates somewhat to small world theory. </p>
<p><a name="bacon"></a></p>
<p><b>Scott:</b> That&#8217;s the same theoretical basis as the Six Degrees of Kevin Bacon game, right? [laughs]</p>
<p><b>Ian:</b> Exactly. The whole notion that you can connect between any two people through about six other people actually has a lot of mathematics behind it. </p>
<p>It turns out that if you just randomly create a network of things, you cannot so easily get from any one point to any other without omissions. For that to work, it must be a small world network, and it just so happens that human relationships do form a small world network, whereas that&#8217;s not true of any arbitrary network of things.</p>
<p>One of the novel things in Freenet was exploiting small world theory to solve the problem of searching and retrieving data in a decentralized, efficient, scalable way. I think that&#8217;s part of what generated the academic interest, and some of the stuff we were doing with cryptography was also quite new.</p>
<p>We certainly did not invent any new cryptographic algorithms, but we were using existing algorithms in some novel ways. That generated interest in the academic communities, computer security communities, cryptography communities, and what later became the peer-to-peer communities, although Freenet slightly predates the term &#8220;peer-to-peer.&#8221;</p>
<p>The other type of interest we were getting was mainstream interest from media like the New York Times and 60 Minutes, which related primarily to the copyright implications of something like Freenet. </p>
<p>Around about that time, of course, Napster was getting a huge amount of publicity, but a lot of people realized that Napster would probably get shut down, and here was Freenet, which could not be shut down. That generated a lot of mainstream media interest that really didn&#8217;t have much to do with the academic interest.</p>
<p>In some ways, they conflicted with each other, especially since some academics dismissed Freenet because it was getting so much mainstream publicity, which is kind of a weird effect.</p>
<p><a name="manage"></a></p>
<p><b>Scott:</b> What did you learn from Freenet in terms of managing an open source community? A lot of people have talked about why communities succeed or fail, but what are some of your lessons learned from that experience?</p>
<p><b>Ian:</b> I have made a lot of mistakes, and I think I will continue to, so I certainly do not really see myself as being an authority. I think what has worked quite well for us has been being very open in terms of allowing people to join the development effort. I know that in some open source projects, it is incredibly difficult to gain access to source controls, and it is a big deal when you do.</p>
<p>We take the opposite tack. If somebody emails us and says they have an idea about how to do something, we give them access. If they do something wrong, we can revert it, which is the beauty of source control. We do everything we can to lower the barriers to entry for people to contribute to the project.</p>
<p>Another lesson has been that it&#8217;s very important to explain to people that they shouldn&#8217;t wait for someone else to tell them what to do. I find that a lot of people send messages to our mailing list, saying that they want to be part of the project and asking for something to do. Invariably, nobody will respond to them.</p>
<p>People need to know that nobody is going to laugh at them or shout at them or get mad at them if they just try to improve something. They can just find something that needs doing and then go do it.</p>
<p>Initially, one of our most prolific contributors over the years was Oskar Sandberg, who actually ended up doing a Ph.D. on stuff related to Freenet. He initially joined the project by writing a very simple Perl utility. To be honest with you, I can&#8217;t even remember exactly what it did, but it was something extremely simple, and from there he got more and more involved.</p>
<p>It&#8217;s really about lowering the barrier to entry, but also making it clear to people that they have to take stuff. They can&#8217;t wait for it to be given to them.</p>
<p><b>Scott:</b> In other words, it&#8217;s better if someone comes to the project with their own itch that they want to scratch rather than &#8220;Hey, I just want to be part of this put me to work.&#8221; We hear from a lot of people that the best developers are often the people who are really using it all the time and so they know, &#8220;Hey, this is cool but it would be even cooler if it could do X.&#8221;</p>
<p>To return to Swarm for a moment, one of the things you mentioned that got my attention is that this is an incredibly ambitious undertaking. What is it that makes Swarm so ambitious, and how is that likely to impact the management of the community around it?</p>
<p><a name="swarm"></a></p>
<p><b>Ian:</b> To put that more in context, I should first talk a little bit about what Swarm is. The basic premise is to build an environment where computer programs can be written so that they can either run on one computer just like a normal program or be distributed across many computers in a way that is mostly transparent to the programmer. The fundamental concept behind Swarm is to move the computation, not the data.</p>
<p>All computer programs are essentially a collection of code that operates on data, and the data can potentially be distributed across multiple computers. The concept with Swarm is that, if code that is executing on one computer needs to do something to data that is on a remote computer, you should move the computation to where the data is.</p>
<p>The conventional approach would be to move the data to the computer where the code is executing, and the Swarm approach is actually a little bit like the MapReduce concept that you may be familiar with.</p>
<p><b>Scott:</b> In fact, I was just thinking that I should ask you how this approach is similar or different from Hadoop&#8217;s approach.</p>
<p><b>Ian:</b> They both share the maxim &#8220;Move the computation, not the data,&#8221; but Swarm is really a lot more general. In order for a problem to be solvable through MapReduce, it needs to be possible to break it down into a map operation and a reduce operation. </p>
<p>Some problems are reducible in that way, especially stuff like indexing web pages, but many problems are not. In particular, anything that requires the equivalent of a database Join between two tables simply can&#8217;t be broken down into a MapReduce in any kind of efficient way.</p>
<p>In short, then, MapReduce is the same general concept but only applicable to a specific class of problems, whereas Swarm is a lot more general. Really, you can write any computer program, and with Swarm in particular you can write your code in Scala</p>
<p>I&#8217;m not sure how familiar you are with Scala, but I consider it to be the successor to Java. It has a lot of benefits of Java but without a lot of its annoyances and with some neat features that Java may never get, like closures, type inference, and that sort of thing.</p>
<p>You write a computer program in Scala, and it can really do anything. There is no requirement that you can break it down into MapReduce operations. It can just be an arbitrary computer program.</p>
<p><b>Scott:</b> Can you share a few particulars about the Swarm approach to handling distributed data?</p>
<p><b>Ian:</b> Let&#8217;s say an object on the computer that is executing happens to have a pointer to an object on a remote computer. At that point, Swarm essentially freezes your computer program, serializes it through a mechanism called continuations, and migrates the computation to the remote computer, where it starts up again.</p>
<p>From the perspective of the programmer, you don&#8217;t even need to know that this is happening or where your code is executing. Swarm can actually move the execution state of your code around between multiple computers in a way that the programmer doesn&#8217;t really have to think about it.</p>
<p>Now obviously, it&#8217;s not inherently very efficient to serialize a computer program&#8217;s state and migrate it across the network, so the second component of Swarm figures out how to arrange the data to minimize the number of times that the program state has to migrate from one computer to another.</p>
<p>Those are essentially the two components of Swarm. Firstly, allow a computer program&#8217;s state to jump around to follow data, so that you can just distribute data across multiple computers and then the computer program just follows it where it needs to go. The second component helps us be smart about where we put the data to maximize efficiency.</p>
<p>In terms of what makes this hard, the continuation mechanism I mentioned is something called portable continuations, which are supported by very few programming languages. Even Haskell cannot support portable continuations, and you know, if you can think of an obscure capability for a programming language, chances are Haskell can do it.</p>
<p>[laughter]</p>
<p><b>Ian:</b> It just so happens that in Scala 2.8, which is the upcoming version, somebody implemented portable continuations. In many ways, this is the kind of thing that makes Swarm even possible.</p>
<p>In other areas of difficulty, there are all kinds of concurrency issues when code is acting on data that can be distributed across multiple computers. You have to deal with persistence and consistency, and all of that.</p>
<p>Problems of consistency and concurrent access to data are part of a very active research area at the moment, and there are a lot of hard problems there. How do you know when it&#8217;s safe to delete data so you can efficiently implement garbage collection across many computers? How do you implement stuff like atomic transactions when your data is distributed across multiple computers?</p>
<p>All the time, you&#8217;ve got a separate overseer process that&#8217;s watching how your computer program is jumping between computers and trying to move data in order to optimize that. I&#8217;ve got a lot on my plate, and I cannot follow all of these problems myself. At the moment, I&#8217;m trying to attract people with expertise in areas like handing concurrency and software transactional memory to the Swarm project.</p>
<p><a name="implement"></a></p>
<p><b>Scott:</b> I am thinking about the approach you describe and comparing it to a more traditional environment, where 100 web servers are talking to a handful of middle-tier servers, which are talking to one huge honking database. You scale that thing up as large as you possibly can, and if you can&#8217;t scale it up any larger, you fail. </p>
<p>I&#8217;m also thinking about the big table approach, where data is shared across lots of nodes. With something like Google App Engine, you get scalability, but you lose some relational database operations like Joins. You may be limited in terms of not being able to do the equivalent of a select that returns a million rows. You&#8217;re limited to a thousand results or something like that. </p>
<p>You talked about Joins and a lot of these concepts, and it sounds like what you are doing with Swarm is envisioning a novel data store. The data is distributed across a lot of nodes, and there&#8217;s an optimizer that&#8217;s always positioning the data so it will be the most efficient for the code to come to it. </p>
<p>At the same time, though, you&#8217;re still supporting what people tend to think of as relational operations, like Joins and those types of things. Am I envisioning it correctly?</p>
<p><b>Ian:</b> I think you&#8217;re absolutely right that you could do things like Join using Swarm, although that would really be at a higher level of instruction. So, for example, somebody could implement a relational database on top of Swarm, but Swarm is not itself a relational database. Really, Swarm&#8217;s representation of data is as an object graph, much as data is represented in memory.</p>
<p><b>Scott:</b> It seems like the concept of moving the execution to the data becomes more and more important as everything now just generates terabytes of data, from super colliders to oil fields.</p>
<p>It&#8217;s no longer practical to move the data to the computation, and you talked about your code running in one place for a little while and then having its state packaged up and running somewhere else.</p>
<p>Still, a lot of times what you want is for your code to sort of fan out. How does Swarm deal with the notion of making the logic fan out and go parallel when it can across nodes? And does it take that responsibility off of the developer, or is that something they would have to be conscious of?</p>
<p><b>Ian:</b> This is something I&#8217;m thinking about at the moment. For example, one thing you could do is implement something like a fork/join approach to parallelization. I know there is a fork/join framework that will probably be part of the next major version of Java. You could absolutely implement something like that on top of Swarm, where you explicitly say to it, you can parallelize this if you want.</p>
<p>I wouldn&#8217;t say the parallelization is completely transparent to the programmer. It&#8217;s not like the system is examining the code and being very clever about figuring out what can we do in parallel and what we need to do serially.</p>
<p>What you can do is implement something like fork/join so the programmer can explicitly give the system permission to parallelize something. Then Swarm will decide where it makes sense to run on one thread or whether to fork it and send different parts of the execution to different other machines and then recombine it all at the end.</p>
<p>In short, the answer to your question is that you can do parallel processing with Swarm, but it probably won&#8217;t be completely transparent to the programmer.</p>
<p><b>Scott:</b> Still, it sounds like it would be implemented from the programming standpoint in more of a declarative way, rather than the with imperative fork/join syntax that you would have in other languages.</p>
<p><b>Ian:</b> I think that is almost certainly true, although I should say that a lot of this really is very much still on the table. You could implement fork/join as a library on top of Swarm. Swarm gives you primitives such as fork a process or communicate between processes, and you have a lot of flexibility about what you build on top of that.</p>
<p><a name="relative"></a></p>
<p><b>Scott:</b> Taking the conversation in a slightly different direction, with Freenet, you manage the community, and with Swarm, you&#8217;re trying to attract people to tackle some nearly intractable problems. Then you also do proprietary software development. </p>
<p>Talk a little bit about the difference in the way you see things working in the proprietary world versus the open source world and where you see advantages or disadvantages to either approach for certain kinds of problems.</p>
<p><b>Ian:</b> There is the obvious benefit in the proprietary world that when you employ people, they more or less they have to do what you say, which is certainly not the case in the open source world, where you have to persuade people.</p>
<p>Open source can make it a lot more difficult to put processes in place, because programmers by nature often resent processes unless they strongly believe in them, although even in the commercial world, if you impose something and nobody believes it&#8217;s any good, then it&#8217;s probably not going work very well either. You can implement structure more easily in the proprietary world though, I think. </p>
<p>The proprietary world also tends to present fairly clear goals. If you&#8217;re going in the right direction, people will buy it and keep paying for it, and if you&#8217;re going in the wrong direction, they won&#8217;t buy it, and you don&#8217;t have a business.</p>
<p>With something like Freenet, it is hard to know what exactly you&#8217;re trying to be. For example, are we trying to service people living in countries like the U.S., people living in China, or both? The goal of proprietary software is to make money, and hopefully you can translate that back to how you need to prioritize different feature-functionality today.</p>
<p><b>Scott:</b> I&#8217;ve never heard anyone call that out as an advantage, although it makes perfect sense. To put it another way, in proprietary software, there&#8217;s very low latency between the moves you make as a software company and the signals the marketplace sends back to you. </p>
<p>On the other hand, a lot of open source projects don&#8217;t even know how much of it is out there. It&#8217;s very nebulous how much &#8220;share&#8221; it has, so there&#8217;s a little more latency between the time a project starts to head in the wrong direction and the point where it&#8217;s really receiving clear information, that it is negatively affecting the use of that project.</p>
<p>Flip it around and talk about some of the advantages of open source that aren&#8217;t really present in proprietary software. For example, if I am a Fortune 500 company, I&#8217;m not in the business of building a web server, but I need one. It&#8217;s a lot more beneficial for me to devote a handful of programmers to Apache so we get the things we want, rather than building one from scratch.</p>
<p><b>Ian:</b> That is certainly very true. From the perspective of the industry in general, open source permits people to create and collaborate on aspects of infrastructure that don&#8217;t need competitive advantages, but just need to work.</p>
<p>Because of that factor, you can have Microsoft and other companies, whether they like it or not, wind up cooperating with each other to the benefit of all. If you look at it from a game theory point of view, proprietary software can be more of a zero-sum game that tends to discourage collaboration, whereas open source tends to encourage or even enforce collaboration if it&#8217;s a viable open source license.</p>
<p>I certainly believe that the software industry is much better off with open source software. But on the other side, in many situations in my career, I&#8217;ve been building proprietary software, and people have said, &#8220;Oh, you should open source it. Wouldn&#8217;t that be great?&#8221; </p>
<p>It would be great for people in general, but it would remove the incentive to create the software in the first place. Because generally speaking, open source business models tend to be services-based business models, which do not scale as well as product-based business models. </p>
<p>To return to your original question, the advantages of open source versus proprietary are different, depending on whose perspective you take. You can build a business creating open source software, but I think it&#8217;s more difficult.</p>
<p><a name="incorporate"></a></p>
<p><b>Scott:</b> We talked to a VC, and his comment was &#8220;Sometimes we give open source credit for being more different than it is.&#8221; There are a lot of companies that go with kind of a &#8220;freemium&#8221; model, where they have a free version of the product, and then they try to up-sell you to the enterprise version. It&#8217;s the equivalent of a loss-leader, which isn&#8217;t new in business, or the modern equivalent of the 30-day evaluation.</p>
<p>Do you think it&#8217;s true that we sometimes highlight more differences between open source development and proprietary development than there really are?</p>
<p><b>Ian:</b> It all depends on your strategy. I think there are many different reasons for open source. For example IDEA, the Java IDE, recently released a community version. I certainly hear a lot about companies that build open source community versions and then try to up-sell people to a commercial version.</p>
<p>I have not seen a lot of solid data demonstrating the success or otherwise of that approach. I think there are some situations where a company has a wider strategic interest to create a platform; like for example the Google Android platform, or Google Wave.</p>
<p>But, in those cases, a much longer-term strategic interest informs those high-risk projects. I think a lot of Google&#8217;s open source projects do not make money, and will not make money, even in the longer term. But luckily enough, they&#8217;ve got a cash cow in the form of advertising that they can use to subsidize all of these things.</p>
<p><b>Scott:</b> You see a lot of experimenting, but it seems that notable successes are not all that prevalent. It&#8217;s also hard to determine whether success is because they went with an open source/enterprise business model, or whether something else contributed to their success.</p>
<p>Just to be sensitive to the time, can you add any closing thoughts?</p>
<p><b>Ian:</b> I&#8217;m a huge user of open source, and I have several hats on my head. I&#8217;ve got a software engineering hat, where I just want there to be cool stuff, and anything that helps there be cool stuff is all right by me. I think almost all open source falls into that category. I think open source&#8217;s existence benefits the software ecosystem and innovation in general.</p>
<p>Then I&#8217;ve got my entrepreneur&#8217;s hat on, where I&#8217;m thinking in terms of whether being a producer of open source can make me money. The software I write is predictive analytic software, and I typically sell it to CTO-level people. </p>
<p>It&#8217;s quite a technical sale, so one of the challenges is to find customers and get them to pay attention for long enough that they actually understand what the software does and whether it can help them.</p>
<p>I have a blog where I will release small, relatively self-contained snippets of code that do stuff that relate, in some way or another, to what my product does. I know that the type of people who are going to find these snippets of code may well be the type of people who work for a company that could use the software I create.</p>
<p>It&#8217;s not so much a loss-leader as it is a publicity tool, and that is definitely one way in which producing open source is in the business interests of even a small software company.</p>
<p><b>Scott:</b> Thanks for taking the time to chat.</p>
<p><b>Ian:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=279&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F20%2Finterview-with-ian-clarke-luminary-and-freenet-creator%2F&amp;title=Interview+with+Ian+Clarke+%26%238211%3B+Luminary+and+Freenet+Creator" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F20%2Finterview-with-ian-clarke-luminary-and-freenet-creator%2F&amp;title=Interview+with+Ian+Clarke+%26%238211%3B+Luminary+and+Freenet+Creator" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F20%2Finterview-with-ian-clarke-luminary-and-freenet-creator%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F20%2Finterview-with-ian-clarke-luminary-and-freenet-creator%2F&amp;title=Interview+with+Ian+Clarke+%26%238211%3B+Luminary+and+Freenet+Creator" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F20%2Finterview-with-ian-clarke-luminary-and-freenet-creator%2F&amp;title=Interview+with+Ian+Clarke+%26%238211%3B+Luminary+and+Freenet+Creator" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F20%2Finterview-with-ian-clarke-luminary-and-freenet-creator%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Ian+Clarke+%26%238211%3B+Luminary+and+Freenet+Creator+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F20%2Finterview-with-ian-clarke-luminary-and-freenet-creator%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/01/20/interview-with-ian-clarke-luminary-and-freenet-creator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interview with Rich Wolski &#8211; Eucalyptus CTO</title>
		<link>http://howsoftwareisbuilt.com/2010/01/11/interview-with-rich-wolski-eucalyptus-cto/</link>
		<comments>http://howsoftwareisbuilt.com/2010/01/11/interview-with-rich-wolski-eucalyptus-cto/#comments</comments>
		<pubDate>Mon, 11 Jan 2010 23:40:29 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=276</guid>
		<description><![CDATA[Eucalyptus is an open-source project that gives you the Amazon cloud API for non-Amazon clouds.&#160; In this interview, we chat with Rich Wolski, CTO of Eucalyptus Systems, about the project and the company. Eucalyptus as a technology-neutral abstraction layer for cloud computing Enabling utility pricing models for compute resources Toward the ability to mix and [...]]]></description>
			<content:encoded><![CDATA[<p>
    Eucalyptus is an open-source project that gives you the Amazon cloud API for<br />
    non-Amazon clouds.&nbsp; In this interview, we chat with Rich Wolski, CTO of<br />
    Eucalyptus Systems, about the project and the company.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/01/11/interview-with-rich-wolski-eucalyptus-cto/#eucalyptus">Eucalyptus as a technology-neutral abstraction layer for cloud computing</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/01/11/interview-with-rich-wolski-eucalyptus-cto/#utility">Enabling utility pricing models for compute resources</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/01/11/interview-with-rich-wolski-eucalyptus-cto/#mix">Toward the ability to mix and match platform services with infrastructure services</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/01/11/interview-with-rich-wolski-eucalyptus-cto/#open">Open source significance and strategy in Eucalyptus</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/01/11/interview-with-rich-wolski-eucalyptus-cto/#community">The evolving role of community in the Eucalyptus project</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/01/11/interview-with-rich-wolski-eucalyptus-cto/#world">Real-world implementations and usage scenarios</a></li>
</ul>
<p><span id="more-276"></span></p>
<p><b>Scott Swigart:</b> Could you start off by introducing yourself and Eucalyptus?</p>
<p><a name="eucalyptus"></a></p>
<p><b>Rich Wolski:</b> Sure. I&#8217;m the Chief Technology Officer and the college professor whose research group did work on Eucalyptus originally at the University of California, Santa Barbara.</p>
<p>The project began as a computer science research investigation. We were looking at how to combine large-scale HPC programs in fields like chemistry, physics, weather forecasting, and biology with the emergence of utility computing offered by third parties, which is to say, cloud computing.</p>
<p>We started looking at that question in an empirical way, using a driver application of weather forecasting code. The goal was to combine Amazon&#8217;s AWS, a number of university data centers, and the NSF supercomputer center site in one large execution platform for this legacy weather code. To do that, we actually wound up having to build an interface layer for that code that could run in university data centers.</p>
<p>To make our programming job easier, we made that interface layer look just like Amazon AWS. We were faced with the problem, from the application perspective, of porting the code to Amazon. That interface layer&#8211;that emulation, if you will, for university data centers&#8211;is what we released as open source, which became Eucalyptus. That&#8217;s how we got ourselves into commercialization.</p>
<p>It&#8217;s an open-source platform that&#8217;s actually architected to support a bunch of different APIs, with flexibility in terms of virtualization, operating systems, and hardware. It implements cloud computing with a sufficiently high-fidelity imitation of Amazon that the weather forecasting code we were working with couldn&#8217;t really tell the difference between when it was running at Amazon and when it was running at the universities.</p>
<p><b>Scott:</b> When I think of things like the Amazon API, there are two things that spring to mind. The first is the API that you access, generally remotely. From my machine here I could spin off a hundred instances and provision them with Amazon machine images.</p>
<p>I could shut them down, I could take snapshots of images or take snapshots of the block source. All of the admin stuff that you can do through the management console is what I think of as the Amazon API. </p>
<p>If I understand what you just said, you guys replicated that API so that if I&#8217;ve got code locally that&#8217;s spinning up instances and kicking off jobs on Amazon, it could also kick off jobs through, say, a third-party vCloud provider, as long as they&#8217;ve bolted on Eucalyptus.</p>
<p>It&#8217;s exposing that same API, so I can federate my clouds and do the same logical operations even though, for example, starting an instance on VMware&#8217;s infrastructure is different from doing so on Amazon&#8217;s. Do I understand correctly that you have built a sort of abstraction layer so that I don&#8217;t really see that as a user?</p>
<p><b>Rich:</b> Yeah, although I think it&#8217;s important to understand that that abstraction layer is more than just the API. Really, what Eucalyptus does is implements the cloud abstractions in an essentially technology-neutral way. </p>
<p>There are a set of abstractions that all successful clouds today support, and we&#8217;ve figured out how to implement those abstractions independently of the API that accesses them and independent of the virtualization technology that is used to implement them.</p>
<p>Eucalyptus is in the middle, as a generic cloud infrastructure. It is built so that you can manipulate that cloud infrastructure using a number of different APIs, although the one that has been most successful for us is Amazon. That&#8217;s the one that we ship with open source, but we&#8217;ve done others.</p>
<p>On the bottom part of it, there are lots of different virtualization technologies that can be used to realize those cloud abstractions, both in open source and in proprietary forms. Eucalyptus is built so that it could use any of those virtualization technologies to implement the abstraction.</p>
<p><b>Scott:</b> So, in other words, you are positioning Eucalyptus to be the lingua franca of the clouds, in terms of starting instances, stopping instances, and that sort of thing.</p>
<p><b>Rich:</b> Right. There really are three sets of abstractions that most clouds support. There are VM abstractions for starting, stopping, and manipulating virtual machines. There are abstractions having to do with storage, such as what persists, what does not, how it is accessed, and what the range of access is. Finally, there are abstractions having to do with networking: how your VMs are named, how can they communicate with each other internally and externally, and those types of things.</p>
<p>We have an internal framework that is able to combine different virtualization technologies to implement those abstractions and then export them via different APIs.</p>
<p><b>Scott:</b> When an instance is running on Amazon, the instance itself doesn&#8217;t really persist data. So if you shut that instance down, its &#8220;local drive&#8221; basically gets blown away, whereas with other providers, that may not necessarily be the case. Or you may set up a portion of your data center so the instances are persistent.</p>
<p>How do you deal with those underlying differences that surface up through whatever kind of cloud or virtualization architecture somebody is using internally?</p>
<p><b>Rich:</b> That&#8217;s a great question, and the answer is that the Amazon API, and in fact all the other APIs we&#8217;ve looked at, have really well-developed notions of Quality of Service attribution. I&#8217;ll just speak about Amazon concretely, because that&#8217;s the one that seems to motivating a lot of others.</p>
<p>In Amazon, there is this notion of VM type, and there is also a notion of availability zone. Availability zones in Amazon are there specifically to give you some idea of independent failure probability. Two different VMs working in different availability zones have independent failure probabilities, so you can calculate the likelihood that you&#8217;ll lose both of them simply by calculating the joint probability.</p>
<p>We actually co-opt that a little bit and use the notion of availability zones as a collection of common QoS characteristics. One of those could be &#8220;is it failure-independent?&#8221; but that is not necessarily the only one. </p>
<p>For example, you could have two different availability zones in a Eucalyptus cloud, where the underlying infrastructure is implemented in two completely different ways. And what you want to export to your users is the notion that each specific availability zone has a particular set of QoS attributes.</p>
<p>Therefore, if VM persistence is an attribute that you want to export in one or more of your availability zones, you can do so. Often, there are capacity-allocation issues associated with that. If you&#8217;re going to do VM persistence, you have to have more of a persistent store available, and the Eucalyptus administrator is responsible for doing that capacity matching, so that the size of the availability zone, the things that can go in it, the users that are entitled to request image starts from it, and so on and so forth, are properly controlled.</p>
<p><a name="utility"></a></p>
<p><b>Scott:</b> Does costing show up in the API at all, in terms of VM, storage, and network resources? In other words, is there a way through the API to know what an instance is costing you, or is that something that you need to know in advance?</p>
<p><b>Rich:</b> That&#8217;s actually something we think about a lot. There&#8217;s no problem in putting the cost into the API, but it&#8217;s difficult in many cases to compute exactly what that cost should be automatically. If you&#8217;re charging against a credit card, you can calculate the occupancy cost or the number of bytes or whatever. There&#8217;s some monitoring that&#8217;s necessary to do that, but you can do it.</p>
<p>On the other hand, if you&#8217;re making calculations based, for instance, on the financial costs of running a VM, it&#8217;s far more complex. There are factors based on the cost of the hardware, the operating costs, and a specific life span of that hardware in a specific company. It&#8217;s a very difficult problem, because of all the factors that go into the cost benefit associated with owning that machine.</p>
<p>Of course, you can use average costs. Rather than calculating the operating cost on a particular machine, you can assume a reasonable rate and feed that to the system as a parameter. It would be no problem to export that through the API, although I&#8217;m not sure you get much utility out of calculations based on a static number like that.</p>
<p>On the other hand, Amazon has just announced spot pricing, which changes the equation. Now the price is going to be fluctuating dynamically, and you don&#8217;t want to know so much what something cost you in the past, as much as you want to know what it&#8217;s going to cost you in the future, and that&#8217;s a whole different ball of wax.</p>
<p>I do think that almost all of the APIs at some point will include some notion of cost, but the utility of that is really going to be predicated on how predictive it is. I don&#8217;t think the providers have gotten to a point yet where they can reliably produce those kinds of predictions, so I suspect we&#8217;re a little ways off before we see it.</p>
<p><b>Scott:</b> I was thinking more in terms of the customer-facing price showing up in the API, rather than the internal costs. That seems to be a simpler matter, since as you said, there are a million things that factor into internal costs, many of which aren&#8217;t really programmatically discernable, such as what your people are costing you and those sorts of things.</p>
<p>The notion of spot pricing leads me to another thought. Today, an IT manager may have a workload that they think may be well suited to the cloud, because their demand is fairly elastic. Maybe the organization needs 100 nodes for just a week, or two hours, or whatever.</p>
<p>Customers only want to pay for peak capacity when they need it, so they&#8217;re manually looking at various cloud providers and trying to find a fit for their needs. They may also be considering an internal cloud that bursts to a public cloud as needed.</p>
<p>Of necessity, they&#8217;re looking at all these relationships very individually, in terms of the characteristics of each provider and so on.</p>
<p>I think about when web services were just catching on, as the trendy technology of the moment. There was a lot of talk at the time about web services marketplaces and directory services, where if you wanted a weather service, you&#8217;d go to a directory service and find one. The stuff you&#8217;re talking about seems to lend itself to the evolution of a kind of cloud marketplace, especially if cloud capacity really is a commodity.</p>
<p>In fact, that&#8217;s the direction that people like Amazon seem to be pushing it, where there are spot prices for instances, for CPU time, and that sort of thing. It would allow you to monitor a marketplace of many providers and place workloads dynamically based on changing prices. Am I thinking too sci-fi here, or do you actually see this happening?</p>
<p><b>Rich:</b> I actually agree with everything you said, and I would enhance your description in the following way.</p>
<p>There are business uses for which commodity markets with fluctuating prices make a lot of sense, assuming that the offerings are completely interchangeable. Typically, customers buy the lowest common denominator, because it&#8217;s a volume business.</p>
<p>At the same time, there are real advantages in certainty and long-term contracts, and being able to repurpose things you buy based on unforeseen events. Spot pricing can be either a hindrance or a boon, since if the spot price happens to be poor when you really need to move on something, you might have been better off with a longer term fixed price fee for service.</p>
<p>There was a study at UC Berkeley, I believe, of network bandwidth. They tried all sorts of pricing models for people in their homes, and they found that, in fact, spot pricing for network bandwidth at home is a really bad idea. People liked it the least of the various pricing options that were made available to them.</p>
<p>I think there&#8217;s going to be a computational commodities market, and I think the key there is going to be interchangeability. You have to know that if you buy it from one provider or another, or if you run it in-house, you really are getting the equivalent service. And in computing that&#8217;s tricky, because it&#8217;s a multidimensional quality of service bundle, and you have to match on all dimensions. Multidimensional matching is a hard problem, particularly if things are statistical.</p>
<p>It&#8217;s kind of hard to do that equivalence relationship, but you can do it, at least to some approximation. And for the things where you can deal with spot pricing and make those kinds of decisions, I think it&#8217;s going to work just fine. </p>
<p>I don&#8217;t see that as the only way that public cloud providers are going to offer their services, though, and I also think that independent of whether the market is commoditized or not, the on-premise cloud is going to play a role, acting as a hybrid.</p>
<p><b>Scott:</b> That makes sense relative to the electricity market, too. Governments, agencies, and companies enter into long-term contracts with each other to get determinism around price, but there&#8217;s also a spot market for power.</p>
<p><a name="mix"></a></p>
<p>To turn to a different subject area, Eucalyptus lets you get to the notion of arbitrary infrastructure as a service. While each service provider may have its own unique set of value adds, broad availability facilitates federating workloads across providers.</p>
<p>It seems like platform as a service started to evolve almost completely separately, but they seem to be coming together a little bit more now. Google App Engine was probably the first platform as a service provider that a lot people really noticed.</p>
<p>The first option was to go the infrastructure as a service route with Amazon, and then you were your own administrator. You had to figure out what OS to load, patch your own boxes, and so on, but you could basically do whatever you could with a virtual machine. On the other hand, you could go with Google, which required a bit of a rewrite, but there were all kinds of things you didn&#8217;t have to worry about.</p>
<p>Now there seems to be work happening toward being able to layer platform-as-a-service frameworks on top of infrastructure-as-a-service offerings. I can envision future where you can sort of mix and match those.</p>
<p>How does Eucalyptus fit in to that world, where you&#8217;ve kind of got arbitrary platforms as a service on top of arbitrary infrastructures as service?</p>
<p><b>Rich:</b> Let me start off by saying that when Eucalyptus was a research project, we teamed with another research group at UCSB that was interested in doing for Google&#8217;s App Engine what we did for Eucalyptus, namely building an open source version of GAE. They layered it on top of Eucalyptus, and it&#8217;s called AppScale. It&#8217;s still a research project today. You can get AppScale and Eucalyptus, run them in your data center, and have GAE and Amazon all to yourself.</p>
<p>From the beginning, we really thought that people would want to mix and match platform services and infrastructure services. I think it will be interesting to see how applications work at different levels in the stack. One of the key questions with regard to this layering is how much of the inter-relationship between the application and its data can be explicitly represented.</p>
<p>With platform as a service, you wind up uploading into the cloud the data you wish to process on and the code you wish to process it with. In infrastructure as a service, you have to upload those two things and you have to upload the environment in which that code is going to run.</p>
<p>I think you can characterize software as a service in the terms that you upload your data, or you upload no code. With platform as a service, you upload your data and you upload your code, but you upload no environment. With infrastructure as a service, you upload the data, the code, and the environment. </p>
<p>Given that schema, what really differentiates platform as a service from infrastructure as a service in terms of the power-benefit trade off is how much you can tease apart your application code and data from the ecosystem of software in which it lives. </p>
<p>That&#8217;s a harder job than it might sound. We all think our code lives in isolation, but it&#8217;s been a long time since people have written anything of substance that lives completely by itself.</p>
<p>If you want to use platform as a service, you have to get back to that sort of software engineering purity, where the environment in which it runs, including the libraries, operating system version, security policies, and so on are really kind of digested from the code. </p>
<p>Particularly complex, scalable applications consist of multiple components, some of which are perfectly suited to platform as a service deployment, and some of which are not for whatever reason.</p>
<p>Because of that, we think the key is this layering, which allows platform as a service and using application components that make the most sense. If that&#8217;s the premise, we hope you use Eucalyptus, but if not, we still think that layering makes a lot of sense.</p>
<p><b>Scott:</b> I talked to some guys who were doing Google App Engine development, and one of the things they said that was really sort of eye opening to me was that platform as a service is fantastic if you can live inside of that sandbox. So, with Google, for example, everything is a request and a response, and you have a certain amount of time to process the request; otherwise it just basically kills it. But, you get infinite scale, you don&#8217;t have to patch your OS, and those kinds of things. </p>
<p>That worked for 80-90 percent of the project they were working on, but for the other parts, they would have Google App Engine call out to other stuff running over on Amazon Web Services, for example, that would do some of the longer-running kinds of things and generate results.</p>
<p>The picture you just painted allows that whole thing to exist inside of that data center. You can have the infrastructure-as-a-service piece and the platform-as-a-service piece talking to each other, relatively seamlessly. Maybe you even run them on the same servers, and to accommodate service peaks, things move around relatively transparently, and from an internal IT perspective, that&#8217;s going to be the norm. </p>
<p>Platform as a service is going to be great, infrastructure as a service is going to be great, and they&#8217;re going to have to communicate with each other, because sometimes one of them is going to be a better fit than the other, and large complex apps are frankly going to require both to some degree.</p>
<p><b>Rich:</b> That&#8217;s right, and I think you see in Azure that kind of acknowledgment. I know this sounds strange, but when I look at the Azure architecture, I see those platform-as-a-service types of restrictions built in. Still, you could imagine exporting in Azure more infrastructure-as-a-service type functionality, and then that would have to be coordinated in a reasonable way.</p>
<p>I think one advantage of Eucalyptus is that as you were saying, if you want a Ruby platform, a Java platform, a Python platform, or some other platform of your choosing, all of those things can happily coexist over the same infrastructure as a service framework.</p>
<p><a name="open"></a></p>
<p><b>Scott:</b> Let&#8217;s flip over and talk a little bit about the open source nature of Eucalyptus and its significance.</p>
<p><b>Rich:</b> There are a couple of ways to view the presence of Eucalyptus as an open source project. The easy one is the motivation for us to do it as open source, which is that we as researchers have really have spent a lot of time working with and benefiting from other open source efforts. </p>
<p>If you are working in the university environment today, doing empirical system science, you are almost assuredly using somebody&#8217;s open source tool kit or platform or software infrastructure to do your work.</p>
<p>And so we felt like it was paramount that if we built something that might have utility for other researchers&#8217; investigative activities, we should make it available as an open source. And frankly speaking, before we even thought of commercializing Eucalyptus, we knew we were going to make it open source. Having now gone down this path and having become a commercial entity, we still feel very strongly about this.</p>
<p>Even from a personal experience perspective, we feel that open source has a lot of value that can be passed on to the community, as long as you can support that notion commercially. From a business perspective, I think there are some fairly good reasons to do it as well. </p>
<p>The first is that we get access to a lot of information very quickly about what Eucalyptus does and does not do. That information is freely given to us as people take the software and try things with it. We can see, almost instantly, which things we have done that have been successful, where we need to work on the quality of the system, where we need to work on its stability, what features might be missing, and those kinds of things. The open source community is very good about providing that information.</p>
<p>A second aspect that was a little less clear to us when we started this, but which is actually turning out to be very valuable, is that Eucalyptus is built to run in pretty much whatever environment you have. </p>
<p>We consider Linux environments primarily, but we really spent a lot of time when we were building it initially thinking about university data centers where we don&#8217;t know what version of Linux is there, we don&#8217;t know what the vintage of the hardware is, we don&#8217;t know well how it&#8217;s administered, and those kinds of things. As a result, we&#8217;ve architected the system to be flexible in this way.</p>
<p>What we&#8217;ve learned from the open source community, though, is just how widely varied those environments can be. We&#8217;re constantly hearing from people who say, &#8220;This is what my data center looks like, and Eucalyptus doesn&#8217;t do X, Y, or Z. Please help me.&#8221; </p>
<p>That type of interaction is not really feedback about the code or what Eucalyptus does, but rather it&#8217;s feedback about how Eucalyptus can be configured. Through that information, we&#8217;ve been able to make Eucalyptus extremely configurable, so that it really does allow itself to be molded into whatever environment you want to put it in. And that, I think, is part of its value proposition in a commercial sense that we didn&#8217;t anticipate.</p>
<p><b>Scott:</b> Just to drill down a little bit, remind me what license Eucalyptus is under. Is it an Apache license, BSD, GPL, or something else?</p>
<p><b>Rich:</b> It&#8217;s actually under two licenses, which has to do with the university and commercialization phases we&#8217;ve been through. The university license was a BSD-style license, modified in a way that the University of California, Santa Barbara absolutely had to have in order to allow us to release it from the university. </p>
<p>We had to modify that license when we commercialized, because it&#8217;s a UCSB-specific license. At that point, we took a look at it, thought about what the rest of the community was doing, and switched to GPL. It&#8217;s actually GPLv3 going forward, and as time goes on, less and less of the university code is present in the actual open source code base. At some point, it will be all GPLv3, but for now it&#8217;s both.</p>
<p><b>Scott:</b> That makes sense. In terms of community involvement, you mentioned that you get a lot of feedback from the community about how it&#8217;s working in a certain environment, specific issues, and things like that.</p>
<p><a name="community"></a></p>
<p>With some open source companies, the majority of the people working on the project work for the company. That was sort of the way MySQL was originally structured.</p>
<p><b>Rich:</b> Right.</p>
<p><b>Scott:</b> Others really want the community contributing code. Where do you see yourselves on that spectrum, in terms of where the code comes from?</p>
<p><b>Rich:</b> We&#8217;re actually moving away from one extreme. When we started out and looked at taking community contributions, we found that we were rolling the code forward so fast that people were contributing code that was absolutely against our internal code base. And so they would get upset because they put a lot of energy into building code that we couldn&#8217;t take.</p>
<p>What we decided was that, in the first phase of the project, we would really limit code contribution to things that talk to the Eucalyptus API, meaning things on top of Eucalyptus or attached to it or communicating through the API, and then to take patches. We would take bug fixes, as long as they were not obsoleted by a bug fix we had internally or whatever.</p>
<p>What has happened is that we were trying as hard as possible to do a complete implementation of AWS while AWS was adding features. As a result, we decided to finish off the initial AWS API for open source, do a set of bug fix releases to re-factor some things internally for performance and stability reasons, and then adopt a much less aggressive release schedule. </p>
<p>Moving to a status where we&#8217;re releasing every six to twelve months instead of every six or eight weeks makes it more feasible for us to entertain contributions to the core. It&#8217;s important for us that the QA be done well. </p>
<p>We&#8217;ve put a lot of energy into making Eucalyptus extremely concurrent, which is a big differentiating factor between Eucalyptus and some other open source offerings. Eucalyptus is a very, very concurrent system, and what we find is that when people add things to it, in order to get stability, they often put locks in, because it&#8217;s easy.</p>
<p>That approach gets their thing to work, but at the cost of reducing the concurrency. We probably will look very long and hard at those contributions to make sure that we preserve the quality of what&#8217;s there, but I imagine that in the next year or so we&#8217;ll see community contributions to the core as well.</p>
<p><b>Scott:</b> I remember seeing news articles that mentioned, for example, Ubuntu and Eucalyptus in the same paragraph. What are your relationships to distros or other open source projects?</p>
<p><b>Rich:</b> We&#8217;ve always wanted to make Eucalyptus as widely valuable as possible in the open source community. The Ubuntu partnership has been fantastic in that way, as they have really put a great deal of energy into adding things that they believe their distribution community really wants. </p>
<p>They put a lot of energy, for example, into improving the installation experience in a way that is much more compatible with what the Ubuntu community is accustomed to seeing. And we think that&#8217;s great.</p>
<p>By the same token, we are open to working with other projects or companies that have different needs. We do work with the Debian community, we&#8217;re actively working to package it for Squeeze as part of their open source offering, and we&#8217;ve talked with some of the other distros about whether it makes sense for us to be part of the distro or just to remain compatible.</p>
<p>We have agreed to continue to package Eucalyptus for the Linux distros that people ask us to package it for, and we explore ways that we can have closer partnerships. That&#8217;s ongoing, but so far, in the distro space, our closest partnership is with Ubuntu.</p>
<p>There is a lot of synergy between what Eucalyptus does and what other projects do. So again, in this platform, there is this sort of layering capacity. We have ongoing compatibility and synergy discussions with a number of different partners. </p>
<p>There is also an ecosystem building up for tooling. RightScale is a good example of a company that provides powerful tools for using Amazon. You can program your Amazon allocation much more effectively through RightScale than you can by hand.</p>
<p>Because we&#8217;re Amazon-compatible, RightScale works with Eucalyptus, so that&#8217;s a nice partnership. If you want to use the same kind of dashboard to manage your Amazon and Eucalyptus clouds, you can, and there are a number of companies that provide that kind of tooling, backup services, and other things that complement what we do.</p>
<p>We think of those partnerships as being really important, particularly since we&#8217;re open source, because we think that the success of cloud computing today is going to come from an active ecosystem and not the success of any one single technology.</p>
<p><b>Scott:</b> Talk a little bit about your business model. How do you guys keep the lights on around this open-source project?</p>
<p><b>Rich:</b> We&#8217;re practicing what I think is commonly referred to as an open-core model, where we provide the core of the system as open source, and then there are value added components that can be purchased from us under license. The bright line differentiator for us is whether the feature or component is really tailored towards the notion of commerce or not.</p>
<p>If you want to experiment with cloud computing, if you want to understand it better, or if you want to see what it does and what it doesn&#8217;t do, then open source Eucalyptus is for you. If you want to make money with it and start a business using Eucalyptus, then there are things we think you need that we feel like you should, in some measure, pay for, because you&#8217;re going to go off and resell the value that comes from them.</p>
<p>So, for us, it&#8217;s actually pretty easy to see what goes in open source and what doesn&#8217;t. We have professional engagements that surround our proprietary products. We also offer support, under contract, for the open source code, but that&#8217;s not really our main business. We&#8217;re really a software development company, and we&#8217;re leveraging the open source community and the open source project for the proprietary products we ship.</p>
<p><a name="world"></a></p>
<p><b>Scott:</b> Obviously, people are using Eucalyptus internally for things that are kind of confidential and not public, but what are some of the really cool stories you can talk about where you&#8217;ve seen Eucalyptus put into production?</p>
<p><b>Rich:</b> The best example of where it&#8217;s in production is actually something we can talk about. NASA took Eucalyptus and added Luster, which is a high performance shared file system, and some other things, and they&#8217;re running it in production at NASA Ames. </p>
<p>They offer cloud services to other agencies within the government, and we know it&#8217;s in production, because they call us and they yell at us about it. We think that&#8217;s pretty exciting.</p>
<p>The other stories around Eucalyptus are really more usage scenarios. We hear a lot from the open source community about what people are doing with it. We can sort of talk about that, in the sense that they only tell us so much. </p>
<p>There are interesting use cases that have emerged from Eucalyptus. One is the cloud bursting of the disaster recovery model. People think of that one almost immediately. A lot of people are also looking at the idea of augmenting their on-premises infrastructure with public clouds.</p>
<p>But then there&#8217;s also this funny phenomenon, where developers become frustrated with the pace at which they can get resources dedicated to their projects, and they pull out their own credit cards and start developing in a public cloud. Often that code is very valuable, and when that comes to life, the company for whom that developer is working often wishes to pull that stuff back into the company and run it on premises until it can be properly vetted, software engineering can be done on it, and so forth.</p>
<p>So, we see this going both ways. We see businesses that want to burst from their on-premises IT into the public clouds, and then we see development which taking place in one venue, such as on the public clouds, and for whatever reason, the people who own the development want to move it back into the private space, perhaps only to move it forward.</p>
<p>The result is a kind of homogenization of the on-premises IT with public utility providers. We think that represents a really exciting future usage scenario for Eucalyptus.</p>
<p><b>Scott:</b> It&#8217;s a slow moving trend, because it&#8217;s difficult for companies to get there, but there&#8217;s pressure for internal IT organizations to restructure themselves essentially as service-delivery arms. </p>
<p>We&#8217;ve talked to people in several companies who have told us, &#8220;I have to compete with everything out there. Any business unit that comes to me is free to buy from Amazon, they&#8217;re free to buy from Google, they&#8217;re free to buy from whoever they want, and I&#8217;ve basically got to make them a deal. I&#8217;ve got to deliver a better service at lower cost.&#8221;</p>
<p><b>Rich:</b> We hear this idea that internal IT finds itself competing with commodity providers, but I predict that will prove to be a short-term phenomenon. Whenever a commodity emerges, it becomes inexpensive as well as very common. In the technology business, there&#8217;s often an advantage in using something that&#8217;s more rare, before your competition and before it becomes a commodity.</p>
<p>Where your business has to innovate, that&#8217;s really not necessarily a commodity business. I think what we&#8217;re going to see in the future is that on-premises IT will really be the engine of innovation, experimenting and coming up with the best ideas before the competition can do them. </p>
<p>And then as those advances commoditize, becoming the sorts of things where you&#8217;re going to compete on price and really try to drive the margins down, there will be pressure to move them from your internal infrastructure, out to the utilities, to make space for the next innovation.</p>
<p>I think long-term, that&#8217;s really what IT is going to look like. I don&#8217;t think it&#8217;s all going to be public, and it&#8217;s certainly not all going to be private. I think the business model that rationalizes how to partition resources between public and private is going to be driven by that innovation cycle.</p>
<p><b>Scott:</b> That&#8217;s great. Looking at the time, I think that&#8217;s a good place to wrap up the interview. Thanks very much</p>
<p><b>Rich:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=276&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F11%2Finterview-with-rich-wolski-eucalyptus-cto%2F&amp;title=Interview+with+Rich+Wolski+%26%238211%3B+Eucalyptus+CTO" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F11%2Finterview-with-rich-wolski-eucalyptus-cto%2F&amp;title=Interview+with+Rich+Wolski+%26%238211%3B+Eucalyptus+CTO" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F11%2Finterview-with-rich-wolski-eucalyptus-cto%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F11%2Finterview-with-rich-wolski-eucalyptus-cto%2F&amp;title=Interview+with+Rich+Wolski+%26%238211%3B+Eucalyptus+CTO" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F11%2Finterview-with-rich-wolski-eucalyptus-cto%2F&amp;title=Interview+with+Rich+Wolski+%26%238211%3B+Eucalyptus+CTO" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F11%2Finterview-with-rich-wolski-eucalyptus-cto%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Rich+Wolski+%26%238211%3B+Eucalyptus+CTO+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F11%2Finterview-with-rich-wolski-eucalyptus-cto%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/01/11/interview-with-rich-wolski-eucalyptus-cto/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Interview with Amr Awadallah &#8211; Cloudera CTO</title>
		<link>http://howsoftwareisbuilt.com/2010/01/06/interview-with-amr-awadallah-cloudera-cto/</link>
		<comments>http://howsoftwareisbuilt.com/2010/01/06/interview-with-amr-awadallah-cloudera-cto/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 19:05:00 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=272</guid>
		<description><![CDATA[Cloudera has built a business around the open-source Hadoop project, and in this interview, we chat with Cloudera CTO and cofounder Amr Awadallah about all things Hadoop. Background on Hadoop and its origins Implementing support for MapReduce to handle distributed data Suitability of Hadoop for cloud computing The relationship between Cloudera&#8217;s Hadoop version and the [...]]]></description>
			<content:encoded><![CDATA[<p>
    Cloudera has built a business around the open-source Hadoop project, and in this<br />
    interview, we chat with Cloudera CTO and cofounder Amr Awadallah about all<br />
    things Hadoop.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2010/01/06/interview-with-amr-awadallah-cloudera-cto/#hadoop">Background on Hadoop and its origins</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/01/06/interview-with-amr-awadallah-cloudera-cto/#mapreduce">Implementing support for MapReduce to handle distributed data</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/01/06/interview-with-amr-awadallah-cloudera-cto/#cloud">Suitability of Hadoop for cloud computing</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/01/06/interview-with-amr-awadallah-cloudera-cto/#relation">The relationship between Cloudera&#8217;s Hadoop version and the Apache version</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/01/06/interview-with-amr-awadallah-cloudera-cto/#business">Cloudera&#8217;s business and monetization model around Hadoop</a></li>
<li><a href="http://howsoftwareisbuilt.com/2010/01/06/interview-with-amr-awadallah-cloudera-cto/#license">Licensing models as they apply to Cloudera and Hadoop</a></li>
</ul>
<p><span id="more-272"></span></p>
<p><b>Scott Swigart:</b> Could you introduce yourself and Cloudera?</p>
<p><b>Amr Awadallah:</b> I&#8217;m CTO, VP of engineering, and cofounder of Cloudera. I was previously VP of engineering for data systems at Yahoo!. That&#8217;s where I first got exposure to Hadoop and recognized the need to form a company to commercialize it. The company is just 13 months old now.</p>
<p><a name="hadoop"></a></p>
<p><b>Scott:</b> Can you fill in some of the blanks for our readers who may not be entirely familiar with Hadoop?</p>
<p><b>Amr:</b> Sure. With traditional data systems, you have a bunch of filers where you store your data. A compute grid is connected to these filers, which essentially is a bunch of computers that fetch over the data, do computations on it, then store the data back.</p>
<p>That model worked fine, but it started to break when data sizes got too big and moving the data between the storage network and the compute resources became a big stress on the network infrastructure. </p>
<p>There was also a big stress on the filer heads. A filer consists of a bunch of disks, and all these disks are attached to a head, which is really what does all the serving. The filer head itself starts to become a bottleneck when you&#8217;re running a job that accesses a large proportion of your data. If you&#8217;re just searching a file or two within your data, it&#8217;s still OK.</p>
<p>Those factors are what drove the need for Hadoop. We had massive amounts of data to process, and existing systems were not capable of scaling sufficiently. Hadoop itself came out of the open source community, though Yahoo! played a very big role, as well.</p>
<p><b>Scott:</b> I&#8217;m always interested in the actual mechanics of how something as complex as Hadoop comes to be. Can you speak to that specifically?</p>
<p><b>Amr:</b> Two developers named Doug Cutting and Mike Cafarella were trying to build a web-scale, distributed search engine. They built a system called Nutch about seven years ago.</p>
<p>Obviously, the system needed to be massively scalable to index the whole web, so they were spending a lot of energy in that area. Around that time, Google published two key papers, on Google MapReduce and the Google File System, GFS. It&#8217;s the marriage of those two technologies that enables the large scale data processing that we can do today. </p>
<p>GFS allows linking a lot of commodity hardware together to present one large, unified file system. MapReduce has two parts: the programming model and the job execution framework.</p>
<p>The MapReduce job execution framework lets you spread your job over multiple nodes, so that the first level of mapping, where you&#8217;re reading your data in and doing the first level of aggregation, is all local. All the communication is purely local between the computation resources and local disks. Therefore, you don&#8217;t take a performance hit or stress the network from passing data over it.</p>
<p>In the second phase, which is called the reduce phase, some inter-node communication takes place, which does necessarily stress the network infrastructure to some degree. Usually by that stage, though, the data has already been reduced in size to the subsets that you care about, so the stress on the network infrastructure is dramatically reduced.</p>
<p>When Doug and Mike saw these papers, they had a &#8220;Eureka&#8221; moment and started to integrate these technologies into their web-scale search technology with Nutch.</p>
<p><b>Scott:</b> How did the work at Yahoo! coincide with what Doug and Mike were doing with Nutch? I assume that Yahoo! must have adopted an open source approach in place of some of their proprietary algorithms. Is that true, and what went on behind the scenes with those decisions?</p>
<p><b>Amr:</b> At the time, Yahoo! was also doing a lot of work internally on their infrastructure for serving search. As the web grew exponentially, the cost of serving a search was getting prohibitively high, and Yahoo! needed a more scalable system to cost-effectively build the web index.</p>
<p>They saw the MapReduce and GFS papers from Google as well, and in light of the work being done by then on the open source Hadoop project, they recognized that they had a choice. They could either build proprietary MapReduce and GFS technology from scratch, or they could adopt Hadoop internally and support the open source project.</p>
<p>A very big debate went on inside Yahoo! about the right direction to go. The proponents of the open source approach made the very good argument that supporting Hadoop would take away one of the key advantages that Google had. Namely, only Google had the systems to really take advantage of MapReduce and GFS, which locked in a tremendous advantage for them.</p>
<p>They argued successfully that it was in Yahoo!&#8217;s best interest to unlock that technology, so other startups could start adopting it and building ideas around it. The idea was that this approach would shift the power from Google.</p>
<p>In response to those arguments and others like them, Yahoo! hired Doug Cutting on staff and invested very heavily on building a large Hadoop team. In fact, the vast majority of the code base in Hadoop today came from Yahoo!, so they certainly deserve a lot of credit for that.</p>
<p><b>Scott:</b> How did Hadoop move beyond its foundations in search?</p>
<p><b>Amr:</b> After Yahoo! started to integrate the technology within its web infrastructure initially, Hadoop started penetrating a lot of other areas of Yahoo! as well. Now it&#8217;s being used in Yahoo! mail for spam filtering, as well as on the Yahoo! front page to model which stories to display to which users based on their interests. It&#8217;s being used for ad modeling and ad targeting, as well. Facebook was another early adopter and supporter of Hadoop.</p>
<p>Around that time, which was maybe two years ago, it was starting to be clear that Hadoop had implications for more than just the big web properties. The technology is useful to many other domains as well. In fact, this notion of moving computation to run as close as possible to the data is something that everybody should be doing, given how big data is growing these days.</p>
<p>This trend is being reinforced by the fact that commodity servers are becoming very powerful in terms of the number of cores and the number of discs you can have in them. Now you can have eight cores very economically, and within the next year you&#8217;re going to have 32 or even 64. You can have anywhere between four terabytes to even 48 terabytes of disk space on one of these standard server nodes.</p>
<p>It has become very clear that the trend is toward large hardware topologies, where systems will need to be very resilient to failure. They will need to know how to shift workloads and data around when things fail, so the system just heals itself.</p>
<p><a name="mapreduce"></a></p>
<p><b>Scott:</b> One of the things that this model depends on is moving away from a traditional relational database that doesn&#8217;t really fragment the data across lots of nodes. Instead, those traditional systems are really based around the idea of a single, monolithic body of data. </p>
<p>That seems to be one of the big paradigm shifts for developers. When they start using something like BigTable where the data is distributed across a lot of nodes, they have to think a little differently about the way they build applications.</p>
<p>Working our way from the bottom up, talk a little bit about thinking differently in how you store your data so that you can use a MapReduce process on top of that.</p>
<p><b>Amr:</b> First, it&#8217;s worth making the important clarifying point that Hadoop is not a database. Hadoop is a data processing system, and in fact, I would even go as far as saying Hadoop is an operating system. The core of an operating system boils down to a file system, the storage of files, and a process scheduling system that runs applications on top of these files.</p>
<p>There are many other components that help with devices, credentials and user access, and so on, but that is the core. Hadoop is exactly the same thing. The core of Hadoop is the Hadoop Distributed File System, which is a file system that&#8217;s runs across many nodes. It links together the file systems on many local nodes to make them into one big file system. Hadoop MapReduce is really the job scheduling system that takes care of scheduling jobs on top of all those nodes.</p>
<p>That is the key distinction between Hadoop&#8217;s approach and that of database systems. Hadoop, at its heart, does not require any structure to your data. You can just upload files directly from anywhere, like a web server, RFID device, or cell phone mobile device, directly into Hadoop.</p>
<p>They could be images, videos, or just a bunch of bits. They don&#8217;t have to have a schema with column types and so on, which gives you tremendous agility and flexibility. </p>
<p>Hadoop has a very nice model that I sometimes refer to as schema on read. Whereas defining your schema as you&#8217;re writing the data in limits what you can put in by requiring it to be conformant to the schema that you created, Hadoop allows you to define the schema as you&#8217;re reading stuff out.</p>
<p>That gives you a lot of flexibility and agility, since you can add files that have dynamic parts like JSON or new standards coming up like Avro, which is a very good project coming out of the Hadoop project that&#8217;s similar to protocol buffers from Google and Thrift from Facebook. Avro makes files have a schema around them as well, but these schemas are semi-structured, rather than conforming to a strict relational model.</p>
<p>That said, it&#8217;s also important to point out that structured stuff is a subset of unstructured stuff. The fact that Hadoop at its heart is a file system doesn&#8217;t mean that it can&#8217;t do database relational stuff. It does actually, in the same way that Windows at its heart is a file system, but you can run SQL Server on top of it to get the relational services, schemas, column types, and so on.</p>
<p>One of the key projects on top of Hadoop is Hive, which actually came out of Facebook. Hive essentially provides a relational database on top of Hadoop that utilizes the underlying file system but has a metastore that keeps the schema of the files.</p>
<p>It knows that a given file is tab delimited or whatever, it knows the column type for these files, and Hive allows you to write SQL against these files. It will look up the schema and then it will write for you the MapReduce jobs so that you don&#8217;t have to go and learn MapReduce from scratch.</p>
<p>Now you have the flexibility of going either way. One approach is to get at the core of the MapReduce framework using Java MapReduce, which we sometimes refer to as being like assembly language for Hadoop. It gives you the most flexibility and performance, but it is fairly complex and difficult to learn.</p>
<p>Alternately, you can go in with a high level language like Hive. In this case, you can just use SQL, if that&#8217;s what you&#8217;re used to, to write your job. Hive itself has lots of optimizations. It understands the underlying MapReduce framework, so it can properly map your problem on top of your data.</p>
<p><b>Scott:</b> Obviously, search is a logical use of MapReduce, but I&#8217;m sure there are many others. For example, in the oil and gas industry, a single oil derrick might produce terabytes of data a day. In other industries, there might be huge amounts of financial data to plow through.</p>
<p>It seems that every industry is faced with figuring out ways to deal with these large bodies of data. Given that, what are some of the more interesting uses or problems you&#8217;ve seen thrown at Hadoop?</p>
<p><b>Amr:</b> The group at Yahoo! that I came from was using Hadoop for data analytics and data warehousing. We had something like 100,000 web servers across the world, and once we collected data from across all these servers, we dumped it into Hadoop, which became the place where we stored all of the data, instead of traditional network storage.</p>
<p>Our reasoning for doing that was a matter of economics, given the quantity of hardware. Hadoop lets us scalably process that data, clean it up, and normalize it so we could pass it along to the systems that need it.</p>
<p>Hadoop is getting very wide adoption in the data warehousing and business intelligence domains. One of the biggest uses within Yahoo! right now is dealing with all of the log information from servers. Analyzing that information allows for better spam filtering, ad targeting, content targeting, A/B testing for new features, et cetera.</p>
<p>It&#8217;s not web-specific. For example, everybody does data warehousing, and we see very strong adoption there.</p>
<p>Separate from that, your example of oil companies is a very good one, as is the financial sector. Right now, we do have a couple of very large financial institutions working with us on these exact problems, taking huge amounts of data from domains like credit card processing and building predictive models for fraud that enable better decisions, for example, about whether to block or allow a given transaction.</p>
<p>In the stock market, Hadoop is being used to do simulations that help predict option pricing and related problems. That&#8217;s another very healthy market that we&#8217;ve seen growth in.</p>
<p>Another interesting market is genomics, specifically for doing gene sequence mapping and protein sequence alignment. In many ways, that is a fuzzy string matching problem, where you have a string of DNA that you need to match to a template string that has certain diseases identified in it. </p>
<p>Any individual task is not that big, but there are so many of them that you need to split the work over many nodes and do the matching in parallel. Government agencies have a number of implementations that involve massive bodies of data as well.</p>
<p><a name="cloud"></a></p>
<p><b>Scott:</b> Let me turn our line of conversation in a slightly different direction. Cloudera obviously has the word &#8220;cloud&#8221; in it, and implementations like Amazon Web Services have Hadoop implementations.</p>
<p>Hadoop seems to intersect with the cloud fairly well. When I think of the cloud, I think about having an API to start up and shut down nodes, utility computing, and being able to provision and shut down relatively large numbers of nodes remotely, programmatically.</p>
<p>What makes the cloud and Hadoop a good fit for each other?</p>
<p><b>Amr:</b> I completely agree with you that the cloud is essentially a set of resources that you can scale up or down with a nice API, based on dynamic demand. There are many services that fit that description, of course, and I should add that a cloud is not always external to your organization.</p>
<p>Even though Cloudera has the word &#8220;cloud&#8221; prominently in our name, we are not building a cloud or a cloud infrastructure like Amazon AWS or Microsoft Azure. Rather, what we&#8217;re building is software that enables cloud computing. Hadoop is at the heart of that, although we&#8217;re also building other components around Hadoop that strengthen that kind of operating proposition.</p>
<p>Hadoop fits very well with the cloud model, because in many ways, Hadoop is like a virtual machine, albeit one that actually takes the inverse approach to what one generally associates with virtual machines. </p>
<p>That is, a virtual machine hypervisor allows you to partition a single server to look like many small servers. On the other hand, Hadoop is about linking together a bunch of physical servers to look like one big server, which starts looking like a cloud service. </p>
<p>You can have many users using this resource, and the resource can be scaled up or down, depending on the demand it&#8217;s being hit with. That&#8217;s exactly the sweet spot for Hadoop.</p>
<p>We&#8217;re working very closely with Amazon, and their Amazon Elastic MapReduce service gives you the option to use the Cloudera specific distribution of Hadoop, although Amazon has the generic Apache distribution of Hadoop as well. If you want to get a maintenance support contract, you obviously need to use our distribution.</p>
<p>Amazon&#8217;s business is about providing the cloud service itself, and not about building Hadoop. Our business is about building Hadoop itself, so that&#8217;s the synergy there.</p>
<p><a name="relation"></a></p>
<p><b>Scott:</b> As you mentioned, there is an Apache version of Hadoop and then there&#8217;s the Cloudera version. As different companies wrap themselves around different open source projects, they&#8217;re structured in different ways. Talk a little bit about Cloudera and what you add to the public open source version of Hadoop, in terms of additional software, support, or services.</p>
<p><b>Amr:</b> I should start by saying that Cloudera is an enterprise software company. Open source is an enabler for us, and it&#8217;s part of what we do, but our mission is about building enterprise software for large-scale data processing in internal or external clouds.</p>
<p>In terms of how we go about working with the community, Hadoop is an Apache licensed product, and the Apache license is one of the most consumer-friendly licenses out there.</p>
<p>The Apache license allows anybody to take the code and do whatever they want to do with it, whether that means using it internally, redistributing it, even packaging and selling it. That flexibility makes it a very nice license, but of course, it means that you have to add value in order to make money from the code.</p>
<p>The way we make money is a bit similar to Red Hat, although Red Hat does charge for the distribution, whereas we don&#8217;t. You can get our distribution directly from our site without having to pay us anything.</p>
<p>There are many differences between our distribution and the Apache distribution. Primarily, our distribution is well-packaged as RPMs, or RPM packages for Red Hat, or Deb packages for Ubuntu or Debian. That makes it much easier to install using standard tools, and in fact, we are talking right now with Microsoft about how we can package it properly to run on top of the Microsoft Windows platform.</p>
<p>The other distinction is that the Apache organization can be a bit slow in terms of launching new Hadoop releases. We release new distributions almost every two weeks, with the latest and greatest features.</p>
<p>Of course, we release them as test distributions, and we tell people to be careful how they use them. As with other fast-moving distributions, the bleeding-edge stuff can have problems as well. Yahoo! also has a distribution, which is essentially the Apache Hadoop code that has been tested on the Yahoo! internal infrastructure. Everything that Yahoo! has in their Hadoop distribution makes it to the Apache Hadoop release, and similarly for any contributions that Cloudera makes to our distribution. We aren’t forking off the code base, it’s just a matter of the Apache Hadoop official release lagging behind a bit.</p>
<p><b>Scott:</b> What is the ongoing relationship between Yahoo! and Cloudera?</p>
<p><b>Amr:</b> As I said before, Hadoop owes a lot to Yahoo!, in terms of how they made it into what it is today, and they continue to contribute a lot. Yahoo! has a very big Hadoop team&#8211;I&#8217;d say almost 100 people right now. They have lots of open positions as well, for new people to join that team.</p>
<p>A lot of the Hadoop patches that we pick up were developed and tested by Yahoo!, and that&#8217;s what we include in our distribution. We do our own development and testing as well, in addition to what Yahoo! does. We listen to our customers to decide on which features/improvements to start working on next. Yahoo’s development roadmap is obviously a function of their internal needs.</p>
<p><a name="business"></a></p>
<p><b>Scott:</b> How does Cloudera monetize their work with Hadoop?</p>
<p><b>Amr:</b> Right now, the only reason you would need to pay us is if you need our maintenance service. Think about software where there&#8217;s no upfront licensing fee but there is a maintenance fee. The maintenance fee buys you access to our team, and you can open tickets and ask questions.</p>
<p>Companies using Hadoop in a production environment need a partner they can fall back on when they have problems, and we are that partner. We can also provide advice about what network hardware to use, how to properly balance the number of hard disks and cores on your machines, and how to configure Hadoop itself. </p>
<p>Hadoop has something like 200 different system parameters that you need to properly configure depending on the infrastructure you have and the type of job that you&#8217;re running. Writing code using Java MapReduce is a bit of an art right now, and we can help with that to make sure your code is properly written with respect to the MapReduce model.</p>
<p>These are the kinds of services that we offer. We also offer quick patches. If you encounter a bug specific to your problem, and you want to make sure you get the patch for it right away so you can continue operating your cluster, we&#8217;ll fix the patch for you very quickly. We have a number of Apache Hadoop committers on staff.</p>
<p>I forgot to mention that Doug Cutting, the inventor of Hadoop, left Yahoo! in September and joined our team. Tom White is another key committer we have at Cloudera. He is also the author of the O’Reilly “Hadoop: The Definitive Guide” book, which is the “bible” for Hadoop right now.</p>
<p>The other thing we do is training, in terms of learning this new technology, how to work with it, and how to develop problems within this new scalable way of thinking. We provide training from the very basic, fundamental kind of stuff all the way to writing algorithms specific to the problem that you are trying to solve.</p>
<p>That&#8217;s how a lot of open source companies start, where their main revenue stream is maintenance fees; this allows them to make healthy revenues in their early years.</p>
<p>Most open source companies evolve to an open core strategy, where the core platform is free, but functionality around the core is not necessarily free. That is the direction that we will be moving in.</p>
<p>We&#8217;ll keep the Hadoop core free, because that&#8217;s Apache, and we&#8217;ll continue contributing to that. Indeed, more than half our engineering team is focused on making the Hadoop platform better, but at the same time, we&#8217;re now investing in building infrastructure around Hadoop to strengthen the platform. </p>
<p><b>Scott:</b> It&#8217;s interesting to consider that dynamic among companies that are wrapped around open source projects like Joomla, WordPress, or MySQL. Every one is different, even with their similarities.</p>
<p>It&#8217;s interesting to me to understand how different companies are structured, because it seems like early on, there were lots and lots of different models people were trying around open source. It seems like the market seems to be settling out a little bit.</p>
<p>One common approach is a sort of aquarium model, like Google or MySQL, where it&#8217;s being written by people who all work for a company, and they put the source code out there for all to see. There may be a patch from outside the company here or there, but most of the development is being done by people who all work for the same company.</p>
<p>On the other hand, there&#8217;s the example of the Red Hat model, where there&#8217;s a pretty vibrant community and the product draws code from a lot of different places. They do a lot of value-added stuff around it, such as providing an enterprise-grade version, fast patching, training, and consulting services.</p>
<p>These two seem to be the predominant models that at the moment. Do you see it in a similar way, or do you see other ways companies are structuring around open source?</p>
<p><b>Amr:</b> I completely agree. The Apache Software Foundation follows the latter model. In fact, they only allow projects to be “official projects” within Apache if the committers in the project are diversified over a number of companies.</p>
<p>If just one company controls the full code base, Apache will not let it become a core project, which I think is a very healthy approach. It does lead to certain logistics and coordination problems, but it allows the software to evolve much more quickly.</p>
<p>It also has sort of a Darwinian advantage, in the sense that when a number of companies are all working on a problem and there is more than one solution, the community can just take the best one. That said, you do lose control a bit, which might slow things down, even though it helps foster innovation, so it is certainly a balancing act.</p>
<p><a name="license"></a></p>
<p><b>Scott:</b> Maybe a year ago, we interviewed Justin Erenkrantz of the Apache Foundation, and that was one of the things we discussed&#8211;that requirement for a project to have maintainers from at least three different companies to make it out of the incubator and become an official Apache project.</p>
<p>In another area, the Apache license is very different than the GPL license, for example. How does that fit into the strategy of Yahoo! or Cloudera? Do you think the Apache license versus the GPL license affects the evolution of Hadoop?</p>
<p><b>Amr:</b> It does play a very big role. As I said earlier, the GPL license has a viral aspect in terms of distribution. Whenever you redistribute the code, you are giving up some rights back to the owners of the code. On the other hand, the Apache license doesn&#8217;t have that viral nature to it, which makes it a much friendlier license, in terms of adoption.</p>
<p>We have to work harder to make our distribution, training, and maintenance services really stand out, and to provide additional value on top of the vanilla Apache Hadoop codebase. After all, customers can just go and get the code directly from Apache, without involving us at all. </p>
<p><b>Scott:</b> There are obviously differing viewpoints on licensing, with some people who really like the GPL for what it does, and some people who really like the Apache license for what it doesn&#8217;t do, or what it doesn&#8217;t make you do. People argue about which is really the most free.</p>
<p><b>Amr:</b> That&#8217;s a good point, and of course, I have my own opinions about it. Our CEO, Mike Olson, led the open source revolution from the beginning. Mike was the CEO for Sleepycat, which commercialized BerkeleyDB, one of the first open source companies. </p>
<p><b>Scott:</b> We are running out of time, so I don&#8217;t want to miss the chance to ask you whether there&#8217;s anything you&#8217;d like to bring up that we haven&#8217;t discussed yet.</p>
<p><b>Amr:</b> Hadoop has been around for a few years, but it’s still very young, relatively speaking. There is still a lot of maturing for it to do, and that&#8217;s what we are spending most of our energy on right now. We, with the Apache Hadoop community, are doing some great stuff in areas like performance, usability, availability, security, auditing, interoperability, and so on.</p>
<p>We are also building new data system products that augment the Hadoop platform. For example, a big problem that many companies have today is how to reliably collect data from across hundreds or thousands of servers, and bring it into a central data repository. That&#8217;s a problem that we are very excited about.</p>
<p><b>Scott:</b> Cool. We&#8217;ll keep paying attention to what comes next. Thanks for taking the time to talk today.</p>
<p><b>Amr:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=272&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F06%2Finterview-with-amr-awadallah-cloudera-cto%2F&amp;title=Interview+with+Amr+Awadallah+%26%238211%3B+Cloudera+CTO" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F06%2Finterview-with-amr-awadallah-cloudera-cto%2F&amp;title=Interview+with+Amr+Awadallah+%26%238211%3B+Cloudera+CTO" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F06%2Finterview-with-amr-awadallah-cloudera-cto%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F06%2Finterview-with-amr-awadallah-cloudera-cto%2F&amp;title=Interview+with+Amr+Awadallah+%26%238211%3B+Cloudera+CTO" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F06%2Finterview-with-amr-awadallah-cloudera-cto%2F&amp;title=Interview+with+Amr+Awadallah+%26%238211%3B+Cloudera+CTO" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F06%2Finterview-with-amr-awadallah-cloudera-cto%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Amr+Awadallah+%26%238211%3B+Cloudera+CTO+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2010%2F01%2F06%2Finterview-with-amr-awadallah-cloudera-cto%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2010/01/06/interview-with-amr-awadallah-cloudera-cto/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Interview with James Vasile &#8211; Software Freedom Law Center</title>
		<link>http://howsoftwareisbuilt.com/2009/12/15/interview-with-james-vasile-software-freedom-law-center/</link>
		<comments>http://howsoftwareisbuilt.com/2009/12/15/interview-with-james-vasile-software-freedom-law-center/#comments</comments>
		<pubDate>Tue, 15 Dec 2009 22:48:29 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=266</guid>
		<description><![CDATA[We had the chance to interview James Vasile, an attorney with the Software Freedom Law Center (SFLC).&#160; Now people often think of the SFLC as the group that &#8220;goes after&#8221; people who violate the GPL, and that&#8217;s certainly the aspect that the press focuses on, but James points out that enforcement is a very small [...]]]></description>
			<content:encoded><![CDATA[<p>
    We had the chance to interview James Vasile, an attorney with the Software<br />
    Freedom Law Center (SFLC).&nbsp; Now people often think of the SFLC as the<br />
    group that &#8220;goes after&#8221; people who violate the GPL, and that&#8217;s certainly the aspect that the press focuses on, but James points out that enforcement is a very<br />
    small percentage of what they do.&nbsp; More interesting were James insights<br />
    into how businesses learn that their secret sauce has very little to do with<br />
    locking up the code.  And if you&#8217;re looking for more, James maintains his own blog at <a href="http://hackervisions.org">hackervisions.org</a></p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2009/12/15/interview-with-james-vasile-software-freedom-law-center/#role">The role of the Software Freedom Law Center in the world of free software</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/12/15/interview-with-james-vasile-software-freedom-law-center/#foster">Fostering voluntary compliance with the GPL</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/12/15/interview-with-james-vasile-software-freedom-law-center/#handle">Handling violations without putting people on the defensive</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/12/15/interview-with-james-vasile-software-freedom-law-center/#community">How communities protect themselves with self-regulation</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/12/15/interview-with-james-vasile-software-freedom-law-center/#negotiating">Negotiating the urge for proprietary &#8220;secret sauce&#8221; in the marketplace</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/12/15/interview-with-james-vasile-software-freedom-law-center/#niche">Niche markets where open source has not had much impact</a></li>
</ul>
<p><span id="more-266"></span></p>
<p><b>Scott Swigart:</b> Could you take a minute to introduce yourself and tell us a little bit about what you do, as it relates to open source and free software?</p>
<p><a name="role"></a></p>
<p><b>James Vasile:</b> Sure. Primarily, I work for the Software Freedom Law Center, doing legal work for people that make and support free software.</p>
<p>My client list includes the GPLed content management systems, like Joomla and Drupal. I go to work for WordPress as well, but I also end up doing work with organizations like GNOME, Samba, and KDE.</p>
<p>Just about any large free software project is going to come through the Law Center at some point. We try to give them whatever help we can, and that help varies quite a bit from client to client.</p>
<p>The legal challenges that free software projects face are sometimes unique, and sometimes mundane. The unique challenges are often about making sure that these projects can continue to operate as free software projects, in a world that is occasionally hostile to freedom and open source software.</p>
<p>More mundane legal issues also arise, because it turns out that running a medium or large free software project is an awful lot like running a small business. The main difference is that nobody gets paid. But you have a lot of the same legal issues trying to get out there and distribute your software all over the world and manage the wealth of resources that it takes to make world class software.</p>
<p>The other side of my work is that I help free software projects with managerial stuff, trying to help them make decisions that put them on the best footing for managing their project, creating new strategies, and getting the word out.</p>
<p>It turns out that we at the Law Center talk to more people in the free software world than almost anyone else. That gives us a pretty good point of view to tell them how a given path has or has not worked well in the past.</p>
<p>We offer that experience to people not as a legal service, but just as an ear to listen, and we offer advice where we can. In that capacity, I sit on the board of Open Source Matters and on the advisory board of GNOME. We give a lot of off-the-cuff advice to software projects.</p>
<p><b>Scott:</b> We&#8217;ve talked to a lot of people about the different ways of building open source software, whether it&#8217;s corporate built or community built. We have also had a lot of conversations about choosing between different licenses, such as deciding between a GPL license versus an Apache license, or something like that.</p>
<p>One aspect of how I view your organization is as an entity involved with enforcing the GPL. Is that an accurate characterization? And what are some of the issues that people bring to you, and how do you help resolve those issues?</p>
<p><a name="foster"></a></p>
<p><b>James:</b> We do a lot of the GPL enforcement, at least in America, but that&#8217;s definitely not the majority of what we do. Even though we do a certain amount of enforcement, most notably for BusyBox, and Samba, I wouldn&#8217;t say that we think of ourselves as primarily GPL enforcers.</p>
<p>More often, we&#8217;re trying to help people to find reasons to comply. I think the GPLed CMS world is a really good example of that, because like most of the rest of the GPLed world and the free software world, enforcement is a non-starter. I mean, the vast majority of licenses are respected, because people voluntarily respect them.</p>
<p>Most people know that if they violate a license, the chances of them actually getting rapped on the knuckles for it, or even called out on it, are pretty low. People comply to avoid the risk of the embarrassment at having to retract some code, but more importantly, there are social norms at play that help them decide this is the right thing to do. There are structures that incentivize compliance that have nothing to do with lawsuits or lawyers.</p>
<p>For example, we just had a really good discussion about this in the WordPress community. It was recently decided that WordPress themes have to comply with the GPL.</p>
<p>Themes are extensions to WordPress that include everything from images, to CSS files, to JavaScript, to PHP code. Some of these components are reached by the GPL, which is why themes, to some degree, have to comply with the GPL.</p>
<p>I won&#8217;t say that this was a new development in the WordPress community, but there were a lot of theme distributors and theme publishers in the WordPress community that were not compliant with the GPL. This is a shift in strategy for WordPress, in the sense that they&#8217;ve decided they want to encourage people to comply.</p>
<p>Obviously, this can be controversial when people are making a living selling themes, but most of the people who decided to come under compliance were fully aware that WordPress was never going to launch a bunch of lawsuits trying to get people to comply. Rather, the reason to comply was to get along well with the rest of the community, rather than thumbing their nose at the project that is producing their livelihood.</p>
<p>More to the point, there are reputational benefits to complying with the GPL in some of these communities. If you work with the project, as opposed to against it, the project is happy to link to you, for example, include you in their extensions directory, their theme directory, and that kind of thing.</p>
<p>There are some practical benefits to complying, but for the most part, it has nothing to do with a legal analysis. Our work in GPL enforcement is not really about trying to whack people upside the head. Sometimes we do that, but for the most part, that&#8217;s the option of last resort.</p>
<p><b>Scott:</b> A lot of times, projects grow up kind of isolated, and then they might reach critical mass, like WordPress or Drupal. On the way to becoming practically household words in the developer and IT community, I imagine that they go a learning process as they decide what kind of a cultural identity they will have.</p>
<p>I could see it being a fairly seismic shift in something like the WordPress community for them to decide that enforcement is important to them. That seems to fit with the notion that the goal is not to file a bunch of lawsuits, but rather to create the conditions where complying with the GPL is the logical thing to do.</p>
<p>As you&#8217;ve worked with different projects, have you observed a set of conditions that need to be present for compliance with the GPL to become self-enforced, in a matter of speaking?</p>
<p><b>James:</b> Its actually extremely varied. Every project has its own culture associated with it and its own values, even. The things that different projects think are important is hugely variable, and so there is no mental checklist.</p>
<p>It really just comes down to looking at a project and the structures and values they have in place, as well as their history. Based on that, one can make a case to people that the license they&#8217;re using should be respected for the good of the community.</p>
<p>We point out the ways in which the community benefits as a whole and the way the individuals benefit within that community. Eventually people start talking to each other about how much they care about licensing.</p>
<p>A really good example of that is in the Joomla world, because Joomla started out as a project called Mambo, and Mambo was released by software company in Australia under the GPL. A bunch of people flocked to the product and started making changes to it. The company said, &#8220;Thank you very much for the changes, but now we&#8217;re going to close the code and release it under proprietary licenses, and basically take this code private.&#8221;</p>
<p>It was the GPL that protected them from having this happen, protected their investment in the code, and kept the code free. Even then, though, many people within the Mambo community did not think the GPL was that important, and they did not have a community norm of respecting the GPL. </p>
<p>They sort of treated the code base as quasi-open for most people and then occasionally closed when they needed it to be closed. Certainly, they weren&#8217;t requiring third-party extension developers to comply with the GPL.</p>
<p>The shift from that world to the current world where the license is a big part of every conversation about what to do with the project has been one of pointing out to them how much the protections of the GPL mattered in the past. That becomes a reason for maintaining those protections in the future.</p>
<p>But obviously with a project like WordPress, which has never had a crisis of who can access the code base and who can make derivative works of it, that sort of argument isn&#8217;t going to be available. So every project, having its own history and culture, is going to have its own reasons for caring about and wanting to respect the GPL.</p>
<p><a name="handle"></a></p>
<p><b>Scott:</b> I would imagine that it can be an interesting challenge if someone starts out small with a great theme, starts welding more functionality into it, and eventually starts to sell a significant amount. In a case like that, how do you go about making that initial communication?</p>
<p>I picture it being somewhat like music downloads. In that case, of course, the industry decided they needed to bring it under control, so they picked a bunch of people and said, &#8220;We&#8217;re going to go after you,&#8221; and then they created a culture. You are fortunate that you don&#8217;t need to take an aggressive approach like that, but how do you educate people who may not even really suspect that they are violating a license?</p>
<p><b>James:</b> The initial contact can be very difficult, because we need to convey to people that they&#8217;re probably violating a license. That rings alarm bells for anybody, especially someone running a small business.</p>
<p>We want to communicate that to them, but in a non-threatening manner. The challenge is to reach them without making them sort of shut down, call their lawyers, and never talk to you again. We want to engage them to talk about ways they can come in from out of the cold and come into compliance. </p>
<p>Having that conversation with people and letting them know, &#8220;I&#8217;m not here to knock you about, I&#8217;m not here to extract money from you, and I&#8217;m not here to change anything you&#8217;re doing except for this one tiny thing of you agreeing to freely license your product, &#8221; is very difficult, because people have a great lot of fear about what is going to happen if they comply.</p>
<p><b>Scott:</b> Sure. The small business owner probably hasn&#8217;t ever talked to a lawyer except when they originally set up their articles of incorporation, if they even have done that. How do you typically play it out from there?</p>
<p><b>James:</b> Sometimes we do it well, and I&#8217;m the first to admit that sometimes that interaction doesn&#8217;t necessarily go well. Usually by the time a case gets to me, the project has already contacted this person, so the discussion has happened outside of my view and certainly outside of my control.</p>
<p>Even though we never know what that is going to look like, obviously there are best practices here. It typically works best to send an initial polite email saying, &#8220;It looks like you might be violating. We should have a discussion about that. We&#8217;re not here to ask you for money, we&#8217;re not here to give you bad publicity, and we&#8217;re not here to do anything but discuss ways that we can convince you to come into compliance.&#8221;</p>
<p>I think that phrasing of &#8220;We would like to have the opportunity to convince you,&#8221; is probably the most important aspect, because it can help people realize that force isn&#8217;t really what&#8217;s on the table. After all, Word Press isn&#8217;t going to waste its precious resources suing a bunch of theme designers.</p>
<p><b>Scott:</b> What&#8217;s it like for you educate the projects themselves on how to engage? I imagine that a project owner who has invested their heart and soul in building an Open Source project might be less than diplomatic in approaching someone they feel has violated the license. How do you help them be constructive?</p>
<p>Conversely, how do you educate something like a three-man design team that it&#8217;s in their best interest to rewire how they&#8217;re going to market? How do you educate them about the fact that doing so will ultimately benefit them? </p>
<p><b>James:</b> When projects make the first contact, the reason it can be a little bit dicey is that sometimes the people involved truly feel wronged. When they contact the people who seem to be violating the license, some of that unhappiness often shows through as impatience and negativity. </p>
<p>That&#8217;s probably not the most auspicious beginning, especially since the theme designer probably doesn&#8217;t agree with that position. From the very first sentence of the email, they tend to have diverging viewpoints, and the conversation can go awry.</p>
<p>Projects get better at it just by doing it more often, either in tandem with my office or just by thinking about it a lot more. They can generally benefit from consulting with us before they make the effort and running the emails past us for the first couple times they do it.</p>
<p>Honestly, though, once they&#8217;ve done it a couple of times, anyone who regularly does compliance work gets into a rhythm about what the techniques are for getting people to actually talk to you as opposed to shutting you out.</p>
<p>It really is just a question of experience and reminding yourself that the person on the other end of that email has a very different point of you than you do. That&#8217;s how we address the initial contact, but what we actually say to people to convince them to comply in the theme context has a lot to with introducing them to theme designers that are complying and making money.</p>
<p>It turns out that the way people find themes is not very GPL aware. In the same way that designers don&#8217;t care much about the GPL, customers don&#8217;t care much about the GPL. When people grab a theme off the Internet, whether it&#8217;s GPLed or not, it&#8217;s probably available from both its authoritative official source and some FTP server or bit torrent somewhere. </p>
<p>And the GPL is not going to change that fact. In the same way that Joomla is not going to run around wasting money suing theme designers, theme designers aren&#8217;t going to run around wasting money suing FTP sites.</p>
<p>That means that the legal rights they are giving up by GPLing their themes are rights that they were probably never going to exercise in the first place. It turns out that the protections that theme designers hold onto are not worth a whole lot, and giving them up does not cost them anything. GPL compliance is a zero cost proposition in 99.9% of cases.</p>
<p>In the theme and extension world, that makes the decision a lot easier, and it helps people realize that they can comply with the community standards without it costing them a dime. Most people truly want to comply, and they don&#8217;t want to be in violation of the standards of the community in which they operate.</p>
<p><a name="community"></a></p>
<p><b>Sean Campbell:</b> In view of the fact that most people want to follow the rules, what do you think the projects themselves can do to facilitate that? What should they be putting on their own sites to help educate people before they download source and get started building with it?</p>
<p>Do you think they should be doing more educating?</p>
<p><b>James:</b> I think that, in the same way that GPL compliance is less about the license and more about the community, open source is less about the openness of the source and more about the community.</p>
<p>The openness of the source code creates great communities, and I think when you&#8217;re talking to potential clients, customers, downloaders, and users, the thing to lead with is not the fact that they can look at the source code, or that there is a license that they probably don&#8217;t care much about anyway.</p>
<p>Projects should explain to them all the benefits of the community, and why they would want to download software that has a community behind it as opposed to just a company behind it.</p>
<p>They should also talk about what it means to join that community. They should explain that extending the software or building on top of it happens in this community context, where there are certain norms.</p>
<p>I think that&#8217;s probably the best approach, because that gives people what they really want to know, which focuses on the social context, rather than legal obligations. It also helps them understand what is so great about the world of free software, which is that we are a collection of people who provide support in this mode for the software that we build.</p>
<p><b>Scott:</b> You said something very interesting about the idea that following the GPL is essentially a zero-cost proposition. Most people getting paid to write code are working for companies that may not have a strong participatory connection to open source, and that has been even more true looking back some number of years.</p>
<p>In companies large and small, I think there is the perception that releasing their code under the GPL could compromise their business position. They are often concerned that they would be ceding control of their product.</p>
<p>That position, of course, isn&#8217;t supported by something like MySQL. No one is going to take over MySQL, because the company building it has all of this expertise around that code base, as well as enhancing it, supporting it, maintaining it, and providing services around it. And when you get down to consumer-level products, that&#8217;s even more true. My mother isn&#8217;t going to want to go search for the source code. She&#8217;ll pay $15 for a theme because it will just work.</p>
<p>In light of that, it was fascinating to me the way you phrased it, that 99% of the time there is no cost to using the GPL. The fear people have that if they go the GPL route, they&#8217;re out of business because they&#8217;re forced to basically give products away instead of selling them isn&#8217;t really well-founded in most cases.</p>
<p>That&#8217;s a message that I don&#8217;t see out there a lot. What success stories can you point to, where people have gone the GPL route and it didn&#8217;t cost them anything, and in fact may have given them benefits they would otherwise not have had?</p>
<p><b>James:</b> I think you give some good examples. Nobody is going to buy MySQL from anybody except MySQL, at least not currently. The people within MySQL over at Sun have concentrated all the expertise in deploying MySQL and using it. It&#8217;s possible that competitors could spring up, but on top of having the expertise in deploying it, the people at MySQL and Sun have the apparatus to support it long term.</p>
<p>It turns out that the people who actually make the software have a huge competitive advantage over everybody else in monetizing that software. No reputable company is going to take the cut-rate deal and compromise on the quality when what they really want is the best support they can get for a reasonable price.</p>
<p>As the people who make the software, you are the best positioned to do other things related to it, and so you become the best source when it comes to servicing that software.</p>
<p>That&#8217;s the big reason that it&#8217;s a zero-cost proposition. At the end of the day, no matter who might want to go into competition with you using the GPL rights you give them, you are going to out-compete them every day of the week. </p>
<p>The only business they would take from you is business that you wouldn&#8217;t get anyway, because you would be pricing too high for that customer. Anyone who could afford to pay your rates would pay your rates, because they want the best that they can get, and you are the best.</p>
<p>Therefore, by definition, anyone who is making GPL-ed software and selling it is in the best possible market position to monetize that software, no matter what the business model ends up being. That&#8217;s definitely true both at the high end and at the consumer end for things like games, which are a highly proprietary market where very few companies allow free redistribution of their games.</p>
<p>It turns out that even though you don&#8217;t allow free redistribution of your games, there is free redistribution of your games. There is rampant pirating of gaming software for both console games and PC games, and despite that the PC game manufacturers, the game manufacturers continue to sell lots and lots of copies of their games.</p>
<p>The reason for that is that at the low end, at the retail end of things, just like you were saying, your mother doesn&#8217;t necessarily want to troll through FTP and BitTorrent sites to try to find a theme or to download the source code for the theme and try to figure out how to install it. </p>
<p>People want one-click convenience and easy, supported deployment of software, and companies that GPL their software are able to do that just as well as companies that don&#8217;t. In fact, sometimes they can do it better.</p>
<p>Whatever the mechanism by which you are making money and being &#8220;the source&#8221; for this software, none of that changes when you go GPL. We used to say that business models might have to be adjusted for people that sell GPL software or distribute GPL software, but it turns out that that&#8217;s not really the case. In most cases, you make money off your proprietary software in the exact same ways you make money off your GPL software, and we&#8217;ve seen that time and time again.</p>
<p>I think you&#8217;re right that we need to start putting up more examples of that and start showing people the examples of companies that have been successfully GPLing software and people that are successfully servicing GPL software that they make. </p>
<p>That&#8217;s probably the biggest thing we could do to alleviate the fear that people have, which is that once they GPL it, the door is open and they can never go back, and all their profits will disappear.</p>
<p><a name="negotiating"></a></p>
<p><b>Scott:</b> What is the corresponding message in the most extreme cases, where somebody feels like they have an algorithm that really is their competitive differentiator. I&#8217;m thinking of something like a gaming engine, a compression algorithm, or a market-analysis algorithm where they know they&#8217;re doing things that nobody else is doing.</p>
<p>I guess, what&#8217;s your message to those companies that are convinced&#8211;perhaps correctly&#8211;that they&#8217;ve got a relatively small amount of code that gives them a huge competitive differentiation?</p>
<p><b>James:</b> First, in most cases they&#8217;re probably wrong, because most of the time, people who cook their own algorithms are not going to beat the combined wisdom of everyone else trying to solve the same problem cooperatively. </p>
<p>That said, it might be true that in some cases, a private algorithm could beat whatever the current state of the art is. Still, they&#8217;re probably not going to beat it for long, and moreover if what you&#8217;ve done is truly novel, there are other mechanisms for protecting the patents and whatnot that come into play. </p>
<p>Most of those people are probably mistaken about their unique competitive advantage, and a lot of what&#8217;s left would probably benefit from having everybody work to produce a better model or algorithm, to combine the advantages of many different services. </p>
<p>If you have five different companies, each with their own private competitive advantage, imagine what they could do if they got together and gave everybody the benefits of all their advantages in one package.</p>
<p>I think that there are still a lot of reasons to go the free software route, even at that point. At the same time, your expertise matters, and your awesome algorithm is best used by you. Even if you free it and let other people improve upon it, you are in the best position to use those improvements. You still have the competitive advantage there and it&#8217;s quite a large one. </p>
<p>Imagine if Google were to release the source code behind Gmail. Nobody else on the planet would be able to deploy Gmail, because none of them have access to the technology that Google has, meaning the entire stack that is so uniquely developed for them and their unique purposes. </p>
<p>I&#8217;m sure Gmail connects to tons and tons of applications and pieces of infrastructure that don&#8217;t exist anywhere but within Google, so even if Google were to hand me the source code to Gmail tomorrow, it would be almost no good to me at all.</p>
<p><b>Scott:</b> We see that sort of effect with the company I believe is called Aquea that&#8217;s wrapped around Drupal. Likewise, Cloud Air is wrapped around Hudu. They would have to do really bad by the project for a long time before somebody could fork it and get the community of users to move over to them. </p>
<p>It sort of happened with Netscape and Firefox. You can see examples, and you mentioned Mambo becoming Joomla.</p>
<p><b>James:</b> Even in the case of Mambo and Joomla, when the company tried to close source Mambo, that wasn&#8217;t even enough to cause a fork. It was a couple years later that the fork happened, after the community tried very hard for a long time to work with this company. And the company tried hard to work with the community as well, and finally it wasn&#8217;t working to anyone&#8217;s satisfaction, so there was a fork. </p>
<p>Experience shows us that every time people are in a situation where forking might be good, they choose not to do it, because it&#8217;s tremendously costly to try to split a community in half and figure out who&#8217;s going to do what.</p>
<p><a name="niche"></a></p>
<p><b>Scott:</b> You mentioned games as being an area that&#8217;s very closed and proprietary. Search is another of those areas, as are financial analysis, oil and gas, seismic analysis, and others.</p>
<p>Are you seeing any areas where there is starting to become a real credible GPL-licensed option that there is a lot of community effort around that may be a game changer a year or two or three from now in some of these areas that have traditionally been very dominated by closed source?</p>
<p><b>James:</b> I wish I could say that I did see a lot of these imminent game changes attracting large communities, but unfortunately I haven&#8217;t. The thing about free software currently is that it usually starts with somebody scratching an itch or somebody trying to solve a fairly widespread problem.</p>
<p>In these niche markets where there isn&#8217;t a lot of free software adherence already, there aren&#8217;t a lot of coders sitting around directly going to benefit from a problem being solved in that area, and it breaks down pretty quickly. One place we see that quite clearly is in the small business application market.</p>
<p>Think about medical billing, for example. Medical offices spend tens of thousands of dollars on software that is by any measure extremely rudimentary, and all it does is help them manage appointments and then submit invoices to insurance companies.</p>
<p>That software is ripe for being knocked off by a free software project, and yet I&#8217;m not aware of anybody that has cobbled together more than one or two programmers committed to working on it, giving it any significant amount of attention or time.</p>
<p>It turns out that the free software community has been relatively bad at supporting these niche areas. There are business models that would allow easy entry into these areas and make it possible for free software developers to make the software and make a good living supporting it, and yet still nobody is taking that opportunity. </p>
<p>That failure is one of the things that I&#8217;m curious about, and one that I&#8217;m seeking solutions to over time.</p>
<p><b>Scott:</b> This has been a great conversation. One of the things we usually do at the end is to ask whether there&#8217;s anything interesting that we didn&#8217;t ask about.</p>
<p><b>James:</b> I think the most interesting thing about the work that I do is really the community-building aspect of it. I touched on that earlier when I said that it&#8217;s not about the legal structures or the legal norms, but rather about the social and community norms.</p>
<p>That&#8217;s true across the board, and one of the things that I see on project after project is people confusing the value of what they do as license-based as opposed to community-based.</p>
<p>You see people with religious allegiances to certain licenses or even people denigrating other licenses or worrying about who&#8217;s using what license in other projects and things like that. What they miss is that the value of the code they&#8217;re writing isn&#8217;t just based on what license it&#8217;s under.</p>
<p><b>Scott:</b> Thank you very much for taking the time to chat with us.</p>
<p><b>James:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=266&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F15%2Finterview-with-james-vasile-software-freedom-law-center%2F&amp;title=Interview+with+James+Vasile+%26%238211%3B+Software+Freedom+Law+Center" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F15%2Finterview-with-james-vasile-software-freedom-law-center%2F&amp;title=Interview+with+James+Vasile+%26%238211%3B+Software+Freedom+Law+Center" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F15%2Finterview-with-james-vasile-software-freedom-law-center%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F15%2Finterview-with-james-vasile-software-freedom-law-center%2F&amp;title=Interview+with+James+Vasile+%26%238211%3B+Software+Freedom+Law+Center" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F15%2Finterview-with-james-vasile-software-freedom-law-center%2F&amp;title=Interview+with+James+Vasile+%26%238211%3B+Software+Freedom+Law+Center" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F15%2Finterview-with-james-vasile-software-freedom-law-center%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+James+Vasile+%26%238211%3B+Software+Freedom+Law+Center+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F15%2Finterview-with-james-vasile-software-freedom-law-center%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2009/12/15/interview-with-james-vasile-software-freedom-law-center/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Interview with Louis Landry &#8211; Joomla Development Lead</title>
		<link>http://howsoftwareisbuilt.com/2009/12/01/interview-with-louis-landry-joomla-development-lead/</link>
		<comments>http://howsoftwareisbuilt.com/2009/12/01/interview-with-louis-landry-joomla-development-lead/#comments</comments>
		<pubDate>Tue, 01 Dec 2009 22:40:45 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=261</guid>
		<description><![CDATA[If you look at any list of open source content management systems, you&#39;ll find Joomla right in top.&#160; In this interview, we talk with Louis Landry, who started as the principal architect of Joomla 1.5 and the Joomla Framework and today acts as the development coordinator.&#160; If you&#39;re interested in where Joomla&#39;s at, and where [...]]]></description>
			<content:encoded><![CDATA[<p>If you look at any list of open source content management systems, you&#39;ll find<br />
    Joomla right in top.&nbsp; In this interview, we talk with Louis Landry, who<br />
    started as the<br />
    principal architect of Joomla 1.5 and the Joomla<wbr> Framework and today acts<br />
    as the development coordinator.&nbsp; If you&#39;re interested in where Joomla&#39;s at,<br />
    and where it&#39;s going, read on&#8230;
<ul>
<li><a href="http://howsoftwareisbuilt.com/2009/12/01/interview-with-louis-landry-joomla-development-lead/#joomla">Joomla and its relationship to website owners</a> </li>
<li><a href="http://howsoftwareisbuilt.com/2009/12/01/interview-with-louis-landry-joomla-development-lead/#stats">The difficulty of creating success metrics for free software</a>
    </li>
<li><a href="http://howsoftwareisbuilt.com/2009/12/01/interview-with-louis-landry-joomla-development-lead/#relation">The relationship between the core Joomla environment and extensions</a>
    </li>
<li><a href="http://howsoftwareisbuilt.com/2009/12/01/interview-with-louis-landry-joomla-development-lead/#cloud">Cloud computing and platform independence</a> </li>
<li><a href="http://howsoftwareisbuilt.com/2009/12/01/interview-with-louis-landry-joomla-development-lead/#transition">Transitioning the web to new platforms and the semantic web</a>
    </li>
<li><a href="http://howsoftwareisbuilt.com/2009/12/01/interview-with-louis-landry-joomla-development-lead/#future">The future of rich applications in a web browser</a> </li>
</ul>
<p><span id="more-261"></span></p>
<p><b>Scott Swigart:</b> To start, could you introduce yourself and tell us a bit about your relationship to Joomla?</p>
<p><b>Louis Landry:</b> Sure. I first used the predecessor to Joomla, Mambo, in 2001 or 2002. I was a computer science student at the time, and I was building a website for my friend. I sort of fell in love with the interface, and honestly, I was shocked to see some of the things that they had pulled off on the web.</p>
<p>Over time, I continued to build more and more sites and get more and more involved. At the time that Joomla became Joomla, I was in contact with several of the key members who started a fork to do some development patches and things into the core.</p>
<p>After I was asked to join the core team, I eventually served as project manager, communications team leader, foundation working group coordinator, development coordinator, and pretty much every aspect of the project. I am now development coordinator, working on future research and development, infrastructure, and things like that.</p>
<p><b>Scott:</b> Talk a little bit about Joomla itself, the strengths of its architecture, and so on.</p>
<p><a name="joomla"></a></p>
<p><b>Louis:</b> Joomla&#8217;s an open source content management system, designed to help you build and maintain your web presence. It doesn&#8217;t take the same approach as some of the older, larger-scaled content managers like SharePoint do.</p>
<p>It&#8217;s not designed to be a full-on enterprise thing. That doesn&#8217;t mean it can&#8217;t be used in enterprises, but it&#8217;s more of a presentation and content organization system than some of the more document-centric systems. Our main target market has been websites for small businesses and other small organizations.</p>
<p>The last reasonable numbers we saw showed maybe one or two million websites out there running Joomla. Most of those are individuals or small organizations putting up sites to share a voice related to some things they care about.</p>
<p>Joomla itself is really a platform with something like 4,000 extensions. It&#8217;s much more of a product based system than other applications like Drupal, where all of the extensions tend to be much more tightly integrated with the core system. With Joomla, you find a lot more third-party developers building product-based things. It&#8217;s much more packaged up, and you just sort of upload a Zip file and everything installs itself.</p>
<p><b>Scott:</b> How is the Joomla organization structured? How do you make money, or do you?</p>
<p><b>Louis:</b> The non-profit organization that holds assets for the Joomla project is called Open Source Matters. It&#8217;s incorporated in New York, and it holds the trademarks, copyrights, and those sorts of things. Most of their income comes from Google Ads, I believe.</p>
<p>It&#8217;s not really a money-making corporation. I am an owner in a company called JXtended, which sells third-party components, extensions to Joomla, and that sort of pays my bills, along with consulting work that I do around Joomla. That tends to be the trend with people around the Joomla project, if they&#8217;re making money here.</p>
<p><b>Scott:</b> That makes sense. I would imagine that a lot of people are looking for certain functionality, and they may not even know that Joomla&#8217;s the solution. Then a consultant can come in and relatively quickly have a site up that meets their needs and just so happens to be based on Joomla.</p>
<p><b>Louis:</b> That&#8217;s actually the case more often than not. People will hire a firm to put together a website for them, and they don&#8217;t really realize that Joomla is the platform on which it&#8217;s being built. A very large proportion of the people on our support forums and engaging in the Joomla project are small web design shops that are pumping out websites for different clients.</p>
<p>We&#8217;ve had more than one case of a client seeing the Joomla offline screen and thinking that we somehow hacked their site. Something went wrong with their Mind Skill server, and when the little screen comes up and says, &#8220;This site&#8217;s offline,&#8221; and it&#8217;s got the Joomla logo, they&#8217;ll email us and say, &#8220;Why did you hack my site?&#8221;</p>
<p>[laughter]</p>
<p>Then we have to explain to them, &#8220;No, really that&#8217;s not what happened.&#8221; So it&#8217;s funny, but it&#8217;s happened quite a few times. That&#8217;s indicative of the fact that a lot of people using Joomla don&#8217;t really think of it in those terms, since the site was put together for them, and Joomla is transparent to them.</p>
<p><b>Scott:</b> That&#8217;s an interesting aspect of that space. There are obviously a lot of people who need a web presence but are not technical.</p>
<p>You&#8217;ve also mentioned the really rich ecosystem, for lack of a better term, around Joomla: 4,000 plug ins, consultants, developers, web design firms, and so on. One to two million sites must rank as a highly successful open source project.</p>
<p><a name="stats"></a></p>
<p><b>Louis:</b> It&#8217;s a pretty sizeable chunk. I think since July, we&#8217;ve had three million downloads of Joomla off of our sites. I&#8217;d have to double-check the numbers, but I think in the last two years, it&#8217;s in the range of 14 or 15 million. It&#8217;s been pretty astonishing to see the downloads crank up.</p>
<p>That&#8217;s also just the English version. There are plenty of versions that are distributed by our language partners or translation partners all over the world in something like 30 or 35 languages.</p>
<p><b>Scott:</b> That&#8217;s an interesting difference between success metrics with proprietary software as opposed to open source. With proprietary solutions, you can point to an exact number of licenses sold, whereas with open source projects, you often have to rely on the number of downloads. That makes it somewhat more difficult to get an accurate picture of adoption.</p>
<p><b>Louis:</b> That&#8217;s true. It&#8217;s difficult to gauge the ratio of deployed sites versus the number of download packages, so it&#8217;s really hard for us to know what the actual market share is.</p>
<p>There were some numbers at one point from a company that I believe was called Netcraft, and some people in our community have built crawlers to try to figure out market penetration. It&#8217;s been interesting to see these various numbers, and they vary widely.</p>
<p><b>Scott:</b> It kind of goes against the ethos of open source, but open source projects could easily embed something in them that by default makes the thing phone home, when it&#8217;s up and running.</p>
<p><b>Louis:</b> There&#8217;s been a lot of talk about that over the years. A few open source projects have tried it, but those efforts tend to backfire, because a lot of users don&#8217;t like it.</p>
<p>We&#8217;ve talked about the concept of anonymous usage statistics and those sorts of things several times, but it&#8217;s just never happened. I am actually not sure it would be a good idea, simply on the basis of the fact that I have no idea what sort of traffic that would generate for the server collecting the data.</p>
<p><b>Scott:</b> Right; a little Google App Engine app could just shove the statistics into a big table data store.</p>
<p><b>Louis:</b> Yeah, let Google suffer. [laughs]</p>
<p>It&#8217;s something that we&#8217;ve talked about, like having some sort of optional switch upon install, but even there you run into trouble, because one of the biggest ways Joomla is installed is via cPanel and Fantastico, which are server scripts to auto install web applications.</p>
<p>They don&#8217;t even run though the Joomla installer, so we wouldn&#8217;t see it. That&#8217;s another example of how our download statistics would be completely irrelevant, because that doesn&#8217;t get downloaded through us&#8211;it gets deployed through a completely separate system.</p>
<p><a name="relation"></a></p>
<p><b>Scott:</b> On the topic of the cPanel and Fantastico installers, talk a little bit about how Joomla updates. There&#8217;s always a tension between a really vibrant ecosystem of plug ins and the necessity of keeping the main project moving forward.</p>
<p><b>Louis:</b> That&#8217;s been a real area of balance for us as well. Anyone who follows Joomla knows that we don&#8217;t release very often, as far as feature incremental releases, and that&#8217;s not always by design. Version 1.5 came out, I think, two years after 1.0, and 1.6 is set to come out probably toward the end of this year, which is about two years or so, as well.</p>
<p>It moves rather slowly, but the wealth of functionality created through the third-party extensions allows us to do that. And because it&#8217;s almost entirely volunteer driven, and we&#8217;ve been going through growing pains since this is still a relatively new project, we&#8217;re still trying to find our footing in a lot of different areas.</p>
<p>We don&#8217;t really have an update mechanism in the core quite yet, although that is something that is being actively worked on for Version 1.6. As far as extensions are concerned, there isn&#8217;t a real updating mechanism for that either.</p>
<p>Historically, it&#8217;s been either a case of uninstall and install the new version, or else install a new version on top of it. With PHP, it&#8217;s generally as simple as FTP-ing files up, if you want to do it manually. It has not been painless, but it has also not been a huge problem for us.</p>
<p><b>Scott:</b> I imagine that the 4,000 or so plug ins are in various different states at any given time, from having been abandoned to having multiple people working on them full time, and everything in between. It&#8217;s kind of like when you upgrade WordPress, you just kind of cross you fingers and push the button, and hope all your plug ins work afterwards.</p>
<p><b>Louis:</b> [laughs] That&#8217;s the thing with the slower release cycle. We put maintenance releases out to fix bugs and things like that pretty regularly, but those updates don&#8217;t change things in such a way that any of the third-party stuff would break, really.</p>
<p>There have been a couple of cases where we&#8217;ve had to change things, but we got out in front of it pretty well, and it didn&#8217;t really cause a whole lot of issues. Still, it&#8217;s true that when you&#8217;ve got a large third-party developer community building external extensions, there&#8217;s an awful lot of inertia.</p>
<p>If we go in and change a system in the core that&#8217;s going to have a cascade effect on how everybody uses the API, or the systems, or something like that, that&#8217;s a lot of ramp up. What we saw with our huge changes in 1.0 to 1.5 was that it took a while for the third-party development community to catch up.</p>
<p>In a lot of cases, they did abandon things, and other people picked it up. That&#8217;s part of the beauty of the open source community, right? Somebody says &#8220;Oh, well I&#8217;m not going to bother with this,&#8221; and somebody else comes behind them and says &#8220;Oh, I&#8217;ll do it.&#8221;</p>
<p><b>Scott:</b> Or, in some cases, there are just a lot of options. When something gets abandoned, there may be four other projects that do something pretty similar.</p>
<p><b>Louis:</b> That&#8217;s certainly true. For example, in just the category on our extensions directory for adding comments to articles, there are pages of different options. Some are commercial, some are not, they have different features, and so on. The richness of choice is really great.</p>
<p><b>Scott:</b> That brings up an interesting point. I&#8217;ve talked to people in corporate environments who are considering using open source software about their methodology for evaluating the safety of a certain project.</p>
<p>They will look at all those stats to see if it&#8217;s being actively maintained. They will see if the contributions are just coming from a few people, or from a lot of people. They often have a specific set of criteria they use to judge whether they want to make a bet on it.</p>
<p>Do you have any advice for people as they look at those several pages of options for similar functionality? Obviously, the one that shows up at the top of the list isn&#8217;t always the safest choice.</p>
<p><b>Louis:</b> That&#8217;s a great note. I find that this is one of the areas where the commercial side of open source really flourishes.</p>
<p>If I were going out to build somebody&#8217;s website, and I was presented with a commercial choice and a non-commercial one where they both met my needs, I would probably choose the commercial one and pay for it. That would give me higher confidence that somebody would be backing it.</p>
<p>That&#8217;s not to say that I don&#8217;t appreciate non-commercial open source software, because obviously I do. Still, in a mission-critical situation, that&#8217;s likely the path I would take. That said, I also tend to look at how active the people are who are running the project.</p>
<p>If I&#8217;m looking at those same two options, and the commercial people are nowhere to be found but the people running the non-commercial project are very active, then that tells me a lot. I feel better about their commitment to the platform, as well as their knowledge of how everything works.</p>
<p>I know that they&#8217;re more able to integrate certain things. They&#8217;d probably be better suited to finding answers to problems, if I ran across them, and that sort of thing.</p>
<p>I think there&#8217;s a balance to be made there. I really like to be able to know that whoever I&#8217;m getting my software from is active in the open-source community.</p>
<p><a name="cloud"></a></p>
<p><b>Scott:</b> What&#8217;s the Windows story for Joomla? Is it common to run it on Windows? Is it one of those situations where some people do development on Windows but they pretty much always deploy on Linux?</p>
<p><b>Louis:</b> I think that since the majority of Internet servers are Linux, it&#8217;s probably safe to say that the majority of Joomla installs are on Linux. And since the majority of end users use Windows, it&#8217;s probably safe to say that most people that test it and build up sites and all those sort of things are doing it on Windows.</p>
<p>There are certainly people using Joomla on Windows servers. We officially support Apache, and we don&#8217;t support IIS. We know it generally works on IIS, but there have been some issues. We don&#8217;t have access to those servers and the software on a reliable basis to catch all of the problems that come up.</p>
<p>We&#8217;re actually in talks with Microsoft now to figure out ways in which we can advance our Windows support and make sure that we have the tools that we need so that everybody can more officially assert that Joomla works in a Windows environment.</p>
<p><b>Scott:</b> What&#8217;s in that for you? Do you feel like that will open up a significant new population to becoming Joomla users?</p>
<p><b>Louis:</b> I&#8217;m sure it does, even though on a personal level, I don&#8217;t really care that much because I don&#8217;t use Windows. I&#8217;m not one of those hate Microsoft guys, but I just use Mac OS and Ubuntu.</p>
<p>Still, our mission is to provide a flexible platform for digital publishing online. And what matters there is that I would like it to be available for as many platforms, OSs, and databases as possible.</p>
<p>Whatever we can do to make it easier for people to deploy Joomla, within reason, is something that we will want to pursue.</p>
<p><b>Scott:</b> Cloud computing is obviously very fashionable at the moment. What interest have you seen around running Joomla on a cloud provider? Is there any work or any thinking about enhancing Joomla to take advantage of some of the attributes of the cloud, like being able to dynamically scale up and scale down and that kind of stuff?</p>
<p><b>Louis:</b> There are a lot of people in the Joomla sphere looking into playing with the cloud, and there are a few companies I know of in the Joomla community who are actively building things on the cloud and trying to provide integrations there.</p>
<p>The EC2 thing at Amazon is interesting to me, and I&#8217;m also very interested in the Eucalyptus project that Ubuntu is starting with, I think, this next version of Ubuntu server. I&#8217;ve been talking with some friends in the FreeBSD project about cloud solutions and what a path might be, as well as a bunch of other places.</p>
<p>I think one of the things that I will focus my research on for the next generation of Joomla will be ways we can restructure Joomla from an architectural standpoint to better live in that space and be able to really reap the benefits of the cloud.</p>
<p><b>Scott:</b> You mentioned Eucalyptus. I&#8217;ve heard that name, but I don&#8217;t know much about it. Can you talk about it?</p>
<p><b>Louis:</b> I don&#8217;t know an awful lot of the technical details behind it, but Eucalyptus is an open source project that Ubuntu is adopting as sort of their cloud platform. It&#8217;s got a bunch of functions, such as the ability to do node requisitioning and various things, and I believe that it&#8217;s the software that would tie the servers in the cloud together.</p>
<p><b>Scott:</b> What I&#8217;ve heard, if it&#8217;s the same thing, is that it also exposes an API that&#8217;s compatible with Amazon&#8217;s API. So even though you&#8217;re not running on Amazon&#8217;s infrastructure, I could write code that could spin up and down instances over in Amazon, but I could point that same code at something Eucalyptus-based and it would do effectively the same thing, even though the whole underlying virtual infrastructure is completely different.</p>
<p><b>Louis:</b> That&#8217;s certainly an aspect of it, and that&#8217;s one of the reasons why it was really interesting to me. I also am really excited about the stuff I&#8217;m hearing coming out of the Ubuntu server crowd.</p>
<p><b>Scott:</b> Talk a little bit about that, because it&#8217;s not something I&#8217;m aware of. In the hosting space, I think of CentOS as being really popular.</p>
<p><b>Louis:</b> It certainly is. CentOS and Red Hat are probably the standard bearers. Ubuntu&#8217;s appeal really stems from being what I chose for my desktop OS. I like the way the file system and the configuration stuff are structured, I like the toolkits, and I like a lot of the things that Ubuntu is doing across that space.</p>
<p>The fact that they&#8217;re starting to focus on the server is interesting to me, and I really like that they&#8217;re thinking about cloud and moving into that direction, because it looks like it&#8217;s one of the focuses that that whole community is taking with the server distribution.</p>
<p><a name="transition"></a></p>
<p><b>Scott:</b> One aspect of that space that really stands out to me, and that I can&#8217;t really figure out, is that one of the earliest cloud providers was Google App Engine, and when it came out, there was a lot of excitement around it. And then people were kind of turned off because they didn&#8217;t want to be limited to Python, and then they added support for Java.</p>
<p>It seems like the most obvious thing for Google to support would be PHP, and the second they came out with Java support, some guy ran off and started a project that I can&#8217;t think of the name of at the moment, but it lets you run PHP on top of Java.</p>
<p>So this guy said, &#8221; I&#8217;m going to port WordPress to run on top of PHP, on top of Java, on top of Google App Engine, using their BigTable, as kind of a science experiment.&#8221; He ran off and did it, and it works, but obviously, it&#8217;s probably not the stack you really want to build on with something real.</p>
<p>When I look at that, I think that if they supported PHP relatively quickly, a lot of PHP projects&#8211;WordPress and maybe Joomla and Drupal&#8211;would probably be supporting some version that could use a BigTable data store and run on Google App Engine.</p>
<p><b>Louis:</b> I&#8217;m actually not certain that that&#8217;s Google&#8217;s vision of what App Engine is about. They&#8217;re certainly not a PHP shop in any way. I&#8217;ve got several friends that work at Google, and I&#8217;ve never heard of them using PHP in anything, unlike Yahoo!, which is a &#8220;PHP shop.&#8221; They do a lot of PHP stuff.</p>
<p>Google is a Python shop. Yahoo! hired Rasmus, Google hired the Python guys, and that&#8217;s the direction they took. I think when they launched it, as is typical for Google, it was closed. You had to be invited to be able to get into the App Engine experience.</p>
<p>I don&#8217;t think they were certain about their ability to scale it out if adoption got too high, and the ubiquity of PHP and Drupal and Joomla and WordPress might actually have been a negative for them. I&#8217;m not certain that signing on to host all those apps would really be to their benefit.</p>
<p>I would almost think that part of their rationale there was wanting to raise awareness around Python and maybe get some clever people working on Python to see who those people are. Then they could collaborate with them, or hire them, or at least get a better feel for what&#8217;s going on in the Python community and expand upon it.</p>
<p><b>Scott:</b> To turn a bit toward the future, you mentioned that cloud is an area that you guys are thinking about. How does Joomla fit in there? How does it play on this new computing paradigm? What else do you feel are the big opportunities on the roadmap for some deep thinking and important work?</p>
<p><b>Louis:</b> The obvious thought is the semantic web, which everybody is thinking about. I feel that we in the open source CMS community can play an important role in terms of providing new ways by which information can be organized.</p>
<p>I think the semantic web is a place where a relatively small amount of work could have a very large impact. Still, that focus is difficult, in the sense that I&#8217;m really more focused within the Joomla community on the next generation, not necessary iterative versions that are coming out.</p>
<p>I&#8217;m trying to rethink the architecture. Joomla as it exists now is based on the same principal concepts as Mambo in 2000, namely being centered around software designed to run on a $6 a month hosting account on some offshore server.</p>
<p>The concept around one site, one instance is still a very serious bread and butter piece of the market for us, but I also want to be thinking about ways in which you can scale out multiple instances running the same software. That ability will position us to fall much more inline with cloud and enterprise deployment schemes.</p>
<p>I want to think about how we can make data more portable, so that you can transport things across instances or just have protocols to push in and pull out data, as well as incorporating the ideas of the semantic web.</p>
<p>I also think a lot about this concept of the rich Internet, like Adobe Air and various other technologies like the high-powered JavaScript engines that are starting to evolve thanks to Google.</p>
<p>It would be really incredible to bring out that rich Internet environment and rich interfaces to make it easier to build a more interactive experience. Those are the things that I have been thinking about.</p>
<p><b>Scott:</b> This has been a great interview. Do you have any closing thoughts from your end?</p>
<p><a name="future"></a></p>
<p><b>Louis:</b> I think we&#8217;re really just growing accustomed to this whole new phase of the web, and I&#8217;m looking forward to seeing where it evolves. I don&#8217;t think that in five years it&#8217;s going to look at all how it looks today, and I don&#8217;t think the experiences are going to be the same.</p>
<p>I&#8217;m really excited to be a part of how this whole thing evolves, including what Microsoft and Google are working on with their Web OS initiatives. It could really change the ways people think about their computers and connectivity. I find all of these things converging and I can&#8217;t wait to see what happens.</p>
<p><b>Scott:</b> I remember one guy saying basically that, until you can run Quake in a web browser, Google is not done. [laughs]</p>
<p><b>Louis:</b> Actually, you can. Google released an experimental thing called Native Client. It&#8217;s not for public consumption yet, but you can compile Windows apps or whatever into this native client code, and then it will run in a browser with a plug in.</p>
<p>So it&#8217;s kind of like a virtual machine. It&#8217;s interesting what they are doing with that, and I&#8217;m very curious to see things like the plug in that they have built enable OpenGL in the browser via JavaScript. I&#8217;m curious to see how they integrate that into future stuff.</p>
<p>Their example is quick. You can click in the browser&#8230;</p>
<p><b>Scott:</b> It sounds almost like an ActiveX plug in.</p>
<p><b>Louis:</b> I think it probably is very similar to that concept, and hopefully they can find a way to make it more secure than that.</p>
<p><b>Scott:</b> That&#8217;s one approach: running a whole different something in the browser, which was kind of the Flash and the SilverLight approach.</p>
<p><b>Louis:</b> We would much more like to see it work in JavaScript, though.</p>
<p><b>Scott:</b> It seems like there is also an awful lot of work going that route. These are just things that are mainstream use cases for HTML and JavaScript, and they&#8217;re very fast. It&#8217;s just a whole new set of scenarios.</p>
<p><b>Louis:</b> The ability to access the machine-level function-called things like that within the browser would be just intense, but also intensively dangerous. I&#8217;m very curious about how we are going to resolve that, because at some point it is going to have to get resolved.</p>
<p>You&#8217;re going to have to be able to access the file system and things like that, and I just don&#8217;t know how we are going to do it.</p>
<p><b>Scott:</b> That seems like a good place to wrap up. Thanks for taking the time to talk today.</p>
<p><b>Louis:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=261&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F01%2Finterview-with-louis-landry-joomla-development-lead%2F&amp;title=Interview+with+Louis+Landry+%26%238211%3B+Joomla+Development+Lead" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F01%2Finterview-with-louis-landry-joomla-development-lead%2F&amp;title=Interview+with+Louis+Landry+%26%238211%3B+Joomla+Development+Lead" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F01%2Finterview-with-louis-landry-joomla-development-lead%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F01%2Finterview-with-louis-landry-joomla-development-lead%2F&amp;title=Interview+with+Louis+Landry+%26%238211%3B+Joomla+Development+Lead" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F01%2Finterview-with-louis-landry-joomla-development-lead%2F&amp;title=Interview+with+Louis+Landry+%26%238211%3B+Joomla+Development+Lead" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F01%2Finterview-with-louis-landry-joomla-development-lead%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Louis+Landry+%26%238211%3B+Joomla+Development+Lead+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F12%2F01%2Finterview-with-louis-landry-joomla-development-lead%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2009/12/01/interview-with-louis-landry-joomla-development-lead/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Interview with Greg Kroah-Hartman &#8211; Linux Kernel Dev/Maintainer</title>
		<link>http://howsoftwareisbuilt.com/2009/11/18/interview-with-greg-kroah-hartman-linux-kernel-devmaintainer/</link>
		<comments>http://howsoftwareisbuilt.com/2009/11/18/interview-with-greg-kroah-hartman-linux-kernel-devmaintainer/#comments</comments>
		<pubDate>Wed, 18 Nov 2009 17:05:51 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=256</guid>
		<description><![CDATA[If anyone knows Linux kernel driver development, it&#39;s Greg Kroah-Hartman, who&#39;s been working deep in Linux for over a decade.&#160; In this interview, Greg talks about how the Linux project has accommodated the accelerating rate of change for the kernel, and offers some insight on where Linux is headed. How development on the Linux kernel [...]]]></description>
			<content:encoded><![CDATA[<p>If anyone knows Linux kernel driver development, it&#39;s Greg Kroah-Hartman, who&#39;s<br />
    been working deep in Linux for over a decade.&nbsp; In this interview, Greg<br />
    talks about how the Linux project has accommodated the accelerating rate of<br />
    change for the kernel, and offers some insight on where Linux is headed.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2009/11/18/interview-with-greg-kroah-hartman-linux-kernel-devmaintainer/#change">How development on the Linux kernel has changed in the past several years</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/11/18/interview-with-greg-kroah-hartman-linux-kernel-devmaintainer/#draw">Drawing in the larger community with the Linux Driver Project</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/11/18/interview-with-greg-kroah-hartman-linux-kernel-devmaintainer/#breadth">The breadth of uses for and contributors to the kernel</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/11/18/interview-with-greg-kroah-hartman-linux-kernel-devmaintainer/#future">The future trajectory of Linux, including functional balance with hardware</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/11/18/interview-with-greg-kroah-hartman-linux-kernel-devmaintainer/#hardware">Overcoming the challenges of maintaining universal hardware device support</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/11/18/interview-with-greg-kroah-hartman-linux-kernel-devmaintainer/#rate">The range of change and growth in the kernel</a></li>
</ul>
<p><span id="more-256"></span></p>
<p><b>Scott Swigart:</b> Can you please introduce yourself and your association with Linux?</p>
<p><b>Greg Kroah-Hartman:</b> Sure. I&#8217;m a Novell fellow, and I&#8217;m working on the Linux kernel with the SUSE Labs division at Novell.</p>
<p>I&#8217;ve been doing Linux kernel work for more than 10 years now. I&#8217;m a maintainer of several different subsystems, including USB, the driver core, and some other minor things. I&#8217;m also responsible for releasing the stable Linux kernels. I used to be a maintainer of a lot of other driver subsystems and such over the years, but I&#8217;ve thankfully handed them off to other people.</p>
<p><a name="change"></a></p>
<p><b>Scott:</b> Let me start with a really big-picture question, since you&#8217;ve been involved in Linux for a long time. Obviously, there&#8217;s still lots of stuff happening on the mailing list, but fundamentally, the way code makes it into the kernel hasn&#8217;t really changed, and yet the scale of development is much larger than it was, say, five years ago.</p>
<p>How have you seen Linux kernel development change to handle the scale of the effort that&#8217;s going on today?</p>
<p><b>Greg:</b> I&#8217;ve been tracking this for over five years now, and we&#8217;ve gotten a lot better. There are a lot more people doing the work.</p>
<p>To give an example, for the 2.5 to 2.6 kernel development series, which took about two years, the top 30 people did 80 percent of the work. Now, the top 30 people do 30 percent of the work. The sheer number of developers has also increased. We were running a couple hundred developers, and now we&#8217;re running a couple thousand.</p>
<p><b>Scott:</b> How does that get managed? Are there more layers in the hierarchy?</p>
<p><b>Greg:</b> We&#8217;ve invented some tools to help us out a lot. It used to be that Linus was the only one who could commit anything, or who would accept anything. It was all done through email. </p>
<p>And then BitKeeper was developed, and a number of us started using that. It was very nice, because we had a much faster feedback loop with Linus. I could have him pull 50 patches, see that he pulled them, and then I could hit him with another 50 patches, if I wanted to, within a day. Before, we had a lag of a couple of weeks.</p>
<p>Then, when we couldn&#8217;t use BitKeeper anymore, Linus wrote Git, and then that just increased it even more, because everybody can use Git. Our development model increased its capacity, and part of that was that we started trusting more people. Now we have subsystem maintainers and such.</p>
<p>As I said, I maintain the subsystems such as USB, and I have people who I trust enough that if they send me a patch, I&#8217;ll take it, no questions asked. Because the most important thing is I know that they will still be around in case there&#8217;s a problem with it. [laughs]</p>
<p>And then I send stuff off to Linus. So, Linus trusts 10 to 15 people, and I trust 10 to 15 people. And I&#8217;m one of the subsystem maintainers. So, it&#8217;s a big, giant web of trust helping this go on.</p>
<p>We&#8217;ve developed some procedures that help guide how and when we do merges and releases. And we&#8217;ve a very regular release schedule: every three months, we do a release. We have a two-week merge window, and all those patches in that merge window have to have been tested. We&#8217;ve gotten the development process down really well over the past four years.</p>
<p>We&#8217;re also increasing the rate of change in our development. The same amount of work one of the top 10 developers did last year wouldn&#8217;t have even made it into the top 20 this year. Our individual developers have got the work flow down, so we can actually contribute more, to an extent that&#8217;s amazing.</p>
<p><a name="draw"></a></p>
<p><b>Scott:</b> I understand that one of the things you&#8217;re involved with is the Linux Driver Project. It seems like there was a certain tension between Linux and certain hardware vendors, in the sense that, for a long time, hardware vendors had been really used to not shipping driver code.</p>
<p>They were used to shipping for Windows, and they didn&#8217;t have to be involved with drivers. They just provided an installer. And then, as Linux started to become a lot more popular in the server space and make some inroads in the desktop space, they wanted access to that market. </p>
<p>There was kind of a culture shift in a lot of companies to realize how to provide things like drivers in a way that matched with what the Linux community expected. Take the example of some hypothetical Acme hardware vendor that wants to submit a driver but has never done it before. What does the process look like?</p>
<p><b>Greg:</b> First, people need to realize that our driver model is different than other operating systems, because all our drivers ship with the kernel. The license requires our drivers to be open, so everything is in the main kernel tree. </p>
<p>Because of our huge rate of change, they pretty much have to be in the kernel tree. Otherwise, keeping a driver outside the kernel is technically a very difficult thing to do, because our internal kernel APIs change very, very rapidly.</p>
<p>In other operating systems, APIs change more slowly, since their development process is comparatively very slow. So, it makes sense, from a technical standpoint and just to save money, to get your code into the kernel. It is maintained that way, and when changes happen within the kernel, your driver code will be fixed for you. If I have to change an API, I change it everywhere.</p>
<p>That approach also lets us see commonality. If see that a driver&#8217;s doing the same thing as another driver, I might merge them together in the shared core. That way, everybody benefits. The new driver gets smaller and easier to maintain. </p>
<p>That&#8217;s happened a lot. We&#8217;ve merged a lot of drivers from a lot of different companies, and everybody&#8217;s happy about that. The companies are happy, the users are happy with the smaller code base, and it&#8217;s easier to maintain.</p>
<p>The Linux Driver Project started out because of the perception that Linux doesn&#8217;t support many devices. It turns out that Linux supports all devices out there. There&#8217;s really nothing manufactured today that Linux doesn&#8217;t support, in a major consumer market. There are some one-offs, like some small video-capture devices I know of that we don&#8217;t have support for, but people are actually working on those.</p>
<p>The initial goal of the Linux Driver Project was to remove all barriers that could possibly be there, which were mostly managerial. To attain that goal, we said that we will write and maintain any driver for free for any company. </p>
<p>It turned out that not very many companies really needed that. A few niche markets needed it, and we&#8217;re writing some drivers for them, but most companies had existing drivers floating around inside their company. Their challenge was typically that they needed to get it into the kernel tree, and they didn&#8217;t know how.</p>
<p>So, the Linux Driver Project, over the past couple years, has just been a big educational thing. I work with the companies and show them how to get the code in. I maintain the code, massage it and get it cleaned up, and then merge it into the kernel tree. That&#8217;s been the majority of the work we&#8217;ve been doing over the past two years.</p>
<p>We&#8217;ve also now codified a few ways that we can play in the kernel. We can accept code into the kernel that doesn&#8217;t meet our normal standards into a staging area. Everybody can clean it up there, and then the code graduates into the real stuff. That&#8217;s worked out really well.</p>
<p><b>Scott:</b> That makes good sense, because someone coming in from the outside isn&#8217;t necessarily going to know everything from coding standards to the best practices for doing things the right way. </p>
<p>And so what you&#8217;re saying is that you&#8217;ve got a place to get that code in, where people who know the right way to do things can look for things that can be factored out of drivers, drivers that can be merged, and that kind of stuff. </p>
<p>It doesn&#8217;t force every hardware vendor or everybody who needs to get a driver into the kernel to be a Linux driver development expert. Do I have that right?</p>
<p><b>Greg:</b> Yeah, although we do have lots of documentation now. We have free books on how to write Linux drivers, and we have documentation of things like our coding style and how to submit code. If people don&#8217;t know where it is, we&#8217;ll point them in the right direction. In case anyone&#8217;s curious, it all starts in a file called HOWTO in the kernel, so they can start there.</p>
<p>Even though we do have those education tools, though, we will also work with companies that want it. We have lots of people who have done this work before and who will do it for you.</p>
<p>It&#8217;s a great place for people wanting to get involved in the kernel to start out, because they can just run a script to find some errors and go fix them. We have people that have started off just wanting to get involved who have ended up maintaining whole drivers and subsystems over time.</p>
<p>Still, in the end, my goal is actually to work with engineers from those companies and have them maintain and own it, so they become full-fledged members of the kernel community. That&#8217;s the only way we&#8217;re going to grow, and it&#8217;s working. The number of companies involved in the kernel has grown year over year. </p>
<p>Companies want to get the most value out of Linux, so I counsel them that they should drive the development of their driver and of Linux as a whole in the direction that they think makes the most sense. If they rely on Linux and feel that Linux is going to be part of their business, I think they should become involved so they can help change it for the better.</p>
<p><b>Scott:</b> That makes sense. In a different area, I saw somewhere that around half of the Linux code that&#8217;s being contributed nowadays is driver code. Do I have that right?</p>
<p><b>Greg:</b> That&#8217;s the percentage. Like I said, everything&#8217;s in the kernel itself. We&#8217;re at something like six or seven million lines of code, and over 50 percent of those lines of code are drivers.</p>
<p>I think 30 percent is architecture-specific stuff, for things like processors and networking. The core kernel is like five percent of the overall code. Those numbers have stayed pretty much the same over the past four or five years.</p>
<p>We change something like 5,000 lines a day, which is just scary. Fifty percent of that change will be in the drivers, and five percent will be in the core kernel. In other words, the kernel is being modified everywhere at that rate of change.</p>
<p><a name="breadth"></a></p>
<p><b>Scott:</b> Do I understand correctly that the core kernel is memory management, process scheduling, and those kinds of fundamental things that an OS has to do?</p>
<p><b>Greg:</b> Exactly. Basic system calls, memory management, scheduling, and inter-process communications.</p>
<p>And the only reason we&#8217;re changing is because people want that change. It&#8217;s not like we&#8217;re sitting there and going, &#8220;Hey! Let&#8217;s rewrite the scheduler again!&#8221; </p>
<p>[laughter]</p>
<p>We&#8217;re doing this stuff because we have to in order to survive, because people want it and because people need it. We do it for fun, but we don&#8217;t gratuitously change things. A big part of what drives that change is that what Linux is being used for is evolving. We&#8217;re the only operating system in something like 85 percent of the world&#8217;s top 500 supercomputers today, and we&#8217;re also in the number-one-selling phone for the past year, by quantity.</p>
<p>It&#8217;s the same exact kernel, and the same exact code base, which is pretty amazing. Nobody&#8217;s ever created something like this before. And we&#8217;re doing it in a way that doesn&#8217;t follow the traditional software design methodologies, which is fun. </p>
<p>I go and talk at a lot of colleges, and they&#8217;re changing their education system based on how the kernel has changed development models for large-scale systems.</p>
<p><b>Scott:</b> Like you mentioned, the kernel runs on some of the smallest devices, as well as some of the largest. What makes that work is a lot of stuff that is or isn&#8217;t included, whether you&#8217;ve got fairly bare-metal, embedded systems, or whether you&#8217;ve got desktops and music players, or whether you&#8217;ve got systems that are optimized for number crunching.</p>
<p>The kernel is an interesting thing. When you talk to people who aren&#8217;t really in the Linux space, they often don&#8217;t make a big distinction between the Linux kernel and a Linux distro. There&#8217;s a huge distinction though, right? The kernel is a fairly small piece of a distro.</p>
<p><b>Greg:</b> Sure, but consider Android, which threw away everything from Linux except the kernel, and they built something totally new on top of it. That&#8217;s a great proof point that the Linux kernel itself has to be really, really flexible to let people do something like that. It still meets the needs of a very big market, which is pretty funny to watch. From an engineering standpoint it was a pretty neat hack.</p>
<p><b>Scott:</b> Give us some sense of the kinds of people who are contributing to the Linux kernel. Obviously, there are hardware vendors. The distros out there are contributing&#8211;Red Hat and Novell, for instance, are big contributors. Give me a sense of the categories of contributors working on the kernel.</p>
<p><b>Greg:</b> You&#8217;ve got a pretty good short list there. Some distros contribute; some do not, and the same is true of hardware companies. Intel has come on like crazy this past couple of years, and they&#8217;re now one of the major contributors to Linux. Consulting groups and educational institutions play a big role as well.</p>
<p>There are also a lot of companies that just modify code for one-off little things, making support better in their hardware. Linux runs on yahts for a lot of automatic pilot steering things, and we&#8217;ve had some contributions from there.</p>
<p>More than five or ten percent of the US&#8217;s power is generated on turbines that are controlled by systems based on Linux, and those guys contribute some changes they need. It&#8217;s also interesting that 20-25 percent of all of our contributions by quantity are done by people that have no affiliation with anything.</p>
<p><b>Scott:</b> The proverbial guy in his basement or dorm room or whatever.</p>
<p><b>Greg:</b> Yeah. They needed something fixed, so they did it and moved on. The quantity there is still large, but on the other hand, 75 to 80 percent is done by people who are getting paid to do it.</p>
<p><b>Scott:</b> What do you feel like is driving change? You mentioned Intel, and Linux seems kind of tailor made for a chip company, because it provides a great opportunity to put new things in a processor or a chip set. Contributing to Linux lets them build support for those kinds of things and to see how they are behaving in a real production operating system.</p>
<p>I can kind of see that new hardware architectures and capabilities drive a certain amount of change in the Linux kernel. I can just see the creation of new peripherals driving a certain amount of change. What other types of change drive the Linux kernel forward?</p>
<p><b>Greg:</b> Well, those two are major. New hardware accounts for the majority of our changes. We get more processors with more and more cores. We had to do a lot of work to make, 8-, 16-, and 32-way machines run really well, and now we&#8217;re running 8000 processor machines really well. The scalability is there, and I actually know of people who have booted larger ones, but we&#8217;re not allowed to talk about it.</p>
<p>USB 3.0 was sponsored by Intel, and it was shown on Linux first&#8211;the first implementation of any operating system. Intel gets their hardware working well fast, and working with the Linux community lets them get it out to developers and other people who are making devices fast. The fact that they can do it for Linux much faster than they can for other operating systems is a huge driving factor for what we need to do for things like 10 Gigabit Ethernet or 40 Gigabit Ethernet.</p>
<p>Some of those speeds have required major changes to the way we do networking and the way we handle processes or data flowing through the kernel. We also have to accommodate workload issues, like real-time workloads, which required big changes to the scheduler and the way we do locking and the way we do other things merged in. </p>
<p>Note that those are all requirements driven by people who are using Linux. The real-time guys want to use operating systems like Linux for their stuff. It also drives Wall Street and the Chicago Mercantile Exchange and the Tokyo Exchange and the London Stock Market.</p>
<p><a name="future"></a></p>
<p><b>Scott:</b> What do you see on the horizon, in terms of hardware use cases that aren&#8217;t quite mainstream or available yet but that are going to drive some additional kind of significant change all the way down to the level of the kernel?</p>
<p><b>Greg:</b> 40 Gigabit Ethernet is going to need some changes, and the networking guys are actually working on that already. There are also going to be more and more cores, better real time operation, and USB 3.0. The USB 3.0 devices aren&#8217;t out yet, so once we start seeing more of those, we&#8217;ll start seeing optimizing things there. SSDs are great; they&#8217;re growing fast.</p>
<p>People seem to forget that changes they want to the kernel that it seems like no one else will want very often end up being valuable to a lot of people. For example, maybe the embedded group makes some power management changes, and it turns out that the server guys really want that, too. Then the server guys want better scalability, and it turns out that the embedded guys love those scalability changes, because they&#8217;re running dual-core ARM systems now.</p>
<p>Part of what&#8217;s helped us make things work is that it turns out that a lot of changes benefit everybody. We&#8217;ve become such a flexible operating system by requiring changes to work for everything, and it&#8217;s made everything work even better.</p>
<p>There are lots of power management changes in the pipeline to be reworked that help address a lot of the issues the embedded guys have come up with that servers traditionally don&#8217;t have. In servers, you can&#8217;t shut off that many discrete components, whereas in embedded, you can pretty much shut off every single discrete component in your system, depending how much you want to do with it.</p>
<p><b>Scott:</b> I remember when we talked to the OLPC guys, one of their requirements was to be able to go to sleep, but not lose the network connection.</p>
<p><b>Greg:</b> Mesh networking, right?</p>
<p><b>Scott:</b> Yes. And they also needed super-aggressive power management, because they wanted to get the most battery life possible, because power is such a big issue in developing countries. If I remember correctly, they were actually putting the processor in a sleep state between keystrokes.</p>
<p><b>Greg:</b> That&#8217;s right; they were, but it turns out that the hardware guys can do it better. If you look at more modern processors, the hardware is putting itself to sleep faster. In modern PCs and these new netbooks, the OS really doesn&#8217;t have control of the power management stuff anymore, because the hardware can do it better.</p>
<p>It turns out it&#8217;s better to run as fast as possible to get your work done, then go to sleep, than to run your CPU at medium speed. The hardware guys realize this, and they know they can handle their workloads better than the operating system can.</p>
<p><b>Scott:</b> I don&#8217;t want to mischaracterize it, and I am intentionally overstating this to some extent, but when you talk to people at Intel, they always say, &#8220;Well, that should be in silicon.&#8221; It&#8217;s kind of like there&#8217;s this drive to put more and more and more in silicon, but at the same time, obviously software is growing at a geometric rate.</p>
<p>Talk a little bit about that balance between what ought to be in hardware and what ought to be in software, as well as the notion that something that ought to be in one place today should maybe be in the other a year or two from now.</p>
<p><b>Greg:</b> The hardware guys are always like, &#8220;Oh, I can solve that in hardware. Let me do that. You software guys aren&#8217;t ever going to get around to doing this.&#8221; And part of that is true.</p>
<p>If you look at the software adoption rates of some operating systems, it&#8217;s very, very slow. So, the fact that they make a change to the hardware, and they can save power today in the new processor, means that overall, the amount of power consumption in the world goes down.</p>
<p>People run old versions of Linux and other operating systems very frequently, so there&#8217;s sometimes an advantage to solving a problem in both places. They can add it to the hardware, and then they also add it to the software.</p>
<p>In the kernel, we know we typically can&#8217;t change user space applications, so we have to make changes in order to accommodate how they want to do things, even if we feel that they&#8217;ve done it the wrong way. So, there&#8217;s also a tension there.</p>
<p>Some companies actually talk to the engineers really, really well. Intel and IBM are very good at this. They sit down, and they talk to the kernel people once a year. &#8220;Hey, this is what we&#8217;re thinking of doing. How should we do this? How are we going to implement this?&#8221; And they get feedback from us directly. And that&#8217;s turned out to be a very, very valuable feedback loop.</p>
<p>It often works out very well in terms of how chips are designed, and how the operating system will take advantage of that. Hopefully, we&#8217;re all talking to all the right people, and everybody&#8217;s interacting well.</p>
<p><a name="hardware"></a></p>
<p><b>Scott:</b> Let me spin back to something you said very early on, which is that basically that all hardware is supported in the Linux kernel, even though there seems to be the perception that not all the hardware works.</p>
<p>Certainly, one of the things that gets brought up is that if I buy a new device, it&#8217;ll come with a driver disk that probably won&#8217;t have Linux drivers on it. Talk a little bit about where do you feel like this disconnect comes from between the people working on the drivers in the Linux kernel and those incorrect user perceptions, even from people who are fairly experienced with Linux.</p>
<p><b>Greg:</b> That perception has existed for many, many years. Back in 2006, I gave a big keynote showing how it was all wrong. Everything is supported, but, yes, that complaint comes up a lot. I joke with Jim Zemlin of the Linux Foundation about this issue all the time. If he publicly says, &#8220;We&#8217;re still working on drivers,&#8221; I always say, &#8220;No, we&#8217;re done.&#8221;</p>
<p>[laughter]</p>
<p>So, there&#8217;s a disconnect there. Part of the issue is that we support more devices than any other operating system ever has. That&#8217;s a fact that&#8217;s been verified by other companies. The problem is that people only care about the devices that they have. Therefore, if your device doesn&#8217;t work for some reason, you don&#8217;t care how many thousands of other devices out there work.</p>
<p>When I started the Linux driver project, I&#8217;d been hearing this from all the major companies that were shipping Linux and cared about Linux. So, I went around to them individually and said, &#8220;OK, what do you need me to do? What needs to be worked on?&#8221;</p>
<p>Every single major hardware company, said, &#8220;Hey, you&#8217;re right. Everything works on Linux. We&#8217;re fine.&#8221; That&#8217;s shown by these companies that ship Linux on their machines. Dell, HP, and all the big hardware companies and laptop manufacturers now ship Linux, and we&#8217;re working with them.</p>
<p>Some of the disconnect is if you buy a machine that doesn&#8217;t have Linux on it, getting it to work sometimes requires weird BIOS tricks to get all the function keys working on a laptop, and things like that. New hardware or hardware when you&#8217;re updating a system should work. If not, then we messed up.</p>
<p>The graphics guys have issues that they&#8217;ve been addressing to help make that work much better. Intel hardware works wonderfully now on Linux. Their developers are very good. ATI and NVIDIA both ship closed source drivers that work on all their devices. Those all work and have very good support systems. The drivers aren&#8217;t in the kernel, but they&#8217;re out there.</p>
<p>Anything storage related or networking related has been working with Linux forever, and all new devices support Linux too. They have to, because that&#8217;s what those markets are going for.</p>
<p>Then you get to the tiny consumer devices, which is where I like having fun, because I do USB. I work really hard to get everything supported, and we don&#8217;t know of anything these days that isn&#8217;t. I was in Tokyo the other day for the Kernel Summit and walking around Akihabara and trying to find devices that we don&#8217;t support. We had all the kernel developers there and we couldn&#8217;t find anything.</p>
<p>I know there&#8217;s a disconnect, but if you look on a mailing list for the Windows guys, they have a lot of driver problems, where things aren&#8217;t working. Mac OS X does not support very many devices at all.</p>
<p>If issues arise with exotic configurations, I&#8217;ll be glad to work through them all. If something doesn&#8217;t work, tell me so I know to fix it.</p>
<p><b>Scott:</b> Is my perception correct that when hardware companies get into Linux, they sometimes start out with a bare-bones driver? It seems that often, the piece of hardware will work, but it won&#8217;t necessarily handle the full spectrum of services like power management with the device. Then they&#8217;ll provide more capabilities, gradually contributing code to use the device to its full capabilities.</p>
<p>Am I at all accurate in that perception, or is it more often the case that the driver they contribute offers the same level of functionality as on any other OS?</p>
<p><b>Greg:</b> You&#8217;re right. Some things don&#8217;t work, although we&#8217;ve made it easier for companies to write a driver that will do everything, because we put core functionality like power management into the core. Power management is now handled by the core of the kernel, so you don&#8217;t have to add that. You just have to add a few hooks in your driver and then you&#8217;re done.</p>
<p>The average driver for Linux is about one third the size of an equivalent driver for another operating system, so you have less code to write and maintain. Still, to your point, there are some new devices that come out whose value add is that they do X and Y differently.</p>
<p>Sometimes we don&#8217;t support that yet, because the manufacturer didn&#8217;t tell us how to do it or we have to figure out how to fit it into the Linux framework to support it properly. Those little webcams are a very good example. New devices come out that can support features like additional bit rates, frame sizes, or screen sizes, and we need to go and add support to the core of the kernel to support those.</p>
<p>So, your statement is very fair, and that contributes to the disgruntlement you&#8217;ll see on my face at times.</p>
<p>[laughter]</p>
<p>If a brand new device that says it can do X, Y, and Z is only doing X on Linux, we immediately start to try to figure out how to get it to do the other things. Sometimes, we just have to work with the manufacturers to do that.</p>
<p>A lot of companies are getting better at working with us to come out with drivers at the same time the hardware comes out. Sometimes we lag by six months just by virtue of that fact that Linux isn&#8217;t the first system they care about, so it takes us a while to get it going. Also, if you&#8217;re using an older distro, you may need to switch versions.</p>
<p>There are all sorts of configuration issues there, which is why Jim Zemlin of the Linux Foundation continues to say we need to work on driver development, and I agree with him there. There&#8217;s always going to be work, so I think I have a lifetime job here.</p>
<p><b>Scott:</b> Correct me if I&#8217;m wrong, but drivers never really leave the kernel, right? Is it more or less true that that&#8217;s one of the reasons that Linux supports old hardware so well? That once it&#8217;s in the kernel, it&#8217;s probably going to be in the kernel for a very long time, so I can grab a six-year-old laptop or six-year-old server, throw Linux at it, and have pretty good confidence it&#8217;s going to work?</p>
<p><b>Greg:</b> Yes, and actually that&#8217;s turned out to be a problem. Because our kernel size is growing so much, we want to make changes, and we&#8217;re running into places where we have drivers for hardware that we know hasn&#8217;t been made in the past ten years. It can be complicated to excise those pieces.</p>
<p>That came up at the kernel summit last week, and actually, I am going to be working on that. We&#8217;re going to move drivers that we think are broken if we are pretty sure that nobody out there uses them anymore. We&#8217;ll move those into the staging tree, keep them there for about a year, and if nobody complains, then we&#8217;ll remove them.</p>
<p>On the other hand, if anybody shows up and says they need one of those drivers, we can use our source code tool to restore and test it. If there is at least one user of an old device in the world, I will gladly maintain the support.</p>
<p><a name="rate"></a></p>
<p><b>Scott:</b> We&#8217;ve covered a lot of great stuff, and now I want to turn to a lazy interviewer trick and ask you what really interesting things I haven&#8217;t asked about. In other words, I&#8217;d like for you to supply both the insightful question and the brilliant answer.</p>
<p>[laughter]</p>
<p><b>Greg:</b> Well, just to touch back on that rate of change that I mentioned before, I just looked it up, and we add 11,000 lines, remove 5500 lines, and modify 2200 lines every single day.</p>
<p>People ask whether we can you keep that up, and I have to tell you that every single year, I say there&#8217;s no way we can go any faster than this. And then we do. We keep growing, and I don&#8217;t see that slowing down at all anywhere.</p>
<p>I mean, the giant server guys love us, the embedded guys love us, and there are entire processor families that only run Linux, so they rely on us. The fact that we&#8217;re out there everywhere in the world these days is actually pretty scary from an engineering standpoint. And even at that rate of change, we maintain a stable kernel.</p>
<p>It&#8217;s something that no one company can keep up with. It would actually be impossible at this point to create an operating system to compete against us. You can&#8217;t sustain that rate of change on your own, which makes it interesting to consider what might come after Linux.</p>
<p>For my part, I think the only thing that&#8217;s going to come after Linux is Linux itself, because we keep changing so much. I don&#8217;t see how companies can really compete with that.</p>
<p><b>Scott:</b> Like you said, a lot of the rate of change in Linux as a whole is due to changes in the drivers. It&#8217;s fairly unique to Linux that it carries the drivers in the kernel, so as the ecosystem grows, you&#8217;ve got more and more people contributing code to Linux.</p>
<p>That&#8217;s not the case with other operating systems.</p>
<p><b>Greg:</b> Right. And while that rate of change is consistent across the whole thing, it&#8217;s also easier to write drivers. As I said before, your driver for Linux is one third the size of your driver for Windows, so even at this rate of change, writing a driver for Linux is less work than it is for other operating systems.</p>
<p>In Linux, we&#8217;ve re-written our USB stack three or four times. Windows has done the same thing, but they had to keep their old USB stack and a lot of their old codes in order to work for those old drivers. So, their maintenance burden goes up over time while ours doesn&#8217;t.</p>
<p>Also, as people change jobs, they generally stay with the Linux kernel community. I&#8217;ve been doing USB work for over 10 years now for Linux, and that&#8217;s a big body of knowledge. Within other companies, engineers usually move around, and that body of knowledge doesn&#8217;t necessarily stick around.</p>
<p>For driver maintainers and driver authors within Linux, it does stick around. The networking guys have been doing that work for 15 years with Linux, and they know this stuff in and out. They can tell you what you need to change in your driver to make it work really well.</p>
<p>They&#8217;re unique, and they&#8217;ll help out any company.</p>
<p><b>Scott:</b> Thanks; we&#8217;re out of time, and that&#8217;s a good place to close.</p>
<p><b>Greg:</b> All right. Thanks a lot; this was fun.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=256&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F18%2Finterview-with-greg-kroah-hartman-linux-kernel-devmaintainer%2F&amp;title=Interview+with+Greg+Kroah-Hartman+%26%238211%3B+Linux+Kernel+Dev%2FMaintainer" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F18%2Finterview-with-greg-kroah-hartman-linux-kernel-devmaintainer%2F&amp;title=Interview+with+Greg+Kroah-Hartman+%26%238211%3B+Linux+Kernel+Dev%2FMaintainer" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F18%2Finterview-with-greg-kroah-hartman-linux-kernel-devmaintainer%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F18%2Finterview-with-greg-kroah-hartman-linux-kernel-devmaintainer%2F&amp;title=Interview+with+Greg+Kroah-Hartman+%26%238211%3B+Linux+Kernel+Dev%2FMaintainer" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F18%2Finterview-with-greg-kroah-hartman-linux-kernel-devmaintainer%2F&amp;title=Interview+with+Greg+Kroah-Hartman+%26%238211%3B+Linux+Kernel+Dev%2FMaintainer" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F18%2Finterview-with-greg-kroah-hartman-linux-kernel-devmaintainer%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Greg+Kroah-Hartman+%26%238211%3B+Linux+Kernel+Dev%2FMaintainer+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F18%2Finterview-with-greg-kroah-hartman-linux-kernel-devmaintainer%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2009/11/18/interview-with-greg-kroah-hartman-linux-kernel-devmaintainer/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>Interview with Ezra Zygmuntowicz &#8211; Engine Yard</title>
		<link>http://howsoftwareisbuilt.com/2009/11/09/interview-with-ezra-zygmuntowicz-engine-yard/</link>
		<comments>http://howsoftwareisbuilt.com/2009/11/09/interview-with-ezra-zygmuntowicz-engine-yard/#comments</comments>
		<pubDate>Mon, 09 Nov 2009 19:56:36 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/?p=252</guid>
		<description><![CDATA[The cloud providers are splitting into a few camps.&#160; On one side, you have companies like Amazon that offer infrastructure as a service (IaaS), and Google who offers platform as a service (PaaS).&#160; PaaS offers rapid development, and no server administration, but it locks you into a specific provider.&#160; Enter Engine Yard, a company that&#39;s [...]]]></description>
			<content:encoded><![CDATA[<p>
    The cloud providers are splitting into a few camps.&nbsp; On one side, you have<br />
    companies like Amazon that offer infrastructure as a service (IaaS), and Google<br />
    who offers platform as a service (PaaS).&nbsp; PaaS offers rapid development,<br />
    and no server administration, but it locks you into a specific provider.&nbsp;<br />
    Enter Engine Yard, a company that&#39;s enhancing Ruby on Rails to run on on top of<br />
    arbitrary IaaS.&nbsp; In this interview Ezra Zygmuntowicz paints the picture.</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2009/11/09/interview-with-ezra-zygmuntowicz-engine-yard/#defining">Defining platform as a service and the creation of Engine Yard</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/11/09/interview-with-ezra-zygmuntowicz-engine-yard/#cloud">Advancing cloud computing beyond buzzwords</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/11/09/interview-with-ezra-zygmuntowicz-engine-yard/#open">Opening cloud computing as a simple, provider-agnostic framework</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/11/09/interview-with-ezra-zygmuntowicz-engine-yard/#flex">Operating system and programming language flexibility</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/11/09/interview-with-ezra-zygmuntowicz-engine-yard/#scale">Providing flexible scalability of data stores</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/11/09/interview-with-ezra-zygmuntowicz-engine-yard/#paradigm">Paradigm shifts created by cloud computing, including admin roles and rapid start-ups</a></li>
</ul>
<p><span id="more-252"></span></p>
<p><b>Scott Swigart:</b> Please take a moment to introduce yourself and your company.</p>
<p><a name="defining"></a></p>
<p><b>Ezra Zygmuntowicz:</b> I&#8217;m one of the founders of Engine Yard. We started life as a hosting company, and we&#8217;ve evolved into an open source company with a platform as a service cloud offering that we run on other people&#8217;s infrastructure. </p>
<p>I&#8217;ve been heavily involved in Ruby and in the Ruby on Rails open source community ever since Ruby on Rails first came out in 2004. Back then, I was webmaster for a newspaper&#8211;the Yakima Herald Republic in Yakima, Washington. I was looking for some kind of new language to redo their whole website, because it was kind of an old mess.</p>
<p>I was reading up on Python and Ruby, and when Ruby on Rails came out, I took it and ran with it, building one of the first commercial sites in Ruby on Rails. That site is still live and running there, and that&#8217;s how I got into the deployment and hosting sides of things.</p>
<p>Back in the early days, nobody knew anything about deploying Ruby on Rails applications. There just wasn&#8217;t much information available on it, and there weren&#8217;t many big sites doing it. I was the only tech person at the newspaper, and they bought me a couple servers and said, &#8220;You need to put this new Ruby on Rails site online and take the 250,000 hits a day we get at the old site right off the bat.&#8221;</p>
<p>That pushed me to start researching how to deploy and scale Ruby on Rails applications on an open source stack. In the community, I became one of the de facto deployment experts, and I wrote a book, <i>Deploying Rails Applications for the Pragmatic Programmers</i>.</p>
<p>The business partners that I started Engine Yard with had a consulting company called &#8220;Quality Humans&#8221; that was a Rails consulting shop. They had a couple of big clients, and they were trying to figure out how to deploy them. They took them to Rackspace, but Rackspace didn&#8217;t know anything about hosting Ruby back then, and Tom and Lance saw a business opportunity for building a high quality hosting platform for Rails applications.</p>
<p>They went looking for somebody to help them with that. They found me in the community, and we got together in early 2006 and started planning this platform for running Ruby applications.</p>
<p>We scrounged up our first $100,000 to get a few servers and our first data center in the summer of 2006. We started taking customers in September 2006, and it&#8217;s been a wild ride since then. We&#8217;ve grown from a tiny startup to 85 employees in 12 locations around the world, running quite a few very high-traffic Ruby on Rails applications. That&#8217;s where we are today.</p>
<p><b>Scott:</b> How does open source fit into your business?</p>
<p>Along the way we&#8217;ve had to build all kinds of tools, and our whole stack is built on open source. We were active in a bunch of open source projects, and we have a number of projects ourselves. We have a project called &#8220;Rubinius,&#8221; which is a new virtual machine for running Ruby code that we&#8217;ve been developing for awhile.</p>
<p>We just hired the guys who run the JRuby project, which is a Ruby virtual machine built on the Java virtual machine so it runs on the JVM. They liked the idea of coming to work for a company focused on Ruby on Rails, so they ended up coming to work for us.</p>
<p>On top of that, we had an open source web framework called &#8220;Merb&#8221; that we wrote to create a sort of high performance version of Ruby on Rails, and that had quite a bit of success. It became more and more of a compelling product, and we realized that we were stealing some thunder from Ruby on Rails. </p>
<p>We had started our company to support Ruby on Rails, so we got together with the Rails core guys and we decided to merge the two projects. So, Merb and Rails have merged to become Rails 3, so it&#8217;s kind of a cool open source story.</p>
<p>The long view is that we kind of forked off to make some performance improvements and provide some different architecting approaches. That was successful, and now we&#8217;ve merged that back into the main line of Ruby on Rails to make it better. That&#8217;s the little story of how we got where we are today.</p>
<p><b>Scott:</b> That&#8217;s an interesting history, and it I think it&#8217;s pretty unique to open source. It proceeds from the idea that a group of people can take a code base off into the wilderness, take it in a different direction to support a different use case or optimize it, and finally bring it back and fold it in. </p>
<p>We see a lot of open source projects go through iterations like that. For instance, between Apache 1 and Apache 2, there were lots of changes, and PHP has gone through similar evolutions. As you mentioned with Ruby, sometimes things go their separate ways, and they may stay separate, or they may converge back together.</p>
<p><b>Ezra:</b> That&#8217;s a pretty cool open source story there, because it shows how open source is a new way of working on software. I think it&#8217;s a better way than closed source, because you can get so many more eyes on the problem.</p>
<p>Anybody can look at the code, and if they&#8217;ve got an itch, they can scratch it themselves and improve it to do what they want it to do. We were interested in being able to run these Ruby applications in the most efficient and resource-conscious way we could. </p>
<p>When we started building Merb out, it kind of took on a life of its own, eventually starting to look almost like a rival to the Rails project. Both projects basically have the same goals: to build a high-level, full stack that&#8217;s easy to get started with and that provides a very rapid application development platform.</p>
<p>It almost felt like we were fractioning the community a little bit by having Merb as an alternative to Ruby on Rails, and like we might be confusing people a little bit. There was starting to be a little bit of rivalry between the two camps, and I just didn&#8217;t think that was useful for the Ruby community at large to have this kind of chasm.</p>
<p>We got together with the Rail core guys and started recognizing that our views on most things weren&#8217;t so different. We started looking for ways that we could patch the two projects together and come out stronger on the other end.</p>
<p>Since then, we have been focusing all our work on Rails, and bringing the good parts and the performance focus we had in Merb into the new release of Rails, which is going to come out in a beta release sometime fairly soon. Because they&#8217;re fairly similar projects, there wasn&#8217;t any reason for them to be separate. </p>
<p>It was cool that everybody was able to get along and see that we&#8217;re all basically working toward the same goal, and we&#8217;ll be stronger if we combine efforts.</p>
<p><a name="cloud"></a></p>
<p><b>Scott:</b> I guess Rails is sort of a platform as a service on top of cloud infrastructure. And since cloud&#8217;s the hot topic, let&#8217;s go ahead and stick the shovel in there. To start, how do you define that?</p>
<p>[laughter]</p>
<p><b>Ezra:</b> Nobody knows what a cloud is these days. It&#8217;s kind of become a term for anything that runs on the Internet.</p>
<p><b>Scott:</b> There are some people who are very serious about having a very strict definition, while others are kind of like, &#8220;Well, if we just put cloud on our marketing material, we&#8217;ll sell more. So let&#8217;s call what we&#8217;ve been doing for 10 years &#8216;cloud.&#8217;&#8221;</p>
<p><b>Ezra:</b> Absolutely. To me, there are a couple of different kinds of clouds. First are the infrastructure as a service clouds, with the Amazon Web Services platform as the canonical example. That&#8217;s what I mainly think of when I hear the term &#8220;cloud,&#8221; because it&#8217;s what I feel the true definition of &#8220;cloud&#8221; is&#8211;an on-demand compute service where resources are disposable and scalable.</p>
<p>I would call something a cloud if it&#8217;s API addressable first and foremost. It has to have an API so it can be manipulated programmatically. It has to be able to scale up and scale down on demand, so if you need more server capacity, you just bring a few more servers up. If your load has gone down, you drop a few and you go back down. The business model is kind of metered, like the electrical grid, where you just plug in and pay by the hour.</p>
<p>Then there&#8217;s all kinds of other stuff that people are calling cloud. There&#8217;s higher level stuff, like platform as a service, which runs on top of infrastructure as a service, but it&#8217;s higher level and deals more with application services, rather than the low-level infrastructure.</p>
<p><b>Scott:</b> What&#8217;s your own history with cloud computing?</p>
<p><b>Ezra:</b> When we started Engine Yard back in 2006, we needed a platform to be able to build this Ruby platform as a service thing. This was before EC2 was out, and before people were calling things cloud, and before there was Amazon Web Services.</p>
<p>We built something in our own data center that looks a lot like EC2, based on Xen virtualization and Gentoo Linux. We still have our own customized version of Gentoo that we optimize for Ruby.</p>
<p>We built a private cloud ourselves, although that&#8217;s kind of a loaded term as well. It was a virtualized cluster platform that ran in our own data centers. We built our platform as a service on top of that. That&#8217;s higher level and deals with the application.</p>
<p>As time has gone on, the Amazon platform has gotten very compelling, and other clouds have popped up, so we have realized that the value that Engine Yard adds is not necessarily at the infrastructure as a service level. We don&#8217;t really give ourselves a big differentiation by racking hardware and running data centers. We can&#8217;t compete with somebody like Amazon on that scale, as far as racking and stacking servers.</p>
<p>We realized that our strong point is what happens after you have a VM online with network and compute and storage. Then you need to put an operating system on there and bring that up all the way up to the application. Then you need to monitor, scale and recover your application from failures. That&#8217;s more where we fit in, and that&#8217;s where our value is&#8211;in platform as a service.</p>
<p>Starting last year, we started taking everything that we&#8217;ve built, all our glue code, and all our open source stuff that we call our stack, and we started porting it to run on top of Amazon Web Services, rather than our own cloud. We&#8217;re kind of abstracting that, so that we can farm that out to Amazon or any of these other providers that have vCloud or other stuff popping up. </p>
<p>It&#8217;s not a super interesting business, racking and stacking servers, but the code and the platform that runs on top of that is a lot more interesting to us. So as a company, we&#8217;ve transitioned from the model where everything happens in our own data centers to a platform that runs out on the cloud in other people&#8217;s servers.</p>
<p><a name="open"></a></p>
<p><b>Scott:</b> We&#8217;ve talked to a lot of people who are Amazon Web Services customers, and they&#8217;ve talked about use cases. Generally, it seems to be fairly good at existing applications.</p>
<p>For example, we talked to some guys who do web resources for the Indy 500. They&#8217;ve got an enormously elastic need. When there&#8217;s an event, they&#8217;re going to have a tremendous amount of traffic, but when there&#8217;s not an event, there&#8217;s very little traffic. And so they&#8217;ve got this need to spin up 100 servers, have them be up for a week or so, tear them down to a handful, and then spin them back up again.</p>
<p>We talked to other people who do diagnostic kind of stuff, and a lot of these show up as Amazon case studies, so if somebody wanted to go look, they could go get details on these. At any rate, this company runs their own internal grid for these analytical models, and when they&#8217;d get a particularly large data set from a customer, again, they have a very elastic need.</p>
<p>It worked pretty well for them to take their existing apps and just run them as is. They don&#8217;t need to maintain enough hardware for their peak load, and their internal servers now are just whatever they need for ongoing day to day operations. All the elasticity happens out in the cloud.</p>
<p>So that&#8217;s one kind of cloud, and then there are other kinds of clouds coming on line. Google&#8217;s been out there for awhile with App Engine. There&#8217;s also Windows Azure, where you&#8217;re coding to a certain kind of API, within a certain development framework, and you don&#8217;t even see machines. You just put your code up, and you don&#8217;t know or care whether it&#8217;s running on one box or 10 boxes or 100 boxes. It&#8217;s very request-response based.</p>
<p>And there you&#8217;re making a bet. With Amazon, you&#8217;re not making a bet on Windows or Linux or any of that stuff; you&#8217;re not even really making a long-term bet on Amazon as a cloud provider, unless you program to something like SQS. With somebody like Google or Microsoft, you&#8217;re making a bet not only on whether you want them as a cloud provider, but also on whether you really want to stick with that framework. If you&#8217;re using Google App Engine, then the only place it runs is on their framework. And the only place that framework is available is on their infrastructure.</p>
<p>Looking at that world, you might think that maybe there&#8217;s an opportunity for something that&#8217;s sort of in between. On one hand, maybe you could choose an infrastructure as a service provider, as a somewhat independent choice, and on the other hand, maybe you could choose the platform as a service framework, also independently. You could sort of mix and match the two.</p>
<p>Is that the kind of scenario that you are enabling? </p>
<p><b>Ezra:</b> I think you hit the nail on the head about what we&#8217;re trying to do. I look at it like this. You&#8217;ve got raw AWS as the canonical example of the infrastructure as a service right now. But they hand you very low level building blocks. You get compute, you get network, you get some memory, get some storage.</p>
<p>You can choose Linux or Windows or whatever, but they hand you these big building blocks, and you still have to assemble those into something that can run your application, and scale, and be reliable at the application level, and so on. You still require quite a bit of system administration, and it&#8217;s not something that Joe Schmoe can just use out of the box.</p>
<p>Then there are the higher level things like Google App Engine, where Google has made a lot of decisions for you. As long as you can fit within their box, you don&#8217;t have to worry about servers anymore. You just twist the dials and scale up as needed and pay for what you use.</p>
<p>That&#8217;s really compelling to people, but it requires them to buy into the lock-in of this Google App Engine thing. You can&#8217;t run elsewhere, and you&#8217;re going to have to be able to color within their lines. There&#8217;s a definite limitation on what you can and cannot do there. You are betting the farm on that being the right way for you to architect things.</p>
<p>We&#8217;ve built this platform as a service that takes the low-level building blocks of something like Amazon or another infrastructure as a service provider, and we have done the research on how to assemble the best practices. We have figured out how to load balance a scalable application farm, a database with replication, and all this kind of stuff for you, and we&#8217;ve figured out how to work with elastic block storage, and snap-shotting, and all that kind of stuff.</p>
<p>We&#8217;ve taken that and wrapped it up into a nice little push button platform as a service, where it&#8217;s pretty much just as easy to get going as with App Engine or something. You tell us about your application code and click a few buttons, and we&#8217;ll build a whole cluster for you.</p>
<p>But it still looks like your old school Web 2.0 deployment. You&#8217;ve got your load balanced applications and your MySQL or Postgres databases. You get SSH access with root login on your boxes, so you can still login and see what&#8217;s going on, and it feels familiar to people.</p>
<p>Our platform as a service is kind of in the middle of that. You can still not really know what you&#8217;re doing and come and just click a few buttons, and get this full cluster and everything up and running, and then add capacity and remove capacity as your traffic goes up and down.</p>
<p>But there are no limitations as far as what you can run there. It&#8217;s just Linux. You have root access, so anything you have been running on Linux servers will still run on this platform and none of this color within the lines stuff, where you can&#8217;t do a certain thing on a platform because they don&#8217;t allow it.</p>
<p><b>Scott:</b> Will it actually spin up and spin down nodes as demand increases and decreases?</p>
<p><b>Ezra:</b> We haven&#8217;t released that feature yet, but it&#8217;s something that we&#8217;re working on right now, and we hope to release it fairly shortly. Right now we can basically do that, but you would get alerts, and then you would have to make a manual decision to click a button to add or remove a node. We are working on adding the whole auto-scaling thing.</p>
<p>Auto scaling is really hard to do in a least common denominator type of way that works for everybody, because it can be so application specific. But we can kind of get away with it because we&#8217;re doing this platform that is all Ruby applications. They all have very similar needs and profiles in terms of how they work.</p>
<p><b>Scott:</b> I don&#8217;t know your guys&#8217; architecture, but one of the scenarios somebody mentioned to us was, &#8220;It&#8217;s not memory or CPU that I&#8217;m concerned with; I care about the queue length and the simple queuing service. And based on more queue items, I want to spin up more nodes.&#8221;</p>
<p><b>Ezra:</b> Right. If you just build an auto scaling system that adds nodes when your load average goes above X, it&#8217;s never going to work that well. You&#8217;re either adding too many nodes all the time, or not keeping up, or whatever.</p>
<p><b>Scott:</b> Or not adding them to the right tier.</p>
<p><b>Ezra:</b> Exactly. That&#8217;s not super helpful, so what we&#8217;re trying to do is build something that takes load into account as one of the factors. </p>
<p>Like you said, maybe they&#8217;ve got a queue they&#8217;re listening to. If the disk goes above X, maybe they want to spin up two more nodes. Or maybe their application is processing X requests a second, and if they hit a certain threshold, they want to scale up or down in a specific way.</p>
<p>We&#8217;re trying to build in some general heuristics that are least common denominator, but still allow it to be a little more intelligent about what it does.</p>
<p><b>Scott:</b> Amazon&#8217;s an infrastructure as a service, and VMware&#8217;s approach doesn&#8217;t look like they&#8217;re going to stand up five massive data centers in five places on the planet. It looks like they have more of a strategy of going after existing hosters and getting them to be vCloud partners.  By being a vCloud partner, they&#8217;re going to have an API exposed to them that lets them programmatically control their instances, kind of like what they can do with AWS. They&#8217;re going to have a self-service provisioning portal, an admin console, and things like that.</p>
<p>Do you see the Rails stuff that you guys are working on down the line supporting this heterogeneous ecosystem of infrastructure as a service providers?</p>
<p><b>Ezra:</b> That&#8217;s definitely something we&#8217;re targeting. We&#8217;ve been talking with VMware a lot lately about the vCloud stuff, and we had an early demo of our platform as a service working against vCloud that we demoed during the keynote at VMworld in Paris six months ago.</p>
<p>We&#8217;ve got the AWS thing down, and it&#8217;s working great. We&#8217;ve got lots of customers on there and customer rate is growing great. And right now we went GA with our platform at the end of September, so this first version is fairly feature complete. </p>
<p>Porting our system to work against vCloud would give us a bunch of bang for our buck, because that&#8217;s not going to tie us to just one infrastructure as a service provider. If we code to the vCloud API, there&#8217;s going to be 10 or 20 of these vCloud approved service providers that we can point our application at, and go to town. Then we know how to instantiate a cluster that would run your applications on AWS or on Terremark&#8217;s vCloud, or on Hosting.com&#8217;s vCloud, or any of these other places. </p>
<p><b>Scott:</b> And that&#8217;s not the piece you are focused on, right? You&#8217;re focused on your framework, but like I said, you could envision a future where there&#8217;s a Rails cloud framework and a pile of hosts that people can choose from, and maybe some other Java cloud framework or PHP cloud framework or .NET cloud framework, or whatever, and they could mix and match.</p>
<p>In that scenario, if a given department prefers a certain language, there&#8217;s a cloud-based, auto scaling dynamic framework that supports that. That also provides a choice of development models and a choice of places to host it that can also be mixed and matched.</p>
<p>Today, I could run ASP.NET on Rackspace and run PHP on Hosting.com, but those are not really cloud frameworks, and those providers aren&#8217;t really in the path. The hosting providers haven&#8217;t really provided an elastic resource, and it&#8217;s been fairly static kinds of units that people have to buy in.</p>
<p>It seems like there&#8217;s a gravitational attraction toward the programming frameworks becoming better at cloud development and the traditional infrastructure providers becoming better at being cloud infrastructure providers.</p>
<p><b>Ezra:</b> It makes a lot of sense. For example, with Google App Engine, you really have to code to their platform to run your app on there. Everything you write in your app has to take into account what they&#8217;ve done there, and how to use a different data store and everything.</p>
<p>But with our platform it&#8217;s just your typical Web 2.0 stack with your applications here, your memcache here, and your database here. It looks familiar to people, and they can run their existing applications on it but still take advantage of the scalability in the cloud.</p>
<p>If they get their application running against our platform on AWS, as soon as we add other back ends, they&#8217;re going to be able to click a button to deploy it out onto a different cloud without any changes in their code or anything. </p>
<p>We&#8217;ll take care of the differences in architecture underneath and still expose the same interface to the application developers. They can just worry about writing their code and let us take care of figuring out the best architecture for each different cloud.</p>
<p><b>Scott:</b> One other facet is that if people want to run this framework internally, they can benefit from using the same framework internally and externally.</p>
<p>Is the stuff you guys are working on mainly a Linux thing, or could you see people running Rails 3 on Solaris? Does it work on Windows? What degree of choice do people have there?</p>
<p><a name="flex"></a></p>
<p><b>Ezra:</b> Rails does work on Windows and Solaris, but Ruby in general is kind of sub-par on Windows compared to Linux, and nobody really deploys production Ruby stuff on Windows unless they absolutely have to integrate with some .NET or other Windows-specific stuff. </p>
<p><b>Scott:</b> As you look at the other choices that people have in that Linux ecosystem, a lot of times it&#8217;s Java, PHP, and a smattering of other things. And then there&#8217;s Rails, which has gained a lot of popularity in a relatively short period of time. </p>
<p>Popular conception seems to portray Java on one end as the strongly typed, object oriented, structured enterprise programming language and PHP on the other side that lends itself to learning in a weekend. Coming from a Microsoft background, I think of it like VB for the Web. There&#8217;s a very low barrier to entry, but the people with computer science degrees point at it as not real programming.</p>
<p>As a development framework in general, and then thinking about cloud development scenarios of the future, what do you think makes Rails a good fit?</p>
<p><b>Ezra:</b> I think that Rails is really good at building a prototype and getting something out and usable to your customers very quickly. It makes some of these decisions for you in starting a new project. </p>
<p>If you&#8217;re starting a project in Java, you&#8217;ve got to sit down and figure out what application server you&#8217;re going to be using, what choice of frameworks, and which ORM, and which XML config files. You&#8217;ve got to piece that all together, and then spend a few days configuring it, so there&#8217;s friction to starting something new and just trying it out.</p>
<p>Rails has a really nice out of the box experience, where it makes all these decisions for you that are fairly sane. You can tweak them later if you need to, but it allows you to just jump the gun and get something started very quickly. And then there&#8217;s a whole slew of plug ins for doing social networking things, friend of a friend, or sending emails, or all this stuff just out of the box you can plug in with very little code.</p>
<p>One of the big advantages of Rails I see is that it allows you to take an idea and have something to show people in a week or two with a couple guys developing on it. That might take three or four months to get the same prototype in a Java system. It&#8217;s just very rapid development.</p>
<p>You can rapidly prototype a website, get it in front of your customer, and then iterate based on real customer feedback, rather than having these really long iterations where you get a bunch of specifications from a customer, and then you come back to them three months later. </p>
<p>Then the customer doesn&#8217;t like it, so they give you a bunch more specifications. Then you come back three months later, and maybe you&#8217;re kind of ships sailing past each other in the night, not really getting the feedback that you need.</p>
<p>With Rails, you could be showing stuff to your customer every day, or every couple of days, and then iterating based on their real time feedback. You can just be so much more productive with it, and I think that&#8217;s where its big advantage is.</p>
<p>As your app grows up and becomes more complex, Rails can support that too. And it has tunables and other options so you can get in there to make it really scale. There are good stories for the stack and the way to scale it now.</p>
<p>One of the other big advantages is that Rails enforces a directory structure, the way the app&#8217;s laid out, and the testing frameworks and so forth. That means you can hire a Rails developer, and they&#8217;re going to be able to find his way around pretty easily. There&#8217;s not going to be a month or two of learning how you customize J2EE or whatever. So there&#8217;s this shared knowledge between all Rails apps that allows you to get programmers up to speed much quicker.</p>
<p><b>Scott:</b> When you compare it to PHP on the other end, is the difference the model of your controller paradigm, or what? Because people think of PHP as being a very rapid way to stand something up and prototype it.</p>
<p><b>Ezra:</b> You can also rapidly prototype stuff in PHP, but when you do and then it gets successful, you&#8217;re probably going to have to rewrite it entirely because it&#8217;s probably a spaghetti mess. With Rails, you could rapidly prototype something, and then if that prototype is successful, you&#8217;re built on a very solid foundation where you can keep scaling that project and keep it maintainable as you add developers and stuff.</p>
<p><a name="scale"></a></p>
<p><b>Scott:</b> With things like Google and Google App Engine and Bigtable, it seems that the popular approach to scale a website is by using a traditional kind of relational database back end. It could be MySQL, PostgreSQL, Oracle, SQL Server, take your pick.</p>
<p>But if you really want to scale infinitely from the beginning, you&#8217;ve got to build on top of one of these distributed, shared databases, whether that&#8217;s SimpleDB or Bigtable or whatever. Talk to me about the options that you have at the data layer.</p>
<p><b>Ezra:</b> There&#8217;s this whole no SQL movement going on, where people are exploring other databases, because they&#8217;ve outgrown MySQL or PostgreSQL. The reality is that I think there&#8217;s a little bit too much hype around that stuff right now.</p>
<p>Most people will never outgrow MySQL or PostgreSQL, even though we have a lot of people who come to us saying that they&#8217;re going to be the next Google or Twitter, so they have to be able to scale huge.</p>
<p>They might lose a lot of time by trying different options, whereas they&#8217;d be better off just starting with a SQL database and getting something out there fast. Then, as they hit the ceiling, they can move portions of that out into some kind of key value store or something else.</p>
<p>We have support for a couple different key value stores on our cloud. You can use Redis, or Toyko Cabinet and Tokyo Tyrant. We have out of the box support for those two. We also like MongoDB a lot. And also MySQL with as many replicas as you want.</p>
<p>The progression usually starts with the thing on MySQL server and a couple app servers, and as I scale those up, I might add a slave or two to the MySQL server and do read-only queries against the slaves and scale down my reads.</p>
<p>As I start to swamp that master database with my write capacity, I might move certain features out into a memcache or a Tokyo Tyrant tier where I don&#8217;t need as sophisticated data access patterns and ad hoc queries; I just need raw speed and scalability of storing a ton of data. </p>
<p>If you have some app that really just screams out, &#8220;I don&#8217;t need a database, I just need a key value store,&#8221; you might start with one of those from the beginning. Still, you&#8217;re going to want to keep your customer and your transactions&#8211;anything to do with money&#8211;in a SQL database with the proper constraints and stuff.</p>
<p>But there is quite a bit of stuff, especially in web pages. You can store rendered HTML template fragments in a memcache or a Tokyo Tyrant, because they might be pretty expensive to pull all the data and build a template, and then you might cache the result of that.</p>
<p>The idea is basically to scale MySQL up as far as you can go and then start taking features and pushing them off into a key value store to take a load off of the database, and to have a system where you&#8217;ve got a relational database augmented by key value stores, and you&#8217;re using them each for what they&#8217;re really good at.</p>
<p><b>Scott:</b> That ties into another thought I had. As some sites have gotten really popular, they have crashed under their load, so they pushed all their images and static data into S3. The page would just have a bunch of image tags, and it would pull the images from Amazon S3. They would off-load a ton of bandwidth from their servers, and the user wouldn&#8217;t really see any difference.</p>
<p>There were a bunch of case studies around that, which told the story of how Amazon S3 saved some sites when they got popular, because they were able to off-load most their bandwidth, without really off-loading any of their processing.</p>
<p>Like you said, Google really forces you to color within their lines, with a limited choice of programming languages, no relational data store, and not really great support for a background process. It&#8217;s still sort of under the constraints of a request and a response.</p>
<p>People have taken the part that won&#8217;t fit in those lines and stuck that over on Amazon, and they just call out to it from the Google App Engine code, or those kinds of things. People I&#8217;ve talked to who have programmed on Google App Engine have said that data store is the paradigm shift. </p>
<p>That&#8217;s the hardest thing to wrap your head around if you&#8217;re used to programming against a relational database, which everybody is. Paradigm shifts are hard. It took 20 years to go from procedural programming to object oriented programming.</p>
<p><b>Ezra:</b> I think those things are definitely in the future for a lot of apps and a lot of programmers, but like you said, they are paradigm shifts, and a lot of people just aren&#8217;t ready to make that shift yet. Cloud computing is pretty young still, and everybody is still trying to figure out how it&#8217;s going to work, what they&#8217;re going to use it for. </p>
<p>There&#8217;s another good five years here where people need to move their legacy applications to the cloud, or they want a cloud environment that looks similar to their on-premises stuff, but they might not be ready to make the jump to the full-on Google App Engine style.</p>
<p>I think something like what we&#8217;ve built will be more compelling for the next several years. It takes advantage of the elasticity of cloud, and the programming capability of the APIs and stuff, but it still kind of looks like your traditional infrastructure.</p>
<p><b>Scott:</b> It might be compelling for a lot more than the next five years. At the end of the day, free is really nice, and it is going to attract a lot of customers. But, the corollary is that those providers are not necessarily going to make a lot of money off of those customers, because they were attracted by something that&#8217;s free.</p>
<p>You can kind of democratize and commoditize a certain kind of lower end of the hosting space, but I look at you guys, and I think, OK, there&#8217;s a framework that I could run. I could run it internally. I could in the future run it on a variety of external partners, whether it&#8217;s Google, or vCloud or whoever. It&#8217;s one of the frameworks I could choose. </p>
<p>Tell me if this is part of your pitch, but I could see companies that are looking at this saying, &#8220;Well yes, but you&#8217;re a small company, you&#8217;re a start up, we don&#8217;t really want to bet on you.&#8221; Is this a place where open source helps, since you&#8217;re able to tell them that they&#8217;re not really betting on you but rather on the longevity of an open source project?</p>
<p><b>Ezra:</b> I think that one of the strengths of our software is that it is built entirely of open source components. We have a lot of proprietary stuff that ties it all together, but at the end of the day, the stuff running your app is Linux, nginx, Rails, MySQL, memcache, and a bunch of other stuff that&#8217;s all open source.</p>
<p>If we went out of business or you didn&#8217;t want to host with those anymore, you&#8217;re not locked in. You can pick up your app and go elsewhere, and be able to put something together that looks close enough where you&#8217;re not going to have to rewrite your code. That helps, because you don&#8217;t get locked in when you use our platform like you would with App Engine or something.</p>
<p><b>Scott:</b> We&#8217;re getting near the end of our allotted time, so is there something interesting I didn&#8217;t ask about?</p>
<p><a name="paradigm"></a></p>
<p><b>Ezra:</b> I think we&#8217;ve covered the main points. The whole thing is a paradigm shift, since there are still a lot of people that I call &#8220;the server huggers,&#8221; who really like racking and stacking servers. They like seeing all those blinking lights, and they like having direct control over the servers and being able to choose the hardware down to every last card. Those guys are having a bit of a harder time with this transition.</p>
<p>I think the whole transition to cloud computing is really going to come to a head and force a merger between typical operations guys and developers. Since the infrastructure is programmable via API, developers are enabled with the power to control the infrastructure without having to call the systems guy every time they need a new server.</p>
<p>I think that your traditional operations guys are going to have to learn to become developers, and I think your traditional developers are going to have to learn more about operations, so that they can play within this new environment that can be totally programmable. I think that&#8217;s going to be one of the big paradigm shifts here.</p>
<p><b>Scott:</b> It&#8217;s also interesting that you mentioned that in order to start out, you had to scrape together $100,000 and put together a small data center. Today, the distance between an idea and a product in the market is getting smaller. </p>
<p>That&#8217;s due in large part to the work that companies like you, Amazon, and Google are doing. The start up money that somebody needs to take an idea and run with it is far less, and if that notion doesn&#8217;t work, fail fast and try something else. </p>
<p>It seems that these technologies are just letting people try things out and spend a lot less time doing the market opportunity analysis and writing the business plan, and instead just trying stuff.</p>
<p><b>Ezra:</b> The whole switch from Capex to Opex is a real game changer, and you&#8217;re right that it is largely cloud-enabled. If a couple of guys at a university have an idea that needs a couple hundred servers to run, in the old days, they would have to have a lot of money to buy a hundred servers and rack them up and get them available. </p>
<p>With the cloud, you can boot a hundred servers and play with them for a few days and shut it off by just spending 100 bucks or something. Being able to that easily try out an idea and see if it&#8217;s feasible or not is a game changer.</p>
<p>That flexibility is really compounded by things like Rails, which are enabling really rapid development of these sites. It gives a lot more power to individuals, and smaller teams these days can build pretty complicated products very fast and then scale them very easily. </p>
<p>That&#8217;s huge, I think. That&#8217;s one of the biggest things that this whole cloud trend is changing. All these open source technologies, along with the infrastructure on demand really does allow people to just try big ideas and then fail fast. Like you said, if it&#8217;s not going to work, they&#8217;re not going to be out a lot of money.</p>
<p><b>Scott:</b> It also enables departments within a 150,000 person company to be really entrepreneurial and really experiment with things. The boss could practically try out an idea with his credit card, and the budget could go on an expense report. </p>
<p>Departments can go outside of traditional IT because they&#8217;ve got a budget and for things that are below a certain threshold, they don&#8217;t need approval. They can have a hosted CRM system or mail accounts or all kinds of things, including their own servers in the cloud that their IT department doesn&#8217;t even know about. They&#8217;re completely off the radar. </p>
<p>On one hand, IT cringes at that, but on the other hand, companies want risk taking, especially if there isn&#8217;t even really that much risk with this kind of entrepreneurial spirit, because it&#8217;s just hard to predict what&#8217;s going to work.</p>
<p><b>Ezra:</b> There&#8217;s a big push for agile development. Test-driven development, Scrum, Agile, and all this kind of stuff has really changed the way people write software. They do rapid iterations and they test and they get stuff done very fast. </p>
<p>I see this whole space we&#8217;re in as enabling agility and deployment as well, because it used to be that maybe your development team is rapidly iterating and coming up with new features and all this stuff, but then there&#8217;s this big brick wall to get it deployed. They had to hand it to the ops guys and order servers and wait. </p>
<p>It was a big ordeal to get stuff deployed, but this new approach totally wipes that out and allows you not only to rapidly develop your product but also continuously deploy it and rapidly have it online and visible. It tears down that last wall of having your product go from idea to actually being out alive in the world and doing something useful.</p>
<p><b>Scott:</b> I was just reading &#8220;Information Week&#8221; and there was a CIO saying, &#8220;One of the reasons I&#8217;m going with a hosted solution is my people never have to perform an upgrade. They never have to upgrade the CRM system. They never have to upgrade the mail system. Those used to be big projects that took lots of planning and lots of experimentation, and I will never have to devote staff to doing an upgrade. And at the same time, we feel safer making customizations because there&#8217;s a little bit more of a feeling that that fast provider isn&#8217;t going to break everybody&#8217;s customizations, because they&#8217;ve built the sandbox that we can customize within.&#8221;</p>
<p>And so they&#8217;ve got better visibility into what people are doing and not doing because it isn&#8217;t the case anymore that just anybody could be doing anything. They built the sandbox up fairly incrementally in terms of customers wanting to be able to do X, so we let them do X. Well, now they know they can do that, so if they&#8217;re upgrading their infrastructure, they&#8217;re able to take that into account.</p>
<p>I guess we are out of time. Thanks for talking with me today.</p>
<p><b>Ezra:</b> Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=252&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F09%2Finterview-with-ezra-zygmuntowicz-engine-yard%2F&amp;title=Interview+with+Ezra+Zygmuntowicz+%26%238211%3B+Engine+Yard" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F09%2Finterview-with-ezra-zygmuntowicz-engine-yard%2F&amp;title=Interview+with+Ezra+Zygmuntowicz+%26%238211%3B+Engine+Yard" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F09%2Finterview-with-ezra-zygmuntowicz-engine-yard%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F09%2Finterview-with-ezra-zygmuntowicz-engine-yard%2F&amp;title=Interview+with+Ezra+Zygmuntowicz+%26%238211%3B+Engine+Yard" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F09%2Finterview-with-ezra-zygmuntowicz-engine-yard%2F&amp;title=Interview+with+Ezra+Zygmuntowicz+%26%238211%3B+Engine+Yard" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F09%2Finterview-with-ezra-zygmuntowicz-engine-yard%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Ezra+Zygmuntowicz+%26%238211%3B+Engine+Yard+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F11%2F09%2Finterview-with-ezra-zygmuntowicz-engine-yard%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2009/11/09/interview-with-ezra-zygmuntowicz-engine-yard/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
