A blog forum to provide deep dive analysis and community conversations about software development models. For more details click here.

Eucalyptus is an open-source project that gives you the Amazon cloud API for
non-Amazon clouds.  In this interview, we chat with Rich Wolski, CTO of
Eucalyptus Systems, about the project and the company.

Scott Swigart: Could you start off by introducing yourself and Eucalyptus?

Rich Wolski: Sure. I’m the Chief Technology Officer and the college professor whose research group did work on Eucalyptus originally at the University of California, Santa Barbara.

The project began as a computer science research investigation. We were looking at how to combine large-scale HPC programs in fields like chemistry, physics, weather forecasting, and biology with the emergence of utility computing offered by third parties, which is to say, cloud computing.

We started looking at that question in an empirical way, using a driver application of weather forecasting code. The goal was to combine Amazon’s AWS, a number of university data centers, and the NSF supercomputer center site in one large execution platform for this legacy weather code. To do that, we actually wound up having to build an interface layer for that code that could run in university data centers.

To make our programming job easier, we made that interface layer look just like Amazon AWS. We were faced with the problem, from the application perspective, of porting the code to Amazon. That interface layer–that emulation, if you will, for university data centers–is what we released as open source, which became Eucalyptus. That’s how we got ourselves into commercialization.

It’s an open-source platform that’s actually architected to support a bunch of different APIs, with flexibility in terms of virtualization, operating systems, and hardware. It implements cloud computing with a sufficiently high-fidelity imitation of Amazon that the weather forecasting code we were working with couldn’t really tell the difference between when it was running at Amazon and when it was running at the universities.

Scott: When I think of things like the Amazon API, there are two things that spring to mind. The first is the API that you access, generally remotely. From my machine here I could spin off a hundred instances and provision them with Amazon machine images.

I could shut them down, I could take snapshots of images or take snapshots of the block source. All of the admin stuff that you can do through the management console is what I think of as the Amazon API.

If I understand what you just said, you guys replicated that API so that if I’ve got code locally that’s spinning up instances and kicking off jobs on Amazon, it could also kick off jobs through, say, a third-party vCloud provider, as long as they’ve bolted on Eucalyptus.

It’s exposing that same API, so I can federate my clouds and do the same logical operations even though, for example, starting an instance on VMware’s infrastructure is different from doing so on Amazon’s. Do I understand correctly that you have built a sort of abstraction layer so that I don’t really see that as a user?

Rich: Yeah, although I think it’s important to understand that that abstraction layer is more than just the API. Really, what Eucalyptus does is implements the cloud abstractions in an essentially technology-neutral way.

There are a set of abstractions that all successful clouds today support, and we’ve figured out how to implement those abstractions independently of the API that accesses them and independent of the virtualization technology that is used to implement them.

Eucalyptus is in the middle, as a generic cloud infrastructure. It is built so that you can manipulate that cloud infrastructure using a number of different APIs, although the one that has been most successful for us is Amazon. That’s the one that we ship with open source, but we’ve done others.

On the bottom part of it, there are lots of different virtualization technologies that can be used to realize those cloud abstractions, both in open source and in proprietary forms. Eucalyptus is built so that it could use any of those virtualization technologies to implement the abstraction.

Scott: So, in other words, you are positioning Eucalyptus to be the lingua franca of the clouds, in terms of starting instances, stopping instances, and that sort of thing.

Rich: Right. There really are three sets of abstractions that most clouds support. There are VM abstractions for starting, stopping, and manipulating virtual machines. There are abstractions having to do with storage, such as what persists, what does not, how it is accessed, and what the range of access is. Finally, there are abstractions having to do with networking: how your VMs are named, how can they communicate with each other internally and externally, and those types of things.

We have an internal framework that is able to combine different virtualization technologies to implement those abstractions and then export them via different APIs.

Scott: When an instance is running on Amazon, the instance itself doesn’t really persist data. So if you shut that instance down, its “local drive” basically gets blown away, whereas with other providers, that may not necessarily be the case. Or you may set up a portion of your data center so the instances are persistent.

How do you deal with those underlying differences that surface up through whatever kind of cloud or virtualization architecture somebody is using internally?

Rich: That’s a great question, and the answer is that the Amazon API, and in fact all the other APIs we’ve looked at, have really well-developed notions of Quality of Service attribution. I’ll just speak about Amazon concretely, because that’s the one that seems to motivating a lot of others.

In Amazon, there is this notion of VM type, and there is also a notion of availability zone. Availability zones in Amazon are there specifically to give you some idea of independent failure probability. Two different VMs working in different availability zones have independent failure probabilities, so you can calculate the likelihood that you’ll lose both of them simply by calculating the joint probability.

We actually co-opt that a little bit and use the notion of availability zones as a collection of common QoS characteristics. One of those could be “is it failure-independent?” but that is not necessarily the only one.

For example, you could have two different availability zones in a Eucalyptus cloud, where the underlying infrastructure is implemented in two completely different ways. And what you want to export to your users is the notion that each specific availability zone has a particular set of QoS attributes.

Therefore, if VM persistence is an attribute that you want to export in one or more of your availability zones, you can do so. Often, there are capacity-allocation issues associated with that. If you’re going to do VM persistence, you have to have more of a persistent store available, and the Eucalyptus administrator is responsible for doing that capacity matching, so that the size of the availability zone, the things that can go in it, the users that are entitled to request image starts from it, and so on and so forth, are properly controlled.

Scott: Does costing show up in the API at all, in terms of VM, storage, and network resources? In other words, is there a way through the API to know what an instance is costing you, or is that something that you need to know in advance?

Rich: That’s actually something we think about a lot. There’s no problem in putting the cost into the API, but it’s difficult in many cases to compute exactly what that cost should be automatically. If you’re charging against a credit card, you can calculate the occupancy cost or the number of bytes or whatever. There’s some monitoring that’s necessary to do that, but you can do it.

On the other hand, if you’re making calculations based, for instance, on the financial costs of running a VM, it’s far more complex. There are factors based on the cost of the hardware, the operating costs, and a specific life span of that hardware in a specific company. It’s a very difficult problem, because of all the factors that go into the cost benefit associated with owning that machine.

Of course, you can use average costs. Rather than calculating the operating cost on a particular machine, you can assume a reasonable rate and feed that to the system as a parameter. It would be no problem to export that through the API, although I’m not sure you get much utility out of calculations based on a static number like that.

On the other hand, Amazon has just announced spot pricing, which changes the equation. Now the price is going to be fluctuating dynamically, and you don’t want to know so much what something cost you in the past, as much as you want to know what it’s going to cost you in the future, and that’s a whole different ball of wax.

I do think that almost all of the APIs at some point will include some notion of cost, but the utility of that is really going to be predicated on how predictive it is. I don’t think the providers have gotten to a point yet where they can reliably produce those kinds of predictions, so I suspect we’re a little ways off before we see it.

Scott: I was thinking more in terms of the customer-facing price showing up in the API, rather than the internal costs. That seems to be a simpler matter, since as you said, there are a million things that factor into internal costs, many of which aren’t really programmatically discernable, such as what your people are costing you and those sorts of things.

The notion of spot pricing leads me to another thought. Today, an IT manager may have a workload that they think may be well suited to the cloud, because their demand is fairly elastic. Maybe the organization needs 100 nodes for just a week, or two hours, or whatever.

Customers only want to pay for peak capacity when they need it, so they’re manually looking at various cloud providers and trying to find a fit for their needs. They may also be considering an internal cloud that bursts to a public cloud as needed.

Of necessity, they’re looking at all these relationships very individually, in terms of the characteristics of each provider and so on.

I think about when web services were just catching on, as the trendy technology of the moment. There was a lot of talk at the time about web services marketplaces and directory services, where if you wanted a weather service, you’d go to a directory service and find one. The stuff you’re talking about seems to lend itself to the evolution of a kind of cloud marketplace, especially if cloud capacity really is a commodity.

In fact, that’s the direction that people like Amazon seem to be pushing it, where there are spot prices for instances, for CPU time, and that sort of thing. It would allow you to monitor a marketplace of many providers and place workloads dynamically based on changing prices. Am I thinking too sci-fi here, or do you actually see this happening?

Rich: I actually agree with everything you said, and I would enhance your description in the following way.

There are business uses for which commodity markets with fluctuating prices make a lot of sense, assuming that the offerings are completely interchangeable. Typically, customers buy the lowest common denominator, because it’s a volume business.

At the same time, there are real advantages in certainty and long-term contracts, and being able to repurpose things you buy based on unforeseen events. Spot pricing can be either a hindrance or a boon, since if the spot price happens to be poor when you really need to move on something, you might have been better off with a longer term fixed price fee for service.

There was a study at UC Berkeley, I believe, of network bandwidth. They tried all sorts of pricing models for people in their homes, and they found that, in fact, spot pricing for network bandwidth at home is a really bad idea. People liked it the least of the various pricing options that were made available to them.

I think there’s going to be a computational commodities market, and I think the key there is going to be interchangeability. You have to know that if you buy it from one provider or another, or if you run it in-house, you really are getting the equivalent service. And in computing that’s tricky, because it’s a multidimensional quality of service bundle, and you have to match on all dimensions. Multidimensional matching is a hard problem, particularly if things are statistical.

It’s kind of hard to do that equivalence relationship, but you can do it, at least to some approximation. And for the things where you can deal with spot pricing and make those kinds of decisions, I think it’s going to work just fine.

I don’t see that as the only way that public cloud providers are going to offer their services, though, and I also think that independent of whether the market is commoditized or not, the on-premise cloud is going to play a role, acting as a hybrid.

Scott: That makes sense relative to the electricity market, too. Governments, agencies, and companies enter into long-term contracts with each other to get determinism around price, but there’s also a spot market for power.

To turn to a different subject area, Eucalyptus lets you get to the notion of arbitrary infrastructure as a service. While each service provider may have its own unique set of value adds, broad availability facilitates federating workloads across providers.

It seems like platform as a service started to evolve almost completely separately, but they seem to be coming together a little bit more now. Google App Engine was probably the first platform as a service provider that a lot people really noticed.

The first option was to go the infrastructure as a service route with Amazon, and then you were your own administrator. You had to figure out what OS to load, patch your own boxes, and so on, but you could basically do whatever you could with a virtual machine. On the other hand, you could go with Google, which required a bit of a rewrite, but there were all kinds of things you didn’t have to worry about.

Now there seems to be work happening toward being able to layer platform-as-a-service frameworks on top of infrastructure-as-a-service offerings. I can envision future where you can sort of mix and match those.

How does Eucalyptus fit in to that world, where you’ve kind of got arbitrary platforms as a service on top of arbitrary infrastructures as service?

Rich: Let me start off by saying that when Eucalyptus was a research project, we teamed with another research group at UCSB that was interested in doing for Google’s App Engine what we did for Eucalyptus, namely building an open source version of GAE. They layered it on top of Eucalyptus, and it’s called AppScale. It’s still a research project today. You can get AppScale and Eucalyptus, run them in your data center, and have GAE and Amazon all to yourself.

From the beginning, we really thought that people would want to mix and match platform services and infrastructure services. I think it will be interesting to see how applications work at different levels in the stack. One of the key questions with regard to this layering is how much of the inter-relationship between the application and its data can be explicitly represented.

With platform as a service, you wind up uploading into the cloud the data you wish to process on and the code you wish to process it with. In infrastructure as a service, you have to upload those two things and you have to upload the environment in which that code is going to run.

I think you can characterize software as a service in the terms that you upload your data, or you upload no code. With platform as a service, you upload your data and you upload your code, but you upload no environment. With infrastructure as a service, you upload the data, the code, and the environment.

Given that schema, what really differentiates platform as a service from infrastructure as a service in terms of the power-benefit trade off is how much you can tease apart your application code and data from the ecosystem of software in which it lives.

That’s a harder job than it might sound. We all think our code lives in isolation, but it’s been a long time since people have written anything of substance that lives completely by itself.

If you want to use platform as a service, you have to get back to that sort of software engineering purity, where the environment in which it runs, including the libraries, operating system version, security policies, and so on are really kind of digested from the code.

Particularly complex, scalable applications consist of multiple components, some of which are perfectly suited to platform as a service deployment, and some of which are not for whatever reason.

Because of that, we think the key is this layering, which allows platform as a service and using application components that make the most sense. If that’s the premise, we hope you use Eucalyptus, but if not, we still think that layering makes a lot of sense.

Scott: I talked to some guys who were doing Google App Engine development, and one of the things they said that was really sort of eye opening to me was that platform as a service is fantastic if you can live inside of that sandbox. So, with Google, for example, everything is a request and a response, and you have a certain amount of time to process the request; otherwise it just basically kills it. But, you get infinite scale, you don’t have to patch your OS, and those kinds of things.

That worked for 80-90 percent of the project they were working on, but for the other parts, they would have Google App Engine call out to other stuff running over on Amazon Web Services, for example, that would do some of the longer-running kinds of things and generate results.

The picture you just painted allows that whole thing to exist inside of that data center. You can have the infrastructure-as-a-service piece and the platform-as-a-service piece talking to each other, relatively seamlessly. Maybe you even run them on the same servers, and to accommodate service peaks, things move around relatively transparently, and from an internal IT perspective, that’s going to be the norm.

Platform as a service is going to be great, infrastructure as a service is going to be great, and they’re going to have to communicate with each other, because sometimes one of them is going to be a better fit than the other, and large complex apps are frankly going to require both to some degree.

Rich: That’s right, and I think you see in Azure that kind of acknowledgment. I know this sounds strange, but when I look at the Azure architecture, I see those platform-as-a-service types of restrictions built in. Still, you could imagine exporting in Azure more infrastructure-as-a-service type functionality, and then that would have to be coordinated in a reasonable way.

I think one advantage of Eucalyptus is that as you were saying, if you want a Ruby platform, a Java platform, a Python platform, or some other platform of your choosing, all of those things can happily coexist over the same infrastructure as a service framework.

Scott: Let’s flip over and talk a little bit about the open source nature of Eucalyptus and its significance.

Rich: There are a couple of ways to view the presence of Eucalyptus as an open source project. The easy one is the motivation for us to do it as open source, which is that we as researchers have really have spent a lot of time working with and benefiting from other open source efforts.

If you are working in the university environment today, doing empirical system science, you are almost assuredly using somebody’s open source tool kit or platform or software infrastructure to do your work.

And so we felt like it was paramount that if we built something that might have utility for other researchers’ investigative activities, we should make it available as an open source. And frankly speaking, before we even thought of commercializing Eucalyptus, we knew we were going to make it open source. Having now gone down this path and having become a commercial entity, we still feel very strongly about this.

Even from a personal experience perspective, we feel that open source has a lot of value that can be passed on to the community, as long as you can support that notion commercially. From a business perspective, I think there are some fairly good reasons to do it as well.

The first is that we get access to a lot of information very quickly about what Eucalyptus does and does not do. That information is freely given to us as people take the software and try things with it. We can see, almost instantly, which things we have done that have been successful, where we need to work on the quality of the system, where we need to work on its stability, what features might be missing, and those kinds of things. The open source community is very good about providing that information.

A second aspect that was a little less clear to us when we started this, but which is actually turning out to be very valuable, is that Eucalyptus is built to run in pretty much whatever environment you have.

We consider Linux environments primarily, but we really spent a lot of time when we were building it initially thinking about university data centers where we don’t know what version of Linux is there, we don’t know what the vintage of the hardware is, we don’t know well how it’s administered, and those kinds of things. As a result, we’ve architected the system to be flexible in this way.

What we’ve learned from the open source community, though, is just how widely varied those environments can be. We’re constantly hearing from people who say, “This is what my data center looks like, and Eucalyptus doesn’t do X, Y, or Z. Please help me.”

That type of interaction is not really feedback about the code or what Eucalyptus does, but rather it’s feedback about how Eucalyptus can be configured. Through that information, we’ve been able to make Eucalyptus extremely configurable, so that it really does allow itself to be molded into whatever environment you want to put it in. And that, I think, is part of its value proposition in a commercial sense that we didn’t anticipate.

Scott: Just to drill down a little bit, remind me what license Eucalyptus is under. Is it an Apache license, BSD, GPL, or something else?

Rich: It’s actually under two licenses, which has to do with the university and commercialization phases we’ve been through. The university license was a BSD-style license, modified in a way that the University of California, Santa Barbara absolutely had to have in order to allow us to release it from the university.

We had to modify that license when we commercialized, because it’s a UCSB-specific license. At that point, we took a look at it, thought about what the rest of the community was doing, and switched to GPL. It’s actually GPLv3 going forward, and as time goes on, less and less of the university code is present in the actual open source code base. At some point, it will be all GPLv3, but for now it’s both.

Scott: That makes sense. In terms of community involvement, you mentioned that you get a lot of feedback from the community about how it’s working in a certain environment, specific issues, and things like that.

With some open source companies, the majority of the people working on the project work for the company. That was sort of the way MySQL was originally structured.

Rich: Right.

Scott: Others really want the community contributing code. Where do you see yourselves on that spectrum, in terms of where the code comes from?

Rich: We’re actually moving away from one extreme. When we started out and looked at taking community contributions, we found that we were rolling the code forward so fast that people were contributing code that was absolutely against our internal code base. And so they would get upset because they put a lot of energy into building code that we couldn’t take.

What we decided was that, in the first phase of the project, we would really limit code contribution to things that talk to the Eucalyptus API, meaning things on top of Eucalyptus or attached to it or communicating through the API, and then to take patches. We would take bug fixes, as long as they were not obsoleted by a bug fix we had internally or whatever.

What has happened is that we were trying as hard as possible to do a complete implementation of AWS while AWS was adding features. As a result, we decided to finish off the initial AWS API for open source, do a set of bug fix releases to re-factor some things internally for performance and stability reasons, and then adopt a much less aggressive release schedule.

Moving to a status where we’re releasing every six to twelve months instead of every six or eight weeks makes it more feasible for us to entertain contributions to the core. It’s important for us that the QA be done well.

We’ve put a lot of energy into making Eucalyptus extremely concurrent, which is a big differentiating factor between Eucalyptus and some other open source offerings. Eucalyptus is a very, very concurrent system, and what we find is that when people add things to it, in order to get stability, they often put locks in, because it’s easy.

That approach gets their thing to work, but at the cost of reducing the concurrency. We probably will look very long and hard at those contributions to make sure that we preserve the quality of what’s there, but I imagine that in the next year or so we’ll see community contributions to the core as well.

Scott: I remember seeing news articles that mentioned, for example, Ubuntu and Eucalyptus in the same paragraph. What are your relationships to distros or other open source projects?

Rich: We’ve always wanted to make Eucalyptus as widely valuable as possible in the open source community. The Ubuntu partnership has been fantastic in that way, as they have really put a great deal of energy into adding things that they believe their distribution community really wants.

They put a lot of energy, for example, into improving the installation experience in a way that is much more compatible with what the Ubuntu community is accustomed to seeing. And we think that’s great.

By the same token, we are open to working with other projects or companies that have different needs. We do work with the Debian community, we’re actively working to package it for Squeeze as part of their open source offering, and we’ve talked with some of the other distros about whether it makes sense for us to be part of the distro or just to remain compatible.

We have agreed to continue to package Eucalyptus for the Linux distros that people ask us to package it for, and we explore ways that we can have closer partnerships. That’s ongoing, but so far, in the distro space, our closest partnership is with Ubuntu.

There is a lot of synergy between what Eucalyptus does and what other projects do. So again, in this platform, there is this sort of layering capacity. We have ongoing compatibility and synergy discussions with a number of different partners.

There is also an ecosystem building up for tooling. RightScale is a good example of a company that provides powerful tools for using Amazon. You can program your Amazon allocation much more effectively through RightScale than you can by hand.

Because we’re Amazon-compatible, RightScale works with Eucalyptus, so that’s a nice partnership. If you want to use the same kind of dashboard to manage your Amazon and Eucalyptus clouds, you can, and there are a number of companies that provide that kind of tooling, backup services, and other things that complement what we do.

We think of those partnerships as being really important, particularly since we’re open source, because we think that the success of cloud computing today is going to come from an active ecosystem and not the success of any one single technology.

Scott: Talk a little bit about your business model. How do you guys keep the lights on around this open-source project?

Rich: We’re practicing what I think is commonly referred to as an open-core model, where we provide the core of the system as open source, and then there are value added components that can be purchased from us under license. The bright line differentiator for us is whether the feature or component is really tailored towards the notion of commerce or not.

If you want to experiment with cloud computing, if you want to understand it better, or if you want to see what it does and what it doesn’t do, then open source Eucalyptus is for you. If you want to make money with it and start a business using Eucalyptus, then there are things we think you need that we feel like you should, in some measure, pay for, because you’re going to go off and resell the value that comes from them.

So, for us, it’s actually pretty easy to see what goes in open source and what doesn’t. We have professional engagements that surround our proprietary products. We also offer support, under contract, for the open source code, but that’s not really our main business. We’re really a software development company, and we’re leveraging the open source community and the open source project for the proprietary products we ship.

Scott: Obviously, people are using Eucalyptus internally for things that are kind of confidential and not public, but what are some of the really cool stories you can talk about where you’ve seen Eucalyptus put into production?

Rich: The best example of where it’s in production is actually something we can talk about. NASA took Eucalyptus and added Luster, which is a high performance shared file system, and some other things, and they’re running it in production at NASA Ames.

They offer cloud services to other agencies within the government, and we know it’s in production, because they call us and they yell at us about it. We think that’s pretty exciting.

The other stories around Eucalyptus are really more usage scenarios. We hear a lot from the open source community about what people are doing with it. We can sort of talk about that, in the sense that they only tell us so much.

There are interesting use cases that have emerged from Eucalyptus. One is the cloud bursting of the disaster recovery model. People think of that one almost immediately. A lot of people are also looking at the idea of augmenting their on-premises infrastructure with public clouds.

But then there’s also this funny phenomenon, where developers become frustrated with the pace at which they can get resources dedicated to their projects, and they pull out their own credit cards and start developing in a public cloud. Often that code is very valuable, and when that comes to life, the company for whom that developer is working often wishes to pull that stuff back into the company and run it on premises until it can be properly vetted, software engineering can be done on it, and so forth.

So, we see this going both ways. We see businesses that want to burst from their on-premises IT into the public clouds, and then we see development which taking place in one venue, such as on the public clouds, and for whatever reason, the people who own the development want to move it back into the private space, perhaps only to move it forward.

The result is a kind of homogenization of the on-premises IT with public utility providers. We think that represents a really exciting future usage scenario for Eucalyptus.

Scott: It’s a slow moving trend, because it’s difficult for companies to get there, but there’s pressure for internal IT organizations to restructure themselves essentially as service-delivery arms.

We’ve talked to people in several companies who have told us, “I have to compete with everything out there. Any business unit that comes to me is free to buy from Amazon, they’re free to buy from Google, they’re free to buy from whoever they want, and I’ve basically got to make them a deal. I’ve got to deliver a better service at lower cost.”

Rich: We hear this idea that internal IT finds itself competing with commodity providers, but I predict that will prove to be a short-term phenomenon. Whenever a commodity emerges, it becomes inexpensive as well as very common. In the technology business, there’s often an advantage in using something that’s more rare, before your competition and before it becomes a commodity.

Where your business has to innovate, that’s really not necessarily a commodity business. I think what we’re going to see in the future is that on-premises IT will really be the engine of innovation, experimenting and coming up with the best ideas before the competition can do them.

And then as those advances commoditize, becoming the sorts of things where you’re going to compete on price and really try to drive the margins down, there will be pressure to move them from your internal infrastructure, out to the utilities, to make space for the next innovation.

I think long-term, that’s really what IT is going to look like. I don’t think it’s all going to be public, and it’s certainly not all going to be private. I think the business model that rationalizes how to partition resources between public and private is going to be driven by that innovation cycle.

Scott: That’s great. Looking at the time, I think that’s a good place to wrap up the interview. Thanks very much

Rich: Thank you.

Comments (2) Posted by campsean on Monday, January 11th, 2010


You can follow any responses to this entry through the magic of "RSS 2.0" and leave a trackback from your own site.

2 Responses to “Interview with Rich Wolski – Eucalyptus CTO”

Post A Comment