Interviewee: Mark Kortekass
In this interview we talk with Mark Kortekaas from CBS. In specific, we talk about:
- Choosing and implementing technologies for the production environment
- Relationships with open source communities
- Tailoring the organization for maximum benefit from open source
- Building infrastructure for capacity and agility
- Future-looking, cloud-aware data centers
Sean Campbell: Mark, would you tell us a bit about your professional background?
Mark Kortekaas: I am the Chief Technology Officer for CBS Interactive, which among other things runs cbs.com, cbsnews.com, and cbssports.com. My group runs core technologies for CBS Interactive.
Sean: What’s the technology that underlies those sites, broadly speaking?
Mark: We’re primarily open source underneath, with everything running on top of Linux variants of some sort. We use combinations of PHP, CGI with Perl, and some C-based stuff. We’ve got some instances running Tomcat, we have Lucerne running for search, and we’ve got some MySQL around.
We also have some commercial search applications, both internal and external, and we’ve got some commercial database applications running, such as Oracle. So we don’t limit ourselves to a purist view that it’s open source or nothing, but from a base design perspective, we run on top of open source. And that sort of lends itself to a lot of custom built solutions, on top of that.
Scott Swigart: Tell us a bit about your decision-making criteria, when you’re evaluating databases, operating systems, or something else, and you have open source and closed source options to choose from. You mentioned a preference for open source but some inclusion of commercial software as well. How do you make the decision on a case-by-case basis about whether you’re to go with a particular open source option that’s available, or whether you need a proprietary piece of software?
Mark: We make the decision about what technology to use based on the problem we’re trying to solve, so there’s no one size fits all answer. There are times when a given project’s resources will be aligned with a particular technology, so they’ll decide to use technology “X.”
It’s a sort of a hammer problem–if you’ve got a hammer, everything looks like a nail, and if you’ve got a screwdriver in your hand, everything looks like a screw. We’ve got some development groups that are very Perl-focused and prefer to do things in Perl. We’ve got other development groups that are very PHP-based and therefore lend themselves to PHP. We have other development groups that are very Java-based and lend themselves to doing everything in Java.
Still, the biggest consideration is the business problem we’re trying to solve, so if we’re trying to solve a problem that lends itself to commercial solutions, we’ll use commercial solutions. There are certain things that call out for commercial solutions, like when there are control issues and things like that. How are you doing control systems for things that are in scope for Sarbanes Oxley, for instance?
Other times, when there aren’t outside pressures, we look at the problem more from a resource perspective.
Scott: I saw you interviewed a bunch of different times about the March Madness implementation you guys rolled out. I know that MLB made a big investment on the Silverlight side, and to be frank, there were some challenges with it–especially this roll out at the beginning of the month.
I’d be curious about what decisions you made around that, because that’s obviously a challenging platform to work with. You’ve got millions of people that could log on, many of whom aren’t all that technical, and they’re probably not really able to deal with failures the same way that maybe somebody with a little more savvy would.
What was the decision making process for that, because that’s obviously a pretty big signature piece of technology you are pushing out.
Mark: We didn’t utilize Silverlight, but not because we felt that Silverlight was a fundamentally good or bad technology. We just knew that using just straight Windows media, which is what we delivered March Madness in this year, in an embedded window on a regular HTML-based page would work.
Silverlight is fairly new to the market, and people are just getting the clients and all that. We were concerned mostly with traditional things like penetration rates, the ability for people to download the client, how long that might take, and so on.
This is a case where we did look at other technologies, but at the end of the day, we went with something that we’d used in the past and knew we could make work again.
We looked at the decision not so much from a technology perspective, but more as history of success. I think you always face the choice between rolling out new technologies versus staying consistent.
There’s never a good answer for that. There are certain times you need to push the envelope, there are other times you need to go with a more conservative approach, and there are times we take those different approaches at different times, depending on the property.
Sean: Closed source products all have armies of sales people with data sheets, marketing vehicles, and demos, and all that sort of stuff. How do you go about analyzing a given open source effort, whether it’s a new product that you want to transition to or a new version of an existing one? How do you guys analyze the effort to determine whether it’s reached the right stability point and so on?
Mark: You look at a couple of things there. You start with the problem you’re trying to solve and the support criteria you’re trying to deal with.
I also think you try to do things in test modes and trials and things of that sort. Apache’s a good example–you look at the last version of Apache 1 and you compare it to 2, and you see that 2 has a better threading model, and over time, you want to get the better performance that will give you.
Does that mean that when we rolled out 2 that we rushed it out immediately? No—we waited a fairly long time to roll from Apache 1 to Apache 2, and we did it in a fairly controlled environment. We rolled it out in a smaller scope first to see how well it would perform and to iron out the issues.
And then over time, as you get the hang of doing rollouts and things of that sort with it, you eventually upgrade your entire farm. In our case, I’m pretty sure we’ve upgraded the whole thing at this point, but for all I know we’ve still got some Apache 1 farms around.
We wouldn’t roll out a new version of Apache the day before a major event, unless there was a sizeable security issue that we were trying to address. So we have to take a risk-benefit approach, and we try to mitigate risk whenever possible.
Sean: With a project like Apache, you’ve got a really clear roadmap of features, and there’s a real active community. What about bringing in a little bit more of a dark horse, like a project that you’ve had your eye on, and now you want to bring it in?
Mark: We would try to bring it in on small scale, in a project that is off to the side a bit, rather than risking disruption to the main line. There are always new things going on, and so we’ll bring things off to the side, set up a new farm, and do some trials. That’s how things like Tomcat and Lucerne got implemented into our environment over time.
After you try it in a small environment, if it works, you start to use more and more of it.
Scott: What are some of the advantages that you see in the open source model, in terms of what it has enabled for your organization?
Mark: Anyone who says that cost doesn’t factor into their thought processes is probably being a little disingenuous. Clearly, cost factors in over time, and in the end, that does weigh in to any soft process we do.
One thing that open source helps with is when you have a new project you want to try. You don’t know what’s going to happen with it. It may turn out to be a big business, it may turn out to be a small business, and it may not even exist anymore in three months. You just want to try it out and see if it works.
Being able to do that with open source means you don’t have to worry about the financial implications so much–it’s possible to do trials without signing data agreements and things of that sort that may require you to make a purchase three, six months down the line. That lets us more fully understand the business implications for what we’re doing before we commit significant resources.
This is easier to do when you’re dealing with a known business, but if we’re trying to set up something new that we’ve never done before, it’s hard to know what the business problems are going to be. The ability with open source not to have to think about financial implications so much in the early stages is a big deal and has to be thought of in terms of flexibility.
I don’t know if the access to source code is so relevant, to be honest. It’s rare that we take advantage of that, although there are times that my engineers will go in and find issues inside the kernel or issues in Apache and will submit changes back to the main project. That does happen on occasion, but by and large, we don’t really have the staff time to go in and deep-dive code unless something is really wrong.
Most of the time, we’re doing things through a known API, so I could just as easily work on those with a commercial web server as well as we do with Apache.
Scott: Given that point, what’s been your level of engagement in terms of contributions to open source projects? That contribution could be actual code, or just communities that you feel like your team’s more actively involved in, either helping other users or just solving problems. Is there anything you could point to in that particular area?
Mark: I know we have done some contributions on miscellaneous bug fixes and things like that, where we’ve submitted things back. By and large, it’s not one of the things we’re necessarily out there trying to do.
We have done it when we found certain things, but my developers are not kernel developers. They’re not core Apache developers or things of that sort, so it’s not something that we’re necessarily striving to do. But as we do stuff that does have relevance outside, we have made things available. I think most of those have probably been done under employees’ names directly and not sponsored by the company at large.
Scott: You mentioned that you were largely a CentOS shop and that, obviously, with open source, like you mentioned, you can try things out without incurring a lot of licensing expenses. Are there places where you have opted to pay for support agreements with Red Hat or MySQL, or things like that, or do you largely just self-support?
Mark: Certain companies have a model that’s not true open source; it’s sort of a hybrid, where for non-commercial use you can use it free, but for commercial use, you have to pay a licensing fee or a support agreement.
We have done those sorts of things, on occasion, but we largely support our own kernels and do our own off-shoot kernel builds.
There hasn’t been a lot of benefit for us in support agreements, although we have had discussions with them. I think those agreements work really well for a lot of companies, but our environment is fairly fluid in terms of what we’ve got done, and we have some very high level engineers that keep things running.
Sean: Regardless of whether you’re using proprietary software or open source software, you could define three ways to keep things up and running. One is that you can enter into the support agreements with the Red Hats and the MySQLs or go with a support aggregator, like an OpenLogic, who supports a wide variety of open source stuff.
Second, you can be an organization that sort of pulls in consultants when you need hard stuff done and you need hard problems solved. Finally, you can have some extremely technical staff, who basically are expected to solve any problem.
It sounds like for your organization, you’ve gone with that third option in terms of having some extremely technical people who are sort of go-to people if others in the organization are stuck. What do you see as the merits of going that route?
Mark: I think the question sort of assumes that we went down that path as a conscious decision. I don’t know whether that’s the case, since many of the systems that we’ve got predate me and the company.
There is no doubt that we’ve got some very technically savvy people. I can speculate that very technical people with a sort of “build it yourself” mentality tend to gravitate toward open source, because that’s their personality fit. And therefore, if you’re hiring for open source type positions, that’s what you tend to get.
So I think it’s sort of a self-selecting technical group of people that are out there, in terms of skill set. Candidates that we see traditionally come in knowing a lot of that and having the feeling that they want to go off and build and tinker and things of that sort.
I think one downside to open source is that you do have to have a fairly technical organization around you, and you have to have a slightly higher risk portfolio or a risk perspective than you might if you were in a different organization. I think there is an aspect of risk acceptance that we take on and that pervades the types of things we do.
We are not a financial institution. We take calculated risks with what we do, and we try to move very fast. Therefore, we feel open source helps us get there, but other people may not have the same wherewithal. Documentation on some open source products is frankly pretty terrible. It requires you to go in and spend a fair amount of time just figuring out how to make it work.
In order to be successful with a project that doesn’t have good documentation, you have got to have good individuals around that want to make it work and will spend the time to do it and not get frustrated at the lack of documentation.
I do think staff selection is a major consideration, in terms of people’s abilities to make open source work. Staff acceptance of it is another one. If your shop works in a risk-averse environment, where tinkering and mucking around with the innards of things isn’t promoted, I don’t think open source is probably the right fit for you as an organization.
It should very well be noted that there are certain aspects of CBS where that risk portfolio would not be acceptable either.
Sean: We have talked to some different CIOs, and to paraphrase, I have had people say, “We just need our servers to work, our databases to run, for everybody to get email, and for it to be easy to maintain, so proprietary software makes sense for us.”
So for you team, which is doing innovative customer-facing web software, you need to be as unconstrained as possible. You need to have control and freedom, to change or not change, to develop or not develop–whatever you want, and open source gives you that. It gives you that at a higher risk, but that risk makes sense for the kind of stuff that you guys are trying to do.
Maybe, for other parts of CBS, like people who are office workers or information workers, they have got a different infrastructure that makes sense. Do I have that right?
Mark: That is a fair way of saying it.
Sean: Let me ask you a question in a different area entirely. What do you think people would find interesting about the CBS infrastructure or the user base you have to support, either internal or external? What do you think is unique about it, and what in those unique features ties you to some open source efforts?
Mark: One thing that makes what we do challenging is that peak-to-average server-load ratios for us are completely different than those in most businesses
We build for a nominal load of 10%, because we know we’ll need that capacity when we get a peak. For our sports business, for instance, football Sundays and the first couple days of the NCAA tournament are going to very busy days for us.
We build an infrastructure around massive scale for very finite events. The challenge that we face is how you deliver when you have a massive audience, yet never know what the peak is going to be year to year. We have estimates, but we never know how good they really are until the event comes. For example, we didn’t know exactly how large March Madness was going to be this year, and so we hoped we got it right, and we were successful. At the same time, you don’t want to overspend either.
I don’t know that open source helps me here. I think open source gives me a lot of benefit in the development realm, but I’m not sure it gives me anything in the capacity one.
Scott: That is an interesting transition, because I want to ask you a more development-oriented question. I think we have all seen web users who fall into the camp of guys like us that probably can thread through a piece of data on the web page very quickly, look and find what is relevant and what is not.
And then, even today, you have a mass of users who may be confused by the density of information, by the way the site composes itself, and the way it folds out information.
Do you feel that the open source development frameworks are moving more rapidly to give you more ability or the things you like in the development frameworks that are being pitched by closed source companies?
It’s an interesting kind of interplay happening now, because ASP.NET is a pretty cool technology by all accounts, but PHP is getting a lot of momentum. There’s a lot of community building around it; Microsoft has even embraced it and is running an IS, so there is an interesting battleground there.
Is there an area where the ability to build a good usable infrastructure and have strong development tools plays well with your user base and comes into the decision making?
Mark: My feeling is that what we use on the back end has no bearing at all to what my consumers feel. I feel that I’ve got a product issue in terms of what products I offer to my consumers and I’ve got a back end technology problem on how to deliver that product.
I honestly believe that if my developers were 100% Microsoft .NET developers, I would be able to deliver our product on .NET. It happens to be that my developers are not .NET developers. I don’t feel that, at the end of the day, open source enables a new business. I mean what Microsoft has done has been very successful. Oracle has been very successful.
You know, there are pros and cons to using either open source or proprietary technologies. You can solve your problem in any number of ways, and I think your skill sets and how you’ll choose to go about developing those solutions will change, based on the infrastructures you use.
I don’t think the problems you can solve are necessarily any different with proprietary versus open-source software. The economics around the solution might be different, but that’s a different question.
Sean: That’s interesting, because people sometimes have pretty rigid opinions about development technologies, rightly or wrongly.
Some people take the line, “I shall only code in C#,” or “I shall only code in PHP.” The fact that Google’s sort of “data storage in the cloud” solution that they released today has a Python interface is creating a bunch of consternation and happiness, depending of what camp you’re in. Of course, you’re always going to have that.
Mark: On the developer workstation side of things, I’m a firm believer in emacs and make my SAs install emacs on all the machines, just so I can personally edit files.
We have programmers that still code in a shell window and use VI or emacs, depending on their predilection. We have others that are coding on the exact same systems that have chosen to use an IDE on their Windows desktops and use more traditional IDE-based and graphical-based front ends that just happen to save it off to a subversion repository. And then we pull it off on the SVN on the other side to actually run it in production.
I think there’s a lot of overlap in a lot of ways, and it’s not an either/or scenario. I think there are things about some commercial software that works really well for developers that doesn’t necessarily change my approach.
Scott: You mentioned fairly unusual capacity-planning challenges. One of the things that you hear around this type of problem is the concept of fabric computing and grid computing, where your data center is just a pool of CPU, disk, and memory capacity, and you can allocate things fairly dynamically.
I don’t get the impression that a lot of that is really happening in practice, though. Today, do you mainly model loads to estimate capacity for things like Sunday football, with a fairly static allocation of resources?
Do you see virtualization or other technology coming online that would let these kinds of systems be more dynamic and, for example, spin up 50 more virtual web servers or provision 50 more systems automatically? Do you see that kind of stuff becoming a reality? Is it already somewhat of a reality?
What can you tell us about the dynamic landscape of the data center versus the reality of today?
Mark: It’s a great space, although there are a lot of issues that are going to have to be faced, in terms of the way applications are written, provisioning times, and things of that sort. If I’m going out to a vendor and I want a certain number of CPU hours or whatever, there are a whole bunch of models of how you sell this stuff.
If I’m going to go and do a reservation for six months for a particular sport season, I still have to plan ahead of time in order to ensure that I’ve got enough capacity. Today, it’s largely a way of solving capacity with different financial sort of parameters around it. That said, a lot of applications don’t lend themselves to not being run in the same data center, and there are issues with not having access to the same raw storage, for instance.
As you look at this, I think over time what you are going to find is people are going to start to write applications that are more “cloud-aware,” and I don’t think we’re quite there yet. There has to be a fair amount of time people spend understanding how to write applications that can do things like scale quickly without breaking the application. I have a feeling that a fair number of applications out there, both our own as well as others, will be fairly particular about data latencies and things of that sort.
The other observation I’ll make is that I don’t have many hours to pull up additional servers. So, if I am looking at a sports season launch, that only happens once a year. And I have to hit that out of the park on that day. Therefore, I have to have an amount of capacity already in the racks, spinning, tested and everything else. The fact that I could spin up another 50 additional servers in eight hours doesn’t really helped me if I have already blown my first day.
There’s no miracle bullet out there that says, you can just do it. That said, I think there are some very interesting things out there, in terms of capacity, but even if you look at the largest of the farms that are out there, they are still being built on some economic reality. And there are still some peak periods that they are trying to hit. They are using the excess capacity to have around the edges to basically put these clouds out there.
So you still have a pretty traditional problem of capacity planning. It’s just changed how you pay for it. If I need 50 servers, I still need 50 servers and I have to plan for 50 servers. Someone has to pay for 50 servers, if I really want them. That’s my high-level observation. At the same time, I think anyone who is not looking at these things as part of their long-term strategy, and taking a close hard look at how these things work, isn’t really doing their job.
Scott: Well, thanks for meeting with us today. We’ve enjoyed talking with you.
Mark: Thank you.