Interviewee: Leon Shklar
In this interview we talk with Leon Shklar – EVP of Media Technology for Reuters. In specific, we talk about:
- Moving Reuters.com to an open source stack
- Choosing and supporting open source software
- Making code changes to open source packages
- Licenses and governance in open source projects
- The perception and reality of open source companies
Sean Campbell: Leon, could you introduce yourself?
Leon Shklar: Sure. I am EVP of Technology for the Media division of Reuters (now Thomson-Reuters Markets).
I hold a PhD in Computer Science from Rutgers, and I have been doing some part time teaching there as well. That came in handy in team building – the team includes some of the former Rutgers students and other people working together through a couple of startups and other jobs.
I have been at Reuters for a little over two years. Prior to that, I was responsible for the technology side of the Wall Street Journal Online. I am pretty proud of what our team accomplished both at WSJ.com and here at Reuters so far.
To wrap up on the history, I worked for Belcore from 1990 to 1996. Before that, I worked with a small software company, called Computer Corporation of America, which made DBMSs for IBM mainframes.
In 1996, I took the plunge into the world of startups. We started a small company in ’96, and then it got acquired in ’99. Then everything went up in 2000 and came crashing down in 2001. When things started going south, I joined Dow Jones, and was responsible for the Wall Street Journal Online, from summer 2002 to spring 2006, which is when I went to Reuters.
On a related subject–with a friend who worked with me together in a couple of jobs, we wrote a book on Web Applications Architecture that came out in its first edition in 2003, by Wiley. The second edition is supposed to come out later this year.
Sean: It looks like it has some really good reviews. Scott and I have actually been lead authors on three technical books too. The joke my CTO gave me once was that “All authors like the act of having written. The act of writing, though, they are not so sure about.” I’d have to say I agree with that.
Tell me a little bit about the technology stack that Reuters sits on, in terms of how you guys use open source, what are the major products, and how that underpins the organization.
Leon: The responsibility of our team includes both consumer web sites and business-to-business media products. The former include the family of Reuters.com web sites–eighteen of them to be precise, for different countries and regions. The latter include text, pictures, and video products distributed over the Internet and via satellite. Overall, quite a mix of current and legacy systems, and not all of them are compatible. Our team is on a mission to bring all of media systems to a common platform.
Over the last two years, we migrated Reuters.com sites from Microsoft .NET technologies to the open source stack. There is some similarity in what we have done back at Wall Street Journal Online and here. When I started at Wall Street Journal Online, there were a bunch of disjointed pieces. The site itself was based on Vignette, there were some scripts, and there was a pretty big buy-in on the company side into Solaris and Sun and commercial tools. The company already had licenses for WebSphere and Oracle, which of course are pretty expensive.
I don’t think the open source components were as robust in 2002 as they are today, and the migration there was away from Vignette, heterogeneous scripts, and all kinds of obsolete CGIs and stuff like that, to a cohesive platform based on the Oracle tier, WebSphere, and a bunch of applications built on top of that.
The OS platform was Solaris. It wasn’t a cheap environment to run, because Solaris systems weren’t cheap. The Oracle licenses and WebSphere licenses are not cheap either, in terms of maintenance and upgrades. But we ultimately got the system to be pretty solid, by the time I left in 2006.
When I interviewed for the job at Reuters, the main question that I was asked, well, can you fix the site? It is not standing up.
Sean: Tell us a little bit about that. Was the issue page load times, or that users felt the site wasn’t responsive; what was the real issue?
Leon: It was a combination of problems. Response time was not very good, and the site wasn’t able to sustain the load at peak times. The other problem was that the implementation of changes was very slow.
The site was implemented as one big blob. To make even small changes, you pretty much needed to deploy the whole code-base.
That was a very difficult thing indeed. I am not religious about open source versus Microsoft, but one thing that’s true is that the bar for doing good work with open source is much higher. You need much better people. To build simple things with .NET is pretty straightforward. So you may have lower average technical levels on the team, when the requirement is just to write to code according to a functional spec. My personal feeling is that .NET by its nature is too permissive to bad coding practices.
Sean: Where do you feel .NET is too permissive in terms of coding practices, by comparison to open source?
Leon: Ultimately, all the door-opening in .NET is done through tools. There is not reasoning from first principles. You write functional code in tools and it gets compiled and then it all theoretically comes together. .NET is designed to do things for you, and that is a perfectly good thing when you’re building simple web sites.
It becomes not necessarily such a good thing when you’re building complex and scalable applications. By the way, that’s part of the reason why .NET is so successful–80% of sites are rather simple, and they don’t have an incredible requirement in terms of performance flow and even the frequency of changes. So for those, it’s perfectly good.
It’s not a very high bar that you need to pass in order to be able to write code for .NET. And it’s much easier to hire people, which means the team is not as expensive.
All that’s fine–it’s just when you get to something like 10-20% of the general population of sites, that you start running into real challenges in using .NET. I’m convinced that you can write good software in anything, and I expect that the implementation of MSN.com, for example, is done well (though you never know). I would guess that the people who are doing it have good expertise and knowledge of that technology.
Unfortunately, that is more of an exception rather than the rule. On average, the expertise level of .NET web development teams I’ve seen is considerably lower than the expertise of open source teams. With open source, you have a big challenge in building a really strong team. The advantage is that you have full control over the platform that you’re building, and ultimately, when you have a problem, you can reason from first principles.
I have seen the best .NET sites with very annoying intermittent errors in them, and nobody knows how to find them. The reason for that is, if the tools do help, there is not much you can do. It’s very difficult to look at elements of the code; it’s very difficult to trace HTTP messages through and to figure out, “Well, it’s this missing header that’s really causing this strange behavior.”
Scott Swigart: Talk a little bit about the infrastructure that you guys are built on now. Are you on sort of a LAMP stack?
Leon: It’s LAMJ.
Scott: So it’s Java instead of PHP or Perl.
Leon: Right. We have Linux boxes with Apache fronting Tomcat. The backend is MySQL with master-slave replication.
We have a number of application components that we built on top Struts. Of course, we built our own platform with our own application components, but we tried our best not to reinvent the wheel.
The content management system is something we had a lot of pain and discussion over two years ago, but we were not able to find the one that everyone liked. I’ve seen a number of commercial products, and generally, the problem with them is that the amount of effort to customize and support them is close to that of building your own.
Drupal was not there yet, and we ended up building a content management system, which is specialized to support our custom applications. Because it is specialized, it doesn’t pretend to be generic, and users are quite happy with it.
Scott: Mark Frons expressed the same kind of thing–he sees that as an area that’s ripe for some open source project to coalesce around, because right now everybody seems to be building their own CMS to some degree.
Leon: That’s true, and there are not very good commercial products either, that I know of. Vignette was trying to build a content management system; I don’t think they succeeded. There are a bunch of other companies like Documentum and Interwoven that have content management systems. They’re very generic, and they don’t appear to be very robust in what we need.
So what ends up happening is that if you use one of those, the effort involved in installing, supporting, and customizing it is as high (or higher) as the effort for building a very custom system of your own.
Scott: In terms of the stack that you’re built on top of–MySQL and Linux and things like that–do you rely on your own expertise and the community and things like that for support and figuring out problems? Do you get support from a vendor?
Leon: I would say that 95% of the time we rely on our own expertise and we rely on the information that’s available in the community. When we were first deploying MySQL, we didn’t have enough expertise, so we had to get help from MySQL. All the tuning they did for us was very, very useful.
We have a support agreement through UNISYS that covers MySQL, Apache, and Tomcat. We almost never contact them with questions regarding Tomcat, Apache, or MySQL, but when we need it, we have this backup.
Scott: One aspect of the value from their acquisition by Sun that MySQL talked about was that they had been looking at doing an IPO in a fairly turbulent market. And with Sun, they felt that Fortune 500s would feel safer about using MySQL if it is backed by another Fortune 500. I would be interested to get your take on that, as a MySQL customer with long-term involvement with open source products.
Leon: I am not necessarily excited about it. This is conjecture, but when I heard about Sun acquiring MySQL, it didn’t make sense at first. The only context in which it does makes sense is that Sun starts distributing MySQL with Solaris, because Sun is trying to gain acceptance for Solaris, as an alternative to Linux.
My concern with that, and one of the reasons we went with Linux to begin with, is that the open source tools are built for Linux first, and only then ported to Solaris.
Sean: It’s the LAMP stack, not the SAMP stack.
Leon: Exactly. So if MySQL is going to become part of the SAMP stack, sooner or later they are going to start doing development on Solaris and porting it to Linux.
Sean: Some people say, “Hey, the great thing about Linux, and Apache, and a lot of these kinds of things are that they are open source, so if I really run into a problem, I can get into the code and fix it.”
But other people say, “Well, I have no interest in doing that. I don’t want to run my own private fork of Linux or Apache. I use those as a product. There is value in the fact that they were built using an open source methodology, but they are not something that I am going to really go in and muck with.”
What’s your take on all of that?
Leon: It is not a good thing to create your own branch of an open source product. That said, if we are in a critical situation, it’s not completely out of the question to go into the code and put a fix in, and then submit it, and make it part of the community so that it goes into the next release. Ultimately, when the next release becomes available, you replace this little emergency fix that you made.
But going into the code is the last resort, and you would only do that if there is no other way. We have people that know (or can analyze) the code for open source products pretty well. At the same time, we do have support agreements with Unisys and MySQL, and we would rather they did that, if deemed necessary.
If need be, we can work with them to figure out what the problem is, and try to come up with some temporary fix or workaround. It is clearly not a sustainable scenario to constantly make changes to open source components. To be specific, making changes to Apache, Tomcat or MySQL code is not a good idea, except as a short-term measure.
Sean: It sounds like, in a practical sense, those situations just don’t really come up very often. The products that you are building on are mature, and you are not doing such exotic stuff that you are taking them into a use case where they have never been used before.
Leon: We try not to. These are fairly sophisticated products. When we first deployed our system, we ran into the kinds of bugs that required people to go into the code. Not to make a change, but to understand the code and to find a workaround.
Most of the time, if you have an exotic error, you don’t have to fix the code. Once you understand the implications, you probably can come up with a workaround.
Sean: So you are talking about having a really high bar for your staff, if people have to be able to trace into the code of the web server, and things like that. But on the other hand–like you said–the scale of the stuff that you guys are doing is way outside of that 80% kind of case for web sites.
Leon: Yes, it is. And let me put it this way. I don’t remember a scenario where we actually had to change Tomcat code. We never changed the base Apache code either, but then there are plug-ins that are distributed with Apache. We did change those, even the ones that are considered part of the standard distribution.
Sean: I suppose that the plug-ins are a little more special purpose than the core web server, and they are not necessarily used by everyone the same way.
Leon: I would also say that they are not as mature, because not so many people are using them. The number of people using Apache is enormous; the number of people using a particular plug-in can be two orders of magnitude lower.
Sean: Many times, people are using a lot of open source stuff around the LAMP stack, like components, unit testing tools, automated build systems, and libraries.
Leon: We use that as well–we use Subversion as the versioning system, we used Maven, and we used to use Ant. That’s pretty much the primary environment.
Sean: How do you evaluate which open source projects to bet on? If you just go to SourceForge, there is a lot of stuff there, some of which is kind of abandon-ware, whereas some is very actively maintained and being worked on.
Leon: You try not to use the libraries that don’t have a lot of activity around them. You also need to look very carefully at their license; you don’t want to use a viral license.
There are a couple of different licenses, as I am sure you know, for open source. One type lets you do whatever you want. Another type lets you do whatever you want, but you can’t resell it. Then there are a number of really bad licenses, where you use the code as you wish, but all the code that you write becomes open source.
And those, we don’t want to use.
Sean: We had an interesting conversation with the president of the Apache Foundation, and he talked about some of their tenets. One of them is that any Apache project has to have maintainers from at least three different companies, so that no one company can force the direction of the project a certain way.
Which is a little different from how MySQL works, or a lot of the Sun stuff works, where it’s very much corporate governance versus community governance.
The other thing that he talked about was that they strive for a very commercial-friendly license, where you can embed stuff with the Apache license in commercial software and you’re not required to open source your derivative works.
Leon: That’s critical.
Scott: On one hand, you’ve got open source, and then a sub-community of that is the Free Software Foundation, which is very much ideologically oriented around the idea that “the world would be a better place if everybody released their source code for everything.”
Leon: I don’t buy into that. I’m very much in favor of the commercial model, and I guess my conviction is very close to that of MySQL, because they do derive commercial benefit from open sourcing their code. They made a conscious decision; and they’ve got many people in the community contributing to the evolution of their code. At the same time, they obviously invested a lot of money.
In the end, their revenue is not coming from licensing–it’s coming from services, and they’re the primary experts on the database. Consequently, they have the ability to charge premium money for those services. I’m totally in favor of this sort of approach to open source where software is a commodity, and you get interested parties to contribute to its evolution.
Scott: On the closed source side, if you’re using Microsoft stuff or VMWare stuff or stuff from Adobe or whoever, do you feel like you feel like you have to look at the EULAs to the same degree?
Leon: With open source, you do have to pay more attention to the licensing. With commercial software, generally I haven’t found it to be a problem.
As a side issue, open source and closed source are not born equal. I wouldn’t use an open source component in a Microsoft environment. If I were forced to use a Microsoft environment, I would probably get a commercial database. If I had to build a system on the Microsoft OS, I would use SQL Server. I would not use one of open source projects, because I don’t feel they’re robust enough. There are not enough acceptances. There is not enough power and attention from the community. It’s very, very limited.
I would rather not run MySQL on Windows in production. I would most definitely not run one of those open source databases built for Windows, which are not MySQL. I don’t want to name them, but there are a number of them that were built for Windows and are considered open source projects, and they’re not so good.
Scott: For open source that only runs on Windows, the community is probably a magnitude smaller than open source that primarily targets Linux.
Leon: That’s one consideration, and the second consideration is that Microsoft is not the primary environment that attracts the open source community. I don’t want to sound like a bigot, but I do believe that the average expertise level of developers that work on Linux is higher than the average expertise level of developers that are working in Windows. And it’s not because Windows is not good and Linux is good. Windows is an excellent operating system that went through a couple of evolutions.
It’s just a completely different scenario. It’s relatively closed, and the conscious decision that Microsoft made at different times was to build something that targeted 80% of the development community, and it’s done well for them.
Sean: One of the reasons that I like my Macbook is that I can run any OS on it that I want at this point.
Yet they’re a closed source company, by and large. And they’ve got this massive innovation that they’re driving. I just saw the iPhone sales numbers this morning. Yet you take another example of a big company like Google, who gets a lot of credit for being open source, but they’re not really in a sense.
They’ve built their business on open source and they have some projects, but it’s not like they turn around and say, “Well, here are all the algorithms we use. Just go deal with them.” And then you’ve got companies like MySQL, that are purely open source, and they’re out to make a profit.
So I’m curious, as you look out across these different vectors, how do you think innovation plays into that? Because from my perspective, when I look at open source products, the more technical the product, and the more the product is by developers for developers, the more innovation seems to stream out.
But when they get into more commercial products like Open Office, to be frank, does it really have the same innovation level? That’s not to say that I’m in love with Outlook. I’ve got my problems with that too.
From the standpoint of acquiring technology in your role, when you’re looking for stuff that’s innovative, do you see a similar sort of breakdown, where the more technical the open source project is, the more likely it is to be innovative?
Leon: First, there is something very interesting that you said that I was kind of itching to respond to immediately. Think of Google, think of Apple. They’re not open source. But there is a perception of open source.
I’ll give you an anecdotal example from real life. We started using the Google appliance for search, and we were planning to deploy it and they had a bug that was preventing us from going live. And they actually guard their code and algorithms quite closely.
So we said, “All right guys, when are you going to fix the bug.” And they said, “Oh, the next release is in three months.” But that doesn’t work for us. We went back and forth and in the end they said, “You know, guys, we understand. We can’t release it sooner than that, so here is the code.”
I put a lot of value in that. It’s a different philosophy. Yeah, it’s closed, but Microsoft would never in their life do that.
Sean: Microsoft actually does make the Windows source available to certain customers, governments, etc., but I don’t think that’s widely known.
About the second part of the question, is there an innovation continuum in open source, where certain projects have innovation that is arguably better than in a closed source model, whereas in others maybe it struggles to really hit the mark and bring things forward?
Leon: You definitely have a strong point, and you gave an example before. I don’t think Open Office is necessarily innovative. I don’t use it. It has kinks still, and it’s too much trouble.
Sean: That’s an easy one to throw a stone at in terms of its innovation, but I want to get your sense as someone who has to acquire and find the right technology, where you think the soft spot is.
For example, the Mac is amazing for UI innovation and fidelity of user experience. On the other hand, go try and find a really good database product for the Mac.
On the flip side, I don’t’ think people have seen Vista as the paragon of great UI innovation, you know what I mean?
Leon: Just to elaborate a little bit on this example of Open Office, I don’t believe it’s quite fully an open source project. It’s heavily sponsored by Sun, and if it was strictly an open source project, the community would not spend this much effort in building an office product.
So there is a difference in an open source project that is supported by the community and a company deciding to open source something. When Sun open sourced Solaris, it didn’t suddenly become an open source operating system in the full definition of the word.
The fact that something is not closed source doesn’t make it open source. There is also a community element.
Scott: Right–who’s allowed to contribute and who controls the direction of it.
Leon: I believe that open source projects with strong community interest in them and strong community participation in them are all pretty innovative. Not all projects that are based on public code are innovative open source projects.
The other part of that is, taking Apple as an example, it is proprietary, but at the same time I can take an Apple laptop and open a command window and I can type my Linux command.
As far as business decisions go, catering to 80% of your audience is good, but it has a price. In the case of Windows, you don’t get a very strong following on the side of cutting edge developers. Because developers don’t love Windows machines. I don’t know a single one.
Scott: Oh, I know some extremely passionate .NET developers.
Leon: Ok, maybe it’s my bias coming through…
Sean: As usual, we have a lot more questions we wish we could have asked, but I want to be sensitive to the time, as we’d scheduled an hour, but thanks for taking the time to chat. This was a great conversation.
Leon: Thank you.