How Software is Built

A blog forum to provide deep dive analysis and community conversations about software development models. For more details click here.
Tags:

Interviewers: Scott Swigart and Sean Campbell

Interviewee: Marc Frons

In this interview with Marc Frons we talked to him about:

Sean Campbell:  Marc can you lay out a bit of your background with the New York Times?

Marc Frons:  Sure. I’m the Chief Technology Officer of the New York Times’ Digital Operations group. I’ve been at the Times since June of 2006. My role is to oversee technology strategy and operations for all of our digital properties. Since July 2007, I have been directly supervising NYTimes.com’s technology and product development as part of my  responsibilities at the company in addition to looking at potential partnerships, acquisitions, technology strategy and tactics overall.

Sean:  Where are the places where the New York Times has made investments in open and closed source technologies?

Marc:  The history of the NewYorkTimes.com is interesting. It is one of the first newspaper websites and now the largest newspaper website on the planet, as far as I can tell. And for the past six years, it has been a proprietary platform and a very good proprietary platform. The folks long before I got here wrote their own application server, even wrote their own programming language. And it’s highly scalable, very secure, and incredibly fast and it has served us well for the past six years. At the same time it is somewhat inflexible in its current form. So we made the decision a little over a year‑and‑a‑half ago to begin to bring more open source technologies into the shop and develop some of our new products using open source technologies. And that was a conscious decision to leverage the knowledge of what other folks were doing in our industry, which would enable us to move faster and be more innovative. At the same time, that other platform isn’t going away and we are continuing to do some things on that as well.

Scott Swigart:  Since you essentially have the world’s largest news web site with the associated high expectations that brings, how do you approach something like that? What would be your advice to another organization looking at open source, things you can take for a given with open source, and things you need to take a closer look at in order to ensure it is going to be handled?

Marc:  We are making a bet, not a big bet, because we are being incremental. We are not just saying "OK, we are going to redo this entire platform that serves almost a billion pages a month and all sorts of other things on a new platform" and just flip the switch all in one big bang.

We have an architecture that allows us to experiment, hedge our bets and be incremental. And I would advise anyone to do the same when making any platform transition.  Do it a little bit at a time, try out a couple of new lower risk products, things that you can do quickly, things that don’t get a ton of traffic, and then you move up the food chain as you go along. Test your hypothesis; see if these things that you thought were true really are true.

Scott:  So just to make sure I understand that right, you may take some section that is three levels down in the hierarchy and move that over. When you look at a website, most people never really bother to understand what it is running on and so you keep reimplementing more sections until eventually the home page is being served up by the new technology?

Marc:  That’s pretty much our approach, because obviously your new technology is going to evolve along the way as well. You want to try it out on the smaller pieces of your site or smaller sites that are in your infrastructure and not do it all at once because you are going to evolve your open source platform as you along.

:  And so when you say open source and you are focusing on web, what immediately springs to mind are the LAMP stack, Linux, Apache…

Marc:  That’s pretty much what we have.

Scott:  OK.

Marc:  We have Sun Solaris in the house, OpenSolaris. But we will also be moving to Linux just for some things; we already have Linux for some things. We will be doing some more things on a pure LAMP stack.

Scott:  And some of the concerns around that I would assume are the quality of the software, support, servicing, documentation. So what are some of the things that you feel are really good versus things that are going to take extra diligence on your part?

Marc:  Quality is obviously important. We feel like these various pieces of the stack have reached a maturity, certainly on the level of the operating system, that allows us to use them more widely.  This is not something we have to worry about particularly much. On the database level, it has also reached a level of maturity that for our purposes anyway is adequate. On the web and application serving level, it is the same thing.

So if you look at the various layers of the stack, these are all things that we have increasing confidence that perhaps two or three years ago, you might say well these products have a ways to go. I mean others like Apache were more mature, but certainly even in the last couple of years, MySQL has made significant strides and Linux keeps getting better and the kind of support you can get for all of the various pieces of technology in your stack has certainly grown. The documentation that you get with it, the cost is a big factor obviously, but these are just attributes that make it possible to use open source software in the first place. For us, the major factor is when talented developers are excited about the tools they are using and the technologies they are using and are passionate about it, you get a lot of innovative development. You can’t overestimate the importance of that.

:  One question given that you’ve built somewhat of a home brewed solution. With that in mind what do you see about the typical advantages open source gives you to fork it, providing you complete access to the source, etc. Is that part of the equation for you or is it just that you feel the platform has matured to the right point?

Marc:  Access to the source is nice, but I really would rather not be rewriting the kernel.

Sean:  Right.

Marc:  What I like about open source is that you are leveraging the community to help you make improvements far more rapidly than any single company could make them for you with a vendor product.

Sean:  Talk to me a little bit about that. How do you see that playing out in terms of the rapid innovation? Do you see the New York Times eventually contributing more actively to some of these open source projects? The Apache folks would have to look at you guys and say, "It’s a billion web pages." Whatever you would have to say about Apache you think they would listen, etc.

Marc:  I think so. This is obviously a new thing for the New York Times. So there’s a lot of building we need to do internally.  But our goal, in general, is to think about ourselves as a platform‑‑not a technology platform so much as a news and information platform‑‑and adopt some of the same ethos of the open source community in software, when it comes to content, information, news, research and data. So as we move forward into 2008, our goal is to create APIs not only for our own internal consumption, but for the outside world so that you can access the New York Times search archive and mix and match that with other data and information and applications, and create compelling applications out of our content and our other tools.

:   One quick follow‑up to that and then I’ll turn it over to Scott. I’m just going to pull on a thread that seemed to be underlying the previous discussion, but I’m not sure if this is accurate or not. Would you say that, from your perspective, there is an almost symbiotic nature between the ethos of open source and the ethos of an organization that’s journalistic, in that you want to provide information? You want to provide an open sense of community. You want to have multiple eyes looking at something trying to analyze the righteousness of it.

Marc:  That’s right.  We’ve increasingly been moving to open NYTimes.com both journalistically and technologically. We have close to 100 blogs by our journalists and launch new ones almost every week. We allow our reads to post comments on articles and we’ll soon be launching additional community and social networking functionality on the site, built, of course, using open source technologies. Perhaps most significant, last September we ended our paid subscription product, TimesSelect, and made virtually all of our content free. And the response has been pretty amazing.

So it’s true, open source fits nicely with how we think about NYTimes.com as a whole.  If you think about the history of journalism, Journalism 1.0 maybe was the typical "who, what, when, where, why" third person journalist‑‑just the facts. Journalism 2.0, the new journalism, maybe is Tom Wolfe,  Hunter Thompson and others who themselves became the center of the story. And 3.0 is where the audience becomes a part of the story through blogs and other forms of online community, and where the Internet helps inform what you’re doing by enabling you to mash up other data, other news, other points of view. So I think that’s where the New York Times as an organization is going, and really needs to go as we become a part of not just an isolated node on the Internet but an integral part of this vast but still growing community.

Scott:  Tell me if I’m reading too much into what you’re saying, but I don’t think I am. It seems you’re saying if I had a team of developers and they were used to buying their software and they were used to a proprietary model of acquiring their tools, they may look at our business and our information in more of a proprietary way. But if you take developers who are kind of steeped in free flow of information - openness, taking from the community and contributing back - those people come up with ideas. And those ideas probably infuse a lot of different things they do, including ideas they come up with for how the New York Times could collect and contribute information to the planet.

Marc:   I haven’t really articulated that even to myself in quite that way, but I think you’re right. I think adopting an open source philosophy tends to attract a particular type of developer who perhaps‑‑and this is not true for everyone, obviously‑‑but who perhaps is more engaged and more aware of the possibilities of information sharing than someone who is very used to a single product, a single source approach, a more closed approach. A lot of that, too, has to do with personality and skill set and interest.

Scott:  Right, right, but people who tend to work in open source kind of gravitate that way because of their personality I would assume.

Marc:  Yes.

Scott:  People are kind of this sort…

Marc:  Let’s put it this way: it certainly couldn’t hurt, right?

:  [laughter]
Well, let me ask how maybe the digital business contrasts with some of the other businesses in the New York Times. You guys are using a lot of open source technology. Is that pervasive, do you feel, throughout the New York Times, or is it more like look, "We’ve got a lot of office workers. They all run Office all day long." Different businesses but not…

Marc:  It is not yet pervasive, and I doubt that it will be pervasive anytime soon.

Scott:  On the Web and for what you guys are doing, open source makes a lot of sense. Are there areas where you think it doesn’t make as much sense?

Marc:  When a proprietary product is so clear superior to anything you can get through open source, then naturally you’d go with the vendor solution. But I think for systems and products where software does not change very much‑‑where you have a vendor product that meets your needs, that allows you to run without a whole lot of technical infrastructure, a product that gets the job done,  doesn’t have to be particularly customized,  isn’t changing all the time, where business requirements are set, where you just sort of want it to work‑‑because your business really isn’t changing. You have a payroll system. You know you have 10,000 employees. You know they get a paycheck every two weeks.

Unless there’s an open source product that has been battle tested in the market and works and you can just download, install and license it, you’re probably going to go with an established vendor, right?

Scott:  Right, right.

Marc:  So the same is probably true for many of your other corporate systems.

Sean:  Well, let me ask a question on that, because maybe I’m right or wrong about this. I would guess the New York Times probably breaks down into a ton of different IT buckets- three broader ones at a minimum where it might be servicing the needs of the staff, for lack of a better phrase: the accounting teams, and the HR teams? Then you’ve got the web presence, which is huge. But then, I would also imagine you have the needs of the reports which is the heart of what people think the New York Times consists of in any case.

Marc:  Yes.

:  So with that in mind I’m curious. Maybe for the office worker, a closed source piece of software makes sense. Then you’ve got the Web, where you have a variety of reasons you went with open source. How does it play out for reporters who are having to report from almost anywhere on the planet? What type of technology stack enables them to do what they do?

Marc:  It’s a good question. And it’s actually very interesting that you mention that. Shortly after I assumed this role and reorganized the NYTimes software group, I created a group called Interactive News Technologies. Basically, it’s a small team of software engineers and technically-minded journalists who sit in the newsroom alongside the reporters, and are charged with building both tools and very rapid development applications for news stories and news events. And actually they are using Ruby on Rails for some of their stuff.

We’re also now talking about fronting the big print publishing systems with web forms and other tools in order to get structured data, and a lot of the other things that we want to do that we can’t do easily in big, closed systems. So we are beginning to use open source in that way to help our Newsroom.

Sean:  Well, one question on that then; so Ruby on Rails is obviously popular and the Basecamp guys have probably popularized it more than most. As in they are one of the canonical apps where they focus on how easy it is to use, it has good fidelity, etc. So just to drill into that one for a minute, why the choice of Ruby on Rails? Because you obviously laid out a good business problem of, you have team of reporters, and they probably have needs that are immediate. You don’t want to be slowed up in any way. You have to have a development team that is able to move pretty fast. And with all that in mind I’m curious why you made the choice you did.

Marc:  My philosophy is, you tend to go to a large extent with what your team is comfortable with and excited about using. So when I was at Dow Jones, we were a big Java shop and had Java J2EE stack with a mix of open source and vendor components. But we used Apache, and we were moving to Linux, MySQL, and Tomcat and away slowly from Oracle and WebSphere, as these open source pieces of the stack matured.  That is where my guys were comfortable with. Those were the technologies they liked. They proved to me they could make them work. I wasn’t about to tell them they had to use something they were opposed to using.

Our interactive news technology guys said, "Listen, we’d really like to try a Ruby platform as opposed to Django, as opposed to some other framework." And I said, "Well, go for it; show me that it will work and start to experiment, do some things on it." So my feeling was you have to let your people have the tools that they are comfortable with as long as you feel that they have a chance to succeed, and you have to charge them with succeeding.

Scott:  You said something earlier that I think ties into that and, again, let me say it in my own words and see if you’d agree. It is more important that you can build whatever you want with your choice of today’s modern tools. And so you want your developers to be passionate and excited, so you let them use the tools they are going to be passionate and excited about using. Because you are going to get more creativity, productivity, etc. if you are letting them have input into their toolset rather than if you are simply dictating a toolset such as, "Well we are a Java Shop, therefore everybody use Java."

Marc:  Yes, exactly. The typical corporate CIO wants to consolidate systems and technologies and have a monolithic structure because he thinks that is going to save him money. And for some things that is true, but corporate technology is almost always a cost center.  Whereas for the digital businesses, IT is integral to the growth of the business and you could argue that IT is the business — software development. You don’t have a business if you don’t have software developers. You don’t have one if you don’t have journalists either if you are in the New York Times, but that is why technology and journalism are inexorably intertwined in the digital world much more so than they have ever been before and why they will continue to be even closer, even more intertwined as we move forward.

Sean:  So what do you see in the open source community that excites or intrigues you the most and I guess at the same time what excites you in terms of the closed source side of the world. As a CTO, you are looking at technology trends as well as specific implementation issues. What seems to stand out?

Marc:   Let us put it this way, there are so many open source projects, there are so many things happening, there is so much innovation going on that it is hard to pick and choose. I mean I think obviously the whole LAMP revolution has been fascinating to watch. These kind of contained frameworks like Ruby, like Tapestry are quite fascinating in that they put tools in the hands of people who perhaps don’t have the skills of hardcore software engineers, who are busy writing code at the base level.

But what I think really excites me is the move to making a lot of these more sort of all‑in‑one frameworks more scalable because if you are in the New York Times, it has got to scale and if it doesn’t scale, we basically can’t use it for anything really important.

Sean:  Right, exactly. Yeah, a billion pages are simply a billion pages, right?

Marc:  Right. Now the truth of the matter is it is not exactly evenly distributed. I mean some parts of the site get a ton of traffic and others not so much. So if you want to do something on Ruby, you put that on a thing where you know it is not going get creamed. Just because it is the first time you have really done that and it is Ruby, so you have to make sure that it is going to work OK, but as you gain confidence in these technologies, then you begin to roll them out in more highly visible and highly trafficked places.

Scott:  I don’t know how core this is to our investigation, but it is an interesting question to me and you might have an interesting answer to it. This is sort of continuum, right? On one end of the continuum, you have closed source proprietary companies who want to be very secretive about their roadmap, very secretive about their features, not share their source code. And even that is diminishing because even they are realizing that there is business value in transparency.

And then on the other end, you have the free software foundation who feels like all code effectively should be public. Initially they were kind of happy with software as a service because there were people who were using open source and building on top of it. But they might even want to go after someone like Google and say, "Hey, you benefited from open source, share your code." Where do you think it makes sense between those two spectrums to kind of draw the line, because it has got to make sense. You can’t kill the goose that laid the golden egg in either direction.

Marc:  Absolutely. Obviously, people need to be compensated in some way, shape or form for their hard work.  But there are emerging models: Just because you are open, doesn’t mean you don’t charge for it. Look at MySQL, you can download it but then you pay for support. It may be cheaper than Oracle but it’s not really free.  The same is true for various flavors of Linux. So I think people are finding ways to make those models work.

Google, it is true, has used a lot of open source, they have also written a lot of their own proprietary stuff that they have not given back to the community. And I think they have done that probably because they needed to just because of the scale of their operation. It really wouldn’t even be germane to most other businesses, right?

 

 

Posted by campsean on Wednesday, January 16th, 2008


You can follow any responses to this entry through the magic of "RSS 2.0" and leave a trackback from your own site.

Post A Comment