<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>How Software is Built &#187; security</title>
	<atom:link href="http://howsoftwareisbuilt.com/tag/security/feed/" rel="self" type="application/rss+xml" />
	<link>http://howsoftwareisbuilt.com</link>
	<description></description>
	<lastBuildDate>Fri, 25 Jun 2010 19:53:36 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<!-- podcast_generator="podPress/8.8" - maintenance_release="8.8.4" -->
		<copyright>2006-2007 </copyright>
		<managingEditor>scottswigart@technologyevangelism.com (How Software is Built)</managingEditor>
		<webMaster>scottswigart@technologyevangelism.com (How Software is Built)</webMaster>
		<category>posts</category>
		<ttl>1440</ttl>
		<itunes:keywords></itunes:keywords>
		<itunes:subtitle></itunes:subtitle>
		<itunes:summary></itunes:summary>
		<itunes:author>How Software is Built</itunes:author>
		<itunes:category text="Society &amp; Culture"/>
		<itunes:owner>
			<itunes:name>How Software is Built</itunes:name>
			<itunes:email>scottswigart@technologyevangelism.com</itunes:email>
		</itunes:owner>
		<itunes:block>No</itunes:block>
		<itunes:explicit>no</itunes:explicit>
		<itunes:image href="http://howsoftwareisbuilt.com/wp-content/plugins/podpress/images/powered_by_podpress_large.jpg" />
		<image>
			<url>http://howsoftwareisbuilt.com/wp-content/plugins/podpress/images/powered_by_podpress.jpg</url>
			<title>How Software is Built</title>
			<link>http://howsoftwareisbuilt.com</link>
			<width>144</width>
			<height>144</height>
		</image>
		<item>
		<title>Interview with Gwyn Fisher &#8211; CTO &#8211; Klocwork</title>
		<link>http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/</link>
		<comments>http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/#comments</comments>
		<pubDate>Mon, 02 Feb 2009 20:02:14 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Sean Campbell]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/</guid>
		<description><![CDATA[Interviewers: Scott Swigart and Sean Campbell
Interviewee: Gwyn Fisher
In this interview we talk with Gwyn. In specific, we talk about:

The value of automated code review
The advantage of code review at the developer level
Judging the security of open source codebases
Coordinating bug resolution between distros and upstream projects
Educating users about vulnerability in the face of complacency
The changing face [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Interviewers:</strong> <a href="http://howsoftwareisbuilt.com/about-scott-swigart/">Scott Swigart</a> and <a href="http://howsoftwareisbuilt.com/about-sean-campbell/">Sean Campbell</a></p>
<p><strong>Interviewee: </strong><a href="http://howsoftwareisbuilt.com/about-gwyn-fisher-cto-klocwork/">Gwyn Fisher</a></p>
<p>In this interview we talk with Gwyn. In specific, we talk about:</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork#value">The value of automated code review</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork#advantage">The advantage of code review at the developer level</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork#security">Judging the security of open source codebases</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork#coordinating">Coordinating bug resolution between distros and upstream projects</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork#educating">Educating users about vulnerability in the face of complacency</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork#changing">The changing face of computer attacks</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork#why">Why isn&#8217;t everyone using code-validation tools?</a></li>
</ul>
<p><span id="more-209"></span></p>
<p><b>Sean Campbell:</b> Gwyn, why don&#8217;t you start with your background and tell us a bit about Klocwork?</p>
<p><b>Gwyn Fisher:</b> Sure. I&#8217;m the Chief Technology Officer at Klocwork. I&#8217;ve been with Klocwork for a couple of years now, and as a consultant, actually, working with them for several years before that.</p>
<p>Klocwork was founded in 2001 as a spinoff from Nortel Networks. The Nortel research group that created our basic technology was centered on the head of the CASE department at ISPRAS, part of the Russian Academy of Sciences and affiliated with the Moscow State University, who was very interested in the notion of very large scale software architecture, model discovery, efficiencies and inefficiencies in incredibly large source code bases.</p>
<p>As it happened, Nortel had a specific business problem that they were trying to deal with that required massive re-engineering on one of their switch operating systems, and so as part of the existing partnership between Nortel and ISPRAS, he was funded to help solve this problem. Following the spin-off and creation of our company, he and most of the team relocated to North America, where many of them continue today to be the kernel of the ever-expanding domestic product development group within Klocwork.</p>
<p>We stayed in that space for several years, helping customers in very similar situations, who had requirements for re-engineering very large, very old, very complicated code bases, in order to simplify, componentize, and improve their software for newer devices, reuse, maintainability, and all that good stuff.</p>
<p>And then some of those same customers&#8211;and these are all very recognizable brand names&#8211;came back to us and asked us to do defect analysis for them. So, in addition to finding things that are wrong with their architecture and the fundamental way they&#8217;re building their software, we also started locating things like where they were leaking memory, doing bad things with pointers, generating buffer overflows, and other basic software problems.</p>
<p>After a couple years of research, we brought the first iteration of a product suite to market for defect and security vulnerability location and analysis.</p>
<p>So, since we were founded in 2001, we have sort of morphed over the years, away from a pure architecture and model discovery kind of technology, into a much fuller footprint in terms of what we do around defect analysis. That includes operational defects and security vulnerabilities, as well as continuing to emphasize our strength in architectural and maintainability issues. </p>
<p>There&#8217;s also that whole backbone of model discovery, visualization, and re-engineering&#8211;all the great stuff our customers need to be able to do in these very large code bases to manage them over time. We help them keep building out across more languages, more platforms, more tool chains, and all the good stuff that goes with working in a production software development environment.</p>
<p><a name="value"></a></p>
<p><b>Scott Swigart:</b> What are the things that are really hard to find with standard code reviews and QA that tools like yours can quickly make apparent?</p>
<p><b>Gwyn:</b> The quick, sort of palm-to-forehead, kind of &#8220;Why wouldn&#8217;t we have found that? Why didn&#8217;t we think of that?&#8221; sort of issues tend to be in the area of what is called inter-procedural analysis.</p>
<p>If you have a very large system, it&#8217;s made up of a whole bunch of functions, methods, and modules all linked together into large system images. And the problem, whether you&#8217;re using code review, runtime testing, or whatever, is finding enough combinations so you can trace through, from source to sink point, exactly what&#8217;s going to go on with a pointer, buffer, array, or whatever it happens to be.</p>
<p>We can point out across multiple different function boundaries, that this pointer you&#8217;re grabbing from over here, you&#8217;re using inappropriately over here, or this memory you&#8217;ve allocated here, you&#8217;re never releasing on this particular thread. Those sorts of instances would require your average high level architect to really bend their mind around them for days or weeks at a time to try and find manually. They&#8217;re the sorts of things that we can point out to someone and they can understand it very easily within 30 seconds or a minute.</p>
<p>Obviously, we all bring our own biases to bear, in terms of how we think about our products and how we think about our tools. And it&#8217;s perhaps obvious to think about a tool like this as a QA tool, and certainly the heritage of tools like this has been very much driven by a downstream audit or an adjunct to code review. But, really, that&#8217;s not where this technology comes from&#8211;this is a developer&#8217;s tool. The first one of these was Lint, back in the &#8217;70s.</p>
<p>And as a developer&#8217;s tool, well, Lint&#8217;s got a pretty bad name. Let&#8217;s face it.</p>
<p>[laughter]</p>
<p>As sort of a history, we&#8217;ve all sat there and watched the CRTs desperately trying to refresh as acres upon acres of warning messages come out of these things. But when tools like Klocwork bring modern analysis technology and accuracy and complexity into that equation and deliver it back to the developer, all of a sudden you&#8217;re not just inundating them with all kinds of stuff that they can&#8217;t deal with. You&#8217;re actually giving them a few very useful bits of information before they check code in.</p>
<p>That&#8217;s our approach, instead of trying to kick in downstream and then figure out who the right developer is to blame, instead our goal is to build analysis much more organically into the development process so it becomes an enabling environment rather than a blaming environment. </p>
<p>The uptake within an organization changes overnight, as soon as you can get through that cultural issue. Rather than giving QA a new tool for telling the developer that they&#8217;re an idiot, we are giving the developer a tool that will help them prevent being called an idiot. [laughs]</p>
<p><b>Scott:</b> In some of our conversations with Microsoft, they&#8217;ve talked about modeling tools that they use similarly to what you&#8217;re describing; before code ever gets checked in, the developer does analysis of the code and fixes issues that are flagged.</p>
<p>Talk a little bit about the benefit of having that done at the developer level versus, for instance, in a QA process that happens once or twice a year.</p>
<p><a name="advantage"></a></p>
<p><b>Gwyn:</b> If you do take a milestone approach or do it just before you roll out, the challenge tends to be that you get an overwhelming number of issues to deal with, often all at a critical time in the product life cycle. From the individual developer&#8217;s point of view, it can be an awful ordeal.</p>
<p><b>Scott:</b> In other words, the developer would tend to build up a big pile of issues that all come back for resolution at once, when they are periodically unearthed. On the other hand, if the issues are dealt with proactively, you never accumulate that big, ugly pile of bugs.</p>
<p><b>Gwyn:</b> Right. The other thing to consider is that this ongoing system means that the developer is dealing with mistakes that they made just recently, instead of six months ago, so it&#8217;s still fresh in their mind.</p>
<p><b>Sean:</b> You mentioned that you guys do some work with different open source communities. Educate me about your input into their process&#8211;are there open source analysis tools that are commonly used?</p>
<p>Do you find that developers are running tools that they have available prior to submitting code to the mailing list, or is it more the case that they really just trust the mailing list and are also interested in the scans that you and other vendors do and what those uncover?</p>
<p><b>Gwyn:</b> There is no doubt about the wisdom of the thousand eyes approach, particularly for the more mature open source projects, and people do use as many techniques and tools as they can get their hands on.</p>
<p>In the Java space, there&#8217;s a very nice open source analysis tool—FindBugs&#8211;that can really help developers get started with analysis. It doesn’t perform inter-procedural analysis, so in the words of the tool’s developer, they’re really focused on “low hanging fruit”, but it’s definitely a great place to start. </p>
<p>In the &#8220;C&#8221; world, well, not so much. There are a couple of tools that check coding style and things like that, but while they can tell you things like you should declare your copy constructor as private or you might generate memory leaks, their ability to find actual defects (as opposed to poor coding constructs) is very limited, unfortunately.</p>
<p><b>Sean:</b> And sometimes, it&#8217;s really hard to spot with the naked eye looking at apps committed to a mailing list.</p>
<p><b>Gwyn:</b> Exactly.</p>
<p><a name="security"></a></p>
<p><b>Sean:</b> This is in kind of a different area, but given your perspective and what you&#8217;ve done in terms of looking at a variety of codebases over the years, what tools do you think organizations should utilize to ascertain whether a given open source project is secure? </p>
<p>That&#8217;s assuming they&#8217;re not going to go buy a tool, do a massive code review or utilize a third party; they&#8217;re just out there having the internal conversation of Apache versus IIS, or MySQL versus SQL Server, or MySQL versus Oracle.</p>
<p>Different people make different claims about which is more secure, but how do we prove it? You end up in this kind of absence of evidence argument, I think it&#8217;s termed. Scott did a look through one of the mailing lists for one of the major projects, and he didn&#8217;t see anything about threat modeling. That doesn&#8217;t mean it&#8217;s not occurring, but that also doesn&#8217;t mean it is.</p>
<p>What&#8217;s your take on that? What should an organization do to really ascertain whether a given project has a robust development lifecycle around security?</p>
<p><b>Gwyn:</b> To be blunt, they should do everything they possibly can. You just have to assume that the guys on the other side of the fence haven&#8217;t done it yet. Do you believe Coverity, who says their favorite 11 open source projects are perfect? Or do you believe Fortify, who comes out two weeks later and says the sky is falling and they’re all insecure? Do you believe Mozilla, who says they have a thousand eyeballs looking at their code all the time? Or do you believe the users who say, &#8220;Yeah, but I filed almost as many bugs against you as I do against Microsoft&#8221;?</p>
<p>Nobody has the answer to this question. I&#8217;ve talked to several people over the years who said, &#8220;Wouldn&#8217;t it be nice if we could put together an indemnification agency that would use tools like yours to help some of our customers have more confidence in open source?&#8221; The idea would be to say, &#8220;If you want to use Apache (for example), here&#8217;s everything you need to know about it. Here&#8217;s the support protocol around it, and most importantly here&#8217;s a security rating for it.&#8221; In essence, they’d be attempting to provide a warranty for a particular revision of a particular open source project.</p>
<p>There are various organizations around the country and around the world who do little bits of this, and some of them are quite successful, but no one has ever gotten to the point of taking responsibility for saying &#8220;Tick or cross: you should use this.&#8221; It all comes down to providing a lot of data for the customer to evaluate, because no one is willing to take on the legal liability of saying what people should consider safe to use.</p>
<p>Because of all the sources out there and because all the tools and techniques are out there, you’d assume somebody would at some point do it. But there are enough litigious arguments against it that I don&#8217;t think it will ever happen, unfortunately.</p>
<p>I think it would be a great thing to do, and I think that most of the senior committers on those mature projects would want to see that happen as well. But they&#8217;re giving their time for free; it&#8217;s not something they&#8217;ll ever want to do themselves. And I think any commercial entity is just going to be scared witless of some bank somewhere going, &#8220;You know what, you told me to use X.Y.Z of Apache, and it turned out it had this vulnerability in it. We&#8217;d like you to compensate us for Grandma&#8217;s password finding its way to the Ukraine.&#8221;</p>
<p><a name="coordinating"></a></p>
<p><b>Scott:</b> We&#8217;ve talked a little bit about the value of static analysis to define things that aren&#8217;t necessarily easy to see with the human eye, and the value of doing it up front. One of the interesting things I find in open source is that with a proprietary company like Microsoft or Oracle, for the most part they own the code from soup to nuts. If there&#8217;s a bug, there&#8217;s a place to file the bug. They fix it, or they don&#8217;t fix it, but it&#8217;s their codebase.</p>
<p>When you&#8217;re dealing with something like a Linux distro, on one hand you&#8217;ve got this downstream distro that&#8217;s packaging a lot of upstream projects. And so, you&#8217;ve got the distro provider who has hired people to do the packaging and things like that. But, they&#8217;re also paying people to work on these upstream projects, whether it&#8217;s the shell, the web server, the kernel, or whatever. Then, when a vulnerability happens, there&#8217;s a relationship necessary for a fix posted to the distro to get posted to the upstream project.</p>
<p>Distros also talk about things to make these upstream projects enterprise ready. In your experience, do you find at the distro level that people are using tools like yours so that if an audio player for example didn&#8217;t go through the same kind of analysis upstream, they maybe are doing it downstream and working with the upstream project to improve the quality of the code?</p>
<p><b>Gwyn:</b> Wouldn&#8217;t that be a good way of organizing life? Unfortunately, it doesn&#8217;t happen so much. Some of the distributions today are backed by serious individuals or serious companies. It&#8217;s not just two guys off in Europe trying to bang a CD together, and in some ways, they&#8217;re doing phenomenal things. Look what Ubuntu has done around the user experience. They&#8217;re doing amazing stuff for the operating system as a whole.</p>
<p>I would say the distro manufacturers as a whole certainly validate in an industrial strength way and do the needed extra development, but they don&#8217;t really take responsibility. In terms of indemnification, guarantees of quality, or anything like that, they tend to fall back on a very dinosaurish kind of mentality. Got a problem? Hey, it’s only a patch, suck it up.</p>
<p>In just the same way as indemnification players don&#8217;t really want to step up and say it&#8217;s secure, distro vendors really don&#8217;t want to take this step of guaranteeing not just observed quality but absolute ground-up code quality, because of the exposure that would entail with all kinds of people over whom they have no control. Who do they go to if they have to build a fix on some kind of SLA? Of those thousand eyes, half of them are asleep at any given time. And the other half work for somebody else.</p>
<p>I&#8217;m certainly not saying you should go and buy commercial software all the time because you can always go and get hold of somebody. At the same time, though, the community just doesn&#8217;t have the level of leadership yet that lets it take that particular kind of responsibility for the quality and security of the software they’re producing. Perhaps the fact is that there&#8217;s just not that level of requirement being surfaced by the enterprise customers.</p>
<p>At the end of the day, nothing&#8217;s ever free. I think that open source has absolutely proved that software can be as free as you like, but that&#8217;s just one small component of the total cost of ownership of these things.</p>
<p><b>Scott:</b> I imagine that you run into the idea of a security IQ that starts out fairly low, where people say &#8220;Well, we haven&#8217;t had a problem, so we must be doing things right.&#8221;</p>
<p><b>Gwyn:</b> [laughs] And my five users are all very happy with it.</p>
<p><a name="educating"></a></p>
<p><b>Scott:</b> Right. So, how do you approach educating somebody about the fact that just because nothing has blown up and it hasn&#8217;t been the end of the world, they might not want to assume that everything&#8217;s hunky dory.</p>
<p><b>Gwyn:</b> It is actually fascinating how you can never feel anybody else&#8217;s pain. I think that&#8217;s a rule of life, whether it&#8217;s physical or intellectual. You can&#8217;t actually empathize with all the people who are going through it until you&#8217;ve done it yourself.</p>
<p>At one point early in my career, we got this very polite email from the DOD saying that they had taken 65 servers offline because they found a vulnerability in one of the products I was responsible for. And this was back ages before anyone was really worried about such things. </p>
<p>It was a tremendous wakeup call to realize that we had just taken away a significant capability from a very important agency just because of an error on our part.</p>
<p>They were very polite about it and didn&#8217;t intend to cause any significant commercial ramification, but as professionals, how were we to get to the root cause of it? How would we stop it from happening again in the future? Until that moment, I had never felt that pain, and I didn&#8217;t really understand it.</p>
<p>Increasingly, we get people coming to us from the device space to ask specifically about security. And they&#8217;re not coming to say &#8220;OK, we know we need to worry about this;&#8221; it&#8217;s &#8220;why would I have to worry about this? Everyone&#8217;s talking about this and I don&#8217;t get it.&#8221;</p>
<p>So, we were talking with this one medical device manufacturer, and they make truly scary stuff. If this stuff goes wrong, it&#8217;s not going to be happy days for anybody. Their senior architect was sitting in front of me and he said, &#8220;Marketing tells me I have to put an IP cable on the back of this box because he wants the paramedics to be able to send stuff off to the hospital before the guy gets there.&#8221;</p>
<p>And I asked him, &#8220;OK, and what are you doing about making sure that your device is doing what you think it&#8217;s doing and hasn&#8217;t been hijacked or something?&#8221; And he responded, &#8220;Why would anyone ever want to do that?&#8221;</p>
<p>Explaining to people that just because they would never think to go and hijack somebody&#8217;s super-fridge or their pacemaker or whatever it happens to be, doesn&#8217;t mean somebody else out there doesn&#8217;t have too much time on their hands and would just think it&#8217;s a huge joke.</p>
<p>In many ways, it&#8217;s that first moment of enlightenment that&#8217;s the hardest for people. You&#8217;ve got to get across that notion that just because they would never do it, and just because they can&#8217;t understand why anybody would, doesn&#8217;t mean it&#8217;s not going to happen to their system or device.</p>
<p>Once they&#8217;re through that, then it becomes a very pragmatic process, but that very first thing you&#8217;ve got to have is for that senior architect, VP, or whoever to feel the pain.</p>
<p><a name="changing"></a></p>
<p><b>Scott:</b> It seems to me that in a way, Microsoft got a little bit lucky (though I&#8217;m sure they didn&#8217;t feel that way at the time) back in early 2000, when Code Red, NIMDA, Blaster, and Slammer came out. These really high profile exploits were very embarrassing, and they were the kind of exploits that brought a whole company network down. And so, they decided to just pour dollars and resources into solving the problem.</p>
<p>Today, it seems like the nature of exploits is very different. They are far less likely to absolutely saturate your network and let you know that you&#8217;ve been hit. It&#8217;s more about setting up a botnet and what some people call crime as a service.</p>
<p>It&#8217;s a lot more valuable to have Sally not know her computer&#8217;s been compromised, or to have some Linux server in a data center doing command and control for a botnet, without its admins knowing about it.</p>
<p>Like I said, Microsoft got lucky in the timing and the nature of the exploits. In some of these projects, there&#8217;s a false sense of security because they say &#8220;Well, where&#8217;s our NIMDA? We&#8217;ve never had one.&#8221; Well, they&#8217;re never going to have one, because that&#8217;s not the kind of exploits people are writing.</p>
<p><b>Gwyn:</b> Yeah, exactly. Over the last ten years, we have moved from script kiddies doing dumb things to these really very, very smart attack vectors. It is a whole new day.</p>
<p>We can&#8217;t go a whole interview without me poking at Microsoft. [laughs] Look at the animated cursor type of thing that hit the public eye a year ago. This was vulnerable code that had been in Windows since day one, but all of a sudden, out of nowhere, you don&#8217;t have to see the person, you never have to see their machine, you can do everything remotely, and all of the sudden that thing is yours.</p>
<p>And what&#8217;s more, if you&#8217;re smart enough, the person who&#8217;s getting compromised doesn&#8217;t know what&#8217;s going on. It sort of became that perfect storm of an attack vector. And at the end of the day, what is it? It&#8217;s a simple exploit of a buffer overflow vulnerability, albeit a phenomenally complicated one. Those guys should get a PhD just for creating those things. [laughter]</p>
<p>You&#8217;ve got to give them kudos, right? But, that happened at the height of Microsoft&#8217;s secure development life cycle, and that bug had been reported. It was in their queue, and they knew all about it, but it had been de-prioritized and was buried in a pile of sixty eight thousand other things they had to worry about.</p>
<p>So, they are at one end of the spectrum because they have so many products and so much going on, but if you take the whole open source domain as a thing, there&#8217;s probably more code there. And you know that same level or magnitude of vulnerability is out there.</p>
<p><b>Scott:</b> That&#8217;s one of the benefits of tools like yours. On one hand, there&#8217;s putting it in the hands of every developer so that they can have a &#8216;get clean, stay clean&#8217; sort of mentality. But, the other value I see is that the most dangerous code in the world is probably the old code that nobody is looking at anymore, because it was written before people knew about integer overflows and buffer overruns.</p>
<p>I remember, way back when, we bought these Unix systems from a vendor. We were writing some low level network code. We did something that crashed the server. We had a bug in our code, and it was sending a packet where the frame was the wrong size or something. The response from the Unix vendor was basically, &#8220;Hey, there&#8217;s an RFC, and if you&#8217;re operating outside that and bad stuff happens, that&#8217;s not something we&#8217;re going to worry about.&#8221;</p>
<p>[laughter]</p>
<p><b>Gwyn:</b> There you go. Good rationale. On that same tack, IBM&#8217;s X Force does all these amazing things all the time, like showing how forcing an invalid pointer usage can compromise Flash applications, etc. &#8212; seriously smart people.</p>
<p>One of the things they do is to issue a “top ten” offenders list for attacks during a year and something that struck me was, I think, in last year&#8217;s report. No prizes for guessing who is the topmost vulnerable vendor, because Microsoft wins every year. But, of all the reports that they had coming to their NOC over the year, Microsoft products were responsible for three percent. Three percent of all the reports had something to do with Microsoft software, somewhere along the way. </p>
<p>The fascinating thing I saw was at that number five or number six&#8211;I forget exactly where they sat&#8211;was Mozilla. How many products do they make? They have one fantastic browser I use every day, and some other random stuff that is much less popular, but they are responsible for 1.4 percent, 1.3 percent &#8230; don&#8217;t quote me on the figures, but some relatively close number of available vulnerabilities, compared to all the things that Microsoft puts out.</p>
<p>Now, in reality, how many of the things that Microsoft puts out are actually getting attacked at the same level that Firefox is getting attacked? But still, it makes you think… Oh, and that top 10 was only responsible for about 15 percent of all reported attacks, so over 80 percent were against software and vendors you’ve never heard of.</p>
<p><b>Scott:</b> Well, I want to be sensitive to the time. We&#8217;ve covered a lot of great stuff, but what have we left out?</p>
<p><b>Gwyn:</b> If I were in your place doing the interview, I would ask what&#8217;s wrong with these tools? At face value, they are no brainers. Everybody should be using them, so why not? What&#8217;s the barrier to acceptance? Is it just because they are very new? What&#8217;s really going on here?</p>
<p><a name="why"></a></p>
<p>We try to answer this question for ourselves all the time. Why is nobody in this market a Microsoft yet? Why are we still a very nascent segment? Obviously, there are some elements of market maturity there. Most of the companies in this space have been around for a very short time.</p>
<p>It doesn&#8217;t matter which community you are working with, whether open or commercial. There is an overwhelming thought process involved around analysis. People tend to look at this process the same way that we started this conversation about it today&#8211;thinking of this as an audit environment. </p>
<p>People tend to regard it as a milestone activity and as something that is an onerous, annoying part of the software development process akin to code review. I can&#8217;t imagine anything more mind numbing than having to revisit the days in which I did code reviews all the time.</p>
<p>If this kind of technology is equated to that area of things, that&#8217;s a problem for this industry, and for the service technology type. It behooves companies like us&#8211;and open source projects who are trying to do the same sort of thing&#8211;to get over that, and to engage with the developer in a meaningful discussion around what you actually need out of one of these tools. </p>
<p>Is it that you need the most accurate tool with the lowest noise ratio? Or do you need something that finds every bug imaginable, regardless of the noise ratio? Do you need something that&#8217;s really fast, or that integrates with every environment, or something you never see and just pops up when you do something dumb? </p>
<p>For various vendors in this space, we have markedly different models as to how we engage with development organizations. As for ourselves, we are very upstream. Everybody we ever talk to is in development. Other people in our space are very downstream. They talk to auditors and CFOs&#8211;guys who in the morning are buying firewalls, and in the afternoon, they are buying software validation tools. There is a completely different feel to how you engage with a development organization, when you start getting that viral distribution down at the volume end of the buying pyramid.</p>
<p>I think it&#8217;s an interesting challenge in this space&#8211;how to get that massive distribution. When did IDEs suddenly hit the tipping point from being a bizarre luxury that a few companies used, to being something that every developer on the planet sees as a right and a prerogative?</p>
<p>It&#8217;s the same with runtime profilers. It&#8217;s the same with memory profilers. It&#8217;s the same with all these tools, which are just part of today&#8217;s normal, everyday SDLC.</p>
<p>When did they get there? How long had the market been around? Is it just a volume-based thing? Is it a market approach thing? I think that&#8217;s the fascinating question in full scale analysis now. Obviously, it has value and is having great effect in both commercial and open source, so when does it go big time?</p>
<p><b>Scott:</b> That&#8217;s a great closing thought. Thanks for taking the time to talk with us.</p>
<p><b>Gwyn:</b> You bet. Thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=209&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/&amp;title=Interview+with+Gwyn+Fisher+%26%238211%3B+CTO+%26%238211%3B+Klocwork" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/&amp;title=Interview+with+Gwyn+Fisher+%26%238211%3B+CTO+%26%238211%3B+Klocwork" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/&amp;title=Interview+with+Gwyn+Fisher+%26%238211%3B+CTO+%26%238211%3B+Klocwork" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/&amp;title=Interview+with+Gwyn+Fisher+%26%238211%3B+CTO+%26%238211%3B+Klocwork" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Gwyn+Fisher+%26%238211%3B+CTO+%26%238211%3B+Klocwork+@+http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2009/02/02/interview-with-gwyn-fisher-cto-klocwork/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Interview with JanRain on OpenID</title>
		<link>http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/</link>
		<comments>http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/#comments</comments>
		<pubDate>Mon, 25 Aug 2008 17:39:24 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Sean Campbell]]></category>
		<category><![CDATA[Identity]]></category>
		<category><![CDATA[OpenID]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/</guid>
		<description><![CDATA[Interviewers: Scott Swigart and Sean Campbell
Interviewees: Brian and Larry, and Michael.
In this interview we talk with Brian, Larry, and Michael. In specific, we talk about:

OpenID&#8217;s offerings for decentralized online identity management
Securing online identity: technology and beyond
Aggregating identity information for multiple resources
Spurring adoption by developers, service providers, and web sites
Responding to objections about centralized identity management
Closed [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Interviewers:</strong> <a href="http://howsoftwareisbuilt.com/about-scott-swigart/">Scott Swigart</a> and <a href="http://howsoftwareisbuilt.com/about-sean-campbell/">Sean Campbell</a></p>
<p><strong>Interviewees: </strong><a href="http://howsoftwareisbuilt.com/about-brian-kissel-ceo-of-janrain/">Brian</a> and <a href="http://howsoftwareisbuilt.com/about-larry-drebes-vp-of-engineering-and-founder-janrain/">Larry</a>, and <a href="http://howsoftwareisbuilt.com/about-michael-graves-cto-of-janrain/">Michael.</a></p>
<p>In this interview we talk with Brian, Larry, and Michael. In specific, we talk about:</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2008/08/25/Interview-with-JanRain-on-OpenID#offerings">OpenID&#8217;s offerings for decentralized online identity management</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/08/25/Interview-with-JanRain-on-OpenID#security">Securing online identity: technology and beyond</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/08/25/Interview-with-JanRain-on-OpenID#aggregating">Aggregating identity information for multiple resources</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/08/25/Interview-with-JanRain-on-OpenID#adoption">Spurring adoption by developers, service providers, and web sites</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/08/25/Interview-with-JanRain-on-OpenID#responding">Responding to objections about centralized identity management</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/08/25/Interview-with-JanRain-on-OpenID#credibility">Closed source versus open source credibility</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/08/25/Interview-with-JanRain-on-OpenID#future">The future of online identity management</a></li>
</ul>
<p><span id="more-177"></span></p>
<p><b>Sean Campbell:</b> To get us started, could each of you introduce yourself and tell us about your current role as it relates to OpenID and JanRain?</p>
<p><b>Larry Drebes:</b> I am in engineering, and these days, I spend most of my day thinking about our software and how it relates to the market needs.</p>
<p>I was one of the co-founders of Four11, which produced a product called RocketMail, the underlying platform for what became Yahoo! Mail. Yahoo! bought RocketMail back in 1997, shortly after Microsoft bought Hotmail.</p>
<p>After that, I co-founded a company called Desktop.com, which was earlier than Google Docs, but similar in concept. Most of my background is with large-scale software, in software as a service deployments.</p>
<p><b>Mike Graves:</b> I&#8217;m the CTO. I&#8217;ve been here a little over a year; I was previously at VeriSign, where I was CTO of one of their two divisions. I arrived at VeriSign in 2000 as the founder of a dot-com-era startup company called Signio. It became the merchant-payment processing system for VeriSign until they sold to eBay last year.</p>
<p>I took a little time off and came back to VeriSign to help lead the charge into new markets where I first became familiar with OpenID, and eventually ended up here at JanRain where we’re exclusively focusing on OpenID.</p>
<p><b>Brian Kissel:</b> I&#8217;m Brian Kissel, the CEO. I joined the company in January of this year at the point when the team felt like OpenID was really starting to reach an inflection point of market adoption. The OpenID V2.0 specification had been recently finalized, large corporate sponsors like IBM, Google, Microsoft, Yahoo, and Verisign were announcing their support, website adoption growth was accelerating, etc.</p>
<p>We wanted to take advantage of the extensive development work that the team here had accomplished over the last couple of years in creating next generation platforms and technologies for user-centric identity and authentication management via OpenID.</p>
<p><a name="offerings"></a></p>
<p><b>Scott Swigart:</b> Thanks. Give us a little bit of a description of OpenID, and what you as a company do around the technology.</p>
<p><b>Brian:</b> OpenID is the equivalent of single sign-on that you would get inside an enterprise. If you log in at the beginning of the day via Exchange or something like that, there&#8217;s an abstraction layer that also logs you into backend systems like CRM, ERP, HR, Accounting, and so forth. You only have to log in once.</p>
<p>You may be familiar with Microsoft&#8217;s Passport and InfoCard technologies, which is the idea that you have one user name and password that you can use on any enabled web site.</p>
<p>That&#8217;s really the same concept behind OpenID&#8211;it&#8217;s for users on the open Internet to have just one or a few identities that they can use to access all web sites that are OpenID-enabled.</p>
<p><b>Scott:</b> And in terms of your company, what are some of the things that you do around OpenID?</p>
<p><b>Brian:</b> Our goal is to be the platform provider of choice for OpenID implementations, for the end user, the OpenID provider, and the web sites that are accepting OpenIDs for registration and login.</p>
<p>In OpenID parlance, there are two kinds of contributors to the OpenID ecosystem that benefit the end user. One type is called the relying parties (RPs), which are web sites that accept OpenID for authentication. The other type is the OpenID providers (OPs), which are organizations that issue OpenIDs to consumers, employees, or members to use on the web sites.</p>
<p>We have a property called MyOpenID.com, which is a hosted ASP service where you can come and get an OpenID. We also offer services to web site operators to enable their web sites for OpenID and to create better, more intuitive login user experiences.</p>
<p>And then we have solutions that we offer to OpenID providers who want to issue OpenIDs to their employees, their partners, or their customers with their own brand.</p>
<p><b>Scott:</b> Obviously, people who are building sites that may accept OpenID are using a variety of technologies. What kinds of technologies do you support, in terms of the components that you provide and things like that?</p>
<p><b>Brian:</b> There are open source OpenID libraries for about eight different platforms out there, including four or five major platforms.</p>
<p><b>Larry:</b> We currently support three of those; we used to support six. Very early in the design process, we created open source reference implementations in a variety of languages. For OpenID V1, we wrote Ruby, Perl, Python, PHP, Java, and C# libraries, all with a common API, and tried to seed the market to get OpenID into the hands of developers.</p>
<p>Since that stage, we&#8217;ve brought a lot of companies into the fold, and it&#8217;s no longer necessary for us to support some of the more niche platforms. Right now, we provide open source libraries for OpenID V2 in Ruby, Python, and PHP.</p>
<p>OpenID fits together between the end user, the provider (which is often us), and the web sites through common web protocols.</p>
<p><b>Scott:</b> As you mentioned, the idea behind OpenID is similar to what Microsoft was trying to accomplish with Passport. What are the advantages of OpenID versus a Passport implementation?</p>
<p><b>Mike:</b> The crucial factor is that there are a lot of technical things inside of Passport that are very nice architecturally, but politically and socially, it&#8217;s just not an option.</p>
<p>The ecosystem will resist any company owning that space in that way, no matter how good the technology is. No one wants to be locked into a silo at that level of the stack.</p>
<p>One extremely valuable thing that OpenID offers is its decentralized nature. The trust and provider infrastructures are decentralized, so anyone can spin up their own OpenID server.</p>
<p>That doesn&#8217;t mean they have to be trusted or that anybody is going to trust them. But the trust circles and trust relationships arise organically out of that, meaning that if you use OpenID, you have portability and the ability to avoid lock-in. You can move to another provider using the same technology, and let them compete against each other in terms of serving the user.</p>
<p>That&#8217;s not the only difference, but it&#8217;s crucial that no one owns this, and the turf is not proprietary at the lowest level.</p>
<p><b>Scott:</b> So since there are a variety of OpenID providers, people develop trust in specific ones, and the providers work to build their reputations.</p>
<p><b>Mike:</b> Right. A lot of factors go into the decision of which provider you want to go with, and those factors can change over time. With OpenID, however you decide, you&#8217;re not tied into that provider.</p>
<p>With OpenID, if you read a headline that gets you concerned that your data is being disseminated somehow that you don&#8217;t approve of, or violates the EULA as you read it, etc., you have more options.</p>
<p>That doesn&#8217;t mean that it&#8217;s not a headache, but the architecture of the system is such that you can migrate to a new provider that abides by the policies or values that you demand in that relationship.</p>
<p><b>Sean:</b> Would you say that what you are trying to do by federating identity is somewhat similar to the experience somebody gets when they can look at Debian versus Ubuntu?</p>
<p>That is, Ubuntu is Debian with some chrome and obvious other technical differences, but they come from the same lineage, and if I don&#8217;t like what Ubuntu is doing, I can go back to Debian, or CentOS, or elsewhere. Do you feel like OpenID shares some philosophical relationships to that approach?</p>
<p><b>Mike:</b> Yeah. We understand that it gets away from the end-user retail-consumer mentality, but at its core, that ethos is very much the same as what drives the open source philosophy of OSs.</p>
<p>Linux is a very good analogy to use, because for instance, even if there are things you don&#8217;t like in a Debian architecture, many users can&#8217;t help themselves, but it&#8217;s always available. If you really need to change things, you have a path if you want to marshal the resources and effort to change and accommodate your interests, even on your own.</p>
<p>OpenID is similar, in the sense that it&#8217;s a roll-your-own architecture. If you&#8217;re sufficiently motivated and equipped, you can roll your own to provide biometrics support, or requirements for attribute exchanges that are very stringent or that otherwise meet your needs. And you aren&#8217;t beholden to anybody&#8217;s particular implementation.</p>
<p><b>Sean:</b> What do you think the natural evolution of OpenID is down toward the desktop? Do you see people eventually weaving it into desktop scenarios to secure data in home and business settings, or do you think it&#8217;s predominately going to stay with service providers and data storage that you access in the cloud?</p>
<p><b>Mike:</b> My personal view is that part of OpenID&#8217;s value is that it&#8217;s fairly narrowly cloud centric, and that&#8217;s the way it should be, even long term. There are lots of great existing desktop identity management and security solutions. OpenID’s primary design intent was for open Internet identity and access control.</p>
<p>We see OpenID as part of a mosaic or a stack that provides user-centric identity and the security and privacy around that, but I don&#8217;t think OpenID has the charter to try and reach from the cloud into the desktop.</p>
<p>We want to tailor things into a modular concept, where OpenID meets its goals very well and in a very lightweight, flexible way, and it defers to other tools that can be plugged in and out, depending on the context, to work with it.</p>
<p>I think it&#8217;s far beyond what OpenID needs to address in terms of desktop integration, because as you said, the Linux desktop is going to have a different set of tools than the Microsoft one. As you know, InfoCard is actually a pretty nice solution in this area, but it&#8217;s Microsoft-centric. And despite the difficulties of getting beyond Microsoft on the desktop, that&#8217;s going to be a sweet spot for the long term.</p>
<p><a name="security"></a></p>
<p><b>Sean:</b> Let me ask you a question in a different area. How do you address the man in the middle question? Obviously, it&#8217;s a tough potential problem for any identity system, even if it depends on factors that are very hard to spoof, like biometric data.</p>
<p><b>Larry:</b> As you know, there are general security issues with the web, and to some extent, the world has survived with those security issues. For instance, if I&#8217;ve been phished and I think I&#8217;m on Amazon but I&#8217;m not really on Amazon, bad things could happen.</p>
<p>Certainly, OpenID is living within the web framework, and we can&#8217;t do anything to remove existing global security issues, but we can add additional features onto the normal intercourse of the web that provide additional protection.</p>
<p>With our product, we&#8217;ve added security measures as options for the user. For instance, the user can use a client side cert.</p>
<p>We have an out-of-band, second-factor authentication solution,&#8221;Called VerifID,&#8221; which actually uses your cell phone in conjunction with a password to verify your identity. Every time you input your password, you get a phone call at that number, and you must hit a number on the keypad in order to continue.</p>
<p>We also allow InfoCard to be used, which is a sample-based public key encryption protocol. So all those options can provide additional layers of protection. Of course, we have to strike a balance between inconveniencing the user and securing their web transactions.</p>
<p>This is really the same story everyone can tell you&#8211;Amazon is making the tradeoff of just using a password versus a more secure option, which might cause more friction against people filling up their carts and checking out.</p>
<p>We are participating with other members of the community, since it&#8217;s not at all a unique problem to us, and we&#8217;re giving users tools so that they can secure their identity themselves, if they choose to.</p>
<p><b>Mike:</b> A lot of this is anthropology, as you know. We have to be realistic about how much of this technology actually can solve. There are social relationships and social trust issues that play a major role. I worked at VeriSign for a long time, and they&#8217;re heavy into the mediating technology for trust relationships.</p>
<p>But even for a big, badass security company like that, they understand that, at some point, it happens above the technology layer. That being said, it doesn&#8217;t mean we&#8217;re not cognizant of it, but because we&#8217;re cognizant of it, we know that that&#8217;s not something that can be solved with just bits.</p>
<p><a name="aggregating"></a></p>
<p><b>Scott:</b> Another topic that tends to come up a lot when we discuss identity management is that there are a couple of choices about aggregating that information, and they both have issues associated with them. One option is to have a different password for every single place you go, and nobody really likes that. On the other hand, there&#8217;s obvious danger to having one password for everything&#8211;if it gets compromised, that&#8217;s really bad.</p>
<p>But as I understand it, I can have more than one OpenID, so I might have different identities for different purposes.</p>
<p><b>Brian:</b> You may have one or you may have many. It&#8217;s up to you, and you&#8217;ll have different reasons why you&#8217;re using different identities, whether it&#8217;s a work identity, or a home identity, or an affinity group identity with your college, or some other interest area that you have. It could be an AARP identity. It could be an American Express identity.</p>
<p>The tradeoff with one or a few identities is that you can be a lot more vigilant and mindful of how you manage your identity. You may have 50 different username password combos out there on the open Internet, and most people do something which is called password reuse.</p>
<p>Your password is only as secure as the weakest place it&#8217;s used on the Internet, and if it gets phished at Billy Bob&#8217;s Phish and Tackle Shop, then it&#8217;s compromised, and it&#8217;s good anywhere else you&#8217;ve reused that password.</p>
<p>On the other hand, in the OpenID model you only share your password once, and that&#8217;s with your OpenID provider. And you can make sure that you trust that your OpenID provider is doing infrastructural things to protect it, and that you&#8217;re creating and managing strong passwords.</p>
<p>You can be reminded to reset your password with a given frequency, and you can be encouraged to implement multifactor authentication protocols&#8211;whether that&#8217;s an SSL certificate, an InfoCard, anti phishing site verification tools, out of band authentication with cell phones, or an RSA token. With this approach, you get to choose how much security you&#8217;re willing to layer on top of your authentication process, to balance between convenience and security.</p>
<p>That&#8217;s what it will always boil down to. Unless every device and every opportunity uses biometrics, and it&#8217;s your retinal scan or your thumbprint scan on every device where you would ever authenticate, then there&#8217;s going to be some inconvenience to you to do something more than username and password.</p>
<p>Actually, though, it turns out that if you have a good relationship with your OpenID provider, you get very easy login on any OpenID-enabled site – single click with no text entry required in many cases. You maintain a very trust-driven relationship with your OpenID provider, and that OpenID provider can manage all that infrastructure across one account as opposed to many.</p>
<p>If every web site had to implement the same multiple layers of protection for your identity that an OpenID provider did, I would posit that the whole ecosystem would not adopt the same level. But if you have a handful of OpenID providers who are really focused on managing your identity and providing multiple layers of protection, then you&#8217;re more likely to have a more secure experience.</p>
<p><a name="adoption"></a></p>
<p><b>Sean:</b> What about the challenges of educating developers? Some technologies are just flat out sexier than others. That&#8217;s nothing against anybody in particular, but Scott and I have had the experience of educating people about some pretty dry stuff, and we appreciate how difficult it can be to engage people on it.</p>
<p>What do you do to evangelize to the broader open and closed source development communities about the importance of identity?</p>
<p>What are you trying to bring to market to make it easier for them? I have to imagine that, on one hand, everybody understands it&#8217;s important, but on the other hand, it might not be the thing that they want to pay as much attention to right out of the gate.</p>
<p><b>Brian:</b> The OpenID Foundation along with the member companies are doing outreach and education on the features and benefits of OpenID. We run a web site called OpenIDEnabled.com that hosts resources for the development community. Additionally, we provide tools like ID Selector (www.idselector.com) to enable website operators to adopt and deploy “best demonstrated practice” implementations of OpenID login. We’re also working on white papers for OpenID providers, web site operators, and end users on how to get the most out of OpenID.</p>
<p>In addition to outreach and education to the development community, we’re providing more “turn key” solutions for OpenID providers, website operators, and end users. Just like Sugar CRM, VA Linux, RedHat, MySQL and other open source initiatives, you can get the open source libraries or you can get professionally managed services using the technology. We provide the widest range of innovative, fully featured solutions on the market.</p>
<p><b>Scott:</b> Where are you seeing OpenID being most widely adopted? Who do you see as being the early adopters, and maybe the second wave of adoption? Who&#8217;s using it?</p>
<p><b>Larry:</b> At the user-created points of the web, like blogs and wikis, it&#8217;s gaining tremendous ground and is actually a sort of thought leadership. It&#8217;s the right way to do things, and it&#8217;s thought of as very cool in lots of circles.</p>
<p>Beyond what you might call the technical early adopter part of the web, if you will, you have large players like Yahoo!, Microsoft, AOL, Google, MySpace, and VeriSign all coming out in support of OpenID. So really, the adoption is expanding. So the top end of the market has already either announced support or delivered production support. </p>
<p>It&#8217;s that middle of the Web that our eyes are really on, at this point, and it&#8217;s a large middle. That&#8217;s where future adoption is going to come from.</p>
<p><b>Scott:</b> What do adopters point to as being advantages of the technology, to validate their decision to go that way? </p>
<p><b>Brian:</b> As Larry mentioned, the early adopters have been either the large portals or the user generated content sites, like blogs, discussion groups, wiki&#8217;s, social networks, and sites like that, where the primary objective is to get registered users on their site as frictionlessly and as quickly as possible.</p>
<p>If users can show up at a site with an identity that they can sign in with, it&#8217;s much more likely that they&#8217;re going to become active participants. The big benefit to website operators is that the more people they have active on their sites, the better they can do with personalization, advertising, promotion, and cross selling. Also, for sites that sell advertising, they can get higher CPMs for profiled users than generic site visitors.</p>
<p>So the first measure of success is probably the conversion rate from “site visitors” to “registered users,” followed by the activity level on the site.</p>
<p>Back to your earlier question, I think that the next categories we will likely see come online will be content sites other than user generated content sites. They&#8217;ll be media properties, like newspapers, radio stations, TV stations, magazine sites, sports sites, gaming sites, and some affinity groups&#8211;organizations where they don&#8217;t control the members, so they can&#8217;t manage single sign on from an enterprise perspective, but they have members where they do want to provide access.</p>
<p>Alumni associations, community organizations, little league teams, AARP, Boy Scouts, and organizations like that are not necessarily transacting business, and they&#8217;re not asking for credit card numbers, but they do want to have registered users.</p>
<p>Another category would be all the customer supported sales and customer self help resources on product web sites, where they have blogs, discussion groups, or wikis to allow people to ask and answer questions and help each other.</p>
<p>If you go to one of those sites and you have to register to participate, and you&#8217;ve already got 50 user names and passwords, the likelihood you&#8217;re going to participate is lower than if it would accept an OpenID that you already have.</p>
<p>That&#8217;s the primary initial benefit for the web site operator&#8211;ease of registration and login, and ease of registration is important as well from the standpoint of being quick and error free.</p>
<p>OpenID has with it the ability to transfer personal data at the user&#8217;s discretion. Right now in simple registration, there are 10 demographic data fields, but it&#8217;s extensible (and we think longer term it will go that way) with a component of OpenID, which is called Attribute Exchange where you can share as many data attributes as you&#8217;re willing to publish about yourself. And you can share that information with a web site at registration.</p>
<p>So instead of having to fill out all those data fields, you just pass them to the web site operator in a machine readable format and pre-populate your application. As OpenID evolves, we&#8217;re going to get to the point where we&#8217;re going to get single-click login.</p>
<p>And that&#8217;s something our solutions actually support, so when you show up at a web site, you don&#8217;t even have to type in anything. It just remembers who you are, you click to authenticate, and it goes back to your OpenID provider to authenticate you seamlessly, and it comes back and logs you into the web site. So from a user&#8217;s perspective, the ability to register quickly and login easily is important.</p>
<p>Longer term, managing that data is going to be important. If you changed your phone number or email address and you had to go back to 50 different web sites and update your web site profile with that information, it would be a very tedious process.</p>
<p>If you have one digital identity, or maybe a handful that remember what sites are using that identity, and you can choose to pass that updated information to the site, that&#8217;s a benefit to you, and it&#8217;s a benefit to the site operator.</p>
<p><a name="responding"></a></p>
<p><b>Scott:</b> What are some of the misconceptions or common objections that you run into when you&#8217;re talking to people about using OpenID?</p>
<p><b>Brian:</b> It tends to fall into one of three buckets.</p>
<p>The first is that they don&#8217;t know whether their customers want it or need it. The second is that they wonder how long and how much effort it&#8217;s going to take them to implement. And the third is that they want to know who among their peers and competitors are using it, because they may not want to be first, and they have to consider how this ranks among their other priorities.</p>
<p>We&#8217;re actually trying to systematically address each of those concerns with products and services that we&#8217;re offering to make it easier (a) for web sites to implement OpenID, (b) for them to be aware of the number of users who would like to use an OpenID on their web site, and (c) to get a few testimonial accounts in any given category to adopt OpenID.</p>
<p>You can envision the scenario that as soon a few major college alumni associations adopt it, others will follow. As soon as you get the Boy Scouts, you&#8217;ll get the Girl Scouts and the Campfire Girls and 4H and everybody else. As soon as you get United Airlines, you&#8217;ll get Delta and Northwest. And as soon as you get Hertz, you&#8217;ll get Avis and National and Enterprise.</p>
<p>So I think part of the challenge for us is getting those early adopters who see the benefit and want to be perceived as thought leaders. As soon as that happens, the others will follow.</p>
<p>Right now, of the 18,000 OpenID enabled web sites, a majority of them are in that “user generated content” category. Some blogging sites, web sites, and discussion group web sites that are just going live now are choosing to go entirely OpenID.</p>
<p>Some are retroactively going OpenID and converting all their users to OpenID. So in that category, the benefits are compelling, in terms of ease of adoption and deployment. Reduction in customer care costs is another thing we should talk about.</p>
<p>If you only have one or two user name passwords to memorize and maintain, the likelihood that you&#8217;re going to forget it on a site that you go to less frequently goes down dramatically.</p>
<p>And it turns out, according to Forrester and Gardner and Meta, forgotten passwords account for 30 to 50 percent of customer care support calls. So if you can drive that cost down dramatically, you drive the customer frustration down about going to a site that they&#8217;re only going to less frequently.</p>
<p><b>Scott:</b> I guess the other advantage for the site operator is that it&#8217;s not their problem anymore, right? They just basically shuffle you off to your OpenID provider who handles all of that lost password, password reset kind of stuff.</p>
<p><b>Brian:</b> There are actually a couple of benefits there. One, you do outsource that customer care cost to your OpenID provider. The other is you don&#8217;t need to maintain the passwords, so you don&#8217;t have liability in the event that a password is compromised.</p>
<p><b>Scott:</b> I think there&#8217;s an interesting dynamic there, in terms of how the Internet has changed. I think originally, site operators saw it as a purely positive proposition to collect information on their users.</p>
<p>In the last couple of years, they have really started thinking about the fact that there&#8217;s liability associated with that data.</p>
<p>I know as a user, I really wonder when I go to a site whether they are really qualified to store my credit card information. Are they really competent to safeguard this information? And if they give me a check box that says: &#8220;We can forget your credit card and you have to enter it every time,&#8221; I always check that box.</p>
<p><b>Brian:</b> That&#8217;s one of the reasons why PayPal has become compelling on smaller sites where you might not have that trust factor. There&#8217;s a way for you to abstract away your proprietary account information.</p>
<p><a name="credibility"></a></p>
<p><b>Sean:</b> What do you think about the ability of something like OpenID to be born from a single closed source company or even a small gaggle of them, whether it was Google, who has tremendous credibility, or Apple, who&#8217;s also closed source but has a lot of positive brand equity? </p>
<p>Do you think something like this has to be created by an open source type of community?</p>
<p><b>Mike:</b> I think that this could&#8217;ve been pushed out by one of the big companies. Google, or Yahoo!, or Microsoft has the girth and user footprint to push something like this out.</p>
<p>But the technology has to have some genetic features that those companies typically wouldn&#8217;t provide&#8211;namely, the ability to easily migrate right off that particular silo into another.</p>
<p>It&#8217;s practically achievable for someone like a Google or Microsoft or Yahoo! or AOL, but disruptive to their community base. One of the things that we&#8217;ve seen happen is that Yahoo!, for example, has been very proactive with OpenID, relative to some of the other big companies.</p>
<p>But there was a time not long ago that they hadn’t warmed to this idea any more than Facebook or some of the other more closed companies, because they wanted to preserve their proprietary silo. One of the things that OpenID or similar technologies will do is to flatten the namespace and make it easy to move things around and facilitate portable user-managed identity. Momentum for that paradigm is expanding beyond OpenID in areas such as OpenSocial, OAuth, Data Portability, Portable Contacts, etc.</p>
<p><a name="future"></a></p>
<p><b>Scott:</b> We&#8217;re drawing near our time limit, so I&#8217;d like to ask whether there&#8217;s anything that we didn&#8217;t ask about that would be interesting in this space to discuss.</p>
<p><b>Brian:</b> One area that we think longer term is going to be interesting as OpenID becomes more prevalent is the notion of consolidating and aggregating and more intelligently managing all the content and communication that happens on the Internet today.</p>
<p>Right now, you have silos like email, chat, discussion groups, blogs, and wikis, with different people, groups, and topics that you&#8217;re trying to manage in a more disaggregated way.</p>
<p>Imagine a time in the future when all of your communications and all of your content sites are OpenID enabled, so you can pull them all together in a more consolidated way. You could keep track of various discussion topics, blog entries, emails, and chats in the context of topics that are of interest to you from people and groups that are of interest to you. It would let you prioritize and organize information in ways that make sense to you, based on your stated priorities or what the technology can infer from your actions.</p>
<p>That&#8217;s something we think that in the longer term, OpenID will enable that no other solution or approach has done to date. So we&#8217;re excited about what it can do for registration and sign on today. We&#8217;re more excited about what it can do with digital identity, reputation, and content management long term.</p>
<p><b>Mike:</b> One of the things that JanRain has to balance is the immediate, practical needs of equipping the industry to provide practical solutions, with the long range opportunity, which is huge. Once OpenID proliferates in a broad way, they will become the effective end points for all sorts of things. Communications is an obvious one, but commerce and trust and the things that spring from them are going to increasingly become factors, in terms of the basic building blocks that are OpenID end points.</p>
<p>For us, that&#8217;s an enormously motivating horizon to keep looking at, even if it&#8217;s not proper to talk about in this quarter or next, or even next year. But as time goes on, this is going to unleash a huge number of disruptive opportunities for new communication tools, new workflows for collaboration around content, new e-commerce trust mitigation, and e commerce payment flows, all using OpenIDs as the atoms building into molecules and organisms in a way that hasn&#8217;t happened before.</p>
<p><b>Scott:</b> This has been a good conversation, and thanks for taking the time to chat with us. Identity and personal data are central to the work that lots of people are doing, and so it&#8217;s really nice to get your perspective on how OpenID fits in.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=177&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/&amp;title=Interview+with+JanRain+on+OpenID" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/&amp;title=Interview+with+JanRain+on+OpenID" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/&amp;title=Interview+with+JanRain+on+OpenID" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/&amp;title=Interview+with+JanRain+on+OpenID" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+JanRain+on+OpenID+@+http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2008/08/25/interview-with-janrain-on-openid/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Interview with Gabor Cselle &#8211; VP of Engineering at Xobni</title>
		<link>http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/</link>
		<comments>http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/#comments</comments>
		<pubDate>Thu, 31 Jul 2008 01:34:46 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Sean Campbell]]></category>
		<category><![CDATA[QA]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[Usability]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/</guid>
		<description><![CDATA[Interviewers: Scott Swigart and Sean Campbell
Interviewee: Gabor Cselle
In this interview we talk with Gabor Cselle. In specific, we talk about:

Deciding between open and closed development models
Strategies around handling QA challenges effectively
Maintaining high usability and user security
Business models and corporate customers as opposed to individual users
The relationship between open-source and commercial software
The value of avoiding feature [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Interviewers:</strong> <a href="http://howsoftwareisbuilt.com/about-scott-swigart/">Scott Swigart</a> and <a href="http://howsoftwareisbuilt.com/about-sean-campbell/">Sean Campbell</a></p>
<p><strong>Interviewee: </strong><a href="http://howsoftwareisbuilt.com/about-gabor-cselle-vp-of-engineering-at-xobni/">Gabor Cselle</a></p>
<p>In this interview we talk with Gabor Cselle. In specific, we talk about:</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni#deciding">Deciding between open and closed development models</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni#QA">Strategies around handling QA challenges effectively</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni#experience">Maintaining high usability and user security</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni#models">Business models and corporate customers as opposed to individual users</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni#relationship">The relationship between open-source and commercial software</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/07/30/Interview-with-Gabor-Cselle-VP-of-Engineering-at-Xobni#features">The value of avoiding feature creep</a></li>
</ul>
<p><span id="more-170"></span></p>
<p><b>Sean Campbell:</b> To get us started, could you please give us a little bit of background on Xobni, your role with the company, and a little bit of your work history before that? </p>
<p><b>Gabor Cselle:</b> I&#8217;m the VP of engineering here at Xobni. I run our engineering team, manage QA and support, and was the first employee. Previously, I worked at Google as a software engineer, both in Zurich, Switzerland where they have a big engineering office and in Mountain View. </p>
<p>Xobni was started as a Y Combinator company. Y Combinator is a seed funding group that tries to encourage young college graduates to start companies, instead of working for &#8220;the man.&#8221;</p>
<p>The company was started by a graduate of MIT and a Penn State graduate in 2006. For about the first year, they were working on what we call Xobni Analytics today, which is a package that analyzes your Outlook emails just like Google Analytics analyzes your web traffic. It would chart out how many emails you got this month, how many people you emailed with, and so on.</p>
<p>After a while, we realized that people wouldn&#8217;t use that more than once or twice, and then they would let it fall by the wayside. We really wanted to build something that people would use for hours every day, and so we got the idea for today&#8217;s product, which is a sidebar for Outlook.</p>
<p>We got funding from Khosla, so after I joined, we built the current product in about six months, and launched the private data in September of 2007. Since then, we&#8217;ve been adding features and broadening the project&#8217;s scope, and we have some new exciting projects as well.</p>
<p><a name="deciding"></a></p>
<p><b>Sean:</b> Given the overall open-source momentum these days, what was your thinking on making the project closed source from the start?</p>
<p>Obviously, some of it was driven by the fact that you are going to plug into Outlook, but that didn&#8217;t have to be a complete limiter, in and off itself.</p>
<p><b>Gabor:</b> I don&#8217;t think it was a conscious decision in the very beginning. It was more about the fact that eventually we wanted to license this to enterprises, because we were working on the Analytics product back then. And obviously, Outlook is closed source&#8211;there&#8217;s the thin layer around Outlook of APIs that Microsoft makes public, but everything else is closed source.</p>
<p>I think we are just kind of imitating that model, and eventually we may have versions with extra features as something that you can purchase. I think that&#8217;s why it&#8217;s still closed source, and we&#8217;re probably going to continue to follow this route.</p>
<p><b>Sean:</b> I guess the natural follow up to that is to ask what kind of push you have gotten from people to extend what you are doing with Outlook and put it in Evolution or Thunderbird? </p>
<p><b>Gabor:</b> Actually, the day after we released, we had about 400 emails in our support queue that all said, &#8220;I&#8217;m not an Outlook user. I&#8217;m a Thunderbird user,&#8221; or &#8220;I&#8217;m a Gmail user and I really like the functionality,&#8221; etcetera.</p>
<p>So this is definitely something that we&#8217;ve looked at very, very intensely. We decided not to pursue the path of integrating with any other desktop clients, at least for now. We would need to revise a lot of the interface, but the main reason is really that Outlook is the number one email application in the world. There are 400 million people using it, and a lot of these people are in pain.</p>
<p>Outlook doesn&#8217;t do what they want it to do. They can&#8217;t search for messages quickly, especially with the older versions. We solve that one pain point really well for Outlook. It&#8217;s a huge addressable market, so that&#8217;s why we are sticking to it. The future might hold a version for other clients as well, though.</p>
<p><a name="QA"></a></p>
<p><b>Sean:</b> One of the things that I find interesting is how you have handled the breadth of QA questions you must have come up against. I mean, everybody uses Outlook, from the most computer illiterate to the most sophisticated users, so what they&#8217;re doing with their machines is radically different from each other. Notepad may be the only thing that gets more universal access on a Windows box.</p>
<p>What was your process around QA from the beginning, and how has it evolved? I&#8217;m sure you have to build something that is absolutely rock solid, and yet who knows what people are running alongside it?</p>
<p><b>Gabor:</b> My first reaction is to say that this issue has been a lot of really hard work. When we were originally building the product and the team was just four people, we outsourced QA to a manual testing company in California. We had a good experience with that, but the problem is that the number of possible configurations and the number of things that can go wrong is very high.</p>
<p>I think the best decision I&#8217;ve ever made here is to bring QA in-house and to hire really good people and just try to get these issues taken care of by adding more bandwidth to them.</p>
<p>We hired an excellent QA lead and testers, and they&#8217;re not just testing the product&#8211;they&#8217;re testing the different integration scenarios with Outlook. They&#8217;re testing scenarios like how it works with Exchange, IMAP, POP, and various combinations of plugins.</p>
<p>That&#8217;s a lot of work, and it&#8217;s not particularly sexy, because everyone wants to work on a great new feature, rather than what happens when a particular ActiveDirectory setting is at one instead of zero. I&#8217;m very happy to say that we have QA engineers who are very passionate about getting these things right.</p>
<p><b>Scott Swigart:</b> In a lot of open source projects, code&#8217;s checked in and people download stuff before it&#8217;s officially called a release, and they work with it. The model is often just to have good bug tracking, but not a lot of overt structure beyond that.</p>
<p>What do you think about the notion of handling bug fixes that way? What do you think would be the net result of a model like that in your case?</p>
<p><b>Gabor:</b> That&#8217;s a very good question. I don&#8217;t think it would necessarily work for us&#8211;mainly because people don&#8217;t want to be exposed to bugs in their Outlook, and they don&#8217;t want to be the guinea pigs, because Outlook is their data store. It&#8217;s where all their personal data is, and they&#8217;re necessarily very sensitive about protecting that.</p>
<p>I think that our case requires in-house testing. That said, however, we originally did a staged private beta release of our software, to a limited population who had signed up on our web site. </p>
<p>That gave us a really strong test audience of people with a bunch of different configurations, and only once we were pretty comfortable with that did we go to the next order of magnitude.</p>
<p>There&#8217;s an effect at work that we call the &#8220;10X effect.&#8221; Every time you increase the number of users by a magnitude of 10X, new stuff starts popping up. For every 10X increase in user base, we get all sorts of weird scenarios that we haven&#8217;t seen before, and we go through these cases very methodically and just kind of adjust them one by one.</p>
<p><b>Scott:</b> Another thing that open source does a lot is to use very distributed teams. You can have people working on one project from all over the globe and lots of different companies.</p>
<p>What kinds of advantages does it give you because you can pull everybody together in a conference room and you&#8217;ve got people working in the same building, versus what may be some of the challenges of using a highly decentralized distributed model?</p>
<p><b>Gabor:</b> Everyone is in the office during work hours&#8211;no one really works from home on our current team. We have meetings about support, about QA, about where the product is headed, and about the current status of each of the projects that we&#8217;re currently working on.</p>
<p>We find that face time is very important, because if you want to describe a problem, you can just go up to the whiteboard and sketch it out. You don&#8217;t need to start up some virtual conferencing solution and try to drive your discussion through a window on the screen.</p>
<p>Really good ideas pop out from hallway conversations; we actually just had a conversation about how sometimes the Outlook UI would freeze when we were syncing data in the background:</p>
<p>We were able to fix that issue, but if we had not had that face-to-face conversation, we wouldn&#8217;t have been able to get a solid theory about why that bug was happening.</p>
<p><a name="experience"></a></p>
<p><b>Scott:</b> Another thing that springs to mind is that, aside from the architecture and what is going on behind the scenes, do you feel like a closed-source, centralized model facilitates getting where you want to be with the user experience and those kinds of things?</p>
<p><b>Gabor:</b> To answer that, I should first take you on an excursion through history. The interface of Xobni has been highly praised, and some people call it &#8220;Star Trek&#8221;-like, because it looks like the computers on &#8220;Star Trek, The Next Generation.&#8221; The way that interface came about was basically that, in the very early days of Xobni, I just tried out different things in Photoshop. I came up with three basic designs: one looked just like Outlook, one looked like Outlook with bigger icons, and one I made as a joke about how far out we could take it. That is what we use now. And that&#8217;s how the basic look and feel of the interface came about.</p>
<p>It was basically arrived at with experimentation and talking to the other team members at the time about what they felt about the different designs. Lately, we&#8217;ve moved into a slightly heavier development model. We have a user experience expert and weekly usability meetings, we view mock ups, and we do hallway usability testing.</p>
<p>We also go to coffee shops. There&#8217;s a Starbucks we frequent a block from the office. We just ask people, &#8220;Do you understand this mockup? Do you want to try it out on our laptop and tell us whether it works for you?&#8221; One of us actually stands in line and says, &#8220;We&#8217;re going to pay for your latte if you have five minutes to look at some screen shots.&#8221;</p>
<p><b>Sean:</b> Wow. Ethnographic research with a latte&#8211;it&#8217;s the grass roots ethos, for sure. [laughs]</p>
<p><b>Gabor:</b> We&#8217;re in the financial district of San Francisco, so it&#8217;s a really good spot, as these are our target users. They all use Outlook, because they work at Morgan Stanley or Wells Fargo. They&#8217;re exactly the kind of people we want to talk to, and they&#8217;re exactly the kind of people that need to understand what our product does.</p>
<p>And so, back to the open source discussion. I don&#8217;t think that in a distributed team, these kinds of impromptu usability sessions would really be able to take place.</p>
<p><b>Sean:</b> There&#8217;s the issue of reliability, and nobody wants to see the &#8220;Outlook is checking your inbox&#8221; message for the 5,000th time, just because you happen to shut it down. But at the same time, there&#8217;s a security issue, especially as you get into those large companies in the financial district and so on.</p>
<p>So what is your process around security? In the open source model, it&#8217;s well-known, and it boils down to &#8220;many eyeballs creates a more secure code.&#8221; I don&#8217;t mean this at all in a pejorative way, but closed source companies, by nature, need to protect their IP, and so they&#8217;re a smaller team that keeps it in-house.</p>
<p>What has Xobni done around the security aspects of the development model? Are you following a particular framework, whether externally developed or internally?</p>
<p><b>Gabor:</b> Our basic premise is that we don&#8217;t ship any personal data back to us, including any part of your email, any part of your calendar, or any part of the data on your desktop. That&#8217;s been the case since the very beginning of Xobni, and it&#8217;s absolutely vital, because people are very sensitive about this issue.</p>
<p>Beginning in the very early versions, we shipped with something called &#8220;The Customer Experience Program,&#8221; which was an opt-in program, and it still exists, if you go into the options dialog. If you enable that, we ship back an anonymized data set about your usage patterns. It basically tells us that a user identified only by a randomly assigned number uses feature X and Y, but not Z. So, that is the one time where you can opt in and volunteer to give us some anonymized user data. </p>
<p><a name="models"></a></p>
<p><b>Sean:</b> That&#8217;s an interesting bridge to another question that I have had in mind&#8211;what types of issues arise when you support a corporate enterprise sale? I imagine that there might be some concern on the part of IT organizations about supporting a broadly distributed plug-in like this.</p>
<p><b>Gabor:</b> Actually, the first-line answer is that we&#8217;re a consumer company, and we want to speak directly to users and get them to download our products in order to address pain points that they are experiencing. We want them to be happy with our products and to be able to find the stuff in their inbox and to deal with it efficiently.</p>
<p>That&#8217;s much, much easier than trying to convince an IT department to get Xobni for their entire corporation.</p>
<p>That being said, a lot of the more progressive IT people have contacted us proactively and said, &#8220;We&#8217;re a company with 200 or a company with 5,000 people. We want Xobni on every single one of our desktops.&#8221; Of course, we encourage that, and may work out licensing that meets their needs.</p>
<p>I think in the future, we might have a support model specifically around this type of deployment, but that really hasn&#8217;t been worked out yet, mostly because the company is quite young and also because we are very focused on making a brilliant product that consumers really enjoy.</p>
<p><b>Scott:</b> In the open source world, there&#8217;s the notion that the source code is free. People can download it, and companies have to figure out a way to monetize around that. They might sell support or have a more enterprise version that they charge for, with features that aren&#8217;t in the free one. </p>
<p>As you look at the landscape, what&#8217;s your impression around business models? Do you have a sense that the traditional model of having people download an eval version and then pay for the full version is still viable?</p>
<p>We run into the impression on the part of some people that everything&#8217;s moving to an open model and that open source software is built on a superior development methodology. On the other hand, there are new and innovative companies like yours that didn&#8217;t choose that route.</p>
<p>Do you feel that the traditional business models are alive and well, or do you see things moving to that &#8220;give it away and figure out how to charge for something&#8221; approach?</p>
<p><b>Gabor:</b> People ask us this question a lot, and it basically comes down to, &#8220;How are you going to make money? Can you really charge for an add-in?&#8221; Our answer is that right now, we&#8217;re not focused on that. Right now, we are very focused on building a great product. We believe that if you build it, they will come.</p>
<p>And if you have millions of users that enjoy your product every day, somehow you will always be able to make money. Like Google. Google didn&#8217;t get started by saying, &#8220;Oh, we have search. Let&#8217;s find out how to monetize it.&#8221; They were, in the first few years, just focused on getting search better every single day, and later they found ways to monetize it.</p>
<p>There&#8217;s a story that the initial Google pitch stack was 70 pages long and explained how they would make money by selling search technology to enterprises, which was completely wrong at the time, but they later found a different way to monetize it, which has worked pretty well for them.</p>
<p>Our mindset, right now, is the same. I think we&#8217;re just building a great product and we&#8217;re going to find out how to make money eventually. People are asking us to buy pro versions. People are asking us whether they can pay if they want to install it into an enterprise.</p>
<p>We&#8217;re definitely looking at these models, but it&#8217;s really too early to take those conversations on in earnest, though&#8211;that still needs to wait six months or a year.</p>
<p><a name="relationship"></a></p>
<p><b>Sean:</b> That&#8217;s fair. Let me ask you a different kind of question. There are obviously many, many good open source applications. At the same time, though, the little holy duopoly of Exchange and Outlook never seems to really find a strong open source competitor. I don&#8217;t think that people love it, but it&#8217;s the best that the world has at the moment.</p>
<p>Every time you try some other combination, whether it&#8217;s Zimbra or Evolution or Thunderbird or Sunbird or you try some other backend, like IMAP, POP, gmail, or whatever, they never seem to have that fidelity.</p>
<p>Why do you think the Exchange and Outlook combination never really sees the same strength of potential competitors as RedHat versus Windows, or MySQL versus SQL Server, or any of these other parts of, let&#8217;s say, the LAMP stack? Why is it that in that one spot, there doesn&#8217;t seem to be a parallel, even though there is almost everywhere else?</p>
<p><b>Gabor:</b> I think you&#8217;re totally right, and I don&#8217;t know why that is. There are definitely great applications out there, such as the experience that Gmail provides. Also, Zimbra is great, and there&#8217;s just a fantastic variety of new applications in this field.</p>
<p>I think that Microsoft has the strength they do in the enterprise world right now because they know exactly what features they want, and they know exactly what these customers need and how to get into these enterprises. And once you have Exchange Server installed, it&#8217;s pretty hard to switch, because Exchange and Outlook have tremendous lock-in. I think that&#8217;s the key to their success and to their strength.</p>
<p><b>Sean:</b> As a follow up to that then, since you came from Google, it&#8217;s an easy one to throw out there. What do you think about someone like Google&#8217;s relationship to open source? Or to give another example, what about Apple? Apple is seen as this amazingly open company, but they&#8217;re certainly not very open when it comes to their roadmap, or talking about other stuff like that.</p>
<p>With Google, you&#8217;ve got a company that builds a lot of stuff on essentially free open source, but yet they build a lot of commercialization on top of that, so there&#8217;s an interesting story there I think. Do you see any tension there?</p>
<p><b>Gabor:</b> First, I have a really high opinion of Google, and I think that they’re excellent at software development. Also, I think their fundamental tenet of not being evil and doing good for the world is genuine, and they&#8217;re certainly sticking to it. They’ve open-sourced a lot of software already – such as Google Gears or protocol buffers, and I’m sure they want to be open and share the wealth with the world, but have to balance that with being a strong company that can generate profit.</p>
<p><b>Scott:</b> During development, does open source fit in anything you do? I know as a developer that open source is really, really popular as developer tools for unit testing, automated builds, and for all kinds of different things. From the perspective of peeling back the covers of your organization, where do you find it useful or do you not really use it for much?</p>
<p><b>Gabor:</b> We do use open source for build tools, release tools, unit tests, and so on. The infrastructure provided by the open source community is very useful to us, and I could imagine us contributing once we have a little higher level of resources, but so far our interaction has been very limited.</p>
<p><b>Scott:</b> Would you mind giving a plug for some of the build tools and development tools that you really like?</p>
<p><b>Gabor:</b> There&#8217;s always the wish list that you have. I think that NAnt, NUnit, or CruiseControl are really good, but I wish that they could each do this one more thing. There are other tools we use, all of which are an integral part of our development cycle.</p>
<p><a name="features"></a></p>
<p><b>Scott:</b> One of the things that we have observed before is that open source really shines in projects that are &#8216;by developers, for developers,&#8217; rather than software that targets a non-developer user base. In the case of developer tools, for example, the people building the project have an intimate understanding of what the product needs.</p>
<p>Open source projects tend to be very good at building a lot of infrastructure stuff&#8211;operating systems like Linux, web servers like Apache, and so forth. That seems to be where open source really innovates toward excellence. On the other hand, the really end-user-focused innovation like the iPhone, Google Maps, and the Xobni UI examples we have been talking about don&#8217;t tend to come from open-source projects.</p>
<p>I feel like open source is very concerned with architecture and performance and modularity, but when it comes to building a peak experience product, a lot of that&#8217;s still coming from the Apples and the Googles and the Microsofts and those kind of companies.</p>
<p>Do you feel like I&#8217;m on to something there? And if so, do you feel like there are fundamental reasons for that, based on how the different kinds of software are built? Or am I kind of just drawing a circle around a few things I have observed and calling it a trend?</p>
<p><b>Gabor:</b> I have a very controversial opinion around this issue, which is that if you want to make a product with a peak experience like the iPhone for example, it&#8217;s not as much about what you build as what you take away, and about making those cuts at the right point.</p>
<p>For these types of decisions, you need very strong product people who understand exactly what the consumer wants and who understand exactly what the minimal feature set is that covers the maximum number of usage scenarios.</p>
<p>And this is something that I, as a developer, am not necessarily good at. There&#8217;s always the impulse to think of the next feature, and I always want to add, add, add, add. That&#8217;s why all of the open source tools have zillions of command-line flags and options and so on.</p>
<p>Whereas, if you want to make a consumer product like the iPhone, you have to make the cut somewhere, and make those things that are inside of the cut just outstandingly good instead of making a lot of things that are sort of a 75% solution.</p>
<p><b>Scott:</b> That&#8217;s interesting from the standpoint of a lot of work I have done as a consultant for software companies. They say, &#8220;We&#8217;re going to add this feature to the product, because one important customer wants it.&#8221; And especially with smaller companies, a lot of times they can be pushed around by their bigger corporate customers to add things into the product.</p>
<p>Even something like MySQL does a certain amount of non-recurring engineering, and they add a certain amount of features, because a person wants it really bad for some specific scenario. I&#8217;ve never heard anyone explain it exactly the way you did, but that makes perfect sense. The iPhone&#8217;s good not because of what&#8217;s in it&#8211;it&#8217;s good because that&#8217;s all that&#8217;s in it. They really did just focus on making those things really good.</p>
<p>So how does a company like yours balance those two impulses? You must run into people who say, &#8220;We&#8217;d buy it if only you would add this feature.&#8221; How do you balance that with the philosophy of less is more?</p>
<p><b>Gabor:</b> I think this question comes back to what I said earlier, which is that we&#8217;re really focused on consumers. We don&#8217;t, as a company, listen to the types of people who say, &#8220;If you just added this feature,&#8221; because right now we&#8217;re not even selling a product. So, the bottom line is that we can&#8217;t really consider a scenario where if we just added the foreign feature, we&#8217;d have $100, 000 more revenue. </p>
<p>And a related thought is that it&#8217;s been very painful for me personally, because I&#8217;m one of those &#8220;add, add, add&#8221; people, and other people on the engineering team are as well. Xobni used to have in the private data stage two tabs, which were &#8220;email&#8221; and &#8220;organize.&#8221; And one was about your email and the other one was about your calendars, your appointments, your outstanding tasks and so on.</p>
<p>We found through the customer experience improvement program that I talked about earlier, where we send back anonymized data about features if you opt in, that no one was using it.</p>
<p>We thought, &#8220;We built this great thing and no one is using it. Why? We should make it more prominent and maybe we should make the button blink or something.&#8221;</p>
<p>[laughter]</p>
<p>But then really, honestly, this is really and indication that people don&#8217;t want that functionality. And we decided to cut it out. It was probably one of the best decisions that we ever made, because we focused the product, and we made the things that people really do use even better. It&#8217;s worked great for us&#8211;it was just very painful at the time.</p>
<p><b>Scott:</b> That reminds me of Firefox. We had Netscape, and then Internet Explorer became dominant, and it kind of became the browser with one more thing and one more button. And when Firefox first came out, it really made people sit back and ask what they really needed from a browser. They asked whether they really needed a lot of extraneous stuff, and a lot of them decided that Firefox really had everything they needed.</p>
<p>When it kind of caught fire, I think it was because it had that minimalist approach. I wonder if it depends on the size of your target, in the sense that if you target a niche, then you may need to be able to get pushed around by the niche a lot. And you may need to build a feature for a customer, because the niche is only so big.</p>
<p>On the other hand, if you are targeting millions of users like Xobni did, maybe you&#8217;re automatically freed from a lot of that. In essence, it seems like you asked where the pain point is for hundreds of millions of users, and that freed you from trying to be all things to all people. You just have to be the right thing for a bunch of people.</p>
<p><b>Gabor:</b> Exactly. But finding that spot is really hard.</p>
<p><b>Scott:</b> I can imagine.</p>
<p><b>Sean:</b> Well, thanks for a great conversation. I&#8217;ve really enjoyed this.</p>
<p><b>Gabor:</b> Absolutely. Thanks for setting it up.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=170&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/&amp;title=Interview+with+Gabor+Cselle+%26%238211%3B+VP+of+Engineering+at+Xobni" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/&amp;title=Interview+with+Gabor+Cselle+%26%238211%3B+VP+of+Engineering+at+Xobni" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/&amp;title=Interview+with+Gabor+Cselle+%26%238211%3B+VP+of+Engineering+at+Xobni" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/&amp;title=Interview+with+Gabor+Cselle+%26%238211%3B+VP+of+Engineering+at+Xobni" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Gabor+Cselle+%26%238211%3B+VP+of+Engineering+at+Xobni+@+http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2008/07/31/interview-with-gabor-cselle-vp-of-engineering-at-xobni/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Interview with Ben Chelf &#8211; CTO &#8211; Coverity</title>
		<link>http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/</link>
		<comments>http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/#comments</comments>
		<pubDate>Tue, 17 Jun 2008 20:29:03 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Sean Campbell]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/</guid>
		<description><![CDATA[Interviewers: Scott Swigart and Sean Campbell
Interviewee: Ben Chelf
In this interview we talk with Ben Chelf from Coverity. In specific, we talk about:

Scanning the worldwide open source code base for vulnerabilities
The necessity of automating defect identification
Building scalability into code analysis
Overcoming the limitations of automatic scanning
The hubris of believing too much in your code
The difficulty of appearing [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Interviewers:</strong> <a href="http://howsoftwareisbuilt.com/about-scott-swigart/">Scott Swigart</a> and <a href="http://howsoftwareisbuilt.com/about-sean-campbell/">Sean Campbell</a></p>
<p><strong>Interviewee: </strong><a href="http://howsoftwareisbuilt.com/about-ben-chelf-cto-coverity/">Ben Chelf</a></p>
<p>In this interview we talk with Ben Chelf from <a href="http://www.coverity.com/index.html">Coverity</a>. In specific, we talk about:</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity#scanning">Scanning the worldwide open source code base for vulnerabilities</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity#necessity">The necessity of automating defect identification</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity#scalability">Building scalability into code analysis</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity#overcoming">Overcoming the limitations of automatic scanning</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity#hubris">The hubris of believing too much in your code</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity#difficulty">The difficulty of appearing objective</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity#changing">The changing face of code security and analysis</a></li>
</ul>
<p><span id="more-161"></span></p>
<p><b>Sean Campbell:</b> Hi, Ben. Could you start us off with some background of yourself and of Coverity?</p>
<p><a name="scanning"></a></p>
<p><b>Ben Chelf:</b> Sure. I&#8217;m the CTO and one of the founders of Coverity. We started the company in late 2002, with technology that I worked on with a few colleagues while we were graduate students at Stanford.</p>
<p>The idea behind the technology is to make static source code analysis practical for everyday commercial software developers. It was a bit of a departure from previous academic attempts at static analysis, in the sense that our approach seeks to scale to millions of lines of code, so it can encompass systems that people are actually developing today.</p>
<p>To fine-tune the technology, we applied it to the Linux kernel, and the nature of the open source community made it fairly simple to work with developers to correct the defects we found. As more and more people found out about the defects we had been able to find automatically&#8211;as opposed to the old-fashioned &#8220;hard way&#8221;&#8211; a lot of buzz started to develop around the technology.</p>
<p>Some companies started approaching us, saying, &#8220;Obviously this stuff works because it&#8217;s being applied to a real system like Linux and getting some traction there, so can we try it on our code bases?&#8221; Eventually, a customer offered to pay us, and we decided that it was the right time to get the company started.</p>
<p>We saw a rapid adoption of our product, and we learned a lot in those first few months about bringing the raw technology into product form. We also started to address some of the other issues around making static source code analysis practical for commercial development, in terms of integrating it into existing build systems and making it so that defects are automatically distributed to the right developers when they&#8217;re discovered. </p>
<p>To encourage adoption, it was also important for us to work our technology into the existing workflow so that the defects we discovered automatically could be treated in a similar fashion to those discovered in the QA process and so forth.</p>
<p>Over the course of that learning process, we acquired a lot of customers&#8211;we have more than 400 now. Because the technology is so valuable, the company has been cash-flow-positive since our inception. And that&#8217;s a testament to the real value, I think, that we&#8217;re providing to folks in discovering defects earlier in the software development process with this new, disruptive kind of technology.</p>
<p><b>Scott Swigart:</b> Coverity showed up in relatively recent news with a huge splash in making the announcement that certain open source projects are now in Coverity&#8217;s rung 2. Can you speak a bit to the significance of that announcement and what those different rungs signify?</p>
<p><b>Ben:</b> Obviously, we started our work with the open source community when we were at Stanford, but under the Coverity umbrella, a new wave of work with the open source community really started about two years ago. We have been providing our technology as a service to open source developers and specifically as many of the widely known packages as possible through the site scan.coverity.com.</p>
<p>What led to this effort was that, a couple months earlier, we had been awarded a contract with the U.S. Department of Homeland Security, in conjunction with Stanford and Symantec, essentially to harden open source, since it is being used more and more in our nation&#8217;s infrastructure.</p>
<p>Software like Linux and Apache is everywhere, and as a matter of public security, it&#8217;s important to maximize the extent to which these systems are secure and reliable, work as they&#8217;re meant to, and are resistant to attack.</p>
<p>When we launched the site, we weren&#8217;t sure how it would be received. Being a proprietary software company, we sell a product. So we were a little bit nervous about how it would be received by the open source community.</p>
<p>So we tried to be very respectful of their code, and when we launched the site, I personally sent a note to the developer list saying, &#8220;OK. Here&#8217;s the deal. Here&#8217;s this site that we&#8217;re about to launch. I&#8217;m not letting anybody have access except for you. This is for the developers of these projects. This isn&#8217;t for press people to log in and troll through the bugs, or hackers to log in and look for vulnerabilities. This really is for the developers of the open source package.&#8221;</p>
<p>Today, we&#8217;ve kept that mantra of, &#8220;This is really for the developers.&#8221; Yes, we post statistics on the site, at a high level. But it&#8217;s really meant to try to improve the open source software as a development tool to help these folks.</p>
<p>I think it&#8217;s because of that cautious approach that the site has been well received. Developers started flocking to the site, registrations came flooding in, and defects started to get fixed as a result of the site.</p>
<p>In fact, to date, almost 8,000 defects have been fixed as a result of our scanning efforts through this site, across the board from open to commercial packages. We&#8217;re now seeing tens of millions of lines of code on a regular basis.</p>
<p>About the rung question, about a year after the site launch, we introduced the idea of the Scan ladder and the different rungs on it to incent the packages to use the product more. To acknowledge and reward those who have been diligently using the product and fixing all the defects that we find, we offer upgrades to the latest and greatest from Coverity, from a product and feature standpoint, as well as giving them additional access to us in terms of giving us their feedback.</p>
<p>There were 11 packages that met the criteria we set up for rung 2, in terms of cleaning out defects on a regular basis. We kind of stepped up our relationship with those packages and our commitment to them to help them get the latest and greatest out of the technology. We anticipate, over time, that we will add more and more rungs, as we have more and more capabilities in our product.</p>
<p><a name="necessity"></a></p>
<p><b>Scott:</b> The press coverage seemed to stretch between two ends of a spectrum. One side seemed to assert that these projects that achieved rung two are bug free. Obviously, Coverity never suggested that&#8211;you were just rewarding projects that have made good use of your technology. On the other end of the spectrum, some people thought it pointed out inherent weakness of open source, because there were vulnerabilities to be addressed.</p>
<p>My impression is the truth lies somewhere in the middle. Large open source products like the Linux kernel or Apache have very high quality, and they&#8217;re worked on by very good developers. But I think it does show that certain jobs are best done by a machine. It stands to reason that using automation to scan code that never gets tired or bored makes sense.</p>
<p>Where do you think the limits are of what eyeballs can do in fixing defects? And as an extension of that idea, where do you think the technology can continue to go? What are some of the opportunities for things that could potentially be scanned for that maybe are still hard problems to solve today?</p>
<p><b>Ben:</b> We started this project with the idea that we were not up to the task of debugging our own software; debugging is something that&#8217;s plagued developers for forever, right?</p>
<p>In 1949 Maurice Wilkes&#8211;a pioneer in creating the EDSAC&#8211;said that the first time he had to debug his software, he realized that most of the time he spent in programming would be on debugging the mistakes he made.</p>
<p>The observation, of course, is that the existing techniques are not sufficient. Testing can&#8217;t cover everything, and humans are subject to frailty, no matter how good we are. First, we&#8217;re not perfect. Even if I only had 1,000 lines of code and I could look at it day in and day out for months and months and months, I would still make mistakes.</p>
<p>Layer on top of that the fact that systems now are millions or tens of millions of lines of code, and a single developer has no chance of keeping the complexity of that kind of system in his or her head all at once. It&#8217;s just impossible.</p>
<p>We&#8217;re at the limit. There is no way to uncover these defects with the traditional software development process. So you look to automation technology that can cover all the different possibilities in code, and this is where static analysis shines, especially compared to testing or dynamic techniques. It isn&#8217;t tied to the execution of the program. </p>
<p>When you execute the program, maybe you get five percent code coverage, meaning that five percent of the lines of code are touched at least once in the execution of a test suite. And that doesn&#8217;t speak at all about the different contexts that those lines could be executed in.</p>
<p>So when you&#8217;re talking about real coverage, it&#8217;s a very miniscule percentage. Hundredths or thousandths or tens of thousandths of a percent in terms of the billions and trillions of different possible execution paths through the code.</p>
<p>Static analysis affords us the luxury of being able to try to cover all of that without running it. And so you can find those corner cases that every human and test case overlooked. Now, obviously the promise of that sounds great, but there are some very practical limitations, in terms of analysis. </p>
<p>Analysis is not easy, partly because the systems are so complicated. This is where a lot of the breakthrough comes in from our technology&#8211;addressing that scalability while retaining enough analysis heft to find interesting things in the code.</p>
<p>It&#8217;s one thing to analyze ten million lines of code, but if you&#8217;re not doing any deep analysis, you&#8217;re not going to find any of those real defects, those real security vulnerabilities that really resonate with developers. On the other hand, if you are doing deep analysis, it might take you the age of the universe to analyze a database of reasonable size.</p>
<p><b>Scott:</b> [laughs] Right.</p>
<p><a name="scalability"></a></p>
<p><b>Ben:</b> So that&#8217;s no good either. So there&#8217;s this very elusive sweet spot that for decades, literally, people didn&#8217;t quite have figured out in static analysis, and that&#8217;s what kept it from really delivering on the promise that it had ever since the days of Lint in the late 70s.</p>
<p>There are always tradeoffs that have to be made to get that scalability, and those tradeoffs manifest themselves as potential weaknesses, in terms of both false negatives and false positives.</p>
<p>False negatives, of course, are instances where static analysis did not report a defect that&#8217;s actually in the system. Of course, we&#8217;ll never have static analysis that can find every single defect, but if it doesn&#8217;t find anything, then it&#8217;s not going to be very useful. Therefore, in a nutshell, you need to find enough things to make the testing process valuable, in conjunction with your other mechanisms for testing.</p>
<p>The other issue is false positives, which tend to be a lot more deadly. When a static analysis tool reports to you something that is not actually a defect, it wastes a bit of the developer&#8217;s time. If you have too many false positives, then the developers will start ignoring even the good results. It&#8217;s the boy who cried wolf phenomenon.</p>
<p>If the tool cries wolf nine times in a row, even if that tenth one is a nasty bug, you&#8217;re probably going to gloss over it. I think the testament to the fact that we solved that problem was that open source developers gravitated to the product. They used it and they kept using it.</p>
<p>No manager told them, &#8220;You better go use this now,&#8221; or anything like that It was just out there and they kept using it. We&#8217;ve measured the false positive rate to be very low, and that&#8217;s an important part of what makes it practical&#8211;having enough analysis heft to keep the rates of false positive low, yet still having a high enough rate of real bugs found to make it interesting and valuable to the end customer.</p>
<p>All of that is really background to answer your question about where we take the product next. You want to add more add more analysis heft and still scale, so you can reduce that false positive rate even lower and find more and more and more bugs.</p>
<p>If you were to run Coverity 1.0 circa 2003 and compare it to now, we find double, triple, quadruple, or more number of defects on any given code base at that same low false positive rate, because we&#8217;re learning. We&#8217;re getting examples from our customers&#8211;they&#8217;re giving us more and more bugs, and asking us to find ways to uncover them.</p>
<p>We&#8217;ve now analyzed close to two billion lines of code between the open source community and all of our commercial customers. That&#8217;s an awful lot of great data for us in terms of learning about software systems, how they fail, and specifically in the code, what goes wrong that makes developers pull their hair out.</p>
<p>The obvious question is how to do that. How do you increase your bug-finding rate and make sure you&#8217;re doing it in a reasonable fashion? Recently, we&#8217;ve introduced a new kind of analysis technology based on Boolean satisfiability. </p>
<p>You take a Boolean formula, which only deals in values of true and false&#8211;variables that have true and false and the logical operators and &#8220;or&#8221; and &#8220;not&#8221;&#8211;and you try to figure out whether its&#8217; possible for that formula to be satisfied. In other words, you try to determine whether there&#8217;s some assignment of variables that make the whole thing true. </p>
<p>This has actually been used in the hardware industry for a long time for the people who are making the tools for chip design. Of course, chips are very, very expensive to patch, if a defect hits in the field. We all remember that Intel Pentium 4 div bug&#8211;it cost Intel half a billion dollars.</p>
<p>Obviously, that&#8217;s a pretty big mistake, and it&#8217;s not surprising that the hardware guys generally apply more rigor on the analysis front than the software guys do. But the fact is, static analysis for hardware verification has been around for a long time. No one&#8217;s really translated this for software before, because there hasn&#8217;t been as much as a push for software verification.</p>
<p>Now we have introduced that kind of technology to complement our traditional dataflow analysis, or abstract interpretation, or lattice simulation&#8211;there are a number of different names for the notion of pushing down all the different possible paths through the code. </p>
<p>But we want to marry that with a bit-level representation of the software system. That helps use reduce false positives by reducing false paths, meaning paths that can never be executed, as well as paths that shouldn&#8217;t be analyzed. </p>
<p>It also helps us find new kinds of bugs like integer overflows and enhance the buffer overflow capabilities of the traditional static analysis technologies. So there are lots of different technologies out there that can do analysis of systems at various different levels.</p>
<p>But again, we always have an eye toward the scalability of the system and a low false positive rate, because that&#8217;s how you get the thing to be used. And of course that&#8217;s the name of the game. It doesn&#8217;t matter how good your analysis is, if developers don&#8217;t want to use it.</p>
<p><a name="overcoming"></a></p>
<p><b>Scott:</b> One of the things I did for a little homework was to look at the Linux kernel mailing list to see what the initial response was when Coverity started to show up on the list. </p>
<p>I remember seeing a thread where somebody was submitting a bunch of patches, and other people were arguing against checking them in. Their view was that they needed to actually look at why the tool reported a problem, and that it isn&#8217;t enough just to modify the code so that the tool doesn&#8217;t complain any more. They wanted to verify that there was really a vulnerability.</p>
<p>What are the safeguards, once the tool has output a lot things, to make sure they actually get fixed the right way? And do you think that there is the problem, or the potential of a problem, where people might just try to make the code pass through clean, but they don&#8217;t necessarily really spend the time to really do a root cause analysis? Is there a vulnerability here?</p>
<p><b>Ben:</b> Yes, that&#8217;s always a danger with any kind of technology, in terms of a developer who isn&#8217;t quite in the right mindset, who maybe is unfortunately not careful enough about the changes they&#8217;re making. They might just try to either make the code compile without throwing errors or warnings, or make it run through the static analysis tool without reporting anything.</p>
<p>What we do to discourage that kind of behavior is to interface robustly to the defect. What we communicate is pretty much everything we learned about the defect in the analysis of that path that discovered the problem.</p>
<p>So, for example, if we observe one event on line 23, and a number of different conditions had to be true or false to get another key event to occur, say, on line 53, we try to step the developer from one condition to the next, outlining the path as exactly as possible. </p>
<p>We try to present all of the information that we have to encourage a deeper look into the issue, although of course you can lead the horse to water, but you can&#8217;t make him drink.</p>
<p>For instance, we might say, &#8220;OK, this is where you have a memory leak.&#8221; And they think, &#8220;Oh, memory leak. I should put a free in there.&#8221; Or no pointer dereference. &#8220;Oh, I guess I should just check that pointer to see if it&#8217;s null or not.&#8221;</p>
<p>Absolutely, you can end up introducing things that aren&#8217;t the right fix. And, of course, that&#8217;s a fact of life. It&#8217;s just as if a QA person told the developer, &#8220;Hey, I noticed that the system crashed.&#8221; Then the developer just made a bunch of arbitrary changes. Then it doesn&#8217;t crash anymore, but they never understood exactly why that crash happened in the first place. Chances are, they just masked the problem with maybe another problem that&#8217;s going to rear its head later on.</p>
<p>So the problem isn&#8217;t specific to static analysis, but certainly these tools can be susceptible to that.</p>
<p><b>Scott:</b> We were talking to some people on Firefox, and they said the dynamic nature of their product, the way they use pointers, and so on causes Coverity to generate excessive false positives. Do you see that type of issue in other environments, and if so, how do you respond to that?</p>
<p><b>Ben:</b> One of the things that we unfortunately don&#8217;t necessarily get to have with the open source community, as opposed to our commercial customers, is the implementation phase of the static analysis.</p>
<p>For the most part, users are going to get good scalability and a low rate of false positives right out of the box, but there are a lot of things you can do to tune it, as well. If you do happen to have a higher than average false positive rate, chances are good that we can address it by encapsulating just a couple of idioms in some configuration for the analysis engine. </p>
<p>For our commercial customers, part of licensing the technology includes our professional services, to come in and make sure everything looks exactly right for that environment. You need to tune things to the application that you&#8217;re analyzing, since every code base is a little different. Tuning the analysis engine can not only keep the false positive rate low, but it can also find more and more kinds of defects, if you add additional bits of knowledge.</p>
<p><b>Sean:</b> When you looked at open source projects, I understand that PHP and some others came out as having particularly high rates of vulnerabilities. Was there any particular class of vulnerability that seemed to show up more often than others? And I guess this is really a separate question, but in cases like that, how do you provide the next level down of analysis for customers?</p>
<p><b>Ben:</b> For the open source packages, we decided not to do that deeper kind of digging. Of course, the data&#8217;s there, and we could certainly look at the different kinds of vulnerabilities and defects that we discovered, but there was generally nothing that jumped out as so obvious that we thought it was either newsworthy or worth exposing in that way.</p>
<p>On the other hand, with X.org, there was one security vulnerability that caused them a great amount of potential grief. They were very glad that we found it, because they said it was literally one of their worst case scenarios&#8211;a root exploit that anybody could take advantage of.</p>
<p>It was actually just a simple matter of missing parentheses in a function call, where they meant to call a function but didn&#8217;t. That kind of issue is hard to categorize. It was just a simple coding error, but it led to a root exploit.</p>
<p>That&#8217;s another thing with static analysis: it&#8217;s sometimes hard to predict the impact of a coding flaw. For instance, we might discover a place where you&#8217;re going to de-reference a NULL pointer.&#8221; Well, that could translate into an innocuous crash that hardly ever happens, or a huge denial of service that every single attacker can take advantage of.</p>
<p><b>Sean:</b> Once you&#8217;ve got a mistake, there&#8217;s no way to really know what the ramifications of that mistake are going to be. A curly brace in the wrong place, or, like you said, lack of parens could be nothing, or it could be a worst-case scenario.</p>
<p><b>Ben:</b> Exactly. So we try to stay away from getting too sensational about claiming that a certain type of defects suggest a certain level of vulnerability. We prefer to let the data speak for itself, and we let the developers take care of the issues to get them fixed. </p>
<p><a name="hubris"></a></p>
<p><b>Sean:</b> At this point, you guys have looked at a lot of code across commercial and open source projects, projects of different sizes, and projects of different ages.</p>
<p>Do you find any correlation between the number of defects that you find with the size or age of the project? Are there other things where there seems to be a correlation, or is it really not correlated to much of anything?</p>
<p><b>Ben:</b> I would say it isn&#8217;t correlated to any of those obvious things like age or size. Certainly the bigger the project, the more susceptible it&#8217;ll be to the complexities of code interacting with other code, which leads to the kinds of things we find.</p>
<p>So there might be a slight correlation in terms of defects per thousand lines of code being a little bit higher on the larger code bases, but what I think dominates more than that is the mindset of the development organization.</p>
<p>Those who focus on quality, in terms of good processes and tools in that regard, have better code, and we find fewer bugs in that kind of code.</p>
<p><b>Sean:</b> I&#8217;ll just go ahead and lay some personal bias out there for the reader to judge me on, but there seem to be a lot of organizations that have a lot of hubris around the security of their products because they haven&#8217;t necessarily had a lot of high profile exploits.</p>
<p>Not to throw stones at any one company, but I read an article just today about enterprise organizations that have a lot of concerns about the iPhone, because of vulnerabilities like buffer overruns from people loading unsupported apps on them.</p>
<p>You&#8217;ve been looking at this for a long time. Do you feel like companies are sometimes a little overconfident, so they haven&#8217;t necessarily really done the work to make it secure, and if they really started looking under the rug they&#8217;d find a lot of unpleasantries?</p>
<p><b>Ben:</b> Obviously, I can&#8217;t comment on any particular companies, but I think software makers in general&#8211;whether commercial or open source&#8211;need to be paranoid.</p>
<p>They need to recognize that every system is attackable, and that there is no such thing as finding all the bugs. There is no such thing as finding all the security vulnerabilities, and that means you&#8217;re susceptible to malicious users.</p>
<p>I think software makers who have not been the victims of a lot of attacks should consider themselves lucky. It&#8217;s a game of probabilities, in many respects, and you can control some of the factors but not all of them. </p>
<p>You can control the degree to which you test your product, how good your secure coding practices are, whether you leverage the latest and greatest static analysis, find and fix all those defects, and do the penetration testing and so forth. All of that reduces the probability that malicious users will discover remaining security vulnerabilities to be attacked.</p>
<p>On the other hand, software builders don&#8217;t have direct control over factors like how many attackers are out there trying to break the system. I think the companies that have the highest profile security problems are the ones the attackers are really going after, because it&#8217;s their products that are on most of the hardware out there. </p>
<p>There are a lot of different mindsets for attackers, but one motivation is certainly fame or the pervasiveness of the exploit. If I have a choice of hacking something that a million people have or five people have, I&#8217;m going to pick the one that applies to a million different systems.</p>
<p><a name="difficulty"></a></p>
<p><b>Sean:</b> What do you think are some of the most common misconceptions that people have about announcements that you guys push out, whether it&#8217;s the homeland security one or whatever?</p>
<p>When Microsoft or Mozilla goes and puts out an announcement about bug counts, the other one comes and spanks it down. In the end, you&#8217;re always left wondering. I&#8217;ve been wanting to put a certain great Mark Twain phrase into an interview transcript, and this looks like the right place&#8211;&#8221;Lies, damn lies, and statistics.&#8221;</p>
<p>What are the common misconceptions you see about the meaning of statistics you publish?</p>
<p><b>Ben:</b> The most common misconception, especially with the open source work we&#8217;re doing, is that we are somehow trying to make a judgment between the quality of open source versus the quality of proprietary software.</p>
<p>I think if you look at the debates and the Slashdot dialog and the comments to stories and so on, you&#8217;ll see people responding to report by saying, &#8220;Oh, is Coverity saying that open source is bad?&#8221;, &#8220;No, Coverity is saying open source is good!&#8221;, &#8220;No, Coverity is saying open source is bad!&#8221;</p>
<p>[laughter]</p>
<p><b>Sean:</b> Your PR department must hate it when you put out an analysis, since they know they&#8217;re going to get hit by two sides. Both sides are going to pull the numbers and stretch it like taffy.</p>
<p><b>Ben:</b> Exactly. I think that&#8217;s the number one thing that people misconstrue. I think the second misconception, unless people are really trying to understand what we&#8217;re doing with these projects and so on, is they think that we&#8217;re claiming that static analysis is the be-all and end-all to software testing and the ultimate yardstick for software quality.</p>
<p>We never make that claim, and we&#8217;ve never fed it when other people make it. The reason we put these reports out and publicize them is so that people are aware of technologies that can help make software better.</p>
<p>Static analysis hasn&#8217;t been around for a long time, and there isn&#8217;t necessarily awareness of what it can do, but in the last two years, 8,000 defects have been fixed in open source packages because of this new technology. That&#8217;s says something.</p>
<p>The fact that 400 customers have signed up, many of them across their enterprise, to have our defect detection as part of their development process also says something.</p>
<p>There are still many billions of lines of code out there that aren&#8217;t going through this new way of checking. To get this stuff implemented used to be hard, but not anymore. There&#8217;s no excuse.</p>
<p>When we make these announcements and talk to people about the technologies, we want them to realize that the development world has improved, in terms of the tools and technologies that help them make the quality of software better. We do that in part using living proof that open source developers are using it and benefiting from it.</p>
<p><b>Scott:</b> For whatever reason, it&#8217;s hard to simply present data and have people believe that you&#8217;re not trying to subtly pass a judgment.</p>
<p>It must have been tremendously valuable to you that tens of millions of lines of open source code were available to throw at your product for analysis.</p>
<p><b>Ben:</b> Absolutely. I wrote an article about two years ago, and the title was something on the order of, &#8220;Open Source Software and Static Source Code Analysis: A Perfect Match.&#8221;</p>
<p>The article was really about how the open source movement enabled this innovation in static analysis. Before that, not enough companies would have let us come in and play around with their code for months on end. Now that we have all this code out there, it&#8217;s a great playground.</p>
<p><a name="changing"></a></p>
<p><b>Sean:</b> It seems to me that what hackers do has sort of changed over the past five, six, seven years. It used to be, if you had a virus, worm, or whatever, you knew it, because your corporate network came to its knees. </p>
<p>Today, the hackers are far more subtle, because they&#8217;ve figured out that it&#8217;s a lot more beneficial for them to build a bot-net that nobody knows they&#8217;re a part of. Do changing hackers&#8217; techniques affect the kind of scanning that you do?</p>
<p><b>Ben:</b> I don&#8217;t think there&#8217;s too much of a correlation there. Not all hackers are even really &#8216;hacking&#8217; so much, because there are so many social ways to take control of systems, just by getting people to install stuff. </p>
<p>I do think that the vulnerabilities that allow them to automatically have a worm or a bot spread across a network look the same as ever, though. Even if they hide their tracks a lot better than they used to, they&#8217;re attacking the same kinds of problems in the code.</p>
<p><b>Sean:</b> Do you think that track-hiding helps perpetuate a false sense of security, as opposed to the old malware back in the day, like &#8220;Code Red,&#8221; &#8220;Nimda,&#8221; and &#8220;Slammer,&#8221; where you definitely knew you had them? </p>
<p><b>Ben:</b> That&#8217;s a good question. Admittedly I don&#8217;t have my pulse on the consumer side of things, but my impression is that people are not as concerned about security, but they&#8217;re frustrated by a different aspect of the same issue. They buy a computer, and six months later it&#8217;s loaded up with all kinds of different services and applications that they have no idea about.</p>
<p>And some of those are probably malware, and some of them are legitimate, but I don&#8217;t think people see it as a security threat, they just see it as frustration with computing. </p>
<p><b>Scott:</b> It seems that, even on the server side, nobody in the data center wants to install any more patches than they have to. By using static analysis in combination with other tools, like dynamic analysis and fuzz testing, on one hand the number of defects per line of code can go down, but on the other hand, the number of lines of code goes up. Do you see it as a zero sum game, or can the number of defects per line of code be driven down faster than the increase in the total number of lines of code?</p>
<p><b>Ben:</b> You can definitely make progress, for the simple reason that people write code, and that&#8217;s slow, whereas analyzing code is fast. We can analyze code a lot faster than people can generate it, and when we innovate in terms of finding ways to analyze code and so on, we&#8217;re pushing that out to a billion and a half or two billion lines of code per year.</p>
<p>It doesn&#8217;t take very much innovation out of the Coverity Research and Development arm to drastically impact the quality of billions of lines of code, so I think we&#8217;re going to be able to stay ahead of the game in that regard. </p>
<p>It&#8217;s the exploding complexity in software that has people listening again to the messages of static analysis and disruptive technology, because they don&#8217;t have any other answers. They&#8217;re trying to make high-quality, secure code, but the volume is increasing geometrically. There are 10 million lines of code in a premium automobile today, where 10 years ago, there were almost none.</p>
<p>There&#8217;s code in everything; by the time you get to work in the morning, you&#8217;ve interacted with probably over 100 million lines of code, if you look at all the devices that you&#8217;ve interacted with. And since that code is coming from all over&#8211;outsourcing, interactions with third party libraries, new infrastructure frameworks, you name it&#8211;the complexity of code is just too great not to leverage these technologies.</p>
<p><b>Scott:</b> Do you see any sort of a trend where when a company gets components from external sources, they use tools like yours as sort of a quality gate? For example, they might go out to some component vendor and say, &#8220;Look, you don&#8217;t have to open source your component, but I want to run some tools on your code base,&#8221; so they could compare the outcomes among competitors.</p>
<p><b>Ben:</b> We&#8217;ve got a great case study on exactly that topic. One of our customers recently was trying to come to an agreement with their outsource vendor as to when some code should be accepted, in terms of whether it had been debugged sufficiently.</p>
<p>We didn&#8217;t suggest this to them, but they looked at Coverity as third-party arbitration, if you will, where they could say, &#8220;This is a gate, and we both can agree that when Coverity says there are no defects in it, we accept that development is complete.&#8221;</p>
<p>That approach cleaned up their relationship a lot, because they finally had some kind of commonly agreed upon measurement for acceptance. They also said that using the product in that capacity and also just fixing defects early in the development process let them shave a significant percentage off of the time it took them to release a product. Of course, that&#8217;s the ultimate ROI here&#8211;you want to get products out the door faster.</p>
<p>I think that&#8217;s something that we will see more and more of. People will finally have some way of making a reasonable gate for the code that they accept into their products.</p>
<p><b>Scott:</b> Well, I want to be sensitive to the time, so do you have any final thoughts to add?</p>
<p><b>Ben:</b> We have talked a lot about problems in software development and finding defects in code with static analysis, but we view this as just the beginning.</p>
<p>To do static analysis the right way, you have to understand how the software is built, how every file is compiled, and how it&#8217;s all assembled together. In the technology that we&#8217;ve generated over the last five years at Coverity, we have all of this information.</p>
<p>If you look at the marketing literature out there on Coverity, we talk about the software DNA map, the fundamental building blocks of everything in your software system&#8211;not just the source code. Versions, how every file was compiled, your build system, and all of that information is sitting right there.</p>
<p>The future for Coverity is to branch out from static analysis and to try to identify the places that are painful in the software development process, from build systems through the compilation process, and finding defects, both statically and dynamically.</p>
<p>All of those things where having a beautiful representation of the software system and a complete representation of exactly what&#8217;s going on could benefit the developers, the product managers, and the software development managers. We want to help them answer questions like, &#8220;What&#8217;s going in my code base? How is it changing? Am I ready for release? Is this code in good shape?&#8221;</p>
<p>Right now, no one knows how to answer those questions, so for us, the next few years will be spent exploring that. We&#8217;re really trying to find the tools and technologies that will enable high software integrity, and that encompasses much more than just finding a few bugs or a few security vulnerabilities&#8211;it means making sure that you&#8217;re confident in what you&#8217;re doing from a software development standpoint.</p>
<p>That&#8217;s how I&#8217;ll conclude&#8211;a little bit of a teaser for what&#8217;s coming next out of the pipe.</p>
<p><b>Scott:</b> Excellent. Thanks for taking the time to talk with us today.</p>
<p><b>Ben:</b> You&#8217;re welcome, and thank you.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=161&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/&amp;title=Interview+with+Ben+Chelf+%26%238211%3B+CTO+%26%238211%3B+Coverity" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/&amp;title=Interview+with+Ben+Chelf+%26%238211%3B+CTO+%26%238211%3B+Coverity" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/&amp;title=Interview+with+Ben+Chelf+%26%238211%3B+CTO+%26%238211%3B+Coverity" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/&amp;title=Interview+with+Ben+Chelf+%26%238211%3B+CTO+%26%238211%3B+Coverity" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Ben+Chelf+%26%238211%3B+CTO+%26%238211%3B+Coverity+@+http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2008/06/17/interview-with-ben-chelf-cto-coverity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interview with Jamie Thingelstad &#8211; CTO &#8211; Wall Street Journal Digital Network</title>
		<link>http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/</link>
		<comments>http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/#comments</comments>
		<pubDate>Wed, 02 Apr 2008 20:50:10 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/</guid>
		<description><![CDATA[Interviewers: Scott Swigart and Sean Campbell
Interviewee: Jamie Thingelstad
In this interview we talk with Jamie Thingelstad &#8211; CTO of the Wall Street Journal&#8217;s Digital Network. In specific, we talk about:

Open and closed-source technology stacks at WSJ
The technology evaluation process
Intersections between open source and commercial entities
Security considerations with open and closed platforms
The value and liability of owning [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Interviewers:</strong> <a href="http://howsoftwareisbuilt.com/about-scott-swigart/">Scott Swigart</a> and <a href="http://howsoftwareisbuilt.com/about-sean-campbell/">Sean Campbell</a></p>
<p><strong>Interviewee: </strong><a href="http://howsoftwareisbuilt.com/about-jamie-thingelstad-cto-wall-street-journal-digital-network">Jamie Thingelstad</a></p>
<p>In this interview we talk with Jamie Thingelstad &#8211; CTO of the Wall Street Journal&#8217;s Digital Network. In specific, we talk about:</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2008/04/02/Interview-with-Jamie-Thingelstad-CTO-Wall-Street-Journal-Digital-Network#stacks">Open and closed-source technology stacks at WSJ</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/04/02/Interview-with-Jamie-Thingelstad-CTO-Wall-Street-Journal-Digital-Network#evaluation">The technology evaluation process</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/04/02/Interview-with-Jamie-Thingelstad-CTO-Wall-Street-Journal-Digital-Network#intersections">Intersections between open source and commercial entities</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/04/02/Interview-with-Jamie-Thingelstad-CTO-Wall-Street-Journal-Digital-Network#security">Security considerations with open and closed platforms</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/04/02/Interview-with-Jamie-Thingelstad-CTO-Wall-Street-Journal-Digital-Network#value">The value and liability of owning code</a></li>
</ul>
<p><span id="more-144"></span></p>
<p><b>Sean Campbell:</b> Jamie, thanks for taking the time to chat.  Could you give us a little background on your role at Dow Jones?</p>
<p><b>Jamie Thingelstad:</b> Sure. I&#8217;m chief technology officer for the Wall Street Journal Digital Network, which is comprised of WSJ.com, MarketWatch.com, Barrons.com, and a number of our smaller web properties. I&#8217;ve been in this role for about a year and a half. Prior to that, I was the CTO for MarketWatch.com inside Dow Jones, after we were acquired by Dow Jones. </p>
<p>Going back further, I was CTO and one of the early founders of Bigcharts.com, which is a premium charting website. BigCharts was acquired by MarketWatch in 1999, and I became CTO of MarketWatch after the acquisition. We acquired a few companies, and then Dow Jones acquired MarketWatch. </p>
<p><a name="stacks"></a></p>
<p><b>Scott Swigart:</b> You mentioned a number of digital properties that Dow Jones has. Tell us a bit about the infrastructure behind those and the kind of volume that those properties have to handle. </p>
<p><b>Jamie:</b> Since they&#8217;re all news sites, there&#8217;s a certain unpredictability to the traffic volume. For instance, the market can drop at any second, and that can drive huge amounts of volume to our web sites.</p>
<p>The two largest sites that we run, WSJ.com and MarketWatch.com, are fairly different. WSJ.com is built in a Java environment using WebSphere, making a slow migration to Tomcat. It&#8217;s built in a fairly rigid framework, but it is very robust and very scalable. It&#8217;s stable, but also not quite as flexible as we&#8217;d like it to be.</p>
<p>The MarketWatch platform is a bit different. MarketWatch is built in a Windows environment and uses .NET. It was originally built in ASP and has been migrated to .NET in bits and pieces. It follows a &#8220;hundreds and hundreds of servers&#8221; approach, where no one server is doing all that much, so it is a bit more agile, and we can typically deploy things a little bit faster. It also has a slightly different performance characteristic. It can have the equivalent of a brown out more often than say, WSJ.com. </p>
<p><b>Scott:</b> That&#8217;s pretty fascinating that you&#8217;ve got several high volume, critical sites and yet their underpinnings are so different from each other. You mentioned moving the Wall Street Journal site to an Apache open source back end, but the MarketWatch site is on a proprietary Microsoft stack. </p>
<p>What can you tell us about the decision-making process as different organizations evaluate technology choices? It looks like these two groups took pretty different approaches to solve similar goals </p>
<p><b>Jamie:</b> The lesson to be learned there is that you can build large-scale sites to be stable, reliable, and highly available in either of those platforms. There is nothing about one versus the other that inherently makes it unable to build and provide those kinds of solutions. Good developers working in either one can get the job done, and do it elegantly.</p>
<p>The historical context is that the Journal web site as it exists now is the descendent of an engagement with IBM Consulting many years ago, and they obviously wanted to build in a WebSphere environment. </p>
<p>All this predates my being part of Dow Jones, but my understanding is that the decision was more about who to partner with than how to get to a specific architecture. In this case, IBM came to the table with WebSphere. In the MarketWatch example, that&#8217;s the platform that we built over the course of many years starting with BigCharts, and the platform actually started in a very different place.</p>
<p>In the very early days, before it was MarketWatch, when it was the platform that ran BigCharts, it was Sybase. It was Perl and CGI in Apache. And it was an ANSI C++ program that we had written to serve charts, that could actually be compiled on both the Win32 library as well as in Unix, at the time Solaris. And so, we had this sort of cross-platform environment. We made a decision around 1996 or &#8216;97 that we were spending too much effort building plumbing code and decided that we needed to move to something else.</p>
<p>At the time, we evaluated CORBA and COM, and our conclusion was that COM was better suited to the type of solutions that we were trying to build. We rewrote the entire system into a COM model and then switched our front-end production environment from Solaris to Windows NT. The net benefit of all that was that we stopped writing a lot of code that was about glue and plumbing, and focused all of our code on business requirements and functions and features that our customers wanted. </p>
<p><a name="evaluation"></a></p>
<p><b>Scott:</b> Can you walk us through how you think about the technology evaluation process? And, as a follow-up, broadly speaking, what are your observations about the pluses and minuses of building on top of a community-supported open source stack, versus going with more of a partner, like IBM or Microsoft?</p>
<p><b>Jamie:</b> I think you have to start with your team. If you&#8217;re able to build your own team, it&#8217;s less of an issue, but if you&#8217;re walking into an environment where the team is well-versed in one tool set or one part of the stack, you shouldn&#8217;t fool yourself into thinking that you can then choose some alien technology and that the team will just flow with it.</p>
<p>I don&#8217;t know many Java shops out there where you could just walk in and say, &#8220;Well, you know what? We&#8217;re going to do the next project in Windows&#8221; and have that be just fine. Independent of culture, independent of bigotry, I think it&#8217;s a matter of skills and capabilities, and some of those skills don&#8217;t transfer easily.</p>
<p>But if you take the team out of it, then I think it&#8217;s a pretty interesting angle. My view is that there really is little value that a proprietary platform like the Windows platform gives you. If I were going to start, green field, building my own shop, I would be heavily biased towards free platforms: some Linux distribution, likely something like Apache.</p>
<p>I would also be very biased towards an interpreted environment. I think there have been great advances in the compiled environments for the web, and .NET has made huge leaps in performance and capability. Similarly, so has Java. But it&#8217;s interesting that the sites that I see doing the most “out-there” stuff, that are the examples of innovation, are typically not running a compiled environment on the front end. WordPress, Twitter and Facebook come to mind.</p>
<p>I hate to suggest that there&#8217;s an either-or situation where either you are innovative and you&#8217;re on an interpreted environment, or you&#8217;re not and you run a compiled environment. I don&#8217;t think it&#8217;s quite that simple. But I do think that there is some real merit to the lightness that you get with something like PHP, with a robust framework, or Ruby on Rails.</p>
<p>In fact, we&#8217;ve tried to have our cake and eat it, too. One of the most recent products that we released on the MarketWatch site is a product called MarketWatch Community. And that project was actually built using the MonoRail framework from Castle Project, which is essentially the Rails framework laid on top of C#. It&#8217;s an interesting way to get the benefit of having a team that understands tools like Visual Studio, and wants to have that very robust debugging environment, but wants to also take advantage of the flexibility that Rails offers. We have been able to have it both ways, if you will. </p>
<p><b>Scott:</b> I think there has been a resurgence in the last few years of dynamic languages or interpreted languages. That programming model seems to be making a pretty strong comeback. When I got started with Unix, way back in the day, they were used for a lot of administrative things, and there were things like Tcl, Tk and some of that kind of stuff. Then it went very compiled. Now it seems to be going back towards more of a dynamic language approach. </p>
<p><b>Jamie:</b> It seems we always have the same problems, but we want to solve them in different ways. We try to find performance and optimization in different parts of the stacks. People have been arguing for decades: could you make an interpreted environment that&#8217;s as fast as a compiled environment just by doing JIT-based compilation?</p>
<p>Maybe you can, maybe you can&#8217;t&#8211;I don&#8217;t know. That&#8217;s pretty academic. At the end of the day, you can&#8217;t get away from the fact that Facebook is written on PHP. There are very large sites like Twitter that do an extreme amount of requests per second, and it was build on Rails. I don&#8217;t think that you can state emphatically anymore that those environments do not scale. Clearly, they can, even if it takes a different set of tools.</p>
<p><b>Scott:</b> There was an article I read a couple of years ago that was called &#8220;The Hundred Year Language,&#8221; or something like that, where somebody was speculating about what we&#8217;re going to be using to program in 100 years from now. His conjecture was it will be just insanely inefficient, but it won&#8217;t really matter.</p>
<p>Because we would all write in assembly, if we wanted to wring the most performance out of it. Whether you have to buy 100 servers or 200 servers to run it, if it&#8217;s very agile in terms of being able to scale, then scale out and focus on supporting change rapidly and those kinds of things. That&#8217;s a lot more important than just wringing every drop out of the CPU. </p>
<p><b>Jamie:</b> One of my professors in college, John Reidl, and this was the early &#8217;90s so I think he was a little ahead of his time, would say that CPU, memory, bandwidth and disk all scale to infinity. The only thing that doesn&#8217;t is programmer time. At the end of the day, the most important thing is to understand your code and to be agile with your code. Hopefully, you can solve everything else through other innovations. </p>
<p><b>Sean:</b> One thing I&#8217;m curious about is usability, whether that&#8217;s from a management perspective, a development environment perspective, or an end user perspective. </p>
<p>When you&#8217;re thinking about acquisitions is it less about scalability, and more about usability? Do you feel like the open source community is best to drive things that have a high degree of usability and utility? Or do you feel that that&#8217;s the place where maybe they&#8217;re striving to, or they have limits that are inherent in the model?</p>
<p><b>Jamie:</b> Well, I guess it depends on what level you define that usability. Some of the most opaque projects around are open source projects. The code is all available, but unfortunately, you have to actually look at the code to figure out how to do anything with it. </p>
<p>And the result of any question is, &#8220;Well, you&#8217;re an idiot. Go look at the code.&#8221; That&#8217;s the extreme, but I think that&#8217;s one of the places that the open source model puts an additional obligation on somebody like a CTO, or a VP of engineering. </p>
<p>With proprietary software, you can somewhat base the viability of the product on its market acceptance, and you get the benefit of the fact that either these other people have bought it or they haven&#8217;t. And they have reasons why they bought it or they haven&#8217;t bought it. And bad products die naturally, out of lack of adoption. In the open source model, that&#8217;s not necessarily the case. You can have a product that is great and just doesn&#8217;t get picked up, or vice versa, so it requires a vetting that&#8217;s a little bit different. What&#8217;s the velocity of the project? How many people are truly committed to the project? How willing are you to commit to the project?</p>
<p>I couldn&#8217;t tell you how many corporate CTO types I&#8217;ve talked to that would love to use open-source projects, but would never give anything back. And, if that&#8217;s your position, I think you should question the ethics of using open source software and just being a parasite. If you&#8217;re not willing to give back your additions, which in some cases is the requirement, that&#8217;s not good.</p>
<p>The Mac also has an interesting dichotomy. I&#8217;m an old-school Mac guy, and I&#8217;ve recently returned to the platform when they switched to Intel. I have commented to many developers that they need to go get a Mac, because I think that, in many ways, the most innovative software that is being built right now is being built on the Mac. In some cases, it&#8217;s just wacky programs like WriteRoom. </p>
<p>A great example is QuickSilver. QuickSilver re-defines how you would interact with your computer. It&#8217;s clear to me that the most innovative development right now is happening on the Mac. Yet at the same time, the thing that bothers me about the Mac is that it is positioned as the computer for the rest of us. It&#8217;s portrayed as the people&#8217;s computer, when in fact; it&#8217;s the most closed platform out there. You can&#8217;t build one. </p>
<p>I have always felt that people&#8217;s computer needs to be a computer that the people can build. So there&#8217;s this dichotomy. But it&#8217;s a great ecosystem. Really great software is being built on Macs. I don&#8217;t know if I have ever seen a Rails screencast that was done on a PC. It&#8217;s very clear that Apple has successfully gained mindshare in that developer community, and they&#8217;ve gotten a lot of market share as a result.</p>
<p>I doubt you&#8217;ll ever see any volume of Macs in server rooms. But at the end of the day, it&#8217;s Unix underneath, and that&#8217;s what gives it the credibility to be able to do things like Rails development. </p>
<p><a name="intersections"></a></p>
<p><b>Sean:</b> It seems like there are a lot of open source acquisitions and even some open source projects that have a single voice at the maintainer level. Sun gets criticized for this a lot. You see a lot of companies put an open-source bumper sticker on their product.  The &#8220;free&#8221; version is open-source, but that&#8217;s just to get some market share and up-sell you to the paid version.  Do you see a trend that direction, or do you see it going a different way? </p>
<p><b>Jamie:</b> I think one of the great examples there is Red Hat. I don&#8217;t want to say that they&#8217;re not fully imbued with open source in their blood. They are. But Red Hat Enterprise is as expensive as Windows. You come to the table with this idea that it&#8217;s Linux, so it must be cheaper, but in fact, no it&#8217;s not. That proves out that technology teams still need support. No matter how good the teams are or how deep your team is, from time to time, most teams are going to need support from an expert. So there&#8217;s a market to be made there in providing that support.</p>
<p>It also highlights that the traditional software business model is still valid. Microsoft&#8217;s and Oracle&#8217;s models are valid, and are a proven way to run a business. These companies are successful doing that. One other thing concerns me&#8211;Sun buying MySQL makes me wonder about the future direction of MySQL. Is MySQL going to continue to be a viable platform for the kinds of projects that it has been, or are they going to try to move it into a different market space?</p>
<p>And even MySQL, prior to Sun buying it, is an interesting example. I&#8217;ve noticed that SQLite has really made a lot of headway. And what&#8217;s happening, from my perspective at least, is that SQLite is picking up where MySQL has abandoned. MySQL decided to go more up-market and tackle bigger problems, and build in some features that are going after the Oracles and the Microsoft SQLs of the world. And that left a space for a much lightweight SQL engine, so SQLite is filling that gap.</p>
<p>So you could argue that, maybe, that&#8217;s how it should work, that when a void is created, there&#8217;s a new open source project that will fill it. But I do think that this need for an enterprise version of these various packages is really just a reflection on the support needs of organizations.</p>
<p>The big thing that they sell all that stuff on is FUD. When there&#8217;s a security issue, you want the verified Red Hat patch, so you&#8217;re going to pay a thousand dollars a year per box, or something like that. </p>
<p><b>Scott:</b> When you look at the spectrum of open source projects, you&#8217;ve got some projects that are very, very old school. You&#8217;ve got Linux. You&#8217;ve got Apache Web Server and some of that stuff. They&#8217;ve been around for a long time. Extremely mature communities, high-quality code, and ubiquitous.</p>
<p>And then you&#8217;ve got kind of this next-generation of open source, to some degree&#8211;PHP, MySQL, Xen&#8211;a lot of these kind of companies, where the main driver behind the code base is a single company. And some of them are really what I&#8217;ve heard people call an aquarium model&#8211;like MySQL&#8211;where they don&#8217;t really have outside contributors. You can look at the code, but unless you work for MySQL, you&#8217;re probably not a contributor. A lot of Sun&#8217;s stuff is that way, too, although I think they&#8217;re actually working to get more outside contributors.</p>
<p>And so the question is, if you&#8217;re going to launch an open source project today, how do you take the world by storm with it? Take a look at Ubuntu building on top of Debian. It seems like the ones that catch fire are the ones where there is a revenue model and there is a business around the code base. What are your thoughts on that? </p>
<p><b>Jamie:</b> I think you&#8217;re right, although it&#8217;s hard to know what&#8217;s the cause and what&#8217;s the effect. Is it successful because there&#8217;s this revenue model, or is there this revenue model because of something else? Are there examples of very, very successful open source frameworks that are not built around a commercial endeavor? Could you say Rails? Rails is fundamentally the creation of 37signals, and they&#8217;re doing commercial things with it. </p>
<p><b>Scott:</b> Debian comes to mind; they&#8217;re not commercial. But then again, Debian caught fire to some degree when Ubuntu used it as their basis. There seems to be a bit of an uneasy relationship between the two communities. I think Debian is thankful for the attention that Ubuntu brought, but at the same time, the philosophies of the projects are different. Going the other direction, Canonical is very happy to say, &#8220;we&#8217;re built on this rock solid distribution. It&#8217;s a distro you can trust, but we also have got our own ideas.&#8221;</p>
<p>One of the things that I look at is that we are in this first wave, so you&#8217;ve got PHP, which has the chief sponsor Zend. You have Xen, whose chief sponsor is XenSource. Red Hat acquires JBoss. And now as these acquisitions start to happen, I wonder what the landscape will look like in five years. MySQL gets acquired by Sun, which has promised to open source everything, so they are moving toward an open source model. </p>
<p>XenSource gets acquired by Citrix, which doesn&#8217;t have an open source history and ethos. JBoss gets acquired by Red Hat, that&#8217;s sort of open from the beginning. Then maybe some of those companies get acquired by other companies. To me, it will be interesting to see how much of the projects will remain tied to the companies.</p>
<p>And what happens if Citrix ends up not being a good steward of Xen? Will they end up having bought nothing because there will be a fork that emerges? </p>
<p><b>Jamie:</b> The other challenge that presents itself in those cases is, how many open source contributors want to contribute to Citrix? Probably not a lot. Once you tie projects into these corporations, the culture may resist it and go off in a different direction. </p>
<p>I think it will be a true test over the course of the next two years to watch the legal pushes on the GPL and the various software agreements out there. Will companies be able to weasel themselves out of that, or just identify ways to break the requirements of the GPL or the Apache license?</p>
<p><b>Scott:</b> It&#8217;s interesting too that MySQL had a way of unlocking the GPL, so if you didn&#8217;t want to buy into the GPL but you wanted to redistribute MySQL with your proprietary code, you could just pay MySQL for that right.</p>
<p>On the other hand, if you wanted to adopt a GPL license yourself, then you could redistribute without paying MySQL the licensing fee. It&#8217;s a very kind of Wild West feeling. Everybody is trying a different model and a different way. I wonder if that&#8217;s the way it will stay, or whether things will consolidate over the course of a decade. How much does this disparity of business models and licensing for open source projects weigh on your decision making? </p>
<p><b>Jamie:</b> It weighs a lot. It concerns me greatly that we would be leveraging an open source framework that could potentially die or be acquired by the wrong company. Of course, the acquisition angle can happen no matter what. Products can be acquired by companies that you&#8217;re not really excited about, so I don&#8217;t think you&#8217;re protected from that in any model.</p>
<p>But the bigger issue, in my book, is the longevity and the velocity of the project. For example, I think Ruby on Rails has an incredible amount of velocity right now. There are a lot of people working on it, there&#8217;s a huge developer community out there. But I ask myself questions like, &#8220;Will I be able to find a Ruby developer in 10 years?&#8221; I don&#8217;t know. Will I be able to find a Ruby developer in a year? They&#8217;re not out there at the same volume as other skills.</p>
<p>So in that case, the fallback is, how hard is it to teach people? If it&#8217;s a fairly transferable thing and it&#8217;s something that an average engineer who knows a few languages could easily pick up, then I don&#8217;t think there&#8217;s a high amount of risk. But if it&#8217;s something very esoteric, I wouldn&#8217;t touch it. The other question to ask is, at what layer of your architecture does that system live?</p>
<p>There is a certain level of the architecture where I would be concerned about the ability of an open-source project to provide that level of always-on availability. I have to admit that that goes in the face of Linux itself, since in many cases, Linux is the underlying OS. But it&#8217;s at a little bit of a different layer in the stack. Knowing the complexity of that layer of development and knowing that it&#8217;s something that companies have a very hard time pulling off, I would be hesitant to use an open-source load balancer. </p>
<p><a name="security"></a></p>
<p><b>Sean:</b> Given that a lot of the work you do is with financial data, I can only imagine you&#8217;re an interesting target for a variety of folks out there who have malevolent intent. You&#8217;ve got two stacks&#8211;one that&#8217;s predominately a closed-source base and one that&#8217;s predominately an open-source base. What do you think about security issues around those platforms? Lately, this is a really, really hot topic. It always is, but it seems to ebb and flow. </p>
<p><b>Jamie:</b> Maybe I&#8217;m too much of a technologist to look for it, but whenever somebody says their platform is more secure than everybody else, to me it&#8217;s just a bunch of crap. There are really very few platforms that have truly iron-clad security. And by and large, those are the platforms that have been around for so long that they&#8217;ve just been exposed to many, many tests. When you think of the Unix kernels themselves, some of them have not seen significant modifications for a decade. And those are probably pretty secure.</p>
<p>But the reality is, if you&#8217;re regularly modifying and releasing code, you&#8217;re going to have security issues. So we protect all our systems with counter-measures in the front and make sure that we have compensating systems to defend against all of that. So I wouldn&#8217;t go one way or the other based on this promise of security. In fact, I think it&#8217;s a little slimy for Red Hat to use security as the push toward getting people to buy Red Hat Enterprise. I think you should be selling based on features and not fear, but I think it&#8217;s a wash between the various platforms. </p>
<p><b>Scott:</b> There are organizations that are building things on top of Linux or Apache, and they&#8217;re not being hacked right and left. Like you said, it may not have that much to do with the platform itself. It may have to do with the expertise of the organization like yours. You actually do the implementation and know how to build these perimeters of defense and that kind of stuff. Do you have an opinion on the notion of many eyeballs as it relates to security? </p>
<p><b>Jamie:</b> My general opinion is there is some difference between the number of the eyeballs and the quality of the eyeballs. Just having thousands of developers looking at your code certainly is not going to qualify it as secure unless they really are qualified to say it is secure. I do think there is a benefit to that. At the end of the day it gets to these bigger issues, like what is the right way to develop source?</p>
<p>We know that in medicine, for example, it is not good for one doctor to just do everything themselves. They should consult and review their findings with other doctors and come to some conclusions. We certainly would not let one engineer build a bridge without ever having another engineer look at the code. In technology, we have this belief that it is our intellectual capital. The source code is the family jewels, and that is where the intellectual capital is located.</p>
<p>I don&#8217;t tend to really follow that idea. I tend to believe more that the intellectual capital is the engineers that are writing that code. I have actually gone on record many times saying that your source code repository is not intellectual capital, it is a liability. If you look at it on your balance sheet, you should be putting it on your balance sheet as a liability. The only thing you have to offset it is the intellectual capital you have in your engineering team. All code is just waiting to break. It is just a matter of time. </p>
<p><a name="value"></a></p>
<p><b>Scott:</b> I have heard people say that code is like maintaining inventory. You have a warehouse full of stuff. </p>
<p><b>Jamie:</b> Every line has a cost. In the future, is the fact that your code is open source really going to matter much to your business model?</p>
<p>You could argue that companies like Red Hat and MySQL are showing that it&#8217;s not the case&#8211;that you can have a very normal business model and have an open source code base underneath that. </p>
<p><b>Sean:</b> Right. Someone would have to treat MySQL very, very badly for somebody else to do a meaningful fork, because nobody else is going to staff 200 developers on it. </p>
<p>I think Sun is going through that transition, where they&#8217;re realizing the value of Solaris is not necessarily the code itself. It is the fact that they have x-hundred number of Solaris developers that know how to work on it. I think Microsoft could do the same thing, if they open-sourced Windows.</p>
<p><b>Jamie:</b> You could argue, what would happen if Microsoft just open-sourced Windows tomorrow? </p>
<p><b>Sean:</b> I don&#8217;t think it would change very much at all. Again, nobody is going to fork that in a way that Dow Jones would trust this competitor to enhance, patch, and build the next version of Windows more than they would trust Microsoft to keep going. </p>
<p>The expertise is in the people&#8217;s expertise around that code base. That seems to be fundamentally what a company is acquiring. As long as they are reasonably good stewards of the code, I don&#8217;t think companies have to worry about losing control of a project just because it is open source. </p>
<p><b>Jamie:</b> It is interesting, because I am a believer in this kind of long view of computer science. What I mean by that is it is still an incredibly immature field. It is much more art than it is science. We have not figured out the basics to our craft to the extent that people who build bridges have figured it out. We may find 20 years from now that the best practice is for all the code to just be out there and that is just how you do it.</p>
<p>Part of the benefit of that is the many eyeballs. Or part of the benefit is automated agents that are scanning code on the behalf of third parties, looking for things. It is hard to know exactly what that would be. In general, I do think this belief that your code is a huge asset you have to keep locked in a big safe is terminal. I do not think that is going to be the path forever. </p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=144&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/&amp;title=Interview+with+Jamie+Thingelstad+%26%238211%3B+CTO+%26%238211%3B+Wall+Street+Journal+Digital+Network" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/&amp;title=Interview+with+Jamie+Thingelstad+%26%238211%3B+CTO+%26%238211%3B+Wall+Street+Journal+Digital+Network" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/&amp;title=Interview+with+Jamie+Thingelstad+%26%238211%3B+CTO+%26%238211%3B+Wall+Street+Journal+Digital+Network" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/&amp;title=Interview+with+Jamie+Thingelstad+%26%238211%3B+CTO+%26%238211%3B+Wall+Street+Journal+Digital+Network" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Jamie+Thingelstad+%26%238211%3B+CTO+%26%238211%3B+Wall+Street+Journal+Digital+Network+@+http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2008/04/02/interview-with-jamie-thingelstad-cto-wall-street-journal-digital-network/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Interview with Justin Erenkrantz &#8211; President &#8211; Apache Software Foundation &#8211; Part II</title>
		<link>http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/</link>
		<comments>http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/#comments</comments>
		<pubDate>Thu, 31 Jan 2008 16:00:02 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Sean Campbell]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[collaboration]]></category>
		<category><![CDATA[community]]></category>
		<category><![CDATA[Justin Erenkrantz]]></category>
		<category><![CDATA[methodology]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/</guid>
		<description><![CDATA[Interviewers: Scott Swigart and Sean Campbell
Interviewee: Justin Erenkrantz
In this second part of a two part interview with Justin Erenkrantz we talked to him about:

How the Apache project ensures good collaboration.
The Apache Foundation&#8217;s philosphy of having no single person as the leader.
Apache&#8217;s security committee.
The process of removing someone from a position of responsibility within the Apache [...]]]></description>
			<content:encoded><![CDATA[<p><b>Interviewers:</b> <a href="http://howsoftwareisbuilt.com/about-scott-swigart/">Scott Swigart</a> and <a href="http://howsoftwareisbuilt.com/about-sean-campbell/">Sean Campbell</a></p>
<p><b>Interviewee:</b> <a href="http://howsoftwareisbuilt.com/about-justin-r-erenkrantz-president-of-the-apache-software-foundation/">Justin Erenkrantz</a></p>
<p>In this second part of a two part interview with Justin Erenkrantz we talked to him about:</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/#collaboration">How the Apache project ensures good collaboration.</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/#nosingleleader">The Apache Foundation&#8217;s philosphy of having no single person as the leader.</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/#security">Apache&#8217;s security committee.</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/#firing">The process of removing someone from a position of responsibility within the Apache Foundation.</a></li>
<li><a href="http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/#partofapache">What would make someone want to be part of the Apache Foundation&#8217;s group of projects.</a></li>
</ul>
<p><span id="more-129"></span></p>
<p><strong><a name="collaboration"></a>Scott  Swigart</strong>: &nbsp;In this part of the interview, we wanted to dig into some of  the tenets, if you would call it that, of the Apache way. And the first of  those of course is collaborative software development.</p>
<p>So talk a little bit, if you would, about how Apache does collaborative  software development. I&rsquo;m sure some things are very traditional and similar to  the way that other open source projects might do it, and there are probably  things that also might just be a little bit unique to Apache. So how do you try  to insure good collaboration?
</p>
<p><strong>Justin  Erenkrantz</strong>: &nbsp;So the main center point of all of the collaborative  software development that we do in Apache is the mailing list. That&#8217;s where  pretty much everything happens. As one of the guys mentioned before, the maxim  has been: &quot;If it didn&#8217;t happen on the mailing list, it didn&#8217;t  happen.&quot; And that generally tends to be true.</p>
<p>Basically, if you follow the mailing list for a particular project, then we&#8217;re  expecting that you should know what&#8217;s going on within the project. Within that,  those are pretty much all public lists. Everybody can subscribe, even people  who are committers, people who are just users, people who just work on another  project that may consume the Web server or maybe PHP modules or something like  that.</p>
<p>It&#8217;s pretty much an open forum. Anybody can voice their ideas. Generally  though, just the way things work, most of the traffic tends to be from the core  developers who are active at that time. The traffic patterns of the list  change, and you get an idea of how much discussion. You generally see peaks and  valleys for the mailing list discussion. Things get really heated or things are  just chugging along and there&#8217;s not much traffic on the list.</p>
<p>So that&#8217;s where all the discussion should happen. And then of course there is  the source code repository. And in Apache we always had the thought of having  shared repository. A lot of projects now are starting to &#8209;&#8209; Git and Mercurial  and all of these distributed version control systems to be centralized version  control systems.</p>
<p>And that&#8217;s something that, within Apache, that goes against what our thoughts  are, because we want to all be agreeing on, &quot;This is the Apache  version.&quot;</p>
<p><strong>Scott</strong>:  &nbsp;Sure, no problem.</p>
<p><a name="nosingleleader"></a>
<p><strong>Justin</strong>:  &nbsp;So there&#8217;s no leader within the ASF. There&#8217;s no person, if you look at  say, Linux and you say, &quot;This is Linux&#8217; tree,&quot; or this is Andrew&rsquo;s tree  or this is Alan&#8217;s tree. Instead there is just the Apache tree. So there&#8217;s  really no concept of, &quot;This is Justin&#8217;s tree,&quot; or someone else.</p>
<p><strong>Scott</strong>:  &nbsp;Is the feeling then that essentially there should just be intensive  discussion before something gets checked in? So when you&rsquo;ve reached a point of consensus,  it should get checked in rather than the way other projects work where  different things get checked into different people&#8217;s trees. And then when it&#8217;s time to build a release, you have to pull stuff from these different sources to  figure out what&#8217;s going to be in the release and what isn&#8217;t.</p>
<p><strong>Justin</strong>:  &nbsp;Right. So, generally, what tends to happen is there are two states a tree  can be in within Apache. One of them is &quot;commit then review.&quot; And  then there is &quot;review then commit.&quot; And you&#8217;ll see some projects  differ on particular trees.</p>
<p>For example for the HTTP Server, the trunk &#8209;&#8209; which is the main development &#8209;&#8209;  is usually always under the &quot;commit then review,&quot; which basically  means that anybody who has commit access can feel free to go and make changes  and they basically have the benefit of the doubt that the change is going to be  good. And there&#8217;s generally an implied threshold: that if you&#8217;re going to make  a really big change, go discuss it on the list. But if it&#8217;s a minor change or  adding a little feature, that&#8217;s probably not going to be controversial, go  ahead and commit that to trunk.</p>
<p>But for our stable releases, generally things that have already been released  and we&#8217;re doing maintenance on those, are under the &quot;review then  commit&quot; model. So that&#8217;s going to be RTC, and that means that any change  to those trees has to be pre&#8209;approved. That means you need to get three binding  votes from other committers to say, &quot;Yes, this is good change and no one  has vetoed it.&quot; Technically you would use a file called STATUS, just a plain  text file that some projects will use, that basically tracks all of the things  that are under discussion to be back&#8209;ported or added into this tree.</p>
<p><strong>Scott</strong>:  &nbsp;Got you. Like any open source project, and this one is democratic, there  is a vote to commit. Well, there is a vote if it is a release product. If it&#8217;s  not a release product and people later decide it wasn&#8217;t a good thing to do, it  can be reverted out.</p>
<p>And you mentioned before that part of your governance is that out of these 60&#8209;something  maintainers, one of them can essentially veto a change if they want to. So talk a little bit about how conflict resolution usually works. For example, when you  have people who are&mdash;and I understand it doesn&#8217;t happen often&mdash;adamant one way  and another person who&#8217;s adamant a different way. How do you see it play out  that those eventually get resolved and things move forward?</p>
<p><strong>Justin</strong>:  &nbsp;Generally what happens is, like before, the veto just tends to be a last  resort. So let&#8217;s just say that someone makes a change to the trunk and I don&#8217;t  like it. I might just say, &quot;You know, we should talk about this  change,&quot; or, &quot;Here&#8217;s the problem with this.&quot; Generally, the  person who has committed that says, &quot;Oh, yeah. Here&#8217;s why I did it this  way,&quot; and comes up with an explanation, and then through a process on the  mailing list, figures out and resolves those conflicts.</p>
<p>That&#8217;s what you tend to see happen. And it&#8217;s all tends to be, for the most  part, &quot;I&#8217;m sorry, I forgot about that particular corner case or  that.&quot; And everything tends to get resolved very naturally.</p>
<p>The veto tends to be when someone says, &quot;No. I have to do it this  way,&quot; and someone else says, &quot;No, that&#8217;s wrong.&quot; That&#8217;s  basically when the community is at risk of breaking down. But generally it  doesn&#8217;t get to that point. Everybody is, &quot;We&#8217;re all going to go in this  direction; this is the right direction for us to go in. We want to add this  feature. Let&#8217;s work through whatever issue you may have about this particular  commit.&quot;</p>
<p><strong>Scott</strong>:  &nbsp;One last thing before we move on to the next point. Other than the fact  that everybody has their own private tree, is there anything else about the  collaborative nature that you think is somewhat unique to Apache?</p>
<p><strong>Justin</strong>:  &nbsp;We tend to do a roll call before the release. So at this point there&#8217;s  been a review of everything. But then let&#8217;s just say that I want to release the  next Apache 2.4 or whatever. Then basically what I will do as a release manager  is say, &quot;OK, I&#8217;m marking this as 2.4,&quot; and I produce all the  artifacts. I produce the tarballs, [inaudible]generated files, whatever. And  then I send it to the list and say, &quot;Hey, is everybody happy with  that?&quot; And then that goes to a voting process.</p>
<p>There&#8217;s review at all stages, but there&#8217;s also a review at the point where you  do a release, and you have to get, at least, three people to approve a release.  One thing that&#8217;s different is that it&#8217;s not possible to veto a release. It&#8217;s  strictly majority rule on the release.</p>
<p><strong>Scott</strong>:  &nbsp;So in other words, there is veto capability on the individual check&#8209;ins. But,  as you said, when it comes down to doing a release it is a majority rule vote.</p>
<p><strong>Justin</strong>:  &nbsp;Yes. You&#8217;ll tend to see someone voting &#8216;no&#8217; on a release if it doesn&#8217;t  work on Linux or something, and generally you&#8217;ll see it get stalled and it then  gets fixed. But there have been a couple of cases when we did the release even  though we knew that it didn&#8217;t work on a particular platform, and so we made a  release note. But the veto does not apply to releases.</p>
<p><strong>Scott</strong>:  &nbsp;Interesting. So moving on to licensing, one of the key things with Apache  is the commercial&#8209;friendly standard license. Talk a little bit about what that  means.</p>
<p><strong>Justin</strong>:  &nbsp;Basically within Apache, we like to have a big tent where everybody can  come in and play with us. We think that the community that we developed within  Apache is going to be the motivation for you to stay involved. For example,  within the HTTP Server community, you have all these experts and Web servers,  and if you&#8217;re part of this community, you get the benefits from them. And so  there&#8217;s an incentive to play within the community, so that I don&#8217;t have to hire  five guys and do a whole team; I can leverage the other people within the  community.</p>
<p>But in turn, all those people who are part of the community say, &quot;Whatever  you want to do with the code is fine. We&#8217;re not going to get hung up if you  make it a commercial product or an open source project. We created it and it  served our needs, and if it serves your needs, that&#8217;s great.&quot;</p>
<p><strong>Scott</strong>:  &nbsp;What springs to mind are things like GPL. Am I seeing it right in that  Apache is more commercial&#8209;friendly than GPL, V2 or V3?</p>
<p><strong>Justin</strong>:  &nbsp;There are companies built around GPL licensed software. But what we tend  to see are two classes of GPL products: In one, there is a real community,  maybe within Linux, and they&#8217;re all happy to make all their changes available  to everybody, and that&#8217;s a very good community. The other community you tend to  see has a single stakeholder that has a prevailing interest in the GPL product,  and they basically have an unfair share.</p>
<p>You can see this with some of the GPL projects that require copyright  assignment. In order to participate, you license your changes in the GPL and  you have to give a copyright assignment to the principal stakeholder. Now they  are then free to release the commercial closed source based on your work  because they have the copyright or whatever legal mechanism. There is an imbalance  there when you look at those two.</p>
<p>Generally when you think about the GPL, you&#8217;re divided, with broad strokes,  into those two groups. This is a real community but the other groups are aware  and want to be clear about which one has a dominant role, and that&#8217;s one thing  within Apache we don&#8217;t like to see. As our projects go through incubation and  get added to the Foundation, one of the things we do is make sure that the  community is diverse. In fact, there is not a single dominant stakeholder that  can direct the project in any untoward way.</p>
<p><strong>Scott</strong>:  &nbsp;Right. I can make changes to it, distribute it as part of a commercial  product, and I would not be required to contribute those changes back to  Apache. But under a GPL license, any modifications made require you to make the  source code available. You cannot have closed source proprietary extensions or  modifications of it. Any modifications you make, you have to open source and it  has to be under the same license. So that&#8217;s the key differentiator?</p>
<p><strong>Justin</strong>:  &nbsp;Yeah, our philosophy is that the community is what&#8217;s going to bring you  and keep you there, and that&#8217;s why you&#8217;re going to stick around. If you  released a commercial product around one of our projects it&#8217;s going to be to  your benefit, to basically keep your commercial project as close to whatever  we&#8217;re releasing.</p>
<p><strong>Scott</strong>:  &nbsp;Right.</p>
<p><strong>Justin</strong>:  &nbsp;You can pick up all the bug fixes and whatever improvements; you get  those as a free rider. But in a sense, you are contributing whatever changes  you&#8217;re making voluntarily back into the greater community.</p>
<p><strong>Scott</strong>:  &nbsp;Yeah, that makes sense. Do you have examples of companies that have used  different Apache projects because of the commercial&#8209;friendly licensing, where  they probably wouldn&#8217;t have it if the license weren&#8217;t so commercial&#8209;friendly?  Is that a topic that comes up?</p>
<p><strong>Justin</strong>:  &nbsp;Absolutely. You see companies like IBM that release their versions of the  Apache Web server or Geronimo under different names, but in the core, they are  Apache projects. We&#8217;ll see that even with smaller companies such as Covalent  that does commercial support. Basically, they added in a couple of extra things  that provide support to their users.</p>
<p>One thing that the Apache community really does not focus on providing is 7/24  support. Covalent goes in with their business model and provides the support  and training around these particular Apache projects. You will see businesses  like JBoss using Tomcat. So you see all of these commercial companies using  things that are Apache projects under the covers.</p>
<p><strong>Scott</strong>:  &nbsp;Right, so that freedom has led people to be a lot more creative about how  they structure their business. They have a lot more options in how they  participate with the different Apache projects, how they contribute back and  how they structure their own products. What is the relationship between the  Apache Software Foundation and the Free Software Foundation? Is there any or  are those fairly separate endeavors?</p>
<p><strong>Justin</strong>:  &nbsp;There is no formal relationship. I&#8217;ve never had a conversation with  Richard Stallman, but I&#8217;ve had conversations with Bradley Kuhn who used to be,  at that time, the Executive Director of the Free Software Foundation. So  there&#8217;s an informal get&#8209;together of foundations to compare notes, and that&#8217;s  generally a very good thing. How do we keep our ears open to what Mozilla&#8217;s  doing? If they&#8217;re doing this new technique, then we can give them a call and  ask, &quot;What are you doing? We&#8217;d like to follow on it.&quot;</p>
<p>One thing we&#8217;ve been doing with the Eclipse is a joint conference. There&#8217;s  going to be a conference in Asia that&#8217;s now scheduled for 2008. So it&#8217;s a way  for us to get the communities talking to each other.</p>
<p><strong>Scott</strong>:  &nbsp;Sure. So basically, if I can summarize, you guys get together around  joint events and joint things where it makes sense. You share information  because you&#8217;re all part of the open source community. Philosophically you may  agree to disagree in terms of the details of licensing, commercial friendly,  and that kind of stuff.</p>
<p><strong>Justin</strong>:  &nbsp;Well and you ought to be using more&#8230;projects have different  circumstances. Apache&#8209;&#8209;we have a very vocal membership and we have this and you  compare that to let&#8217;s say the Mozilla which has a completely different  governance structure. But if you look at Brian Behlendorf, he&#8217;s been on the  Mozilla board for a very long time and he was one of the founders of Apache.</p>
<p><strong>Scott</strong>:  &nbsp;Gotcha.</p>
<p><strong>Justin</strong>:  &nbsp;So you have this intermingling of the communities. So someone like Brian Behlendorf  who was brought in through the Mozilla and says here&#8217;s how we did things within  Apache, and here&#8217;s his expertise and his experience that he got, he can share  that with the other people within Mozilla.</p>
<p><strong>Scott</strong>:  &nbsp;Gotcha, gotcha.</p>
<p><a name="security"></a>
<p><strong>Sean</strong>:  &nbsp;Let me refer to one of the other tenants of the Apache way. I&#8217;m curious  about this just because I was thinking about the conversation we were having.  To state the obvious, you&rsquo;re focused on producing software. I notice that you  have a security committee that&mdash;if I&#8217;m reading it right and for lack of a better  phrase&mdash;provides a service to all of the projects that are part of the  foundation. And it looks like those projects can turn to the security committee  and ask security related questions or possibly look for guidance from them,  regardless of whether they&#8217;re Tomcat or some other piece of the foundation? Is  that accurate or is that not accurate?</p>
<p><strong>Justin</strong>:  &nbsp;It&#8217;s somewhat so. We have a security team, which I believe is currently a  Board committee. But basically what they&#8217;re responsible for doing is ensuring  our security at Apache.org mailing address gets responded to. And these are  generally people who are very security savvy.</p>
<p>But there tends to be some people from Tomcat, from the HTTP Web server, from a  higher profile project on this internal mailing list. So let&#8217;s say that, to  give you an example, let&#8217;s say there was a security vulnerability in Derby and  they could parse to those reps and say, &quot;Hey, we have a security  vulnerability. What do we do?&quot; And so there&#8217;s expertise and, &quot;OK.  Here&#8217;s what you do. Here is your administrative contact. Make sure your mailing  list&#8230; Go talk to&#8230;&quot; Kind of a shared resource. But we&#8217;re not getting  the focus on producing the fixes for the project but it&#8217;ll be &quot;OK, here&#8217;s  the responsible disclosure policy and an attribution policy.&quot; So that&#8217;s  generally what their role is.</p>
<p>  <strong>Sean</strong>:  &nbsp; Are they providing fairly prescriptive guidance but just not down to the  &lsquo;I&rsquo;m going to change you&rsquo;re code&rsquo; level because they don&#8217;t know the individual  projects at that level? Would that group essentially be the center for  discussions around a security development lifecycle for the Apache Software Foundation?  And an attempt to pull those best practices together?</p>
<p>  <strong>Justin</strong>:  &nbsp;Yeah. I think basically our project concern&#8230;we have something. What&#8217;s  the process? What do we do? And that&#8217;s as an advisory role. OK, here&#8217;s the  process and the procedures to follow.</p>
<p><strong>Sean</strong>:  &nbsp;But it&#8217;s purely advisory, right? I mean one of the projects where they  feel that their code is &quot;secure enough&quot;, or they&#8217;ve looked at it long  enough or they feel that they&#8217;ve handled it. &nbsp;Then the advisory committee comes back and  says, &quot;Well, we really think you could take a look at this again.&quot;  That&#8217;s where the communication would stop and it would be up to the individual  project whether they want to take that under advisement or not.</p>
<p>  <strong>Justin</strong>:  &nbsp;Yeah. I think so. I think record security, can you maybe at that point  write back to your original reporter and say, &quot;We looked at it and we  don&#8217;t feel there&#8217;s a security vulnerability here.&quot; That may be&#8230;that has  happened where we look at things and we say, &quot;No. This is not an  issue.&quot; But generally, really the security team is more of a reactive. So  they&#8217;re not proactively performing security analysis on our code or anything  like that.</p>
<p><strong>Scott</strong>:  &nbsp;I just want to clarify. It sounds like they have a little bit of an all&#8209;up  policy for somebody sending an email to that address; somebody reports what  they perceive as vulnerability or reports some kind of issue. They do a little  air traffic control. They route it to the project.</p>
<p><strong>Justin</strong>:  &nbsp;Exactly, exactly.</p>
<p><strong>Scott</strong>:  &nbsp;There&#8217;s a general process that the different projects would follow. Basically  that sort of happens and that&#8217;s one of the things that the security group  advises the other projects on. Well this is generally a &#8216;way we do it&rsquo; sort of  thing.</p>
<p><strong>Sean</strong>:  &nbsp;Let&rsquo;s go to a different piece of the Apache way, the emphasis on a  technical&#8209;based interaction. One of the things that I find fascinating about  open&#8209;source projects is the way that they exorcise community members that maybe  aren&#8217;t following those rules.</p>
<p>  [laughter]</p>
<p><a name="firing"></a>
<p><strong>Sean</strong>:  &nbsp;Because we got some interesting responses when we talked to people about  it. It&#8217;s like, fine. We understand that everybody is an adult. We understand  that everybody will try to handle themselves in an appropriate manner. But if  anybody&#8217;s worked on a software project they know that not everybody does,  right? So&#8230;</p>
<p>  [laughter]</p>
<p><strong>Sean</strong>:  &nbsp;Considering that you can learn a lot from a story&hellip;if you&#8217;ve got a story  or two about either fully pulling the ejection handle on somebody that would be  interesting to hear? Or a scenario where it just took serious counseling to get  somebody pointed in the right direction.</p>
<p>  I would be curious to see how you guys handle that. You obviously have  procedures in place. But at times you have to go beyond those with some amount  of intervention and I&#8217;m just curious how that played out.</p>
<p><strong>Justin</strong>:  &nbsp;Yeah. So there&#8217;s one case that comes into mind but I&#8217;m trying to reserve  the right to figure out how much of this has been disclosed.</p>
<p><strong>Sean</strong>:  &nbsp;Yeah, Sure</p>
<p><strong>Justin</strong>:  &nbsp;So I&#8217;ll tell the story and then leave people&#8217;s names out of it but I have  to go back and see how much of this has been told. So recently within one of  our larger, well&#8209;known projects, there was a bout, to use the word, between two  committers. And they basically ended up vetoing each other on everything. It&#8217;s  like no, no, no and tempers got flared. And it got into a very unhealthy  situation. They&#8217;re two very strong&#8209;willed individuals.</p>
<p>  And basically what happened is the PMC, so it was the PMC responsible for this  particular activity, basically had to step in and say, &quot;OK; we need to  come up with some policy or come up with some new rules to get everybody back  to ground zero. So it wasn&#8217;t a matter of ejecting anybody. It was never really  an option that&#8230;basically what happened was they said, &quot;Here are the  ground rules. Here is&#8230;if you&#8217;re going to go do this you have to follow this  set of rules. If you&#8217;re going to go do that, you&#8217;ve got to go follow this set  of rules.&quot; Basically the community agreed to say we&#8217;re going to go and  we&#8217;re going to voluntarily adopt these rules. But as a settlement process; lots  of flames and a lot of innocent people getting&#8230;</p>
<p><strong>Sean</strong>:  &nbsp;Right.</p>
<p><strong>Justin</strong>:  &nbsp;&#8230;accused of things and that process. The other one that&#8217;s in our  history was a project called Avalon. And this is one that&#8217;s definitely well  known so this won&#8217;t be any issue about this one. Avalon was a container  framework. And what happened was two individuals just did not get along. And  they were ending up in what we would call a commit war, where they would  basically be reverting each other&#8217;s changes as soon as they came in. And it is  just this whole really poisonous environment and basically in that case, the  community wasn&#8217;t able to deal with it. And so basically what happened there was  they fractured.</p>
<p>  And so that one of the individuals went off and he took his code and you know,  we wished him luck and said have a nice life and he went off and then it was  kind of some other people came along and they did a project called Excalibur,  which was basically the remnants of this whole Avalon project. You will just  see if you look at the mailing list traffic, you will just see this giant peak  and then this sudden nothingness because the project got shut down because no  one could play well with each other.</p>
<p><strong>Sean</strong>:  &nbsp;Right.</p>
<p><strong>Justin</strong>:  &nbsp;By the way what it is interesting is some of the veterans of Avalon, they  really got involved in the greater Apache community, one of our new directors  this year. He was in the middle of all of this, but during that whole  experience, he was one of the people trying to keep things level and stuck  around and this year he got elected to the board of directors.</p>
<p><strong>Sean</strong>:  &nbsp;Since Apache is a large foundation, I&rsquo;m curious about a different point.  If you were a closed source company and you feel you are short on testers or short  of security experts, you simply go to HR and put out a requisition and  hopefully some good folks come back.</p>
<p>  So has the foundation had to answer requests from projects where they say,  &quot;OK, look, we think we are geniuses on nine out of 10 of the things we  need to do, but this one thing we really need people to help.&quot; How does  the foundation help with staffing up a project in this type of case?</p>
<p><strong>Justin</strong>:  &nbsp;It is more bottom&#8209;up than that I think in the sense of the culture that  we have. You will see some overlap between the HTTP Server committers and the  Tomcat committers because you will see that &#8209;&#8209; sometimes the Tomcat committers  came over and they say, &quot;Hey, we need some help with HTTP.&quot; Well  lucky us, we have some of the world&#8217;s foremost experts in HTTP server, and that  basically got them within the communities.</p>
<p>  So I think our community&#8217;s diverse enough to, &quot;Hey I am looking for a  person who knows SQL.&quot; OK, I am going to go on the Derby mailing list and  say, &quot;Hey, I need some help with SQL&quot; or for something for build  systems, I&#8217;ll go to Ant community. And there will be some of the people who are  the foremost experts in that. That&#8217;s actually one thing of having such a large  diverse community is that you can pretty much find someone who understands  something about something somewhere within the Foundation.</p>
<p><strong>Sean</strong>:  &nbsp; I figure that&#8217;s one of the advantages. You&#8217;ve got a massive talent base  but at the same time it is segmented into the project, so the Foundation can  help orient a little bit of that knowledge of where the talent base is.</p>
<p><strong>Justin</strong>:  &nbsp;Yeah and as I said if you look out the social graph and it is a weird mix  of people who are committers on Cocoon, maybe committers on Mina, then maybe on  Gump and so you will see that developers themselves, the committers aren&#8217;t  necessarily staying in their silo, there are some who do, but there are also  just as many who will go to other communities and work on other projects.</p>
<p><strong>Sean</strong>:  &nbsp; Well I have couple more questions but Scott can go ahead. I want to give  you an opportunity to jump back in.</p>
<p><strong>Scott</strong>:  &nbsp;So talk a little bit about standards because one of the other tenets or  pieces of philosophy is faithful implementation of standards. Talk a little bit  about what that means for Apache?</p>
<p><strong>Justin</strong>:  &nbsp;Right. So this was initially when we started off there was HTTP and there  was the IETF standards and that was when you have editor of the HTTP standard  is one of the people behind the code base, there is the knowledge is going both  ways in a sense is that we are that able to influence the standards process.  But at the same point we also have some of us involved with the standard  process and feeding those changes back and to the development and supporting  those standards.</p>
<p>  But since then, the initial, you see our participation within some of the key web  server specifications, then probably most importantly our participation in the  Java Community Process and that is, as you know, we have so many Java projects  and that so many of our projects are implementing some JSR specification and  our involvement within the JCP has been to ensure that we can implement the  specifications and we have projects have representation on these expert groups that  device these standards.</p>
<p><strong>Scott</strong>:  &nbsp;Right. So it isn&#8217;t just like the standard shows up and then you figure  out how to implement it. There is this two&#8209;way street where you are shaping the  standard because you guys have such a big, real world implementation. And  meanwhile, the standard is telling what you guys do because they have their own  stakeholders, but they are considering your recommendations and you want to  conform to what they eventually approve.</p>
<p><strong>Justin</strong>:  &nbsp;Correct.</p>
<p><strong>Scott</strong>:  &nbsp;Yeah, I don&#8217;t know. I don&#8217;t have anything else specific. Sean, do you?</p>
<p><strong>Sean</strong>:  &nbsp;No, not right now. I think this led us into some new stuff and from our  end, we really enjoyed chatting about it. Justin, do you have anything you&#8217;d  want to add or things you think we should address overall based on the theme of  where we were going?</p>
<p><strong>Justin</strong>:  &nbsp;No, I mean you did a good job of asking me the questions.</p>
<p><a name="partofapache"></a>
<p><strong>Scott</strong>:  &nbsp;I guess there is one final question. When somebody is starting or has an  open source project, they can pick the license they want &nbsp;they can do what they want for their  community, things like that. What makes people want to be, in your mind, an  Apache project? What is the draw I guess?</p>
<p><strong>Justin</strong>:  &nbsp;Well, I think from my perspective, the draw is we handle a lot of the  mundane governing structures and all this and the infrastructure and the  licenses. And that is all essentially managed and I think you see a lot of open  source projects like, &quot;Oh we need to go get a foundation.&quot;</p>
<p>  And that&#8217;s a lot of overhead, there is a lot of overhead to create a  corporation, handle donations and handle essential infrastructure and that is  what our goal at the Foundation at the broadest level is to provide support. So  that these people who are working on all these different projects all they have  to worry about is doing code. They don&#8217;t have to worry about, &quot;Oh I need  to go and buy a new server, how we are doing to deal with this donation or this  tax policy.&quot; We try to deal with all of that. So I think there is a  critical mass that works in our favor.</p>
<p><strong>Scott</strong>:  &nbsp;Right, right and let them focus on the piece of it that they really  enjoy, which is whatever this project is that they have come up with. They have  a passion, like you said, for not having to worry about all the housekeeping  stuff.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=129&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/&amp;title=Interview+with+Justin+Erenkrantz+%26%238211%3B+President+%26%238211%3B+Apache+Software+Foundation+%26%238211%3B+Part+II" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/&amp;title=Interview+with+Justin+Erenkrantz+%26%238211%3B+President+%26%238211%3B+Apache+Software+Foundation+%26%238211%3B+Part+II" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/&amp;title=Interview+with+Justin+Erenkrantz+%26%238211%3B+President+%26%238211%3B+Apache+Software+Foundation+%26%238211%3B+Part+II" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/&amp;title=Interview+with+Justin+Erenkrantz+%26%238211%3B+President+%26%238211%3B+Apache+Software+Foundation+%26%238211%3B+Part+II" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Justin+Erenkrantz+%26%238211%3B+President+%26%238211%3B+Apache+Software+Foundation+%26%238211%3B+Part+II+@+http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2008/01/31/interview-with-justin-erenkrantz-president-apache-software-foundation-part-ii/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interview with David Campbell &#8211; Technical Fellow &#8211; Microsoft</title>
		<link>http://howsoftwareisbuilt.com/2008/01/04/interview-with-dave-campbell-technical-fellow-microsoft/</link>
		<comments>http://howsoftwareisbuilt.com/2008/01/04/interview-with-dave-campbell-technical-fellow-microsoft/#comments</comments>
		<pubDate>Fri, 04 Jan 2008 01:03:26 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Sean Campbell]]></category>
		<category><![CDATA[databases]]></category>
		<category><![CDATA[David Campbell]]></category>
		<category><![CDATA[enterprise]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[security development lifecycle]]></category>
		<category><![CDATA[sql server]]></category>
		<category><![CDATA[strategy]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2007/12/19/interview-with-dave-campbell-technical-fellow-microsoft/</guid>
		<description><![CDATA[Interviewers: Scott Swigart and Sean Campbell
Interviewee: David Campbell
In this interview with David Campbell we talked to him about:

His background with the SQL Server team and Microsoft
What about planning, implementation, testing, delivery, and servicing do you think are unique to a product as high profile and critical as SQL Server.
The Security Development Lifecycle and SQL Server.
Usability [...]]]></description>
			<content:encoded><![CDATA[<p><b>Interviewers:</b> <a href="http://howsoftwareisbuilt.com/about-scott-swigart/">Scott Swigart</a> and <a href="http://howsoftwareisbuilt.com/about-sean-campbell/">Sean Campbell</a></p>
<p><b>Interviewee:</b> <a href="http://howsoftwareisbuilt.com/david-campbell-technical-fellow-microsoft/">David Campbell</a></p>
<p>In this interview with David Campbell we talked to him about:</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2007/12/19/interview-with-dave-campbell-technical-fellow-microsoft/#DavidBio">His background with the SQL Server team and Microsoft</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/12/19/interview-with-dave-campbell-technical-fellow-microsoft/#SQLUnique">What about planning, implementation, testing, delivery, and servicing do you think are unique to a product as high profile and critical as SQL Server.</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/12/19/interview-with-dave-campbell-technical-fellow-microsoft/#SDL">The Security Development Lifecycle and SQL Server.</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/12/19/interview-with-dave-campbell-technical-fellow-microsoft/#Usability">Usability and SQL Server&#8217;s Development.</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/12/19/interview-with-dave-campbell-technical-fellow-microsoft/#OpenSource">His experience with OpenSource databases.</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/12/19/interview-with-dave-campbell-technical-fellow-microsoft/#OpenSourceDB">How much the experiences of mySQL and other open source databases have affected SQL Server&#8217;s development.</a></li>
</ul>
<p><span id="more-120"></span></p>
<p><b>Sean:  </b>Tell us a bit about your role at Microsoft    with the SQL Server Team and as a Microsoft Fellow.</p>
<p><b><a name="DavidBio"></a>David: </b>I have been working on SQL Server for roughly    13 years in a variety of roles but my passions are in systems and product    development. Currently, I am heading up a team we call “Strategy,    Infrastructure and Architecture” – SIA for short but I’ve worked on SQL Server    as an SDE (Microsoft lingo: SDE = Software Development Engineer), Development    Manager, Product Unit Manager and General Manager overseeing a number of    component groups within the product.</p>
<p><b><a name="SQLUnique"></a>Sean: </b>SQL is a product upon which enterprises bet    their business.  So much of what’s mission critical depends on SQL Server.     What about planning, implementation, testing, delivery, and servicing do you    think are unique to a product as high profile and critical as SQL Server?</p>
<p><b>David: </b>In the business world, enterprise database    products define “mission critical”. A crash in many line of business    applications may result in a disruption of service until the server or    application is restarted but a severe crash in a database server resulting in    data loss can cripple a business. I learned this lesson the hard way before    coming to Microsoft when I worked on database systems at Digital Equipment    Corporation (DEC). One day I received a page from our support staff and found    that a product defect had resulted in a production line shutdown for a major    semiconductor manufacturer. They had to send folks home; some of the physical    equipment suffered damage as a result of stopping the production with partially    completed batches of material and they reminded me that they were losing    roughly a million dollars an hour in business while things were stopped. </p>
<p>Getting to your question might require a little background    on SQL Server. Some folks might know that Microsoft hired a bunch of people    from the database industry in the early to mid 1990’s to work on SQL Server as    Microsoft tried to become a player in enterprise database systems. We acquired    a source license for Sybase 4.2 and shipped two versions, SQL Server 6.0 and    SQL Server 6.5, on that architecture. We hired a number of query processing experts    and they started building a completely new query processor from a clean sheet    of paper. I worked on the storage engine team and SQL Server was getting beaten    up in the market around this time since we didn’t support row level locking  as    the Sybase architecture we inherited only supported page locking. I was    responsible for the design of the row level locking feature for SQL Server 7.0    and the more I dug into the Sybase architecture the more challenging the design    became since the entire Sybase transaction and recovery system was predicated    on page level locking and it was very difficult to do a clean row locking    design without a number of major compromises. After a number of sleepless    nights and arduous design meetings we ultimately came to the conclusion that we    would have to rewrite much of the Sybase storage engine to do this feature    correctly. As part of this major architecture shift we made a decision to    change the on disk format for SQL Server 7.0. This meant customers would have    to unload and reload all their data as part of the upgrade to the new release.</p>
<p>So, now we have the context to start answering your original    question. We started what would become SQL Server 7.0 with 2 strikes against    us: It was really going to be a V1 product with a V7.0 name and we would    require every customer to completely unload an reload their data to migrate to    the new product. We knew that poor quality would mean strike three. Since we    had a number of people that had experience building enterprise database systems    we weren’t lacking in design knowledge so the real success factors for the    release came down to doing a great job of architecture, implementation and    validation. On the validation front we did a number of interesting things that    could probably fill a book but I’ll highlight a couple of them here. </p>
<p>Given that we were going to make everyone migrate their data    we knew that we need to make the data migration process both highly performance    and rock solid. We engaged our sales force to ask our customers to help us by    sharing their databases so we could convert them to the new format in the lab.    We called this the 1,000 DB challenge and we wanted to get 1,000 real customer    databases of all different sizes and complexity so we could run them through a    database conversion lab that we created. The other interesting thing we did in    this space was to write a playback system. This consisted of a capture utility    that would log the customer activity on a production database and then a     playback utility that would allow us to play back the actual customer workload    against the customer database in our lab. We could play the workload back in    “real time” which included the dwell and think time between queries, or    “compressed” where we just jammed the queries into the server as fast as we    could. In this mode we could stuff a day’s worth of work into the system in an    hour or so and really stress the server. We’d ask customers to take a backup of    the database that corresponded to the workload and to capture the actual work    using the capture utility and then send us the database and capture log. This    data allowed us to test the conversion of the database from the old version of    SQL Server to the new version and also let us test the new version of SQL    Server by replaying the actual customer workload on the new system. Later on we    started to capture query performance of replay on the old vs. new version of    the product to find performance bugs. </p>
<p>The next interesting thing we did on the validation front    came from Don Slutz, a long time database veteran that was working in Microsoft    Research at the time. He wrote a program he called RAGS that was really a model    based testing system that used the SQL language grammar and the schema of an    existing database to generate bizarre, but syntactically legal, SQL statements    and feed them into the query processor. Basically, he married the state domain    from an existing database schema with that specified by the SQL grammar and was    able to probe all the dark corners of the search space programmatically. He    wrote an MSR technical report that is available on the Microsoft Research we    site. The way this played out was pretty interesting. At first it was pretty    easy for Don to crash the query processor. So he filed a bunch of bugs, the    developers fixed a bunch of bugs then Don ran it again, etc. After a couple of    iterations of this cycle RAGS needed to generate some pretty ugly queries to    crash things. You’d wind up with these 5 page SQL queries that looked like    random gibberish but was legal SQL syntax and the query processing team would    spend a bunch of time figuring out what the query was supposed to do and then    figure out why it crashed the system. Later on Don used RAGS against different    versions of SQL Server and other database products generating queries over    equivalent schemas and comparing both the results and the performance of the    different products across a wide range complex queries.</p>
<p>I could list more if you guys want to write a book…</p>
<p><b>Sean: </b>It’s hard to imagine a product that has higher    security requirements than a database server.  It has to talk on the network,    and it has to be able to store sensitive information.  Since the introduction    of the SDL, SQL has seen a dramatic reduction in vulnerabilities.  How does the    SDL play out on a day to day basis?  How does it affect your architecture?</p>
<p><b><a name="SDL"></a>David: </b>Great question! To fully appreciate the change    you have to understand that 10-20 years ago there was very little widespread    security knowledge in the software development world. In some sense the    environment didn’t require it; most systems weren’t interconnected and remotely    accessible. This meant that security breaches required typically physical    breaches and people had a model in their head for physical security. It was    easier to understand locks on doors than buffer overruns. What is interesting    about the SDL and SQL’s evolution is that we’ve gone from a period in which    security review and validation was done mostly after the coding was done to    today’s world where we formally design security threat models as part of the    design process before writing code. We also have a great training program in    place that everyone touching the code needs to take. We also have refreshers    that developers must take for emerging threats. Additionally, we’ve developed a    great set of static code analysis and run-time tools to avoid and detect    potential security issues. In 2002 we had a “security push” where the entire    development team stood down and reviewed every line of code in the product. We    did a smaller push in 2004 for SQL Server 2005 and with our current release we    have truly integrated the security best practices into the development process    and don’t need a separate security push as security is simply part of our day    to day process. One challenge in the security space is that the threats are    constantly evolving so we can’t rest and, as long as the bad guys are learning    new tricks, we need to up our game and having a process in place for rapid    mitigation in the event of a new threat or vulnerability. </p>
<p>In terms of how security has affected our architecture we    are much more mindful of the threat environment each line of code is executing    in. For example, there is a small amount of code that performs the initial    client authorization before allowing a connection into the server. Since a    remote client executes this code pre-authorization, any remote code that can    access the port over the network can execute this portion of code.    Architecturally, we strive to keep this code to a minimum and it is very    thoroughly reviewed. Similarly, the security base is designed in a layered    fashion so you have smallish amounts of thoroughly reviewed code providing    services that other aspects of the security system are built upon.</p>
<p><b>Sean: </b>How would you respond to a statement like, “If    you really wanted to make SQL secure, you’d forget about the SDL and just    open-source it so that many eyeballs could look at the code.”</p>
<p><b>David: </b>I think there is some benefit to having more    eyes on a piece of code. There is benefit from each person’s fresh perspective,    benefit from varied knowledge, etc. However, eyeballs alone are not nearly    enough. The SDL process evolution has been really interesting in that we have    changed the culture, habits, and processes of every Microsoft developer in a    way that is much more effective than having many otherwise competent    programmer’s looking at the code. With SDL we have many eyes looking at threat    models, many looking a the design of a security feature and ultimately, many    looking at the code following a pretty rigorous and proven process. I think    there’s sufficient independent objective evidence to say that SDL is working    for us. If you have a few minutes go to the National Vulnerability Database at    nist.gov and check out the vulnerability reports for SQL Server vs. other    databases. I’ll warn you, if you try to search for Oracle flaws they are hard    to find since every major and minor release of the Oracle database server is    listed separately so you need to do a little aggregation to get the real    picture. </p>
<p><b>Sean: </b>It’s one thing to design powerful functionality,    it’s another to make it easy to use.  Talk about how Microsoft insures not just    functionality, but usability.</p>
<p><b><a name="Usability"></a>David: </b>OK. Now you’ve hit on one of my personal    passions. I believe that many technologies go through stages of evolution as    they mature. I define three major phases – nascent, developing, and refined.    There are many examples that can serve as a lesson for software developers. </p>
<p>Consider televisions; thirty to forty years ago when TVs    were a nascent technology you almost needed to be a technician to own one.    Certainly, you needed to know how to take the back off and pull the tubes out    so you could go to Radio Shack and test them when one of them went bad.    Furthermore, there were knobs on televisions that were there solely because the    technology hadn’t matured to the point where they weren’t necessary. I often    ask audiences how many people miss the horizontal and vertical control knobs on    their TV and we’re getting to the point where many in the audience don’t know    that these knobs even existed to keep the picture from rolling and waving in    early TVs.</p>
<p>Twenty years ago TVs entered the developing age as they    became fully solid state. They were much more reliable and the technology    developed to the point where many of the knobs disappeared. In this phase they    were good enough for the masses. You didn’t need to be a technician to own one    but it was nice to have one in the neighborhood. Frankly, I sort of think this    is where PCs are today.</p>
<p>Today’s TVs are refined in that the user’s control surface    captures the user’s intent rather than exposing the control surface of the    underlying technology. For example, instead of fiddling directly with color    temperature, saturation and hue to adjust the picture, my TV has a control that    asks me if I want to watch sports, movies, or regular programming and adjusts    the color parameters accordingly. Furthermore, some TVs are aware of the operating    environment such as ambient light and compensate for that. You can do this same    thought experiment on automobiles, microwave ovens, etc. When viewed through    this perspective, most system software still has a long way to go to become a    refined technology. Of course, there are cases where these maturity phases    ripple and repeat through a single technology where advances happen in waves. I    think my new Smartphone is an example of that. It’s way more capable than my    previous cell phone but I never had to reboot my earlier one.</p>
<p>So, how are we doing on SQL Server? One simple example is    the work that we did in the database engine in SQL Server 7.0 when we got rid    of many of the knobs and made many of them self-tuning. SQL Server 6.5 had    roughly 100 knobs whereas SQL Server 7.0, which was much more complex in many    ways, had roughly 20. Many people thought that more knobs meant more control    but, in reality, we found many systems were performing poorly in the field due    to mis-configuration. We classified the knobs into those that should simply    take care of themselves – things like the number of locks the server could    allocate when it booted, the number of hash buckets in the cache manager, etc.    These were our “horizontal and vertical control knobs”. For other knobs we set    them up where the database server managed them by default but if an    administrator wanted to impose constraints he could. The amount of memory    allocated to the server was in this category; by default, SQL Server manages    memory in cooperation with the demands in the operating system but you can set    low and high watermarks if you want to. In other areas we used control theory    and feedback loops to have the system adapt dynamically to the environment and    control things based upon instantaneous system response. The adaptive systems    work we did was really interesting in that existing control knobs were often    static; a good example is “sort memory”. In many database systems before SQL    Server 7.0 you set aside a portion of memory for sorting to perform queries and    build indexes, etc. During times where you weren’t building indexes or didn’t    have queries that required a sort that reserved memory was just laying fallow    and wasted. Further, if you had a sort that needed a little more space than    what you had reserved it would spill to disk because the sort wouldn’t fit in    the reserved space – even if there plenty of memory unused elsewhere in the    system. In SQL Server 7.0 we created an internal memory broker that could use    server  memory for whatever purpose made the most sense over time so things    like the procedure cache, workspace, sort, and the buffer pool all cooperated    to use memory in the most efficient manner over time. The result was fewer    knobs and a more efficient system.</p>
<p>This was great work but we ran into a situation where we    were ahead of the market in many respects. DBAs were worried we were going to    put them out of a job. They thought they were getting paid, in some part, to    respond to their pager at 2:00 AM because a big batch job failed because they    hadn’t allocated enough lock blocks for the server. Our competitors also used    this maturation against us – things like, “How can SQL Server be a real    enterprise database system – it only has 20 knobs and our system has 500!”.    What’s interesting is that if you look at the major database systems they have    all made major investments in self-tuning and ease of use.</p>
<p>Market dynamics also demand that we make these systems    easier to use and self managing. As an industry we’ve expanded database    deployment from an installed base that was likely measured in the small    100,000s of units 20 years ago to one that is likely measured in 100,000,000s    of units today. They had better be easier to use than 20 years ago!</p>
<p><b>Scott: </b>How do customer needs and requirements make it    into the planning process?  How do you handle situations where the customer is    asking for the wrong feature?  (The customer asks for a setting so they can    tune X, and you realize that if a subsystem was redesigned, they wouldn’t need    to tune X)</p>
<p><b>David: </b>We have had many situations where customers    have asked for features that they have seen in other systems that didn’t make    sense for SQL Server. Often customers ask for performance features that other    products have that may provide a large advantage in those products but, given    the way that SQL Server is architected, these same features may provide little    to no benefit on our architecture. One example is raw device and partitioning    support. 20 years ago many UNIX systems didn’t have advanced I/O features such    as the ability to avoid the file cache, scatter/gather I/O or great    asynchronous I/O. In fact, many early UNIX file systems had 32 bit file offsets    so the maximum size of an individual database file could only be 2 or 4 GB in    size. NTFS, in contrast, supported 64 bit file addressing, great asynchronous    I/O, and the ability to do unbuffered I/O from the start. So, whereas other    products needed separate partitions over multiple files with an I/O thread per    file to simulate asynchronous I/O – SQL Server didn’t need any of this. This    didn’t prevent customers from asking though. Of course, once you get up to very    large databases, partitioning makes sense for a number of reasons such as the    ability to physically manage large tables in index in smaller pieces but the    point is that SQL Server didn’t need partitioning to get I/O parallelism in the    same way some other systems did. We had to educate many customers on these    points so they understood how our architecture achieved the performance that    other systems did but through a different architecture.</p>
<p>I’m also mindful that control doesn’t always represent    progress and often simplicity is the best approach – especially when it affords    an opportunity for the software to do a better job. I think one good example    involves a feature known as “tempdb in RAM”. Prior to SQL Server 7.0 you could    allocate a region of RAM to hold the temporary database which is used for    scratch tables and intermediate query results. In certain environments, placing    “tempdb in RAM” could provide a significant benefit given the way that SQL    Server 6.x was architected. Unfortunately, it was easy to under or over    allocate the amount of memory and wind up spilling to disk or paging    excessively. Another point is that the amount of RAM allocated to tempdb was statically    determined; once set you needed to reboot the server to change it. Since the    amount of space needed for tempdb varies depending on the workload, the optimal    caching strategy for tempdb is dynamic. We removed “tempdb in RAM” in SQL    Server 7.0 and did a number of other optimizations under the cover to better    manage tempdb pages in memory so the actual customer result was much, much    better across a wide range of scenarios for SQL Server 7.0 but customers who    had improved their performance 2-3x by using the tempdb in RAM feature in SQL    Server 6.5 screamed loudly. I finally wrote a long mail and included some    experimental results that proved that the new approach was better but I still    received hate mail for several years after that decision.</p>
<p> <b>Scott: </b>Areas like the developer division have    strived for greater transparency.  In open source, all development and    decisions are transparent.  Talk about how SQL Server views transparency during    development, and how you have to balance expectations from customers who want    full transparency, vs. not shooting yourself in the foot by disclosing early    and giving a closed source competitor like Oracle an advantage?</p>
<p><b>David: </b>You touch upon a very real challenge. SQL    Server has matured to the point where we do a very good job on the fundamentals    and, as a result, it’s more important that we listen to and work with our    customers to continue to produce a product that helps them better run their    business. We’ve talked mostly about the core relational database engine thus    far but today’s SQL Server includes data analysis, data mining, reporting,    enterprise class ETL, etc. The solutions we deliver in this space touch a    broader range of customers and the pace of innovation is much faster than that    seen in the core relational database engine. As a result, we need to be much    more in tune with our customers to produce the right product. We’re making some    real progress on including key customers and experts in our design process. For    some of our more complex SQL Server 2008 improvements we’ve done joint design    with key customers and MVPs and they’ve helped us make many of the tough    scenario and design tradeoffs required to deliver the right feature. Done well,    this reduces the need for iterative field testing to get solid feedback on new    features.</p>
<p>Disclosing early is a real risk and, yes, our competitors do    listen and respond. It’s funny, they pay much more attention to what we say now    than they did 10 years ago.</p>
<p><b><a name="OpenSource"></a>Sean: </b>What experience do you have with Open Source    Databases?</p>
<p><b>David: </b>Yes, I have experience in open source and    certainly follow the key open source databases from the perspective of    technology evolution and adoption. I don’t look at any source code from the    open source databases to avoid any potential IP issues. </p>
<p><b>Sean: </b>What do you think are some things that are easy    to accomplish in a closed source model that would be challenging in open    source?</p>
<p><b>David: </b>I think one thing is something I would call    “consistency in the large”. This isn’t necessarily easy in the closed source    world but in a coordinated engineering environment you can align things in a    way that lead to a degree of consistency which creates customer value. For    example, the products within Microsoft’s Server and Tools business have a set    of “Common Engineering Criteria” (CEC) that we all follow. The criteria include    things such as common processes and UE content that lead to a more uniform    customer experience and aligned features, such as a Best Practices Analyzer or    having a management pack for our management tools when we RTM. This sort of    broad consistency doesn’t happen organically and is one of the challenges that    large scale open source efforts have to wrestle with.</p>
<p><b><a name="OpenSourceDB"></a>: </b>How much is SQL development and features informed    by things like MySQL and DB2?  Are there features in SQL that were inspired by    these products?</p>
<p><b>David: </b>Certainly; and it goes both ways. Things like    persisted views have been done by all major database players. I can’t recall    whether Oracle or IBM did it first but now SQL Server, Oracle, and DB2 all have    a form of persisted views with varying degrees of updatability and query    matching sophistication. Oracle was the last major database player to have a    credible fully cost based optimizer. I would say that SQL Server led in the    advancement of real self-tuning and ease of use. Unfortunately we did it in    1998 and tried to sell it to a market that didn’t understand its value and IBM    did a great thing (for IBM) later on by coining the phrase “autonomous    computing” and selling the value. I don’t think MySQL has contributed too much    yet to the big 3 but I do like MySQL’s notion of installable storage engines    with implementations optimized for various scenarios. What’s interesting is    that SQL Server 7.0 was architected to support this concept with the OLE-DB    interface between the relational and storage engine but MySQL has really gotten    value from the concept.</p>
<p><b>Sean: </b>If you had the opportunity to borrow more from    the development model of the open source community when it comes to SQL Server    what are the top couple of things you would like to bring over to SQL Server    development that you see out there today in the Open Source community?</p>
<p><b>Scott: </b>In general I like the notion of tapping into a    large development community’s collective energy. Different people have    different passions and, if you can harness the energy effectively, it can lead    to some interesting results. For example, if someone gets excited about adding    a specific feature or fixing a particular bug they can often just make it    happen. Of course, one challenge in the open source world is maintaining    architectural and feature consistency. Smart people don’t always agree on    what’s important for the customer or know how a particular feature should be    implemented in a way that maintains the system’s design tenets. I think a key    aspect of great design is in providing the most value with the simplest and    most intuitive implementation and user model. The open source project    maintainers have a very tough job keeping this in check.</p>
<p>Another aspect of the open source world which is powerful is    the ability to generate an ecosystem of tools and add-ons around a core    technology. This is an ecosystem effect that may or may not require open source    access to the core technology. For example, imagine if MySQL were a closed    source project but that it had a vibrant open source community building    installable storage engines for various scenarios. We are starting to do some    of this with SQL Server through Codeplex. </p>
<p><b>Sean: </b>How much do you think SQL Server’s development    has been impacted by Open Source development efforts in terms of how they build    community around various database products?</p>
<p><b>David: </b>Enabling the community to help the community    and giving the user base a direct voice to the development team is one of the    most exciting and powerful concepts to come out of the open source development    model. We have learned a lot from this and in our product review meetings we    regularly review input from our community including how they vote on design    change requests. Rather than us guessing or asking a small sample we can now    reach out to our community quickly and efficiently. Frankly the Internet and    access to our community have enabled us to produce a much better product.    Things such as Watson, SQM, and our community product provide us direct    feedback that allows us to respond in ways we couldn’t even imagine 10 years    ago.</p>
<p><b>Scott: </b>Give us some understanding of the structure of    the SQL Server development team in terms of some high level estimates in terms    of the number of testers, developers, number of machines in the test lab,    length of time it takes test suites to run, etc.  Just so folks can put the SQL    Server development effort in perspective to other efforts they are familiar    with.</p>
<p>I do not have exact numbers in my head but let’s just say we    have 100’s of developers, 100’s of testers and probably 10,000 machines in    various test labs. We literally have millions of individual test cases and the    test system is highly automated. Many of our test machines are in offsite data    centers and we’re able to connect to and control them via IP KVM. We have    evolved our test methodology to include much more model based testing rather    than individual test case generation&#8230; One interesting thing is that we    completely revamped our development methodology between SQL Server 2005 and the    current release. This was a huge cultural and process change and we have    learned a lot from this experience. Perhaps we can chat a bit out this next    time.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=120&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://howsoftwareisbuilt.com/2008/01/04/interview-with-dave-campbell-technical-fellow-microsoft/&amp;title=Interview+with+David+Campbell+%26%238211%3B+Technical+Fellow+%26%238211%3B+Microsoft" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://howsoftwareisbuilt.com/2008/01/04/interview-with-dave-campbell-technical-fellow-microsoft/&amp;title=Interview+with+David+Campbell+%26%238211%3B+Technical+Fellow+%26%238211%3B+Microsoft" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://howsoftwareisbuilt.com/2008/01/04/interview-with-dave-campbell-technical-fellow-microsoft/" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://howsoftwareisbuilt.com/2008/01/04/interview-with-dave-campbell-technical-fellow-microsoft/&amp;title=Interview+with+David+Campbell+%26%238211%3B+Technical+Fellow+%26%238211%3B+Microsoft" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://howsoftwareisbuilt.com/2008/01/04/interview-with-dave-campbell-technical-fellow-microsoft/&amp;title=Interview+with+David+Campbell+%26%238211%3B+Technical+Fellow+%26%238211%3B+Microsoft" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http://howsoftwareisbuilt.com/2008/01/04/interview-with-dave-campbell-technical-fellow-microsoft/" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+David+Campbell+%26%238211%3B+Technical+Fellow+%26%238211%3B+Microsoft+@+http://howsoftwareisbuilt.com/2008/01/04/interview-with-dave-campbell-technical-fellow-microsoft/" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2008/01/04/interview-with-dave-campbell-technical-fellow-microsoft/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Is Security Really That Hard to Measure?</title>
		<link>http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/</link>
		<comments>http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/#comments</comments>
		<pubDate>Fri, 12 Oct 2007 00:00:58 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[security development lifecycle]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/</guid>
		<description><![CDATA[There&#8217;s a good article on LinuxWorld about the security debate between open-source and Windows.  My first question is, does it need to be a debate?  In this day and age, isn&#8217;t it easy enough to quantify vulnerabilities?
If you are looking for subjective opinion, I recommend looking through the interviews we&#8217;ve done here.  [...]]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s a good article on LinuxWorld about the <a href="http://www.linuxworld.com.au/index.php/id;121116082;fp;4;fpid;3">security debate between open-source and Windows</a>.  My first question is, does it need to be a debate?  In this day and age, isn&#8217;t it easy enough to quantify vulnerabilities?</p>
<p>If you are looking for subjective opinion, I recommend looking through the interviews we&#8217;ve done here.  At the risk of sounding like a Microsoft fan-boy, the Microsoft interviews (in my opinion) demonstrate a company where secure coding is &#8220;in the water&#8221;.  Code goes through threat modeling, risky function calls have simply been banned, code goes through automated and human inspection, and vulnerabilities that do slip through feedback into the process to determine how to prevent them in the future.  </p>
<p>I simply don&#8217;t get the same feeling from the open-source people we&#8217;ve talked to.  When we&#8217;ve brought the subject up, the response is almost universally &#8220;many eyeballs,&#8221; and faith (without data) that &#8220;many eyeballs&#8221; is effective.</p>
<p>Am I completely off base?  Do things like the Linux kernel and Apache go through rigorous security reviews?  Is there proof that &#8220;many eyeballs&#8221; in open source is at least as good as something like the Security Development Lifecycle in Microsoft?  If you&#8217;re in a position to know, let&#8217;s chat!</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=106&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/&amp;title=Is+Security+Really+That+Hard+to+Measure%3F" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/&amp;title=Is+Security+Really+That+Hard+to+Measure%3F" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/&amp;title=Is+Security+Really+That+Hard+to+Measure%3F" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/&amp;title=Is+Security+Really+That+Hard+to+Measure%3F" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Is+Security+Really+That+Hard+to+Measure%3F+@+http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2007/10/12/is-security-really-that-hard-to-measure/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Interview with Microsoft&#8217;s Scott Charney</title>
		<link>http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/</link>
		<comments>http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/#comments</comments>
		<pubDate>Thu, 04 Oct 2007 19:35:33 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[michael howard]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[security development lifecycle]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/</guid>
		<description><![CDATA[A while back, we did an in-depth interview with Michael Howard about Microsoft&#8217;s Security Development Lifecycle, which has been one of our most popular interviews to date.&#160; It seems there&#8217;s a lot of interest in pulling back the covers and looking at how Microsoft is approaching building secure code.&#160; 
ComputerWorld just did an interview with [...]]]></description>
			<content:encoded><![CDATA[<p>A while back, we did an in-depth interview with <a href="http://howsoftwareisbuilt.com/2007/06/24/michael-howard-microsoft-interview/">Michael Howard</a> about Microsoft&#8217;s Security Development Lifecycle, which has been one of our most popular interviews to date.&nbsp; It seems there&#8217;s a lot of interest in pulling back the covers and looking at how Microsoft is approaching building secure code.&nbsp; </p>
<p>ComputerWorld just did an <a href="http://www.computerworld.com/action/article.do?command=viewArticleBasic&amp;articleId=9038059&amp;pageNumber=1">interview with Microsoft&#8217;s Scott Charney</a>, which provides more insight into their efforts to produce secure products.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=101&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/&amp;title=Interview+with+Microsoft%26%238217%3Bs+Scott+Charney" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/&amp;title=Interview+with+Microsoft%26%238217%3Bs+Scott+Charney" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/&amp;title=Interview+with+Microsoft%26%238217%3Bs+Scott+Charney" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/&amp;title=Interview+with+Microsoft%26%238217%3Bs+Scott+Charney" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Microsoft%26%238217%3Bs+Scott+Charney+@+http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2007/10/04/interview-with-microsofts-scott-charney/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interview with Ryan Waite, Group Program Manager for High-Performance Computing at Microsoft.</title>
		<link>http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/</link>
		<comments>http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/#comments</comments>
		<pubDate>Sun, 22 Jul 2007 21:25:49 +0000</pubDate>
		<dc:creator>scottswigart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Compute Cluster Server]]></category>
		<category><![CDATA[high performance computing]]></category>
		<category><![CDATA[message passing interface]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[parallel programming]]></category>
		<category><![CDATA[Ryan Waite]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[supercomputing]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/</guid>
		<description><![CDATA[Interviewers: Scott Swigart, and Sean Campbell 
Interviewee: Ryan Waite 







Ryan Waite



In this interview, we talk with Ryan Waite, Group Program Manager for High-Performance Computing at Microsoft.&#160; We talk about: 

Microsoft&#8217;s entry into the High Performance Computing area.
Microsoft&#8217;s support of the open source Message Passing Interface (MPI).
MPI has become bloated, and there&#8217;s no usability data (like [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Interviewers:</strong> <a href="http://howsoftwareisbuilt.com/about-scott-swigart/">Scott Swigart</a>, and <a href="http://howsoftwareisbuilt.com/about-sean-campbell/">Sean Campbell</a> </p>
<p><strong>Interviewee:</strong> <a href="http://howsoftwareisbuilt.com/about-ryan-waite-microsoft-group-program-manager-for-high-performance-computing/">Ryan Waite</a> </p>
<table>
<tr>
<td><img src='http://howsoftwareisbuilt.com/wp-content/uploads/2007/07/ryanwaite.thumbnail.jpg' alt='ryanwaite.jpg' />
</td>
</tr>
<tr>
<td align=center>
Ryan Waite
</td>
</tr>
</table>
<p>In this interview, we talk with Ryan Waite, Group Program Manager for High-Performance Computing at Microsoft.&nbsp; We talk about: </p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/#entry">Microsoft&#8217;s entry into the High Performance Computing area</a>.
<li><a href="http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/#mpi">Microsoft&#8217;s support of the open source Message Passing Interface (MPI)</a>.
<li><a href="http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/#bloat">MPI has become bloated, and there&#8217;s no usability data (like there is with Microsoft Office) to indicate what should be deprecated</a>.
<li><a href="http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/#features">At Microsoft, select customers help shape the features for the next product release</a>.
<li><a href="http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/#ease">Microsoft has always focused on ease of use, simplified administration, and uncomplicated setup. HPC is no different.</a>
<li><a href="http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/#instrument">Instrumentation helps find reliability issues. If partners are affecting reliability (with unstable drivers, for example) certification helps raise the bar.</a>
<li><a href="http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/#parallel">Parallel programming is now the only way to do more computation in the same amount of time. Processors just are getting faster like they used to.</a>
<li><a href="http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/#radical">It&#8217;s easier for open source to be incremental. Proprietary software has to make radical changes.</a>
<li><a href="http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/#bandwidth">Hard drives + overnight shipping = really high bandwidth.</a>
</ul>
<p><span id="more-73"></span></p>
<p>&nbsp; </p>
<p><b>Scott Swigart:</b> We are conducting a series of interviews to explore the premise that open source software and closed source proprietary software are built differently &#8212; that the software development life-cycle is different, and that difference is likely to manifest itself in concrete ways in the final product. </p>
<p>That&#8217;s not to say that one&#8217;s better than the other, but simply that there are certain expectations you can have from either open source or closed source software because of the way it&#8217;s built, as well as certain inherent limitations. </p>
<p>In the past, ISVs pretty much just wrote closed source proprietary software, but today they have a little more of a choice in how they build a product, and we want to look at some of the decision points around that. And if you&#8217;re looking at bringing software into your organization, and you have a choice between something open source and something closed source proprietary, again, what are some of the common decision points or expectations around that? </p>
<p>To get us started, tell us a little about yourself and what you do, to give us a little bit of context. </p>
<p><b>Ryan Waite:</b> <a name="entry"></a>I&#8217;m the Group Program Manager for <a href="http://www.microsoft.com/windowsserver2003/ccs/default.aspx">High-Performance Computing</a> at Microsoft. What that means is that my team is responsible for writing the technical specifications for our products; we design which features are going to be necessary for the product, then we write up how those features should work. </p>
<p>We work in close partnership with the software development and software test teams. The development team, as you can imagine, writes the code itself, and the test team tests that code. I should also mention that my product actually combines both closed and open source software &#8212; both Microsoft-designed as well as open source software. </p>
<p>My staff is comprised mostly of computer scientists or computer engineers of one kind or another. I&#8217;ve been at Microsoft for over 15 years, and I&#8217;ve worked mostly on server products, including things like Exchange Server, BackOffice Server, Windows Server, and Small Business Server &#8212; all sorts of different pieces. </p>
<p>And the product I work on right now is called Compute Cluster Server. As an HPC product, Compute Cluster Server allows you to bind 50 or 100 or 1,000 machines together to work on a single computational problem or a single set of closely related computational problems. So it&#8217;s basically a category of supercomputing. </p>
<p>The idea is to take very large computational problems, like proteomics problems, genomic problems, problems in mechanical engineering, weather simulations, and computational finance; and you can then run these against this huge set of machines. And the great thing about these kinds of clusters is that they cost a fraction of the cost of traditional supercomputers. But clusters are very powerful, and they&#8217;re very well suited to certain types of programming problems. These kinds of clusters are based on the concept of what are called <a href="http://www.beowulf.org/">Beowulf</a> clusters, developed back in the early &#8217;90s at NASA. The idea is to bind together a bunch of commodity servers with a high-speed network, and you can run some very interesting computational problems on them. </p>
<p>Most problems can be designed to work on a cluster, although there is still a category of supercomputing problems that you really do need a traditional supercomputer for. </p>
<p><b>Scott:</b> Thanks. You mentioned that you have closed source and open source solutions; can you talk a little bit about what you do on each side? </p>
<p><b>Ryan:</b> Loosely speaking, you can think of our product as having three major components. One is what&#8217;s called a job scheduler, which schedules work across this big cluster of resources. If you have 1,000 machines, it&#8217;s not like every job takes 1,000 machines; some take 16 machines, some need 64 machines, some need 132 machines. And then they have time limits for how long they want the machines to go &#8212; for example, they might need eight hours of computational time. So we have a job scheduler. </p>
<p>One of the gentlemen on my team was an architect on the LSF job scheduler, at a company called Platform Computing &#8212; he&#8217;s a rocket scientist. The second thing we&#8217;ve built is systems management functionality, and this allows us to manage this big cluster of machines. There&#8217;s a woman named Cathy Palmer on my team who was at Cray and Terra Computing for 11 years and has a PhD in job scheduling; she&#8217;s a rocket scientist, too. These are some of the best people that ever I worked with, so I&#8217;m just going to glow a little bit about how great they are. </p>
<p>The third area is about high-speed networking, and the parallel programming model itself. That&#8217;s headed up by another gentleman on my team named Eric Lantz. Part of that is high-speed networking. On the high-speed networking end of things, you need to enable these machines to talk to each other very quickly, because sometimes the speed of your computation is dependent on how fast the nodes can communicate with each other as they&#8217;re running their computations. It&#8217;s not just about the CPU inside that machine, but sometimes the I/O, or the communication speed between those machines. </p>
<p><a name="mpi"></a>For high speed, tightly coupled communication there&#8217;s something called <a href="http://www-unix.mcs.anl.gov/mpi/">MPI</a>. MPI stands for the Message Passing Interface. This is basically the communication interface used for people to write these kinds of massively parallel applications that run on clusters. There are some standards for how MPI is implemented, but the reference design comes from <a href="http://www.anl.gov/">Argonne National Lab</a>s. And Argonne National Labs distributes that code under a BSD-style open source license. </p>
<p>So Microsoft is a licensee for that code. We&#8217;ve taken that open source code from Argonne, and we&#8217;ve made modifications to it and we ship it as part of our product. The folks at Argonne are genius UNIX and Linux developers &#8212; really great UNIX developers. They&#8217;ve been doing this kind of stuff for years and years. And message passing is a tremendously complicated area of computer science, and they&#8217;ve done a great job at building this code. They call it <a href="http://en.wikipedia.org/wiki/MPICH">MPICH</a> and <a href="http://www.hpclab.niu.edu/mpi/g2_body.html">MPICH2</a>. </p>
<p>The problem is that they&#8217;re not Windows developers. They don&#8217;t have a lot of experience with that. They have one person over there that is their Windows developer, but we have a lot of experience programming Windows and can certainly help. And Windows and Linux, are different operating systems, so you write programs for them in slightly different ways. Not radically different ways; in both cases you use C or Java or C# or whatever to write your applications. </p>
<p>But the thing to keep in mind with them is that you have different efficiencies in different kinds of ways. And so one of the things we&#8217;ve done is to implement those efficiencies in the Windows version of MPICH. All of this allows us to have a very high-performance solution. </p>
<p>We&#8217;re about to contribute those changes back. We&#8217;ve just been working through some of the final issues with how we transfer the code back. It&#8217;s just logistical stuff. In the next month or two, you&#8217;ll see a press announcement go out that talks about our contributions to this code, these modifications that we&#8217;ve made back to Argonne. As far as I can tell, this will be the largest contribution Microsoft has ever made to the open source community. </p>
<p>With this particular code, we want anybody running Windows to have access to a high-performance MPI library. If they buy it through my product, that&#8217;s great; if they are just running Windows and they want it, then they can get it from Argonne. </p>
<p><b>Scott:</b> You&#8217;re in kind of an interesting position, because you&#8217;re one of the few Microsoft people developing both closed source proprietary and open source projects. What are some of the things that stand out to you as key differences between how those two processes work? </p>
<p><b>Ryan:</b> What&#8217;s interesting about MPI is that there are a large number of features that have been driven because somebody in a standards body or open source group thought it was an interesting feature for what they were doing. They&#8217;ve implemented particular APIs. </p>
<p>Now, that API may never be used by anybody else except them. And yet it becomes part of this code base. The really great thing about that is that if you need to do a very specialized API for the application that you&#8217;re writing, you can go ahead and contribute it to Argonne &#8212; and if anybody else in the world has that same kind of problem, they might be able to use that same piece of code that you wrote. </p>
<p>The disadvantage of that is that this is now a very large code set with over 200 APIs. It&#8217;s a bit of a joke, and I don&#8217;t have any proof that this is true, but the joke is most developers use six or eight of the MPI commands out of that 200 for the bulk of their programming, so there&#8217;s a whole bunch of stuff that just isn&#8217;t used. And part of the problem is that you have to maintain all of that code as it marches forward, making sure all those APIs work. </p>
<p>They&#8217;re all documented, they&#8217;ve all been shipped, and ideally, they all continue to be well-supported. I think that&#8217;s the struggle with that kind of model &#8212; the code has exploded pretty quickly. It exploded because everybody gets to contribute the little things that they need into that code. </p>
<p><b>Scott:</b> That&#8217;s interesting, because in other open source projects, the maintainers tend to be pretty strict and conservative about what they&#8217;ll let in. </p>
<p><b>Ryan:</b> <a name="bloat"></a>There is no single person for MPI saying, &#8220;this is OK, this is not OK.&#8221; You don&#8217;t really have the same kind of maintainer making things cross a very high bar before they get included. And those bars can take many forms, other than just the quality of the code. </p>
<p>Sometimes people establish a quality bar for when code gets included and sometimes it&#8217;s a customer requirement bar &#8212; you might want to make sure that if people are going to use something, it works properly. We all have to have different reasons for what we decide to include in software and what we decide not to. </p>
<p>I think it&#8217;s just the nature of these projects and the way software projects mature. They get bigger. Take the Linux kernel &#8212; it&#8217;s huge compared to what it was five or ten years ago. </p>
<p>As these products become more and more general purpose, they have to be able to service the needs of a greater and greater community. And we have an enormous amount of experience with this at Microsoft with products like Office, for example. </p>
<p>We have more usability data about how people use Office than you can imagine. Because we shipped instrumented versions of Office during the beta period, we know exactly which features people use and we know exactly how they use them. </p>
<p>So, even thought there may be a thousand features inside of Office, we know that every one of them is used. Everybody may use a different 20 percent, but we know that all those features are well-used. And features that aren&#8217;t used, we deprecate or remove from the product. We have very concrete usability data to tell us that. </p>
<p><b>Scott:</b> Contrast that with the closed source work that you do. It seems like Microsoft goes through a process where there&#8217;s a big wish list of features, and people start coding on things as the project marches toward release. Things get cut, and developer resources get put on the more critical things. </p>
<p>How do you see the closed source work that you do comparing with the way you&#8217;ve seen the open source projects developed, in terms of how features are spec&#8217;d, how they&#8217;re built, how features are cut, and how things make it into the release product? </p>
<p><b>Ryan:</b> That&#8217;s a hard question to answer, because it&#8217;s so different depending on which open source group you&#8217;re working with. I&#8217;d say for us, the first step early on in a project is to figure out whether a product is going to be more driven by a release date, or more driven by a feature set. </p>
<p>We do both at Microsoft. If you&#8217;re more driven by a release date, you really make sure that you scope that product so you&#8217;re going to do well at hitting that release date. And Office is a good example of a product that really drives to release dates. </p>
<p>The video game industry is an even better example. If you look at Xbox or you look at any of the game title providers that are building stuff for Xbox, they have to hit Christmas. No ifs, ands or buts. Otherwise, they&#8217;re not going to recoup any of their development costs, and those companies could fail. It&#8217;s really important for those organizations to scope their features to a particular date. </p>
<p>For other projects, you might look at customer requirements and pick a date, but you&#8217;re really going to drive to implementing that set of features and making sure that they hit the appropriate level of quality. I think that as those projects move on, that&#8217;s where you may begin to see some features get cut in order to hit a release date, or just because the team realizes that they&#8217;re less important. </p>
<p><a name="features"></a>We spend a lot of time sitting with customers. We have a customer advisory board called the <a href="http://msdn2.microsoft.com/en-us/isv/bb190413.aspx">Technology Adoption Program</a>, and they come out here for a couple of days. These are big technology decision makers at government national labs, biotech companies, and financial institutions like banks and stock trading houses. We spend two days presenting to them on what we&#8217;re planning to build, and they give us feedback about what they&#8217;d like to see. </p>
<p>We release public betas. In the Technology Adoption Program, people actually deploy beta code and run it in production prior to the release of our software. That lets us know which features are being used and which ones aren&#8217;t. </p>
<p>I also spend a lot of time with people asking them which things they think are dumb or badly implemented, with an eye toward cutting those and focusing on other stuff instead. </p>
<p>The overall process takes the entire universe of things that you could possibly build, and you continually cut, as you move through the software development process, until you have what is really important at the end. </p>
<p><b>Scott:</b> Again, this varies from open source project to open source project, but a lot of them are built by developers for developers, so the way a feature gets into an open source product is that somebody just sits down and writes it. Their first interaction with the process is by submitting code for a feature. </p>
<p>If they can&#8217;t code it, then that feature isn&#8217;t going to get included, so the downside is that you only have coders to some degree, spec&#8217;ing features. The upside is that nobody writes something that they aren&#8217;t personally going to use. </p>
<p>At a company like Microsoft, when you&#8217;re building the next version of a product, you&#8217;re doing a certain amount of work that&#8217;s fairly speculative. You&#8217;re building what you think people are going to need &#8212; you&#8217;re really building what you think people are going to use, and sometimes that&#8217;s more successful than others. </p>
<p>I mean, sometimes Microsoft delivers stuff that&#8217;s wildly popular and everybody uses it, and some things are dead on arrival, even though there was a lot of effort put into building them. </p>
<p><b>Ryan:</b> Right &#8212; do you remember the &#8220;Bob&#8221; operating system? It&#8217;s not a successful product. </p>
<p>[laughter] </p>
<p>Commercial software companies have to determine what people want, and what will be successful. That&#8217;s why we spend a lot time doing things like customer research, usability research &#8212; really trying to figure out what we should build. I talk a lot about market opportunity documents, which is a fancy title for &#8220;what do people want?&#8221; </p>
<p>It tells us first of all, what set of users we are going to help. We segment the market, and we figure out where we&#8217;re going to prioritize. That&#8217;s the first round of cuts &#8212; we&#8217;re not building something for everybody; we&#8217;re building something for a particular group. </p>
<p>Then we ask, &#8220;What do those people really want?&#8221; and we visit customers to try to answer that. One of the first questions I asked is an open-ended one, which is, &#8220;What drives you most crazy about your HPC System?&#8221; Windows or Linux &#8212; it doesn&#8217;t matter &#8212; what&#8217;s the thing that drives you most nuts? </p>
<p>If you have some really open-ended questions, you hear some really interesting things come back from customers. Remember that HPC was entirely a UNIX market in the past. A lot has moved to Linux, but there&#8217;s still a fair amount of Unix out there, and some of them are beginning to adopt Windows. </p>
<p>And because so much of it is driven by the open source people, you have a whole set of computer scientists really driving at issues that they think are important in high performance computing. </p>
<p>But when you go talk to the system administrators, they tell you things like, &#8220;I&#8217;m struggling with power and cooling &#8212; the systems just run too hot.&#8221; The second problem that they talk about is, &#8220;It&#8217;s so hard to get the cluster up and running in the first place.&#8221; </p>
<p><b>Sean Campbell:</b> Do you think that either closed source or open source software reaches out a little more effectively to some of those audiences and meets their needs? </p>
<p><b>Ryan:</b><a name="ease"></a> Take systems administration software for example. How do I get servers up and running, and deployed, and get the whole system going? We do really well with that, and it’s a good thing to do well at since a lot of people complain about it. </p>
<p>This is where things get a little bit touchy, because there are a lot of UNIX administrators who are very passionate about UNIX or Linux. And Windows administrators like to talk about how great Windows is. People throw mud all the time about this. I&#8217;m on a bunch of user group mailing lists, and people there say, &#8220;Well, the problem with Windows administrators is that they&#8217;re inexpensive, and they&#8217;re all the same.&#8221; </p>
<p>On the other hand, I have heard people say, &#8220;Every time we get a call from a Unix customer, or a Linux customer, their whole network is set up in a unique way. Every time I get a call from one of our Windows customers they&#8217;re easier to support because all Windows networks are traditionally set up the same.&#8221; It&#8217;s true that there are better prescriptions of how to do things in Windows than there are in the Linux world, or the Unix world, and so that has some benefit. </p>
<p>I think we do a good job of servicing the needs of the system administrators. Certainly in the database space there are a lot of debates about it. I think MySQL has actually come a long way, and is doing pretty well. I&#8217;m not a big database expert, but SQL Server really changed the database market. </p>
<p>I don&#8217;t know if you guys have ever set up a Oracle Database before, but it&#8217;s hard [laughs]. The first one I set up was about 10 years ago on a Solaris box, and oh my God. I consider myself to be a pretty technically competent guy. I&#8217;m not a database expert, but I spent a weekend just trying to get through the ten volumes of manuals about how to get the thing up and running and performing correctly. </p>
<p>With SQL Server, you walk through a wizard for setup, and you&#8217;re pretty much up and running. So, I think we do really well at servicing particular user needs, and servicing a broad general market as well, even though it may not be as technically savvy as people that are making contributions into the open source community. </p>
<p><b>Scott:</b> Another area that I&#8217;d like to dig into involves some of the differences around identification of bugs, and support, and so forth. I&#8217;d guess that when you&#8217;re dealing with high performance computing and different architectures, you see all kinds of strange things that might be hard to reproduce. </p>
<p>On the closed source side, how do you handle that? What is the process for a customer to come to you with a problem and get a resolution, whether it&#8217;s just sending knowledge back to the customer, or whether it&#8217;s an actual change to the product to resolve it? </p>
<p><b>Ryan:</b> One of the things that&#8217;s important for my group to keep in mind is that we&#8217;re brand new in this space. Windows is eight to ten percent of the HPC market right now, which is really pretty good. But if somebody has an HPC cluster, it&#8217;s probably a Linux or Unix cluster, so what has been important for us is that we understand the way that people are using their existing clusters, and that we fit into the model. </p>
<p>We provide support through our traditional Microsoft product support services, and we also have newsgroups that we support. But we also behave a lot more like more open source organizations might behave. </p>
<p>We have a site called WindowsHPC.net, and there are a set of bulletin boards up there that people can search to look for problems that they run into. Our development team here in Redmond reads those groups all the time. </p>
<p>That helps us plug in to that type of traditional model for how people support these systems. And that&#8217;s how people can report bugs as well. We also get bug reports from a lot of our partners. In the HPC market, it&#8217;s not like building a packaged product like Office, where you kind of ship it out there and you&#8217;re done. </p>
<p>I have to get OEMs like Dell and HP and IBM and SGI all lined up to ship our software. We have network hardware vendors, software vendors, critical applications like Mathematica and MatLab, that are all integrated in with our products. We have to get all of that put together. </p>
<p>So, we get bugs from those customers or partners as well. And those come, usually, directly in to us. They talk to somebody that&#8217;s kind of an account representative for them at Microsoft, and we get bugs filed directly. We get on the phone with them, and partners get a really close level of support in that way. When end users run into problems, they post to the newsgroups, and we help them out there. </p>
<p>So, the interesting thing here is that we do both. We do stuff that&#8217;s a lot more like the open source community as far as finding out about bugs as well as stuff that is more traditional. </p>
<p><a name="instrument"></a>The bug reporting service built into Windows gives us really great, qualitative data about where things crash, like the fact that most crashes in Windows occurred because of video device drivers. I forget the exact numbers, but something like 50 percent of crashes in Windows used to be because of video device drivers. </p>
<p>And then we have something called WHQL, Windows Hardware Qualification Lab, and they run tests on drivers. In order for a driver to get WHQL certification, it has to pass a battery of tests. So, we would look at the ways these different video drivers were crashing, and create tests that we would then put into WHQL, and that would improve the quality of those drivers continuously. Because even if it&#8217;s a particular driver that crashes, you still blame Microsoft, because Windows crashed. And so, it behooves us to really go through making sure that all of that code is performing correctly. </p>
<p><b>Scott:</b> In other words, you&#8217;re able to regulate the ecosystem to some degree by saying, &#8220;If you want to be certified for Windows, submit your driver to our lab.&#8221; And then if your lab is running tests and they&#8217;re getting the driver to crash, you go back to the vendor and tell them to fix it? </p>
<p><b>Ryan:</b> Actually, the qualification labs are run by a separate agency. They can contact us to get a waiver for a particular company, and under some circumstances that&#8217;s worth doing, to grant a waiver if they&#8217;re not able to pass a particular test. </p>
<p>It could also be, as hardware continually evolves, that the tests may not always keep up. If a hardware vendor implements a totally new piece of hardware, the older tests may not work correctly, so you have to be flexible, of course. But everyone wants to go through that qualification process, because the OEMs &#8212; like Dell and HP and IBM, etcetera &#8212; all want WHQL-qualified drivers on their systems. Otherwise, they&#8217;re going to be getting support calls as well. </p>
<p><b>Scott:</b> One of the things that Microsoft&#8217;s been really focused on for about the last five years is security &#8212; just increasing the security of the stuff that you guys ship. Is any of that included in your certification process? </p>
<p><b>Ryan:</b> There are parts. We&#8217;ve been releasing a lot of our security tools to the community, for people to use the same security tools that we use in their own products as they build their own software. We have a lot of very sophisticated software analysis tools that help identify security bugs before they even pop up. We published a book called &#8220;Writing Secure Code.&#8221; </p>
<p><b>Scott:</b> We interviewed Michael Howard, so we&#8217;ve got a pretty good background on that. </p>
<p><b>Ryan:</b> Actually, Michael and I have worked really closely. As he was doing the first round of the security development life-cycle, I was responsible for security on mobile devices like Pocket PC and Smartphone. </p>
<p>I would say that we do really well on the security end of things. And if you talked to Michael, you know that the issue is that if you miss one thing, there&#8217;s some hacker out there that has years to try and find that one bug that somebody missed. </p>
<p><b>Scott:</b> He also pointed out that as Microsoft hardens their products, hackers are focusing more on the ecosystem because they&#8217;re softer targets. </p>
<p><b>Ryan:</b> I&#8217;ve read that same data and I think that&#8217;s fascinating, the way that these kinds of ecosystems change over time. </p>
<p><b>Scott:</b> So, the third party that does the certification runs tests, but I&#8217;m guessing it hasn&#8217;t evolved to the point where they say to a driver vendor, &#8220;Well, submit your threat models.&#8221; Do you see things evolving in that direction, or is that more intrusive into how an organization does what it does than Microsoft would want to get? </p>
<p><b>Ryan:</b> I would say it leans toward your second point, that when you hand over your threat models, you&#8217;re basically handing over the architecture of your product. And I&#8217;m not sure that a lot of people want to do that. </p>
<p><b>Scott:</b> One of the things that Michael also pointed out is that certain APIs have just been banned, things as simple as strcpy(), because they&#8217;re vulnerable to buffer overruns. </p>
<p>It seems like you guys would be in a difficult position because high-performance computing, on one hand, is all about speed, but I&#8217;m guessing that there&#8217;s a security aspect to it, too. So, how do you balance security and performance, or do you see that there&#8217;s any tension between those at all? </p>
<p><b>Ryan:</b> I don&#8217;t think it&#8217;s black and white. I think there&#8217;s a little bit of tension, but not a lot. Here&#8217;s what people traditionally do with HPC systems: you have a cluster &#8212; let&#8217;s say you have 500 machines &#8212; and you have what&#8217;s called a head node. And a head node is where you submit computational jobs. </p>
<p>That head node will be sitting on the public network, and when you submit your job and your job is ready to run against all the compute nodes in the cluster, those compute nodes are sitting behind a firewall on a private network. </p>
<p>In a sense, all of those compute nodes can have a very open relationship with each other. They may not be hardened in the same way that every other system on the network is. And that&#8217;s OK, because they&#8217;re all behind a firewall. And all the jobs, when they&#8217;re submitted, are validated. You know what user submitted them, and they have permission to submit jobs. And if they don&#8217;t have permission, the job is rejected, and so forth. Those jobs are submitted into that cluster and they run on the cluster at that point. </p>
<p>There&#8217;s a pretty well-established security model for how clusters run computational jobs already. In the military installations that we&#8217;ve talked to, they actually hook them up in separate rooms. They&#8217;re not plugged into any kind of public network at all. </p>
<p><b>Scott:</b> I would assume that just for performance reasons if nothing else, you don&#8217;t want your cluster sharing a network, and it&#8217;s also probably on a much higher speed network than the rest of your organization. I would assume that it&#8217;s pretty isolated. </p>
<p><b>Ryan:</b> Here&#8217;s where it gets really interesting. The thing that my group is really pushing for is moving high-performance computing into more mainstream areas. We don&#8217;t focus on share-shift of moving people from Linux to Windows in the existing market; our job is really to grow the HPC market. And we expect that there&#8217;s going to be a little bit of share-shift, and that&#8217;s great for us, but we&#8217;re really focused on growing that market in new spaces. </p>
<p>One of the areas that&#8217;s very interesting is departmental and workgroup clusters. Those people may set up the whole cluster on a public network. In those systems we need to make sure that that system is secure by default when it comes up. </p>
<p><b>Scott:</b> Do you ever see a day where every computer in the organization is potentially also part of a high performance cluster? Or do you think there will always be dedicated machines that are separate from machines running the user-workloads? </p>
<p><b>Ryan:</b> Are you talking about a situation where my computer in front of me would actually be plugged into a cluster on the back-end? </p>
<p><b>Scott:</b> Right. Kind of like the SETI@home thing, where my machine is idle, so it starts working on this job. </p>
<p><b>Ryan:</b> In some cases, I do see that happening. People call it &#8220;cycle stealing,&#8221; or &#8220;workstation clusters.&#8221; </p>
<p>Those systems only work well for a particular set of computing problems, though. For certain kinds of computational problems that have a high number of cycles that provide data, those kinds of clusters work great. So SETI@home is a great example; there&#8217;s also a protein folding project which works this way. When your screensaver kicks on, you start doing your computation as a part of the screensaver. There&#8217;s a whole set of other problems that don&#8217;t actually work that well. Seismic processing is a great example of a problem that doesn&#8217;t work very well, because they have very low number of cycles of computation per byte of data. I talked to one of the companies that does this kind of work on a 10,000 node cluster, and when they get customer data in from an oil company, it&#8217;s a whole pallet of tapes. </p>
<p>The whole thing gets loaded into a gigantic tape array, and then they farm it out across a 10,000-node cluster for computational analysis. They&#8217;re dealing with just ridiculous amounts of data, and that stuff wouldn&#8217;t work well with cycle stealing. When you really get these systems working in the enterprise phase, you&#8217;ve got to figure out how to make sure that data that&#8217;s residing in London doesn&#8217;t get shuffled off to China for computation, because it&#8217;s so inefficient to move it all the way around the world. You may actually spend more time moving data than you do computing it. </p>
<p><b>Scott:</b> So, in other words, &#8216;high volume of data, low computation per byte&#8217; is bad for that cycle stealing scenario. &#8216;Low volume of data, high computation per byte&#8217; is good for cycle stealing, because you ship a little bit of data out to a machine, and it just crunches on it for quite some time and submits, again, a fairly small result set back. </p>
<p><b>Ryan:</b> <a name="parallel"></a>Yes. Here&#8217;s what I think is a more interesting direction, in terms of your question about what&#8217;s happening with my PC and my cluster in the background. We&#8217;ve hit a point where CPUs aren&#8217;t going faster any longer. 4GHz is pretty fast, and the amount of heat generated by a processor running at 4GHz is enormous. The reason we can&#8217;t go faster is they just get too hot. </p>
<p>So the silicon vendors have started moving in a multi-core direction, and so we see four cores right now, we&#8217;re going to see eight cores, and pretty soon a bazillion cores are going to be on these machines. </p>
<p>And this is an issue for the software industry as well as for Microsoft. What&#8217;s been so great about the software industry is, as processors get faster, we can add more and more features, right? The Linux kernel gets bigger, the Windows operating system gets bigger, we have richer features, we can do more sophisticated stuff than ever before. But now that processors have hit kind of their Gigahertz speed limit, we can&#8217;t just willy-nilly add features into our products any longer. </p>
<p>The bad thing that could happen is that the software industry would become very similar to the white goods industry. You don&#8217;t buy a new washing machine because ooh wow, this one has this great new spin cycle! No, you wait until your washing machine breaks, and then you buy another one. </p>
<p>What has to happen in the software industry is that all software development needs to start moving in the direction of being aware of concurrency. </p>
<p>That concurrency could be across multiple cores on your local machine, or it could be across multiple machines running in a cluster. And so, the thing that I predict we&#8217;re going to see in the future, is that programming models and the kind of programs that are being written will not only be better at taking advantage of multiple cores, but those programs will be better able to run on a cluster as well. </p>
<p><b>Scott:</b> I&#8217;m sure that this is an area of just absolutely intense research, but right now, it&#8217;s just too hard for the average developer to write a good parallel program. </p>
<p>It&#8217;s too hard to write a multi-threaded program; it&#8217;s too hard to write a program that&#8217;s going to run on a cluster. Event driven programming became easy when it was just something the languages natively knew about. And right now, the languages don&#8217;t really natively know about threads, and concurrency, and all that stuff. It&#8217;s done through libraries; it&#8217;s not really done through core concepts in the language. You have to write the thread synchronization logic, and you can&#8217;t really just declare a synchronized integer, for example. </p>
<p>What do you see as the future of the tools, so that it&#8217;s not so insanely difficult to do it right? Do you see the way it&#8217;s done, with high performance computing, where you&#8217;re sending off batches of work and getting results back, do you see that as being a viable model for a multi-core system? Or do you think there&#8217;s some new paradigm on the horizon that might unify both? </p>
<p><b>Ryan:</b> For general purpose programming, it&#8217;s the latter. We start moving to new programming models &#8212; not radically new programming models, because we can&#8217;t retrain everybody in totally new programming constructs, but you&#8217;ll see subtle shifts. Languages like C# will continue to evolve and understand better concurrency, and it will become simple for developers. </p>
<p>I think we have to get there, because I agree with you, trying to teach average software developers about how to do concurrent programming just isn&#8217;t going to work, as it stands right now. And if you think that&#8217;s hard, you should look at the MPI programming that people write. This is akin to Assembly level programming &#8212; it&#8217;s just outrageously complicated. </p>
<p>You get this 500-node cluster, and the communications are not handled automatically. One node switches to receive mode, the other node switches to send mode, and then it sends data to the machine that&#8217;s in receive mode. If they both switch to send simultaneously, or they both switch to receive simultaneously, it deadlocks! And your whole 500-node job halts. </p>
<p><b>Scott:</b> Wow. Getting back on some of the differences between closed source and open source, to follow upon your statement that &#8220;you can&#8217;t just change everything because it&#8217;s hard to train everybody to do things differently,&#8221; open source seems to move very incrementally, because of the way it&#8217;s built, because things are submitted as 100 lines of code to a mailing list where they get scrutinized, and maybe it makes it in and maybe it doesn&#8217;t. And a lot of times, things are worked on in fairly small pieces. </p>
<p>It seems like it&#8217;s fairly impossible for an open source project to make a radical leap. There seem to only be two ways that happens: one is that somebody forks the source and does a fairly major change to it, or a whole new thing comes out of nowhere; you know, Ruby on Rails shows up out of nowhere, and it&#8217;s all the rage for building web stuff. People look at it, and they have to decide whether to stick with PHP or to learn a whole new thing. </p>
<p>In closed source, it seems to be a little different. It seems that there&#8217;s a lot higher barrier to just releasing something completely new that has no connection to an existing product. There&#8217;s a lower barrier to taking an existing product and maybe doing something a little bit more radical with it. Like in Office, they came out with the Ribbon, a whole new UI paradigm. I don&#8217;t know that something like OpenOffice would have ever evolved something like that. </p>
<p>Vista is obviously built on the same code base as previous operating systems, but there were some pretty radical departures there that had some impacts on application compatibility and issues like that. And again, it would be kind of a stretch for open source to do that. But on the other hand, you might come out with a whole new shell that competes with Gnome or KDE or their peers. </p>
<p>Talk about that a little bit. Do you think I&#8217;m onto something, or do you see things that run counter to my observation? </p>
<p><b>Ryan:</b> I think it&#8217;s a very good point, and it&#8217;s something that I&#8217;ve also thought about and wondered about. Part of what&#8217;s powerful about the open source community is that it is incremental. And as they make their incremental changes, if they&#8217;re good, people use them; if they don&#8217;t, nobody uses them, and they eventually are discarded. A good example is user interface models. We have one, Apple has one, and those are very popular user interface models. </p>
<p>The X-Windows model was never really that popular. It&#8217;s popular among a particular group of people, but not like the Macintosh and Windows paradigms were. And they&#8217;ve continually competed and evolved with each other. And I think you see Gnome and KDE coming in and filing the best of what they like out of those models and evolving that, and maybe keeping a little bit of some of the X-Windows stuff that they like as well. </p>
<p>Those are highly evolutionary systems. Things like OpenOffice can really pick at what has been successful&#8211;like Office, for example&#8211;and take that and put it into their products. At the same time &#8212; and I think this will be true for a lot of commercial software companies &#8212; because we spend so much time trying to figure out what people would pay for, if we think something is going to be paid for, we&#8217;ll invest the money there. </p>
<p><b>Scott:</b> It seems like there&#8217;s almost a pressure not to do things that are incremental, because unless it&#8217;s a big enough delta between this version and the previous version, people won&#8217;t pay for it. </p>
<p><b>Ryan:</b><a name="radical"></a> That&#8217;s right. We&#8217;re constantly in this business of having to generate value on a huge scale, because it&#8217;s not like we can build something that&#8217;s interesting for a hundred or a thousand people. It has to be interesting for millions of people. And so that&#8217;s why we can spend so much time figuring out the next big shift that&#8217;s going to come along. </p>
<p>Sometimes that means that we look at stuff that&#8217;s been popular in niche markets and figure out how to make that go really big. And sometimes it&#8217;s radical new things. In some ways, for example, you can say that the original Windows NT operating system was an evolution of traditional systems, based on work that was done at DEC, and work that was done at some other software companies, to create this new kind of kernel. </p>
<p>But NT was a fascinating concept. If we go back and think about it, the idea was that you could write a kernel that would span from the smallest laptop computer up to gigantic SMP systems. </p>
<p><b>Scott:</b> The whole Hardware Abstraction Layer was an unproven theory and a pretty radical idea. </p>
<p><b>Ryan:</b> Right, and it&#8217;s now the foundation for Vista. </p>
<p>You know, HPC is a really interesting market to study from an open source point of view, because it&#8217;s a full ecosystem, and it&#8217;s still pretty heavily driven by the academic community, which is, by nature, fairly open source-oriented. </p>
<p>We fund work at something like 10 universities and research labs around the world, like Jiaotong University in China, and Pacific Northwest National Labs, and the University of Tennessee where Jack Dongarra does all of his HPC research. And we do that because it&#8217;s still critical to really be working well with the academic community, in order to be successful in this market. </p>
<p><b>Scott:</b> They&#8217;re the ones running the climate simulations, things like that. They&#8217;re the ones who, largely, are running the kind of models that need high-performance computing. One of the things that&#8217;s kind of interesting &#8212; it&#8217;s not really high-performance computing &#8212; but Amazon has in beta what they call their Enterprise Compute Cluster. </p>
<p><b>Ryan:</b> Yeah. My friend Marvin Theimer is an architect at Amazon. </p>
<p><b>Scott:</b> Do you see a future for hosted high-performance computing? Do you see a future for maybe &#8216;high-performance computing lite,&#8217; where it is more tailored to the kind of jobs where you&#8217;re not moving enormous quantities of data, the computation per byte is higher, and you can have simplified communication between the nodes? </p>
<p>I wouldn&#8217;t have to have 500 nodes, but I could push a job up to some hosted environment that could put 10 nodes or 10,000 nodes on it. Does that kind of thing exist today? </p>
<p><b>Ryan:</b> It does exist today. Sun has been doing this for a few years now; it&#8217;s at least two years. They have this model where they charge on a CPU-hour basis. We talk to those guys, and they&#8217;re very open about what they&#8217;ve been up to, or at least they used to be. And the way that those jobs work is you actually ship all the disks over there. </p>
<p><a name="bandwidth"></a>Jim Gray did some really interesting research here at Microsoft. He computed how much it costs to move a certain amount of data and how long it would take to move that much data over a particular kind of pipe, because the question is: how long does it take you to get a terabyte of data from one site to another? </p>
<p><b>Scott:</b> DVDs and FedEx have an incredibly high bandwidth. </p>
<p><b>Ryan:</b> Yeah. That was the result of the research, years ago: &#8220;Just put it on DVDs and ship it that way.&#8221; And so what Sun does is that you actually ship them a cabinet of hard drives, and they plug that into the cluster and use it to compute. </p>
<p>Your point is very good though, that a lot of customers will typically do small computational jobs and occasionally they&#8217;ll need a really big cluster. I think that is true going forward, that we&#8217;ll see more and more of that, and people will occasionally need to run a big job. </p>
<p>If part of what we believe is true, that this revolution will happen in workgroup and departmental clusters, it&#8217;s because workgroups and departments will need those smaller clusters for some of their work, but every now and then they&#8217;re going to want to hand jobs to a huge cluster for computation. </p>
<p>And that may be handled using grid standards that are generated by the open grid forum. That may be handled by just handing off to a place like Amazon or Sun. </p>
<p>I also think that, in closing on the discussion of open source and the HPC market and the commercial software that we&#8217;re building for HPC, I&#8217;m really hoping that what&#8217;s going to happen is that our group at Microsoft will help pave the way for how commercial groups at Microsoft are better able to work with open source projects. </p>
<p><b>Scott:</b> Thanks, Ryan, for all your insight. This was a great conversation. </p>
<p><b>Ryan:</b> Thanks.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=73&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/&amp;title=Interview+with+Ryan+Waite%2C+Group+Program+Manager+for+High-Performance+Computing+at+Microsoft." rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/&amp;title=Interview+with+Ryan+Waite%2C+Group+Program+Manager+for+High-Performance+Computing+at+Microsoft." rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/&amp;title=Interview+with+Ryan+Waite%2C+Group+Program+Manager+for+High-Performance+Computing+at+Microsoft." rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/&amp;title=Interview+with+Ryan+Waite%2C+Group+Program+Manager+for+High-Performance+Computing+at+Microsoft." rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Ryan+Waite%2C+Group+Program+Manager+for+High-Performance+Computing+at+Microsoft.+@+http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2007/07/22/interview-with-ryan-waite/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
