<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
		xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>How Software is Built &#187; PostgreSQL</title>
	<atom:link href="http://howsoftwareisbuilt.com/tag/postgresql/feed/" rel="self" type="application/rss+xml" />
	<link>http://howsoftwareisbuilt.com</link>
	<description></description>
	<lastBuildDate>Fri, 25 Jun 2010 19:53:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
	<copyright>2006-2007 </copyright>
	<managingEditor>scottswigart@technologyevangelism.com (How Software is Built)</managingEditor>
	<webMaster>scottswigart@technologyevangelism.com (How Software is Built)</webMaster>
	<ttl>1440</ttl>
	<image>
		<url>http://howsoftwareisbuilt.com/wp-content/plugins/podpress/images/powered_by_podpress.jpg</url>
		<title>How Software is Built</title>
		<link>http://howsoftwareisbuilt.com</link>
		<width>144</width>
		<height>144</height>
	</image>
	<itunes:subtitle></itunes:subtitle>
	<itunes:summary></itunes:summary>
	<itunes:keywords></itunes:keywords>
	<itunes:category text="Society &#38; Culture" />
	<itunes:author>How Software is Built</itunes:author>
	<itunes:owner>
		<itunes:name>How Software is Built</itunes:name>
		<itunes:email>scottswigart@technologyevangelism.com</itunes:email>
	</itunes:owner>
	<itunes:block>no</itunes:block>
	<itunes:explicit>no</itunes:explicit>
	<itunes:image href="http://howsoftwareisbuilt.com/wp-content/plugins/podpress/images/powered_by_podpress_large.jpg" />
		<item>
		<title>Interiew with Denis Lussier &#8211; Co-Founder &#8211; CTO &#8211; EnterpriseDB</title>
		<link>http://howsoftwareisbuilt.com/2009/04/22/interiew-with-denis-lussier-co-founder-cto-enterprisedb/</link>
		<comments>http://howsoftwareisbuilt.com/2009/04/22/interiew-with-denis-lussier-co-founder-cto-enterprisedb/#comments</comments>
		<pubDate>Wed, 22 Apr 2009 16:54:13 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Sean Campbell]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2009/04/22/interiew-with-denis-lussier-co-founder-cto-enterprisedb/</guid>
		<description><![CDATA[Interviewers: Scott Swigart and Sean Campbell Interviewee: Denis Lussier In this interview we talk with Denis. In specific, we talk about: The role of PostgreSQL in the enterprise database sphere Building performance to expand upon the capabilities of LAMP Creating a hybrid open source/corporate business model The role of various license types in enabling business [...]]]></description>
			<content:encoded><![CDATA[<p><html></p>
<p><strong>Interviewers:</strong> <a href="http://howsoftwareisbuilt.com/about-scott-swigart/">Scott Swigart</a> and <a href="http://howsoftwareisbuilt.com/about-sean-campbell/">Sean Campbell</a></p>
<p><strong>Interviewee: </strong>Denis Lussier</a></p>
<p>In this interview we talk with Denis. In specific, we talk about:</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2009/04/22/interiew-with-denis-lussier-co-founder-cto-enterprisedb#role">The role of PostgreSQL in the enterprise database sphere</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/04/22/interiew-with-denis-lussier-co-founder-cto-enterprisedb#LAMP">Building performance to expand upon the capabilities of LAMP</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/04/22/interiew-with-denis-lussier-co-founder-cto-enterprisedb#model">Creating a hybrid open source/corporate business model</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/04/22/interiew-with-denis-lussier-co-founder-cto-enterprisedb#licensing">The role of various license types in enabling business models</a></li>
<li><a href="http://howsoftwareisbuilt.com/2009/04/22/interiew-with-denis-lussier-co-founder-cto-enterprisedb#sun">Sun&#8217;s relationship with PostgreSQL</a></li>
</ul>
<p><span id="more-217"></span></p>
<p><b>Sean Campbell:</b> Please give us a little bit of background about yourself, both with and prior to EnterpriseDB.</p>
<p><b>Denis:</b> Sure. In 1994, I founded a small boutique system integrations firm called Fusion Technologies that grew to be about 250 people by 2004. In 2003, as a bored CEO running a bigger company, I started an open source practice area at Fusion Technologies. That is where I stumbled across PostgreSQL and MySQL.</p>
<p>I came to realize in that process that PostgreSQL was a superior database to MySQL, although it was not as popular as MySQL. And neither of them were really all that popular compared to Linux in the operating systems space or compared to JBoss in the app server space, just because anti compatibility was not that big of a deal in the database space, since because nobody really has it and nobody cares about it.</p>
<p>Ultimately, I spun off EnterpriseDB as a separate company. I took venture capital financing in early 2005, and we have done several rounds since then, producing what you now know as EnterpriseDB. I sold Fusion Technologies to a larger company, so I am not involved in that anymore.</p>
<p><b>Sean:</b> I was a database guy for a long time, and I wrote my share of code, although Scott wrote more than me. Where does EnterpriseDB fit into the grand pantheon that includes everything from DB2, to Oracle, to MySQL?</p>
<p><a name="role"></a></p>
<p><b>Denis:</b> We are definitely going after the enterprise space, because we are built on rock solid PostgreSQL, which is enterprise class. If you look at the headlines, MySQL might be referred to as &#8220;The world&#8217;s popular open source database,&#8221; while PostgreSQL will be referred to as &#8220;The World&#8217;s most advanced open source database.&#8221;</p>
<p>PostgreSQL appeals to the open source UNIX and BSD types of guys, and it is the engineers&#8217; choice for an enterprise class open source database management. MySQL is not silly enough to go directly after Oracle&#8211;even now that they are with Sun&#8211;and neither are we.</p>
<p>Just as you didn&#8217;t rip and replace Solaris to go with Linux, you don&#8217;t just rip and replace Oracle and go with EnterpriseDB, because we are PostgreSQL with Oracle compatibility. Migrations like that can take years.</p>
<p>Even though going after the enterprise market takes years of development, we have an enterprise class product that is appropriate for the enterprise, especially for up and coming companies. For existing companies, we have got to get in with some developmental type projects and earn our stripes through true enterprise class replacement, which again takes years.</p>
<p>Whereas MySQL is sort of positioned as the web database, we do a lot with scale out and so forth, for companies like hi5 and Skype. Guys that are doing huge things with PostgreSQL understand what you have got to do to truly scale out a database, which is PostgreSQL&#8217;s strength. </p>
<p>You also have to scale up the database in terms of OLTP type transactions. Huge numbers of users and transactions are PostgreSQL&#8217;s traditional strength. If you want to do industrial class scale out, PostgreSQL also excels at that. </p>
<p>We fancy ourselves as &#8220;The PostgreSQL Company.&#8221; We are the largest sponsor of PostgreSQL core committers, and we also have a value added product on top of that, which is EnterpriseDB’s Postgres Plus Advanced Server.</p>
<p>That offering includes all of PostgreSQL, plus advanced features that enterprises are interested in&#8211;hence the name. Other than a couple parts of it being closed source in much the same way as Red Hat Enterprise Linux versus Fedora, anyone can kind of do whatever they want whenever they want, without paying us.</p>
<p><b>Sean:</b> It&#8217;s an interesting time to be opposite MySQL, with Sun buying that project to essentially buy a letter in the LAMP stack as a play for some open source credibility.</p>
<p><b>Denis:</b> That&#8217;s not working out so well with all of MySQL’s executives leaving Sun.</p>
<p><b>Sean:</b> I think they found an HR department that was not to their liking, although I have a feeling that the situation stems 25% from the entrepreneur going to work for a large company and 75% from Sun.</p>
<p><b>Denis:</b> We all love Sun and Unix and stuff like that, but Sun and software don&#8217;t do so well, and the MySQL guys in a big corporate culture don&#8217;t do so well, so there is nothing particularly surprising there.</p>
<p><a name="LAMP"></a></p>
<p><b>Sean:</b> When we talked to Dries over at Acquia, who is smaller than you guys by far, he said his goal is essentially to be the Red Hat to Drupal, the Red Hat to portals. </p>
<p>It kind of sounds like you want to be the Red Hat to the database stack, modifying people&#8217;s perception of an M in that LAMP stack as you deal with more scale out in Web 2.0 and aid them in moving beyond being just small sites that run PHP in MySQL.</p>
<p><b>Denis:</b> There is definitely an element of that. Among the features we are about to announce, there is one called &#8220;Infinite Cache.&#8221; Basically, we take memcached, and rather than putting it in front of a database and requiring you to change your application, we take it as the basis.</p>
<p>It is BSD, so you can do whatever you want with it. We take it as the basis and put it behind the database so you can still do your relational queries. We take what is effectively a hypothetical database and put memcached behind it so that we do the caching and you don&#8217;t have to worry about doing special things in your application.</p>
<p>We use a distributed cache behind the scenes. When we say a Web 2.0 database, what that really means to us is that you can do stuff like memcached, but we make it transparent behind the database.</p>
<p>If you want to use a huge scale-out database, like Skype or hi5 does, and you use this technology behind the database, it will automatically cache everything that you put in the distributed cache.</p>
<p>The remaining part that isn&#8217;t the read&#8211;maybe 90% of a typical web based application&#8211;is an OLTP application. PostgreSQL and, therefore, EnterpriseDB based on it, is one of the best databases in the world for that.</p>
<p>The architecture underneath is very, very similar to Oracle in terms of how you do distributed locking and how you do row level locking versus page level locking, and all that stuff. So, we have the traditional strengths of Oracle in that area, but we leverage various open source projects when creating Postgres Plus Advanced Server. It is similar in approach to Red Hat enterprise Linux.</p>
<p><a name="model"></a></p>
<p><b>Sean:</b> Assuming the analogy still holds true, Red Hat has been doing really well. They had a good quarter for a technology company in a quarter where many did not. They have been growing continually, but particularly in the last quarter. What would you like to follow or not follow in terms of their example?</p>
<p><b>Denis:</b> They obviously defined themselves as an enterprise open source company, but they try to combine the best of the open source world and the non open source world. We seek to follow that; for example, they defined the successful subscription model that we follow, and there is a lot to follow there.</p>
<p>Ed Boyajian, who is our CEO now, ran North American operations for Red Hat, so we have an ongoing affinity for Red Hat. Everything that we do moves towards open source, but we don&#8217;t rush making it open source before it is time, because we want to get the economic benefit out of some of our features.</p>
<p>Our features move to open source at the rate that makes sense. We don&#8217;t have a policy that says everything must be GPL2 immediately. We want to emulate Red Hat&#8217;s success, but frankly the database market is somewhere between six to eight times bigger than the operating system market.</p>
<p>We would hope to be as successful as Red Hat someday, but we dream of being even more successful because of the bigger market.</p>
<p><b>Scott Swigart:</b> How do you make that decision about what makes sense to keep in your proprietary offering and when it makes sense to push those patches out to PostgreSQL?</p>
<p><b>Denis:</b> To speak in sweeping generalities for a moment, if we have got a feature that we have made closed source, and somebody else is creating a competitive feature that would cause the source code to fork unnecessarily, we will contribute our feature.</p>
<p>Sometimes, we open source defensively like that. We also open source sometimes because we have a feature that is going to be enterprise class, but we need a much wider QA community before we can make it enterprise class. That&#8217;s more of an offensive reason to open source.</p>
<p>In general, the PostgreSQL community is all about anti compatibility. PostgreSQL is the strongest anti database in the world. Our customers who care about Oracle compatibility features are obsessed with the money saving aspect of open source and that it be enterprise class based on open source.</p>
<p>They understand that some of our Oracle compatibility features are not open source. To open source those features, we would really have to try and force them on the community, because they don&#8217;t really want them anyway.</p>
<p>The decision about whether to open source a specific feature is sort of complicated, but most decisions we make about how, when, and where to open source something make sense if you break it down into those three categories. The decision that you make and the timing become self evident.</p>
<p><a name="licensing"></a></p>
<p><b>Scott:</b> Does the fact that PostgreSQL is under the BSD license help to enable your business model? In other words, GPL wouldn&#8217;t really work for what you are doing.</p>
<p><b>Denis:</b> It helps to look at things in terms of generalities. There is GPL that is not controlled by a single commercial interest, such as Linux. There is also GPL that is dominated by a single company, such as JBoss and MySQL.</p>
<p>Then there is BSD. There is BSD in Apache, so the licenses are more permissive, so you can commercialize them and make your own decision as to how, when, and what you open source.</p>
<p>I chose PostgreSQL when I founded EnterpriseDB because, from an engineer&#8217;s perspective, PostgreSQL is a better database, but from a license perspective, it also provides flexibility for our business model to combine open source vision with commercial pragmatism.</p>
<p><b>Scott:</b> How has it been beneficial for PostgreSQL itself, in terms of its evolution? I know there are some people who would say that a GPL license would be best for a given project, because people have to share their derivative work.</p>
<p>On the other hand, there are people who will say that you have to make enough money to stay in business, and that by growing a company, they are going to be able to contribute a lot more to that core project.</p>
<p><b>Denis:</b> Both models can and do work. They have their strengths and weaknesses. You can&#8217;t really directly sell a subscription to something like PostgreSQL, the Apache server, Tomcat, or anything with an Apache type license, a BSD type license, or an MIT license.</p>
<p>You can be a professional services company that specializes in that, but the actual software is very high quality and it is free. That makes it harder for a company to make money on it, but on the other hand, it is easy for a lot of companies to get behind contributing to it, because they want to invest their time and they want to use it for free.</p>
<p>Therefore, the Apache, BSD, and MIT licenses encourage people to trade time for money. GPL, on the other hand, lets a company get behind it. I would argue that PostgreSQL is the world&#8217;s best open source database and MySQL is the world&#8217;s most popular. It was the most popular because there is a commercial entity behind it.</p>
<p>We don&#8217;t control the PostgreSQL community that EnterpriseDB is a big contributor to. It is a very independent community. Several of the core committers work for us and so on, but we don&#8217;t control it, much as IBM doesn&#8217;t strictly control the Apache community. They are a major contributor to it, but they don&#8217;t control it, and they don&#8217;t need to.</p>
<p>You can sell WebSphere, whereas you can&#8217;t sell Apache or Tomcat. MySQL is successful and it is a good product (although, naturally, I think that PostgreSQL is better). We think that models like PostgreSQL, Apache, BSD, and MIT can be successfully commercialized by a company that understands open source, the way IBM does. They can have some closed source aspects of their product and still be a good open source company.</p>
<p>IBM is an investor in our company, and we frankly try to emulate their success in the open source market.</p>
<p><a name="sun"></a></p>
<p><b>Scott:</b> Sun used to be, or maybe even still is, a fairly big supporter of PostgreSQL. What is that relationship like?</p>
<p><b>Denis:</b> Sun is trying to be a good open source company in the model of IBM, but once they spent a billion dollars on MySQL, they slowly and steadily backed away from PostgreSQL, which is what you would expect. </p>
<p>We are steadily getting into the Solaris space ourselves, rather than just doing a third party contract through Sun, which has basically expired.</p>
<p><b>Scott:</b> Who is your target customer? Are they the kind of people who would have bought Oracle in the past, but just don&#8217;t see the justification for spending that kind of money for an enterprise class database?</p>
<p><b>Denis:</b> There are startup companies that see the value of PostgreSQL but want people who are true experts at it. They are not going to build it themselves, and they get all the build optimizations and cache update service, and that kind of stuff.</p>
<p>There are also companies that are spending, in some cases, tens or hundreds of millions of dollars on Oracle per year that want to, over time, decrease that dependency. And then you have everything in the middle.</p>
<p>We recognize that big companies can&#8217;t just replace Oracle or SQL Server overnight, but we think we have something that they can move toward over time, and we have been very successful with that model.</p>
<p>In fact, that is very similar to the Red Hat Linux model. They targeted Solaris, but you didn&#8217;t just rip and replace Solaris eight years ago, even though you might now.</p>
<p><b>Scott:</b> Well, we have come to the end of our time, but I want to thank you for a great conversation.</p>
<p><b>Denis:</b> Thank you. I enjoyed it.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=217&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F04%2F22%2Finteriew-with-denis-lussier-co-founder-cto-enterprisedb%2F&amp;title=Interiew+with+Denis+Lussier+%26%238211%3B+Co-Founder+%26%238211%3B+CTO+%26%238211%3B+EnterpriseDB" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F04%2F22%2Finteriew-with-denis-lussier-co-founder-cto-enterprisedb%2F&amp;title=Interiew+with+Denis+Lussier+%26%238211%3B+Co-Founder+%26%238211%3B+CTO+%26%238211%3B+EnterpriseDB" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F04%2F22%2Finteriew-with-denis-lussier-co-founder-cto-enterprisedb%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F04%2F22%2Finteriew-with-denis-lussier-co-founder-cto-enterprisedb%2F&amp;title=Interiew+with+Denis+Lussier+%26%238211%3B+Co-Founder+%26%238211%3B+CTO+%26%238211%3B+EnterpriseDB" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F04%2F22%2Finteriew-with-denis-lussier-co-founder-cto-enterprisedb%2F&amp;title=Interiew+with+Denis+Lussier+%26%238211%3B+Co-Founder+%26%238211%3B+CTO+%26%238211%3B+EnterpriseDB" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F04%2F22%2Finteriew-with-denis-lussier-co-founder-cto-enterprisedb%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interiew+with+Denis+Lussier+%26%238211%3B+Co-Founder+%26%238211%3B+CTO+%26%238211%3B+EnterpriseDB+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2009%2F04%2F22%2Finteriew-with-denis-lussier-co-founder-cto-enterprisedb%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2009/04/22/interiew-with-denis-lussier-co-founder-cto-enterprisedb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Interview with Josh Berkus &#8211; PostgreSQL Core Team Lead &#8211; Sun Microsystems</title>
		<link>http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/</link>
		<comments>http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/#comments</comments>
		<pubDate>Wed, 22 Aug 2007 16:35:17 +0000</pubDate>
		<dc:creator>campsean</dc:creator>
				<category><![CDATA[Sean Campbell]]></category>
		<category><![CDATA[databases]]></category>
		<category><![CDATA[Josh Berkus]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[PostgreSQL]]></category>
		<category><![CDATA[quality]]></category>
		<category><![CDATA[sun microsystems]]></category>

		<guid isPermaLink="false">http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/</guid>
		<description><![CDATA[Interviewers: Scott Swigart and Sean Campbell Interviewee: Josh Berkus In this interview with Josh, PostgreSQL Core Team Lead at Sun Microsystems Inc., we asked him about: How PostgreSQL fits into the landscape of open source databases New uses of PostgreSQL How new code and features are selected for PostgreSQL Quality control and maintenance of code [...]]]></description>
			<content:encoded><![CDATA[<p>
<p><strong>Interviewers:</strong><br />
<a href="http://howsoftwareisbuilt.com/about-scott-swigart/">Scott Swigart</a> and <a href="http://howsoftwareisbuilt.com/about-sean-campbell/">Sean Campbell</a> </p>
<p>
<p><strong>Interviewee:</strong><a href="http://howsoftwareisbuilt.com/about-josh-berkus-postgressql-core-team-lead/"> Josh Berkus</p>
<p></a><br />
In this interview with Josh, PostgreSQL Core Team Lead at Sun Microsystems Inc., we asked him about:</p>
<ul>
<li><a href="http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/#whatispostgres">How PostgreSQL fits into the landscape of open source databases</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/#usesofpostgres">New uses of PostgreSQL</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/#includedfeatures">How new code and features are selected for PostgreSQL</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/#codequality">Quality control and maintenance of code</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/#maintainers">Core maintainers and code contributors</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/#coders">How contributors participate in PostgreSQL development</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/#opensource">Differences between open and closed source database</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/#variations">Variations between PostgreSQL products</a></li>
<li><a href="http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/#microsoft">PostgreSQL Windows users and Microsoft support</a></li>
</ul>
<p><span id="more-90"></span></p>
<p><strong>Sean Campell:</strong> Josh, tell us a bit about your role with PostgreSQL.</p>
<p><strong>Josh Berkus: </strong>OK. Well, what I&#8217;m best known for is that I&#8217;m a member of the seven member core steering committee for PostgreSQL Open Source Database Project. I also work at Sun on PostgreSQL and I&#8217;m the PostgreSQL lead, which is sort of a strategic and evangelism position at Sun Microsystems in our database technology group.</p>
<p>I&#8217;ve been a database applications developer since about 1994. I started out actually with desktop databases and moved to Microsoft SQL Server, and from there to PostgreSQL in 1998. For the last couple of years, I worked at Greenplum, a data warehousing company. I did a lot of consulting on database performance. Now I seem to spend most of my time going to conferences.</p>
<p><strong>Sean: </strong>Talk a little bit, if you would, about PostgreSQL and where it kind of fits in the ecosystem with things like MySQL, open source on one side and the proprietary Microsoft SQL Server and Oracle Database.</p>
<p><strong></p>
<p><a name="whatispostgres">Josh:</a> </strong>Well, the way I like to describe PostgreSQL is we&#8217;re sort of the high end of open source databases: very SMP scalable; capable of running large, complex queries involving multiple sub selects; and, all kinds of SQL tricks. A lot of functionality is built into the database, including triggers and views and schema, and the ability to use 11 or 12 procedural languages to write procedures in. </p>
<p>Mostly, people who look at PostgreSQL are also considering databases like Oracle or DB2 or SQL Server 2005 for their needs &#8211; large database installations involving possible terabytes of data and usually a dedicated database administrator.</p>
<p>That&#8217;s probably the majority of our usage. We also get a fairly substantial amount of sort of embedded usage. Not because PostgreSQL is really an embedded database, but because of our extremely liberal BSD open source licensing terms.</p>
<p>Something I would like us to be better known for is the extensibility of the database model. The origin of PostgreSQL goes back to 1986, which was the POSTGRES Project at the University of California &#8211; Berkeley. It was the second UCB database project. The reason it was called Postgres was it actually stands for post Ingres, Ingres having been the first UCB database project.</p>
<p>One of the principles that PostgreSQL was founded on was the idea of an object relational database. That is, the database administrator or designer should be able to modify the behavior of database objects with code, or add their own new types of database objects. On a useful basis, that&#8217;s given us a whole bunch of exotic data types that are extremely useful for keeping certain specific and unusual types of data    genetic sequences, geographic data, or cryptographic data    that might be hard to store in the standard SQL data types, let alone to index or do anything useful with.</p>
<p><strong>Sean: </strong></a>Well, tell me a little bit about out that, just to dive bomb for a second. Coming from a database background, there&#8217;s been a lot of discussion about this on the SQL Server side, progressively integrating additional data types, working in an XML data type, and so on.</p>
<p><strong>Josh: </strong>Right.</p>
<p><strong>Sean: </strong>But obviously, one of the challenges with something like this is making sure that it&#8217;s valid and useful for the community at large. I think the deeper you get into a particular data type, you really have to have it vetted by the community, to make sure it&#8217;s actually useful, and people don&#8217;t just go back and throw it into a more generic field type.</p>
<p>So, tell me a little bit about the process of getting something like that included in PostgresSQ, and maybe take the most interesting example you can think of. From soup to nuts, how did it get originally proposed, and then how did it get eventually integrated into the product?</p>
<p><strong>Josh:</strong>So an example of one, actually, that&#8217;s going to be fully integrated into the core code is the UUID data type. It&#8217;s coming out in version 8.3, which is the next release version, and will come out somewhere around October. </p>
<p>Now, I know that SQL Server has had UUIDs for a while. We&#8217;ve been actually putting off incorporating UUIDs and GUIDs, simply because there&#8217;s five or six competing ways to make up such a data type. </p>
<p>So, when UUIDs and GUIDs were proposed for inclusion in Postgres, one of the goals with that was to actually create a data type that would uphold the various different ways of forming a UUID, and yet have a compact representation as well as its own index types and operators.</p>
<p>That is, for example, equality for a UUID is the same sort of concept as it would be for an integer. Differential operators, like greater than, less than, and that sort of thing, actually don&#8217;t function the same.</p>
<p><strong>Sean: </strong>Sure.</p>
<p><strong>Josh: </strong>Because, among other things, UUIDs will contain information on which machine created that particular ID.</p>
<p><strong>Sean: </strong>Right.</p>
<p><a name="usesofpostgres">
<p><strong>Josh: </strong></a>Yeah. So, an example of a more exotic one that&#8217;s not yet in the core code, but is out there in our add ins and that I find much more interesting, is genetics data types. There&#8217;s a couple of projects; one of them is called Unison DB, which was sponsored and open sourced by Genentech, and another one is a community project called BLASTgres. </p>
<p>These are all projects of genomics scientists.</p>
<p>One of the things that they needed to be able to do is they needed to be able to store sequences of base pairs in the database. Now, that&#8217;s what we call a multi value data type, as in it has meaning as a whole, in sequential order, but also, each component has meaning in small pairs. Initially they tried to store this, actually, sort of vertically &#8211; you have each pair as a row. But when you&#8217;re talking about millions of protein sequences, each of which can have a couple thousand base pairs, that&#8217;s really not a realistic method of storage.</p>
<p>So instead, they actually modified our already existing array data type which is quite fully featured, in terms of having its own special comparison operators and special ways of indexing it  and modified it to hold base pairs.So, that allowed them to do meaningful comparisons of things like equality, and particularly to compare individual base pairs. That is, if the base pair in this third place equals both sequences, what percent of the base pairs are equal to those sequences?</p>
<p>Because in protein analysis, you&#8217;re not expecting an exact match; you&#8217;re expecting a match, basically, according to percentages. You want to be able to look up and say, &#8220;OK, I want to find all the genetic sequences that have this particular sequence of six base pairs anywhere in the proteinî and that requires a special index type which is based on our generalized index search tree. The tree allows you to create your own index types to support some of these exotic data types.</p>
<p><a name="includedfeatures">
<p><strong>Sean: </strong></a>One of the questions out of that, I guess, because that lays out pretty clearly what the feature set is, is about how you go about making the decision to incorporate something like that into the mainline product and/or to just kind of leave it as a community add on, for lack of a better phrase?</p>
<p><a name="codequality">
<p><strong>Josh:</strong></a>  There&#8217;s actually a variety of features, and we&#8217;re going through this right now with our full text search type. So there&#8217;s a whole variety of decisions that goes into that. One is how broad the usage is. That is, is this something which is used by a large percentage of our users, or is it relatively obscure?</p>
<p>A second question is how good the code quality is. That&#8217;s a big deal, because we&#8217;re an open source project that&#8217;s been open source for 11 years and part of a project that&#8217;s 21 years old. Maintainability in our code base is actually, possibly, our number one priority because every one of our major contributors now was not here at the beginning of the project.</p>
<p>Every single one of them inherited the code from someone else, and they know how important it is to be able to pass it on. So that&#8217;s actually been a big holdup in incorporating our full text search type into the core, because it was written by some Russian developers and they had a lot of issues with doing internal documentation of the code in English.</p>
<p>So, in that case, for other data types and the like, there&#8217;s going to be the same sort of standards in terms of internal documentation of the code; good quality public user documentation; the code being formatted correctly and easy to read; and, having consistent sub functions and references that are easy to follow and match our other coding standards that are part of our documentation. </p>
<p>Another part of it is going to be, again, for the maintenance issue, whether or not we feel that the contributors are going to be with the project for the medium  to long term, because if it&#8217;s something that somebody just dumps on us    however good it is right now    and then leaves, then what we&#8217;ve done is we&#8217;ve added to the burden of the core maintainers to keep up that extra 10,000 or 15,000 lines of code.</p>
<p>For an individual data type, that&#8217;s not very much. If you look at it like we add 10 new data types, then that requires us to have two more code maintainers, just to maintain that extra code base. So, if somebody is not going to be making a long term commitment to maintain the code, then the bar to add it to the core code becomes much higher.</p>
<p>Then, a final consideration would be external dependencies. That is, one of the things that we do to make PostgreSQL easy to install is that the dependencies to install the very core code of PostgreSQL with no additional options are extremely light. You basically need a handful of GNU utilities, certain C code building utilities, and that&#8217;s it. That&#8217;s made it very easy to install PostgreSQL on 30 different platforms; so has building it from source, so there&#8217;s a variety of means. For the people out on Windows, you can build it with either MinGW or with Visual Studio. That wouldn&#8217;t be possible if we had a whole slew of external dependencies.</p>
<p>So, thereís add ons that require heavy external dependencies. For example, there&#8217;s a procedural plug-in to allow you to use PHP inside the database. The reason it hasn&#8217;t been included in the PostgreSQL core code, even though it&#8217;s fairly feature complete and is reasonably popular, is specifically because in order to build it you have to have PHP and Apache installed and even configured in certain ways. So that makes it a real dependency issue in order to build that component.</p>
<p>That&#8217;s very tricky, and we don&#8217;t necessarily want things that are that hard to build in the core code. We&#8217;ll put them in add ons, where people realize that they have to take extra steps.</p>
<p><a name="maintainers">
<p><strong>Sean</strong>:  </a>One follow on to that, too, back onto the core maintainers. I&#8217;m just curious, how many core maintainers do you have, considering that came up a couple times in the vetting process? And, how does someone kind of move from just a general community member to a core maintainer? </p>
<p>Because different projects handle that promotion process a little bit differently.</p>
<p><strong>Josh</strong>:  Yeah. We don&#8217;t have a formal process. We&#8217;re actually at the sort of extreme end of open source projects in that all of our policies are negotiated and unwritten, pretty much. It&#8217;s because of the age of the project and it&#8217;s because we&#8217;ve always had a consensus process for making decisions, which has yet to break down. So there hasn&#8217;t been a need for some of the more elaborate, formal structures you would find in, for example, Apache.</p>
<p><strong>Sean</strong>:  Right, so you guys haven&#8217;t had to have like a voting process, per se, on things. It&#8217;s kind of been a communal decision process.</p>
<p><strong>Josh</strong>:  Yeah. There are some things that are conventions. You&#8217;re not going to be considered, for example, for getting commit ability on the CVS tree if you haven&#8217;t been around the project for a couple of years, contributing. Again, it&#8217;s a 10 year old project. We feel that we can wait two or three years for somebody&#8230;</p>
<p><strong>Sean</strong>:  [laughs] Right, right.</p>
<p><strong>Josh</strong>:  If they go away in that amount of time, then we didn&#8217;t want them as a committer anyway.</p>
<p><strong>Sean</strong>:  Right, right. Somebody that flashes onto the list and is like, &#8220;I want to be helpful! I want to be helpful!&#8221; then six months later, you&#8217;ve never heard from them. You have kind of a base vetting process just from that alone.</p>
<p><strong>Josh</strong>:  Yeah, yeah. So we&#8217;ve got people who&#8217;ve been around and contributing code for a couple of years. Volume is also a consideration, because somebody who&#8217;s only contributing one or two small patches per version, then there&#8217;s no particular need for them to have any greater level of access.</p>
<p>The big issue is having lots of free time or having time paid by your employer to work on these things, because the main thing that we need from major contributors now is actually time to review other people&#8217;s code.</p>
<p><a name="coders">
<p><strong>Sean</strong>:</a>  Well, one question on that, because we&#8217;ve found this to be interesting as we&#8217;ve talked to different projects, is how much of the project is funded by proxy, in the sense that somebody&#8217;s got a day job? </p>
<p>Is there a scenario where someone is funded predominantly by their employer to write code for PostgreSQL?</p>
<p><strong>Josh</strong>:  Yes.</p>
<p><strong>Sean</strong>:  And what percentage of the base of people writing the mainline code probably falls into that type of category?</p>
<p><strong>Josh</strong>:  We haven&#8217;t tried to do a count for about three years, but I would estimate, now, 80% to 90% of the code changes that go into any given version are written by people who were either paid directly to work on PostgreSQL development, or for whom working on PostgreSQL development is an approved use of their work time.</p>
<p>The second class would be large PostgreSQL users, like the staff of Afilias, for example. They&#8217;re not required by Afilias to contribute to Postgres, but if they want to spend Tuesday afternoon working on a Postgres patch, it&#8217;s completely acceptable to Afilias if they do so.</p>
<p>There&#8217;s a number of those. So, yeah. That&#8217;s actually one of the big myths of open source is people imagine a bunch of hobbyists. And I&#8217;ll say, in the early days of Postgres, we were hobbyists, because you couldn&#8217;t use it for much. I was actually earning my living as a SQL Server performance consultant, and working on Postgres for sort of my own stuff. But once a project gets big and commercially adopted, you&#8217;re going to find that at least three quarters of the code contribution comes from people who are paid to work on the project.</p>
<p><strong>Sean</strong>:  I want to give Scott a chance to chime in here, too. But one of the things that came out of one of the conversations we had   I think it was with Michael Tiemann &#8211; was the concept of: alright, so you&#8217;ve got a set of developers, they work for a large company, and that company would like to get something into the product. So, they go off and squirrel away and work on some feature.</p>
<p>Letís say it&#8217;s the genetic discussion we were having earlier, right? That&#8217;s probably not the best example, but let&#8217;s imagine that was the case. How do you deal with the challenge of someone going and building ìxî, and they feel they&#8217;ve invested real company time and stake and equity in it, but yet, maybe it doesn&#8217;t make it in because the community writ large just doesn&#8217;t feel it&#8217;s maintainable or it doesn&#8217;t meet the standards you guys are looking for?</p>
<p><strong>Josh</strong>:  Yeah. Well, that&#8217;s something that needs to be handled with a fair amount of diplomacy. And there have certainly been failures on that in the past, but I think itís because all of our interactions with our contributors tend to be highly personal. That is, if something gets rejected, then it&#8217;s going to be after Bruce or one of the other reviewers had 16 or 17 different email interactions with a contributor. But no, that&#8217;s not until after we&#8217;ve given the contributor multiple chances, made it clear to them what needed to be changed, and given them multiple chances to modify their stuff, and hopefully made them understand why it was being postponed.</p>
<p>We&#8217;re actually struggling with that right now, because we&#8217;re having a bit of a fire hose problem with version 8.3; as in, when we hit feature freeze at the beginning of April, we had something over 100 different patches pending, some of which involved up to 50 60,000 lines of code. So, the result is we&#8217;ve actually set a very high bar for things making it into 8.3. Patches that we might have accepted in an earlier version and spent more time getting up to the acceptable standards of code are instead being held back for 8.4.</p>
<p><a name="opensource">
<p><strong>Sean</strong>:</a>  Well, one last question, and then I want to give Scott a turn. </p>
<p>So, from a database side in the open source world, what do you think the open source development methodology brings to a database product that either gives it more credibility, a better feature set, etc. when compared to a closed source model for developing a database product? </p>
<p>Because we&#8217;ve asked everybody this, in some form or other, and the responses have been really interesting. I don&#8217;t mean that in a pejorative way, I mean they&#8217;ve been very interesting to listen to.</p>
<p>But, we haven&#8217;t talked to anybody from the database side.</p>
<p><strong>Josh</strong>:  Well, actually, one of the biggest benefits is for security and reliability. We actually had Coverity run a code check on PostgreSQL a couple of years ago, something that they&#8217;re apparently going to do again for us, and one of the first things that I noticed is that the PostgreSQL core actually has possibly the lowest code count, in terms of lines of code, for any major SQL database.</p>
<p>What that&#8217;s indicative of is that we&#8217;ve spent a lot of time; that every time we release a new version, there is significant refactoring involved in it, and a real effort to keep the code clean and eliminate anything that&#8217;s Byzantine or hard to read.</p>
<p>The payoff for that is that it makes it very easy to keep the code reliable and secure. That is, if somebody reports a security issue, we can generally come up with a fix in 24 to 48 hours, because it&#8217;s very easy to zero in on exactly where the problem is happening.</p>
<p>It also prevents such issues from occurring in the first place, because there aren&#8217;t mystery functions that nobody understands and can&#8217;t touch. Having worked on some proprietary software, I know how that kind of stuff creeps into your proprietary software, because you&#8217;re more concerned with meeting the ship date, and the idea is that you&#8217;ll clean it up in the first update version. Only after you meet the ship date, cleaning it up becomes a low priority. </p>
<p>So for us, because all of the code is out there and that it&#8217;s all visible, maintainability is a primary goal. There is no postponing cleaning it up. The cleaning it up has to happen before we release. The result has been very highly secure and very highly reliable code.</p>
<p><strong>Scott</strong>:  So, you mention that there&#8217;s a lot of patches queued up, and a lot of them aren&#8217;t necessarily going to make it into this upcoming version. How is that a decision that&#8217;s made? Open source projects kind of have a different hierarchy and a different culture, so I&#8217;m trying to understand with PostgreSQL, is it kind of a representative democracy where the steering committee sort of votes on those or&#8230;?</p>
<p><strong>Josh</strong>:  Think of it as almost a pure democracy. Yeah, because most of those decisions are made by rough consensus on what we call the &#8220;hackers mailing list&#8221; which has something on the order of 7,000 subscribers. Although probably only 75 -100 of those people are really active.</p>
<p><strong>Scott</strong>:  Sure.</p>
<p><strong>Josh</strong>:  The rest of them are just monitoring what goes on. Basically what happens is, if there&#8217;s a patch, there&#8217;s a couple of other lists attached to that &#8211; the actual patches mailing list or the actual committer&#8217;s mailing list. But most of the discussion happens on &#8220;hackers.&#8221;</p>
<p>If somebody submits a patch, or preferably a specification before they submit the patch, then there&#8217;s going to be lots of discussion and we&#8217;ll form a rough consensus on whether it&#8217;s a good idea or not, whether it belongs in the core code, and whether it belongs in an add in, and other issues like that. Then, when the patch actually gets submitted, it becomes up to the code reviewer    who will generally be one of our handful of committers, people who actually have direct access to the CVS tree    who decide whether the code is of sufficient quality to make it in or whether it needs work.</p>
<p>If that&#8217;s an extended process, they will generally take that back onto the hackers mailing list and say, &#8220;This is a really cool feature, but the code is a mess and it needs X, Y, and Z. If the original contributor didn&#8217;t clean it up, is there someone else who cares enough about it to clean it up, or are we going to hold it back?&#8221; That will get sort of worked out there. And it&#8217;s sort of peer democracy. It&#8217;s not so much pure democracy as actually what I call &#8220;volunteerocracy.&#8221;</p>
<p>[laughter]</p>
<p><strong>Josh</strong>:  It&#8217;s that somebody can force a decision by saying, &#8220;This feature is really, really important to me and I&#8217;m going to do whatever it takes to clean it up so it can go in.&#8221;</p>
<p><strong>Scott</strong>:  Right.</p>
<p><strong>Josh</strong>:  When somebody doesn&#8217;t step forward and do that, often stuff gets held back.</p>
<p><strong>Scott</strong>:  OK. So there isn&#8217;t like a formalized vote, but it is pretty obvious kind of what the consensus is.</p>
<p><strong>Josh</strong>:  Yeah. I mean, the core team is a steering committee, but our goal is to actually do as little as possible. The main thing that we do is we set the date for feature freeze, beta and release. We handle security issues, because those need to be dealt with in a confidential forum, and that&#8217;s it. </p>
<p>There&#8217;s been months where we&#8217;ve gotten maybe a dozen messages total on the closed core list. The vast majority of any decision making, any reviewing, any discussion, happens on the public mailing list, particularly hackers, but also to a lesser degree, patches and committers.</p>
<p>There&#8217;s a few segments of PostgreSQL, things like the JDBC driver, which have their own development mailing list, so they make their own decisions within their development mailing list to submit their decisions to hackers. We take their word for it because the rest of us don&#8217;t know anything about Java.</p>
<p><a name="variations">
<p><strong>Scott</strong>:</a>  Now, one of the things that happens with something like, let&#8217;s say, the Linux kernelÖ You&#8217;ve got Linus Torvalds, who essentially says, &#8220;Here&#8217;s the new kernel&#8221; and it goes out to the different distros. </p>
<p>And the different distros are all basically kind of a fork of that kernel.</p>
<p>The kernel that ships in Ubuntu isn&#8217;t exactly the same as the one that&#8217;ll ship in RedHat or something like that. And part of the reason for that is there may be certain features which are very important to RedHat customers, but it&#8217;s not something that they&#8217;re able to get necessarily into the core kernel, right? I&#8217;m curious if the same kind of thing happens with PostgreSQL; if Sun, for example, had customers who really needed certain features but the consensus was that those features shouldn&#8217;t end up in the core products &#8211; at least at this time &#8211; do you end up with companies kind of making their own weak forks for their specific customer or does that not really happen on this project?</p>
<p><strong>Josh</strong>:  Yeah, I&#8217;ll speak for PostgreSQL in general and then I&#8217;ll tell you specifically what we&#8217;re doing at Sun. In the case of PostgreSQL in general, it wasn&#8217;t something that used to happen, even though we supported the idea. We always have had sort of a kernel model, like Linux.</p>
<p>If you have the PostgreSQL core code, that 13 megabytes of code, and then you have probably 100 megabytes of add ins on places like (TP)Foundry and SourceForge and our contrib modules and a whole bunch of other places. In the past, it&#8217;s sort of been up to the user or the developer to add these things together themselves. What&#8217;s been happening more recently is that the packager for the Linux distribution or the BSD distribution has made certain decisions about what packages they want to include.</p>
<p>But those decisions have been fairly lightweight. People have not been taking the strategy of putting together an actual distribution until recently. And what&#8217;s changed recently is that we&#8217;ve gotten a lot of startups like Greenplum and EnterpriseDB that have sort of their own special version of Postgres that are deliberately maintained forms. In the case of Greenplum, it&#8217;s a data warehouse enhanced form of Postgres, and in the case of EnterpriseDB, it&#8217;s an Oracle compatible version of Postgres. There&#8217;s been a couple of others that are older, like the old Windows version, and the multi threaded Windows version called PowerGres in Japan, which is actually about four years old. Fujitsu actually has their own version, which is called Fujitsu Supported Postgres.</p>
<p>So, what&#8217;s been happening with this is that a lot of those companies do develop their own features for Postgres, which they submit, but don&#8217;t necessarily get accepted immediately, and possibly don&#8217;t get accepted in their original form.</p>
<p>That&#8217;s going to cause problems for those companies down the road, because&#8230; Well, I&#8217;ll give you an example. Greenplum actually developed bitmap on disk indexes. We currently have bitmap in memory indexes released with Postgres. But Greenplum developed bitmap on disk indexes a couple of years ago and put them into the Bizgres open source project to be submitted into the PostgreSQL core code.</p>
<p>The thing is that there have been some issues with index maintenance and with code style, and particularly with conflicts with other patches that we&#8217;ve incorporated to improve the performances of indexes, and that bitmap index patch.</p>
<p>Because those have not been resolved, the bitmap index patch is still not in the core code of Postgres. The problem that that is going to lead to is that when that patch does make it in    in 8.4 or whatever    it&#8217;s quite possibly going to have a slightly different API from the version that Greenplum has been distributing with their own proprietary product.</p>
<p>That will put Greenplum in the position where they actually need to have support for both versions: their original version and the version that made it into the core code. What that results in for these companies that are actually making core changes and distributing them in advance of getting at least vocal approval from the community, is that they develop an increasing maintenance burden. Now, companies like Greenplum and EnterpriseDB and Fujitsu in general recognize this and try to avoid that situation. They try to wait until their patch is queued and accepted before they start distributing it. But, like with the bitmap indexes, it doesn&#8217;t always work. Since the process of actually distributing modified versions of Postgres and marketing them heavily is relatively new, except for PowerGres, then it&#8217;s a little hard to see what could happen.</p>
<p>I mean, what could happen is what happened with PowerGres. XRA in Japan developed a multi threaded, high performance version of PostgreSQL 7.3 for Windows. But they modified PostgreSQL heavily to make it work in that context, to the point where it no longer worked on Linux. When the PostgreSQL project decided that we were going to adopt Windows as a platform, which we finally released in version 8.0, one of the decisions was that we wanted to have the same core code with no substantial differences regardless of operating system. That is, we weren&#8217;t going to have a separate code tree for Windows because that was going to be impossible to maintain.</p>
<p>So as a result, when we released the official Windows version, it was substantially different from PowerGres. So now a lot of the users in Japan are in the sort of weird position where they&#8217;re stuck with PowerGres, which is no longer advancing; it&#8217;s stuck at the version 7.3 feature set.</p>
<p>Or, they adopt the new official version, which is already five versions and four years later, and have a different performance profile and a lot of changes that will be requiring them to change their applications. So, that&#8217;s the sort of thing you want to avoid. That&#8217;s why at Sun, with our distribution of PostgreSQL, the PostgreSQL for Solaris, one of our policies is that we actually don&#8217;t distribute anything until it is accepted into the patch queue with a very strong assurance of acceptance versus revision by the PostgreSQL community.</p>
<p>If it is completely separable as an add in, if it&#8217;s something that can be added at build time and no later and doesn&#8217;t modify other APIs, then we might accept something. The particular example I&#8217;m thinking there is probes, which is that we can add in additional probes non invasively. It doesn&#8217;t matter if we add in a few extra probes before those probes appear in a community version, because they don&#8217;t affect other functionalities.</p>
<p><strong>Scott</strong>:  If you had to kind of ballpark it, how much of the effort do you feel like goes into adding new features to the product versus how much effort it is to support such a variety of platforms?</p>
<p>There&#8217;s different Linux distros, obviously, and supporting Windows on the same core code base obviously presents challenges also. How much time goes into bugs and testing and things like that just to ensure really good compatibility on all these different platforms versus building out new functionality?</p>
<p><strong>Josh</strong>:  Well, see there&#8217;s another way&#8230; You asked earlier about what the benefits were of the open source development process, and that&#8217;s another area where having really clean code is the big benefit, in that it&#8217;s allowed us to actually minimize the amount of platform specific engineering that we do.</p>
<p>We&#8217;ve also made some sacrifices in order to minimize that maintenance version, that maintenance issue. For example, on Linux and Unix platforms, we only use POSIX standard interfaces. This means that we&#8217;re not making use of some other operating system interfaces specific to particular operating systems that might give us additional operating system features and performance. For example, one of the big discussions I have here at Sun is that Sun&#8217;s new file system is ZFS and has some of its how APIs that are non POSIX. The ZFS people keep telling me it&#8217;s going to give us some tremendous benefit using databases on ZFS.</p>
<p>But we really don&#8217;t want to do that as an open source community, because the moment that you do that, you&#8217;re dedicating some amount of hours of somebody&#8217;s time just to maintain that interface for the code. So, we&#8217;ve completely avoided doing that, and that&#8217;s allowed us to minimize the maintenance version. That sort of platform specific bug and compatibility issue then becomes less than 20% of our overall development effort.</p>
<p>Now, Windows in particular, because there are more maintenance issues associated with Windows is very different from the POSIX platforms. We do have a couple of people for that. For example, one of our major contributors is Magnus&#8230; I&#8217;m going to mispronounce his last name, so I won&#8217;t say it. Magnus H. from Sweden. He spends most of his contribution time to Postgres, and he probably spends somewhere around half of his work time contributing to Postgres. He spends the majority of that, actually, maintaining Windows compatibility issues. So think about that as sort of one quarter of a developer for a year. Plus, a bunch of our other contributors and maintainers, like Bruce and Tom and Dave Page particularly, spend a minority of their time dealing with Windows build specific issues.</p>
<p>Again though, we try to stick to standard interfaces and to minimizing any particular code paths for Windows. Now, unfortunately that does limit our level of performance on Windows and our ability to integrate with some of the Windows utilities. But in terms of preventing us from having to have a completely separate version of Windows, it works.</p>
<p><strong>Scott</strong>:  If you talk to somebody who&#8217;s shipping a compiler, they would expect that&#8230; Well, Intel would sort of put people on the project to make compiler optimizations for Intel architectures, because Intel would really know how to do that.</p>
<p>AMD would put people on the project who make the compiler optimization for AMD architecture, because they would know how to do that. So you end up with kind of this collaborative effort where companies are kind of coming together and they&#8217;re putting their expertise in on the stuff they&#8217;ve developed. Microsoft isn&#8217;t staffing anybody on making sure that Postgres runs as well as it can on Microsoft&#8217;s operating system, is that correct? I mean, it sounds to me like you&#8217;re saying that work is being done by other people who&#8230;</p>
<p><strong>Josh</strong>:</a>  Yeah. The Microsoft folks have been friendly to us, particularly Microsoft Labs have been consistently friendly to us, but Microsoft doesn&#8217;t contribute any efforts or help with the project. And I&#8217;ll actually say, except for Sun, who is directly involved.</p>
<p>For some of the other things, for example, Intel did actually have some ICC optimization efforts, but that was through EnterpriseDB, and I don&#8217;t think it would have happened without EnterpriseDB&#8217;s involvement. So, a lot of this has been by proxy, which would be the case with Microsoft as well, if there was any huge interest in it. Microsoft has its own database product though, one that they&#8217;re pretty dedicated to promoting. Well, a lot of individual Microsoft engineers have been extremely friendly to us, but nobody actually contributing to the Postgres project works at Microsoft that I know of.</p>
<p>That work is done entirely by community people. And even those that are primarily Windows maintainers, like Magnus, spend as much or more of their time using, say, Linux than they do Windows. So, there hasn&#8217;t been a big push to Windows specific optimization through them.</p>
<p><strong>Sean</strong>:  This is good. I guess the same thing we&#8217;ve offered to everyone, Josh, are there any closing comments you would like to add to the discussion as it&#8217;s extrapolated so far? Is there something you feel we haven&#8217;t&#8217; touched that you&#8217;d like to get on the record, I guess?</p>
<p><a name="microsoft">
<p><strong>Josh</strong></a>:  I&#8217;d like to throw in a fun little factoid. This is aimed at Windows developers, yes?</p>
<p><strong>Sean</strong>:  Well, both. We&#8217;re getting heavy traffic from both sides. But still, there&#8217;s definitely some Windows folks involved, so go ahead.</p>
<p><strong>Josh</strong>:  Well, actually one of the other things that made it possible for us to do Microsoft for it has been that in general    not for our product specifically but Microsoft has actually made tools to build and run programs made for Linux and Unix a lot easier on the Windows platform than it used to be.</p>
<p>One thing in particular is that the PostgreSQL project was actually the first user of the open source WiX installer, something we&#8217;re actually extremely happy with. It has allowed us to make PostgreSQL vastly more accessible to users on Windows because it provides them with a really nice installer. So nice, in fact, that after we released the first Windows version, a bunch of the Linux folks began saying, &#8220;Hey, why can&#8217;t we have an installer like that in Linux?&#8221;</p>
<p>[laughter]</p>
<p><strong>Scott</strong>:  Well, that&#8217;s one of the other things too that you run into, is that in the Linux world, and even in the Unix world, people are more comfortable kind of running, making and building it for their particular configuration.</p>
<p>But in the Windows world, it&#8217;s really sort of mandatory to ship a binary and an installer, not so much the source. But that&#8217;s interesting that you found the WiX project to be really useful for your needs.</p>
<p><strong>Josh</strong>:  Yeah. It showed up at exactly the right time, because it got released like, I don&#8217;t know, three or four months before our targeted first Windows release. We were able to go out of the gate with an installer, as I recall. I wasn&#8217;t actually involved in the Windows build at the time, so I&#8217;m not sure that&#8217;s 100% accurate. But certainly very close after having the Windows release, we had a nice graphical installer that was not only a nice graphical installer, but it also has like little check boxes for all the most popular add ins.</p>
<p><strong>Scott</strong>:  Oh, nice.</p>
<p><strong>Josh</strong>:  And again, if you&#8217;re a Linux user or whatever, somebody says, &#8220;Oh, that&#8217;s in the contrib module, you just need to type in these three commands and it will be installed.&#8221; That&#8217;s not really available to Windows users. They have to have a whole tool chain that doesn&#8217;t ship with Windows. So, having that nice binary installer has made it vastly more accessible for Windows users.</p>
<p><strong>Scott</strong>:  Do you have any sense for how much Postgres runs on Unix versus Linux versus Windows?</p>
<p><strong>Josh</strong>:  It&#8217;s a little hard to tell, because where we really get a full sense is the people that are active on the mailing lists. But we&#8217;re keenly aware that that doesn&#8217;t actually represent what&#8217;s actually out in the field.</p>
<p><strong>Scott</strong>:  Sure.</p>
<p><strong>Josh</strong>:  Because we only have, like, 35,000 &#8211; 40,000 people that are active on the various mailing lists and forums. Whereas even just judging by a certain manufacturer&#8217;s distributions, who are bundling PostgreSQL in their products, there&#8217;s several tens of millions of copies out there in the field.</p>
<p>Just given that Windows users tend to be more used to web forums and IM than they are to mailing lists and IRC, we&#8217;re guessing that we probably have a lower level of participation from Windows users. So, on the one hand, we have this vast majority of downloads from the Windows users, but on the other hand in terms of people who actually come back to the project and ask questions, the people that we know are on Windows is a minority.</p>
<p>But that doesn&#8217;t tell us who&#8217;s using it. You follow me?</p>
<p><strong>Scott</strong>:  Yeah, sure.</p>
<p><strong>Josh</strong>:  Because it may be that a lot of people are using it on Windows and they&#8217;re just not joining the mailing lists. They&#8217;re getting their help in other ways.</p>
<p>The other thing is that having a unified code base where we have the same version for Windows and Linux, etc., means that actually for a lot of the newbies asking questions about how to do this and how to do that, until we have some interaction with them, we don&#8217;t actually know what platform they&#8217;re on.</p>
<p><strong>Scott</strong>:  Right.</p>
<p><strong>Josh</strong>:  I&#8217;ve had a number of chats on IRC where it wasn&#8217;t until I got to like the third or fourth exchange of questions that I realized that somebody was on Windows. Which is a terrific thing in terms of being able to support our user base, because that was one of the biggest things I was worried about when we released the Windows version, that all of the sudden we would have 50,000 new Windows users hitting the mailing list, and those of us who are not Windows users would not be able to help them.</p>
<p><strong>Scott</strong>:  Right.</p>
<p><strong>Josh</strong>:  And it hasn&#8217;t turned out that way. So, I would guess that probably the majority of installations are in Windows, just based on the download numbers. But how much those people are using their installations, I couldn&#8217;t tell you.</p>
<p><strong>Scott</strong>:  OK. Great. Well, thank you very, very much for taking the time to chat with us. This has been great. You were really good to talk to because you were able to really dig into a lot of how the process works. If you look at a lot of our interviews, process tends to be some of the stuff that we&#8217;re the most interested in. And I learned a lot, just from this conversation.</p>
<p><strong>Josh</strong>:  Well, I&#8217;m happy to have a chance to explain this and some stuff about how the project works. We don&#8217;t have a lot of written policy, so it can be very opaque to somebody who&#8217;s just joining. I actually recently did a presentation on developing PostgreSQL at OSCON because we realized a few people had problems with how very indecipherable this was to how we were supposed to do things&#8230;</p>
<p><strong>Scott</strong>:  Right.</p>
<p><strong>Josh</strong>: I&#8217;m really glad to have a chance to explain that a little.</p>
<img src="http://howsoftwareisbuilt.com/?ak_action=api_record_view&id=90&type=feed" alt="" /><!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark this:</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2007%2F08%2F22%2Finterview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems%2F&amp;title=Interview+with+Josh+Berkus+%26%238211%3B+PostgreSQL+Core+Team+Lead+%26%238211%3B+Sun+Microsystems" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2007%2F08%2F22%2Finterview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems%2F&amp;title=Interview+with+Josh+Berkus+%26%238211%3B+PostgreSQL+Core+Team+Lead+%26%238211%3B+Sun+Microsystems" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2007%2F08%2F22%2Finterview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems%2F" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2007%2F08%2F22%2Finterview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems%2F&amp;title=Interview+with+Josh+Berkus+%26%238211%3B+PostgreSQL+Core+Team+Lead+%26%238211%3B+Sun+Microsystems" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit?url=http%3A%2F%2Fhowsoftwareisbuilt.com%2F2007%2F08%2F22%2Finterview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems%2F&amp;title=Interview+with+Josh+Berkus+%26%238211%3B+PostgreSQL+Core+Team+Lead+%26%238211%3B+Sun+Microsystems" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.sphere.com/sphereit/http%3A%2F%2Fhowsoftwareisbuilt.com%2F2007%2F08%2F22%2Finterview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems%2F" rel="nofollow" title="Add to&nbsp;SphereIt"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/sphereit.png" title="Add to&nbsp;SphereIt" alt="Add to&nbsp;SphereIt" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://twitter.com/home/?status=Check+out+Interview+with+Josh+Berkus+%26%238211%3B+PostgreSQL+Core+Team+Lead+%26%238211%3B+Sun+Microsystems+@+http%3A%2F%2Fhowsoftwareisbuilt.com%2F2007%2F08%2F22%2Finterview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems%2F" rel="nofollow" title="Add to&nbsp;Twitter"><img class="social_img" src="http://howsoftwareisbuilt.com/wp-content/plugins/social-bookmarks/images/twitter.png" title="Add to&nbsp;Twitter" alt="Add to&nbsp;Twitter" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://howsoftwareisbuilt.com/2007/08/22/interview-with-josh-berkus-postgresql-core-team-lead-sun-microsystems/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

