Open source has evolved beyond simple infrastructure and is steadilly working
it's way up the stack. A prime example is Jaspersoft, which supports the
worlds most popular open source business intelligence software. In this
interview we chat with Brian Gentile, Jaspersoft CEO.
- Breadth of open source business intelligence offerings from Jaspersoft
- The rise of information as the key asset owned by many businesses
- The role of the open source model in Jaspersoft’s success
- Transition in business intelligence toward smaller companies and less expensive solutions
- The role of cloud computing in the future of business and government
- How the business intelligence layer fits into the larger cloud computing model
Scott Swigart: Please introduce yourself and Jaspersoft, as well as the company’s relationship with open source.
Brian Gentile: I’m President and Chief Executive Officer of Jaspersoft, which, we’re happy to say, has emerged as the world’s most widely used business intelligence software. We’ve very successfully used the open source model of development and distribution to reach this point. It allows our software to be, in a sense, downloaded and put into use by virtuallly anyone who has the acumen or even interest to solve some data problem using a BI tool for reporting and analysis.
Today we sponsor and promote five major business intelligence projects that are also signature products. JasperReports is the world’s most widely used reporting engine and library, and iReport is a companion report design and development product that sits on top of JasperReports.
JasperServer is a repository-based report and analysis server that provides scheduling, authentication, and all sorts of server-based business intelligence functions. JasperAnalysis is a full relational OLAP environment that works side-by-side with JasperServer and provides multi-dimensional capabilities through the ROLAP engine.
Lastly, we sponsor the JasperETL Project, which allows for very sophisticated data integration, enabling extract, transform, and load functions in creating a data warehouse.
The combination of all five projects we refer to as the Jaspersoft Business Intelligence Suite, and we take it to market first in community releases, which means freely downloadable, mostly GPL-licensed releases, and second, in the form of commercial editions, which we refer to as the Professional Edition and the Enterprise Edition.
These two commercial editions have features and capabilities that go above and beyond the community-oriented project releases. So for our commercial customers, there is a feature-based reason to potentially subscribe to and pay for the professional or enterprise editions of our suite.
Scott: In its early days, a lot of open source was infrastructure-oriented, and in many cases, it was written for developers, by developers, such as the Linux kernel and the Apache web server. Now, we’re seeing success with companies like yours that offer software that operates at a higher level, along with open source ERP systems, CRM systems, content-management systems, and those kinds of things.
Talk a little bit about the trajectory of open source evolution. Why is now a good time in history for these higher-level products to emerge in open source, where it used to be more infrastructure?
Brian: Let me use some numbers that provide clear testimony that the world and its development model has surely changed. With our business intelligence platform, we are now approaching volumes that are close to those infrastructure open source projects that preceded us. For instance, we surpassed the 10 million download mark in January 2010, and we’re closing in on 10.5 million downloads today.
Right now, we have a new community member registering at the Jasper website about every five minutes. There are more than 125,000 registered members, and we have more than 70,000 forum posts. Forum posts are where our community interacts, so they’re helping one another every time they post something to solve one another’s problems, answer each other’s questions, and so on.
We now have what is known clearly as the largest ecosystem in all of BI, because of the partners, contractors, community members, and developers that are participating. It really is fundamentally changing the way software is built.
Just as the Spring Framework, Hibernate, and a variety of other development tools have completely transformed and enabled the LAMP stack and the new world of web development, Jaspersoft now is the BI layer in that world.
The volumes and the kind of development that is occurring, the kind of people that are developing and creating applications for which BI now becomes a web-based layer, has expanded dramatically over the last few years. And it has expanded because of the existence of all these open source products and different layers.
Therefore, it makes all the sense in the world that open source software has marched up the stack. Jaspersoft dwells at a level in between middleware and applications. We’re not quite an application, but we’re certainly more than middleware. Our providing the business intelligence layer within a web application frees the developer to focus in his core competencies at the application level. And, assembling applications rapidly based on the new web model and using open source tools whenever they’re available is clearly the new design point.
Scott: Historically, the need for content management systems, CRM systems, and ERP systems goes back a long way, so it seems obvious in hindsight that open source alternatives have become available for them. There’s a relatively longstanding need that is relatively well understood.
CMS and CRM are slightly newer, but even those go back a decade in terms of proprietary solutions. Business intelligence, on the other hand, seems to be driven by a different force. It seems like we’re more addicted to data than we’ve ever been.
More and more, the standard seems to be just to save everything. Save every search query that anyone enters, and save every detail about every interaction with the customer. Just as an oilfield or a super collider produces terabytes of data in a really short time, now even mundane business tasks produce terabytes of data, too.
This addiction to data has driven the need to have really good business intelligence, because it’s just too much to look at and make sense out of any other way.
Brian: For the last 20 or so years, information and its use have constituted a large percentage of a company’s relative value. 100 years ago, perhaps 90 percent of the company’s value was based on its ability to manipulate land, labor, and capital–the fundamental factors of production.
As we moved into the late last century, information technology began to drive an important part of the economy as we became so much more capable of using it effectively. Today, the management of information makes up the vast majority of the value of many companies.
For example, one could argue that nearly 100 percent of the value of a financial services firm is in the information that they command and maintain. Vast amounts of data exist because that data is increasingly valuable, even for an old guard steel mill or manufacturing company.
That trend makes it very important to efficiently manage and manipulate that information. Value is created from insights gleaned from that information, by increasingly large numbers of people. The data needs to be distributed, decentralized, and democratized so it can be used in the largest possible variety of ways, as quickly as possible.
That is a fundamental reason why open source and lighter weight, faster tools for business intelligence have become so popular, and their popularity will go nowhere but up over the course of the next decade.
Scott: Concerning the democratization of data, not that long ago, companies had canned reports that they printed on a big line printer on a weekly, monthly, or quarterly basis. Someone had determined ahead of time what data and reports were useful, in a fairly inflexible way, and they went into a binder on a shelf.
Today, there’s the understanding that the company can’t fully predict how that data might be used. Individuals may have very specific problems that they need to solve, and to do so, they may use data in a unique way that is not apparent or useful to anyone else.
Brian: That phenomenon is exactly why Gartner estimates that about 40 percent of the budgets for business intelligence are now pushed out to the business units, either consciously or unconsciously.
Those who really understand the data and need to make use of it for better business decisions and faster insights are not in IT, and they’re not at headquarters. They need to be able to make decisions about which tools they’re going to use, how they’re going to put them in place, and which data they’re need to access and manipulate.
We’re seeing a fundamental shift toward working with departments and enterprises who are now creating software and using the tools inside of that software at a faster pace than we’ve ever seen before.
It’s all being done using web techniques and thin-client tools, and they’re doing it in days and weeks rather than months and quarters, for tens of thousand of dollars rather than millions of dollars. That trend is the next stage of the democratization of data, and it is largely enabled by open source.
Scott: Open source has clearly been successful for Jaspersoft, and as you said, you are getting millions of downloads. What is it about the open source nature of the foundation that has made your approach so successful?
Brian: Based on surveys that I’ve participated in and read, first, the innovation and the capabilities available in open source software are simply not found in their proprietary alternatives, if those proprietary alternatives even exist.
Second, open source affords companies flexibility. Some enjoy the fact that they have the source code, some enjoy the fact that they have access to lots of other people who are using it and can share ideas and improve it, and all of them like the idea that they have an alternative to a proprietary player, again, if one exists in the category.
Many of the senior people in these organizations that have been successful in using open source would go on to say that it prevents them from feeling locked in to a proprietary vendor who can constantly increase prices or put pressure on them to do things that they might not otherwise want to do. If there is any lock-in in open source, it is surely at a much lower cost and a much lower threshold level.
Today, because of those factors, we’re seeing what I’ve referred to as the mainstreaming of or the mainstream use of open source. As a result, we are seeing real depth in the use of open source, even among those who are not on the bleeding edge of technology.
One of the best examples is the public sector’s adoption of open source. Governments around the world, by definition and by charter, have never really been on the edge or in the vanguard in their use of information technologies. And now public sector entities are adopting and using open source now in big numbers.
The clear advantages occur in terms of innovation, flexibility, and avoiding lock in. Notice that I didn’t even talk about low costs, which would probably come fourth or fifth in importance for those enterprises that have been most successful with open source.
Scott: My experience with government customers has been that one of the reasons they love open source is that their procurement process is so onerous. The ability to use a free, open source tool that doesn’t require approval, a competitive bidding process, RFPs, and all that kind of stuff is very attractive to them. They can have an idea this morning, download some stuff this afternoon, and have a prototype mocked up by this evening.
Obviously, though, you guys have got to make money, and obviously the community of users is going to be larger than the community of paying users. But talk about how a government user translates from free to fee, in a way that kind of keeps your business going and lets you keep contributing back to the underlying projects.
Brian: Their use of the free, open source tools is no different than in private enterprise, in the sense that they can, just as you said, download something and get started on a prototype or a proof of concept on their own time. There’s nothing about the licensing techniques that would prevent a public sector employee any differently than a private sector one.
The good news is that the public sector’s big enough that there plenty of employees willing to do what they need to do to get their jobs done better, and that’s a good thing for taxpayers all around.
But then your second point is the procurement process, which I cannot deny is certainly more laborious than in a private enterprise, although I can point to some of our most difficult contract discussions and lengthy procurement processes actually coming from the private world and not the public sector. There are many private-sector companies that would give any government a run for its money in terms of ridiculous procurement processes.
Your point is accurate, though; it does take longer on average in the public sector. As an open source company, we have to have the ability and patience to work through those issues so we can get on purchasing schedules and so forth. It’s all about making the purchase process easier for them. It takes time, but it’s absolutely worthwhile.
And, that’s the right thing to do for public sector spending in general. I’ve written plenty about the need for governments to embrace and use open source wherever it exists. Using our taxpayer dollars more efficiently than they would be otherwise is definitely worth the effort, even though it can be difficult.
Scott: Different companies that have an open source foundation unpack the difference between the community version and the paid version differently. Do your customers find value mostly in features they want that are only available in the enterprise version, or is it more the support peace of mind?
Brian: There are three or four major reasons that a customer would choose to move to a commercial, paid version instead of the community version, and I’ll list them in no particular order, because it’s going to vary by customer.
The first reason is that there are features and capabilities in the commercial versions that are not in the community version, and those features and capabilities may make the paid product better aligned with what they need to do.
Our community versions are clearly capable products, so what we’ve chosen to put in the professional or commercial versions is very significant. Any commercial company, I think, would look at our commercial products and say, “Yeah, that’s absolutely worth paying for.”
The second reason is the license. Some customers are not comfortable with the license that governs the open source code. Usually, it’s in the range of some relatively permissive license, like the LGPL or Apache or something like that, to a relatively restrictive one, like the GPL.
Each of those licenses comes with terms that make it less interesting for some companies to put the software into use. They would rather have standard, commercial terms around the software and its use. By going to a commercial version, customers have a commercial license which includes the indemnification and protection they want to go along with it.
The third reason is professional support. The forums and the support that you get from working with the Jaspersoft community are pretty robust, but if your use of Jaspersoft is relatively mission critical, you probably want the comfort of knowing that you have a support team that’s very intimate with the code and available to you 24/7.
Scott: Another area that’s interesting to me is the notion that business intelligence used to be used by the largest companies, and it was sold to them by very large software companies. These were very expensive solutions and implementations.
Talk a little bit about the need for BI in the small and medium size business sector. What kinds of examples have you seen where BI has become increasingly critical to relatively small companies?
Brian: This is one of the most important and fundamental shifts that has occurred in the last 10 years: the size of the company now has almost nothing to do with the size of the data problems. We have many small to mid-size companies that need to analyze enormous volumes of data, in very sophisticated ways and in almost real time.
One of my favorite examples in our portfolio of customers is a small company called CrowdStar, which creates Facebook games.
They need to analyze, in almost real-time, 20 million records of data at least every day to understand patterns of movement and player activity within their games. They can modify those games during the course of a day to enable more activity, which for them equates to more use and more payments.
They analyze all that data using Jaspersoft in the cloud, which they pay an incredibly small amount of money for. CrowdStar is not a very big company, but they use big volumes of data, and there is literally almost no other way they could afford to do this without Jaspersoft in the cloud.
I think that’s a quintessential example. There are others that are less dramatic, perhaps, but they are equally important in the magnitude of data that needs to be analyzed and the size of the company that is behind it.
This trend is causing all kinds of technological shifts, and it has triggered what I refer to as “therefores” in the world of business and technology. By “therefores”, I simply mean that they are a second order effect, caused by the sea of data so commonly barraging any size enterprise. Examples of these “therefores” are increased interest in columnar orientation analytic data stores, proactive analysis, and predictive-like analysis. There’s much more interest in elegant design of responsive mechanisms for data analysis, such as dashboarding methods of quickly drilling into data, even through multiple dimensions.
It’s created a need for fast, successful small projects. It has pushed, as we described earlier, more of the emphasis out to the business unit levels, so they can have these small projects around where the data is best known and understood, and they can accomplish things more quickly.
So this host of business and technology “therefores” has come from the enormous amount of data now that is being analyzed by almost everyone.
Scott: There are so many different trends converging that let an entrepreneur go from an idea to a product in less time, with less capital expense than ever before. It used to be that, if you had an idea for software, step one was finding an office. Step two was buying servers. Step three was hiring a bunch of people.
That all burned through a lot of cash trying to get a product built, before you even had something that your first customer could look at. Today, you don’t have to buy a server. Even before the cloud, hosting providers offered dedicated servers, and then they started offering virtual servers.
Then clouds came online like Amazon EC2, and you started to see the platform as a service plays like Google App Engine and Azure, and you see a lot of the traditional hosting providers also becoming cloud providers to some degree, building on top of a virtualization infrastructure.
You can put a $2,000, or $5,000, or $10,000 project out on Elance or something like that, or build it yourself if you’ve got programming capability. You can be a company of five people, 10 people, 20 people who relatively quickly layer on top of another infrastructure like Facebook or something that has millions of users. Your user base can grow at a fairly steep exponential curve, having millions of users and enormous amounts of data.
There is also a huge amount of open source data, or data that you can subscribe to. There is a website that’s built on Google App Engine called Walk Score that you put in an address and it tell you how walkable a neighborhood is. It taps into all kinds of different streams of data that they don’t own.
It’s not their data, and they didn’t have to collect it, but it’s syndicated and accessible data. Whether that’s baseball stats, map and geo-location data, or the data that you might need to use business intelligence on, it increasingly might not even be your data. You might be looking for correlations or you might be relating data in novel ways that nobody else has bothered to put together.
What do you see as the future of these trends?
Brian: Some of the changes in the last few years have been profound, and the great amplifier of this phenomenon, going forward, is of course the cloud. If we fast forward five or seven years, it will look ridiculous that we built all these data centers. Everybody having their own data center will seem as absurd as everybody having their own power plant.
The ability to find tremendous efficiency, and deliver computing power on demand, instant on, as it’s needed, is one of the most important things happening right now. So many technologies that are transfromational, we tend to overestimate their impact in the near-term and underestimate their impact in the long-term.
All that is why Jaspersoft has been working in the cloud for more than a year. It’s not because there is yet a huge demand for business intelligence in the cloud. It’s because I know fundamentally that we have to be there so we can learn, adapt, and ensure that our BI platform can be delivered as a service through the cloud even more easily than it can be downloaded today.
My favorite example of all this is the U.S. federal government, which manages slightly more than 1,000 data centers today. If you’re a U.S. taxpayer, your money is going in a big way to help continue the running and management of more than 1,000 data centers.
Let’s just say, for fun, they could cut that number in half, which should be, by the way, an incredibly modest goal. It should actually come down to about a couple of dozen data centers. But let’s just say they could cut it in half by virtualizing and taking into the cloud a substantial percentage of the infrastructure platforms (hardware and software) that is common, or should be common, across every government agency.
Fortunately, that is a goal that the new federal government CIO and CTO are working on right now. It’s a high priority on their agenda, and I don’t think it can happen fast enough. I’m hoping that they do it in a way that is highly successful, and I’m hoping it results in a 90 percent reduction in those data centers in the course of the next decade or so.
You could use that same example across much of the private-sector landscape. Private clouds are going to have a huge impact. I think the public cloud has probably been overrated, but it will have an impact in a longer term way. The methods of building and creating private clouds is going to be one of the most important things that have come along since the invention of the personal computer and the Internet.
Scott: When I think of private clouds, I envision a company that still has a data center, but it’s just providing cloud-like services to the organization. A guy I talked to one time said, “A cloud is basically a data center with an API.”
You can run, start, stop, and clone instances, and there’s the notion of IT being more of a service provider to the organization. How does that square with the way you’re using the term “private cloud?”
Brian: Back to the government example, that’s all true, but what Vivek Kundra should be creating is a single, enormous private cloud for the U.S. government that should allow the reduction of 1,000 data centers to, say, 50. This would allow agencies to share these resources simply and capably.
And, this Government cloud should be private, so nobody unauthorized is able to tap into it. It should be a U.S. government agency cloud-based infrastructure that is shareable and provisionable for each agency, but it’s managed by some central authority, and enormous in scale.
Scott: The idea being that, even if you look at the government doing a census and tax time for the IRS and all of these peak loads, the entire capacity of the thousand data centers are nowhere near fully utilized.
So you could consolidate that all down to 50 data centers and still have lots of headroom, but without the risk that if a public cloud provider gets compromised, Los Alamos would be potentially compromised.
Brian: That’s exactly right. The present duplication of resources is remarkable, and cutting a lot of that out is where the cost savings would come from, more so than the virtualized hardware and software, and it would allow for the fungible provisioning of computer resources based upon different agencies’ needs for it.
Scott: Certain things are holding organizations back from going to the cloud. Part of the hesitation is that they just don’t trust the technology yet, but there are other factors as well. Some organizations are held back by the difficulty of extending their identity and security boundaries or issues around compliance requirements.
How does that impact Jaspersoft and business intelligence, given your stated goal of having it run as well in the cloud as it does if you download it and run it on premises? What are some of the specific things that you are tackling to make sure that you’re ready, as some of the network security trust issues get solved, and people start adopting the cloud in a more mainstream way?
Brian: We’re avoiding those use cases where those preventatives are in place, to be honest. It’s one of those things where gravity finds its own level or water runs downhill. The use cases come to us. The customers that don’t have those obstacles in their way and that have problems that can be solved with BI in the cloud are the ones that are finding us now.
The kinds of things we’re seeing come to us now in the cloud pretty commonly occur around three or four major project types. You are right that those project types do not involve industries or processes where there are major restrictions of security or data flow, or authentication issues like HIPAA-based requirements, and so on. Those are not the use cases being solved for now.
Our logic is to solve the green field problems where the marriage of cloud and BI make good sense today, and CrowdStar is a great example (one I offered earlier). Over time, the benefits of using instant-on, highly provisionable cloud-based services like BI are going to become even more efficient.
Even those industries and sectors that are highly regulated or highly concerned with security will find ways eventually to adapt and provide secure provisioning and whatever’s necessary to make those services available for these more difficult use cases.
If we were to target these security-minded use cases now, we would be foolish. Instead, we’re going to target and work with those entities, companies, and sectors that don’t have those restrictions or problems, solving mostly green field problems today. Then, over time, as the benefits emerge more clearly, we’ll surely see even the laggard use by those industries that are highly regulated.
Scott: As everybody works on that problem and they figure out a way to do HIPAA in the cloud, what are some of the things that Jaspersoft needs to do, or is doing and needs to continue to invest in to be a great BI platform in the cloud?
Brian: Truthfully, I don’t think we need to do anything differently than we’re doing today as a great BI platform. I think the issues that will need to be solved with the cloud are going to be separate from the BI layer.
Some things we’re doing in general to help us to be a better participant in highly regulated urgent care industries include things like marrying BI to processes and collaboration environments.
Those are probably the two most requested emerging trends that we’re seeing, where we have partners that are building Jaspersoft into industry-specific BPM software, to take some time out of processing and add analytics to it, so that there are huge gains in the output and the results.
One of our good partners is named HandySoft. It’s created a remarkable environment using Jaspersoft and its own BPM expertise to fully automate the process flow of secure documents that need to travel inside the Nuclear Regulatory Commission for the maintenance and management of nuclear materials all around the world.
Of course, that’s a very secure, highly authenticated environment. We’re solving the challenges there today with HandySoft for the NRC. They’re actually providing a hosting service for the NRC, very cost effectively.
I think someday that will be available generically in the cloud. We have a bunch of hurdles to cross before that can happen, but there’s no change needed to the BI software, really. What’s needed will be changes to the cloud infrastructure, the perception of the cloud, and how it behaves in order to satisfy uses beyond what I described for the NRC.
Scott: There are a couple of different cloud models, and I would imagine that infrastructure as a service is particularly interesting to you. I’m referring to the model where somebody essentially has a virtual machine, and they can put whatever they want on it in terms of an OS and a stack, and they can put Jaspersoft’s stuff on it.
On the other end of the spectrum, there are things like Google App Engine, where the developers really insulate it from even knowing that there’s a machine underneath them, an OS, a file system, and that kind of stuff. It can call out to something running on another cloud provider, but it seems like there wouldn’t be a lot of partner opportunity for a company like yours.
And then in the middle, there are things like Engine Yard, where companies are making things like Ruby on Rails or these other development frameworks into platform as a service offerings that can run on an arbitrary infrastructure as a service.
What do you see as the opportunities associated with some of these different models?
Brian: I think that many of these emerging application infrastructures will include Jaspersoft as the BI reporting and analytics layer, so that if you’re choosing from a smorgasbord of functionality in your platform as a service environment you could very well choose Jaspersoft as your reporting and analytic layer to go become part of your solution.
Our job is to ensure that the Jaspersoft platform is inside of all these environments, which is precisely why we’re doing what we’re doing with both the platform fundamentally and the cloud work that we’ve done.
Scott: In some ways, it’s almost like an OEM relationship, where you make the case to the platform or the service providers why their customers may want Jaspersoft as a BI solution they can choose from.
Brian: In fact, a little more than 60 percent of our business is in arrangements that are OEM-embedded uses of our tool, for a lot of reasons. That substantial customer base that we have who are embedding Jaspersoft, makes me not want to compete with them, but to help them succeed by attracting more customers.
I don’t want to, for instance, host Jaspersoft and try to compete with the very customers I’m selling to; that is, our OEM customers. I would much rather allow them to do it in a way that is probably application- or domain-specific, and which adds their expertise to our BI platform – and includes our expertise, which is all about building a great BI platform.
Scott: This has been a great interview. We’re running short of time, so are there any closing thoughts you’d like to add?
Brian: The kind of innovation that we’ve represented in the last few years and the substantial community that we’ve built around our open source heritage is about to take some important leaps forward.
We have two major releases planned this calendar year, and we have significant announcements with our community that will show just how big and broad it is. I think you’ll see a variety of projects that use Jaspersoft as the open source engine across stand alone and embedded implementations that will prove we really are the alternative now to the aged, proprietary vendors of the past.
Our ability to innovate rapidly and steadily, and our ability to work with the community is going to continue to be a substantial differentiator for us. It’s going to be a lot of fun, and a year from now, our progress is going to be shocking. There are plenty of surprises ahead over the course of the next 12 months.
Scott: Thanks for taking the time to talk today.
Brian: Scott, thank you so much. This has been great.