Video streaming has grown way beyond YouTube and Hulu. Today, broadcasters such as ABC, NBC, and the Comedy Channel are streaming full shows straight to viewers. Netflix is living up to it’s name, and Amazon offers a huge library of movies on-demand. And open-source has moved well beyond the infrastructure layer, with open source video platforms like Kaltura. In this interview, we chat with Shay David, co-founder of Kaltura.
Scott Swigart: To get us started, could you introduce yourself and Kaltura?
Shay David: I am one of the co-founders of Kaltura, and I run business and community development. Kaltura is the first open source video platform, meaning that we have a set of tools, technologies, best practices, and services that allow people to do anything and everything with video, including adding video to any site.
We empower the entire video stack with a set of open source building blocks, including ingestion, metadata normalization, player development, playlist development, content moderation, scheduling (for both live and on-demand video feeds), monetization, and analytics. Those building blocks are the foundations of video applications.
People are using Kaltura for video publishing, goods and asset management, distance education, and enterprise collaboration, among many other uses. Pretty much across every industry, we feel that we are becoming part of every web experience.
Scott: Why do you think video is becoming so important right now, whereas it wasn’t as widespread in the past?
Shay: One of the reasons is that the means of production are so available. If you look five years back, it took a lot more to be a media company or a studio. People needed to have a satellite van, $60,000 cameras, and whatnot.
Today, two billion people carry the means of production in their pockets with smart phones, and more and more of the population are equipped with broadband. The capacity to produce and to consume video is very prevalent.
This creates a whole paradigmatic shift in the set of technologies that are necessary, because, unlike text, video is complicated. A lot of things that we know how to do with text pretty well are not well understood for video; we need the means to handle, store, and stream very large files, transcode one format to another, and manage advertising and analytics.
Scott: Do you feel that there’s more opportunity for analytics in video? For example, if somebody goes to web page, you can determine how long they spent on the web page, but you don’t necessarily know what was interesting to them.
With video, on the other hand, I imagine you might be able to know what portions they watched, what they skipped, and so on. How are the opportunities around analytics different for video than other types of content?
Shay: Analytics actually encompass two very different types of information. The first is audience measurement, such as who is looking at the content, which countries and demographics they come from, and so on. That type is well understood.
The second set of data is media analytics, which consists of the types of things you were mentioning, like drop-off points, popular pieces of content, level of engagement, etc., which is a much less understood science. Part of what we bring to the table is a whole set of technologies around that.
Using this information begins with identifying the data you actually want to collect and the events on the player side that are associated with such data. Next, you have to decide how to close the loop back into your production systems and determine how you will use the information. Just measuring it without acting on it is only half the task.
Scott: The first time I got a sense for video analytics was the whole Janet Jackson Super Bowl thing. Since with Tivo, you can just hit a button and go back eight seconds, the Tivo company reported that the eight-second rewind function was used at that point more than at any other time in the history of their service.
I thought, “Oh my goodness, Tivo knows every time I back up eight seconds. I had no idea they collected that kind of telemetry information.” That opened my eyes to some of the things that you can track with video.
Shay: Absolutely, and that’s just the beginning. When we talk about social media and interactive media around video, we can go well beyond basic audience measurement, but we have to be sensitive to privacy concerns too.
The Nielsen paradigm is about people-metering, and making some gross assumptions about who’s watching each channel in very large categories. And that was fine when we were talking about a broadcast paradigm, but who are the people watching? How many of them are there? How many couch potatoes are in front of the screen?
When you’re dealing with interactivity, there’s a whole new set of measurements to be concerned with, because that’s what the advertisers or publishers ultimately care about.
Scott: Talk a little bit about the open source projects that are underneath what you do as a company, how you engage, and how the development of that works.
Shay: Kaltura is open source through and through, and we run everything on the LAMP stack. So underneath, for production, we use Linux, Apache, MySQL, and PHP. Our entire backend is in PHP, and the front end is a combination of technologies.
We want to maximize availability to publishers, so we provide a set of code in Flash, a set of code in HTML 5, and more recently, a Silverlight front end. It’s completely multi-platform; I run it on my Mac, our developers sometimes develop on Windows, and production is based on Linux. The code itself is a set of extensions on top of Apache.
Scott: Did you choose PHP for the presentation layer just because you wanted your customers to be able to easily modify it?
Shay: Absolutely, and PHP is also the back end. A lot of the presentation layer, for the most part, is the graphics hub. You could use a variety of libraries when presenting content in PHP, .NET, or Java. Ruby, too, is becoming more and more popular. The idea is just to maximize publisher freedom.
A publisher can take the entire set of functionality and just plug it into whatever kind of system they like. One of the things that we’re proud of is a lot of pre-made integration kits, so if you want Kaltura within whatever context, you can most probably go to our newly released Application Exchange, which is a marketplace for all the extensions, plugins, and applications that Kaltura, our partners, and our developer community create.
You can find kits that we have already pre-integrated with Drupal, Joomla, MediaWiki, WordPress, and many other systems like that, regardless of the operating system or the language you’re using for the front end. That takes the pain out of the integration.
Sean Campbell: Walk me through how you make money. When I look at the way you describe the five layers of your open source stack, you’re leveraging a lot of open source projects, but you don’t seem to have a fully packaged, open source version of the solution.
Shay: We make money in three ways. We bring to market the best of breed open source platform, where people that understand open source and have the development wherewithal and capacity to do it can go to our community site, kaltura.org, download the software, install it on their own, and never buy a single service from us.
In exchange for that, we expect them to play nicely with the open source rules. Any derivative work has to be contributed to the community, and they have to follow rules about attribution. Licensing for the code is under the Affero-GPL. So if they want to produce web services, they’re still regulated by the open source licenses.
From that group of people, we may never make any money. We have 60,000 publishers in our network, and the majority of them never pay us license fees. They use the software and contribute code or bug reports, or just become part of an active community. Some of them opt in to receive ads.
On the other hand, we make money from that group by selling them professional services, such as maintenance, support, installation, application development, and things like that. That’s one revenue bucket.
The second revenue bucket is a service business whereby people that understand the open source game still don’t have the capacity to do it, so they pay us to provide them software and services.
Using the same stack of code, we offer a hosted service where we host their files, we run the transcoding of the files, and we run the applications that are going to be integrated into their stack in a hosted manner.
The third revenue bucket comes from service providers. To hosting companies, cloud companies, telcos, system integrators and the like, the platform is also available as a self-hosted solution, under a commercial license with full support and maintenance from Kaltura.
Sean: I’ve heard a lot of people say that there are certain things they wouldn’t do with a cloud-based app, and a common thing people tend to bring up is heavy multimedia editing. It requires significant CPU, and you want to have full fidelity solutions like FinalCut Pro or even key solutions like iMovie or FinalCut Express or Pic Premiere on Windows.
On top of that, the open source community has at times struggled with desktop Linux OSs like Canonical to really get good embedded tools that would rival some of those toolsets. You guys have obviously spent some time on the editing side, and it looks like you have done really great work.
Talk a bit about the limits you have to deal with and work around, in terms of people’s user experience with uploading and editing.
Shay: I think that’s challenging, but we have a few solutions for that. One is that we do offer what we call the Self-hosted Edition, which has the capacity to run the hardened version locally. If you’re a service provider, an enterprise, or a university and you want to run this locally, you don’t necessarily have to run it from the cloud.
We give you a hardened version that you can install on your premises, and you own ROOT. You can control it, and the files are close to the destination so you don’t have to operate over VPN.
Depending on the details of the deployment, you might be able to enjoy economies of scale without having to have the complexity of cloud. We see a lot of adoption of this model in higher education, where many institutions have robust on-campus infrastructure.
I think you’re correct to point out that it’s only a matter of time until you can pretty much do everything remotely. People like Sun have been saying for ages that the “network is the computer,” meaning that, at some point in time, if the network is fast enough, you don’t really care where the CPU is.
In that sense, I think that what we have is a beginning. It’s not ready to replace high-end, broadcast-grade editing solutions any time soon, but people are using it for production, and you can run really complex work-flows around it.
Where cloud-based editing becomes essential is when you’re starting to think about collaboration. On one hand, if I’m editing my own files that I just uploaded from my camera to my computer, I have little motivation to run everything on the cloud, other than the benefits of automatic updates, auto-backup, and so on.
On the other hand, a group of students working on a class project doesn’t really have the option of doing it offline. In the context of enterprise collaboration or training, the need arises to gather information from multiple sources, and potentially from multiple geographical locations.
Therefore, you are almost mandated to have a network solution. For a network solution like that, the first tier replicates the broadcast model online. A provider has a library of content and wants to capture data out there. But that is just the starting point.
As we’re moving into the new realm of video, it’s about interactive media; it’s about collecting information from multiple sources and being able to transmit it. That’s the place where I think we are seeing a lot of adoption of the more interesting concepts around editing, remixing, annotation, content manipulation, and metadata extraction.
In my mind, all those fall in the same category of advance network services that have to do with collaboration.
Sean: People obviously love social media streams and the collaboration associated with them, but it’s somewhat uni-directional. Normally, I upload my video, and people watch it. It gets more interesting when they remix it or do something interactive with it. Some musicians actually like that their fans are doing that.
That creates an interactive world where, by lifting all that weight off of physical drives in everybody’s house and office, we enable new use cases. If it really is all up there and you’ve got good editing tools, you can blend and create environments and communities.
There’s plenty of video on the Web, but it’s mostly a broadcasting-type model. I’m hearing you say that you might be able to enable a certain level of community around video as a content stream, given the tools and the environment you provide.
Shay: Absolutely. I think that revolution is impending, and I think the first step we’re going to see at a large scale is on Wikipedia. Using this technology, we’re providing Wikipedia the same type of media collaboration that they know from text.
Some people are going to upload, some people are going to edit, some people are going to clip, and some people are going to transmit. It’s a completely open system that allows people to collect multiple media files, sequence them, edit them, and translate them, all in an open way.
It’s very exciting stuff, and two years of work are coming to fruition. The first phase just went live a couple of weeks ago. People are contributing content, and then later in the year, all the sequencing and remixing software is going to go live. That’s going to be very significant.
It’s been a great year for HTML5 video. Safari, Chrome, and Firefox already support it, and a few weeks ago, Microsoft announced that a new version of IE is going to support it. Google announced that it has open-sourced the VP8 codec. That kind of opens the way to creating a truly open HTML5 solution.
Sean: It seems that one of your potential business development strategies would be to look at properties on the Web that are already successful but also text-heavy, which would naturally lend themselves to a video overlay.
I imagine you also have your eyes on other text-heavy properties that are driven by community contributions.
Shay: True, but I think the opportunity goes beyond publishers like Wikipedia in the classic sense of publishing. It’s not only text-heavy content that is aimed to end users; it’s also other places where there is mainly text.
Think about university learning-management systems (LMSs) that are 95 percent text right now and maybe a PDF and an image here and there. Think about Enterprise Resource Planning (ERP) or Customer Relationship Management (CRM) software, and all the other places that couldn’t use rich media because it was too complicated. Now we can change all that.
We are giving the market a set of technologies that are enabled for rich media and video. I’ve seen statistics that show that video is going to be more than 90 percent of the traffic on the web. I think that’s a conservative estimate; I think it’s going to be somewhere like 95 to 99 percent of the traffic in the world in terms of number of bits.
For large education institutions or enterprises, this requires a huge shift in thinking. For these guys, it’s not about all the libraries, but being able to integrate into the platform that’s going to give them all the tools, because now, if you don’t do it properly, all of a sudden you’re going to choke on it.
Imagine you’re a university campus of 50,000 students and you have an LMS that’s running great. You have a new film professor who assigns his 300 students homework to upload movies to a campus web site. If they do that before you provide for it, 300 students each uploading half-gigabyte files will cause the system to collapse. True story.
In scenarios like that, you really have to think about the infrastructure. You have to use a new set of tools. That’s just one example.
Scott: You mentioned HTML5. For the longest time, as videos started to emerge on the Web, it all gravitated toward Flash, but now, the industry seems to be moving toward HTML5. Google is pushing it, Microsoft’s going to support it, and of course, Apple has declared war on Flash.
I can’t play at least 85 percent of the videos on the web using my iPad, because they’re all Flash-based. What are you seeing in terms of emerging video formats, and what opportunities do you think that creates?
Shay: I think a lot of the war is driven by adoption and devices. There’s no question that Flash is installed on a large majority of PCs and Macs, but that’s not the case on mobile devices. Flash doesn’t have any adoption on IP-enabled TVs, set-top boxes, Windows Mobile phones, Android phones, iPhones, iPads, and iPod Touches.
Only a relatively small portion of devices can support Flash, and while its absolute adoption numbers might be increasing because the PC market is still growing, it has declined as a percentage of the overall ’screens’ out there.
I think there is an opportunity for open formats to take Flash’s place, and HTML5 is one of those formats. HTML5 has always been a codeword in that sense, because it does not define the codec or the transfer protocols for video, so we’re going to see a lot of bifurcation in the market moving forward.
The promise of HTML5 is that it allows developers to write once and publish anywhere. You can include video tags on your page, and it’s the device’s responsibility and the browser’s responsibility to figure out the codec and the transfers that render the video properly. In that sense it’s a huge step forward, but I don’t think Flash is going to disappear overnight.
It’s a paradigm shift that is going to take a few years, and I think Flash has a lot of things going for it. We work very closely with both Microsoft and Adobe to maximize the benefits to publishers. The biggest opportunity it presents for us is that today, publishers are faced with a bifurcated audience, because they have to address the iPhone, the iPad, Android, set-top boxes, and PCs.
They need a set of tools that allows them to simplify writing and publishing their application without having to think about all the devices, and that represents to us a big opportunity to provide an infrastructure that allows developers to write just once. Our libraries are going to do the heavy lifting for them in order to deploy the app with the appropriate protocols.
Sean: Anybody who’s produced video for the Web, even to put a video on their blog, is immediately faced with choosing a bitrate, encoding, and whether to put it up as an MOV file, Flash, a WMV file, or whatever.
People trying to create interesting content don’t really want to worry about all those technical details, and there is no right choice anyway. It really depends on what a particular user has on their machine.
Like you said, if they’re using a mobile phone one format’s right, and if they’re using a Mac or a Windows PC, a different format might be optimal. I think you’re right that there’s a huge opportunity to deal with those issues and offload the content developer from having to worry about them.
Shay: Precisely. I think that no single company can solve the challenge alone. We’re big believers in creating standards in the market, interoperability, and distributed control.
We’re a founding member of an organization called the Open Video Alliance. That’s an umbrella organization that is bringing together all the people who care about open video. We had a wonderful Open Video Conference last year where more than 900 people showed up to talk about these issues, and 4,000 more were on the live feed.
This year it is going to happen again on October 1 and 2. I’m inviting the developer community and the people who care about it to actually come and participate and just vote with their feet, helping solve that problem on the ground.
We are also involved with the industry resource website called HTML5Video.org, which allows people that have interest to showcase the technology, explore the libraries, play around with it, and experience the potential of HTML5.
Scott: That’s great. Any closing thoughts from your end?
Shay: It’s an exciting time for this market, and we are maybe at the end of the first inning, with eight more innings to go. I think this is the same order of explosion that we saw with the first wave of the Web 10-12 years ago, with text.
I think next year, more than half of the TVs are going to be IP enabled, and half of the homes are connected to broadband. Two billion people have access to the production process, and imagination is their only limitation.
I think 2010 is the year of HTML5, and we are going to see massive adoption. 2011 and 2012 are going to be all about productization and adoption of new business models around video. We’re nearing a state where the plumbing of video, so to speak, is done, and the beautiful gardens will soon spring into full bloom.
Publishers no longer have to spend human resources on the infrastructure. The problems are beginning to be well understood—and solved with technology like what Kaltura brings to the market. Now people can start focusing on innovation and actually building the applications and what follows around it.
The Application Exchange we just launched recently is an excellent example of that, which allows people to think about higher-level applications, instead of plumbing.
How could video drive better distance education experiences? How could it drive better enterprise collaboration? How could it drive better publishing and media experiences? I think that’s the future we’re looking at.
Scott: Thanks so much for taking the time to chat.
Shay: Thank you. For more information check out: