Testing a web site against multiple browsers has been a long-time challenge of
Web development. In this interview, Jason Huggins, CTO of Sauce Labs,
explains how the Selenium project makes multi-browser testing much faster.
- The creation of the Selenium web-testing tool
- The Sauce Labs startup around Selenium to run testing on cloud systems
- Automated web testing with Selenium
- Taking Selenium beyond testing into software documentation and marketing
- The revenue and business models of Sauce Labs
- The maturation of Selenium beyond predecessors and into the emerging cloud ecosystem
Scott Swigart: To start us off, could you introduce yourself and your company?
Jason Huggins:I’m the founder of Sauce Labs. I think my unofficial title is CTO, but the official one on my business card is “Executive Software Chef.” I’m a fan of silly titles, so that’s my silly title. I’m also the creator of the Selenium web-testing tool, and Sauce Labs is built around the things that we wanted to do, as a startup, with Selenium.
I was the tech lead of a team at ThoughtWorks that was writing an in-house time-expense system. Unlike most corporate entities, we couldn’t dictate by fiat that everyone should use the same browser. We had a bunch of alpha geeks there, and people used whatever they wanted to.
That was frustrating, and I decided that we needed to put a stop to it. We decided to come up with a way to write one test and run it across all of the browsers we cared about. That was the genesis of the Selenium tool.
Scott: How did the tool progress from that point to become an open source project?
Jason: I thought I was going to be famous with this time-expenses thing we were building. It turned out, though, that as I was showing people what we had done to test these two browsers, they said they wanted to use it on their own projects.
Those developers wanted access to the code, so we made a formal request to open-source it. One thing led to another, and it kind of took off. We never actually open-sourced the time-expense system, but Selenium definitely took off.
Scott: What was the trajectory from that point? Did the project grow primarily by word of mouth?
Jason: It grew by word of mouth, one project at a time. At one of the last projects I worked on at ThoughtWorks, the developers were frustrated that running tests with Selenium was taking too long–about 30 minutes to run all tests end-to-end. In a classic agile project, you want to run everything often–ideally, at least every time you check a file in and everything done in 10 minutes or less. If the whole process of build and test takes too long, you’ll skip running those tests.
Every project has a different conception of how long is too long, but usually five to 10 minutes is kind of the ideal. On this particular project, total test time was hovering at about 25 minutes. Mutiny was afoot, and I had the idea of getting a couple of extra machines. Using four machines, I wrote some code to split the tests across the four machines, running concurrently, and we got the testing time back to a good level.
A couple of months later, I heard through the grapevine that Google was also using Selenium, and that same contact asked me if I had an updated resume. Of course, that’s the offer you can’t refuse, to be courted by Google, so I applied. It turns out they were building a Selenium farm, and I arrived on the scene right when they were doing that.
It turns out that the Writely team, which is now Google Docs, had the same kind of an epiphany, if you will. There were also using Selenium at the time, and it was taking an hour to run their test. They ran the testing on four machines, and the time went down to 15 minutes.
They started telling other groups at Google, and it became a virtuous cycle where more teams started using this farm, which enabled them to get approval to buy more machines. Having more machines let them divide all of their tests across those machines and make everything go faster. Selenium eventually started being used for Google Apps and Gmail, Maps, Calendar, and lots of apps that I can’t talk about.
Scott: How did Sauce Labs grow out of that experience?
Jason: I was at Google for about a year and a half, but I was looking to get back to Chicago. Meanwhile, I met John Dunham, who is now Sauce Labs CEO, through Elisabeth Hendrickson, an expert in agile testing, and a mutual friend of ours. John wanted to get back into the testing space, and he sought me out just to have lunch with a guy interested in the same field.
I revealed to him I might be looking to do a start up, too. I was the idea guy, but I didn’t have all the business sense to be able to actually run a company. The initial “cocktail napkin” business plan was to prove the viability of throwing hardware at the software testing problem.
We had proved it on a small scale with my little project at ThoughtWorks and also on a big scale at Google. The next question was to determine whether there was an actual market for it, and we started Sauce Labs to answer that question.
It didn’t hurt that cloud computing was the most hyped story of 2009. The type of testing we’re doing at Sauce is the perfect use of cloud technology. We use disposable machines to run a whole bunch of stuff in parallel and run web browsers on all those cloud machines. Customers then drive those machines through our API to execute their tests.
It’s a classic problem where a long, slow process needs to go faster, and we just need a lot of disposable machines to run it against. Sauce Labs wouldn’t be really feasible without cloud technology.
Scott: Do I understand correctly that Selenium provides a plug-in that you use with your browser of choice that lets you record what you do? My understanding is that it emits a script that can then drive the other browsers through those same actions.
It can actually video record what’s on the screen as it’s driving them through those tasks, and it’s got logic to verify that the right thing shows up on the screen. Do I have that right?
Jason: There are several tools in the Selenium toolbox, but the tool you’re thinking of is a Firefox extension called “Selenium IDE”. The video recording is not included in Selenium out-of-the box–that’s something Sauce Labs has added to our online-based service to extend the capability and usefulness of running Selenium tests in the cloud.
Taking a step back, though, Selenium specifically is a functional testing tool (also called “acceptance testing”) that can verify features by interacting with a website exactly as a user would. Think of Selenium as a programmable software robot that knows how to use a web browser, and does so in all browsers, not just in IE or Firefox. We use IE, Firefox, Safari, Opera, and Google Chrome, on Windows, Linux, and Mac OS. We have APIs for Java, C#, Python, Ruby, Perl, and PHP.
The core minimum Selenium API is a collection of simple commands: “open”, “type”, “click”, and maybe “verify”. Open a web page, type text into a field, click a couple of buttons, and then ask the page if some text or another element is present or whether some state has changed. That’s the minimum essence of what Selenium does.
Scott: My sense, too, is that it’s not just simulating the browser. It looks like one of the reasons it seems suited to the cloud is that it’s basically starting up a VM with specific OS and browser builds, driving it through certain things, collecting screen shots, video, and whatever, and then tearing it down.
Jason: That’s actually what makes us unique as a cloud vendor. Everyone else is putting servers in the clouds, but we’re actually booting up the desktop environments and are running what you would otherwise call “client applications” in the cloud on those servers.
One of the more cryptic tag lines we had was, “Web browsers as a service.” That probably doesn’t play in Peoria, but it kind of gives you a sense of what we’re doing. We’re launching these desktops that you can connect to remotely and see that they are starting up IE or Firefox. It really is starting up the browser and you get to see everything a user would see.
Since you mentioned video, I will add that it’s been really important to me to add that functionality, to see directly what happens when a browser running in a cloud crashes, for example.
It was important to me in creating Selenium, since I am very visually oriented, to see what the browser was doing. I want to see whether it looks good or ugly, and I don’t really believe the application’s working until it actually runs in IE and Firefox. And by default, when you run things in the cloud, you can’t see what’s going on, so we put in the effort to add back that visual feedback when customers use our service. Video recording of every test has been a pretty popular feature with our users. They don’t just have some log files to look at. They can actually view the video, share it with their coworkers, and so on.
Scott: I would imagine that as the web becomes more interactive, testing gets an different or expanded focus, as well. For example, with a traditional shopping cart app, you need to make sure data is handled the right way, and that at the end, the “your purchase has been completed” string shows up on the final page.
On the other hand, to test the functionality of dragging a map or something like that, there’s not a string to test for. In some of these cases, the only way you’re actually going to know if it really worked is to look and see whether the map been dragged to the right position or whatever.
That need for video suggests that you are always stuck with a certain number of tests that require kind of some human subjectivity, even if you can automate it to some degree.
Jason: That’s true, although one of the “maybe someday,” crazy ideas I have is to add in some kind of “Mechanical Turk” functionality. For instance, once the video is created, perhaps you can automate answering the question of whether the functionality worked, even if it takes a human to answer the follow-up question of whether it looked ugly.
In other words, even if there are still questions that only a human could answer, it would be easier for them to do that by watching a movie and answering a couple of multiple-choice questions, instead of requiring them to actually perform the entire workflow manually themselves.
I read a blog post recently by Hugh McCleod, where the author suggested that, if you’re doing a start-up for any kind of company, there needs to be some kind of social object: some conversational aspect of what you do or produce.
For us, that’s this video that we can pass around, which can include assertions about whether it is an example of a passing test or failing test. If the test failed, it becomes a social thing that you can pass around to the other developers to say, “Look at the one-minute mark; you can see where the bug appears and the test fails.”
Ideally, you have a test for every feature in a website. That might be sending a message in Gmail, doing a product search on eBay, or buying something on Amazon.
Scott: It seems like that could have implications beyond app testing. What kinds of things have you considered elsewhere in the life cycle?
Jason: I would like to see some of the lines erased between documentation, marketing, and functional testing. My best example is an iPhone commercial, and even Android now these days. All they do is zoom in on the phone, show you opening an app, clicking buttons, seeing how cool it is.
They just show one feature, and how well it’s working, which is actually an ideal use case for Selenium and Sauce Labs. People write these tests so they show off one little feature, then there’s a video for it. Ideally, if you are a software developer, and you’re coding up a project, you add more features over time to the thing you are building.
You could have this continually growing little check-list of the features that you have written, and you can drill down and look at the recorded video of that feature working on latest version of your code.
You can see the history of that feature working, but specifically you can actually see it working over the life of the project. People are doing screencasts now to help sell their software, putting up little YouTube videos on their site.
The old approach was to let people try before they buy, downloading a trial before they bought it. With the unfocused attention of most people these days, now they want to watch the video before they even try it.
I’d like to make it so these videos get recorded all the time with all of your little features. You can use them either as-is or as your prototype for making little feature screencasts you can put up on your website.
Scott: It also potentially solves the problem of keeping the docs in sync, especially with software as a service solutions that might roll out little changes daily, weekly, or every couple of weeks.
Keeping the videos in sync could become fairly automatic, since if it’s been tested it’s also documented, and the only stuff that’s undocumented is the stuff that’s untested.
Jason: Actually, I’d like to see testing move away from being so focused on the tests themselves, and toward looking at the tests as more like screenplays, with the real focus being on the videos they produce. Those would be the actual artifacts that you care about.
As recently as five years ago, project testing was usually kind of a late process activity, done by one or two people in the corner. You called on testers and they used really expensive commercial tools, in isolation from everybody else.
Testing with Selenium follows the growing adoption of agile methodolgies where testing is something that developers do all the time. By analogy, I see a lot of those marketing screencasts being done very late in the process, and not by the developers. Screencasting today is where testing was five years ago.
I believe screencasting could benefit greatly by being pulled into the development process using the video capture functionality from Sauce Labs or similar tools. I think that the key glue there is automation, as well as encouraging testers and developers to think of themselves as screenwriters showing off their applications, rather than being so focused on acquiring the pass or fail data point.
Scott: There’s Selenium, which is an open source project, and then as you said, wrapped around it is Sauce Labs. Talk a bit about what Sauce Labs does. What do you add beyond the open source project? What’s your revenue model?
Jason: We started with the single focus of Selenium and cloud computing. As a request comes in, we either route that request to an existing running machine, or we boot more machines from a cloud vendor. There’s a lot of code needed that’s not in Selenium to manage this cloud environment and scale the infrastructure.
That’s the bulk of what we do, but we also collect log files, record video of each test, and make all that data accessible through our website. We also add support for this notion of multi-tenancy, in the sense that we have to make it work with lots of customers, so we have to set up and tear down cleanly between customer uses. That’s more code we have to write and maintain.
Along the lines of those security considerations, that’s how I found my other co-founder, Steve Hazel. He created a site called codepad.org, and you actually can paste in code from pretty much any language, and it’ll run it on his servers in EC2.
In addition to the actual security code itself, and managing how to properly sandbox running code in the cloud, Steve also provides that mindset of being security conscious and vigilant about how people can use and abuse your site.
Along those lines, we’ve expanded our mindset beyond just providing cloud execution to become the Selenium support company more generally. Because Selenium’s already been out there for five and a half years, a lot of people who have a pretty big installed base may not want to use our cloud service, but they do want a phone number or an email address where they can get support, bug fixes, updates for the latest browsers, and things like that.
We’ve expanded the scope of the company to a classic open source model of being the obvious company that you go to for help when you use this particular tool. Blended with that, we also buy cloud resources at wholesale rates by the hour and sell them by the minute.
Scott: I think you mentioned that the cloud provider behind this is Amazon EC2. It would seem that there are going to be a lot more choices as time goes on. I could envision you using multiple cloud providers simultaneously, transparently to the user, because it’s at a lower abstraction level where the VM is actually running, versus the data that the user’s getting back.
Jason: Right. There’s also an argument to be made that, because we’re telling browsers to go to some website, ideally those browsers should be as close as possible to the active servers that we’re testing.
It’s a simple matter of the speed of light. If the server we’re hitting is in Japan, and our test lab is in Virginia, latency is going to be a problem. Therefore, we’d like to be wherever our customers are, in the sense that if they’re inclined to host their application on EC2, our browsers will be there, too, so the network obviously is really fast.
There are lots of good reasons to be cloud-vendor neutral.
Scott: Obviously, if you’re in the same geography as somebody that helps, but if you’re in the same data center, that’s even better.
Jason: If nothing else, your bandwidth is free when it’s inside the data center.
Scott: On another topic, in getting ready to chat with you, we went out and did a little bit of homework and came across Adobe BrowserLab, which seems to provide some services that are similar to Selenium. Can you talk a little bit about some of things that are unique about Selenium?
Jason: I’m glad you asked that, because it’s where I think the original synapses fired and I was like, “Hey, maybe there’s something here with Selenium as a service.” The word “cloud” didn’t exist a couple of years ago, but the phrase “as a service” was in fairly broad usage.
What was out there was very similar to that Adobe site. There was something called BrowserCam, which sounds shady, but it’s not.
The limitation there was that all of them were what I might characterize as just single-page snapshot services. They specifically market themselves to designers who just want to see whether a page that they’re developing renders right in all these different browsers. They don’t really target developers who need deeper functionality than that.
Also, at the same time, five years ago, virtual machines were not as commonly used, so if you only had a Windows machine and you wanted to see what a site looked like on Safari on Mac, you would submit your URL and get a screen shot back.
The most obvious limitation is that you can only give it one URL, and that’s it. They can’t test a multi-step workflow behind a login. So if you had, let’s say, a shopping cart, what you might really be interested in is to go from the front page, add something to the shopping cart, click “submit” and get all the way to the end of the purchasing process, then take a screen shot.
None of those screen-shot services could do that, unless you add some kind of automation mechanism to be able to navigate through the site to get to that point and then take a screen shot.
The genesis of what I originally wanted to do in the cloud was to take a classic screen-shot service and combine it with Selenium. Of course, I kind of took it to 11 and said, “Hey, why not, instead of just screen shots, record a movie?” And then you’ve got, theoretically, 30 screen shots per second that you can play with.
Scott: Is the size of the movies prohibitive at all? Does it take screen shots at set points, such as when it knows that a response is finished rendering, or is it really full-motion video of the browser?
Jason: Right now, it’s full-motion, in the sense that we’re running VNC as the protocol, remote desktop. VNC and RDP are the two main industry-standard protocols. VNC is kind of the open-source lowest common denominator, and RDP, which is the default one for Windows, is the next step up.
We just have a VNC server running on all of our test machines, and instead of connecting as a client application, we have a software library that I added some improvements to. The open source of that library is something called Castro. Instead of a user watching the remote screen, you dump the VNC traffic to what ultimately becomes a Flash video file.
Of course, we have done tweaking here and there, although it is inefficient if you do 30 frames per second. At five or even three frames per second, it actually works out well enough, though.
We’ve made some other improvements as well, although there are still more to be made. Specifically, I would love to bookmark the moments in the video where the interesting things happen, synchronizing it with the command you’re giving to the browser.
To your point, I could see where, instead of watching a video to get to those little bookmarked items, you could end up just getting something that’s a little bit more static. Using the raw video, for instance, you could go to each point of interest, pull out the video frame, and produce a sort of photo book or slideshow of just the relevant frames.
Scott: It would be similar in some ways to apps that detect when a sufficient number of pixels change, and so they know that some kind of transition happened.
Jason: Right. Perhaps it could be streamlined down to a slide show of only the key parts of the work flow, making reviewing the results go faster. Since we have recorded all of the video, we have all of the data, and we can then edit it in any number of ways.
Of course, people still come to us just expecting screen shots, which we will do for them. I think the next step is to synchronize the API traffic, the clicking, the typing, and so on, to make it easier to watch that flow of pictures, rather than having to explicitly take a screen shot of it the whole time.
Scott: Well, I think we’ve covered most of the stuff that I had planned to talk about. Any closing thoughts on your end?
Jason: The only thing that comes to mind is that we didn’t really get into the differences between Selenium user groups. There are developers on the one hand, using programming languages like Java, C#, Python, Perl, and Ruby. On the other hand, there are other Selenium users who don’t really do a lot of programming, but they use Selenium IDE specifically.
Sauce originally started out just to target those people who are using Selenium Remote Control. As we become more of a general Selenium company, we’re also embracing those users who are a little bit newer to automated testing, who are using Selenium IDE. We are branching out to do more things for those people as well. For example, we have a version of Selenium IDE, called Sauce IDE, that integrated with our Sauce OnDemand cloud service. So in a few seconds after creating a test, you can quickly start using our cloud service from inside the IDE.
Scott: Thanks for taking the time to chat.
Jason: Thank you.