In this interview, we chat with Andy Chou, the chief architect at Coverity, a
company that produces proprietary software analysis products. In addition,
Coverity has been scanning over 300 open source projects, resulting in more than
11,000 defects that have been fixed by the open source community.
Scott Swigart: Why don’t you introduce yourself and your role at Coverity?
Andy Chou: I’m chief architect at Coverity. I really have many different hats, but basically, I’m responsible for the technology direction of our products. Coverity was founded seven years ago, and we build products that help companies improve their software integrity. Our flagship product performs static analysis to find defects in source code automatically.
This is an interesting space, because we do software development ourselves, but we also analyze other peoples’ software. We get to see a wide variety of processes, tools, and products.
We started out as a research project at Stanford University. The potential of static analysis technology was clear after we spent one weekend in 2000 and discovered hundreds of defects in the Linux kernel.
In 2003, we commercialized the technology and bootstrapped the company for four years without much external funding. Since then we got our first round of VC funding, and we have been expanding internationally and growing our product line since then.
Scott: Various open source projects use static analysis, and I know that Coverity has scanned something like 280 open source projects and provided those results back. Some projects more than others have worked to root out the issues found.
Before we go there, though, talk a little more about integrity scanning and higher-level issues related to software quality and design.
Andy: One of the things we discovered as we started to expand is that companies are interested in more than just the code in terms of software integrity. They’re interested in how the code is designed. They’re interested in how it’s assembled by the build process. They’re interested in how it’s tested and whether there’s anything that can be found by automatic tools during QA.
We have different modules that together form the Coverity Integrity Center. It’s our combined offering that addresses different parts of the development life cycle.
There are a lot of tools in the Application Lifecycle Management space that focus on streamlining and automating processes in software development, like tracking bug tickets that come in or doing source control management.
We consider ourselves to be in an adjacent space because we analyze the artifacts of software development and really dig into how they are put together and what they are doing. For example, a source control system treats source code as nothing more than a pile of text. We treat source code at a more semantic level, by parsing it and analyzing its behavior on specific paths.
Scott: When you’re talking about analyzing code, you have to consider the broad range of programming languages out there: C#, Java, PHP, and all kinds of other stuff. When you’re getting more into the design and build process, even though there are artifacts, there are also build instruction files, for lack of a better term, and things like that to consider.
How did you choose where to invest? Back in the day, organizations used a fairly discrete set of technologies, whether it was .NET, Java, or C++. As time went on, we got Ruby on Rails, different kind of cloud platforms and services, development for handhelds, and whole ecosystems growing up around those.
I assume that to support each one of those requires a separate investment from you, so how do you decide where you’re going to focus your resources? It seems like it would take a crystal ball to determine what’s going to be around for a long time.
Andy: We focus primarily on areas where software integrity is essential, for example safety-critical or business-critical systems. Often, this means embedded software. For example, software that’s going into a device, into a plane or car, or into a router, where it can be very, very expensive to fix a problem (think recall, or flying an engineer on-site to debug a problem).
The amount of software that’s in these devices is growing tremendously. I think it’s not really widely understood that software systems have become truly enormous over the past decade or so. For our customer base, a million lines of code is not considered very large.
If you printed out a million lines of code, it would be a six foot high stack of paper. It’s far too much for a single person to truly comprehend all the details. It’s hard to imagine any human product on that scale that has no errors in it. Some of the larger individual code bases we work with run into the tens of millions of lines of code.
And by the way, that’s just one variant of that code base. We all know that in real software development, there are many branches of development that share a great deal of the code. We also have products put together by sharing different pieces of software across teams, and the complexity grows tremendously.
It’s always a challenge to understand where to invest in order to maximize your impact, but we believe that listening to our customers is of paramount importance. They tell us which types of their software are most critical, and the most critical stuff is where automated scanning is most valuable.
Scott: You mentioned embedded devices, which makes a lot of sense, since they take a long time to develop, and they are very difficult to patch. You can push a patch to some types of embedded devices, but there others that don’t have a network connection, and it basically takes a product recall to make changes.
Cars are a great example that I think bring the fact home to the general public that software is everywhere.
Andy: The average person probably thinks of software as something they interact with when they turn on their PC, but the reality is that most of the software we all interact with on a day-to-day basis is not on a desktop computer.
Some of it is in your car, your mobile phone has tons of software on it, and you use software almost every time you purchase something. Furthermore, when you get a piece of information off the Internet, you use many layers of software. It’s not just on your computer and the website you’re going to, but also on all the infrastructure in between – the routers, switches, wireless access points, firewalls, and database servers. There’s an incredible amount of software that supports mundane everyday activities, and that’s not really well understood.
At Coverity, we think in terms of the software supply chain. When companies put together applications or devices, they’re no longer developing all of it themselves. So much software development is about pulling together components from different suppliers (including open source), integrating it together, and customizing it.
Scott: Let’s switch over and talk a little bit about the work you do with open source projects.
Andy: We worked with the Department of Homeland Security (DHS) putting together a scanning service for open source projects called Coverity Scan. That effort began in 2006, and today (early 2010) we scan over 300 different open source packages and have found over 11,000 defects that have been fixed by the open source community.
The DHS was looking for a way to have a big impact on software integrity across a spectrum of different kinds of open source projects. They saw that static analysis technology could apply to millions of lines of code and provide a quick way to harden open source infrastructure. Open source is used by virtually every company and government agency.
The economics of using shared code is compelling. But the downside is that defects in the shared infrastructure propagate everywhere. I think the DHS was concerned about this growing ubiquity, and what could be done to accelerate fixing more defects earlier.
Scan has been really successful in terms of raising the awareness of what static analysis can do.
Scott: You bring up a good point about evangelizing the benefits of static analysis. I think for a long time in the open source community, there was the notion that once code has been posted to a mailing list and looked at by a lot of people, that scrutiny will find all of the bugs.
When Coverity started scanning projects and turning these reports over to the maintainers, some projects probably really appreciated it, while others may have been more inclined to dismiss the results as a bunch of false positives. Some people would hesitate to believe that a program could really understand their code.
How have you seen the thinking about the benefits of static analysis evolve among open source projects over the past three years or so?
Andy: There has been a greater embrace of static analysis, both within the open source community and among commercial software makers. Part of the reason is that static analyzers have gotten better. They’re starting to think about software more like a person would and less like a machine enforcing rules blindly.
Part of why we think static analysis is being adopted more is that it has begun to provide enough information to understand the problems that could arise if you don’t fix a particular bug. That makes it more valuable and clearer that there is a real issue.
Some open source communities have really taken the results and run with them, fixing virtually everything that’s found by our tool, whereas others are a lot slower to adopt it. We’ve found a high correlation between the projects that do static analysis and doing other things like extensive unit testing that also help with quality.
Scott: In other words, certain projects broadly adopt tools and techniques to raise quality as part of their culture. Those measures might include static analysis, unit testing, and maybe other things.
Andy: And they see static analysis as a useful technique that can address some of the gaps left by other techniques, in terms of achieving coverage of all of the code without necessarily having test cases that cover everything or missing things that a developer might overlook.
Static analysis is good at looking at all of the different paths in the software, which people are not so good at.
Still, it’s been very difficult to get companies and open source adopters to really incorporate static analysis deeply into their daily workflows. We have found that, if you incorporate testing for defects into a nightly build process, for example, it really helps.
In that scenario, the developers are talking about defects that are fresh and about code that they just wrote, which really brings up the relevance level. Even for commercial software development, that matters quite a lot.
For a lot of companies that use our software, although not all, they see the initial set of defects that come out of the tool and they say, “Well, it’s too much work to fix all of those right away, so we’ll baseline that and fix everything that comes in new after today.”
Scott: That makes sense. Like you said, if they’re running it over a million or ten million lines of code, and thousands and thousands of defects come out, they might say, “Well, the software works today, and we’re not getting a lot of complaints about the quality, but we want to continue to raise the bar.”
It also seems like a pretty good educational tool. You check your code in before you go home at night, and in the morning, there are some defects waiting for you. That could very quickly help you change your own development practices to avoid defects.
Andy: We recently published a paper “A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World” in the Communications of the ACM that described some of the lessons we’ve learned over the past few years in terms of making static analysis work in a commercial setting.
One of the continuing lessons we’ve learned is that incorporating a new technology like static analysis into real, commercial software development really requires you to understand how people work and some of the complexities of their environment.
One example is that in our forthcoming Coverity 5 version, there’s going to be support for multiple code branches and identifying defects in shared code, which I think almost every software development shop has. They have a release version, a development version, perhaps a branch where they’re doing advanced work, and perhaps branches for customers that have special sets of requirements.
That is a very common scenario, but all of those different branches are variants of the same code base. It’s the same when you have different targets. You may have a Windows version, a Linux version, and versions for other target platforms.
Even in embedded applications, each device may have a different CPU or set of devices and, therefore, a different layer on the bottom to interface with the low level components. Some phones have cameras and some have Bluetooth, but not all of them. The variants between these different products and branches are part of the real life situation in software development.
Supporting that well requires more than scanning a code base and finding a bunch of bugs. It means digging into where else the same bug may arise. If I understand your branching structure and the ways in which you put together all of the software components into different products, I can tell you which products might be impacted by this bug.
That can be really useful. If you have a bug and you fix it in one branch, you may not realize that it’s also present in others. If you don’t also find it there and tell the developer about it when it’s timely, you may miss out on an opportunity to save quite a lot of time.
Scott: You mentioned that you’ve got an upcoming release, Coverity 5. Give us a little bit of a preview, in terms of what is coming in that version, and even some of the interesting problems that you’re thinking about beyond that.
Andy: One of the features we’re excited about in Coverity 5 is defect impact mapping. It gives developers a quick way to see what the most likely high-impact defects are. Instead of thousands of defects in a flat list, you can focus in on those that are most likely to be critical, such as those that involve things like memory corruption and memory leaks, which can be very expensive to find and to fix.
When you combine insight into which defects are critical with which defects occur in multiple code branches or streams, it’s a huge step forward in understanding the impact of defects on the business.
A third feature is that we’ve mapped all the results that we find to the Common Weakness Enumeration, the CWE, which is an industry standard way of talking about software vulnerabilities. It’s something that our customers have been asking for, because it allows them to understand defects not in terms of how Coverity classifies them, but also in terms of how the industry is talking about them.
Scott: You’ve talked about a lot of ways of finding and analyzing defects. What are some of the hard problems that are left? I think that, in terms of compute problems, this is probably right up there which speech recognition and automatic language translation, in terms of being a really hard computer science problem.
Andy: In our space, every time you turn around you face an NP-complete or undecidable problem. What we’ve found is that you really need to be careful about the trade-offs made when finding defects. You simply can’t find all defects with a low false positive rate, while scaling to millions of lines of code while remaining almost completely automatic.
The balance between these many factors is something that you have to strike very, very carefully, because a tool needs to work for very large code bases as well as very small code bases, and on code bases written using different styles.
One of the challenges we continually face is how to make software analysis something that really customizes itself for the particular code base that’s being scanned. You should treat code bases like patients, in the sense that you don’t treat each patient exactly the same way.
If you look at their history, the processes used to develop them, and the idioms used in the code base, you may come up with a different set of diagnoses about what’s wrong. That’s a relatively new field. We’ve done some of the work in our products, but I think there’s a lot of room to grow in that dimension.
If you’re talking about a web application, for example, security may be of paramount importance, but if you’re talking about embedded device, you may have a different set of concerns. The blind scanners that just look for a list of problems don’t necessarily have that context, so they may look for things that are lot less relevant, just because they don’t understand what you’re trying to do with this application, how you developed the software, and what idioms you’re using.
That’s one area for future work. From a top down point of view, I am interested in pushing software analysis more deeply into the work flow and incorporating information from different sources.
Correlating the information between different systems that track bugs and track changes in source code is another interesting area for the future. There’s tremendous value in converging information from many different sources and using that to better determine whether software is ready to release, to gauge its quality, and to determine which teams are doing a good job.
Scott: This is definitely an interesting space, and it’s good to see it becoming more mainstream. We’ve covered most of the stuff I had on my agenda, so do you have any closing thoughts?
Andy: Despite the growth in the past few years, this space is still kind of nascent. I think the market is large for tools that are starting to integrate information from different software components together.
One of the things that we’re looking at is how to increase the quality of the supply chain, in addition to just individual software projects. That’s an area that we think will require some automated analysis. It’s too expensive to manually look at code bases that are being produced by third parties or by suppliers.
There’s no common regulation or standard by which software quality is measured today, so going beyond just finding defects, how do you determine what is high quality software? I think there’s an open question out there, because today, there is no real metric for quality.
There’s a lot of work to be done in that area, and that’s going to make for an interesting future.
Scott: Great. Thanks for taking time to chat. This has been a great interview.
Andy: Thank you.