The Clang Static Analyzer
Posted By Quentin Carnicelli on July 14th, 2008
In recent years, my patience has increased such that I’m now often content to wait for software to be out of beta before trying it. But some things are just so good that even in their early, buggy stages, putting up with their beta-ness is well worth it. That certainly is the case with the Clang Static Analyzer.
You may have heard about it at WWDC, or when it made the rounds last week. In short, Clang Static Analysis is like an extended set of compiler warnings for your code (C and Obj-C as of this writing). However, the Clang Checker has a vastly better understanding of your code than a compiler. It can detect memory leaks, double-frees, bad pointer references, and other such bugs that keep you up at night.
When run, the checker produces nice interactive HTML reports. Here’s an image of the report for Nicecast:
We choose not to display the Dead Stores and Missing Deallocs, as they are rarely bugs (at least for our code). Some of the memory leaks are false positives as the checker is confused by some things, like objects that live for the lifetime of the application. But that memory leak listed for PMRunLoop.c was real. Clicking on “View” link gives the following:
As you can see, this is a very nice syntax colored HTML listing of the source, with comments inserted showing how the leak occurs. In this case the bug is obvious: we commented out the call to Release and forgot to undo it later. In other cases the checker has found leaks for us in far more subtle places, such as large recursive functions with multiple exit points.
On our first run over our entire code base (nearing half a million lines), Clang found one major crashing bug in Nicecast that had been plaguing us for awhile, several other possible crashers in our various frameworks, and more memory leaks then you can shake a stick at (although most were in edge cases such as error handling).
After fixing all these problems, we went and re-ran the checker to generate new reports. With almost forty separate projects to run the checker on, we realized that automating this would probably be preferable. So I invested some time upgrading our build system such that it generates checker reports nightly, and posts them to our development wiki and nightly builds RSS feed. Now, each day, we get a report like this:
While I am still hesitant about beta software, especially beta development tools, I very much recommend the Clang static checker to all developers. The amount of time it takes to use it is dwarfed by the time it saves in improving the quality of your code.
Ted Kremenek says:July 14th, 2008 at 10:46 pm
We’re very interested in feedback from users to help us improve the analyzer. This includes plugging false positives, implementing new checks, and improving the analyzer’s diagnostics (i.e., bug reports). In the case of false positives, my belief is that either a check produces a low number of false positives or it is not useful (as people will ignore it). The “missing dealloc” check definitely feels like something that was turned on by default too early, so feedback on false positives for this check in particular would be extremely helpful.
Quentin Carnicelli says:July 15th, 2008 at 2:05 am
Ted – Overall we are very happy with it as is.
Most of the MissingDealloc false positives are coming from Window/View objects whos ivars are all IBOutlets (and thus released by NSWindowController or the like). Over time we’ll probably just add -deallocs to all our classes, as stylistically I think that is a beneficial thing to do.
The DeadStores on the other hand, bug me more. We often do DeadStores on purpose, like: int err = doStuff(); //Ignore error because we don’t care if doStuff() fails. This way the code annotates that an error is returned, and ignored on purpose. These I have no plans on changing.
I believe the true solution for false positives though, is a way of tracking these bugs, this way a false positive can be marked as such, and ignored from future reports. Between BUGDESC, BUGLINE and BUGPATH we can almost do this now, but not quite. If we had just a little more information like the function the bug is in, and the line number relative to the start of the function (or similar), that would be enough for us to build a persistent tracker on top of the reports.
Kevin Ballard says:July 15th, 2008 at 2:14 am
Quentin, if you want to annotate that a function’s return value is being explicitly ignored, I would suggest something like: (void)doStuff();
table says:July 15th, 2008 at 8:02 am
I’m happy Mac/Objective C land is finally getting static analysers. Are there any continuous integration build servers that group together unit tests, Clang’s static analysis and other reports?
Jeff says:July 15th, 2008 at 9:49 am
Kevin, that’s what I’ve been telling him, but he doesn’t like that. ;-)
Robert ‘Groby’ Blum says:July 15th, 2008 at 2:02 pm
Can’t speak for the RA guys, but in my case, I’m also reluctant to use the (void)doStuff();
Why? Because by assigning it to a variable, you can see the error code in debug builds, or potentially even attach conditional break points.
Dan Villiom Podlaski Christiansen says:July 15th, 2008 at 2:49 pm
I’d suggest adding a line below the statement:
When I write new code, I generally enable as many checks and diagnostics as possible on the assumption that it will improve the quality of the code, and make it more maintainable and readable. I find the aforementioned method a good way of saying “yes, I know this is unused” in e.g. main(). Wrapping it in a macro and optionally using __attribute__((unused)) might be even better.
(I’m a measly CS student, so I wouldn’t know whether it actually works out there in the Real World ;))
Ilja Iwas says:July 15th, 2008 at 4:38 pm
Thanks for taking the time and writing this article. Otherwise I’d never known this tool even existed.
Ted Kremenek says:July 16th, 2008 at 1:15 pm
For the MissingDealloc check, yesterday (after reading your comments) I added support in the analyzer to not flag warnings for objects whose ivars are all IBOutlets (please report problems when this doesn’t work). This should now be in checker-62 or later (available at http://clang.llvm.org). If anyone has ideas of how the MissingDealloc check can be further improved, please feel free to file a Bugzilla report via the Clang website.
As for the dead stores checker, its purpose is to identify logical mistakes, not penalize legitimate code, so I am completely open to making it smarter to work with different coding idioms. This checker has actually found interesting bugs in code at Apple, with sometimes the check identifying performance bugs and other cases identifying insidious logic errors. From the performance side, although the optimizer in a compiler can optimize away the store itself, it cannot always optimize away the computation used to generate the value that would be stored to the variable. This is particularly the case when the value is computed via Objective-C message expressions; since Objective-C is so dynamic, the compiler takes a conservative approach and typically cannot ever inline the called methods. This means that dead method calls will always be executed, so dead stores can be useful in identifying optimization opportunities for the programmer.
To help address some of your own concerns, in checker-62 I also added support in the analyzer to distinguish between “dead initializations” and “dead stores”, which may go part of the way to give you a way to automatically filter some of the false positives for your specific coding idiom. I realize that this may not actually help you for your use of error variables, but in general it should be helpful for users to better triage these bugs. Any suggestions of how to further refine the reporting of dead store bugs to make this check more useful are welcome!
As for reporting more information such as BUGDESC, BUGLINE, etc., we can easily report more information. Please feel free to file Bugzilla reports with such feature requests! We want to make the workflow of using this tool both flexible and fluid, so suggestions of how the workflow can be improved are greatly appreciated. There has also been some mention on the Clang developers’ mailing list to potentially support XML output from the analyzer. This would help enable others who wanted to build their own report tracking and presentation infrastructure. This is something that I personally won’t have time to work on in the near future, but since the entire tool is open source, anyone who is interested in helping is welcome to do so (more information on the Clang website).
Quentin Carnicelli says:July 20th, 2008 at 8:33 pm
Ted – MissingDealloc fix worked good, took out 26 false MissingDeallocs for us. Most of the rest were confused about non-retained delegates, which I imagine will be harder to filter out. Also I did see one about a class which had 2 SEL ivars. I’ll see about reporting these on Bugzilla.
We did start working through some of our dead stores, and did find at least one bug in them. But most of them have yet to be useful. The new flags to turn these reports on/off in the newer builds are cool and should settle that though. Thanks!
Ted Kremenek says:July 22nd, 2008 at 7:43 pm
Quentin – Thanks for the confirmation on the fix to the MissingDealloc check. A test case that illustrates the check giving erroneous output for SEL ivars would also be extremely useful.
One comment about the dead stores: even if it doesn’t identify real bugs for you, it should never flag a store as being dead if that value is ever read. Put another way, if the check flags a dead store, it really should be a dead store. That particular check is not suppose to emit any false positives. If you notice such cases, please feel free to file a bug report.
Thanks for your continuing feedback regarding the tool!