Did you mean ‘clang’ compiler?

| | 0 Comments| 8:48 AM
Categories:

One of the many cool features of Google is the Did you mean phrase suggested if you fat-fingered your search keywords. Go ahead, search for the clong compiler. Google says, yeah, we think you might have meant clang compiler. Thanks Google!

While working one day with some OpenCV code (I’ll blog about that some day, when I know what I’m doing), I noticed I had mistyped cvNamedWindow as cvNameWindow and clang replied:

Well, isn’t that cool? Intrigued I decided to see what gcc would respond with given some typos, and then compare it to clang.

To enable a quick illustration I’m going to use g++ and clang++.

mathr.h
[c]
int addTwoInts(int a, int b);
int addThreeInts(int a, int b, int c);
[/c]

mathr.c
[c]
int addTwoInts(int a, int b) { return a + b; }
int addThreeInts(int a, int b, int c) { return a + b + c; }
[/c]

umath.c
[c]
#include "mathr.h"
int main(void) {
addTwInts(4, 5);
}
[/c]

Notice the addTwInts function call. Obviously I meant addTwoInts.

Compiling with g++-4.8 gives

Okay, addTwInts not declared in this scope. Since this is a really simple example I know where I went wrong. But, take a look at what clang++ can tell me:

Nice! Yes, I did mean addTwoInts!

Searching on Google led me to Chris Lattner’s 2010 article on Clang’s neat error recovery features, and there’s more than just the spell-checking-suggestion-engine.

Clang uses the Levenshtein distance algorithm for determining possible corrections to typos. If you notice, addTwInts is just 1 character away (an insertion) from addTwoInts, and Clang is able to recognize that and make the suggestion.

Of course there are limits. Substituting addTwInts with addInts results in:

The distance is 3 inserts from addTwoInts and 4 inserts from addThreeInts. What’s interesting however is that a “strong match” (not a technical term!) will boost Clang’s ability to make a suggestion. For example, you can change addTwoInts to addTwoInts_____ (your underscore key was stuck that day), and Clang will still make the suggestion “use of undeclared identifier 'addTwoInts_____'; did you mean 'addTwoInts'?” Add another underscore (for a total of 6) and you are back to use of undeclared identifier with no suggestions. To quickly calculate the Levenshtein distance of two arbitrary strings, you can try a online Levenshtein distance calculator.

I’m currently only using Clang on my Mac, and the diagnostics features work for C, C++, Objective-C, and of course, Swift. I’m definitely looking forward to using Clang on Linux where GCC has been the only game in town for decades. Of course, recognizing the rising adoption of Clang has led to significant improvements in GCC’s diagnostics. Only time will tell if GCC will remain the premier compiler on Linux systems (sorry, GNU/Linux. *smirk*).

Leave a Reply

Your email address will not be published. Required fields are marked *