Discussion of Trying to Make GCC Work With LLVM in the Pre-Clang Days

From back in 2005:

On the positive side, LLVM has a lot more experience in managing whole program compilation, and it is a much cleaner software base. It would be a good to find some mechanism to take advantage of that experience. Trying to make the hippo dance is not really a lot of fun.

Re: LLVM/GCC Integration Proposal

Compiling your C/C++ and Obj C/C++ Code with “clang” (LLVM)

clang is a recent addition to the landscape of development within the C family. Though GCC is a household name (well, in my household), clang is built on LLVM, a modular and versatile compiler platform. In fact, because it’s built on LLVM, clang can emit a readable form of LLVM byte-code:

Source:

#include <stdio.h>

int main()
{
    printf("Testing.\n");

    return 0;
}

Command:

clang -emit-llvm -S main.c -o -

Output:

; ModuleID = 'main.c'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32-S128"
target triple = "i386-pc-linux-gnu"

@.str = private unnamed_addr constant [10 x i8] c"Testing.\0A\00", align 1

define i32 @main() nounwind {
  %1 = alloca i32, align 4
  store i32 0, i32* %1
  %2 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([10 x i8]* @.str, i32 0, i32 0))
  ret i32 0
}

declare i32 @printf(i8*, ...)

Benefits of using clang versus gcc include the following:

  • Considerably better error messages (a very popular feature).
  • Considerable speed improvements and resource usage, across the board, per clang (http://clang.llvm.org/features.html#performance). This might not be the case, though, per the average discussion on Stack Overflow.
  • Its ASTs and code are allegedly simpler and more straight-forward for those individuals that would like to study them.
  • It’s a single parser for the C family of languages (including Objective C/C++, but does not include C#), while also promoting the ability to be further extended.
  • It’s built as an API, so it can be bound by other tools.

clang is not nearly as mature as GCC, though I haven’t seen (as a casual observer) much negative feedback due to this.

To do a two-part build like you would with GCC, the basic parameters are similar, though there are six-pages of parameters available:

clang -emit-llvm -o foo.bc -c foo.c
clang -o foo foo.bc

It’s important to mention that clang comes bundled with a static analyzer. This means that checking your code for bugs at a deeper level than the compiler is concerned with is that much more accessible. For example, if we adjust the code above to do an allocation, but neglect to free it:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    printf("Testing.\n");

    void *new = malloc((size_t)2000);

    return 0;
}

We can build, while also telling clang to invoke the static-analyzer:

clang --analyze main.c -o main
main.c:8:11: warning: Value stored to 'new' during its initialization is never read
    void *new = malloc((size_t)2000);
          ^~~   ~~~~~~~~~~~~~~~~~~~~
main.c:10:5: warning: Memory is never released; potential leak of memory pointed to by 'new'
    return 0;
    ^
2 warnings generated.

In truth, I don’t know how clang’s static-analyzer compares with Valgrind, the standard, heavyweight, open-source static-analyzer. Though Valgrind can actually run your program and watch to make sure that your allocations are managed properly, I’m not yet sure if clang’s static-analyzer can do the same.