3. Compilation, Debugging and Debuggers#

In the last chapter, we were introduced to the compiler as being the tool that translates the C++ code that we write, into machine code the CPU understands. In this chapter, we will dig a little deeper into the C++ compilers, their various stages and get a general feeling for how to read disassembly. We will finish by talking about debugging, i.e., how to diagnose where and/or why our programming is failing.

3.1. C++ Compilers#

There are essentially three standard compilers in the C++ world. These are:

  • gcc: the GNU Compiler Collection

    • This is the default compiler on most Linux operating system, it is also the oldest and most opaque of compilers. However, it does compile highly preformant code.

  • clang: (pronounced as either one word or c-lang), the LLVM based compiler

    • This is the default compiler on MacOS (though maintains its own branch of the compiler), and sits on top of the LLVM compiler infrastrucutre.

    • Most modern compilers rely on LLVM to do the optimizations necessary to make fast code. It’s really a remarkable project

  • cl: the compiler that ships with Microsoft’s Visual Studio

    • Also referred to as the MSVC compiler, it only works on Windows operating systems.

    • It also tends to be the most up-to-date with the latest ISO C++ standards, which are updated about every three years.

We will focus on clang, as this one is available on Linux and MacOS, and can be accessed on Windows through the Windows Subsystem for Linux (WSL2). Clang is also available through Visual Studio, for those relecutant to evolve from Windows.

Tip

All high preformance computing happens on computing clusters, which require you to be familiar with the POSIX command line interface (CLI). It is very worthwhile to learn this, now. Where possible, we will include examples of using the CLI throughout these notes

There are plenty of resources that explain how to install these compilers, for simplicity, we will only include the necessary command line instructions for Ubuntu based operating systems:

sudo apt install clang++

To ensure that clang is on your system, you can run this command which asks clang to print what version it is

clang --version

Go ahead and create a folder in your favorite spot on your computer called learning_scientific_computing and subfolder named hello_world. In this folder execute the following commands

# Creates a file with the contents being the hello world program
cat >> hello_world.cpp << EOF
#include <iostream>

int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}
EOF

# Compile the program and name it `hello_word`
clang++ -o hello_world hello_world.cpp

# execute the program
./hello_world

These commands create the source file, compiles it, and then execute the program. The result output should just be Hello, World!!

3.1.1. Digging a little deeper into the compiler#

Let’s take a quick digression to understand what the compiler is doing under the hood. For the compilation process to begin, the compiler must find exactly one int main() function across all input files. The compiler uses the function to understand how to put the program together, and optimize it. Compilation generally happens is several stages:

  1. Preprocessing

    • This step involves passing through the source code, and replacing preprocessing directives, also called macros, where ever they occur. The most common of these is the #include statement, which takes the contents of the file being included and pastes it right there into the current file.

    • It also removes any comments, or dead bits of code that are encloded if #if guards that evaluate to false.

  2. Compilation

    • This is were the compiler reads in every line of code in a file and represents in some form that can easily manipulate (an example is an abstract syntax tree). This allows it to analyze your code to see if there are any bugs, such as missing semicolons, undeclared variables, etc. This step is generally referred to as lexing and parsing.

    • The AST is then converted into some binary (no longer human-readible) representation, for gcc is the Register Transfer Language and clang uses Intermediate Representation.

    • The compiler then undergoes several pass of the code, transforming the binary representation to make the resulting machine code more efficient. This process is called optimization; there has been A LOT of research that has gone into teaching compilers to be extremely good optimization machines.

  3. Assembling

    • Once the compiler is happy with its optimization passes, it converts its binary representation into one that the hard architure we are target can understand. For me, this is usually an intel based chip, which require x86_64 instructions, but could also be Apple Silicon chip which requires ARM instructions.

  4. Linking

    • Once all the machine code has been generated for the various .cpp files, they have to be linked together, that is, functions defined in one .cpp file have to be cross-referenced so that every symbol is resolved and the CPU knows where to look for what when stepping through the call stack (an order list of instructions the CPU execute to execute the program).

    • Modern compilers can even preform further optimizations, called link time optimization or LTOs, do reduce executable size or improve performance.

    • This is also the step where the program is made aware of where to look for symbols that are situated in libraries and not a part of the executable itself.

There are more intricacy associated to each step that we needn’t review or understand. However, it good to be familiar with what exactly it is that a compiler does. Though we won’t be able to appreciate it from these notes, C++ provides a tool set that is vary well equipped for developing compiler infrastructure.

3.2. Disassembly#

We have used the word disassembly without really explaining what it is. So let’s clarify that here. Disassembly is a play on the word assembly which describes the programming language that human write which translate directly into machine code for the CPU. For this reason, every CPU architecture (and manufacturer) has their own version of assembly. Disassembly is the process of taking machine code, and translating is back into assembly. Two flavors of assembly that show up in personnal computers are x86 (Intel®) and ARM (Apple Silicon)

We will take a little time to familiarize with the registers and commans commonly seen. We will not talk abouot register sizes, or the evolution of assembly languages, but encourage you to read on this, if you’re any bit curious. A register is a physical location on the CPU where data can be load and the CPU can preform any number of commands (which we list below). The set of commands available, ultimately dictates the assembly that chip can understand. The register and their functions are

3.2.1. Register Types#

Register

Name

Function

rax

accumulator registre

generally the register that stores the return value of a function (if necessary, it will store a function argument)

rbx

base register

typically used for storing function arguments, especially pointers

rcx

count register

typically keeps track of the loop count or index into a string for string operations

rdx

data register

can be a register for an argument to a function

rsi, rdi

index register

source and destination registers for stream operations (such as outputing text to terminal)

rbp

base pointer

points to the bottom of the stack (a first-in-last-out data structure)

rsp

stack pointer

points the the top of the stack

r8-r15

additional registers

generally used to store function arguments

xmm0 - xm15

SIMD registers

used to vectorize arithmetic operations

st0-st7

floating-point unit registers

for floating-point arithmetic

Note

There are a limit to the number of arguments a function can take. This is generally determined by the number of register available on the CPU. If you find yourself passing a large number variables to a function, you should think about restructuring your code.

3.2.2. Command Types#

3.2.3. Revisiting Hello, World!#

In the last chapter, we introduced what the disassembly looks like for a “Hello, World!” program. You may have noticed that Compiler Explorer was set to the gcc compiler. For completeness, we include the generated disassembly for a clang compiler.

__cxx_global_var_init:                  # @__cxx_global_var_init
        push    rbp
        mov     rbp, rsp
        movabs  rdi, offset std::__ioinit
        call    std::ios_base::Init::Init() [complete object constructor]
        movabs  rdi, offset std::ios_base::Init::~Init() [complete object destructor]
        movabs  rsi, offset std::__ioinit
        movabs  rdx, offset __dso_handle
        call    __cxa_atexit
        pop     rbp
        ret
main:                                   # @main
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     dword ptr [rbp - 4], 0
        movabs  rdi, offset std::cout
        movabs  rsi, offset .L.str
        call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
        mov     rdi, rax
        movabs  rsi, offset std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)
        call    std::basic_ostream<char, std::char_traits<char> >::operator<<(std::basic_ostream<char, std::char_traits<char> >& (*)(std::basic_ostream<char, std::char_traits<char> >&))
        xor     eax, eax
        add     rsp, 16
        pop     rbp
        ret
_GLOBAL__sub_I_example.cpp:             # @_GLOBAL__sub_I_example.cpp
        push    rbp
        mov     rbp, rsp
        call    __cxx_global_var_init
        pop     rbp
        ret
.L.str:
        .asciz  "Hello, Wolrd!"

Because it’s simpler, we add comments to the gcc disassembly. But, as mentioned before, we will principally work with the clang compiler for this tutorial. Below, we have modified the disassembly by adding comments (the lines starting with semicolons) to indicate what each line does

.LC0:
	; Create location in memory to store output
	; this string ships with the program
        .string "Hello, Wolrd!"
main:
	; Create the stack
        push    rbp
	; Set stack pointer to the bottom
        mov     rbp, rsp
	; Store output string in register esi (second argument)
        mov     esi, OFFSET FLAT:.LC0
	; Store output stream in register edi (first argument)
        mov     edi, OFFSET FLAT:_ZSt4cout
	; Invoke call to the insertion operator `<<` to store string in stream
        call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
	; Store `std::endl` object at register esi (to be consumed by an output stream)
        mov     esi, OFFSET FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
	; Store the return value from the last `call` (a stream object) in register (argument for next function call)
        mov     rdi, rax
	; Invoke call to the inserttion operator `<<` to store the `std::endl` object
        call    std::basic_ostream<char, std::char_traits<char> >::operator<<(std::basic_ostream<char, std::char_traits<char> >& (*)(std::basic_ostream<char, std::char_traits<char> >&))
	; Store return value to be return from `int main()`
        mov     eax, 0
	; Destroy stack 
        pop     rbp
	; Return from function
        ret

3.3. Debugging#

There are one of two problems for which debugging is used:

  1. Code crashes, such as a segmentation fault

  2. Deviation from expected behavior

To assist us in narrowing down the source of the error, it would be nice to be able to step through the code one line at a time, one function at a time, or any other increment we can envision. Two tools that we will introduce here are intend to provide you with basic command-line tools that every C++ programmer should have some familiarity with. These are valgrind and GDB.

Here we will only go through the most basic usage, which should be adequate for your debugging purposes.

3.3.1. Valgrind#

3.3.2. GBD#

GDB comes installed with the gcc compiler, and the clang compiler has their own version called LLDB. To appreciate the differences between what these offer, we would have to understand what the LLVM backend of the clang compiler provides us. But that is a rabbit whole we really don’t want to go down. For simplicity, we will just provide and example for GDB for our hello world program.