3. Compilation, Debugging and Debuggers#
In the last chapter, we were introduced to the compiler as being the tool that translates the C++ code that we write, into machine code the CPU understands. In this chapter, we will dig a little deeper into the C++ compilers, their various stages and get a general feeling for how to read disassembly. We will finish by talking about debugging, i.e., how to diagnose where and/or why our programming is failing.
3.1. C++ Compilers#
There are essentially three standard compilers in the C++ world. These are:
gcc
: the GNU Compiler CollectionThis is the default compiler on most Linux operating system, it is also the oldest and most opaque of compilers. However, it does compile highly preformant code.
clang
: (pronounced as either one word or c-lang), the LLVM based compilerThis is the default compiler on MacOS (though maintains its own branch of the compiler), and sits on top of the LLVM compiler infrastrucutre.
Most modern compilers rely on LLVM to do the optimizations necessary to make fast code. It’s really a remarkable project
cl
: the compiler that ships with Microsoft’s Visual StudioAlso referred to as the MSVC compiler, it only works on Windows operating systems.
It also tends to be the most up-to-date with the latest ISO C++ standards, which are updated about every three years.
We will focus on clang, as this one is available on Linux and MacOS, and can be accessed on Windows through the Windows Subsystem for Linux (WSL2). Clang is also available through Visual Studio, for those relecutant to evolve from Windows.
Tip
All high preformance computing happens on computing clusters, which require you to be familiar with the POSIX command line interface (CLI). It is very worthwhile to learn this, now. Where possible, we will include examples of using the CLI throughout these notes
There are plenty of resources that explain how to install these compilers, for simplicity, we will only include the necessary command line instructions for Ubuntu based operating systems:
sudo apt install clang++
To ensure that clang
is on your system, you can run this command which asks clang
to print what version it is
clang --version
Go ahead and create a folder in your favorite spot on your computer called learning_scientific_computing and subfolder named hello_world. In this folder execute the following commands
# Creates a file with the contents being the hello world program
cat >> hello_world.cpp << EOF
#include <iostream>
int main() {
std::cout << "Hello, World!" << std::endl;
return 0;
}
EOF
# Compile the program and name it `hello_word`
clang++ -o hello_world hello_world.cpp
# execute the program
./hello_world
These commands create the source file, compiles it, and then execute the program.
The result output should just be Hello, World!
!
3.1.1. Digging a little deeper into the compiler#
Let’s take a quick digression to understand what the compiler is doing under the hood.
For the compilation process to begin, the compiler must find exactly one int main()
function across all input files.
The compiler uses the function to understand how to put the program together, and optimize it.
Compilation generally happens is several stages:
Preprocessing
This step involves passing through the source code, and replacing preprocessing directives, also called macros, where ever they occur. The most common of these is the
#include
statement, which takes the contents of the file being included and pastes it right there into the current file.It also removes any comments, or dead bits of code that are encloded if
#if
guards that evaluate to false.
Compilation
This is were the compiler reads in every line of code in a file and represents in some form that can easily manipulate (an example is an abstract syntax tree). This allows it to analyze your code to see if there are any bugs, such as missing semicolons, undeclared variables, etc. This step is generally referred to as lexing and parsing.
The AST is then converted into some binary (no longer human-readible) representation, for
gcc
is the Register Transfer Language andclang
uses Intermediate Representation.The compiler then undergoes several pass of the code, transforming the binary representation to make the resulting machine code more efficient. This process is called optimization; there has been A LOT of research that has gone into teaching compilers to be extremely good optimization machines.
Assembling
Once the compiler is happy with its optimization passes, it converts its binary representation into one that the hard architure we are target can understand. For me, this is usually an intel based chip, which require x86_64 instructions, but could also be Apple Silicon chip which requires ARM instructions.
Linking
Once all the machine code has been generated for the various .cpp files, they have to be linked together, that is, functions defined in one .cpp file have to be cross-referenced so that every symbol is resolved and the CPU knows where to look for what when stepping through the call stack (an order list of instructions the CPU execute to execute the program).
Modern compilers can even preform further optimizations, called link time optimization or LTOs, do reduce executable size or improve performance.
This is also the step where the program is made aware of where to look for symbols that are situated in libraries and not a part of the executable itself.
There are more intricacy associated to each step that we needn’t review or understand. However, it good to be familiar with what exactly it is that a compiler does. Though we won’t be able to appreciate it from these notes, C++ provides a tool set that is vary well equipped for developing compiler infrastructure.
3.2. Disassembly#
We have used the word disassembly without really explaining what it is. So let’s clarify that here. Disassembly is a play on the word assembly which describes the programming language that human write which translate directly into machine code for the CPU. For this reason, every CPU architecture (and manufacturer) has their own version of assembly. Disassembly is the process of taking machine code, and translating is back into assembly. Two flavors of assembly that show up in personnal computers are x86 (Intel®) and ARM (Apple Silicon)
We will take a little time to familiarize with the registers and commans commonly seen. We will not talk abouot register sizes, or the evolution of assembly languages, but encourage you to read on this, if you’re any bit curious. A register is a physical location on the CPU where data can be load and the CPU can preform any number of commands (which we list below). The set of commands available, ultimately dictates the assembly that chip can understand. The register and their functions are
3.2.1. Register Types#
Register |
Name |
Function |
---|---|---|
|
accumulator registre |
generally the register that stores the return value of a function (if necessary, it will store a function argument) |
|
base register |
typically used for storing function arguments, especially pointers |
|
count register |
typically keeps track of the loop count or index into a string for string operations |
|
data register |
can be a register for an argument to a function |
|
index register |
source and destination registers for stream operations (such as outputing text to terminal) |
|
base pointer |
points to the bottom of the stack (a first-in-last-out data structure) |
|
stack pointer |
points the the top of the stack |
|
additional registers |
generally used to store function arguments |
|
SIMD registers |
used to vectorize arithmetic operations |
|
floating-point unit registers |
for floating-point arithmetic |
Note
There are a limit to the number of arguments a function can take. This is generally determined by the number of register available on the CPU. If you find yourself passing a large number variables to a function, you should think about restructuring your code.
3.2.2. Command Types#
3.2.3. Revisiting Hello, World!
#
In the last chapter, we introduced what the disassembly looks like for a “Hello, World!” program.
You may have noticed that Compiler Explorer was set to the gcc
compiler.
For completeness, we include the generated disassembly for a clang
compiler.
__cxx_global_var_init: # @__cxx_global_var_init
push rbp
mov rbp, rsp
movabs rdi, offset std::__ioinit
call std::ios_base::Init::Init() [complete object constructor]
movabs rdi, offset std::ios_base::Init::~Init() [complete object destructor]
movabs rsi, offset std::__ioinit
movabs rdx, offset __dso_handle
call __cxa_atexit
pop rbp
ret
main: # @main
push rbp
mov rbp, rsp
sub rsp, 16
mov dword ptr [rbp - 4], 0
movabs rdi, offset std::cout
movabs rsi, offset .L.str
call std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
mov rdi, rax
movabs rsi, offset std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)
call std::basic_ostream<char, std::char_traits<char> >::operator<<(std::basic_ostream<char, std::char_traits<char> >& (*)(std::basic_ostream<char, std::char_traits<char> >&))
xor eax, eax
add rsp, 16
pop rbp
ret
_GLOBAL__sub_I_example.cpp: # @_GLOBAL__sub_I_example.cpp
push rbp
mov rbp, rsp
call __cxx_global_var_init
pop rbp
ret
.L.str:
.asciz "Hello, Wolrd!"
Because it’s simpler, we add comments to the gcc
disassembly.
But, as mentioned before, we will principally work with the clang
compiler for this tutorial.
Below, we have modified the disassembly by adding comments (the lines starting with semicolons) to indicate what each line does
.LC0:
; Create location in memory to store output
; this string ships with the program
.string "Hello, Wolrd!"
main:
; Create the stack
push rbp
; Set stack pointer to the bottom
mov rbp, rsp
; Store output string in register esi (second argument)
mov esi, OFFSET FLAT:.LC0
; Store output stream in register edi (first argument)
mov edi, OFFSET FLAT:_ZSt4cout
; Invoke call to the insertion operator `<<` to store string in stream
call std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
; Store `std::endl` object at register esi (to be consumed by an output stream)
mov esi, OFFSET FLAT:_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
; Store the return value from the last `call` (a stream object) in register (argument for next function call)
mov rdi, rax
; Invoke call to the inserttion operator `<<` to store the `std::endl` object
call std::basic_ostream<char, std::char_traits<char> >::operator<<(std::basic_ostream<char, std::char_traits<char> >& (*)(std::basic_ostream<char, std::char_traits<char> >&))
; Store return value to be return from `int main()`
mov eax, 0
; Destroy stack
pop rbp
; Return from function
ret
3.3. Debugging#
There are one of two problems for which debugging is used:
Code crashes, such as a segmentation fault
Deviation from expected behavior
To assist us in narrowing down the source of the error, it would be nice to be able to step through the code one line at a time, one function at a time, or any other increment we can envision. Two tools that we will introduce here are intend to provide you with basic command-line tools that every C++ programmer should have some familiarity with. These are valgrind and GDB.
Here we will only go through the most basic usage, which should be adequate for your debugging purposes.
3.3.1. Valgrind#
3.3.2. GBD#
GDB comes installed with the gcc compiler, and the clang compiler has their own version called LLDB. To appreciate the differences between what these offer, we would have to understand what the LLVM backend of the clang compiler provides us. But that is a rabbit whole we really don’t want to go down. For simplicity, we will just provide and example for GDB for our hello world program.