Getting started with the Make build tool

Nov. 22, 2024, 2:15 p.m.

In the previous posts introducing compilation and linking, we ran all commands directly in the CLI, one by one. When working with larger projects, this gets tedious fast. Another issue is that your project becomes cluttered with intermediate files. In the linking example for instance, you might have had files like "badadd.o", "badmultiply.o" and "badmath.a" which aren't required for the final executable. If you have a project with hundreds or thousands of files, this becomes untenable real fast. So how can we better organize compilation and linking?

In this post, like the previous one about linking, the example commands assume that you're on Linux, but the concepts discussed are general.

Make in a nutshell

One of the oldest build tools, that's still widely used, is Make, developed in the 1970s. At its core, it is a way to arrange different shell commands/scripts and specify dependencies between them. It lets you dictate e.g. that "For the script building executable domath, a script building static library badmath.a must have been run" (since domath has a dependency on badmath and needs to link it in).

Make also uses file timestamps to figure out if a script needs to be rerun or not. Say yesterday you asked Make to build domath, which triggered badmath.a, and by extension badadd.o and badmultiply.o (since they are included in the static library), to be built. Today, you've changed badadd.c, but not badmultiply.c, and ask Make to build domath again. You do want badadd.o to be rebuilt, since it's likely to be different given that badadd.c has been changed, but it'd be wasteful to rebuild badmultiply.o, given that badmultiply.c is still the exact same. Make does indeed rebuild badadd.o, but not badmultiply.o, before rebuilding badmath.a and finally domath. Make "knows" to do this because it compares file timestamps (you can see an overview of file timestamps in a directory by running ls -l). Since "badadd.o" has an older timestamp than "badadd.c", it might be out of date and so is rebuilt. Likewise, since "badmultiply.o" has a newer timestamp than "badmultiply.c", "badmultiply.o" must be up-to-date and there's no need to rebuild it.

These two mechanisms, specifying dependencies between scripts and using timestamps to rebuild only the parts that are needed, form the core of Make's functionality. Make comes with many more features that among other things help to reduce the amount of boilerplate script writing you need to do. There are very good freely available tutorials explaining these additional features, one of which I'll link to at the end. But in this post, I'll deliberately write out all scripts and only focus on the core, since grasping the main concepts helps to understand the rest much more easily, and they are what I hope to build on in future posts related to build tooling.

Installation

The most popular version of Make today is GNU Make. On Linux, MacOS and even Windows, you can install it with package managers like APT, Homebrew or Chocolatey, e.g. sudo apt install make. Try running the command make --version to see if installation succeeded.

Running an arbitrary script with Make

I will assume that you've created the "domath.c", "badmath.c", "badmath.h" and "badmultiply.c" files described in the linking post to a directory. In that directory, create a new file called "Makefile" (no file extension). When you run make, Make will check for a Makefile in the current directory and use the information in it.

Let's create as simple a Makefile as we can.

say_hello:
    echo "hello make"

(note that the indentation in front of echo must be a tab, using spaces instead will not work)

Now try running:

make say_hello
# echo "hello make"
# hello make

We have essentially told make, "look for a rule say_hello in the Makefile and run the script specified for it". The script consists of all indented lines of commands following the target declaration line (say_hello:). I should note here that the script linked to a target is usually called the target's recipe. Also, you'll note that before running the single command in the recipe, make actually printed out the command itself, resulting in the output line echo "hello make". You can disable this command-printing behavior on a line-by-line basis by preceding commands with an @, e.g. @echo "hello make".

Let's try specifying a non-existent rule:

make invalid
# make: *** No rule to make target 'invalid'.  Stop.

As expected, this fails. The error message might seem curious though, No rule to make target 'invalid'. This is related to a certain expectation of Make. Namely, it expects the names of rules, like say_hello, to indicate a target file to be produced, like a "say_hello" executable. This isn't a requirement, but it is expected, so our say_hello rule is a bit odd in that sense. Rather than building anything, we've essentially just named an arbitrary script that prints a message. Using Make for such arbitrary scripts isn't uncommon, but again, building stuff is its primary purpose and this explains the error message.

Building a single file with Make

Let's get to actually defining a rule for building a target.

badadd.o: badadd.c
    gcc -c badadd.c -o badadd.o

Note that:

The name of the rule is usually the same as the target that is to be built, in this case "badadd.o".
We now have something to the right of the :, namely "badadd.c". Whatever files we specify here will be considered dependencies of our target ("badadd.o").
- This information is used for the "timestamp checking" I explained earlier. We'll see this in action soon.
As before, we indent lines to specify the rule's recipe.
- We now adhere to Make's expectations about rules being for building stuff, since the recipe actually does build "badadd.o".
- However, Make doesn't "know" if the recipe builds something or not; it's up to you to ensure that the recipe does what you want it do.

Let's try running Make against the badadd.o rule:

make badadd.o
# gcc -c badadd.c -o badadd.o

You should now see that a "badadd.o" file has been generated, just like if you had run the gcc command directly in the CLI yourself.

Let's try running the command once again:

make badadd.o
# make: 'badadd.o' is up to date.

Make informs us that "badadd.o" is up to date, so there's no reason to build it again. You can check why this is by using the CLI tool stat:

stat badadd.c
# -snip-
# Modify: 2024-11-21 16:33:24.721117720 +0100
# -snip
stat badadd.o
# -snip-
# Modify: 2024-11-21 16:57:55.778606967 +0100
# -snip-

stat provides detailed information about a file, including its precise timestamps. The Modify timestamp says when the file's contents were last modified. Since "badadd.o" has a later timestamp than badadd.c, it must have been generated after the last change to badadd.c, and so the input (and thus resulting "badadd.o" output) would be exactly the same.

We can use the touch CLI tool to deliberately "update the access and modification times" (as specified in the man page) of "badadd.c" and try again.

touch badadd.c
make badadd.o
# gcc -c badadd.c -o badadd.o

If you use stat you will see that a new badadd.o has indeed been built as its timestamps are different, although the file contents are exactly the same as in the previous file.

Building more with Make

Now you know the essentials of Make, but let's flesh out the example a little bit to further illustrate how things go together.

badadd.o: badadd.c
    gcc -c badadd.c -o badadd.o

badmultiply.o: badmultiply.c
    gcc -c badmultiply.c -o badmultiply.o

libbadmath.a: badadd.o badmultiply.o
    ar rcs libbadmath.a badadd.o badmultiply.o 

domath: domath.c badmath.a badmath.h
    gcc -I. domath.c -L. -lbadmath -o domath

All of the recipes' commands are copied from the previous post on linking, so please have a look there if you're unsure about what they do. What's different from previously is that we can now explicitly state dependencies between the targets - "domath" depends on "domath.c", "badmath.h" and "libbadmath.a", which in turn depends on "badadd.o" and "badmultiply.o", etc. This is often referred to as a dependency graph, where the targets are nodes linked to each other through directed edges that specify dependencies. For instance, "libbadmath.a" is a node connected by an edge to the node "badmultiply.o".

If we do:

make domath
# gcc -c badmultiply.c -o badmultiply.o
# ar rcs libbadmath.a badadd.o badmultiply.o 
# gcc -I. domath.c -L. -lbadmath -o domath

All of the required files, e.g. badmath.a, will be generated for us. If we try make domath once again, we get a 'domath' is up to date message, similar to what we saw with badadd.o. Try predicting and checking what files are rebuilt if you do a touch badmultiply.c and then make domath a third time.

Cleaning up

Lastly, we want a way to tidy up after us. This is usually done with a simple removal script/recipe, like this:

clean:
    rm badadd.o badmultiply.o libbadmath.a domath
    @echo "All cleaned up!"

We do have to write out the exact rm command (though you can of course shorten it using globbing like rm *.o), but we only need to do it once in the Makefile rather than typing it out every time. Now we can just run make clean instead.

Managing large projects

What we've done with Make here is a big step up from running a single CLI command at a time, but it's still unwieldy if you want to work on a large project distributed across a hierarchy of directories. With just Make, you can do things like putting Makefiles in subdirectories, using a root directory Makefile which triggers make commands to be run against the other Makefiles. However, there are many options for additional tooling which helps managing larger projects. A popular choice is CMake, developed in the late 1990s and early 2000s, which adds another layer on top of build tools like Make. Another is Bazel, developed in the 2010s. Before delving into CMake or Bazel though, it's well worth spending some time with Make since it helps you understand the fundamentals.