This is an informal document about internals of kati. This document is not meant to be a comprehensive document of kati or GNU make. This explains some random topics which other programmers may be interested in.
The motivation of kati was to speed up Android platform build. Especially, its incremental build time was the main focus. Android platform's build system is a very unique system. It provides a DSL, (ab)using Turing-completeness of GNU make. The DSL allows developers to write build rules in a descriptive way, but the downside is it's complicated and slow.
When we say a build system is slow, we consider "null build" and "full build". Null build is a build which does nothing, because all output files are already up-to-date. Full build is a build which builds everything, because there were nothing which have been already built. Actual builds in daily development are somewhere between null build and full build. Most benchmarks below were done for null build.
For Android with my fairly beefy workstation, null build took ~100 secs with GNU make. This means you needed to wait ~100 secs to see if there's a compile error when you changed a single C file. To be fair, things were not that bad. There are tools called mm/mmm. They allow developers to build an individual module. As they ignore dependencies between modules, they are fast. However, you need to be somewhat experienced to use them properly. You should know which modules will be affected by your change. It would be nicer if you can just type "make" whenever you change something.
This is why we started this project. We decided to create a GNU make clone from scratch, but there were some other options. One option was to replace all Android.mk by files with a better format. There is actually a longer-term project for this. Kati was planned to be a short-term project. Another option was to hack GNU make instead of developing a clone. We didn't take this option because we thought the source code of GNU make is somewhat complicated due to historical reason. It's written in old-style C, has a lot of ifdefs for some unknown architectures, etc.
Currently, kati's main mode is --ninja mode. Instead of executing build commands by itself, kati generates build.ninja file and ninja actually runs commands. There were some back-and-forths before kati became the current form. Some experiments succeeded and some others failed. We even changed the language for kati. At first, we wrote kati in Go. We naively expected we can get enough performance with Go. I guessed at least one of the following statements are true: 1. GNU make is not very optimized for computation heavy Makefiles, 2. Go is fast for our purpose, or 3. we can come up with some optimization tricks for Android's build system. As for 3, some of such optimization succeeded but it's performance gain didn't cancel the slowness of Go.
Go's performance would be somewhat interesting topic. I didn't study the performance difference in detail, but it seemed both our use of Go and Go language itself were making the Go version of kati slower. As for our fault, I think Go version has more unnecessary string allocations than C++ version has. As for Go itself, it seemed GC was the main show-stopper. For example, Android's build system defines about one million make variables, and buffers for them will be never freed. IIRC, this kind of allocation pattern isn't good for non-generational GC.
Go version and test cases were written by ukai and me, and C++ rewrite was done mostly by me. The rest of this document is mostly about the C++ version.
Kati consists of the following components:
A Makefile has some statements which consist of zero or more expressions. There are two parsers and two evaluators - one for statements and the other for expressions.
Most of users of GNU make may not care about the evaluator much. However, GNU make's evaluator is very powerful and is Turing-complete. For Android's null build, most time is spent in this phase. Other tasks, such as building dependency graphs and calling stat function for build targets, are not the bottleneck. This would be a very Android specific characteristics. Android's build system uses a lot of GNU make black magics.
The evaluator outputs a list of build rules and a variable table. The dependency builder creates a dependency graph from the list of build rules. Note this step doesn't use the variable table.
Then either executor or ninja generator will be used. Either way, kati runs its evaluator again for command lines. The variable table is used again for this step.
We'll look at each components closely. GNU make is a somewhat different language from modern languages. Let's see.
I'm not 100% sure, but I think GNU make parses and evaluates Makefiles simultaneously, but kati has two phases for parsing and evaluation. The reason of this design is for performance. For Android build, kati (or GNU make) needs to read ~3k files ~50k times. The file which is read most often is read ~5k times. It's waste of time to parse such files again and again. Kati can re-use parsed results when it needs to evaluate a Makefile second time. If we stop caching the parsed results, kati will be two times slower for Android's build. Caching parsed statements is done in file_cache.cc.
The statement parser is defined in parser.cc. In kati, there are four kinds of statements:
Data structures for them are defined in stmt.h. Here are examples of these statements:
VAR := yay! # An assignment all: # A rule echo $(VAR) # A command include xxx.mk # A make directive (include)
In addition to include directive, there are ifeq/ifneq/ifdef/ifndef directives and export/unexport directives. Also, kati internally uses "parse error statement". As GNU make doesn't show parse errors in branches which are not taken, we need to delay parse errors to evaluation time.
A tricky point of parsing make statements is that the parsing depends on the context of the evaluation. See the following Makefile chunk for example:
$(VAR) X=hoge echo $${X}
You cannot tell whether the second line is a command or an assignment until $(VAR) is evaluated. If $(VAR) is a rule statement, the second line is a command and otherwise it's an assignment. If the previous line is
VAR := target:
the second line will turn out to be a command.
For some reason, GNU make expands expressions before it decides the type of a statement only for rules. Storing assignments or directives in a variable won't work as assignments or directives. For example
ASSIGN := A=B $(ASSIGN):
doesn't assign "B:" to A, but defines a build rule whose target is A=B.
Anyway, as a line starts with a tab character can be either a command statement or other statements depending on the evaluation result of the previous line, sometimes kati's parser cannot tell the statement type of a line. In this case, kati's parser speculatively creates a command statement object, keeping the original line. If it turns out the line is actually not a command statement, the evaluator re-runs the parser.
In most programming languages, line concatenations by a backslash character and comments are handled at a very early stage of a language implementation. However, GNU make changes the behavior for them depending on parse/eval context. For example, the following Makefile outputs "has space" and "hasnospace":
VAR := has\ space all: echo $(VAR) echo has\ nospace
GNU make usually inserts a whitespace between lines, but for command lines it doesn't. As we've seen in the previous subsection, sometimes kati cannot tell a line is a command statement or not. This means we should handle them after evaluating statements. Similar discussion applies for comments. GNU make usually trims characters after '#', but it does nothing for '#' in command lines.
We have a bunch of comment/backslash related testcases in the testcase directory of kati's repository.
A statement may have one or more expressions. The number of expressions in a statement depends on the statement's type. For example,
A := $(X)
This is an assignment statement, which has two expressions - A and $(X). Types of expressions and their parser are defined in expr.cc. Like other programming languages, an expression is a tree of expressions. The type of a leaf expression is either literal, variable reference, substitution references, or make functions.
As written, backslashes and comments change their behavior depending on the context. Kati handles them in this phase. ParseExprOpt is the enum for the contexts.
As a nature of old systems, GNU make is very permissive. For some reason, it allows some kind of unmatched pairs of parentheses. For example, GNU make doesn't think $($(foo) is an error - this is a reference to variable $(foo. If you have some experiences with parsers, you may wonder how one can implement a parser which allows such expressions. It seems GNU make intentionally allows this:
http://git.savannah.gnu.org/cgit/make.git/tree/expand.c#n285
No one won't use this feature intentionally. However, as GNU make allows this, some Makefiles have unmatched parentheses, so kati shouldn't raise an error for them, unfortunately.
GNU make has a bunch of functions. Most users would use only simple ones such as $(wildcard ...) and $(subst ...). There are also more complex functions such as $(if ...) and $(call ...), which make GNU make Turing-complete. Make functions are defined in func.cc. Though func.cc is not short, the implementation is fairly simple. There is only one weirdness I remember around functions. GNU make slightly changes its parsing for $(if ...), $(and ...), and $(or ...). See trim_space and trim_right_space_1st in func.h and how they are used in expr.cc.
Evaluator for statements are defined in eval.cc. As written, there are four kinds of statements:
There is nothing tricky around commands and make directives. A rule statement have some forms and should be parsed after evaluating expression by the third parser. This will be discussed in the next section.
Assignments in GNU make is tricky a bit. There are two kinds of variables in GNU make - simple variables and recursive variables. See the following code snippet:
A = $(info world!) # recursive B := $(info Hello,) # simple $(A) $(B)
This code outputs "Hello," and "world!", in this order. The evaluation of a recursive variable is delayed until the variable is referenced. So the first line, which is an assignment of a recursive variable, outputs nothing. The content of the variable $(A) will be $(info world!) after the first line. The assignment in the second line uses := which means this is a simple variable assignment. For simple variables, the right hand side is evaluated immediately. So "Hello," will be output and the value of $(B) will be an empty string ($(info ...) returns an empty string). Then, "world!" will be shown when the third line is evaluated as $(A) is evaluated, and lastly the forth line does nothing, as $(B) is an empty string.
There are two more kinds of assignments (i.e., += and ?=). These assignments keep the type of the original variable. Evaluation of them will be done immediately only when the left hand side of the assignment is already defined and is a simple variable.
After evaluating a rule statement, kati needs to parse the evaluated result. A rule statement can actually be the following four things:
Parsing them is mostly done in rule.cc.
A rule is something like all: hello.exe. You should be familiar with it. There are several kinds of rules such as pattern rules, double colon rules, and order only dependencies, but they don't complicate the rule parser.
A feature which complicates the parser is semicolon. You can write the first build command on the same line as the rule. For example,
target: echo hi!
and
target: ; echo hi!
have the same meaning. This is tricky because kati shouldn't evaluate expressions in a command until the command is actually invoked. As a semicolon can appear as the result of expression evaluation, there are some corner cases. A tricky example:
all: $(info foo) ; $(info bar) $(info baz)
should output foo, baz, and then bar, in this order, but
VAR := all: $(info foo) ; $(info bar) $(VAR) $(info baz)
outputs foo, bar, and then baz.
Again, for the command line after a semicolon, kati should also change how backslashes and comments are handled.
target: has\ space ; echo no\ space
The above example says target depends on two targets, has and space, and to build target, echo nospace should be executed.
You may not familiar with target specific variables. This feature allows you to define variable which can be referenced only from commands in a specified target. See the following code:
VAR := X target1: VAR := Y target1: echo $(VAR) target2: echo $(VAR)
In this example, target1 shows Y and target2 shows X. I think this feature is somewhat similar to namespaces in other programming languages. If a target specific variable is specified for a non-leaf target, the variable will be used even in build commands of prerequisite targets.
In general, I like GNU make, but this is the only GNU make's feature I don't like. See the following Makefile:
hello: CFLAGS := -g hello: hello.o gcc $(CFLAGS) $< -o $@ hello.o: hello.c gcc $(CFLAGS) -c $< -o $@
If you run make for the target hello, CFLAGS is applied for both commands:
$ make hello gcc -g -c hello.c -o hello.o gcc -g hello.o -o hello
However, CFLAGS for hello won't be used when you build only hello.o:
$ make hello.o gcc -c hello.c -o hello.o
Things could be even worse when two targets with different target specific variables depend on a same target. The build result will be inconsistent. I think there is no valid usage of this feature for non-leaf targets.
Let's go back to the parsing. Like for semicolons, we need to delay the evaluation of the right hand side of the assignment for recursive variables. Its implementation is very similar to the one for semicolons, but the combination of the assignment and the semicolon makes parsing a bit trickier. An example:
target1: ;X=Y echo $(X) # A rule with a command target2: X=;Y echo $(X) # A target specific variable
Evaluation of expressions is done in expr.cc, func.cc, and command.cc. The amount of code for this step is fairly large especially because of the number of GNU make functions. However, their implementations are fairly straightforward.
One tricky function is $(wildcard ...). It seems GNU make is doing some kind of optimization only for this function and $(wildcard ...) in commands seem to be evaluated before the evaluation phase for commands. Both C++ kati and Go kati are different from GNU make's behavior in different ways, but it seems this incompatibility is OK for Android build.
There is an important optimization done for Android. Android's build system has a lot of $(shell find ...) calls to create a list of all .java/.mk files under a directory, and they are slow. For this, kati has a builtin emulator of GNU find. The find emulator traverses the directory tree and creates an in-memory directory tree. Then the find emulator returns results of find commands using the cached tree. For my environment, the find command emulator makes kati ~1.6x faster for AOSP.
The implementations of some IO-related functions in commands are tricky in the ninja generation mode. This will be described later.
Now we get a list of rules and a variable table. dep.cc builds a dependency graph using the list of rules. I think this step is what GNU make is supposed to do for normal users.
This step is fairly complex like other components but there's nothing strange. There are three types of rules in GNU make:
The following code shows the three types:
all: foo.o foo.o: echo explicit %.o: echo implicit .c.o: echo suffix
In the above example, all of these three rules match the target foo.o. GNU make prioritizes explicit rules first. When there's no explicit rule for a target, it uses an implicit rule with longer pattern string. Suffix rules are used only when there are no explicit/implicit rules.
Android has more than one thousand implicit rules and there are ten thousands of targets. It's too slow to do matching for them with a naive O(NM) algorithm. Kati uses a trie to speed up this step.
Multiple rules without commands should be merged into the rule with a command. For example:
foo.o: foo.h %.o: %.c $(CC) -c $< -o $@
foo.o depends not only on foo.c, but also on foo.h.
C++ kati's executor is fairly simple. This is defined in exec.cc. This is useful only for testing because this lacks some important features for a build system (e.g., parallel build).
Expressions in commands are evaluated at this stage. When they are evaluated, target specific variables and some special variables (e.g., $< and $@) should be considered. command.cc is handling them. This file is used by both the executor and the ninja generator.
Evaluation at this stage is tricky when both += and target specific variables are involved. Here is an example code:
all: test1 test2 test3 test4 A:=X B=X X:=foo test1: A+=$(X) test1: @echo $(A) # X bar test2: B+=$(X) test2: @echo $(B) # X bar test3: A:= test3: A+=$(X) test3: @echo $(A) # foo test4: B= test4: B+=$(X) test4: @echo $(B) # bar X:=bar
$(A) in test3 is a simple variable. Though $(A) in the global scope is simple, $(A) in test1 is a recursive variable. This means types of global variables don't affect types of target specific variables. However, The result of test1 ("X bar") shows the value of a target specific variable is concatenated to the value of a global variable.
ninja.cc generates a ninja file using the results of other components. This step is actually fairly complicated because kati needs to map GNU make's features to ninja's.
A build rule in GNU make may have multiple commands, while ninja's has always a single command. To mitigate this, the ninja generator translates multiple commands into something like (cmd1) && (cmd2) && .... Kati should also escape some special characters for ninja and shell.
The tougher thing is $(shell ...) in commands. Current kati's implementation translates it into shell's $(...). This works for many cases. But this approach won't work when the result of $(shell ...) is passed to another make function. For example
all: echo $(if $(shell echo),FAIL,PASS)
should output PASS, because the result of $(shell echo) is an empty string. GNU make and kati's executor mode output PASS correctly. However, kati's ninja generator emits a ninja file which shows FAIL.
I wrote a few experimental patches for this issue, but they didn't work well. The current kati's implementation has an Android specific workaround for this. See HasNoIoInShellScript in func.cc for detail.
C++ kati has --regen flag. If this flag is specified, kati checks if anything in your environment was changed after the previous run. If kati thinks it doesn't need to regenerate the ninja file, it finishes quickly. For Android, running kati takes ~30 secs at the first run but the second run takes only ~1 sec.
Kati thinks it needs to regenerate the ninja file when one of the followings is changed:
Quickly doing the last check is not trivial. It takes ~18 secs to run all $(shell ...) in Android's build system due to the slowness of $(shell find ...). So, for find commands executed by kati's find emulator, kati stores the timestamps of traversed directories with the find command itself. For each find commands, kati checks the timestamps of them. If they are not changed, kati skips re-running the find command.
Kati doesn't run $(shell date ...) and $(shell echo ...) during this check. The former always changes so there's no sense to re-run them. Android uses the latter to create a file and the result of them are empty strings. We don't want to update these files to get empty strings.
A big TODO is sub-makes invoked by $(MAKE). I wrote some experimental patches but nothing is ready to be used as of writing.