blob: e31a21d25887723228a04899778ef46319d0dd04 [file] [log] [blame] [view]
Brendenc3c4fc12015-05-03 08:33:53 -07001# BPF Compiler Collection (BCC)
2
3This directory contains source code for BCC, a toolkit for creating small
4programs that can be dynamically loaded into a Linux kernel.
5
6The compiler relies upon eBPF (Extended Berkeley Packet Filters), which is a
Brenden Blanco31518432015-07-07 17:38:30 -07007feature in Linux kernels starting from 3.15. Currently, this compiler leverages
Brendenc3c4fc12015-05-03 08:33:53 -07008features which are mostly available in Linux 4.1 and above.
9
Brenden Blanco31518432015-07-07 17:38:30 -070010## Installing
11
12See [INSTALL.md](INSTALL.md) for installation steps on your platform.
13
Brendenc3c4fc12015-05-03 08:33:53 -070014## Motivation
15
16BPF guarantees that the programs loaded into the kernel cannot crash, and
Brenden Blanco452de202015-05-03 10:43:07 -070017cannot run forever, but yet BPF is general purpose enough to perform many
18arbitrary types of computation. Currently, it is possible to write a program in
Brendenc3c4fc12015-05-03 08:33:53 -070019C that will compile into a valid BPF program, yet it is vastly easier to
20write a C program that will compile into invalid BPF (C is like that). The user
Brenden Blanco452de202015-05-03 10:43:07 -070021won't know until trying to run the program whether it was valid or not.
Brendenc3c4fc12015-05-03 08:33:53 -070022
23With a BPF-specific frontend, one should be able to write in a language and
24receive feedback from the compiler on the validity as it pertains to a BPF
25backend. This toolkit aims to provide a frontend that can only create valid BPF
26programs while still harnessing its full flexibility.
27
Brenden Blanco46176a12015-07-07 13:05:22 -070028Furthermore, current integrations with BPF have a kludgy workflow, sometimes
29involving compiling directly in a linux kernel source tree. This toolchain aims
30to minimize the time that a developer spends getting BPF compiled, and instead
31focus on the applications that can be written and the problems that can be
32solved with BPF.
33
Brendenc3c4fc12015-05-03 08:33:53 -070034The features of this toolkit include:
35* End-to-end BPF workflow in a shared library
Brenden Blanco46176a12015-07-07 13:05:22 -070036 * A modified C language for BPF backends
Brenden Blanco452de202015-05-03 10:43:07 -070037 * Integration with llvm-bpf backend for JIT
Brendenc3c4fc12015-05-03 08:33:53 -070038 * Dynamic (un)loading of JITed programs
39 * Support for BPF kernel hooks: socket filters, tc classifiers,
40 tc actions, and kprobes
41* Bindings for Python
42* Examples for socket filters, tc classifiers, and kprobes
Brenden Blanco46176a12015-07-07 13:05:22 -070043
44In the future, more bindings besides python will likely be supported. Feel free
45to add support for the language of your choice and send a pull request!
46
47## Examples
48
49This toolchain is currently composed of two parts: a C wrapper around LLVM, and
50a Python API to interact with the running program. Later, we will go into more
51detail of how this all works.
52
53### Hello, World
54
55First, we should include the BPF class from the bpf module:
56```python
57from bpf import BPF
58```
59
60Since the C code is so short, we will embed it inside the python script.
61
62The BPF program always takes at least one argument, which is a pointer to the
63context for this type of program. Different program types have different calling
64conventions, but for this one we don't care so `void *` is fine.
65```python
66prog = """
67int hello(void *ctx) {
68 bpf_trace_printk("Hello, World!\\n");
69 return 0;
70};
71"""
72b = BPF(text=prog)
73```
74
75For this example, we will call the program every time `fork()` is called by a
76userspace process. Underneath the hood, fork translates to the `clone` syscall,
77so we will attach our program to the kernel symbol `sys_clone`.
78```python
79fn = b.load_func("hello", BPF.KPROBE)
80BPF.attach_kprobe(fn, "sys_clone")
81```
82
83The python process will then print the trace printk circular buffer until ctrl-c
84is pressed. The BPF program is removed from the kernel when the userspace
85process that loaded it closes the fd (or exits).
86```python
87from subprocess import call
88try:
89 call(["cat", "/sys/kernel/debug/tracing/trace_pipe"])
90except KeyboardInterrupt:
91 pass
92```
93
94Output:
95```
96bcc/examples$ sudo python hello_world.py
97 python-7282 [002] d... 3757.488508: : Hello, World!
98```
99
100[Source code listing](examples/hello_world.py)
101
102### Networking
103
Brenden Blanco31518432015-07-07 17:38:30 -0700104At RedHat Summit 2015, BCC was presented as part of a [session on BPF](http://www.devnation.org/#7784f1f7513e8542e4db519e79ff5eec).
105A multi-host vxlan environment is simulated and a BPF program used to monitor
106one of the physical interfaces. The BPF program keeps statistics on the inner
107and outer IP addresses traversing the interface, and the userspace component
108turns those statistics into a graph showing the traffic distribution at
109multiple granularities. See the code [here](examples/tunnel_monitor).
110
111[![Screenshot](http://img.youtube.com/vi/yYy3Cwce02k/0.jpg)](https://youtu.be/yYy3Cwce02k)
Brenden Blanco46176a12015-07-07 13:05:22 -0700112
113### Tracing
Brendenc3c4fc12015-05-03 08:33:53 -0700114
Brenden Blanco31518432015-07-07 17:38:30 -0700115Here is a slightly more complex tracing example than Hello World. This program
116will be invoked for every task change in the kernel, and record in a BPF map
117the new and old pids.
118
119The C program below introduces two new concepts.
120The first is the macro `BPF_TABLE`. This defines a table (type="hash"), with key
121type `key_t` and leaf type `u64` (a single counter). The table name is `stats`,
122containing 1024 entries maximum. One can `lookup`, `lookup_or_init`, `update`,
123and `delete` entries from the table.
124The second concept is the prev argument. This argument is treated specially by
125the BCC frontend, such that accesses to this variable are read from the saved
126context that is passed by the kprobe infrastructure. The prototype of the args
127starting from position 1 should match the prototype of the kernel function being
128kprobed. If done so, the program will have seamless access to the function
129parameters.
130```c
131#include <uapi/linux/ptrace.h>
132#include <linux/sched.h>
133
134struct key_t {
135 u32 prev_pid;
136 u32 curr_pid;
137};
138// map_type, key_type, leaf_type, table_name, num_entry
139BPF_TABLE("hash", struct key_t, u64, stats, 1024);
140int count_sched(struct pt_regs *ctx, struct task_struct *prev) {
141 struct key_t key = {};
142 u64 zero = 0, *val;
143
144 key.curr_pid = bpf_get_current_pid_tgid();
145 key.prev_pid = prev->pid;
146
147 val = stats.lookup_or_init(&key, &zero);
148 (*val)++;
149 return 0;
150}
151```
152[Source code listing](examples/task_switch.c)
153
154The userspace component loads the file shown above, and attaches it to the
155`finish_task_switch` kernel function (which takes one `struct task_struct *`
156argument). The `get_table` API returns an object that gives dict-style access
157to the stats BPF map. The python program could use that handle to modify the
158kernel table as well.
159```python
160from bpf import BPF
161from time import sleep
162
163b = BPF(src_file="task_switch.c")
164fn = b.load_func("count_sched", BPF.KPROBE)
165stats = b.get_table("stats")
166BPF.attach_kprobe(fn, "finish_task_switch")
167
168# generate many schedule events
169for i in range(0, 100): sleep(0.01)
170
171for k, v in stats.items():
172 print("task_switch[%5d->%5d]=%u" % (k.prev_pid, k.curr_pid, v.value))
173```
174[Source code listing](examples/task_switch.py)
175
Brendenc3c4fc12015-05-03 08:33:53 -0700176## Requirements
177
Brenden Blanco46176a12015-07-07 13:05:22 -0700178To get started using this toolchain in binary format, one needs:
Brendenc3c4fc12015-05-03 08:33:53 -0700179* Linux kernel 4.1 or newer, with these flags enabled:
Brenden Blanco83102912015-06-09 17:43:27 -0700180 * `CONFIG_BPF=y`
181 * `CONFIG_BPF_SYSCALL=y`
182 * `CONFIG_NET_CLS_BPF=m` [optional, for tc filters]
183 * `CONFIG_NET_ACT_BPF=m` [optional, for tc actions]
184 * `CONFIG_BPF_JIT=y`
185 * `CONFIG_HAVE_BPF_JIT=y`
186 * `CONFIG_BPF_EVENTS=y` [optional, for kprobes]
Brenden Blanco46176a12015-07-07 13:05:22 -0700187* Headers for the above kernel
188* gcc, make, python
189* python-pyroute2 (for some networking features only)
Brendenc3c4fc12015-05-03 08:33:53 -0700190
Brenden Blanco452de202015-05-03 10:43:07 -0700191## Getting started
192
Brenden Blanco46176a12015-07-07 13:05:22 -0700193As of this writing, binary packages for the above requirements are available
194in unstable formats. Both Ubuntu and Fedora have 4.2-rcX builds with the above
195flags defaulted to on. LLVM provides 3.7 Ubuntu packages (but not Fedora yet).
Brenden Blanco452de202015-05-03 10:43:07 -0700196
Brenden Blanco31518432015-07-07 17:38:30 -0700197See [INSTALL.md](INSTALL.md) for installation steps on your platform.