blob: 6345134443f75023da2ebf6f2f477a53f78cfa0a [file] [log] [blame] [view]
Suchakra Sharmac4970562015-08-03 19:22:22 -04001![BCC Logo](images/logo2.png)
Brendenc3c4fc12015-05-03 08:33:53 -07002# BPF Compiler Collection (BCC)
3
4This directory contains source code for BCC, a toolkit for creating small
5programs that can be dynamically loaded into a Linux kernel.
6
7The compiler relies upon eBPF (Extended Berkeley Packet Filters), which is a
Brenden Blanco31518432015-07-07 17:38:30 -07008feature in Linux kernels starting from 3.15. Currently, this compiler leverages
Brendenc3c4fc12015-05-03 08:33:53 -07009features which are mostly available in Linux 4.1 and above.
10
Brenden Blanco31518432015-07-07 17:38:30 -070011## Installing
12
13See [INSTALL.md](INSTALL.md) for installation steps on your platform.
14
Brendenc3c4fc12015-05-03 08:33:53 -070015## Motivation
16
17BPF guarantees that the programs loaded into the kernel cannot crash, and
Brenden Blanco452de202015-05-03 10:43:07 -070018cannot run forever, but yet BPF is general purpose enough to perform many
19arbitrary types of computation. Currently, it is possible to write a program in
Brendenc3c4fc12015-05-03 08:33:53 -070020C that will compile into a valid BPF program, yet it is vastly easier to
21write a C program that will compile into invalid BPF (C is like that). The user
Brenden Blanco452de202015-05-03 10:43:07 -070022won't know until trying to run the program whether it was valid or not.
Brendenc3c4fc12015-05-03 08:33:53 -070023
24With a BPF-specific frontend, one should be able to write in a language and
25receive feedback from the compiler on the validity as it pertains to a BPF
26backend. This toolkit aims to provide a frontend that can only create valid BPF
27programs while still harnessing its full flexibility.
28
Brenden Blanco46176a12015-07-07 13:05:22 -070029Furthermore, current integrations with BPF have a kludgy workflow, sometimes
30involving compiling directly in a linux kernel source tree. This toolchain aims
31to minimize the time that a developer spends getting BPF compiled, and instead
32focus on the applications that can be written and the problems that can be
33solved with BPF.
34
Brendenc3c4fc12015-05-03 08:33:53 -070035The features of this toolkit include:
36* End-to-end BPF workflow in a shared library
Brenden Blanco46176a12015-07-07 13:05:22 -070037 * A modified C language for BPF backends
Brenden Blanco452de202015-05-03 10:43:07 -070038 * Integration with llvm-bpf backend for JIT
Brendenc3c4fc12015-05-03 08:33:53 -070039 * Dynamic (un)loading of JITed programs
40 * Support for BPF kernel hooks: socket filters, tc classifiers,
41 tc actions, and kprobes
42* Bindings for Python
43* Examples for socket filters, tc classifiers, and kprobes
Brenden Blanco46176a12015-07-07 13:05:22 -070044
45In the future, more bindings besides python will likely be supported. Feel free
46to add support for the language of your choice and send a pull request!
47
48## Examples
49
50This toolchain is currently composed of two parts: a C wrapper around LLVM, and
51a Python API to interact with the running program. Later, we will go into more
52detail of how this all works.
53
54### Hello, World
55
56First, we should include the BPF class from the bpf module:
57```python
58from bpf import BPF
59```
60
61Since the C code is so short, we will embed it inside the python script.
62
63The BPF program always takes at least one argument, which is a pointer to the
64context for this type of program. Different program types have different calling
65conventions, but for this one we don't care so `void *` is fine.
66```python
67prog = """
68int hello(void *ctx) {
69 bpf_trace_printk("Hello, World!\\n");
70 return 0;
71};
72"""
73b = BPF(text=prog)
74```
75
76For this example, we will call the program every time `fork()` is called by a
77userspace process. Underneath the hood, fork translates to the `clone` syscall,
78so we will attach our program to the kernel symbol `sys_clone`.
79```python
80fn = b.load_func("hello", BPF.KPROBE)
81BPF.attach_kprobe(fn, "sys_clone")
82```
83
84The python process will then print the trace printk circular buffer until ctrl-c
85is pressed. The BPF program is removed from the kernel when the userspace
86process that loaded it closes the fd (or exits).
87```python
88from subprocess import call
89try:
90 call(["cat", "/sys/kernel/debug/tracing/trace_pipe"])
91except KeyboardInterrupt:
92 pass
93```
94
95Output:
96```
97bcc/examples$ sudo python hello_world.py
98 python-7282 [002] d... 3757.488508: : Hello, World!
99```
100
101[Source code listing](examples/hello_world.py)
102
103### Networking
104
Brenden Blanco31518432015-07-07 17:38:30 -0700105At RedHat Summit 2015, BCC was presented as part of a [session on BPF](http://www.devnation.org/#7784f1f7513e8542e4db519e79ff5eec).
106A multi-host vxlan environment is simulated and a BPF program used to monitor
107one of the physical interfaces. The BPF program keeps statistics on the inner
108and outer IP addresses traversing the interface, and the userspace component
109turns those statistics into a graph showing the traffic distribution at
110multiple granularities. See the code [here](examples/tunnel_monitor).
111
112[![Screenshot](http://img.youtube.com/vi/yYy3Cwce02k/0.jpg)](https://youtu.be/yYy3Cwce02k)
Brenden Blanco46176a12015-07-07 13:05:22 -0700113
114### Tracing
Brendenc3c4fc12015-05-03 08:33:53 -0700115
Brenden Blanco31518432015-07-07 17:38:30 -0700116Here is a slightly more complex tracing example than Hello World. This program
117will be invoked for every task change in the kernel, and record in a BPF map
118the new and old pids.
119
120The C program below introduces two new concepts.
121The first is the macro `BPF_TABLE`. This defines a table (type="hash"), with key
122type `key_t` and leaf type `u64` (a single counter). The table name is `stats`,
123containing 1024 entries maximum. One can `lookup`, `lookup_or_init`, `update`,
124and `delete` entries from the table.
125The second concept is the prev argument. This argument is treated specially by
126the BCC frontend, such that accesses to this variable are read from the saved
127context that is passed by the kprobe infrastructure. The prototype of the args
128starting from position 1 should match the prototype of the kernel function being
129kprobed. If done so, the program will have seamless access to the function
130parameters.
131```c
132#include <uapi/linux/ptrace.h>
133#include <linux/sched.h>
134
135struct key_t {
136 u32 prev_pid;
137 u32 curr_pid;
138};
139// map_type, key_type, leaf_type, table_name, num_entry
140BPF_TABLE("hash", struct key_t, u64, stats, 1024);
141int count_sched(struct pt_regs *ctx, struct task_struct *prev) {
142 struct key_t key = {};
143 u64 zero = 0, *val;
144
145 key.curr_pid = bpf_get_current_pid_tgid();
146 key.prev_pid = prev->pid;
147
148 val = stats.lookup_or_init(&key, &zero);
149 (*val)++;
150 return 0;
151}
152```
153[Source code listing](examples/task_switch.c)
154
155The userspace component loads the file shown above, and attaches it to the
156`finish_task_switch` kernel function (which takes one `struct task_struct *`
157argument). The `get_table` API returns an object that gives dict-style access
158to the stats BPF map. The python program could use that handle to modify the
159kernel table as well.
160```python
161from bpf import BPF
162from time import sleep
163
164b = BPF(src_file="task_switch.c")
Brenden Blancoc8b66982015-08-28 23:15:19 -0700165b.attach_kprobe(event="finish_task_switch", fn_name="count_sched")
Brenden Blanco31518432015-07-07 17:38:30 -0700166
167# generate many schedule events
168for i in range(0, 100): sleep(0.01)
169
Brenden Blancoc8b66982015-08-28 23:15:19 -0700170for k, v in b["stats"].items():
Brenden Blanco31518432015-07-07 17:38:30 -0700171 print("task_switch[%5d->%5d]=%u" % (k.prev_pid, k.curr_pid, v.value))
172```
173[Source code listing](examples/task_switch.py)
174
Brendenc3c4fc12015-05-03 08:33:53 -0700175## Requirements
176
Brenden Blanco46176a12015-07-07 13:05:22 -0700177To get started using this toolchain in binary format, one needs:
Brendenc3c4fc12015-05-03 08:33:53 -0700178* Linux kernel 4.1 or newer, with these flags enabled:
Brenden Blanco83102912015-06-09 17:43:27 -0700179 * `CONFIG_BPF=y`
180 * `CONFIG_BPF_SYSCALL=y`
181 * `CONFIG_NET_CLS_BPF=m` [optional, for tc filters]
182 * `CONFIG_NET_ACT_BPF=m` [optional, for tc actions]
183 * `CONFIG_BPF_JIT=y`
184 * `CONFIG_HAVE_BPF_JIT=y`
185 * `CONFIG_BPF_EVENTS=y` [optional, for kprobes]
Brenden Blanco46176a12015-07-07 13:05:22 -0700186* Headers for the above kernel
187* gcc, make, python
188* python-pyroute2 (for some networking features only)
Brendenc3c4fc12015-05-03 08:33:53 -0700189
Brenden Blanco452de202015-05-03 10:43:07 -0700190## Getting started
191
Brenden Blanco46176a12015-07-07 13:05:22 -0700192As of this writing, binary packages for the above requirements are available
193in unstable formats. Both Ubuntu and Fedora have 4.2-rcX builds with the above
194flags defaulted to on. LLVM provides 3.7 Ubuntu packages (but not Fedora yet).
Brenden Blanco452de202015-05-03 10:43:07 -0700195
Brenden Blanco31518432015-07-07 17:38:30 -0700196See [INSTALL.md](INSTALL.md) for installation steps on your platform.