blob: f51a1ac47445a252f6837a6fdcb9e24150fbffa9 [file] [log] [blame] [view]
David Tolnay7db73692019-10-20 14:51:12 -04001CXX — safe FFI between Rust and C++
2=========================================
3
4[![Build Status](https://api.travis-ci.com/dtolnay/cxx.svg?branch=master)](https://travis-ci.com/dtolnay/cxx)
5[![Latest Version](https://img.shields.io/crates/v/cxx.svg)](https://crates.io/crates/cxx)
6[![Rust Documentation](https://img.shields.io/badge/api-rustdoc-blue.svg)](https://docs.rs/cxx)
7
8This library provides a **safe** mechanism for calling C++ code from Rust and
9Rust code from C++, not subject to the many ways that things can go wrong when
10using bindgen or cbindgen to generate unsafe C-style bindings.
11
David Tolnayccd39752020-01-08 09:33:51 -080012This doesn't change the fact that 100% of C++ code is unsafe. When auditing a
13project, you would be on the hook for auditing all the unsafe Rust code and
14*all* the C++ code. The core safety claim under this new model is that auditing
15just the C++ side would be sufficient to catch all problems, i.e. the Rust side
16can be 100% safe.
17
David Tolnay7db73692019-10-20 14:51:12 -040018```toml
19[dependencies]
David Tolnayf51dc4d2020-03-12 00:45:30 -070020cxx = "0.2"
David Tolnay7db73692019-10-20 14:51:12 -040021```
22
David Tolnayb606ce32020-03-16 01:16:16 -070023*Compiler support: requires rustc 1.42+*
David Tolnay7db73692019-10-20 14:51:12 -040024
25<br>
26
27## Overview
28
29The idea is that we define the signatures of both sides of our FFI boundary
30embedded together in one Rust module (the next section shows an example). From
31this, CXX receives a complete picture of the boundary to perform static analyses
32against the types and function signatures to uphold both Rust's and C++'s
33invariants and requirements.
34
35If everything checks out statically, then CXX uses a pair of code generators to
36emit the relevant `extern "C"` signatures on both sides together with any
37necessary static assertions for later in the build process to verify
38correctness. On the Rust side this code generator is simply an attribute
39procedural macro. On the C++ side it can be a small Cargo build script if your
40build is managed by Cargo, or for other build systems like Bazel or Buck we
41provide a command line tool which generates the header and source file and
42should be easy to integrate.
43
44The resulting FFI bridge operates at zero or negligible overhead, i.e. no
45copying, no serialization, no memory allocation, no runtime checks needed.
46
47The FFI signatures are able to use native types from whichever side they please,
48such as Rust's `String` or C++'s `std::string`, Rust's `Box` or C++'s
49`std::unique_ptr`, Rust's `Vec` or C++'s `std::vector`, etc in any combination.
50CXX guarantees an ABI-compatible signature that both sides understand, based on
51builtin bindings for key standard library types to expose an idiomatic API on
52those types to the other language. For example when manipulating a C++ string
53from Rust, its `len()` method becomes a call of the `size()` member function
54defined by C++; when manipulation a Rust string from C++, its `size()` member
55function calls Rust's `len()`.
56
57<br>
58
59## Example
60
61A runnable version of this example is provided under the *demo-rs* directory of
62this repo (with the C++ side of the implementation in the *demo-cxx* directory).
63To try it out, jump into demo-rs and run `cargo run`.
64
65```rust
66#[cxx::bridge]
67mod ffi {
68 // Any shared structs, whose fields will be visible to both languages.
69 struct SharedThing {
70 z: i32,
71 y: Box<ThingR>,
72 x: UniquePtr<ThingC>,
73 }
74
75 extern "C" {
76 // One or more headers with the matching C++ declarations. Our code
77 // generators don't read it but it gets #include'd and used in static
78 // assertions to ensure our picture of the FFI boundary is accurate.
79 include!("demo-cxx/demo.h");
80
81 // Zero or more opaque types which both languages can pass around but
82 // only C++ can see the fields.
83 type ThingC;
84
85 // Functions implemented in C++.
86 fn make_demo(appname: &str) -> UniquePtr<ThingC>;
87 fn get_name(thing: &ThingC) -> &CxxString;
88 fn do_thing(state: SharedThing);
89 }
90
91 extern "Rust" {
92 // Zero or more opaque types which both languages can pass around but
93 // only Rust can see the fields.
94 type ThingR;
95
96 // Functions implemented in Rust.
97 fn print_r(r: &ThingR);
98 }
99}
100```
101
102Now we simply provide C++ definitions of all the things in the `extern "C"`
103block and Rust definitions of all the things in the `extern "Rust"` block, and
104get to call back and forth safely.
105
106Here are links to the complete set of source files involved in the demo:
107
108- [demo-rs/src/main.rs](demo-rs/src/main.rs)
109- [demo-rs/build.rs](demo-rs/build.rs)
110- [demo-cxx/demo.h](demo-cxx/demo.h)
111- [demo-cxx/demo.cc](demo-cxx/demo.cc)
112
113To look at the code generated in both languages for the example by the CXX code
114generators:
115
116```console
117 # run Rust code generator and print to stdout
118 # (requires https://github.com/dtolnay/cargo-expand)
119$ cargo expand --manifest-path demo-rs/Cargo.toml
120
121 # run C++ code generator and print to stdout
122$ cargo run --manifest-path cmd/Cargo.toml -- demo-rs/src/main.rs
123```
124
125<br>
126
127## Details
128
129As seen in the example, the language of the FFI boundary involves 3 kinds of
130items:
131
132- **Shared structs** &mdash; their fields are made visible to both languages.
133 The definition written within cxx::bridge is the single source of truth.
134
135- **Opaque types** &mdash; their fields are secret from the other language.
136 These cannot be passed across the FFI by value but only behind an indirection,
137 such as a reference `&`, a Rust `Box`, or a `UniquePtr`. Can be a type alias
138 for an arbitrarily complicated generic language-specific type depending on
139 your use case.
140
141- **Functions** &mdash; implemented in either language, callable from the other
142 language.
143
144Within the `extern "C"` part of the CXX bridge we list the types and functions
145for which C++ is the source of truth, as well as the header(s) that declare
146those APIs. In the future it's possible that this section could be generated
147bindgen-style from the headers but for now we need the signatures written out;
148static assertions will verify that they are accurate.
149
150Within the `extern "Rust"` part, we list types and functions for which Rust is
151the source of truth. These all implicitly refer to the `super` module, the
152parent module of the CXX bridge. You can think of the two items listed in the
153example above as being like `use super::ThingR` and `use super::print_r` except
154re-exported to C++. The parent module will either contain the definitions
155directly for simple things, or contain the relevant `use` statements to bring
156them into scope from elsewhere.
157
158Your function implementations themselves, whether in C++ or Rust, *do not* need
159to be defined as `extern "C"` ABI or no\_mangle. CXX will put in the right shims
160where necessary to make it all work.
161
162<br>
163
164## Comparison vs bindgen and cbindgen
165
166Notice that with CXX there is repetition of all the function signatures: they
167are typed out once where the implementation is defined (in C++ or Rust) and
168again inside the cxx::bridge module, though compile-time assertions guarantee
169these are kept in sync. This is different from [bindgen] and [cbindgen] where
170function signatures are typed by a human once and the tool consumes them in one
171language and emits them in the other language.
172
173[bindgen]: https://github.com/rust-lang/rust-bindgen
174[cbindgen]: https://github.com/eqrion/cbindgen/
175
176This is because CXX fills a somewhat different role. It is a lower level tool
177than bindgen or cbindgen in a sense; you can think of it as being a replacement
178for the concept of `extern "C"` signatures as we know them, rather than a
179replacement for a bindgen. It would be reasonable to build a higher level
180bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module
181(and/or IDL like Thrift) as source of truth and generates the cxx::bridge,
182eliminating the repetition while leveraging the static analysis safety
183guarantees of CXX.
184
185But note in other ways CXX is higher level than the bindgens, with rich support
186for common standard library types. Frequently with bindgen when we are dealing
187with an idiomatic C++ API we would end up manually wrapping that API in C-style
188raw pointer functions, applying bindgen to get unsafe raw pointer Rust
189functions, and replicating the API again to expose those idiomatically in Rust.
190That's a much worse form of repetition because it is unsafe all the way through.
191
192By using a CXX bridge as the shared understanding between the languages, rather
193than `extern "C"` C-style signatures as the shared understanding, common FFI use
194cases become expressible using 100% safe code.
195
196It would also be reasonable to mix and match, using CXX bridge for the 95% of
197your FFI that is straightforward and doing the remaining few oddball signatures
198the old fashioned way with bindgen and cbindgen, if for some reason CXX's static
199restrictions get in the way. Please file an issue if you end up taking this
200approach so that we know what ways it would be worthwhile to make the tool more
201expressive.
202
203<br>
204
205## Cargo-based setup
206
207For builds that are orchestrated by Cargo, you will use a build script that runs
208CXX's C++ code generator and compiles the resulting C++ code along with any
209other C++ code for your crate.
210
211The canonical build script is as follows. The indicated line returns a
212[`cc::Build`] instance (from the usual widely used `cc` crate) on which you can
213set up any additional source files and compiler flags as normal.
214
215[`cc::Build`]: https://docs.rs/cc/1.0/cc/struct.Build.html
216
217```rust
218// build.rs
219
220fn main() {
221 cxx::Build::new()
222 .bridge("src/main.rs") // returns a cc::Build
223 .file("../demo-cxx/demo.cc")
224 .flag("-std=c++11")
225 .compile("cxxbridge-demo");
226
227 println!("cargo:rerun-if-changed=src/main.rs");
228 println!("cargo:rerun-if-changed=../demo-cxx/demo.h");
229 println!("cargo:rerun-if-changed=../demo-cxx/demo.cc");
230}
231```
232
233<br>
234
235## Non-Cargo setup
236
237For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of
238invoking the C++ code generator as a standalone command line tool. The tool is
239packaged as the `cxxbridge-cmd` crate on crates.io or can be built from the
240*cmd* directory of this repo.
241
242```bash
243$ cargo install cxxbridge-cmd
244
245$ cxxbridge src/main.rs --header > path/to/mybridge.h
246$ cxxbridge src/main.rs > path/to/mybridge.cc
247```
248
249<br>
250
251## Safety
252
253Be aware that the design of this library is intentionally restrictive and
254opinionated! It isn't a goal to be powerful enough to handle arbitrary
255signatures in either language. Instead this project is about carving out a
256reasonably expressive set of functionality about which we can make useful safety
257guarantees today and maybe extend over time. You may find that it takes some
258practice to use CXX bridge effectively as it won't work in all the ways that you
259are used to.
260
261Some of the considerations that go into ensuring safety are:
262
263- By design, our paired code generators work together to control both sides of
264 the FFI boundary. Ordinarily in Rust writing your own `extern "C"` blocks is
265 unsafe because the Rust compiler has no way to know whether the signatures
266 you've written actually match the signatures implemented in the other
267 language. With CXX we achieve that visibility and know what's on the other
268 side.
269
270- Our static analysis detects and prevents passing types by value that shouldn't
271 be passed by value from C++ to Rust, for example because they may contain
272 internal pointers that would be screwed up by Rust's move behavior.
273
274- To many people's surprise, it is possible to have a struct in Rust and a
275 struct in C++ with exactly the same layout / fields / alignment / everything,
276 and still not the same ABI when passed by value. This is a longstanding
277 bindgen bug that leads to segfaults in absolutely correct-looking code
278 ([rust-lang/rust-bindgen#778]). CXX knows about this and can insert the
279 necessary zero-cost workaround transparently where needed, so go ahead and
280 pass your structs by value without worries. This is made possible by owning
281 both sides of the boundary rather than just one.
282
283- Template instantiations: for example in order to expose a UniquePtr\<T\> type
284 in Rust backed by a real C++ unique\_ptr, we have a way of using a Rust trait
285 to connect the behavior back to the template instantiations performed by the
286 other language.
287
288[rust-lang/rust-bindgen#778]: https://github.com/rust-lang/rust-bindgen/issues/778
289
290<br>
291
292## Builtin types
293
David Tolnay559fbb32020-03-17 23:32:20 -0700294In addition to all the primitive types (i32 &lt;=&gt; int32_t), the following
David Tolnay06515f02020-03-17 23:28:02 -0700295common types may be used in the fields of shared structs and the arguments and
296returns of functions.
David Tolnay7db73692019-10-20 14:51:12 -0400297
298<table>
299<tr><th>name in Rust</th><th>name in C++</th><th>restrictions</th></tr>
David Tolnay750755e2020-03-01 13:04:08 -0800300<tr><td>String</td><td>rust::String</td><td></td></tr>
301<tr><td>&amp;str</td><td>rust::Str</td><td></td></tr>
David Tolnayefe81052020-04-14 16:28:24 -0700302<tr><td>&amp;[u8]</td><td>rust::Slice&lt;uint8_t&gt;</td><td><sup><i>arbitrary &amp;[T] not implemented yet</i></sup></td></tr>
David Tolnayf51dc4d2020-03-12 00:45:30 -0700303<tr><td><a href="https://docs.rs/cxx/0.2/cxx/struct.CxxString.html">CxxString</a></td><td>std::string</td><td><sup><i>cannot be passed by value</i></sup></td></tr>
David Tolnay750755e2020-03-01 13:04:08 -0800304<tr><td>Box&lt;T&gt;</td><td>rust::Box&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
David Tolnayf51dc4d2020-03-12 00:45:30 -0700305<tr><td><a href="https://docs.rs/cxx/0.2/cxx/struct.UniquePtr.html">UniquePtr&lt;T&gt;</a></td><td>std::unique_ptr&lt;T&gt;</td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr>
David Tolnayaddc7482020-03-29 22:19:44 -0700306<tr><td>fn(T, U) -&gt; V</td><td>rust::Fn&lt;V(T, U)&gt;</td><td><sup><i>only passing from Rust to C++ is implemented so far</i></sup></td></tr>
David Tolnay31b5aad2020-04-10 19:35:47 -0700307<tr><td>Result&lt;T&gt;</td><td>throw/catch</td><td><sup><i>allowed as return type only</i></sup></td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400308</table>
309
David Tolnay736cbca2020-03-11 16:49:18 -0700310The C++ API of the `rust` namespace is defined by the *include/cxx.h* file in
311this repo. You will need to include this header in your C++ code when working
312with those types.
David Tolnay7db73692019-10-20 14:51:12 -0400313
314The following types are intended to be supported "soon" but are just not
315implemented yet. I don't expect any of these to be hard to make work but it's a
316matter of designing a nice API for each in its non-native language.
317
318<table>
319<tr><th>name in Rust</th><th>name in C++</th></tr>
David Tolnay84f232e2020-01-08 12:22:56 -0800320<tr><td>Vec&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
321<tr><td>BTreeMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
322<tr><td>HashMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay239d05f2020-03-13 01:36:50 -0700323<tr><td>Arc&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay84f232e2020-01-08 12:22:56 -0800324<tr><td><sup><i>tbd</i></sup></td><td>std::vector&lt;T&gt;</td></tr>
325<tr><td><sup><i>tbd</i></sup></td><td>std::map&lt;K, V&gt;</td></tr>
326<tr><td><sup><i>tbd</i></sup></td><td>std::unordered_map&lt;K, V&gt;</td></tr>
David Tolnay239d05f2020-03-13 01:36:50 -0700327<tr><td><sup><i>tbd</i></sup></td><td>std::shared_ptr&lt;T&gt;</td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400328</table>
329
330<br>
331
332## Remaining work
333
334This is still early days for CXX; I am releasing it as a minimum viable product
335to collect feedback on the direction and invite collaborators. Here are some of
336the facets that I still intend for this project to tackle:
337
338- [ ] Support associated methods: `extern "Rust" { fn f(self: &Struct); }`
339- [ ] Support C++ member functions
David Tolnay7db73692019-10-20 14:51:12 -0400340- [ ] Support structs with type parameters
341- [ ] Support async functions
342
343On the build side, I don't have much experience with the `cc` crate so I expect
344there may be someone who can suggest ways to make that aspect of this crate
345friendlier or more robust. Please report issues if you run into trouble building
346or linking any of this stuff.
347
348Finally, I know more about Rust library design than C++ library design so I
349would appreciate help making the C++ APIs in this project more idiomatic where
350anyone has suggestions.
351
352<br>
353
354#### License
355
356<sup>
357Licensed under either of <a href="LICENSE-APACHE">Apache License, Version
3582.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option.
359</sup>
360
361<br>
362
363<sub>
364Unless you explicitly state otherwise, any contribution intentionally submitted
365for inclusion in this project by you, as defined in the Apache-2.0 license,
366shall be dual licensed as above, without any additional terms or conditions.
367</sub>