blob: a978e969e15eff627ce4c2bae9299b9a35622d34 [file] [log] [blame] [view]
David Tolnay7db73692019-10-20 14:51:12 -04001CXX — safe FFI between Rust and C++
2=========================================
3
4[![Build Status](https://api.travis-ci.com/dtolnay/cxx.svg?branch=master)](https://travis-ci.com/dtolnay/cxx)
5[![Latest Version](https://img.shields.io/crates/v/cxx.svg)](https://crates.io/crates/cxx)
6[![Rust Documentation](https://img.shields.io/badge/api-rustdoc-blue.svg)](https://docs.rs/cxx)
7
8This library provides a **safe** mechanism for calling C++ code from Rust and
9Rust code from C++, not subject to the many ways that things can go wrong when
10using bindgen or cbindgen to generate unsafe C-style bindings.
11
12```toml
13[dependencies]
David Tolnaye43b7372020-01-08 08:46:20 -080014cxx = "0.1"
David Tolnay7db73692019-10-20 14:51:12 -040015```
16
17*Compiler support: requires rustc 1.42+ (beta on January 30, stable on March
1812)*
19
20<br>
21
22## Overview
23
24The idea is that we define the signatures of both sides of our FFI boundary
25embedded together in one Rust module (the next section shows an example). From
26this, CXX receives a complete picture of the boundary to perform static analyses
27against the types and function signatures to uphold both Rust's and C++'s
28invariants and requirements.
29
30If everything checks out statically, then CXX uses a pair of code generators to
31emit the relevant `extern "C"` signatures on both sides together with any
32necessary static assertions for later in the build process to verify
33correctness. On the Rust side this code generator is simply an attribute
34procedural macro. On the C++ side it can be a small Cargo build script if your
35build is managed by Cargo, or for other build systems like Bazel or Buck we
36provide a command line tool which generates the header and source file and
37should be easy to integrate.
38
39The resulting FFI bridge operates at zero or negligible overhead, i.e. no
40copying, no serialization, no memory allocation, no runtime checks needed.
41
42The FFI signatures are able to use native types from whichever side they please,
43such as Rust's `String` or C++'s `std::string`, Rust's `Box` or C++'s
44`std::unique_ptr`, Rust's `Vec` or C++'s `std::vector`, etc in any combination.
45CXX guarantees an ABI-compatible signature that both sides understand, based on
46builtin bindings for key standard library types to expose an idiomatic API on
47those types to the other language. For example when manipulating a C++ string
48from Rust, its `len()` method becomes a call of the `size()` member function
49defined by C++; when manipulation a Rust string from C++, its `size()` member
50function calls Rust's `len()`.
51
52<br>
53
54## Example
55
56A runnable version of this example is provided under the *demo-rs* directory of
57this repo (with the C++ side of the implementation in the *demo-cxx* directory).
58To try it out, jump into demo-rs and run `cargo run`.
59
60```rust
61#[cxx::bridge]
62mod ffi {
63 // Any shared structs, whose fields will be visible to both languages.
64 struct SharedThing {
65 z: i32,
66 y: Box<ThingR>,
67 x: UniquePtr<ThingC>,
68 }
69
70 extern "C" {
71 // One or more headers with the matching C++ declarations. Our code
72 // generators don't read it but it gets #include'd and used in static
73 // assertions to ensure our picture of the FFI boundary is accurate.
74 include!("demo-cxx/demo.h");
75
76 // Zero or more opaque types which both languages can pass around but
77 // only C++ can see the fields.
78 type ThingC;
79
80 // Functions implemented in C++.
81 fn make_demo(appname: &str) -> UniquePtr<ThingC>;
82 fn get_name(thing: &ThingC) -> &CxxString;
83 fn do_thing(state: SharedThing);
84 }
85
86 extern "Rust" {
87 // Zero or more opaque types which both languages can pass around but
88 // only Rust can see the fields.
89 type ThingR;
90
91 // Functions implemented in Rust.
92 fn print_r(r: &ThingR);
93 }
94}
95```
96
97Now we simply provide C++ definitions of all the things in the `extern "C"`
98block and Rust definitions of all the things in the `extern "Rust"` block, and
99get to call back and forth safely.
100
101Here are links to the complete set of source files involved in the demo:
102
103- [demo-rs/src/main.rs](demo-rs/src/main.rs)
104- [demo-rs/build.rs](demo-rs/build.rs)
105- [demo-cxx/demo.h](demo-cxx/demo.h)
106- [demo-cxx/demo.cc](demo-cxx/demo.cc)
107
108To look at the code generated in both languages for the example by the CXX code
109generators:
110
111```console
112 # run Rust code generator and print to stdout
113 # (requires https://github.com/dtolnay/cargo-expand)
114$ cargo expand --manifest-path demo-rs/Cargo.toml
115
116 # run C++ code generator and print to stdout
117$ cargo run --manifest-path cmd/Cargo.toml -- demo-rs/src/main.rs
118```
119
120<br>
121
122## Details
123
124As seen in the example, the language of the FFI boundary involves 3 kinds of
125items:
126
127- **Shared structs** &mdash; their fields are made visible to both languages.
128 The definition written within cxx::bridge is the single source of truth.
129
130- **Opaque types** &mdash; their fields are secret from the other language.
131 These cannot be passed across the FFI by value but only behind an indirection,
132 such as a reference `&`, a Rust `Box`, or a `UniquePtr`. Can be a type alias
133 for an arbitrarily complicated generic language-specific type depending on
134 your use case.
135
136- **Functions** &mdash; implemented in either language, callable from the other
137 language.
138
139Within the `extern "C"` part of the CXX bridge we list the types and functions
140for which C++ is the source of truth, as well as the header(s) that declare
141those APIs. In the future it's possible that this section could be generated
142bindgen-style from the headers but for now we need the signatures written out;
143static assertions will verify that they are accurate.
144
145Within the `extern "Rust"` part, we list types and functions for which Rust is
146the source of truth. These all implicitly refer to the `super` module, the
147parent module of the CXX bridge. You can think of the two items listed in the
148example above as being like `use super::ThingR` and `use super::print_r` except
149re-exported to C++. The parent module will either contain the definitions
150directly for simple things, or contain the relevant `use` statements to bring
151them into scope from elsewhere.
152
153Your function implementations themselves, whether in C++ or Rust, *do not* need
154to be defined as `extern "C"` ABI or no\_mangle. CXX will put in the right shims
155where necessary to make it all work.
156
157<br>
158
159## Comparison vs bindgen and cbindgen
160
161Notice that with CXX there is repetition of all the function signatures: they
162are typed out once where the implementation is defined (in C++ or Rust) and
163again inside the cxx::bridge module, though compile-time assertions guarantee
164these are kept in sync. This is different from [bindgen] and [cbindgen] where
165function signatures are typed by a human once and the tool consumes them in one
166language and emits them in the other language.
167
168[bindgen]: https://github.com/rust-lang/rust-bindgen
169[cbindgen]: https://github.com/eqrion/cbindgen/
170
171This is because CXX fills a somewhat different role. It is a lower level tool
172than bindgen or cbindgen in a sense; you can think of it as being a replacement
173for the concept of `extern "C"` signatures as we know them, rather than a
174replacement for a bindgen. It would be reasonable to build a higher level
175bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module
176(and/or IDL like Thrift) as source of truth and generates the cxx::bridge,
177eliminating the repetition while leveraging the static analysis safety
178guarantees of CXX.
179
180But note in other ways CXX is higher level than the bindgens, with rich support
181for common standard library types. Frequently with bindgen when we are dealing
182with an idiomatic C++ API we would end up manually wrapping that API in C-style
183raw pointer functions, applying bindgen to get unsafe raw pointer Rust
184functions, and replicating the API again to expose those idiomatically in Rust.
185That's a much worse form of repetition because it is unsafe all the way through.
186
187By using a CXX bridge as the shared understanding between the languages, rather
188than `extern "C"` C-style signatures as the shared understanding, common FFI use
189cases become expressible using 100% safe code.
190
191It would also be reasonable to mix and match, using CXX bridge for the 95% of
192your FFI that is straightforward and doing the remaining few oddball signatures
193the old fashioned way with bindgen and cbindgen, if for some reason CXX's static
194restrictions get in the way. Please file an issue if you end up taking this
195approach so that we know what ways it would be worthwhile to make the tool more
196expressive.
197
198<br>
199
200## Cargo-based setup
201
202For builds that are orchestrated by Cargo, you will use a build script that runs
203CXX's C++ code generator and compiles the resulting C++ code along with any
204other C++ code for your crate.
205
206The canonical build script is as follows. The indicated line returns a
207[`cc::Build`] instance (from the usual widely used `cc` crate) on which you can
208set up any additional source files and compiler flags as normal.
209
210[`cc::Build`]: https://docs.rs/cc/1.0/cc/struct.Build.html
211
212```rust
213// build.rs
214
215fn main() {
216 cxx::Build::new()
217 .bridge("src/main.rs") // returns a cc::Build
218 .file("../demo-cxx/demo.cc")
219 .flag("-std=c++11")
220 .compile("cxxbridge-demo");
221
222 println!("cargo:rerun-if-changed=src/main.rs");
223 println!("cargo:rerun-if-changed=../demo-cxx/demo.h");
224 println!("cargo:rerun-if-changed=../demo-cxx/demo.cc");
225}
226```
227
228<br>
229
230## Non-Cargo setup
231
232For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of
233invoking the C++ code generator as a standalone command line tool. The tool is
234packaged as the `cxxbridge-cmd` crate on crates.io or can be built from the
235*cmd* directory of this repo.
236
237```bash
238$ cargo install cxxbridge-cmd
239
240$ cxxbridge src/main.rs --header > path/to/mybridge.h
241$ cxxbridge src/main.rs > path/to/mybridge.cc
242```
243
244<br>
245
246## Safety
247
248Be aware that the design of this library is intentionally restrictive and
249opinionated! It isn't a goal to be powerful enough to handle arbitrary
250signatures in either language. Instead this project is about carving out a
251reasonably expressive set of functionality about which we can make useful safety
252guarantees today and maybe extend over time. You may find that it takes some
253practice to use CXX bridge effectively as it won't work in all the ways that you
254are used to.
255
256Some of the considerations that go into ensuring safety are:
257
258- By design, our paired code generators work together to control both sides of
259 the FFI boundary. Ordinarily in Rust writing your own `extern "C"` blocks is
260 unsafe because the Rust compiler has no way to know whether the signatures
261 you've written actually match the signatures implemented in the other
262 language. With CXX we achieve that visibility and know what's on the other
263 side.
264
265- Our static analysis detects and prevents passing types by value that shouldn't
266 be passed by value from C++ to Rust, for example because they may contain
267 internal pointers that would be screwed up by Rust's move behavior.
268
269- To many people's surprise, it is possible to have a struct in Rust and a
270 struct in C++ with exactly the same layout / fields / alignment / everything,
271 and still not the same ABI when passed by value. This is a longstanding
272 bindgen bug that leads to segfaults in absolutely correct-looking code
273 ([rust-lang/rust-bindgen#778]). CXX knows about this and can insert the
274 necessary zero-cost workaround transparently where needed, so go ahead and
275 pass your structs by value without worries. This is made possible by owning
276 both sides of the boundary rather than just one.
277
278- Template instantiations: for example in order to expose a UniquePtr\<T\> type
279 in Rust backed by a real C++ unique\_ptr, we have a way of using a Rust trait
280 to connect the behavior back to the template instantiations performed by the
281 other language.
282
283[rust-lang/rust-bindgen#778]: https://github.com/rust-lang/rust-bindgen/issues/778
284
285<br>
286
287## Builtin types
288
289In addition to all the primitive types (i32 int32_t), the following common
290types may be used in the fields of shared structs and the arguments and returns
291of functions.
292
293<table>
294<tr><th>name in Rust</th><th>name in C++</th><th>restrictions</th></tr>
295<tr><td>String</td><td>cxxbridge::RustString</td><td></td></tr>
296<tr><td>&amp;str</td><td>cxxbridge::RustStr</td><td></td></tr>
David Tolnaye43b7372020-01-08 08:46:20 -0800297<tr><td><a href="https://docs.rs/cxx/0.1/cxx/struct.CxxString.html">CxxString</a></td><td>std::string</td><td><sup><i>cannot be passed by value</i></sup></td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400298<tr><td>Box&lt;T&gt;</td><td>cxxbridge::RustBox&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
David Tolnaye43b7372020-01-08 08:46:20 -0800299<tr><td><a href="https://docs.rs/cxx/0.1/cxx/struct.UniquePtr.html">UniquePtr&lt;T&gt;</a></td><td>std::unique_ptr&lt;T&gt;</td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400300<tr><td></td><td></td><td></td></tr>
301</table>
302
303The C++ API of the `cxxbridge` namespace is defined by the *include/cxxbridge.h*
304file in this repo. You will need to include this header in your C++ code when
305working with those types.
306
307The following types are intended to be supported "soon" but are just not
308implemented yet. I don't expect any of these to be hard to make work but it's a
309matter of designing a nice API for each in its non-native language.
310
311<table>
312<tr><th>name in Rust</th><th>name in C++</th></tr>
313<tr><td>&amp;[T]</td><td></td></tr>
314<tr><td>Vec&lt;T&gt;</td><td></td></tr>
315<tr><td>BTreeMap&lt;K, V&gt;</td><td></td></tr>
316<tr><td>HashMap&lt;K, V&gt;</td><td></td></tr>
317<tr><td></td><td>std::vector&lt;T&gt;</td></tr>
318<tr><td></td><td>std::map&lt;K, V&gt;</td></tr>
319<tr><td></td><td>std::unordered_map&lt;K, V&gt;</td></tr>
320</table>
321
322<br>
323
324## Remaining work
325
326This is still early days for CXX; I am releasing it as a minimum viable product
327to collect feedback on the direction and invite collaborators. Here are some of
328the facets that I still intend for this project to tackle:
329
330- [ ] Support associated methods: `extern "Rust" { fn f(self: &Struct); }`
331- [ ] Support C++ member functions
332- [ ] Support passing function pointers across the FFI
333- [ ] Support translating between Result exceptions
334- [ ] Support structs with type parameters
335- [ ] Support async functions
336
337On the build side, I don't have much experience with the `cc` crate so I expect
338there may be someone who can suggest ways to make that aspect of this crate
339friendlier or more robust. Please report issues if you run into trouble building
340or linking any of this stuff.
341
342Finally, I know more about Rust library design than C++ library design so I
343would appreciate help making the C++ APIs in this project more idiomatic where
344anyone has suggestions.
345
346<br>
347
348#### License
349
350<sup>
351Licensed under either of <a href="LICENSE-APACHE">Apache License, Version
3522.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option.
353</sup>
354
355<br>
356
357<sub>
358Unless you explicitly state otherwise, any contribution intentionally submitted
359for inclusion in this project by you, as defined in the Apache-2.0 license,
360shall be dual licensed as above, without any additional terms or conditions.
361</sub>