blob: 3f7399a6422b9f73bb15fc4e7ce70d7aed8bab50 [file] [log] [blame] [view]
David Tolnay7db73692019-10-20 14:51:12 -04001CXX — safe FFI between Rust and C++
2=========================================
3
4[![Build Status](https://api.travis-ci.com/dtolnay/cxx.svg?branch=master)](https://travis-ci.com/dtolnay/cxx)
5[![Latest Version](https://img.shields.io/crates/v/cxx.svg)](https://crates.io/crates/cxx)
6[![Rust Documentation](https://img.shields.io/badge/api-rustdoc-blue.svg)](https://docs.rs/cxx)
7
8This library provides a **safe** mechanism for calling C++ code from Rust and
9Rust code from C++, not subject to the many ways that things can go wrong when
10using bindgen or cbindgen to generate unsafe C-style bindings.
11
David Tolnayccd39752020-01-08 09:33:51 -080012This doesn't change the fact that 100% of C++ code is unsafe. When auditing a
13project, you would be on the hook for auditing all the unsafe Rust code and
14*all* the C++ code. The core safety claim under this new model is that auditing
15just the C++ side would be sufficient to catch all problems, i.e. the Rust side
16can be 100% safe.
17
David Tolnay7db73692019-10-20 14:51:12 -040018```toml
19[dependencies]
David Tolnayf51dc4d2020-03-12 00:45:30 -070020cxx = "0.2"
David Tolnay7db73692019-10-20 14:51:12 -040021```
22
David Tolnayb606ce32020-03-16 01:16:16 -070023*Compiler support: requires rustc 1.42+*
David Tolnay7db73692019-10-20 14:51:12 -040024
25<br>
26
27## Overview
28
29The idea is that we define the signatures of both sides of our FFI boundary
30embedded together in one Rust module (the next section shows an example). From
31this, CXX receives a complete picture of the boundary to perform static analyses
32against the types and function signatures to uphold both Rust's and C++'s
33invariants and requirements.
34
35If everything checks out statically, then CXX uses a pair of code generators to
36emit the relevant `extern "C"` signatures on both sides together with any
37necessary static assertions for later in the build process to verify
38correctness. On the Rust side this code generator is simply an attribute
39procedural macro. On the C++ side it can be a small Cargo build script if your
40build is managed by Cargo, or for other build systems like Bazel or Buck we
41provide a command line tool which generates the header and source file and
42should be easy to integrate.
43
44The resulting FFI bridge operates at zero or negligible overhead, i.e. no
45copying, no serialization, no memory allocation, no runtime checks needed.
46
47The FFI signatures are able to use native types from whichever side they please,
48such as Rust's `String` or C++'s `std::string`, Rust's `Box` or C++'s
49`std::unique_ptr`, Rust's `Vec` or C++'s `std::vector`, etc in any combination.
50CXX guarantees an ABI-compatible signature that both sides understand, based on
51builtin bindings for key standard library types to expose an idiomatic API on
52those types to the other language. For example when manipulating a C++ string
53from Rust, its `len()` method becomes a call of the `size()` member function
54defined by C++; when manipulation a Rust string from C++, its `size()` member
55function calls Rust's `len()`.
56
57<br>
58
59## Example
60
61A runnable version of this example is provided under the *demo-rs* directory of
62this repo (with the C++ side of the implementation in the *demo-cxx* directory).
63To try it out, jump into demo-rs and run `cargo run`.
64
65```rust
66#[cxx::bridge]
67mod ffi {
68 // Any shared structs, whose fields will be visible to both languages.
69 struct SharedThing {
70 z: i32,
71 y: Box<ThingR>,
72 x: UniquePtr<ThingC>,
73 }
74
75 extern "C" {
76 // One or more headers with the matching C++ declarations. Our code
77 // generators don't read it but it gets #include'd and used in static
78 // assertions to ensure our picture of the FFI boundary is accurate.
79 include!("demo-cxx/demo.h");
80
81 // Zero or more opaque types which both languages can pass around but
82 // only C++ can see the fields.
83 type ThingC;
84
85 // Functions implemented in C++.
86 fn make_demo(appname: &str) -> UniquePtr<ThingC>;
David Tolnay7db73692019-10-20 14:51:12 -040087 fn do_thing(state: SharedThing);
Joel Galensonf9379962020-04-16 14:11:25 -070088
89 // Methods implemented in C++.
90 fn get_name(self: &ThingC) -> &CxxString;
David Tolnay7db73692019-10-20 14:51:12 -040091 }
92
93 extern "Rust" {
94 // Zero or more opaque types which both languages can pass around but
95 // only Rust can see the fields.
96 type ThingR;
97
98 // Functions implemented in Rust.
99 fn print_r(r: &ThingR);
Joel Galensonf9379962020-04-16 14:11:25 -0700100
101 // Methods implemented in Rust.
102 fn print(self: &ThingR);
David Tolnay7db73692019-10-20 14:51:12 -0400103 }
104}
105```
106
107Now we simply provide C++ definitions of all the things in the `extern "C"`
108block and Rust definitions of all the things in the `extern "Rust"` block, and
109get to call back and forth safely.
110
111Here are links to the complete set of source files involved in the demo:
112
113- [demo-rs/src/main.rs](demo-rs/src/main.rs)
114- [demo-rs/build.rs](demo-rs/build.rs)
115- [demo-cxx/demo.h](demo-cxx/demo.h)
116- [demo-cxx/demo.cc](demo-cxx/demo.cc)
117
118To look at the code generated in both languages for the example by the CXX code
119generators:
120
121```console
122 # run Rust code generator and print to stdout
123 # (requires https://github.com/dtolnay/cargo-expand)
124$ cargo expand --manifest-path demo-rs/Cargo.toml
125
126 # run C++ code generator and print to stdout
127$ cargo run --manifest-path cmd/Cargo.toml -- demo-rs/src/main.rs
128```
129
130<br>
131
132## Details
133
134As seen in the example, the language of the FFI boundary involves 3 kinds of
135items:
136
137- **Shared structs** &mdash; their fields are made visible to both languages.
138 The definition written within cxx::bridge is the single source of truth.
139
140- **Opaque types** &mdash; their fields are secret from the other language.
141 These cannot be passed across the FFI by value but only behind an indirection,
142 such as a reference `&`, a Rust `Box`, or a `UniquePtr`. Can be a type alias
143 for an arbitrarily complicated generic language-specific type depending on
144 your use case.
145
146- **Functions** &mdash; implemented in either language, callable from the other
147 language.
148
149Within the `extern "C"` part of the CXX bridge we list the types and functions
150for which C++ is the source of truth, as well as the header(s) that declare
151those APIs. In the future it's possible that this section could be generated
152bindgen-style from the headers but for now we need the signatures written out;
153static assertions will verify that they are accurate.
154
155Within the `extern "Rust"` part, we list types and functions for which Rust is
156the source of truth. These all implicitly refer to the `super` module, the
157parent module of the CXX bridge. You can think of the two items listed in the
158example above as being like `use super::ThingR` and `use super::print_r` except
159re-exported to C++. The parent module will either contain the definitions
160directly for simple things, or contain the relevant `use` statements to bring
161them into scope from elsewhere.
162
163Your function implementations themselves, whether in C++ or Rust, *do not* need
164to be defined as `extern "C"` ABI or no\_mangle. CXX will put in the right shims
165where necessary to make it all work.
166
167<br>
168
169## Comparison vs bindgen and cbindgen
170
171Notice that with CXX there is repetition of all the function signatures: they
172are typed out once where the implementation is defined (in C++ or Rust) and
173again inside the cxx::bridge module, though compile-time assertions guarantee
174these are kept in sync. This is different from [bindgen] and [cbindgen] where
175function signatures are typed by a human once and the tool consumes them in one
176language and emits them in the other language.
177
178[bindgen]: https://github.com/rust-lang/rust-bindgen
179[cbindgen]: https://github.com/eqrion/cbindgen/
180
181This is because CXX fills a somewhat different role. It is a lower level tool
182than bindgen or cbindgen in a sense; you can think of it as being a replacement
183for the concept of `extern "C"` signatures as we know them, rather than a
184replacement for a bindgen. It would be reasonable to build a higher level
185bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module
186(and/or IDL like Thrift) as source of truth and generates the cxx::bridge,
187eliminating the repetition while leveraging the static analysis safety
188guarantees of CXX.
189
190But note in other ways CXX is higher level than the bindgens, with rich support
191for common standard library types. Frequently with bindgen when we are dealing
192with an idiomatic C++ API we would end up manually wrapping that API in C-style
193raw pointer functions, applying bindgen to get unsafe raw pointer Rust
194functions, and replicating the API again to expose those idiomatically in Rust.
195That's a much worse form of repetition because it is unsafe all the way through.
196
197By using a CXX bridge as the shared understanding between the languages, rather
198than `extern "C"` C-style signatures as the shared understanding, common FFI use
199cases become expressible using 100% safe code.
200
201It would also be reasonable to mix and match, using CXX bridge for the 95% of
202your FFI that is straightforward and doing the remaining few oddball signatures
203the old fashioned way with bindgen and cbindgen, if for some reason CXX's static
204restrictions get in the way. Please file an issue if you end up taking this
205approach so that we know what ways it would be worthwhile to make the tool more
206expressive.
207
208<br>
209
210## Cargo-based setup
211
212For builds that are orchestrated by Cargo, you will use a build script that runs
213CXX's C++ code generator and compiles the resulting C++ code along with any
214other C++ code for your crate.
215
216The canonical build script is as follows. The indicated line returns a
217[`cc::Build`] instance (from the usual widely used `cc` crate) on which you can
218set up any additional source files and compiler flags as normal.
219
220[`cc::Build`]: https://docs.rs/cc/1.0/cc/struct.Build.html
221
222```rust
223// build.rs
224
225fn main() {
226 cxx::Build::new()
227 .bridge("src/main.rs") // returns a cc::Build
228 .file("../demo-cxx/demo.cc")
229 .flag("-std=c++11")
230 .compile("cxxbridge-demo");
231
232 println!("cargo:rerun-if-changed=src/main.rs");
233 println!("cargo:rerun-if-changed=../demo-cxx/demo.h");
234 println!("cargo:rerun-if-changed=../demo-cxx/demo.cc");
235}
236```
237
238<br>
239
240## Non-Cargo setup
241
242For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of
243invoking the C++ code generator as a standalone command line tool. The tool is
244packaged as the `cxxbridge-cmd` crate on crates.io or can be built from the
245*cmd* directory of this repo.
246
247```bash
248$ cargo install cxxbridge-cmd
249
250$ cxxbridge src/main.rs --header > path/to/mybridge.h
251$ cxxbridge src/main.rs > path/to/mybridge.cc
252```
253
254<br>
255
256## Safety
257
258Be aware that the design of this library is intentionally restrictive and
259opinionated! It isn't a goal to be powerful enough to handle arbitrary
260signatures in either language. Instead this project is about carving out a
261reasonably expressive set of functionality about which we can make useful safety
262guarantees today and maybe extend over time. You may find that it takes some
263practice to use CXX bridge effectively as it won't work in all the ways that you
264are used to.
265
266Some of the considerations that go into ensuring safety are:
267
268- By design, our paired code generators work together to control both sides of
269 the FFI boundary. Ordinarily in Rust writing your own `extern "C"` blocks is
270 unsafe because the Rust compiler has no way to know whether the signatures
271 you've written actually match the signatures implemented in the other
272 language. With CXX we achieve that visibility and know what's on the other
273 side.
274
275- Our static analysis detects and prevents passing types by value that shouldn't
276 be passed by value from C++ to Rust, for example because they may contain
277 internal pointers that would be screwed up by Rust's move behavior.
278
279- To many people's surprise, it is possible to have a struct in Rust and a
280 struct in C++ with exactly the same layout / fields / alignment / everything,
281 and still not the same ABI when passed by value. This is a longstanding
282 bindgen bug that leads to segfaults in absolutely correct-looking code
283 ([rust-lang/rust-bindgen#778]). CXX knows about this and can insert the
284 necessary zero-cost workaround transparently where needed, so go ahead and
285 pass your structs by value without worries. This is made possible by owning
286 both sides of the boundary rather than just one.
287
288- Template instantiations: for example in order to expose a UniquePtr\<T\> type
289 in Rust backed by a real C++ unique\_ptr, we have a way of using a Rust trait
290 to connect the behavior back to the template instantiations performed by the
291 other language.
292
293[rust-lang/rust-bindgen#778]: https://github.com/rust-lang/rust-bindgen/issues/778
294
295<br>
296
297## Builtin types
298
David Tolnay559fbb32020-03-17 23:32:20 -0700299In addition to all the primitive types (i32 &lt;=&gt; int32_t), the following
David Tolnay06515f02020-03-17 23:28:02 -0700300common types may be used in the fields of shared structs and the arguments and
301returns of functions.
David Tolnay7db73692019-10-20 14:51:12 -0400302
303<table>
304<tr><th>name in Rust</th><th>name in C++</th><th>restrictions</th></tr>
David Tolnay750755e2020-03-01 13:04:08 -0800305<tr><td>String</td><td>rust::String</td><td></td></tr>
306<tr><td>&amp;str</td><td>rust::Str</td><td></td></tr>
David Tolnayefe81052020-04-14 16:28:24 -0700307<tr><td>&amp;[u8]</td><td>rust::Slice&lt;uint8_t&gt;</td><td><sup><i>arbitrary &amp;[T] not implemented yet</i></sup></td></tr>
David Tolnayf51dc4d2020-03-12 00:45:30 -0700308<tr><td><a href="https://docs.rs/cxx/0.2/cxx/struct.CxxString.html">CxxString</a></td><td>std::string</td><td><sup><i>cannot be passed by value</i></sup></td></tr>
David Tolnay750755e2020-03-01 13:04:08 -0800309<tr><td>Box&lt;T&gt;</td><td>rust::Box&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
David Tolnayf51dc4d2020-03-12 00:45:30 -0700310<tr><td><a href="https://docs.rs/cxx/0.2/cxx/struct.UniquePtr.html">UniquePtr&lt;T&gt;</a></td><td>std::unique_ptr&lt;T&gt;</td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr>
David Tolnayaddc7482020-03-29 22:19:44 -0700311<tr><td>fn(T, U) -&gt; V</td><td>rust::Fn&lt;V(T, U)&gt;</td><td><sup><i>only passing from Rust to C++ is implemented so far</i></sup></td></tr>
David Tolnay31b5aad2020-04-10 19:35:47 -0700312<tr><td>Result&lt;T&gt;</td><td>throw/catch</td><td><sup><i>allowed as return type only</i></sup></td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400313</table>
314
David Tolnay736cbca2020-03-11 16:49:18 -0700315The C++ API of the `rust` namespace is defined by the *include/cxx.h* file in
316this repo. You will need to include this header in your C++ code when working
317with those types.
David Tolnay7db73692019-10-20 14:51:12 -0400318
319The following types are intended to be supported "soon" but are just not
320implemented yet. I don't expect any of these to be hard to make work but it's a
321matter of designing a nice API for each in its non-native language.
322
323<table>
324<tr><th>name in Rust</th><th>name in C++</th></tr>
David Tolnay84f232e2020-01-08 12:22:56 -0800325<tr><td>Vec&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
326<tr><td>BTreeMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
327<tr><td>HashMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay239d05f2020-03-13 01:36:50 -0700328<tr><td>Arc&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay84f232e2020-01-08 12:22:56 -0800329<tr><td><sup><i>tbd</i></sup></td><td>std::vector&lt;T&gt;</td></tr>
330<tr><td><sup><i>tbd</i></sup></td><td>std::map&lt;K, V&gt;</td></tr>
331<tr><td><sup><i>tbd</i></sup></td><td>std::unordered_map&lt;K, V&gt;</td></tr>
David Tolnay239d05f2020-03-13 01:36:50 -0700332<tr><td><sup><i>tbd</i></sup></td><td>std::shared_ptr&lt;T&gt;</td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400333</table>
334
335<br>
336
337## Remaining work
338
339This is still early days for CXX; I am releasing it as a minimum viable product
340to collect feedback on the direction and invite collaborators. Here are some of
341the facets that I still intend for this project to tackle:
342
David Tolnay7db73692019-10-20 14:51:12 -0400343- [ ] Support structs with type parameters
344- [ ] Support async functions
345
346On the build side, I don't have much experience with the `cc` crate so I expect
347there may be someone who can suggest ways to make that aspect of this crate
348friendlier or more robust. Please report issues if you run into trouble building
349or linking any of this stuff.
350
351Finally, I know more about Rust library design than C++ library design so I
352would appreciate help making the C++ APIs in this project more idiomatic where
353anyone has suggestions.
354
355<br>
356
357#### License
358
359<sup>
360Licensed under either of <a href="LICENSE-APACHE">Apache License, Version
3612.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option.
362</sup>
363
364<br>
365
366<sub>
367Unless you explicitly state otherwise, any contribution intentionally submitted
368for inclusion in this project by you, as defined in the Apache-2.0 license,
369shall be dual licensed as above, without any additional terms or conditions.
370</sub>