blob: 38fe6822e33c26a9498492c29d87f18c2adabc9f [file] [log] [blame] [view]
David Tolnay7db73692019-10-20 14:51:12 -04001CXX — safe FFI between Rust and C++
2=========================================
3
David Tolnaydd3af092020-05-12 21:47:06 -07004[<img alt="github" src="https://img.shields.io/badge/github-dtolnay/cxx-8da0cb?style=for-the-badge&labelColor=555555&logo=github" height="20">](https://github.com/dtolnay/cxx)
5[<img alt="crates.io" src="https://img.shields.io/crates/v/cxx.svg?style=for-the-badge&color=fc8d62&logo=rust" height="20">](https://crates.io/crates/cxx)
6[<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-cxx-66c2a5?style=for-the-badge&labelColor=555555&logoColor=white&logo=" height="20">](https://docs.rs/cxx)
7[<img alt="build status" src="https://img.shields.io/github/workflow/status/dtolnay/cxx/CI/master?style=for-the-badge" height="20">](https://github.com/dtolnay/cxx/actions?query=branch%3Amaster)
David Tolnay7db73692019-10-20 14:51:12 -04008
9This library provides a **safe** mechanism for calling C++ code from Rust and
10Rust code from C++, not subject to the many ways that things can go wrong when
11using bindgen or cbindgen to generate unsafe C-style bindings.
12
David Tolnayccd39752020-01-08 09:33:51 -080013This doesn't change the fact that 100% of C++ code is unsafe. When auditing a
14project, you would be on the hook for auditing all the unsafe Rust code and
15*all* the C++ code. The core safety claim under this new model is that auditing
16just the C++ side would be sufficient to catch all problems, i.e. the Rust side
17can be 100% safe.
18
David Tolnay7db73692019-10-20 14:51:12 -040019```toml
20[dependencies]
David Tolnay5f3fb892020-09-01 23:02:57 -070021cxx = "0.4"
David Tolnayc5cd7a12020-09-03 15:32:34 -070022
23[build-dependencies]
24cxx-build = "0.4"
David Tolnay7db73692019-10-20 14:51:12 -040025```
26
David Tolnaycf223852020-05-11 20:50:59 -070027*Compiler support: requires rustc 1.42+ and c++11 or newer*<br>
David Tolnay5d08baa2020-04-27 18:12:08 -070028*[Release notes](https://github.com/dtolnay/cxx/releases)*
David Tolnay7db73692019-10-20 14:51:12 -040029
30<br>
31
32## Overview
33
34The idea is that we define the signatures of both sides of our FFI boundary
35embedded together in one Rust module (the next section shows an example). From
36this, CXX receives a complete picture of the boundary to perform static analyses
37against the types and function signatures to uphold both Rust's and C++'s
38invariants and requirements.
39
40If everything checks out statically, then CXX uses a pair of code generators to
41emit the relevant `extern "C"` signatures on both sides together with any
42necessary static assertions for later in the build process to verify
43correctness. On the Rust side this code generator is simply an attribute
44procedural macro. On the C++ side it can be a small Cargo build script if your
45build is managed by Cargo, or for other build systems like Bazel or Buck we
46provide a command line tool which generates the header and source file and
47should be easy to integrate.
48
49The resulting FFI bridge operates at zero or negligible overhead, i.e. no
50copying, no serialization, no memory allocation, no runtime checks needed.
51
52The FFI signatures are able to use native types from whichever side they please,
53such as Rust's `String` or C++'s `std::string`, Rust's `Box` or C++'s
54`std::unique_ptr`, Rust's `Vec` or C++'s `std::vector`, etc in any combination.
55CXX guarantees an ABI-compatible signature that both sides understand, based on
56builtin bindings for key standard library types to expose an idiomatic API on
57those types to the other language. For example when manipulating a C++ string
58from Rust, its `len()` method becomes a call of the `size()` member function
Christopher Durhamb8d211d2020-05-24 11:56:32 -040059defined by C++; when manipulating a Rust string from C++, its `size()` member
David Tolnay7db73692019-10-20 14:51:12 -040060function calls Rust's `len()`.
61
62<br>
63
64## Example
65
David Tolnay278f6fc2020-09-01 16:16:57 -070066A runnable version of this example is provided under the *demo* directory of
67this repo. To try it out, run `cargo run` from that directory.
David Tolnay7db73692019-10-20 14:51:12 -040068
69```rust
70#[cxx::bridge]
71mod ffi {
72 // Any shared structs, whose fields will be visible to both languages.
73 struct SharedThing {
74 z: i32,
75 y: Box<ThingR>,
76 x: UniquePtr<ThingC>,
77 }
78
79 extern "C" {
80 // One or more headers with the matching C++ declarations. Our code
81 // generators don't read it but it gets #include'd and used in static
82 // assertions to ensure our picture of the FFI boundary is accurate.
David Tolnay278f6fc2020-09-01 16:16:57 -070083 include!("demo/include/demo.h");
David Tolnay7db73692019-10-20 14:51:12 -040084
85 // Zero or more opaque types which both languages can pass around but
86 // only C++ can see the fields.
87 type ThingC;
88
89 // Functions implemented in C++.
90 fn make_demo(appname: &str) -> UniquePtr<ThingC>;
David Tolnayb6a5f672020-04-17 15:04:56 -070091 fn get_name(thing: &ThingC) -> &CxxString;
David Tolnay7db73692019-10-20 14:51:12 -040092 fn do_thing(state: SharedThing);
93 }
94
95 extern "Rust" {
96 // Zero or more opaque types which both languages can pass around but
97 // only Rust can see the fields.
98 type ThingR;
99
100 // Functions implemented in Rust.
101 fn print_r(r: &ThingR);
102 }
103}
104```
105
106Now we simply provide C++ definitions of all the things in the `extern "C"`
107block and Rust definitions of all the things in the `extern "Rust"` block, and
108get to call back and forth safely.
109
110Here are links to the complete set of source files involved in the demo:
111
David Tolnay278f6fc2020-09-01 16:16:57 -0700112- [demo/src/main.rs](demo/src/main.rs)
113- [demo/build.rs](demo/build.rs)
114- [demo/include/demo.h](demo/include/demo.h)
115- [demo/src/demo.cc](demo/src/demo.cc)
David Tolnay7db73692019-10-20 14:51:12 -0400116
117To look at the code generated in both languages for the example by the CXX code
118generators:
119
120```console
121 # run Rust code generator and print to stdout
122 # (requires https://github.com/dtolnay/cargo-expand)
David Tolnay278f6fc2020-09-01 16:16:57 -0700123$ cargo expand --manifest-path demo/Cargo.toml
David Tolnay7db73692019-10-20 14:51:12 -0400124
125 # run C++ code generator and print to stdout
David Tolnay278f6fc2020-09-01 16:16:57 -0700126$ cargo run --manifest-path gen/cmd/Cargo.toml -- demo/src/main.rs
David Tolnay7db73692019-10-20 14:51:12 -0400127```
128
129<br>
130
131## Details
132
133As seen in the example, the language of the FFI boundary involves 3 kinds of
134items:
135
136- **Shared structs** &mdash; their fields are made visible to both languages.
137 The definition written within cxx::bridge is the single source of truth.
138
139- **Opaque types** &mdash; their fields are secret from the other language.
140 These cannot be passed across the FFI by value but only behind an indirection,
141 such as a reference `&`, a Rust `Box`, or a `UniquePtr`. Can be a type alias
142 for an arbitrarily complicated generic language-specific type depending on
143 your use case.
144
145- **Functions** &mdash; implemented in either language, callable from the other
146 language.
147
148Within the `extern "C"` part of the CXX bridge we list the types and functions
149for which C++ is the source of truth, as well as the header(s) that declare
150those APIs. In the future it's possible that this section could be generated
151bindgen-style from the headers but for now we need the signatures written out;
152static assertions will verify that they are accurate.
153
154Within the `extern "Rust"` part, we list types and functions for which Rust is
155the source of truth. These all implicitly refer to the `super` module, the
156parent module of the CXX bridge. You can think of the two items listed in the
157example above as being like `use super::ThingR` and `use super::print_r` except
158re-exported to C++. The parent module will either contain the definitions
159directly for simple things, or contain the relevant `use` statements to bring
160them into scope from elsewhere.
161
162Your function implementations themselves, whether in C++ or Rust, *do not* need
163to be defined as `extern "C"` ABI or no\_mangle. CXX will put in the right shims
164where necessary to make it all work.
165
166<br>
167
168## Comparison vs bindgen and cbindgen
169
170Notice that with CXX there is repetition of all the function signatures: they
171are typed out once where the implementation is defined (in C++ or Rust) and
172again inside the cxx::bridge module, though compile-time assertions guarantee
173these are kept in sync. This is different from [bindgen] and [cbindgen] where
174function signatures are typed by a human once and the tool consumes them in one
175language and emits them in the other language.
176
177[bindgen]: https://github.com/rust-lang/rust-bindgen
178[cbindgen]: https://github.com/eqrion/cbindgen/
179
180This is because CXX fills a somewhat different role. It is a lower level tool
181than bindgen or cbindgen in a sense; you can think of it as being a replacement
182for the concept of `extern "C"` signatures as we know them, rather than a
183replacement for a bindgen. It would be reasonable to build a higher level
184bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module
185(and/or IDL like Thrift) as source of truth and generates the cxx::bridge,
186eliminating the repetition while leveraging the static analysis safety
187guarantees of CXX.
188
189But note in other ways CXX is higher level than the bindgens, with rich support
190for common standard library types. Frequently with bindgen when we are dealing
191with an idiomatic C++ API we would end up manually wrapping that API in C-style
192raw pointer functions, applying bindgen to get unsafe raw pointer Rust
193functions, and replicating the API again to expose those idiomatically in Rust.
194That's a much worse form of repetition because it is unsafe all the way through.
195
196By using a CXX bridge as the shared understanding between the languages, rather
197than `extern "C"` C-style signatures as the shared understanding, common FFI use
198cases become expressible using 100% safe code.
199
200It would also be reasonable to mix and match, using CXX bridge for the 95% of
201your FFI that is straightforward and doing the remaining few oddball signatures
202the old fashioned way with bindgen and cbindgen, if for some reason CXX's static
203restrictions get in the way. Please file an issue if you end up taking this
204approach so that we know what ways it would be worthwhile to make the tool more
205expressive.
206
207<br>
208
209## Cargo-based setup
210
211For builds that are orchestrated by Cargo, you will use a build script that runs
212CXX's C++ code generator and compiles the resulting C++ code along with any
213other C++ code for your crate.
214
215The canonical build script is as follows. The indicated line returns a
216[`cc::Build`] instance (from the usual widely used `cc` crate) on which you can
217set up any additional source files and compiler flags as normal.
218
219[`cc::Build`]: https://docs.rs/cc/1.0/cc/struct.Build.html
220
David Tolnaycc9ece52020-04-29 18:57:05 -0700221```toml
222# Cargo.toml
223
224[build-dependencies]
David Tolnay5f3fb892020-09-01 23:02:57 -0700225cxx-build = "0.4"
David Tolnaycc9ece52020-04-29 18:57:05 -0700226```
227
David Tolnay7db73692019-10-20 14:51:12 -0400228```rust
229// build.rs
230
231fn main() {
David Tolnayf8ed0732020-04-29 12:34:47 -0700232 cxx_build::bridge("src/main.rs") // returns a cc::Build
David Tolnay278f6fc2020-09-01 16:16:57 -0700233 .file("src/demo.cc")
Philip Craig7e14e2e2020-05-09 10:42:30 +0100234 .flag_if_supported("-std=c++11")
David Tolnay7db73692019-10-20 14:51:12 -0400235 .compile("cxxbridge-demo");
236
237 println!("cargo:rerun-if-changed=src/main.rs");
David Tolnay278f6fc2020-09-01 16:16:57 -0700238 println!("cargo:rerun-if-changed=src/demo.cc");
239 println!("cargo:rerun-if-changed=include/demo.h");
David Tolnay7db73692019-10-20 14:51:12 -0400240}
241```
242
243<br>
244
245## Non-Cargo setup
246
247For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of
248invoking the C++ code generator as a standalone command line tool. The tool is
249packaged as the `cxxbridge-cmd` crate on crates.io or can be built from the
250*cmd* directory of this repo.
251
252```bash
253$ cargo install cxxbridge-cmd
254
255$ cxxbridge src/main.rs --header > path/to/mybridge.h
256$ cxxbridge src/main.rs > path/to/mybridge.cc
257```
258
259<br>
260
261## Safety
262
263Be aware that the design of this library is intentionally restrictive and
264opinionated! It isn't a goal to be powerful enough to handle arbitrary
265signatures in either language. Instead this project is about carving out a
266reasonably expressive set of functionality about which we can make useful safety
267guarantees today and maybe extend over time. You may find that it takes some
268practice to use CXX bridge effectively as it won't work in all the ways that you
269are used to.
270
271Some of the considerations that go into ensuring safety are:
272
273- By design, our paired code generators work together to control both sides of
274 the FFI boundary. Ordinarily in Rust writing your own `extern "C"` blocks is
275 unsafe because the Rust compiler has no way to know whether the signatures
276 you've written actually match the signatures implemented in the other
277 language. With CXX we achieve that visibility and know what's on the other
278 side.
279
280- Our static analysis detects and prevents passing types by value that shouldn't
281 be passed by value from C++ to Rust, for example because they may contain
282 internal pointers that would be screwed up by Rust's move behavior.
283
284- To many people's surprise, it is possible to have a struct in Rust and a
285 struct in C++ with exactly the same layout / fields / alignment / everything,
286 and still not the same ABI when passed by value. This is a longstanding
287 bindgen bug that leads to segfaults in absolutely correct-looking code
288 ([rust-lang/rust-bindgen#778]). CXX knows about this and can insert the
289 necessary zero-cost workaround transparently where needed, so go ahead and
290 pass your structs by value without worries. This is made possible by owning
291 both sides of the boundary rather than just one.
292
293- Template instantiations: for example in order to expose a UniquePtr\<T\> type
294 in Rust backed by a real C++ unique\_ptr, we have a way of using a Rust trait
295 to connect the behavior back to the template instantiations performed by the
296 other language.
297
298[rust-lang/rust-bindgen#778]: https://github.com/rust-lang/rust-bindgen/issues/778
299
300<br>
301
302## Builtin types
303
David Tolnay559fbb32020-03-17 23:32:20 -0700304In addition to all the primitive types (i32 &lt;=&gt; int32_t), the following
David Tolnay06515f02020-03-17 23:28:02 -0700305common types may be used in the fields of shared structs and the arguments and
306returns of functions.
David Tolnay7db73692019-10-20 14:51:12 -0400307
308<table>
309<tr><th>name in Rust</th><th>name in C++</th><th>restrictions</th></tr>
David Tolnay750755e2020-03-01 13:04:08 -0800310<tr><td>String</td><td>rust::String</td><td></td></tr>
311<tr><td>&amp;str</td><td>rust::Str</td><td></td></tr>
David Tolnayefe81052020-04-14 16:28:24 -0700312<tr><td>&amp;[u8]</td><td>rust::Slice&lt;uint8_t&gt;</td><td><sup><i>arbitrary &amp;[T] not implemented yet</i></sup></td></tr>
David Tolnay5f3fb892020-09-01 23:02:57 -0700313<tr><td><a href="https://docs.rs/cxx/0.4/cxx/struct.CxxString.html">CxxString</a></td><td>std::string</td><td><sup><i>cannot be passed by value</i></sup></td></tr>
David Tolnay750755e2020-03-01 13:04:08 -0800314<tr><td>Box&lt;T&gt;</td><td>rust::Box&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
David Tolnay5f3fb892020-09-01 23:02:57 -0700315<tr><td><a href="https://docs.rs/cxx/0.4/cxx/struct.UniquePtr.html">UniquePtr&lt;T&gt;</a></td><td>std::unique_ptr&lt;T&gt;</td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr>
David Tolnay347c3d02020-04-24 16:14:07 -0700316<tr><td>Vec&lt;T&gt;</td><td>rust::Vec&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
David Tolnay5f3fb892020-09-01 23:02:57 -0700317<tr><td><a href="https://docs.rs/cxx/0.4/cxx/struct.CxxVector.html">CxxVector&lt;T&gt;</a></td><td>std::vector&lt;T&gt;</td><td><sup><i>cannot be passed by value, cannot hold opaque Rust type</i></sup></td></tr>
David Tolnayaddc7482020-03-29 22:19:44 -0700318<tr><td>fn(T, U) -&gt; V</td><td>rust::Fn&lt;V(T, U)&gt;</td><td><sup><i>only passing from Rust to C++ is implemented so far</i></sup></td></tr>
David Tolnay31b5aad2020-04-10 19:35:47 -0700319<tr><td>Result&lt;T&gt;</td><td>throw/catch</td><td><sup><i>allowed as return type only</i></sup></td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400320</table>
321
David Tolnay736cbca2020-03-11 16:49:18 -0700322The C++ API of the `rust` namespace is defined by the *include/cxx.h* file in
323this repo. You will need to include this header in your C++ code when working
324with those types.
David Tolnay7db73692019-10-20 14:51:12 -0400325
326The following types are intended to be supported "soon" but are just not
327implemented yet. I don't expect any of these to be hard to make work but it's a
328matter of designing a nice API for each in its non-native language.
329
330<table>
331<tr><th>name in Rust</th><th>name in C++</th></tr>
David Tolnay84f232e2020-01-08 12:22:56 -0800332<tr><td>BTreeMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
333<tr><td>HashMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay239d05f2020-03-13 01:36:50 -0700334<tr><td>Arc&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay85487b02020-08-22 06:13:27 -0700335<tr><td>Option&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay84f232e2020-01-08 12:22:56 -0800336<tr><td><sup><i>tbd</i></sup></td><td>std::map&lt;K, V&gt;</td></tr>
337<tr><td><sup><i>tbd</i></sup></td><td>std::unordered_map&lt;K, V&gt;</td></tr>
David Tolnay239d05f2020-03-13 01:36:50 -0700338<tr><td><sup><i>tbd</i></sup></td><td>std::shared_ptr&lt;T&gt;</td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400339</table>
340
341<br>
342
343## Remaining work
344
345This is still early days for CXX; I am releasing it as a minimum viable product
David Tolnay3deb2f92020-04-22 19:16:38 -0700346to collect feedback on the direction and invite collaborators. Please check the
347open issues.
David Tolnay7db73692019-10-20 14:51:12 -0400348
David Tolnay9c815df2020-09-02 09:54:19 -0700349Especially please report issues if you run into trouble building or linking any
350of this stuff. I'm sure there are ways to make the build aspects friendlier or
351more robust.
David Tolnay7db73692019-10-20 14:51:12 -0400352
353Finally, I know more about Rust library design than C++ library design so I
354would appreciate help making the C++ APIs in this project more idiomatic where
355anyone has suggestions.
356
357<br>
358
359#### License
360
361<sup>
362Licensed under either of <a href="LICENSE-APACHE">Apache License, Version
3632.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option.
364</sup>
365
366<br>
367
368<sub>
369Unless you explicitly state otherwise, any contribution intentionally submitted
370for inclusion in this project by you, as defined in the Apache-2.0 license,
371shall be dual licensed as above, without any additional terms or conditions.
372</sub>