blob: 768b394fd3ec62bbc297eb5c77089fbc6df8b844 [file] [log] [blame] [view]
David Tolnay7db73692019-10-20 14:51:12 -04001CXX — safe FFI between Rust and C++
2=========================================
3
David Tolnaydd3af092020-05-12 21:47:06 -07004[<img alt="github" src="https://img.shields.io/badge/github-dtolnay/cxx-8da0cb?style=for-the-badge&labelColor=555555&logo=github" height="20">](https://github.com/dtolnay/cxx)
5[<img alt="crates.io" src="https://img.shields.io/crates/v/cxx.svg?style=for-the-badge&color=fc8d62&logo=rust" height="20">](https://crates.io/crates/cxx)
6[<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-cxx-66c2a5?style=for-the-badge&labelColor=555555&logoColor=white&logo=" height="20">](https://docs.rs/cxx)
7[<img alt="build status" src="https://img.shields.io/github/workflow/status/dtolnay/cxx/CI/master?style=for-the-badge" height="20">](https://github.com/dtolnay/cxx/actions?query=branch%3Amaster)
David Tolnay7db73692019-10-20 14:51:12 -04008
9This library provides a **safe** mechanism for calling C++ code from Rust and
10Rust code from C++, not subject to the many ways that things can go wrong when
11using bindgen or cbindgen to generate unsafe C-style bindings.
12
David Tolnayccd39752020-01-08 09:33:51 -080013This doesn't change the fact that 100% of C++ code is unsafe. When auditing a
14project, you would be on the hook for auditing all the unsafe Rust code and
15*all* the C++ code. The core safety claim under this new model is that auditing
16just the C++ side would be sufficient to catch all problems, i.e. the Rust side
17can be 100% safe.
18
David Tolnay7db73692019-10-20 14:51:12 -040019```toml
20[dependencies]
David Tolnay63a43842020-04-29 18:54:07 -070021cxx = "0.3"
David Tolnay7db73692019-10-20 14:51:12 -040022```
23
David Tolnaycf223852020-05-11 20:50:59 -070024*Compiler support: requires rustc 1.42+ and c++11 or newer*<br>
David Tolnay5d08baa2020-04-27 18:12:08 -070025*[Release notes](https://github.com/dtolnay/cxx/releases)*
David Tolnay7db73692019-10-20 14:51:12 -040026
27<br>
28
29## Overview
30
31The idea is that we define the signatures of both sides of our FFI boundary
32embedded together in one Rust module (the next section shows an example). From
33this, CXX receives a complete picture of the boundary to perform static analyses
34against the types and function signatures to uphold both Rust's and C++'s
35invariants and requirements.
36
37If everything checks out statically, then CXX uses a pair of code generators to
38emit the relevant `extern "C"` signatures on both sides together with any
39necessary static assertions for later in the build process to verify
40correctness. On the Rust side this code generator is simply an attribute
41procedural macro. On the C++ side it can be a small Cargo build script if your
42build is managed by Cargo, or for other build systems like Bazel or Buck we
43provide a command line tool which generates the header and source file and
44should be easy to integrate.
45
46The resulting FFI bridge operates at zero or negligible overhead, i.e. no
47copying, no serialization, no memory allocation, no runtime checks needed.
48
49The FFI signatures are able to use native types from whichever side they please,
50such as Rust's `String` or C++'s `std::string`, Rust's `Box` or C++'s
51`std::unique_ptr`, Rust's `Vec` or C++'s `std::vector`, etc in any combination.
52CXX guarantees an ABI-compatible signature that both sides understand, based on
53builtin bindings for key standard library types to expose an idiomatic API on
54those types to the other language. For example when manipulating a C++ string
55from Rust, its `len()` method becomes a call of the `size()` member function
56defined by C++; when manipulation a Rust string from C++, its `size()` member
57function calls Rust's `len()`.
58
59<br>
60
61## Example
62
63A runnable version of this example is provided under the *demo-rs* directory of
64this repo (with the C++ side of the implementation in the *demo-cxx* directory).
65To try it out, jump into demo-rs and run `cargo run`.
66
67```rust
68#[cxx::bridge]
69mod ffi {
70 // Any shared structs, whose fields will be visible to both languages.
71 struct SharedThing {
72 z: i32,
73 y: Box<ThingR>,
74 x: UniquePtr<ThingC>,
75 }
76
77 extern "C" {
78 // One or more headers with the matching C++ declarations. Our code
79 // generators don't read it but it gets #include'd and used in static
80 // assertions to ensure our picture of the FFI boundary is accurate.
81 include!("demo-cxx/demo.h");
82
83 // Zero or more opaque types which both languages can pass around but
84 // only C++ can see the fields.
85 type ThingC;
86
87 // Functions implemented in C++.
88 fn make_demo(appname: &str) -> UniquePtr<ThingC>;
David Tolnayb6a5f672020-04-17 15:04:56 -070089 fn get_name(thing: &ThingC) -> &CxxString;
David Tolnay7db73692019-10-20 14:51:12 -040090 fn do_thing(state: SharedThing);
91 }
92
93 extern "Rust" {
94 // Zero or more opaque types which both languages can pass around but
95 // only Rust can see the fields.
96 type ThingR;
97
98 // Functions implemented in Rust.
99 fn print_r(r: &ThingR);
100 }
101}
102```
103
104Now we simply provide C++ definitions of all the things in the `extern "C"`
105block and Rust definitions of all the things in the `extern "Rust"` block, and
106get to call back and forth safely.
107
108Here are links to the complete set of source files involved in the demo:
109
110- [demo-rs/src/main.rs](demo-rs/src/main.rs)
111- [demo-rs/build.rs](demo-rs/build.rs)
112- [demo-cxx/demo.h](demo-cxx/demo.h)
113- [demo-cxx/demo.cc](demo-cxx/demo.cc)
114
115To look at the code generated in both languages for the example by the CXX code
116generators:
117
118```console
119 # run Rust code generator and print to stdout
120 # (requires https://github.com/dtolnay/cargo-expand)
121$ cargo expand --manifest-path demo-rs/Cargo.toml
122
123 # run C++ code generator and print to stdout
Philip Craig064668a2020-05-09 08:24:12 +0100124$ cargo run --manifest-path gen/cmd/Cargo.toml -- demo-rs/src/main.rs
David Tolnay7db73692019-10-20 14:51:12 -0400125```
126
127<br>
128
129## Details
130
131As seen in the example, the language of the FFI boundary involves 3 kinds of
132items:
133
134- **Shared structs** &mdash; their fields are made visible to both languages.
135 The definition written within cxx::bridge is the single source of truth.
136
137- **Opaque types** &mdash; their fields are secret from the other language.
138 These cannot be passed across the FFI by value but only behind an indirection,
139 such as a reference `&`, a Rust `Box`, or a `UniquePtr`. Can be a type alias
140 for an arbitrarily complicated generic language-specific type depending on
141 your use case.
142
143- **Functions** &mdash; implemented in either language, callable from the other
144 language.
145
146Within the `extern "C"` part of the CXX bridge we list the types and functions
147for which C++ is the source of truth, as well as the header(s) that declare
148those APIs. In the future it's possible that this section could be generated
149bindgen-style from the headers but for now we need the signatures written out;
150static assertions will verify that they are accurate.
151
152Within the `extern "Rust"` part, we list types and functions for which Rust is
153the source of truth. These all implicitly refer to the `super` module, the
154parent module of the CXX bridge. You can think of the two items listed in the
155example above as being like `use super::ThingR` and `use super::print_r` except
156re-exported to C++. The parent module will either contain the definitions
157directly for simple things, or contain the relevant `use` statements to bring
158them into scope from elsewhere.
159
160Your function implementations themselves, whether in C++ or Rust, *do not* need
161to be defined as `extern "C"` ABI or no\_mangle. CXX will put in the right shims
162where necessary to make it all work.
163
164<br>
165
166## Comparison vs bindgen and cbindgen
167
168Notice that with CXX there is repetition of all the function signatures: they
169are typed out once where the implementation is defined (in C++ or Rust) and
170again inside the cxx::bridge module, though compile-time assertions guarantee
171these are kept in sync. This is different from [bindgen] and [cbindgen] where
172function signatures are typed by a human once and the tool consumes them in one
173language and emits them in the other language.
174
175[bindgen]: https://github.com/rust-lang/rust-bindgen
176[cbindgen]: https://github.com/eqrion/cbindgen/
177
178This is because CXX fills a somewhat different role. It is a lower level tool
179than bindgen or cbindgen in a sense; you can think of it as being a replacement
180for the concept of `extern "C"` signatures as we know them, rather than a
181replacement for a bindgen. It would be reasonable to build a higher level
182bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module
183(and/or IDL like Thrift) as source of truth and generates the cxx::bridge,
184eliminating the repetition while leveraging the static analysis safety
185guarantees of CXX.
186
187But note in other ways CXX is higher level than the bindgens, with rich support
188for common standard library types. Frequently with bindgen when we are dealing
189with an idiomatic C++ API we would end up manually wrapping that API in C-style
190raw pointer functions, applying bindgen to get unsafe raw pointer Rust
191functions, and replicating the API again to expose those idiomatically in Rust.
192That's a much worse form of repetition because it is unsafe all the way through.
193
194By using a CXX bridge as the shared understanding between the languages, rather
195than `extern "C"` C-style signatures as the shared understanding, common FFI use
196cases become expressible using 100% safe code.
197
198It would also be reasonable to mix and match, using CXX bridge for the 95% of
199your FFI that is straightforward and doing the remaining few oddball signatures
200the old fashioned way with bindgen and cbindgen, if for some reason CXX's static
201restrictions get in the way. Please file an issue if you end up taking this
202approach so that we know what ways it would be worthwhile to make the tool more
203expressive.
204
205<br>
206
207## Cargo-based setup
208
209For builds that are orchestrated by Cargo, you will use a build script that runs
210CXX's C++ code generator and compiles the resulting C++ code along with any
211other C++ code for your crate.
212
213The canonical build script is as follows. The indicated line returns a
214[`cc::Build`] instance (from the usual widely used `cc` crate) on which you can
215set up any additional source files and compiler flags as normal.
216
217[`cc::Build`]: https://docs.rs/cc/1.0/cc/struct.Build.html
218
David Tolnaycc9ece52020-04-29 18:57:05 -0700219```toml
220# Cargo.toml
221
222[build-dependencies]
David Tolnay63a43842020-04-29 18:54:07 -0700223cxx-build = "0.3"
David Tolnaycc9ece52020-04-29 18:57:05 -0700224```
225
David Tolnay7db73692019-10-20 14:51:12 -0400226```rust
227// build.rs
228
229fn main() {
David Tolnayf8ed0732020-04-29 12:34:47 -0700230 cxx_build::bridge("src/main.rs") // returns a cc::Build
David Tolnay7db73692019-10-20 14:51:12 -0400231 .file("../demo-cxx/demo.cc")
Philip Craig7e14e2e2020-05-09 10:42:30 +0100232 .flag_if_supported("-std=c++11")
David Tolnay7db73692019-10-20 14:51:12 -0400233 .compile("cxxbridge-demo");
234
235 println!("cargo:rerun-if-changed=src/main.rs");
236 println!("cargo:rerun-if-changed=../demo-cxx/demo.h");
237 println!("cargo:rerun-if-changed=../demo-cxx/demo.cc");
238}
239```
240
241<br>
242
243## Non-Cargo setup
244
245For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of
246invoking the C++ code generator as a standalone command line tool. The tool is
247packaged as the `cxxbridge-cmd` crate on crates.io or can be built from the
248*cmd* directory of this repo.
249
250```bash
251$ cargo install cxxbridge-cmd
252
253$ cxxbridge src/main.rs --header > path/to/mybridge.h
254$ cxxbridge src/main.rs > path/to/mybridge.cc
255```
256
257<br>
258
259## Safety
260
261Be aware that the design of this library is intentionally restrictive and
262opinionated! It isn't a goal to be powerful enough to handle arbitrary
263signatures in either language. Instead this project is about carving out a
264reasonably expressive set of functionality about which we can make useful safety
265guarantees today and maybe extend over time. You may find that it takes some
266practice to use CXX bridge effectively as it won't work in all the ways that you
267are used to.
268
269Some of the considerations that go into ensuring safety are:
270
271- By design, our paired code generators work together to control both sides of
272 the FFI boundary. Ordinarily in Rust writing your own `extern "C"` blocks is
273 unsafe because the Rust compiler has no way to know whether the signatures
274 you've written actually match the signatures implemented in the other
275 language. With CXX we achieve that visibility and know what's on the other
276 side.
277
278- Our static analysis detects and prevents passing types by value that shouldn't
279 be passed by value from C++ to Rust, for example because they may contain
280 internal pointers that would be screwed up by Rust's move behavior.
281
282- To many people's surprise, it is possible to have a struct in Rust and a
283 struct in C++ with exactly the same layout / fields / alignment / everything,
284 and still not the same ABI when passed by value. This is a longstanding
285 bindgen bug that leads to segfaults in absolutely correct-looking code
286 ([rust-lang/rust-bindgen#778]). CXX knows about this and can insert the
287 necessary zero-cost workaround transparently where needed, so go ahead and
288 pass your structs by value without worries. This is made possible by owning
289 both sides of the boundary rather than just one.
290
291- Template instantiations: for example in order to expose a UniquePtr\<T\> type
292 in Rust backed by a real C++ unique\_ptr, we have a way of using a Rust trait
293 to connect the behavior back to the template instantiations performed by the
294 other language.
295
296[rust-lang/rust-bindgen#778]: https://github.com/rust-lang/rust-bindgen/issues/778
297
298<br>
299
300## Builtin types
301
David Tolnay559fbb32020-03-17 23:32:20 -0700302In addition to all the primitive types (i32 &lt;=&gt; int32_t), the following
David Tolnay06515f02020-03-17 23:28:02 -0700303common types may be used in the fields of shared structs and the arguments and
304returns of functions.
David Tolnay7db73692019-10-20 14:51:12 -0400305
306<table>
307<tr><th>name in Rust</th><th>name in C++</th><th>restrictions</th></tr>
David Tolnay750755e2020-03-01 13:04:08 -0800308<tr><td>String</td><td>rust::String</td><td></td></tr>
309<tr><td>&amp;str</td><td>rust::Str</td><td></td></tr>
David Tolnayefe81052020-04-14 16:28:24 -0700310<tr><td>&amp;[u8]</td><td>rust::Slice&lt;uint8_t&gt;</td><td><sup><i>arbitrary &amp;[T] not implemented yet</i></sup></td></tr>
David Tolnay63a43842020-04-29 18:54:07 -0700311<tr><td><a href="https://docs.rs/cxx/0.3/cxx/struct.CxxString.html">CxxString</a></td><td>std::string</td><td><sup><i>cannot be passed by value</i></sup></td></tr>
David Tolnay750755e2020-03-01 13:04:08 -0800312<tr><td>Box&lt;T&gt;</td><td>rust::Box&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
David Tolnay63a43842020-04-29 18:54:07 -0700313<tr><td><a href="https://docs.rs/cxx/0.3/cxx/struct.UniquePtr.html">UniquePtr&lt;T&gt;</a></td><td>std::unique_ptr&lt;T&gt;</td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr>
David Tolnay347c3d02020-04-24 16:14:07 -0700314<tr><td>Vec&lt;T&gt;</td><td>rust::Vec&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
David Tolnay63a43842020-04-29 18:54:07 -0700315<tr><td><a href="https://docs.rs/cxx/0.3/cxx/struct.CxxVector.html">CxxVector&lt;T&gt;</a></td><td>std::vector&lt;T&gt;</td><td><sup><i>cannot be passed by value, cannot hold opaque Rust type</i></sup></td></tr>
David Tolnayaddc7482020-03-29 22:19:44 -0700316<tr><td>fn(T, U) -&gt; V</td><td>rust::Fn&lt;V(T, U)&gt;</td><td><sup><i>only passing from Rust to C++ is implemented so far</i></sup></td></tr>
David Tolnay31b5aad2020-04-10 19:35:47 -0700317<tr><td>Result&lt;T&gt;</td><td>throw/catch</td><td><sup><i>allowed as return type only</i></sup></td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400318</table>
319
David Tolnay736cbca2020-03-11 16:49:18 -0700320The C++ API of the `rust` namespace is defined by the *include/cxx.h* file in
321this repo. You will need to include this header in your C++ code when working
322with those types.
David Tolnay7db73692019-10-20 14:51:12 -0400323
324The following types are intended to be supported "soon" but are just not
325implemented yet. I don't expect any of these to be hard to make work but it's a
326matter of designing a nice API for each in its non-native language.
327
328<table>
329<tr><th>name in Rust</th><th>name in C++</th></tr>
David Tolnay84f232e2020-01-08 12:22:56 -0800330<tr><td>BTreeMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
331<tr><td>HashMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay239d05f2020-03-13 01:36:50 -0700332<tr><td>Arc&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay84f232e2020-01-08 12:22:56 -0800333<tr><td><sup><i>tbd</i></sup></td><td>std::map&lt;K, V&gt;</td></tr>
334<tr><td><sup><i>tbd</i></sup></td><td>std::unordered_map&lt;K, V&gt;</td></tr>
David Tolnay239d05f2020-03-13 01:36:50 -0700335<tr><td><sup><i>tbd</i></sup></td><td>std::shared_ptr&lt;T&gt;</td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400336</table>
337
338<br>
339
340## Remaining work
341
342This is still early days for CXX; I am releasing it as a minimum viable product
David Tolnay3deb2f92020-04-22 19:16:38 -0700343to collect feedback on the direction and invite collaborators. Please check the
344open issues.
David Tolnay7db73692019-10-20 14:51:12 -0400345
346On the build side, I don't have much experience with the `cc` crate so I expect
347there may be someone who can suggest ways to make that aspect of this crate
348friendlier or more robust. Please report issues if you run into trouble building
349or linking any of this stuff.
350
351Finally, I know more about Rust library design than C++ library design so I
352would appreciate help making the C++ APIs in this project more idiomatic where
353anyone has suggestions.
354
355<br>
356
357#### License
358
359<sup>
360Licensed under either of <a href="LICENSE-APACHE">Apache License, Version
3612.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option.
362</sup>
363
364<br>
365
366<sub>
367Unless you explicitly state otherwise, any contribution intentionally submitted
368for inclusion in this project by you, as defined in the Apache-2.0 license,
369shall be dual licensed as above, without any additional terms or conditions.
370</sub>