blob: f6a18ebb70ce2dcb22f6f49d383f91f1fb300fe9 [file] [log] [blame] [view]
David Tolnay7db73692019-10-20 14:51:12 -04001CXX — safe FFI between Rust and C++
2=========================================
3
David Tolnaydd3af092020-05-12 21:47:06 -07004[<img alt="github" src="https://img.shields.io/badge/github-dtolnay/cxx-8da0cb?style=for-the-badge&labelColor=555555&logo=github" height="20">](https://github.com/dtolnay/cxx)
5[<img alt="crates.io" src="https://img.shields.io/crates/v/cxx.svg?style=for-the-badge&color=fc8d62&logo=rust" height="20">](https://crates.io/crates/cxx)
6[<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-cxx-66c2a5?style=for-the-badge&labelColor=555555&logoColor=white&logo=" height="20">](https://docs.rs/cxx)
7[<img alt="build status" src="https://img.shields.io/github/workflow/status/dtolnay/cxx/CI/master?style=for-the-badge" height="20">](https://github.com/dtolnay/cxx/actions?query=branch%3Amaster)
David Tolnay7db73692019-10-20 14:51:12 -04008
9This library provides a **safe** mechanism for calling C++ code from Rust and
10Rust code from C++, not subject to the many ways that things can go wrong when
11using bindgen or cbindgen to generate unsafe C-style bindings.
12
David Tolnayccd39752020-01-08 09:33:51 -080013This doesn't change the fact that 100% of C++ code is unsafe. When auditing a
14project, you would be on the hook for auditing all the unsafe Rust code and
15*all* the C++ code. The core safety claim under this new model is that auditing
16just the C++ side would be sufficient to catch all problems, i.e. the Rust side
17can be 100% safe.
18
David Tolnay7db73692019-10-20 14:51:12 -040019```toml
20[dependencies]
David Tolnaybf1000f2020-11-16 23:47:01 -080021cxx = "1.0"
David Tolnayc5cd7a12020-09-03 15:32:34 -070022
23[build-dependencies]
David Tolnaybf1000f2020-11-16 23:47:01 -080024cxx-build = "1.0"
David Tolnay7db73692019-10-20 14:51:12 -040025```
26
David Tolnay727ea002020-11-11 11:15:38 -080027*Compiler support: requires rustc 1.48+ and c++11 or newer*<br>
David Tolnay5d08baa2020-04-27 18:12:08 -070028*[Release notes](https://github.com/dtolnay/cxx/releases)*
David Tolnay7db73692019-10-20 14:51:12 -040029
30<br>
31
David Tolnay6a936b32020-11-17 08:32:39 -080032## Guide
33
34Please see **<https://cxx.rs>** for a tutorial, reference material, and example
35code.
36
37<br>
38
David Tolnay7db73692019-10-20 14:51:12 -040039## Overview
40
41The idea is that we define the signatures of both sides of our FFI boundary
42embedded together in one Rust module (the next section shows an example). From
43this, CXX receives a complete picture of the boundary to perform static analyses
44against the types and function signatures to uphold both Rust's and C++'s
45invariants and requirements.
46
47If everything checks out statically, then CXX uses a pair of code generators to
48emit the relevant `extern "C"` signatures on both sides together with any
49necessary static assertions for later in the build process to verify
50correctness. On the Rust side this code generator is simply an attribute
51procedural macro. On the C++ side it can be a small Cargo build script if your
52build is managed by Cargo, or for other build systems like Bazel or Buck we
53provide a command line tool which generates the header and source file and
54should be easy to integrate.
55
56The resulting FFI bridge operates at zero or negligible overhead, i.e. no
57copying, no serialization, no memory allocation, no runtime checks needed.
58
59The FFI signatures are able to use native types from whichever side they please,
60such as Rust's `String` or C++'s `std::string`, Rust's `Box` or C++'s
61`std::unique_ptr`, Rust's `Vec` or C++'s `std::vector`, etc in any combination.
62CXX guarantees an ABI-compatible signature that both sides understand, based on
63builtin bindings for key standard library types to expose an idiomatic API on
64those types to the other language. For example when manipulating a C++ string
65from Rust, its `len()` method becomes a call of the `size()` member function
Christopher Durhamb8d211d2020-05-24 11:56:32 -040066defined by C++; when manipulating a Rust string from C++, its `size()` member
David Tolnay7db73692019-10-20 14:51:12 -040067function calls Rust's `len()`.
68
69<br>
70
71## Example
72
David Tolnay4ca366f2020-11-10 20:55:31 -080073In this example we are writing a Rust application that wishes to take advantage
74of an existing C++ client for a large-file blobstore service. The blobstore
75supports a `put` operation for a discontiguous buffer upload. For example we
76might be uploading snapshots of a circular buffer which would tend to consist of
772 chunks, or fragments of a file spread across memory for some other reason.
78
David Tolnay278f6fc2020-09-01 16:16:57 -070079A runnable version of this example is provided under the *demo* directory of
80this repo. To try it out, run `cargo run` from that directory.
David Tolnay7db73692019-10-20 14:51:12 -040081
82```rust
83#[cxx::bridge]
84mod ffi {
85 // Any shared structs, whose fields will be visible to both languages.
David Tolnay4ca366f2020-11-10 20:55:31 -080086 struct BlobMetadata {
87 size: usize,
88 tags: Vec<String>,
David Tolnay7db73692019-10-20 14:51:12 -040089 }
90
91 extern "Rust" {
92 // Zero or more opaque types which both languages can pass around but
93 // only Rust can see the fields.
David Tolnay4ca366f2020-11-10 20:55:31 -080094 type MultiBuf;
David Tolnay7db73692019-10-20 14:51:12 -040095
96 // Functions implemented in Rust.
David Tolnay4ca366f2020-11-10 20:55:31 -080097 fn next_chunk(buf: &mut MultiBuf) -> &[u8];
98 }
99
David Tolnay7be5b1f2020-11-11 10:48:32 -0800100 unsafe extern "C++" {
David Tolnay4ca366f2020-11-10 20:55:31 -0800101 // One or more headers with the matching C++ declarations. Our code
102 // generators don't read it but it gets #include'd and used in static
103 // assertions to ensure our picture of the FFI boundary is accurate.
104 include!("demo/include/blobstore.h");
105
106 // Zero or more opaque types which both languages can pass around but
107 // only C++ can see the fields.
108 type BlobstoreClient;
109
110 // Functions implemented in C++.
111 fn new_blobstore_client() -> UniquePtr<BlobstoreClient>;
112 fn put(&self, parts: &mut MultiBuf) -> u64;
113 fn tag(&self, blobid: u64, tag: &str);
114 fn metadata(&self, blobid: u64) -> BlobMetadata;
David Tolnay7db73692019-10-20 14:51:12 -0400115 }
116}
117```
118
David Tolnay4ca366f2020-11-10 20:55:31 -0800119Now we simply provide Rust definitions of all the things in the `extern "Rust"`
120block and C++ definitions of all the things in the `extern "C++"` block, and get
121to call back and forth safely.
David Tolnay7db73692019-10-20 14:51:12 -0400122
123Here are links to the complete set of source files involved in the demo:
124
David Tolnay278f6fc2020-09-01 16:16:57 -0700125- [demo/src/main.rs](demo/src/main.rs)
126- [demo/build.rs](demo/build.rs)
David Tolnay4ca366f2020-11-10 20:55:31 -0800127- [demo/include/blobstore.h](demo/include/blobstore.h)
128- [demo/src/blobstore.cc](demo/src/blobstore.cc)
David Tolnay7db73692019-10-20 14:51:12 -0400129
130To look at the code generated in both languages for the example by the CXX code
131generators:
132
133```console
134 # run Rust code generator and print to stdout
135 # (requires https://github.com/dtolnay/cargo-expand)
David Tolnay278f6fc2020-09-01 16:16:57 -0700136$ cargo expand --manifest-path demo/Cargo.toml
David Tolnay7db73692019-10-20 14:51:12 -0400137
138 # run C++ code generator and print to stdout
David Tolnay278f6fc2020-09-01 16:16:57 -0700139$ cargo run --manifest-path gen/cmd/Cargo.toml -- demo/src/main.rs
David Tolnay7db73692019-10-20 14:51:12 -0400140```
141
142<br>
143
144## Details
145
146As seen in the example, the language of the FFI boundary involves 3 kinds of
147items:
148
149- **Shared structs** &mdash; their fields are made visible to both languages.
150 The definition written within cxx::bridge is the single source of truth.
151
152- **Opaque types** &mdash; their fields are secret from the other language.
153 These cannot be passed across the FFI by value but only behind an indirection,
154 such as a reference `&`, a Rust `Box`, or a `UniquePtr`. Can be a type alias
155 for an arbitrarily complicated generic language-specific type depending on
156 your use case.
157
158- **Functions** &mdash; implemented in either language, callable from the other
159 language.
160
David Tolnay48e98d82020-11-12 15:28:26 -0800161Within the `extern "Rust"` part of the CXX bridge we list the types and
162functions for which Rust is the source of truth. These all implicitly refer to
163the `super` module, the parent module of the CXX bridge. You can think of the
164two items listed in the example above as being like `use super::MultiBuf` and
165`use super::next_chunk` except re-exported to C++. The parent module will either
166contain the definitions directly for simple things, or contain the relevant
167`use` statements to bring them into scope from elsewhere.
David Tolnay7db73692019-10-20 14:51:12 -0400168
David Tolnay48e98d82020-11-12 15:28:26 -0800169Within the `extern "C++"` part, we list types and functions for which C++ is the
170source of truth, as well as the header(s) that declare those APIs. In the future
171it's possible that this section could be generated bindgen-style from the
172headers but for now we need the signatures written out; static assertions will
173verify that they are accurate.
David Tolnay7db73692019-10-20 14:51:12 -0400174
175Your function implementations themselves, whether in C++ or Rust, *do not* need
176to be defined as `extern "C"` ABI or no\_mangle. CXX will put in the right shims
177where necessary to make it all work.
178
179<br>
180
181## Comparison vs bindgen and cbindgen
182
183Notice that with CXX there is repetition of all the function signatures: they
184are typed out once where the implementation is defined (in C++ or Rust) and
185again inside the cxx::bridge module, though compile-time assertions guarantee
186these are kept in sync. This is different from [bindgen] and [cbindgen] where
187function signatures are typed by a human once and the tool consumes them in one
188language and emits them in the other language.
189
190[bindgen]: https://github.com/rust-lang/rust-bindgen
191[cbindgen]: https://github.com/eqrion/cbindgen/
192
193This is because CXX fills a somewhat different role. It is a lower level tool
194than bindgen or cbindgen in a sense; you can think of it as being a replacement
195for the concept of `extern "C"` signatures as we know them, rather than a
196replacement for a bindgen. It would be reasonable to build a higher level
197bindgen-like tool on top of CXX which consumes a C++ header and/or Rust module
198(and/or IDL like Thrift) as source of truth and generates the cxx::bridge,
199eliminating the repetition while leveraging the static analysis safety
200guarantees of CXX.
201
202But note in other ways CXX is higher level than the bindgens, with rich support
203for common standard library types. Frequently with bindgen when we are dealing
204with an idiomatic C++ API we would end up manually wrapping that API in C-style
205raw pointer functions, applying bindgen to get unsafe raw pointer Rust
206functions, and replicating the API again to expose those idiomatically in Rust.
207That's a much worse form of repetition because it is unsafe all the way through.
208
209By using a CXX bridge as the shared understanding between the languages, rather
210than `extern "C"` C-style signatures as the shared understanding, common FFI use
211cases become expressible using 100% safe code.
212
213It would also be reasonable to mix and match, using CXX bridge for the 95% of
214your FFI that is straightforward and doing the remaining few oddball signatures
215the old fashioned way with bindgen and cbindgen, if for some reason CXX's static
216restrictions get in the way. Please file an issue if you end up taking this
217approach so that we know what ways it would be worthwhile to make the tool more
218expressive.
219
220<br>
221
222## Cargo-based setup
223
224For builds that are orchestrated by Cargo, you will use a build script that runs
225CXX's C++ code generator and compiles the resulting C++ code along with any
226other C++ code for your crate.
227
228The canonical build script is as follows. The indicated line returns a
229[`cc::Build`] instance (from the usual widely used `cc` crate) on which you can
230set up any additional source files and compiler flags as normal.
231
232[`cc::Build`]: https://docs.rs/cc/1.0/cc/struct.Build.html
233
David Tolnaycc9ece52020-04-29 18:57:05 -0700234```toml
235# Cargo.toml
236
237[build-dependencies]
David Tolnaybf1000f2020-11-16 23:47:01 -0800238cxx-build = "1.0"
David Tolnaycc9ece52020-04-29 18:57:05 -0700239```
240
David Tolnay7db73692019-10-20 14:51:12 -0400241```rust
242// build.rs
243
244fn main() {
David Tolnayf8ed0732020-04-29 12:34:47 -0700245 cxx_build::bridge("src/main.rs") // returns a cc::Build
David Tolnay278f6fc2020-09-01 16:16:57 -0700246 .file("src/demo.cc")
Philip Craig7e14e2e2020-05-09 10:42:30 +0100247 .flag_if_supported("-std=c++11")
David Tolnay7db73692019-10-20 14:51:12 -0400248 .compile("cxxbridge-demo");
249
250 println!("cargo:rerun-if-changed=src/main.rs");
David Tolnay278f6fc2020-09-01 16:16:57 -0700251 println!("cargo:rerun-if-changed=src/demo.cc");
252 println!("cargo:rerun-if-changed=include/demo.h");
David Tolnay7db73692019-10-20 14:51:12 -0400253}
254```
255
256<br>
257
258## Non-Cargo setup
259
260For use in non-Cargo builds like Bazel or Buck, CXX provides an alternate way of
261invoking the C++ code generator as a standalone command line tool. The tool is
262packaged as the `cxxbridge-cmd` crate on crates.io or can be built from the
David Tolnayd9e789e2020-10-08 21:22:04 -0700263*gen/cmd* directory of this repo.
David Tolnay7db73692019-10-20 14:51:12 -0400264
265```bash
266$ cargo install cxxbridge-cmd
267
268$ cxxbridge src/main.rs --header > path/to/mybridge.h
269$ cxxbridge src/main.rs > path/to/mybridge.cc
270```
271
272<br>
273
274## Safety
275
276Be aware that the design of this library is intentionally restrictive and
277opinionated! It isn't a goal to be powerful enough to handle arbitrary
278signatures in either language. Instead this project is about carving out a
279reasonably expressive set of functionality about which we can make useful safety
280guarantees today and maybe extend over time. You may find that it takes some
281practice to use CXX bridge effectively as it won't work in all the ways that you
282are used to.
283
284Some of the considerations that go into ensuring safety are:
285
286- By design, our paired code generators work together to control both sides of
287 the FFI boundary. Ordinarily in Rust writing your own `extern "C"` blocks is
288 unsafe because the Rust compiler has no way to know whether the signatures
289 you've written actually match the signatures implemented in the other
290 language. With CXX we achieve that visibility and know what's on the other
291 side.
292
293- Our static analysis detects and prevents passing types by value that shouldn't
294 be passed by value from C++ to Rust, for example because they may contain
295 internal pointers that would be screwed up by Rust's move behavior.
296
297- To many people's surprise, it is possible to have a struct in Rust and a
298 struct in C++ with exactly the same layout / fields / alignment / everything,
299 and still not the same ABI when passed by value. This is a longstanding
300 bindgen bug that leads to segfaults in absolutely correct-looking code
301 ([rust-lang/rust-bindgen#778]). CXX knows about this and can insert the
302 necessary zero-cost workaround transparently where needed, so go ahead and
303 pass your structs by value without worries. This is made possible by owning
304 both sides of the boundary rather than just one.
305
306- Template instantiations: for example in order to expose a UniquePtr\<T\> type
307 in Rust backed by a real C++ unique\_ptr, we have a way of using a Rust trait
308 to connect the behavior back to the template instantiations performed by the
309 other language.
310
311[rust-lang/rust-bindgen#778]: https://github.com/rust-lang/rust-bindgen/issues/778
312
313<br>
314
315## Builtin types
316
David Tolnay559fbb32020-03-17 23:32:20 -0700317In addition to all the primitive types (i32 &lt;=&gt; int32_t), the following
David Tolnay06515f02020-03-17 23:28:02 -0700318common types may be used in the fields of shared structs and the arguments and
319returns of functions.
David Tolnay7db73692019-10-20 14:51:12 -0400320
321<table>
322<tr><th>name in Rust</th><th>name in C++</th><th>restrictions</th></tr>
David Tolnay750755e2020-03-01 13:04:08 -0800323<tr><td>String</td><td>rust::String</td><td></td></tr>
324<tr><td>&amp;str</td><td>rust::Str</td><td></td></tr>
David Tolnay5515a9e2020-11-25 19:07:54 -0800325<tr><td>&amp;[T]</td><td>rust::Slice&lt;const T&gt;</td><td><sup><i>cannot hold opaque Rust or C++ type</i></sup></td></tr>
326<tr><td>&amp;mut [T]</td><td>rust::Slice&lt;T&gt;</td><td><sup><i>cannot hold opaque Rust or C++ type</i></sup></td></tr>
David Tolnaybf1000f2020-11-16 23:47:01 -0800327<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.CxxString.html">CxxString</a></td><td>std::string</td><td><sup><i>cannot be passed by value</i></sup></td></tr>
David Tolnay750755e2020-03-01 13:04:08 -0800328<tr><td>Box&lt;T&gt;</td><td>rust::Box&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
David Tolnaybf1000f2020-11-16 23:47:01 -0800329<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.UniquePtr.html">UniquePtr&lt;T&gt;</a></td><td>std::unique_ptr&lt;T&gt;</td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr>
David Tolnayb3b24a12020-12-01 15:27:43 -0800330<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.SharedPtr.html">SharedPtr&lt;T&gt;</a></td><td>std::shared_ptr&lt;T&gt;</td><td><sup><i>cannot hold opaque Rust type</i></sup></td></tr>
David Tolnay10926402020-11-24 21:34:13 -0800331<tr><td>[T; N]</td><td>std::array&lt;T, N&gt;</td><td><sup><i>cannot hold opaque Rust or C++ type</i></sup></td></tr>
David Tolnay347c3d02020-04-24 16:14:07 -0700332<tr><td>Vec&lt;T&gt;</td><td>rust::Vec&lt;T&gt;</td><td><sup><i>cannot hold opaque C++ type</i></sup></td></tr>
David Tolnaybf1000f2020-11-16 23:47:01 -0800333<tr><td><a href="https://docs.rs/cxx/1.0/cxx/struct.CxxVector.html">CxxVector&lt;T&gt;</a></td><td>std::vector&lt;T&gt;</td><td><sup><i>cannot be passed by value, cannot hold opaque Rust type</i></sup></td></tr>
David Tolnayaddc7482020-03-29 22:19:44 -0700334<tr><td>fn(T, U) -&gt; V</td><td>rust::Fn&lt;V(T, U)&gt;</td><td><sup><i>only passing from Rust to C++ is implemented so far</i></sup></td></tr>
David Tolnay31b5aad2020-04-10 19:35:47 -0700335<tr><td>Result&lt;T&gt;</td><td>throw/catch</td><td><sup><i>allowed as return type only</i></sup></td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400336</table>
337
David Tolnay736cbca2020-03-11 16:49:18 -0700338The C++ API of the `rust` namespace is defined by the *include/cxx.h* file in
339this repo. You will need to include this header in your C++ code when working
340with those types.
David Tolnay7db73692019-10-20 14:51:12 -0400341
342The following types are intended to be supported "soon" but are just not
343implemented yet. I don't expect any of these to be hard to make work but it's a
344matter of designing a nice API for each in its non-native language.
345
346<table>
347<tr><th>name in Rust</th><th>name in C++</th></tr>
David Tolnay84f232e2020-01-08 12:22:56 -0800348<tr><td>BTreeMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
349<tr><td>HashMap&lt;K, V&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay239d05f2020-03-13 01:36:50 -0700350<tr><td>Arc&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay85487b02020-08-22 06:13:27 -0700351<tr><td>Option&lt;T&gt;</td><td><sup><i>tbd</i></sup></td></tr>
David Tolnay84f232e2020-01-08 12:22:56 -0800352<tr><td><sup><i>tbd</i></sup></td><td>std::map&lt;K, V&gt;</td></tr>
353<tr><td><sup><i>tbd</i></sup></td><td>std::unordered_map&lt;K, V&gt;</td></tr>
David Tolnay7db73692019-10-20 14:51:12 -0400354</table>
355
356<br>
357
358## Remaining work
359
360This is still early days for CXX; I am releasing it as a minimum viable product
David Tolnay3deb2f92020-04-22 19:16:38 -0700361to collect feedback on the direction and invite collaborators. Please check the
362open issues.
David Tolnay7db73692019-10-20 14:51:12 -0400363
David Tolnay9c815df2020-09-02 09:54:19 -0700364Especially please report issues if you run into trouble building or linking any
365of this stuff. I'm sure there are ways to make the build aspects friendlier or
366more robust.
David Tolnay7db73692019-10-20 14:51:12 -0400367
368Finally, I know more about Rust library design than C++ library design so I
369would appreciate help making the C++ APIs in this project more idiomatic where
370anyone has suggestions.
371
372<br>
373
374#### License
375
376<sup>
377Licensed under either of <a href="LICENSE-APACHE">Apache License, Version
3782.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option.
379</sup>
380
381<br>
382
383<sub>
384Unless you explicitly state otherwise, any contribution intentionally submitted
385for inclusion in this project by you, as defined in the Apache-2.0 license,
386shall be dual licensed as above, without any additional terms or conditions.
387</sub>