blob: 699d52710637951d3da41dda7608e84b13a819ce [file] [log] [blame]
Wyatt Heplerf9fb90f2020-09-30 18:59:33 -07001.. _module-pw_protobuf:
Alexei Frolov9c2ed462020-01-13 15:35:42 -08002
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -07003===========
Alexei Frolov9c2ed462020-01-13 15:35:42 -08004pw_protobuf
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -07005===========
Alexei Frolov9c2ed462020-01-13 15:35:42 -08006The protobuf module provides a lightweight interface for encoding and decoding
7the Protocol Buffer wire format.
8
Alexei Frolov469b39f2020-04-30 10:48:43 -07009.. note::
10
11 The protobuf module is a work in progress. Wire format encoding and decoding
12 is supported, though the APIs are not final. C++ code generation exists for
13 encoding, but not decoding.
14
Alexei Frolov4a257c12020-03-02 14:09:42 -080015Design
16======
17Unlike other protobuf libraries, which typically provide in-memory data
Armando Montanez0054a9b2020-03-13 13:06:24 -070018structures to represent protobuf messages, ``pw_protobuf`` operates directly on
19the wire format and leaves data storage to the user. This has a few benefits.
20The primary one is that it allows the library to be incredibly small, with the
Alexei Frolov4a257c12020-03-02 14:09:42 -080021encoder and decoder each having a code size of around 1.5K and negligible RAM
22usage. Users can choose the tradeoffs most suitable for their product on top of
23this core implementation.
24
Armando Montanez0054a9b2020-03-13 13:06:24 -070025``pw_protobuf`` also provides zero-overhead C++ code generation which wraps its
Alexei Frolov4a257c12020-03-02 14:09:42 -080026low-level wire format operations with a user-friendly API for processing
27specific protobuf messages. The code generation integrates with Pigweed's GN
28build system.
29
Alexei Frolovf9ae1892021-04-01 18:24:27 -070030Configuration
31=============
32``pw_protobuf`` supports the following configuration options.
33
34* ``PW_PROTOBUF_CFG_MAX_VARINT_SIZE``:
35 When encoding nested messages, the number of bytes to reserve for the varint
36 submessage length. Nested messages are limited in size to the maximum value
37 that can be varint-encoded into this reserved space.
38
39 The values that can be set, and their corresponding maximum submessage
40 lengths, are outlined below.
41
42 +-------------------+----------------------------------------+
43 | MAX_VARINT_SIZE | Maximum submessage length |
44 +===================+========================================+
45 | 1 byte | 127 |
46 +-------------------+----------------------------------------+
47 | 2 bytes | 16,383 or < 16KiB |
48 +-------------------+----------------------------------------+
49 | 3 bytes | 2,097,151 or < 2048KiB |
50 +-------------------+----------------------------------------+
51 | 4 bytes (default) | 268,435,455 or < 256MiB |
52 +-------------------+----------------------------------------+
53 | 5 bytes | 4,294,967,295 or < 4GiB (max uint32_t) |
54 +-------------------+----------------------------------------+
55
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -070056========
57Encoding
58========
59
Alexei Frolov4a257c12020-03-02 14:09:42 -080060Usage
61=====
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -070062Pigweed's protobuf encoders encode directly to the wire format of a proto rather
63than staging information to a mutable datastructure. This means any writes of a
64value are final, and can't be referenced or modified as a later step in the
65encode process.
Alexei Frolov4a257c12020-03-02 14:09:42 -080066
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -070067MemoryEncoder
68=============
69A MemoryEncoder directly encodes a proto to an in-memory buffer.
Alexei Frolov4a257c12020-03-02 14:09:42 -080070
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -070071.. Code:: cpp
Alexei Frolov4a257c12020-03-02 14:09:42 -080072
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -070073 // Writes a proto response to the provided buffer, returning the encode
74 // status and number of bytes written.
75 StatusWithSize WriteProtoResponse(ByteSpan response) {
76 // All proto writes are directly written to the `response` buffer.
77 MemoryEncoder encoder(response);
78 encoder.WriteUint32(kMagicNumberField, 0x1a1a2b2b);
79 encoder.WriteString(kFavoriteFood, "cookies");
80 return StatusWithSize(encoder.status(), encoder.size());
81 }
Alexei Frolov9c2ed462020-01-13 15:35:42 -080082
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -070083StreamEncoder
84=============
85pw_protobuf's StreamEncoder class operates on pw::stream::Writer objects to
86serialized proto data. This means you can directly encode a proto to something
87like pw::sys_io without needing to build the complete message in memory first.
Alexei Frolov4a257c12020-03-02 14:09:42 -080088
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -070089.. Code:: cpp
90
91 #include "pw_protobuf/encoder.h"
92 #include "pw_stream/sys_io_stream.h"
93 #include "pw_bytes/span.h"
94
95 pw::stream::SysIoWriter sys_io_writer;
96 pw::protobuf::StreamEncoder my_proto_encoder(sys_io_writer,
97 pw::ByteSpan());
98
99 // Once this line returns, the field has been written to the Writer.
100 my_proto_encoder.WriteInt64(kTimestampFieldNumber, system::GetUnixEpoch());
101
102 // There's no intermediate buffering when writing a string directly to a
103 // StreamEncoder.
104 my_proto_encoder.WriteString(kWelcomeMessageFieldNumber,
105 "Welcome to Pigweed!");
106 if (!my_proto_encoder.status().ok()) {
107 PW_LOG_INFO("Failed to encode proto; %s", my_proto_encoder.status().str());
108 }
109
110Nested submessages
111==================
112Writing proto messages with nested submessages requires buffering due to
113limitations of the proto format. Every proto submessage must know the size of
114the submessage before its final serialization can begin. A streaming encoder can
115be passed a scratch buffer to use when constructing nested messages. All
116submessage data is buffered to this scratch buffer until the submessage is
117finalized. Note that the contents of this scratch buffer is not necessarily
118valid proto data, so don't try to use it directly.
119
120MemoryEncoder objects use the final destination buffer rather than relying on a
121scratch buffer. Note that this means your destination buffer might need
122additional space for overhead incurred by nesting submessages. The
123``MaxScratchBufferSize()`` helper function can be useful in estimating how much
124space to allocate to account for nested submessage encoding overhead.
125
126.. Code:: cpp
127
128 #include "pw_protobuf/encoder.h"
129 #include "pw_stream/sys_io_stream.h"
130 #include "pw_bytes/span.h"
131
132 pw::stream::SysIoWriter sys_io_writer;
133 // The scratch buffer should be at least as big as the largest nested
134 // submessage. It's a good idea to be a little generous.
135 std::byte submessage_scratch_buffer[64];
136
137 // Provide the scratch buffer to the proto encoder. The buffer's lifetime must
138 // match the lifetime of the encoder.
139 pw::protobuf::StreamEncoder my_proto_encoder(sys_io_writer,
Ewout van Bekkum011a4d52021-08-20 20:19:52 -0700140 submessage_scratch_buffer);
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -0700141
Ewout van Bekkum011a4d52021-08-20 20:19:52 -0700142 {
143 // Note that the parent encoder, my_proto_encoder, cannot be used until the
144 // nested encoder, nested_encoder, has been destroyed.
145 StreamEncoder nested_encoder =
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -0700146 my_proto_encoder.GetNestedEncoder(kPetsFieldNumber);
Ewout van Bekkum011a4d52021-08-20 20:19:52 -0700147
148 // There's intermediate buffering when writing to a nested encoder.
149 nested_encoder.WriteString(kNameFieldNumber, "Spot");
150 nested_encoder.WriteString(kPetTypeFieldNumber, "dog");
151
152 // When this scope ends, the nested encoder is serialized to the Writer.
153 // In addition, the parent encoder, my_proto_encoder, can be used again.
154 }
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -0700155
156 // If an encode error occurs when encoding the nested messages, it will be
157 // reflected at the root encoder.
158 if (!my_proto_encoder.status().ok()) {
159 PW_LOG_INFO("Failed to encode proto; %s", my_proto_encoder.status().str());
160 }
161
162.. warning::
Ewout van Bekkum011a4d52021-08-20 20:19:52 -0700163 When a nested submessage is created, any use of the parent encoder that
164 created the nested encoder will trigger a crash. To resume using the parent
165 encoder, destroy the submessage encoder first.
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -0700166
167Error Handling
168==============
169While individual write calls on a proto encoder return pw::Status objects, the
170encoder tracks all status returns and "latches" onto the first error
171encountered. This status can be accessed via ``StreamEncoder::status()``.
172
173Codegen
174=======
175pw_protobuf encoder codegen integration is supported in GN, Bazel, and CMake.
176The codegen is just a light wrapper around the ``StreamEncoder`` and
177``MemoryEncoder`` objects, providing named helper functions to write proto
178fields rather than requiring that field numbers are directly passed to an
179encoder. Namespaced proto enums are also generated, and used as the arguments
180when writing enum fields of a proto message.
181
182All generated messages provide a ``Fields`` enum that can be used directly for
183out-of-band encoding, or with the ``pw::protobuf::Decoder``.
184
185This module's codegen is available through the ``*.pwpb`` sub-target of a
186``pw_proto_library`` in GN, CMake, and Bazel. See :ref:`pw_protobuf_compiler's
187documentation <module-pw_protobuf_compiler>` for more information on build
188system integration for pw_protobuf codegen.
189
190Example ``BUILD.gn``:
191
192.. Code:: none
193
194 import("//build_overrides/pigweed.gni")
195
196 import("$dir_pw_build/target_types.gni")
197 import("$dir_pw_protobuf_compiler/proto.gni")
198
199 # This target controls where the *.pwpb.h headers end up on the include path.
200 # In this example, it's at "pet_daycare_protos/client.pwpb.h".
201 pw_proto_library("pet_daycare_protos") {
202 sources = [
203 "pet_daycare_protos/client.proto",
204 ]
205 }
206
207 pw_source_set("example_client") {
208 sources = [ "example_client.cc" ]
209 deps = [
210 ":pet_daycare_protos.pwpb",
211 dir_pw_bytes,
212 dir_pw_stream,
213 ]
214 }
215
216Example ``pet_daycare_protos/client.proto``:
217
218.. Code:: none
219
220 syntax = "proto3";
221 // The proto package controls the namespacing of the codegen. If this package
222 // were fuzzy.friends, the namespace for codegen would be fuzzy::friends::*.
223 package fuzzy_friends;
224
225 message Pet {
226 string name = 1;
227 string pet_type = 2;
228 }
229
230 message Client {
231 repeated Pet pets = 1;
232 }
233
234Example ``example_client.cc``:
235
236.. Code:: cpp
237
238 #include "pet_daycare_protos/client.pwpb.h"
239 #include "pw_protobuf/encoder.h"
240 #include "pw_stream/sys_io_stream.h"
241 #include "pw_bytes/span.h"
242
243 pw::stream::SysIoWriter sys_io_writer;
244 std::byte submessage_scratch_buffer[64];
245 // The constructor is the same as a pw::protobuf::StreamEncoder.
246 fuzzy_friends::Client::StreamEncoder client(sys_io_writer,
247 submessage_scratch_buffer);
Ewout van Bekkum011a4d52021-08-20 20:19:52 -0700248 {
249 fuzzy_friends::Pet::StreamEncoder pet1 = client.GetPetsEncoder();
250 pet1.WriteName("Spot");
251 pet1.WritePetType("dog");
252 }
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -0700253
Ewout van Bekkum011a4d52021-08-20 20:19:52 -0700254 {
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -0700255 fuzzy_friends::Pet::StreamEncoder pet2 = client.GetPetsEncoder();
256 pet2.WriteName("Slippers");
257 pet2.WritePetType("rabbit");
258 }
259
260 if (!client.status().ok()) {
261 PW_LOG_INFO("Failed to encode proto; %s", client.status().str());
262 }
263
264========
265Decoding
266========
267
268Size report
269===========
270
271Full size report
272----------------
273
274This report demonstrates the size of using the entire decoder with all of its
275decode methods and a decode callback for a proto message containing each of the
276protobuf field types.
277
278.. include:: size_report/decoder_full
279
280
281Incremental size report
282-----------------------
283
284This report is generated using the full report as a base and adding some int32
285fields to the decode callback to demonstrate the incremental cost of decoding
286fields in a message.
287
288.. include:: size_report/decoder_incremental
289
290========================================
Alexei Frolov4a257c12020-03-02 14:09:42 -0800291Comparison with other protobuf libraries
292========================================
293
294protobuf-lite
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -0700295=============
Alexei Frolov4a257c12020-03-02 14:09:42 -0800296protobuf-lite is the official reduced-size C++ implementation of protobuf. It
297uses a restricted subset of the protobuf library's features to minimize code
298size. However, is is still around 150K in size and requires dynamic memory
299allocation, making it unsuitable for many embedded systems.
300
301nanopb
Ewout van Bekkumf1672fb2021-08-24 14:21:29 -0700302======
Alexei Frolov4a257c12020-03-02 14:09:42 -0800303`nanopb <https://github.com/nanopb/nanopb>`_ is a commonly used embedded
304protobuf library with very small code size and full code generation. It provides
305both encoding/decoding functionality and in-memory C structs representing
306protobuf messages.
307
308nanopb works well for many embedded products; however, using its generated code
309can run into RAM usage issues when processing nontrivial protobuf messages due
310to the necessity of defining a struct capable of storing all configurations of
311the message, which can grow incredibly large. In one project, Pigweed developers
312encountered an 11K struct statically allocated for a single message---over twice
313the size of the final encoded output! (This was what prompted the development of
Armando Montanez0054a9b2020-03-13 13:06:24 -0700314``pw_protobuf``.)
Alexei Frolov4a257c12020-03-02 14:09:42 -0800315
316To avoid this issue, it is possible to use nanopb's low-level encode/decode
317functions to process individual message fields directly, but this loses all of
Armando Montanez0054a9b2020-03-13 13:06:24 -0700318the useful semantics of code generation. ``pw_protobuf`` is designed to optimize
319for this use case; it allows for efficient operations on the wire format with an
Alexei Frolov4a257c12020-03-02 14:09:42 -0800320intuitive user interface.
321
322Depending on the requirements of a project, either of these libraries could be
323suitable.