| <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
| "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" > |
| <article> |
| <title>The C Code Generator</title> |
| |
| <section> |
| <title>Design</title> |
| |
| <para>The overall goal is to keep the code-generator as simple |
| as possible. Hopefully performance isn't sacrificed to that end!</para> |
| |
| <para>Anyways, we generate very little code: we mostly generate |
| structure definitions (for example enums and structures |
| for messages) and some metadata which is basically |
| reflection-type data.</para> |
| |
| <para>The serializing and deserializing is implemented in a library, |
| called libprotobuf-c rather than generated code.</para> |
| |
| </section> |
| <section> |
| <title>The Generated Code</title> |
| <para> |
| For each enum, we generate a C enum. |
| For each message, we generate a C structure |
| which can be cast to a <type>ProtobufCMessage</type>. |
| </para> |
| <para> |
| For each enum and message, we generate a descriptor |
| object that allows us to implement a kind of reflection |
| on the structures. |
| </para> |
| <section><title>Naming Conventions</title> |
| <para>First, some naming conventions: |
| <itemizedlist> |
| <listitem><para> |
| The name of the type for enums and messages and services |
| is camel case (meaning WordsAreCrammedTogether) |
| except that double-underscores are used to delimit |
| scopes. For example: |
| <programlisting><![CDATA[ |
| package foo.bar; |
| message BazBah { |
| int32 val; |
| } |
| ]]></programlisting> |
| would generate a C type <type>Foo__Bar__BazBah</type>.</para> |
| </listitem><listitem> |
| <para>Functions and globals are all lowercase, with camel-case |
| words separated by single underscores; namespaces are separated with |
| double-underscores. |
| For example: |
| <programlisting><![CDATA[ |
| Foo__Bar__BazBah *foo__bar__baz_bah__unpack |
| (ProtobufCAllocator *allocator, |
| size_t length, |
| const unsigned char *data); |
| ]]></programlisting> |
| </para> |
| </listitem><listitem> |
| <para>Enums values are all uppercase.</para> |
| </listitem> |
| <listitem><para> |
| Stuff we dd to your symbol names will also be |
| separated by a double-underscore. For example, |
| the unpack method above.</para></listitem> |
| </itemizedlist> |
| </para> |
| </section> |
| <section><title>Generated Descriptors</title> |
| <para> |
| We also generate descriptor objects for messages |
| and enums. These are declared in the .h files: |
| <programlisting><![CDATA[ |
| extern const ProtobufCMessageDescriptor |
| foo__bar__baz_bah__descriptor; |
| ]]></programlisting> |
| </para> |
| </section> |
| <section><title>Message Methods</title> |
| <para> |
| The message structures all begin with <type>ProtobufCMessage</type>, |
| so they may be cast to that type. |
| </para> |
| <para> |
| We generate some functions for each message: |
| <itemizedlist> |
| <listitem> |
| <para><function>unpack()</function>. Unpack data for a particular |
| message-format: |
| <programlisting><![CDATA[ |
| Foo__Bar__BazBah * |
| foo__bar__baz_bah__unpack (ProtobufCAllocator *allocator, |
| size_t length, |
| const unsigned char *data); |
| ]]></programlisting> |
| Note that <parameter>allocator</parameter> may be NULL. |
| </para> |
| </listitem> |
| <listitem> |
| <para><function>free_unpacked()</function>. Free a message |
| that you obtained with the unpack method: |
| <programlisting><![CDATA[ |
| void |
| foo__bar__baz_bah__free_unpacked (Foo__Bar__BazBah *baz_bah, |
| ProtobufCAllocator *allocator); |
| ]]></programlisting> |
| </para> |
| </listitem> |
| <listitem> |
| <para><function>get_packed_size()</function>. Find how long |
| the serialized representation of the data will be: |
| message-format: |
| <programlisting><![CDATA[ |
| size_t |
| foo__bar__baz_bah__get_packed_size |
| (const Foo__Bar__BazBah *message); |
| ]]></programlisting> |
| </para> |
| </listitem> |
| <listitem> |
| <para><function>pack()</function>. Pack message |
| into buffer; assumes that buffer is long enough (use get_packed_size first!). |
| <programlisting><![CDATA[ |
| size_t |
| foo__bar__baz_bah__pack |
| (const Foo__Bar__BazBah *message, |
| unsigned char *packed_data_out); |
| ]]></programlisting> |
| </para> |
| </listitem> |
| <listitem> |
| <para><function>pack_to_buffer()</function>. Pack message |
| into virtualize buffer. |
| <programlisting><![CDATA[ |
| size_t |
| foo__bar__baz_bah__pack_to_buffer |
| (const Foo__Bar__BazBah *message, |
| ProtobufCBuffer *buffer); |
| ]]></programlisting> |
| </para> |
| </listitem> |
| </itemizedlist> |
| </para> |
| </section> |
| |
| <section><title>Services</title> |
| <para> |
| Services are collections of methods each having an input and output type. |
| Unlike messages where we generate a structure that corresponds |
| to the actual message object, for services we generate |
| a function that creates a <type>ProtobufCService</type> |
| from a collection of user-defined methods. |
| </para> |
| <para> |
| We also define simple functions that invoke each method of a service. |
| These functions work if the service is created by |
| the <function>create_service</function> generated function |
| or if the service is instantiated by an RPC system. |
| </para> |
| <para> |
| Suppose we have a .proto file: |
| <programlisting><![CDATA[ |
| message A { |
| required uint32 val; |
| } |
| message B { |
| required string foo; |
| } |
| service Convert { |
| rpc Itoa (A) returns (B); |
| rpc Atoi (B) returns (A); |
| } |
| ]]></programlisting> |
| We will get generated code: |
| <programlisting><![CDATA[ |
| struct _Convert_Service { |
| ProtobufCService base; |
| void (*itoa) (Convert_Service *service, |
| const A *input, |
| B__Closure closure, |
| void *closure_data); |
| void (*atoi) (Convert_Service *service, |
| const B *input, |
| A__Closure closure, |
| void *closure_data); |
| void (*destroy) (Convert_Service *service); |
| }; |
| ]]></programlisting> |
| <programlisting><![CDATA[ |
| /* structure derived from Convert_Service. */ |
| typedef struct { |
| Convert_Service base; /* must be first member */ |
| unsigned radix; |
| } Convert_WithRadix; |
| |
| /* convert int to string (not really implemented) */ |
| static void radix_itoa (Convert_Service *service, |
| const A *input, |
| B__Closure closure, |
| void *closure_data) |
| { |
| char buf[256]; |
| Convert_WithRadix *wr = (Convert_WithRadix *) service; |
| B rv; |
| print_int_with_radix (input->val, wr->radix, buf); |
| rv.descriptor = &b__descriptor; |
| rv.str = buf; |
| closure (&rv, closure_data); |
| } |
| |
| /* convert string to int: use strtoul */ |
| static void radix_atoi (Convert_Service *service, |
| const B *input, |
| A__Closure closure, |
| void *closure_data) |
| { |
| Convert_WithRadix *wr = (Convert_WithRadix *) service; |
| A rv; |
| rv.val = strtoul (input->val, NULL, wr->radix); |
| rv.descriptor = &a__descriptor; |
| closure (&rv, closure_data); |
| } |
| |
| /* create a new convert service by radix */ |
| ProtobufCService * |
| create_convert_service_from_radix (unsigned radix) |
| { |
| Convert_WithRadix *wr = malloc (sizeof (Convert_WithRadix)); |
| convert__init (wr, (Convert__ServiceDestroy) free); |
| wr->base.itoa = radix_itoa; |
| wr->base.atoi = radix_atoi; |
| wr->radix = radix; |
| return (ProtobufCService *) wr; |
| } |
| ]]></programlisting> |
| Just like with messages, you may cast |
| from <type>Convert_Service</type> to <type>ProtobufCService</type>, |
| at least as long as you have run the __init function. |
| </para> |
| <para> |
| Conversely, we generate functions to help you invoke service |
| methods on generic <type>ProtobufCService</type> objects. |
| These go through the <function>invoke()</function> of service |
| and they work on both services created with create_service |
| as well as factory-provided services like those provided by RPC systems. |
| For example: |
| <programlisting><![CDATA[ |
| void convert__itoa (ProtobufCService *service, |
| const B *input, |
| A__Closure closure, |
| void *closure_data); |
| ]]></programlisting> |
| </para> |
| </section> |
| |
| </section> |
| |
| <section> |
| <title>The protobuf-c Library</title> |
| |
| <para>This library is used by the generated code; |
| it includes common structures and enums, |
| as well as functions that most users of the generated code |
| will want.</para> |
| |
| <para> |
| There are three main components: |
| <orderedlist> |
| <listitem><para>the Descriptor structures</para></listitem> |
| <listitem><para>helper structures and objects</para></listitem> |
| <listitem><para>packing and unpacking code</para></listitem> |
| </orderedlist> |
| </para> |
| |
| </section> |
| <section> |
| <title>protobuf-c: the Descriptor structures</title> |
| |
| <para>For example, enums are described in terms of structures: |
| |
| <programlisting><![CDATA[ |
| struct _ProtobufCEnumValue |
| { |
| const char *name; |
| const char *c_name; |
| int value; |
| }; |
| |
| struct _ProtobufCEnumDescriptor |
| { |
| const char *name; |
| const char *short_name; |
| const char *package_name; |
| |
| /* sorted by value */ |
| unsigned n_values; |
| const ProtobufCEnumValue *values; |
| |
| /* sorted by name */ |
| unsigned n_value_names; |
| const ProtobufCEnumValue *values_by_name; |
| }; |
| ]]></programlisting></para> |
| |
| <para>Likewise, messages are described by: |
| |
| <programlisting><![CDATA[ |
| struct _ProtobufCFieldDescriptor |
| { |
| const char *name; |
| int id; |
| ProtobufCFieldLabel label; |
| ProtobufCFieldType type; |
| unsigned quantifier_offset; |
| unsigned offset; |
| void *descriptor; /* for MESSAGE and ENUM types */ |
| }; |
| struct _ProtobufCMessageDescriptor |
| { |
| const char *name; |
| const char *short_name; |
| const char *package_name; |
| |
| /* sorted by field-id */ |
| unsigned n_fields; |
| const ProtobufCFieldDescriptor *fields; |
| }; |
| ]]></programlisting></para> |
| |
| <para> |
| And finally services are described by: |
| |
| <programlisting><![CDATA[ |
| struct _ProtobufCMethodDescriptor |
| { |
| const char *name; |
| const ProtobufCMessageDescriptor *input; |
| const ProtobufCMessageDescriptor *output; |
| }; |
| struct _ProtobufCServiceDescriptor |
| { |
| const char *name; |
| unsigned n_methods; |
| ProtobufCMethodDescriptor *methods; // sorted by name |
| }; |
| ]]></programlisting></para> |
| |
| </section> |
| <section> |
| <title>protobuf-c: helper structures and typedefs</title> |
| |
| <para>We defined typedefs for a few types |
| which are used in .proto files but do not |
| have obvious standard C equivalents: |
| <itemizedlist> |
| <listitem><para>a boolean type (<type>protobuf_c_boolean</type>)</para></listitem> |
| <listitem><para>a binary-data (bytes) type (<type>ProtobufCBinaryData</type>)</para></listitem> |
| <listitem><para>the various int types (<type>int32_t</type>, <type>uint32_t</type>, <type>int64_t</type>, <type>uint64_t</type>) |
| are obtained by including <filename>inttypes.h</filename></para></listitem> |
| </itemizedlist> |
| </para> |
| |
| <para>We also define a simple allocator object, ProtobufCAllocator |
| that let's you control how allocations are done. |
| This is predominately used for parsing.</para> |
| |
| <para>There is a virtual buffer facility that |
| only has to implement a method to append binary-data |
| to the buffer. This can be used to serialize messages |
| to different targets (instead of a flat slab of data).</para> |
| |
| <para>We define a base-type for all messages, |
| for code that handles messages generically. |
| All it has is the descriptor object.</para> |
| |
| <section id="buffers"> |
| <title>Buffers</title> |
| <para>One important helper type is the <type>ProtobufCBuffer</type> |
| which allows you to abstract the target of serialization. The only |
| thing that a buffer has is an <function>append</function> method: |
| <programlisting><![CDATA[ |
| struct _ProtobufCBuffer |
| { |
| void (*append)(ProtobufCBuffer *buffer, |
| size_t len, |
| const unsigned char *data); |
| } |
| ]]></programlisting> |
| ProtobufCBuffer subclasses are often defined on the stack. |
| </para> |
| |
| <para> |
| For example, to write to a <type>FILE</type> you could make: |
| <programlisting><![CDATA[ |
| typedef struct |
| { |
| ProtobufCBuffer base; |
| FILE *fp; |
| } BufferAppendToFile |
| static void my_buffer_file_append (ProtobufCBuffer *buffer, |
| unsigned len, |
| const unsigned char *data) |
| { |
| BufferAppendToFile *file_buf = (BufferAppendToFile *) buffer; |
| fwrite (data, len, 1, file_buf->fp); // XXX: no error handling! |
| } |
| ]]></programlisting> |
| </para> |
| |
| <para> |
| To use this new type of Buffer, you would do something like: |
| <programlisting><![CDATA[ |
| ... |
| BufferAppendToFile tmp; |
| tmp.base.append = my_buffer_file_append; |
| tmp.fp = fp; |
| protobuf_c_message_pack_to_buffer (&message, &tmp); |
| ... |
| ]]></programlisting> |
| </para> |
| <para> |
| A commonly builtin subtype is the BufferSimple |
| which is declared on the stack and uses a scratch buffer provided by the user |
| for its initial allocation. It does exponential resizing. |
| To create a BufferSimple, use code like: |
| <programlisting><![CDATA[ |
| unsigned char pad[128]; |
| ProtobufCBufferSimple buf = PROTOBUF_C_BUFFER_SIMPLE_INIT (pad); |
| ProtobufCBuffer *buffer = (ProtobufCBuffer *) &simple; |
| protobuf_c_buffer_append (buffer, 6, (unsigned char *) "hi mom"); |
| ]]></programlisting> |
| You can access the data as buf.len and buf.data. For example, |
| <programlisting><![CDATA[ |
| assert (buf.len == 6); |
| assert (memcmp (buf.data, "hi mom", 6) == 0); |
| ]]></programlisting> |
| To finish up, use: |
| <programlisting><![CDATA[ |
| PROTOBUF_C_BUFFER_SIMPLE_CLEAR (&buf); |
| ]]></programlisting> |
| </para> |
| </section> |
| </section> |
| <section> |
| <title>protobuf-c: packing and unpacking messages</title> |
| |
| <para> |
| To pack messages one first computes their packed size, |
| then provide a buffer to pack into. |
| <programlisting><![CDATA[ |
| size_t protobuf_c_message_get_packed_size |
| (ProtobufCMessage *message); |
| void protobuf_c_message_pack (ProtobufCMessage *message, |
| unsigned char *out); |
| ]]></programlisting> |
| </para> |
| |
| <para> |
| Or you can use the "streaming" approach: |
| <programlisting><![CDATA[ |
| void protobuf_c_message_pack_to_buffer |
| (ProtobufCMessage *message, |
| ProtobufCBuffer *buffer); |
| ]]></programlisting> |
| where <type>ProtobufCBuffer</type> is a base object with an append metod. |
| See <xref linkend="buffers" />. |
| </para> |
| |
| |
| |
| <para> |
| To unpack messages, you should simple call |
| <programlisting><![CDATA[ |
| ProtobufCMessage * |
| protobuf_c_message_unpack (const ProtobufCMessageDescriptor *, |
| ProtobufCAllocator *allocator, |
| size_t len, |
| const unsigned char *data); |
| ]]></programlisting> |
| If you pass NULL for <parameter>allocator</parameter>, then |
| the default allocator will be used. |
| </para> |
| |
| <para> |
| You can cast the result to the type that matches |
| the descriptor. |
| </para> |
| |
| <para> |
| The result of unpacking should be freed with protobuf_c_message_free(). |
| </para> |
| |
| |
| </section> |
| <section> |
| <title>Author</title> |
| <para>Dave Benson.</para> |
| </section> |
| </article> |