Wyatt Hepler | f9fb90f | 2020-09-30 18:59:33 -0700 | [diff] [blame] | 1 | .. _module-pw_log_tokenized: |
Keir Mierle | bc5a269 | 2020-05-21 16:52:25 -0700 | [diff] [blame] | 2 | |
Wyatt Hepler | cb725c1 | 2020-05-01 11:05:01 -0700 | [diff] [blame] | 3 | ---------------- |
| 4 | pw_log_tokenized |
| 5 | ---------------- |
Wyatt Hepler | 8f4a096 | 2021-03-11 16:39:21 -0800 | [diff] [blame] | 6 | The ``pw_log_tokenized`` module contains utilities for tokenized logging. It |
| 7 | connects ``pw_log`` to ``pw_tokenizer``. |
| 8 | |
| 9 | C++ backend |
| 10 | =========== |
| 11 | ``pw_log_tokenized`` provides a backend for ``pw_log`` that tokenizes log |
Wyatt Hepler | 21ab0f4 | 2021-03-15 09:38:11 -0700 | [diff] [blame] | 12 | messages with the ``pw_tokenizer`` module. By default, log messages are |
| 13 | tokenized with the ``PW_TOKENIZE_TO_GLOBAL_HANDLER_WITH_PAYLOAD`` macro. |
| 14 | The log level, 16-bit tokenized module name, and flags bits are passed through |
| 15 | the payload argument. The macro eventually passes logs to the |
| 16 | ``pw_tokenizer_HandleEncodedMessageWithPayload`` function, which must be |
| 17 | implemented by the application. |
Wyatt Hepler | cb725c1 | 2020-05-01 11:05:01 -0700 | [diff] [blame] | 18 | |
| 19 | Example implementation: |
| 20 | |
| 21 | .. code-block:: cpp |
| 22 | |
Wyatt Hepler | 7a5e4d6 | 2020-08-31 08:39:16 -0700 | [diff] [blame] | 23 | extern "C" void pw_tokenizer_HandleEncodedMessageWithPayload( |
| 24 | pw_tokenizer_Payload payload, const uint8_t message[], size_t size) { |
Wyatt Hepler | cb725c1 | 2020-05-01 11:05:01 -0700 | [diff] [blame] | 25 | // The metadata object provides the log level, module token, and flags. |
| 26 | // These values can be recorded and used for runtime filtering. |
| 27 | pw::log_tokenized::Metadata metadata(payload); |
| 28 | |
| 29 | if (metadata.level() < current_log_level) { |
| 30 | return; |
| 31 | } |
| 32 | |
| 33 | if (metadata.flags() & HIGH_PRIORITY_LOG != 0) { |
| 34 | EmitHighPriorityLog(metadata.module(), message, size); |
| 35 | } else { |
| 36 | EmitLowPriorityLog(metadata.module(), message, size); |
| 37 | } |
| 38 | } |
| 39 | |
Wyatt Hepler | 21ab0f4 | 2021-03-15 09:38:11 -0700 | [diff] [blame] | 40 | See the documentation for :ref:`module-pw_tokenizer` for further details. |
| 41 | |
Wyatt Hepler | ebbce4c | 2021-06-03 17:34:00 -0700 | [diff] [blame] | 42 | Metadata in the format string |
| 43 | ----------------------------- |
| 44 | With tokenized logging, the log format string is converted to a 32-bit token. |
| 45 | Regardless of how long the format string is, it's always represented by a 32-bit |
| 46 | token. Because of this, metadata can be packed into the tokenized string with |
| 47 | no cost. |
| 48 | |
| 49 | ``pw_log_tokenized`` uses a simple key-value format to encode metadata in a |
| 50 | format string. Each field starts with the ``■`` (U+25A0 "Black Square") |
| 51 | character, followed by the key name, the ``♦`` (U+2666 "Black Diamond Suit") |
| 52 | character, and then the value. The string is encoded as UTF-8. Key names are |
| 53 | comprised of alphanumeric ASCII characters and underscore and start with a |
| 54 | letter. |
| 55 | |
| 56 | .. code-block:: |
| 57 | |
| 58 | "■key1♦contents1■key2♦contents2■key3♦contents3" |
| 59 | |
| 60 | This format makes the message easily machine parseable and human readable. It is |
| 61 | extremely unlikely to conflict with log message contents due to the characters |
| 62 | used. |
| 63 | |
| 64 | ``pw_log_tokenized`` uses three fields: ``msg``, ``module``, and ``file``. |
| 65 | Implementations may add other fields, but they will be ignored by the |
| 66 | ``pw_log_tokenized`` tooling. |
| 67 | |
| 68 | .. code-block:: |
| 69 | |
| 70 | "■msg♦Hyperdrive %d set to %f■module♦engine■file♦propulsion/hyper.cc" |
| 71 | |
| 72 | Using key-value pairs allows placing the fields in any order. |
| 73 | ``pw_log_tokenized`` places the message first. This is prefered when tokenizing |
| 74 | C code because the tokenizer only hashes a fixed number of characters. If the |
| 75 | file were first, the long path might take most of the hashed characters, |
| 76 | increasing the odds of a collision with other strings in that file. In C++, all |
| 77 | characters in the string are hashed, so the order is not important. |
| 78 | |
| 79 | Metadata in the tokenizer payload argument |
| 80 | ------------------------------------------- |
| 81 | ``pw_log_tokenized`` packs runtime-accessible metadata into a 32-bit integer |
| 82 | which is passed as the "payload" argument for ``pw_log_tokenizer``'s global |
| 83 | handler with payload facade. Packing this metadata into a single word rather |
| 84 | than separate arguments reduces the code size significantly. |
| 85 | |
| 86 | Four items are packed into the payload argument: |
| 87 | |
| 88 | - Log level -- Used for runtime log filtering by level. |
| 89 | - Line number -- Used to track where a log message originated. |
| 90 | - Log flags -- Implementation-defined log flags. |
| 91 | - Tokenized :c:macro:`PW_LOG_MODULE_NAME` -- Used for runtime log filtering by |
| 92 | module. |
| 93 | |
| 94 | Configuring metadata bit fields |
| 95 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 96 | The number of bits to use for each metadata field is configurable through macros |
| 97 | in ``pw_log/config.h``. The field widths must sum to 32 bits. A field with zero |
| 98 | bits allocated is excluded from the log metadata. |
| 99 | |
| 100 | .. c:macro:: PW_LOG_TOKENIZED_LEVEL_BITS |
| 101 | |
| 102 | Bits to allocate for the log level. Defaults to :c:macro:`PW_LOG_LEVEL_BITS` |
| 103 | (3). |
| 104 | |
| 105 | .. c:macro:: PW_LOG_TOKENIZED_LINE_BITS |
| 106 | |
| 107 | Bits to allocate for the line number. Defaults to 11 (up to line 2047). If the |
| 108 | line number is too large to be represented by this field, line is reported as |
| 109 | 0. |
| 110 | |
| 111 | Including the line number can slightly increase code size. Without the line |
| 112 | number, the log metadata argument is the same for all logs with the same level |
| 113 | and flags. With the line number, each metadata value is unique and must be |
| 114 | encoded as a separate word in the binary. Systems with extreme space |
| 115 | constraints may exclude line numbers by setting this macro to 0. |
| 116 | |
| 117 | It is possible to include line numbers in tokenized log format strings, but |
| 118 | that is discouraged because line numbers change whenever a file is edited. |
| 119 | Passing the line number with the metadata is a lightweight way to include it. |
| 120 | |
| 121 | .. c:macro:: PW_LOG_TOKENIZED_FLAG_BITS |
| 122 | |
| 123 | Bits to use for implementation-defined flags. Defaults to 2. |
| 124 | |
| 125 | .. c:macro:: PW_LOG_TOKENIZED_MODULE_BITS |
| 126 | |
| 127 | Bits to use for the tokenized version of :c:macro:`PW_LOG_MODULE_NAME`. |
| 128 | Defaults to 16, which gives a ~1% probability of a collision with 37 module |
| 129 | names. |
| 130 | |
Wyatt Hepler | bcf0735 | 2021-04-05 14:44:30 -0700 | [diff] [blame] | 131 | Using a custom macro |
| 132 | -------------------- |
| 133 | Applications may use their own macro instead of |
Wyatt Hepler | 21ab0f4 | 2021-03-15 09:38:11 -0700 | [diff] [blame] | 134 | ``PW_TOKENIZE_TO_GLOBAL_HANDLER_WITH_PAYLOAD`` by setting the |
| 135 | ``PW_LOG_TOKENIZED_ENCODE_MESSAGE`` config macro. This macro should take |
| 136 | arguments equivalent to ``PW_TOKENIZE_TO_GLOBAL_HANDLER_WITH_PAYLOAD``: |
| 137 | |
Keir Mierle | b191402 | 2021-04-12 09:08:33 -0700 | [diff] [blame] | 138 | .. c:macro:: PW_LOG_TOKENIZED_ENCODE_MESSAGE(log_metadata, message, ...) |
| 139 | |
| 140 | :param log_metadata: |
| 141 | |
| 142 | Packed metadata for the log message. See the Metadata_ class for how to |
| 143 | unpack the details. |
| 144 | |
| 145 | :type log_metadata: pw_tokenizer_Payload |
| 146 | |
| 147 | :param message: The log message format string (untokenized) |
| 148 | :type message: :c:texpr:`const char*` |
| 149 | |
Rob Mohr | 640c75c | 2021-05-26 07:22:54 -0700 | [diff] [blame] | 150 | .. _Metadata: https://cs.opensource.google/pigweed/pigweed/+/HEAD:pw_log_tokenized/public/pw_log_tokenized/log_tokenized.h;l=113 |
Wyatt Hepler | cb725c1 | 2020-05-01 11:05:01 -0700 | [diff] [blame] | 151 | |
Wyatt Hepler | bcf0735 | 2021-04-05 14:44:30 -0700 | [diff] [blame] | 152 | For instructions on how to implement a custom tokenization macro, see |
| 153 | :ref:`module-pw_tokenizer-custom-macro`. |
| 154 | |
Wyatt Hepler | 6736887 | 2020-07-30 16:55:38 -0700 | [diff] [blame] | 155 | Build targets |
| 156 | ------------- |
| 157 | The GN build for ``pw_log_tokenized`` has two targets: ``pw_log_tokenized`` and |
| 158 | ``log_backend``. The ``pw_log_tokenized`` target provides the |
| 159 | ``pw_log_tokenized/log_tokenized.h`` header. The ``log_backend`` target |
| 160 | implements the backend for the ``pw_log`` facade. ``pw_log_tokenized`` invokes |
Wyatt Hepler | e0575f7 | 2020-10-16 10:47:03 -0700 | [diff] [blame] | 161 | the ``pw_tokenizer:global_handler_with_payload`` facade, which must be |
Wyatt Hepler | 6736887 | 2020-07-30 16:55:38 -0700 | [diff] [blame] | 162 | implemented by the user of ``pw_log_tokenized``. |
| 163 | |
Wyatt Hepler | 8f4a096 | 2021-03-11 16:39:21 -0800 | [diff] [blame] | 164 | Python package |
| 165 | ============== |
Wyatt Hepler | bcf0735 | 2021-04-05 14:44:30 -0700 | [diff] [blame] | 166 | ``pw_log_tokenized`` includes a Python package for decoding tokenized logs. |
Wyatt Hepler | 8f4a096 | 2021-03-11 16:39:21 -0800 | [diff] [blame] | 167 | |
| 168 | pw_log_tokenized |
| 169 | ---------------- |
| 170 | .. automodule:: pw_log_tokenized |
| 171 | :members: |
| 172 | :undoc-members: |