Wyatt Hepler | f9fb90f | 2020-09-30 18:59:33 -0700 | [diff] [blame] | 1 | .. _module-pw_string: |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 2 | |
| 3 | --------- |
| 4 | pw_string |
| 5 | --------- |
Wyatt Hepler | 3c4e5de | 2020-03-03 14:37:52 -0800 | [diff] [blame] | 6 | String manipulation is a very common operation, but the standard C and C++ |
| 7 | string libraries have drawbacks. The C++ functions are easy-to-use and powerful, |
| 8 | but require too much flash and memory for many embedded projects. The C string |
| 9 | functions are lighter weight, but can be difficult to use correctly. Mishandling |
| 10 | of null terminators or buffer sizes can result in serious bugs. |
| 11 | |
Armando Montanez | 0054a9b | 2020-03-13 13:06:24 -0700 | [diff] [blame] | 12 | The ``pw_string`` module provides the flexibility, ease-of-use, and safety of |
Wyatt Hepler | 3c4e5de | 2020-03-03 14:37:52 -0800 | [diff] [blame] | 13 | C++-style string manipulation, but with no dynamic memory allocation and a much |
Armando Montanez | 0054a9b | 2020-03-13 13:06:24 -0700 | [diff] [blame] | 14 | smaller binary size impact. Using ``pw_string`` in place of the standard C |
| 15 | functions eliminates issues related to buffer overflow or missing null |
| 16 | terminators. |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 17 | |
Wyatt Hepler | ee3e02f | 2019-12-05 10:52:31 -0800 | [diff] [blame] | 18 | Compatibility |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 19 | ============= |
| 20 | C++17 |
| 21 | |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 22 | pw::string::Format |
Wyatt Hepler | 0435efe | 2021-03-01 14:00:36 -0800 | [diff] [blame] | 23 | ================== |
Wyatt Hepler | 2596fe5 | 2020-01-23 17:40:10 -0800 | [diff] [blame] | 24 | The ``pw::string::Format`` and ``pw::string::FormatVaList`` functions provide |
| 25 | safer alternatives to ``std::snprintf`` and ``std::vsnprintf``. The snprintf |
| 26 | return value is awkward to interpret, and misinterpreting it can lead to serious |
| 27 | bugs. |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 28 | |
| 29 | Size report: replacing snprintf with pw::string::Format |
Wyatt Hepler | 0435efe | 2021-03-01 14:00:36 -0800 | [diff] [blame] | 30 | ------------------------------------------------------- |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 31 | The ``Format`` functions have a small, fixed code size cost. However, relative |
| 32 | to equivalent ``std::snprintf`` calls, there is no incremental code size cost to |
| 33 | using ``Format``. |
| 34 | |
Alexei Frolov | 725b85b | 2020-03-19 13:37:10 -0700 | [diff] [blame] | 35 | .. include:: format_size_report |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 36 | |
Ewout van Bekkum | f89f137 | 2021-05-03 11:15:54 -0700 | [diff] [blame] | 37 | Safe Length Checking |
| 38 | ==================== |
| 39 | This module provides two safer alternatives to ``std::strlen`` in case the |
| 40 | string is extremely long and/or potentially not null-terminated. |
Ewout van Bekkum | c2e9d88 | 2021-04-29 16:01:27 -0700 | [diff] [blame] | 41 | |
Ewout van Bekkum | f89f137 | 2021-05-03 11:15:54 -0700 | [diff] [blame] | 42 | First, a constexpr alternative to C11's ``strnlen_s`` is offerred through |
| 43 | :cpp:func:`pw::string::ClampedCString`. This does not return a length by |
| 44 | design and instead returns a string_view which does not require |
| 45 | null-termination. |
Ewout van Bekkum | c2e9d88 | 2021-04-29 16:01:27 -0700 | [diff] [blame] | 46 | |
Ewout van Bekkum | f89f137 | 2021-05-03 11:15:54 -0700 | [diff] [blame] | 47 | Second, a constexpr specialized form is offered where null termination is |
| 48 | required through :cpp:func:`pw::string::NullTerminatedLength`. This will only |
| 49 | return a length if the string is null-terminated. |
| 50 | |
| 51 | .. cpp:function:: constexpr std::string_view pw::string::ClampedCString(std::span<const char> str) |
| 52 | .. cpp:function:: constexpr std::string_view pw::string::ClampedCString(const char* str, size_t max_len) |
| 53 | |
| 54 | Safe alternative to the string_view constructor to avoid the risk of an |
| 55 | unbounded implicit or explicit use of strlen. |
| 56 | |
| 57 | This is strongly recommended over using something like C11's strnlen_s as |
| 58 | a string_view does not require null-termination. |
| 59 | |
| 60 | .. cpp:function:: constexpr pw::Result<size_t> pw::string::NullTerminatedLength(std::span<const char> str) |
| 61 | .. cpp:function:: pw::Result<size_t> pw::string::NullTerminatedLength(const char* str, size_t max_len) |
| 62 | |
| 63 | Safe alternative to strlen to calculate the null-terminated length of the |
| 64 | string within the specified span, excluding the null terminator. Like C11's |
| 65 | strnlen_s, the scan for the null-terminator is bounded. |
| 66 | |
| 67 | Returns: |
| 68 | null-terminated length of the string excluding the null terminator. |
| 69 | OutOfRange - if the string is not null-terminated. |
| 70 | |
| 71 | Precondition: The string shall be at a valid pointer. |
Ewout van Bekkum | c2e9d88 | 2021-04-29 16:01:27 -0700 | [diff] [blame] | 72 | |
| 73 | pw::string::Copy |
| 74 | ================ |
| 75 | The ``pw::string::Copy`` functions provide a safer alternative to |
| 76 | ``std::strncpy`` as it always null-terminates whenever the destination |
| 77 | buffer has a non-zero size. |
| 78 | |
| 79 | .. cpp:function:: StatusWithSize Copy(const std::string_view& source, std::span<char> dest) |
| 80 | .. cpp:function:: StatusWithSize Copy(const char* source, std::span<char> dest) |
| 81 | .. cpp:function:: StatusWithSize Copy(const char* source, char* dest, size_t num) |
| 82 | |
| 83 | Copies the source string to the dest, truncating if the full string does not |
| 84 | fit. Always null terminates if dest.size() or num > 0. |
| 85 | |
| 86 | Returns the number of characters written, excluding the null terminator. If |
| 87 | the string is truncated, the status is ResourceExhausted. |
| 88 | |
| 89 | Precondition: The destination and source shall not overlap. |
| 90 | Precondition: The source shall be a valid pointer. |
| 91 | |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 92 | pw::StringBuilder |
Wyatt Hepler | 0435efe | 2021-03-01 14:00:36 -0800 | [diff] [blame] | 93 | ================= |
| 94 | ``pw::StringBuilder`` facilitates building formatted strings in a fixed-size |
| 95 | buffer. It is designed to give the flexibility of ``std::string`` and |
| 96 | ``std::ostringstream``, but with a small footprint. |
| 97 | |
| 98 | Supporting custom types with StringBuilder |
| 99 | ------------------------------------------ |
| 100 | As with ``std::ostream``, StringBuilder supports printing custom types by |
| 101 | overriding the ``<<`` operator. This is is done by defining ``operator<<`` in |
| 102 | the same namespace as the custom type. For example: |
| 103 | |
| 104 | .. code-block:: cpp |
| 105 | |
| 106 | namespace my_project { |
| 107 | |
| 108 | struct MyType { |
| 109 | int foo; |
| 110 | const char* bar; |
| 111 | }; |
| 112 | |
| 113 | pw::StringBuilder& operator<<(pw::StringBuilder& sb, const MyType& value) { |
| 114 | return sb << "MyType(" << value.foo << ", " << value.bar << ')'; |
| 115 | } |
| 116 | |
| 117 | } // namespace my_project |
| 118 | |
| 119 | Internally, ``StringBuilder`` uses the ``ToString`` function to print. The |
| 120 | ``ToString`` template function can be specialized to support custom types with |
| 121 | ``StringBuilder``, though it is recommended to overload ``operator<<`` instead. |
| 122 | This example shows how to specialize ``pw::ToString``: |
| 123 | |
| 124 | .. code-block:: cpp |
| 125 | |
| 126 | #include "pw_string/to_string.h" |
| 127 | |
| 128 | namespace pw { |
| 129 | |
| 130 | template <> |
| 131 | StatusWithSize ToString<MyStatus>(MyStatus value, std::span<char> buffer) { |
Ewout van Bekkum | c2e9d88 | 2021-04-29 16:01:27 -0700 | [diff] [blame] | 132 | return Copy(MyStatusString(value), buffer); |
Wyatt Hepler | 0435efe | 2021-03-01 14:00:36 -0800 | [diff] [blame] | 133 | } |
| 134 | |
| 135 | } // namespace pw |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 136 | |
| 137 | Size report: replacing snprintf with pw::StringBuilder |
Wyatt Hepler | 0435efe | 2021-03-01 14:00:36 -0800 | [diff] [blame] | 138 | ------------------------------------------------------ |
| 139 | StringBuilder is safe, flexible, and results in much smaller code size than |
| 140 | using ``std::ostringstream``. However, applications sensitive to code size |
| 141 | should use StringBuilder with care. |
| 142 | |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 143 | The fixed code size cost of StringBuilder is significant, though smaller than |
Armando Montanez | 0054a9b | 2020-03-13 13:06:24 -0700 | [diff] [blame] | 144 | ``std::snprintf``. Using StringBuilder's << and append methods exclusively in |
| 145 | place of ``snprintf`` reduces code size, but ``snprintf`` may be difficult to |
| 146 | avoid. |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 147 | |
Armando Montanez | 0054a9b | 2020-03-13 13:06:24 -0700 | [diff] [blame] | 148 | The incremental code size cost of StringBuilder is comparable to ``snprintf`` if |
Wyatt Hepler | 0435efe | 2021-03-01 14:00:36 -0800 | [diff] [blame] | 149 | errors are handled. Each argument to StringBuilder's ``<<`` expands to a |
| 150 | function call, but one or two StringBuilder appends may have a smaller code size |
| 151 | impact than a single ``snprintf`` call. |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 152 | |
Alexei Frolov | 725b85b | 2020-03-19 13:37:10 -0700 | [diff] [blame] | 153 | .. include:: string_builder_size_report |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 154 | |
| 155 | Future work |
Wyatt Hepler | 0435efe | 2021-03-01 14:00:36 -0800 | [diff] [blame] | 156 | =========== |
Wyatt Hepler | fe85de2 | 2019-11-19 17:10:20 -0800 | [diff] [blame] | 157 | * StringBuilder's fixed size cost can be dramatically reduced by limiting |
| 158 | support for 64-bit integers. |
| 159 | * Consider integrating with the tokenizer module. |