blob: 41b2564869134e0a0049a8d68684c9bcaee3afeb [file] [log] [blame]
.. _module-pw_string:
=========
pw_string
=========
String manipulation is a very common operation, but the standard C and C++
string libraries have drawbacks. The C++ functions are easy-to-use and powerful,
but require too much flash and memory for many embedded projects. The C string
functions are lighter weight, but can be difficult to use correctly. Mishandling
of null terminators or buffer sizes can result in serious bugs.
The ``pw_string`` module provides the flexibility, ease-of-use, and safety of
C++-style string manipulation, but with no dynamic memory allocation and a much
smaller binary size impact. Using ``pw_string`` in place of the standard C
functions eliminates issues related to buffer overflow or missing null
terminators.
-------------
Compatibility
-------------
C++17, C++14 (:cpp:type:`pw::InlineString`)
--------
Features
--------
pw::InlineString
================
:cpp:class:`pw::InlineBasicString` and :cpp:type:`pw::InlineString` are
C++14-compatible, fixed-capacity, null-terminated string classes. They are
equivalent to ``std::basic_string<T>`` and ``std::string``, but store the string
contents inline and use no dynamic memory.
:cpp:type:`pw::InlineString` takes the fixed capacity as a template argument,
but may be used generically without specifying the capacity. The capacity value
is stored in a member variable, which the generic ``pw::InlineString<>`` /
``pw::InlineBasicString<T>`` specialization uses in place of the template
parameter.
:cpp:type:`pw::InlineString` is efficient and compact. The current size and
capacity are stored in a single word. Accessing the contents of a
:cpp:type:`pw::InlineString` is a simple array access within the object, with no
pointer indirection, even when working from a generic ``pw::InlineString<>``
reference.
Key differences from ``std::string``
------------------------------------
- **Fixed capacity** -- Operations that add characters to the string beyond its
capacity are an error. These trigger a ``PW_ASSERT`` at runtime. When
detectable, these situations trigger a ``static_assert`` at compile time.
- **Minimal overhead** -- :cpp:type:`pw::InlineString` operations never
allocate. Reading the contents of the string is a direct memory access within
the string object, without pointer indirection.
- **Constexpr support** -- :cpp:type:`pw::InlineString` works in ``constexpr``
contexts, which is not supported by ``std::string`` until C++20.
API reference
-------------
:cpp:type:`pw::InlineString` / :cpp:class:`pw::InlineBasicString` follows the
``std::string`` / ``std::basic_string<T>`` API, with a few variations:
- :cpp:type:`pw::InlineString` provides overloads specific to character arrays.
These perform compile-time capacity checks and are used for class template
argument deduction. Like ``std::string``, character arrays are treated as
null-terminated strings.
- :cpp:type:`pw::InlineString` allows implicit conversions from
``std::string_view``. Specifying the capacity parameter is cumbersome, so
implicit conversions are helpful. Also, implicitly creating a
:cpp:type:`pw::InlineString` is less costly than creating a ``std::string``.
As with ``std::string``, explicit conversions are required from types that
convert to ``std::string_view``.
- Functions related to dynamic memory allocation are not present (``reserve()``,
``shrink_to_fit()``, ``get_allocator()``).
- ``resize_and_overwrite()`` only takes the ``Operation`` argument, since the
underlying string buffer cannot be resized.
.. cpp:class:: template <typename T, unsigned short kCapacity> pw::InlineBasicString
Represents a fixed-capacity string of a generic character type. Equivalent to
``std::basic_string<T>``. Always null (``T()``) terminated.
.. cpp:type:: template <unsigned short kCapacity> pw::InlineString = pw::InlineBasicString<char, kCapacity>
Represents a fixed-capacity string of ``char`` characters. Equivalent to
``std::string``. Always null (``'\0'``) terminated.
See the `std::string documentation
<https://en.cppreference.com/w/cpp/string/basic_string>`_ for full details.
Usage
-----
:cpp:type:`pw::InlineString` objects must be constructed by specifying a fixed
capacity for the string.
.. code-block:: c++
// Initialize from a C string.
pw::InlineString<32> inline_string = "Literally";
inline_string.append('?', 3); // contains "Literally???"
// Supports copying into known-capacity strings.
pw::InlineString<64> other = inline_string;
// Supports various helpful std::string functions
if (inline_string.starts_with("Lit") || inline_string == "not\0literally"sv) {
other += inline_string;
}
// Like std::string, InlineString is always null terminated when accessed
// through c_str(). InlineString can be used to null-terminate
// length-delimited strings for APIs that expect null-terminated strings.
std::string_view file(".gif");
if (std::fopen(pw::InlineString<kMaxNameLen>(file).c_str(), "r") == nullptr) {
return;
}
// pw::InlineString integrates well with std::string_view. It supports
// implicit conversions to and from std::string_view.
inline_string = std::string_view("not\0literally", 12);
FunctionThatTakesAStringView(inline_string);
FunctionThatTakesAnInlineString(std::string_view("1234", 4));
All :cpp:type:`pw::InlineString` operations may be performed on strings without
specifying their capacity.
.. code-block:: c++
void RemoveSuffix(pw::InlineString<>& string, std::string_view suffix) {
if (string.ends_with(suffix)) {
string.resize(string.size() - suffix.size());
}
}
void DoStuff() {
pw::InlineString<32> str1 = "Good morning!";
RemoveSuffix(str1, " morning!");
pw::InlineString<40> str2 = "Good";
RemoveSuffix(str2, " morning!");
PW_ASSERT(str1 == str2);
}
:cpp:type:`pw::InlineString` operations on known-size strings may be used in
``constexpr`` expressions.
.. code-block:: c++
static constexpr pw::InlineString<64> kMyString = [] {
pw::InlineString<64> string;
for (int i = 0; i < 10; ++i) {
string += "Hello";
}
return string;
}();
:cpp:type:`pw::InlineBasicString` supports class template argument deduction
(CTAD) in C++17 and newer. Since :cpp:type:`pw::InlineString` is an alias, CTAD
is not supported until C++20.
.. code-block:: c++
// Deduces a capacity of 5 characters to match the 5-character string literal
// (not counting the null terminator).
pw::InlineBasicString inline_string = "12345";
// In C++20, CTAD may be used with the pw::InlineString alias.
pw::InlineString my_other_string("123456789");
pw::string::Format
==================
The ``pw::string::Format`` and ``pw::string::FormatVaList`` functions provide
safer alternatives to ``std::snprintf`` and ``std::vsnprintf``. The snprintf
return value is awkward to interpret, and misinterpreting it can lead to serious
bugs.
Size report: replacing snprintf with pw::string::Format
-------------------------------------------------------
The ``Format`` functions have a small, fixed code size cost. However, relative
to equivalent ``std::snprintf`` calls, there is no incremental code size cost to
using ``Format``.
.. include:: format_size_report
Safe Length Checking
====================
This module provides two safer alternatives to ``std::strlen`` in case the
string is extremely long and/or potentially not null-terminated.
First, a constexpr alternative to C11's ``strnlen_s`` is offerred through
:cpp:func:`pw::string::ClampedCString`. This does not return a length by
design and instead returns a string_view which does not require
null-termination.
Second, a constexpr specialized form is offered where null termination is
required through :cpp:func:`pw::string::NullTerminatedLength`. This will only
return a length if the string is null-terminated.
.. cpp:function:: constexpr std::string_view pw::string::ClampedCString(span<const char> str)
.. cpp:function:: constexpr std::string_view pw::string::ClampedCString(const char* str, size_t max_len)
Safe alternative to the string_view constructor to avoid the risk of an
unbounded implicit or explicit use of strlen.
This is strongly recommended over using something like C11's strnlen_s as
a string_view does not require null-termination.
.. cpp:function:: constexpr pw::Result<size_t> pw::string::NullTerminatedLength(span<const char> str)
.. cpp:function:: pw::Result<size_t> pw::string::NullTerminatedLength(const char* str, size_t max_len)
Safe alternative to strlen to calculate the null-terminated length of the
string within the specified span, excluding the null terminator. Like C11's
strnlen_s, the scan for the null-terminator is bounded.
Returns:
null-terminated length of the string excluding the null terminator.
OutOfRange - if the string is not null-terminated.
Precondition: The string shall be at a valid pointer.
pw::string::Copy
================
The ``pw::string::Copy`` functions provide a safer alternative to
``std::strncpy`` as it always null-terminates whenever the destination
buffer has a non-zero size.
.. cpp:function:: StatusWithSize Copy(const std::string_view& source, span<char> dest)
.. cpp:function:: StatusWithSize Copy(const char* source, span<char> dest)
.. cpp:function:: StatusWithSize Copy(const char* source, char* dest, size_t num)
.. cpp:function:: StatusWithSize Copy(const pw::Vector<char>& source, span<char> dest)
Copies the source string to the dest, truncating if the full string does not
fit. Always null terminates if dest.size() or num > 0.
Returns the number of characters written, excluding the null terminator. If
the string is truncated, the status is ResourceExhausted.
Precondition: The destination and source shall not overlap.
Precondition: The source shall be a valid pointer.
It also has variants that provide a destination of ``pw::Vector<char>``
(see :ref:`module-pw_containers` for details) that do not store the null
terminator in the vector.
.. cpp:function:: StatusWithSize Copy(const std::string_view& source, pw::Vector<char>& dest)
.. cpp:function:: StatusWithSize Copy(const char* source, pw::Vector<char>& dest)
pw::string::PrintableCopy
=========================
The ``pw::string::PrintableCopy`` function provides a safe printable copy of a
string. It functions with the same safety of ``pw::string::Copy`` while also
converting any non-printable characters to a ``.`` char.
.. cpp:function:: StatusWithSize PrintableCopy(const std::string_view& source, span<char> dest)
pw::StringBuilder
=================
``pw::StringBuilder`` facilitates building formatted strings in a fixed-size
buffer. It is designed to give the flexibility of ``std::string`` and
``std::ostringstream``, but with a small footprint.
.. code-block:: cpp
#include "pw_log/log.h"
#include "pw_string/string_builder.h"
pw::Status LogProducedData(std::string_view func_name,
span<const std::byte> data) {
pw::StringBuffer<42> sb;
// Append a std::string_view to the buffer.
sb << func_name;
// Append a format string to the buffer.
sb.Format(" produced %d bytes of data: ", static_cast<int>(data.data()));
// Append bytes as hex to the buffer.
sb << data;
// Log the final string.
PW_LOG_DEBUG("%s", sb.c_str());
// Errors encountered while mutating the string builder are tracked.
return sb.status();
}
Supporting custom types with StringBuilder
------------------------------------------
As with ``std::ostream``, StringBuilder supports printing custom types by
overriding the ``<<`` operator. This is is done by defining ``operator<<`` in
the same namespace as the custom type. For example:
.. code-block:: cpp
namespace my_project {
struct MyType {
int foo;
const char* bar;
};
pw::StringBuilder& operator<<(pw::StringBuilder& sb, const MyType& value) {
return sb << "MyType(" << value.foo << ", " << value.bar << ')';
}
} // namespace my_project
Internally, ``StringBuilder`` uses the ``ToString`` function to print. The
``ToString`` template function can be specialized to support custom types with
``StringBuilder``, though it is recommended to overload ``operator<<`` instead.
This example shows how to specialize ``pw::ToString``:
.. code-block:: cpp
#include "pw_string/to_string.h"
namespace pw {
template <>
StatusWithSize ToString<MyStatus>(MyStatus value, span<char> buffer) {
return Copy(MyStatusString(value), buffer);
}
} // namespace pw
Choosing between InlineString and StringBuilder
-----------------------------------------------
:cpp:type:`pw::InlineString` is comparable to ``std::string``, while
:cpp:class:`pw::StringBuilder` is comparable to ``std::ostringstream``.
Because :cpp:class:`pw::StringBuilder` provides high-level stream functionality,
it has more overhead than :cpp:type:`pw::InlineString`.
Use :cpp:type:`pw::InlineString` unless :cpp:class:`pw::StringBuilder`'s
capabilities are needed. Features unique to :cpp:class:`pw::StringBuilder`
include:
* Polymorphic C++ stream-style output, potentially supporting custom types.
* Non-fatal handling of failed append/format operations.
* Tracking the status of a series of operations.
* Building a string in an external buffer.
If those features are not required, use :cpp:type:`pw::InlineString`. A common
example of when to prefer :cpp:type:`pw::InlineString` is wrapping a
length-delimited string (e.g. ``std::string_view``) for APIs that require null
termination.
.. code-block:: cpp
void ProcessName(std::string_view name) {
PW_LOG_DEBUG("The name is %s", pw::InlineString<kMaxNameLen>(name).c_str());
Size report: replacing snprintf with pw::StringBuilder
------------------------------------------------------
StringBuilder is safe, flexible, and results in much smaller code size than
using ``std::ostringstream``. However, applications sensitive to code size
should use StringBuilder with care.
The fixed code size cost of StringBuilder is significant, though smaller than
``std::snprintf``. Using StringBuilder's << and append methods exclusively in
place of ``snprintf`` reduces code size, but ``snprintf`` may be difficult to
avoid.
The incremental code size cost of StringBuilder is comparable to ``snprintf`` if
errors are handled. Each argument to StringBuilder's ``<<`` expands to a
function call, but one or two StringBuilder appends may have a smaller code size
impact than a single ``snprintf`` call.
.. include:: string_builder_size_report
Module Configuration Options
============================
The following configuration options can be adjusted via compile-time
configuration of this module.
.. c:macro:: PW_STRING_ENABLE_DECIMAL_FLOAT_EXPANSION
Setting this to a non-zero value will result in the ``ToString`` function
outputting string representations of floating-point values with a decimal
expansion after the point, by using the ``Format`` function. The default
value of this configuration option is zero, which will result in floating
point values being rounded to the nearest integer in their string
representation.
Using a non-zero value for this configuration option may incur a code size
cost due to the dependency on ``Format``.
-----------
Future work
-----------
* StringBuilder's fixed size cost can be dramatically reduced by limiting
support for 64-bit integers.
* Consider integrating with the tokenizer module.
Zephyr
======
To enable ``pw_string`` for Zephyr add ``CONFIG_PIGWEED_STRING=y`` to the
project's configuration.