.. _module-pw_string:

=========
pw_string
=========
String manipulation is a very common operation, but the standard C and C++
string libraries have drawbacks. The C++ functions are easy-to-use and powerful,
but require too much flash and memory for many embedded projects. The C string
functions are lighter weight, but can be difficult to use correctly. Mishandling
of null terminators or buffer sizes can result in serious bugs.

The ``pw_string`` module provides the flexibility, ease-of-use, and safety of
C++-style string manipulation, but with no dynamic memory allocation and a much
smaller binary size impact. Using ``pw_string`` in place of the standard C
functions eliminates issues related to buffer overflow or missing null
terminators.

-------------
Compatibility
-------------
C++17, C++14 (:cpp:type:`pw::InlineString`)

--------
Features
--------

pw::InlineString
================
:cpp:class:`pw::InlineBasicString` and :cpp:type:`pw::InlineString` are
C++14-compatible, fixed-capacity, null-terminated string classes. They are
equivalent to ``std::basic_string<T>`` and ``std::string``, but store the string
contents inline and use no dynamic memory.

:cpp:type:`pw::InlineString` takes the fixed capacity as a template argument,
but may be used generically without specifying the capacity. The capacity value
is stored in a member variable, which the generic ``pw::InlineString<>`` /
``pw::InlineBasicString<T>`` specialization uses in place of the template
parameter.

:cpp:type:`pw::InlineString` is efficient and compact. The current size and
capacity are stored in a single word. Accessing the contents of a
:cpp:type:`pw::InlineString` is a simple array access within the object, with no
pointer indirection, even when working from a generic ``pw::InlineString<>``
reference.

Key differences from ``std::string``
------------------------------------
- **Fixed capacity** -- Operations that add characters to the string beyond its
  capacity are an error. These trigger a ``PW_ASSERT`` at runtime. When
  detectable, these situations trigger a ``static_assert`` at compile time.
- **Character array support** -- :cpp:type:`pw::InlineString` provides overloads
  specific to character arrays. These allow for compile-time capacity checks and
  class template argument deduction.
- **Constexpr support** -- :cpp:type:`pw::InlineString` works in ``constexpr``
  contexts, which is not supported by ``std::string`` until C++20.

API reference
-------------
:cpp:type:`pw::InlineString` / :cpp:class:`pw::InlineBasicString` follows the
``std::string`` / ``std::basic_string<T>`` API as closely as possible.
:cpp:class:`pw::InlineBasicString` is intended to support all
``std::basic_string<T>`` operations, except those having to do with dynamic
memory allocation (``reserve()``, ``shrink_to_fit()``, ``get_allocator()``).

.. cpp:class:: template <typename T, unsigned short kCapacity> pw::InlineBasicString

   Represents a fixed-capacity string of a generic character type. Equivalent to
   ``std::basic_string<T>``. Always null (``T()``) terminated.

.. cpp:type:: template <unsigned short kCapacity> pw::InlineString = pw::InlineBasicString<char, kCapacity>

   Represents a fixed-capacity string of ``char`` characters. Equivalent to
   ``std::string``. Always null (``'\0'``) terminated.

Usage
-----
:cpp:type:`pw::InlineString` objects must be constructed by specifying a fixed
capacity for the string.

.. code-block:: c++

  // Initialize from a C string.
  pw::InlineString<32> my_string = "Literally";
  my_string.append('?', 3);   // contains "Literally???"

  // Like std::string, always null terminated when accessed through c_str().
  if (std::fopen(my_string.c_str(), "r") == nullptr) {
    // Integrates with std::string_view.
    my_string = std::string_view("not\0literally", 12);
    TakesAStringView(std::string_view(my_string));
  }

  // Supports copying into known-capacity strings.
  pw::InlineString<64> other = my_string;

  // Supports various helpful std::string functions
  if (my_string.starts_with("Lit") || my_string == "not\0literally"sv) {
    other += my_string;
  }

All :cpp:type:`pw::InlineString` operations may be performed on strings without
specifying their capacity.

.. code-block:: c++

  void RemoveSuffix(pw::InlineString<>& string, std::string_view suffix) {
    if (string.ends_with(suffix)) {
       string.resize(string.size() - suffix.size());
    }
  }

  void DoStuff() {
    pw::InlineString<32> str1 = "Good morning!";
    RemoveSuffix(str1, " morning!");

    pw::InlineString<40> str2 = "Good";
    RemoveSuffix(str2, " morning!");

    PW_ASSERT(str1 == str2);
  }

:cpp:type:`pw::InlineString` operations on known-size strings may be used in
``constexpr`` expressions.

.. code-block:: c++

   static constexpr pw::InlineString<64> kMyString = [] {
     pw::InlineString<64> string;

     for (int i = 0; i < 10; ++i) {
       string += "Hello";
     }

     return string;
   }();

:cpp:type:`pw::InlineBasicString` supports class template argument deduction
(CTAD) in C++17 and newer. Since :cpp:type:`pw::InlineString` is an alias, CTAD
is not supported until C++20.

.. code-block:: c++

   // Deduces a capacity of 6 characters to match the 6-character string literal
   // (counting the null terminator).
   pw::InlineBasicString my_string = "12345";

   // In C++20, CTAD may be used with the pw::InlineString alias.
   pw::InlineString my_other_string("123456789");

pw::string::Format
==================
The ``pw::string::Format`` and ``pw::string::FormatVaList`` functions provide
safer alternatives to ``std::snprintf`` and ``std::vsnprintf``. The snprintf
return value is awkward to interpret, and misinterpreting it can lead to serious
bugs.

Size report: replacing snprintf with pw::string::Format
-------------------------------------------------------
The ``Format`` functions have a small, fixed code size cost. However, relative
to equivalent ``std::snprintf`` calls, there is no incremental code size cost to
using ``Format``.

.. include:: format_size_report

Safe Length Checking
====================
This module provides two safer alternatives to ``std::strlen`` in case the
string is extremely long and/or potentially not null-terminated.

First, a constexpr alternative to C11's ``strnlen_s`` is offerred through
:cpp:func:`pw::string::ClampedCString`. This does not return a length by
design and instead returns a string_view which does not require
null-termination.

Second, a constexpr specialized form is offered where null termination is
required through :cpp:func:`pw::string::NullTerminatedLength`. This will only
return a length if the string is null-terminated.

.. cpp:function:: constexpr std::string_view pw::string::ClampedCString(span<const char> str)
.. cpp:function:: constexpr std::string_view pw::string::ClampedCString(const char* str, size_t max_len)

   Safe alternative to the string_view constructor to avoid the risk of an
   unbounded implicit or explicit use of strlen.

   This is strongly recommended over using something like C11's strnlen_s as
   a string_view does not require null-termination.

.. cpp:function:: constexpr pw::Result<size_t> pw::string::NullTerminatedLength(span<const char> str)
.. cpp:function:: pw::Result<size_t> pw::string::NullTerminatedLength(const char* str, size_t max_len)

   Safe alternative to strlen to calculate the null-terminated length of the
   string within the specified span, excluding the null terminator. Like C11's
   strnlen_s, the scan for the null-terminator is bounded.

   Returns:
     null-terminated length of the string excluding the null terminator.
     OutOfRange - if the string is not null-terminated.

   Precondition: The string shall be at a valid pointer.

pw::string::Copy
================
The ``pw::string::Copy`` functions provide a safer alternative to
``std::strncpy`` as it always null-terminates whenever the destination
buffer has a non-zero size.

.. cpp:function:: StatusWithSize Copy(const std::string_view& source, span<char> dest)
.. cpp:function:: StatusWithSize Copy(const char* source, span<char> dest)
.. cpp:function:: StatusWithSize Copy(const char* source, char* dest, size_t num)
.. cpp:function:: StatusWithSize Copy(const pw::Vector<char>& source, span<char> dest)

   Copies the source string to the dest, truncating if the full string does not
   fit. Always null terminates if dest.size() or num > 0.

   Returns the number of characters written, excluding the null terminator. If
   the string is truncated, the status is ResourceExhausted.

   Precondition: The destination and source shall not overlap.
   Precondition: The source shall be a valid pointer.

It also has variants that provide a destination of ``pw::Vector<char>``
(see :ref:`module-pw_containers` for details) that do not store the null
terminator in the vector.

.. cpp:function:: StatusWithSize Copy(const std::string_view& source, pw::Vector<char>& dest)
.. cpp:function:: StatusWithSize Copy(const char* source, pw::Vector<char>& dest)


pw::string::PrintableCopy
=========================
The ``pw::string::PrintableCopy`` function provides a safe printable copy of a
string. It functions with the same safety of ``pw::string::Copy`` while also
converting any non-printable characters to a ``.`` char.

.. cpp:function:: StatusWithSize PrintableCopy(const std::string_view& source, span<char> dest)

pw::StringBuilder
=================
``pw::StringBuilder`` facilitates building formatted strings in a fixed-size
buffer. It is designed to give the flexibility of ``std::string`` and
``std::ostringstream``, but with a small footprint.

.. code-block:: cpp

  #include "pw_log/log.h"
  #include "pw_string/string_builder.h"

  pw::Status LogProducedData(std::string_view func_name,
                             span<const std::byte> data) {
    pw::StringBuffer<42> sb;

    // Append a std::string_view to the buffer.
    sb << func_name;

    // Append a format string to the buffer.
    sb.Format(" produced %d bytes of data: ", static_cast<int>(data.data()));

    // Append bytes as hex to the buffer.
    sb << data;

    // Log the final string.
    PW_LOG_DEBUG("%s", sb.c_str());

    // Errors encountered while mutating the string builder are tracked.
    return sb.status();
  }

Supporting custom types with StringBuilder
------------------------------------------
As with ``std::ostream``, StringBuilder supports printing custom types by
overriding the ``<<`` operator. This is is done by defining ``operator<<`` in
the same namespace as the custom type. For example:

.. code-block:: cpp

  namespace my_project {

  struct MyType {
    int foo;
    const char* bar;
  };

  pw::StringBuilder& operator<<(pw::StringBuilder& sb, const MyType& value) {
    return sb << "MyType(" << value.foo << ", " << value.bar << ')';
  }

  }  // namespace my_project

Internally, ``StringBuilder`` uses the ``ToString`` function to print. The
``ToString`` template function can be specialized to support custom types with
``StringBuilder``, though it is recommended to overload ``operator<<`` instead.
This example shows how to specialize ``pw::ToString``:

.. code-block:: cpp

  #include "pw_string/to_string.h"

  namespace pw {

  template <>
  StatusWithSize ToString<MyStatus>(MyStatus value, span<char> buffer) {
    return Copy(MyStatusString(value), buffer);
  }

  }  // namespace pw

Size report: replacing snprintf with pw::StringBuilder
------------------------------------------------------
StringBuilder is safe, flexible, and results in much smaller code size than
using ``std::ostringstream``. However, applications sensitive to code size
should use StringBuilder with care.

The fixed code size cost of StringBuilder is significant, though smaller than
``std::snprintf``. Using StringBuilder's << and append methods exclusively in
place of ``snprintf`` reduces code size, but ``snprintf`` may be difficult to
avoid.

The incremental code size cost of StringBuilder is comparable to ``snprintf`` if
errors are handled. Each argument to StringBuilder's ``<<`` expands to a
function call, but one or two StringBuilder appends may have a smaller code size
impact than a single ``snprintf`` call.

.. include:: string_builder_size_report

Module Configuration Options
============================
The following configuration options can be adjusted via compile-time
configuration of this module.

.. c:macro:: PW_STRING_ENABLE_DECIMAL_FLOAT_EXPANSION

   Setting this to a non-zero value will result in the ``ToString`` function
   outputting string representations of floating-point values with a decimal
   expansion after the point, by using the ``Format`` function. The default
   value of this configuration option is zero, which will result in floating
   point values being rounded to the nearest integer in their string
   representation.

   Using a non-zero value for this configuration option may incur a code size
   cost due to the dependency on ``Format``.

-----------
Future work
-----------
* StringBuilder's fixed size cost can be dramatically reduced by limiting
  support for 64-bit integers.
* Consider integrating with the tokenizer module.

Zephyr
======
To enable ``pw_string`` for Zephyr add ``CONFIG_PIGWEED_STRING=y`` to the
project's configuration.
