docs: add OS abstraction layers doc

Adds a new top level doc on Pigweed's OS abstraction layers.
While doing so, also adds some minor updates to the relevant modules.

Change-Id: Ie5c324088d06a1b4c6c11ff5abbbea03f4e9c61d
Reviewed-on: https://pigweed-review.googlesource.com/c/pigweed/pigweed/+/44920
Commit-Queue: Ewout van Bekkum <ewout@google.com>
Reviewed-by: Keir Mierle <keir@google.com>
diff --git a/docs/BUILD.gn b/docs/BUILD.gn
index 0c99c11..26004b7 100644
--- a/docs/BUILD.gn
+++ b/docs/BUILD.gn
@@ -37,6 +37,7 @@
     "faq.rst",
     "getting_started.rst",
     "module_structure.rst",
+    "os_abstraction_layers.rst",
     "style_guide.rst",
   ]
 }
diff --git a/docs/index.rst b/docs/index.rst
index 8d996e4..28cf4ab 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -15,6 +15,7 @@
   docs/code_of_conduct
   docs/embedded_cpp_guide
   Code Style <docs/style_guide>
+  docs/os_abstraction_layers
   targets
   Build System <build_system>
   FAQ <docs/faq>
diff --git a/docs/os_abstraction_layers.rst b/docs/os_abstraction_layers.rst
new file mode 100644
index 0000000..08a6db6
--- /dev/null
+++ b/docs/os_abstraction_layers.rst
@@ -0,0 +1,493 @@
+.. _docs-os_abstraction_layers:
+
+=====================
+OS Abstraction Layers
+=====================
+Pigweed’s operating system abstraction layers are configurable building blocks.
+They are designed to be lightweight, portable, and easy to use while giving
+users full control and configurability.
+
+Although we primarily target smaller-footprint MMU-less 32-bit microcontrollers,
+the OS abstraction layers are written to work on everything from single-core
+bare metal low end microcontrollers to asymmetric multiprocessing (AMP) and
+symmetric multiprocessing (SMP) embedded systems using Real Time Operating
+Systems (RTOS). They even fully work on your developer workstation on Linux,
+Windows, or MacOS!
+
+.. list-table::
+
+  * - **Environment**
+    - **Status**
+  * - FreeRTOS
+    - Supported
+  * - ThreadX
+    - Supported
+  * - embOS
+    - *In Progress*
+  * - STL
+    - Supported
+  * - Zephyr
+    - Planned
+  * - CMSIS-RTOS API v2 & RTX5
+    - Planned
+  * - Baremetal
+    - *In Progress*
+
+.. contents::
+   :local:
+   :depth: 1
+
+-------------
+OS Primitives
+-------------
+Pigweed's OS abstraction layers are divided by the functional grouping of the
+primitives. Many of our APIs are similar or nearly identical to C++'s Standard
+Template Library (STL) with the notable exception that we do not support
+exceptions. We opted to follow the STL's APIs partially because they are
+relatively well thought out and many developers are already familiar with them,
+but also because this means they are compatible with existing helpers in the STL
+which can then be further leveraged.
+
+---------------------------
+pw_chrono - Time Primitives
+---------------------------
+The :ref:`module-pw_chrono` module provides the building blocks for expressing
+durations, timestamps, and acquiring the current time. This in turn is used by
+other modules, including  :ref:`module-pw_sync` and :ref:`module-pw_thread` as
+the basis for any time bound APIs (i.e. with timeouts and/or deadlines). Note
+that this module is optional and bare metal targets may opt not to use this.
+
+.. list-table::
+
+  * - **Supported On**
+    - **SystemClock**
+  * - FreeRTOS
+    - :ref:`module-pw_chrono_freertos`
+  * - ThreadX
+    - :ref:`module-pw_chrono_threadx`
+  * - embOS
+    - :ref:`module-pw_chrono_embos`
+  * - STL
+    - :ref:`module-pw_chrono_stl`
+  * - Zephyr
+    - Planned
+  * - CMSIS-RTOS API v2 & RTX5
+    - Planned
+  * - Baremetal
+    - Planned
+
+
+SystemClock
+===========
+For RTOS and HAL interactions, we provide a ``pw::chrono::SystemClock`` facade
+which provides 64 bit timestamps and duration support along with a C API. For
+C++ there is an optional virtual wrapper, ``pw::chrono::VirtualSystemClock``,
+around the singleton clock facade to enable dependency injection.
+
+.. code-block:: cpp
+
+  #include <chrono>
+
+  #include "pw_thread/sleep.h"
+
+  using namespace std::literals::chrono_literals;
+
+  void ThisSleeps() {
+    pw::thread::sleep_for(42ms);
+  }
+
+Unlike the STL's time bound templated APIs which are not specific to a
+particular clock, Pigweed's time bound APIs are strongly typed to use the
+``pw::chrono::SystemClock``'s ``duration`` and ``time_points`` directly.
+
+.. code-block:: cpp
+
+  #include "pw_chrono/system_clock.h"
+
+  bool HasThisPointInTimePassed(const SystemClock::time_point timestamp) {
+    return SystemClock::now() > timestamp;
+  }
+
+------------------------------------
+pw_sync - Synchronization Primitives
+------------------------------------
+The :ref:`module-pw_sync` provides the building blocks for synchronizing between
+threads and/or interrupts through signaling primitives and critical section lock
+primitives.
+
+Critical Section Lock Primitives
+================================
+Pigweed's locks support Clang's thread safety lock annotations and the STL's
+RAII helpers.
+
+.. list-table::
+
+  * - **Supported On**
+    - **Mutex**
+    - **TimedMutex**
+    - **InterruptSpinLock**
+  * - FreeRTOS
+    - :ref:`module-pw_sync_freertos`
+    - :ref:`module-pw_sync_freertos`
+    - :ref:`module-pw_sync_freertos`
+  * - ThreadX
+    - :ref:`module-pw_sync_threadx`
+    - :ref:`module-pw_sync_threadx`
+    - :ref:`module-pw_sync_threadx`
+  * - embOS
+    - :ref:`module-pw_sync_embos`
+    - :ref:`module-pw_sync_embos`
+    - :ref:`module-pw_sync_embos`
+  * - STL
+    - :ref:`module-pw_sync_stl`
+    - :ref:`module-pw_sync_stl`
+    - :ref:`module-pw_sync_stl`
+  * - Zephyr
+    - Planned
+    - Planned
+    - Planned
+  * - CMSIS-RTOS API v2 & RTX5
+    - Planned
+    - Planned
+    - Planned
+  * - Baremetal
+    - Planned, not ready for use
+    - ✗
+    - Planned, not ready for use
+
+
+Thread Safe Mutex
+-----------------
+The ``pw::sync::Mutex`` protects shared data from being simultaneously accessed
+by multiple threads. Optionally, the ``pw::sync::TimedMutex`` can be used as an
+extension with timeout and deadline based semantics.
+
+.. code-block:: cpp
+
+  #include <mutex>
+
+  #include "pw_sync/mutex.h"
+
+  pw::sync::Mutex mutex;
+
+  void ThreadSafeCriticalSection() {
+    std::lock_guard lock(mutex);
+    NotThreadSafeCriticalSection();
+  }
+
+Interrupt Safe InterruptSpinLock
+--------------------------------
+The ``pw::sync::InterruptSpinLock`` protects shared data from being
+simultaneously accessed by multiple threads and/or interrupts as a targeted
+global lock, with the exception of Non-Maskable Interrupts (NMIs). Unlike global
+interrupt locks, this also works safely and efficiently on SMP systems.
+
+.. code-block:: cpp
+
+  #include <mutex>
+
+  #include "pw_sync/interrupt_spin_lock.h"
+
+  pw::sync::InterruptSpinLock interrupt_spin_lock;
+
+  void InterruptSafeCriticalSection() {
+    std::lock_guard lock(interrupt_spin_lock);
+    NotThreadSafeCriticalSection();
+  }
+
+Signaling Primitives
+====================
+Native signaling primitives tend to vary more compared to critical section locks
+across different platforms. For example, although common signaling primitives
+like semaphores are in most if not all RTOSes and even POSIX, it was not in the
+STL before C++20. Likewise many C++ developers are surprised that conditional
+variables tend to not be natively supported on RTOSes. Although you can usually
+build any signaling primitive based on other native signaling primitives,
+this may come with non-trivial added overhead in ROM, RAM, and execution
+efficiency.
+
+For this reason, Pigweed intends to provide some simpler signaling primitives
+which exist to solve a narrow programming need but can be implemented as
+efficiently as possible for the platform that it is used on. This simpler but
+highly portable class of signaling primitives is intended to ensure that a
+portability efficiency tradeoff does not have to be made up front.
+
+.. list-table::
+
+  * - **Supported On**
+    - **ThreadNotification**
+    - **TimedThreadNotification**
+    - **CountingSemaphore**
+    - **BinarySemaphore**
+  * - FreeRTOS
+    - :ref:`module-pw_sync_freertos`
+    - :ref:`module-pw_sync_freertos`
+    - :ref:`module-pw_sync_freertos`
+    - :ref:`module-pw_sync_freertos`
+  * - ThreadX
+    - :ref:`module-pw_sync_threadx`
+    - :ref:`module-pw_sync_threadx`
+    - :ref:`module-pw_sync_threadx`
+    - :ref:`module-pw_sync_threadx`
+  * - embOS
+    - :ref:`module-pw_sync_embos`
+    - :ref:`module-pw_sync_embos`
+    - :ref:`module-pw_sync_embos`
+    - :ref:`module-pw_sync_embos`
+  * - STL
+    - :ref:`module-pw_sync_stl`
+    - :ref:`module-pw_sync_stl`
+    - :ref:`module-pw_sync_stl`
+    - :ref:`module-pw_sync_stl`
+  * - Zephyr
+    - Planned
+    - Planned
+    - Planned
+    - Planned
+  * - CMSIS-RTOS API v2 & RTX5
+    - Planned
+    - Planned
+    - Planned
+    - Planned
+  * - Baremetal
+    - Planned
+    - ✗
+    - TBD
+    - TBD
+
+Thread Notification
+-------------------
+Pigweed intends to provide the ``pw::sync::ThreadNotification`` and
+``pw::sync::TimedThreadNotification`` facades which permit a singler consumer to
+block until an event occurs. This should be backed by the most efficient native
+primitive for a target, regardless of whether that is a semaphore, event flag
+group, condition variable, or direct task notification with a critical section
+something else.
+
+Counting Semaphore
+------------------
+The ``pw::sync::CountingSemaphore`` is a synchronization primitive that can be
+used for counting events and/or resource management where receiver(s) can block
+on acquire until notifier(s) signal by invoking release.
+
+.. code-block:: cpp
+
+  #include "pw_sync/counting_semaphore.h"
+
+  pw::sync::CountingSemaphore event_semaphore;
+
+  void NotifyEventOccurred() {
+    event_semaphore.release();
+  }
+
+  void HandleEventsForever() {
+    while (true) {
+      event_semaphore.acquire();
+      HandleEvent();
+    }
+  }
+
+Binary Semaphore
+----------------
+The ``pw::sync::BinarySemaphore`` is a specialization of the counting semaphore
+with an arbitrary token limit of 1, meaning it's either full or empty.
+
+.. code-block:: cpp
+
+  #include "pw_sync/binary_semaphore.h"
+
+  pw::sync::BinarySemaphore do_foo_semaphore;
+
+  void NotifyResultReady() {
+    result_ready_semaphore.release();
+  }
+
+  void BlockUntilResultReady() {
+    result_ready_semaphore.acquire();
+  }
+
+--------------------------------
+pw_thread - Threading Primitives
+--------------------------------
+The :ref:`module-pw_thread` module provides the building blocks for creating and
+using threads including yielding and sleeping.
+
+.. list-table::
+
+  * - **Supported On**
+    - **Thread Creation**
+    - **Thread Id/Sleep/Yield**
+  * - FreeRTOS
+    - :ref:`module-pw_sync_freertos`
+    - :ref:`module-pw_sync_freertos`
+  * - ThreadX
+    - :ref:`module-pw_sync_threadx`
+    - :ref:`module-pw_sync_threadx`
+  * - embOS
+    - Under Development
+    - :ref:`module-pw_sync_embos`
+  * - STL
+    - :ref:`module-pw_sync_stl`
+    - :ref:`module-pw_sync_stl`
+  * - Zephyr
+    - Planned
+    - Planned
+  * - CMSIS-RTOS API v2 & RTX5
+    - Planned
+    - Planned
+  * - Baremetal
+    - ✗
+    - ✗
+
+Thread Creation
+===============
+The ``pw::thread::Thread``’s API is C++11 STL ``std::thread`` like. Unlike
+``std::thread``, the Pigweed's API requires ``pw::thread::Options`` as an
+argument for creating a thread. This is used to give the user full control over
+the native OS's threading options without getting in your way.
+
+.. code-block:: cpp
+
+  #include "pw_thread/detached_thread.h"
+  #include "pw_thread_freertos/context.h"
+  #include "pw_thread_freertos/options.h"
+
+  pw::thread::freertos::ContextWithStack<42> example_thread_context;
+
+  void StartDetachedExampleThread() {
+     pw::thread::DetachedThread(
+       pw::thread::freertos::Options()
+           .set_name("static_example_thread")
+           .set_priority(kFooPriority)
+           .set_static_context(example_thread_context),
+       example_thread_function);
+  }
+
+Controlling the current thread
+==============================
+Beyond thread creation, Pigweed offers support for sleeping, identifying, and
+yielding the current thread.
+
+.. code-block:: cpp
+
+  #include "pw_thread/yield.h"
+
+  void CooperativeBusyLooper() {
+    while (true) {
+      DoChunkOfWork();
+      pw::this_thread::yield();
+    }
+  }
+
+--------------------------------------------
+Execution Contexts & Thread-Safety API Model
+--------------------------------------------
+The explosion of real contexts is too large for Pigweed to fully cover in a way
+that provides value. First there are many more contexts than just threads and
+IRQ handlers on microcontrollers, there are many more meta contexts like
+non-blocking thread callbacks which may have scheduling and/or interrupts masked
+to some degree. On top of this some environments like in userspace may not even
+have interrupts and instead deal with signals. Instead we use the following
+simplified execution thread-safety model which our APIs should be ported to
+support regardless of the real contexts they are executed in:
+
+**Thread Safe APIs** - These APIs are safe to use in any execution context where
+one can use blocking or yielding APIs such as sleeping, blocking on a mutex
+waiting on a semaphore.
+
+**Interrupt (IRQ) Safe APIs** - These APIs can be used in any execution context
+which cannot use blocking and yielding APIs. These APIs must protect themselves
+from preemption from maskable interrupts, etc. This includes critical section
+thread contexts in addition to "real" interrupt contexts. Our definition
+explicitly excludes any interrupts which are not masked when holding a SpinLock,
+those are all considered non-maskable interrupts. An interrupt safe API may
+always be safely used in a context which permits thread safe APIs.
+
+**Non-Maskable Interrupt (NMI) Safe APIs** - Like the Interrupt Safe APIs, these
+can be used in any execution context which cannot use blocking or yielding APIs.
+In addition, these may be used by interrupts which are not masked when for
+example holding a SpinLock like CPU exceptions or C++/POSIX signals. These tend
+to come with significant overhead and restrictions compared to regular interrupt
+safe APIs as they cannot rely on critical sections for implementations, instead
+only atomic signaling can be used. An interrupt safe API may always be safely
+used in a context which permits interrupt safe and thread safe APIs.
+
+Instead of going with context specific APIs, e.g. FreeRTOS's ``*FromISR()``
+APIs, Pigweed opted to go with the merged (context agnostic) API which validates
+the context requirements through ``DASSERT`` and ``DCHECK`` in the backends
+(user configurable). We did this primarily for two reasons. The explosion of
+real contexts is too large for Pigweed to fully cover as mentioned above,
+meaning there would likely have to be some context aware multiplexing with our
+simplified thread safety model split APIs. Second, we would recommend a
+``DHCECK`` to enforce context requirements regardless, so we've opted with a
+simplest API which also happens to match both the C++'s STL and Google's Abseil
+relatively closely.
+
+---------------------------------------------------
+Construction Requirements & Initialization Paradigm
+---------------------------------------------------
+
+**TL;DR: Pigweed OS primitives are initialized through C++ construction.**
+
+We have chosen to go with a model which initializes the synchronization
+primitive during C++ object construction. This means that there is a requirement
+in order for static instantiation to be safe that the user ensures that any
+necessary kernel and/or platform initialization is done before the global static
+constructors are run which would include construction of the C++ synchronization
+primitives.
+
+In addition this model for now assumes that Pigweed code will always be used to
+construct synchronization primitives used with Pigweed modules. Note that with
+this model the backend provider can decide if they want to statically
+preallocate space for the primitives or rely on dynamic allocation strategies.
+If we discover at a later point that this is not sufficiently portable than we
+can either produce an optional constructor that takes in a reference to an
+existing native synchronization type and wastes a little bit RAM or we can
+refactor the existing class into two layers where one is a StaticMutex for
+example and the other is a Mutex which only holds a handle to the native mutex
+type. This would then permit users who cannot construct their synchronization
+primitives to skip the optional static layer.
+
+Kernel / Platform Initialization Before C++ Global Static Constructors
+======================================================================
+What is this kernel and/or platform initialization that must be done first?
+
+It's not uncommon for an RTOS to require some initialization functions to be
+invoked before more of its API can be safely used. For example for CMSIS RTOSv2
+``osKernelInitialize()`` must be invoked before anything but two basic getters
+are called. Similarly, Segger's embOS requires ``OS_Init()`` to be invoked first
+before any other embOS API.
+
+.. Note::
+  To get around this one should invoke these initialization functions earlier
+  and/or delay the static C++ constructors to meet this ordering requirement. As
+  an example if you were using :ref:`module-pw_boot_armv7m`, then
+  ``pw_boot_PreStaticConstructorInit()`` would be a great place to invoke kernel
+  initialization.
+
+-------
+Roadmap
+-------
+Pigweed is still actively expanding and improving its OS Abstraction Layers.
+That being said, the following concrete areas are being worked on and can be
+expected to land at some point in the future:
+
+1. Thread creation support for embOS is in progress.
+2. We'd like to offer a system clock based timer abstraction facade which can be
+   used on either an RTOS or a hardware timer.
+3. We are evaluating a less-portable but very useful portability facade for
+   event flags / groups. This would make it even easier to ensure all firmware
+   can be fully executed on the host.
+4. Cooperative cancellation thread joining along with a ``std::jhtread`` like
+   wrapper is in progress.
+5. We'd like to add support for queues, message queues, and similar channel
+   abstractions which also support interprocessor communication in a transparent
+   manner.
+6. We're interested in supporting asynchronous worker queues and worker queue
+   pools.
+7. Migrate HAL and similar APIs to use deadlines for the backend virtual
+   interfaces to permit a smaller vtable which supports both public timeout and
+   deadline semantics.
+8. Baremetal support is partially in place today, but it's not ready for use.
+9. Most of our APIs today are focused around synchronous blocking APIs, however
+   we would love to extend this to include asynchronous APIs.
diff --git a/pw_chrono/docs.rst b/pw_chrono/docs.rst
index e62aba6..62cc03b 100644
--- a/pw_chrono/docs.rst
+++ b/pw_chrono/docs.rst
@@ -7,9 +7,20 @@
 leveraging many pieces of STL's the ``std::chrono`` library but with a focus
 on portability for constrained embedded devices and maintaining correctness.
 
+At a high level Pigweed's time primitives rely on C++'s
+`<chrono> <https://en.cppreference.com/w/cpp/header/chrono>`_ library to enable
+users to express intents with strongly typed real time units. In addition, it
+extends the C++ named
+`Clock <https://en.cppreference.com/w/cpp/named_req/Clock>`_ and
+`TrivialClock <https://en.cppreference.com/w/cpp/named_req/TrivialClock>`_
+requirements with additional attributes such as whether a clock is monotonic
+(not just steady), is always enabled (or requires enabling), is free running
+(works even if interrupts are masked), whether it is safe to use in a
+Non-Maskable Interrupts (NMI), what the epoch is, and more.
+
 .. warning::
-  This module is under construction, not ready for use, and the documentation
-  is incomplete.
+  This module is still under construction, the API is not yet stable. Also the
+  documentation is incomplete.
 
 SystemClock facade
 ------------------
diff --git a/pw_sync/docs.rst b/pw_sync/docs.rst
index f2959c8..b5835fb 100644
--- a/pw_sync/docs.rst
+++ b/pw_sync/docs.rst
@@ -969,7 +969,7 @@
 
 This simpler but highly portable class of signaling primitives is intended to
 ensure that a portability efficiency tradeoff does not have to be made up front.
-For example we intend to provide a ``pw::sync::Notification`` facade which
+For example we intend to provide a ``pw::sync::ThreadNotification`` facade which
 permits a singler consumer to block until an event occurs. This should be
 backed by the most efficient native primitive for a target, regardless of
 whether that is a semaphore, event flag group, condition variable, or something
@@ -1043,12 +1043,23 @@
   * - CMSIS-RTOS API v2 & RTX5
     - Planned
 
+Conditional Variables
+=====================
+We've decided for now to skip on conditional variables. These are constructs,
+which are typically not natively available on RTOSes. CVs would have to be
+backed by a multiple hidden semaphore(s) in addition to the explicit public
+mutex. In other words a CV typically ends up as a a composition of
+synchronization primitives on RTOSes. That being said, one could implement them
+using our semaphore and mutex layers and we may consider providing this in the
+future. However for most of our resource constrained customers they will mostly
+likely be using semaphores more often than CVs.
+
 Coming Soon
 ===========
 We are intending to provide facades for:
 
-* ``pw::sync::Notification``: A portable abstraction to allow threads to receive
-  notification of a single occurrence of a single event.
+* ``pw::sync::ThreadNotification``: A portable abstraction to allow a single
+  thread to receive notification of a single occurrence of a single event.
 
 * ``pw::sync::EventGroup`` A facade for a common primitive on RTOSes like
   FreeRTOS, RTX5, ThreadX, and embOS which permit threads and interrupts to
diff --git a/pw_thread/docs.rst b/pw_thread/docs.rst
index c2b5ef3..99003cd 100644
--- a/pw_thread/docs.rst
+++ b/pw_thread/docs.rst
@@ -27,6 +27,13 @@
 ``pw::thread::ThreadCore`` objects and functions which match the
 ``pw::thread::Thread::ThreadRoutine`` signature.
 
+We recognize that the C++11's STL ``std::thread``` API has some drawbacks where
+it is easy to forget to join or detach the thread handle. Because of this, we
+offer helper wrappers like the ``pw::thread::DetachedThread``. Soon we will
+extend this by also adding a ``pw::thread::JoiningThread`` helper wrapper which
+will also have a lighter weight C++20 ``std::jthread`` like cooperative
+cancellation contract to make joining safer and easier.
+
 Threads may begin execution immediately upon construction of the associated
 thread object (pending any OS scheduling delays), starting at the top-level
 function provided as a constructor argument. The return value of the