Update documentation
diff --git a/docs/menu.rst b/docs/menu.rst
index 2c110de..afd9772 100644
--- a/docs/menu.rst
+++ b/docs/menu.rst
@@ -5,9 +5,11 @@
     3) `API reference`_
     4) `Security model`_
     5) `Migration from older versions`_
+    6) `New features`_
     
 .. _`Overview`: index.html
 .. _`Concepts`: concepts.html
 .. _`API reference`: reference.html
 .. _`Security model`: security.html
 .. _`Migration from older versions`: migration.html
+.. _`New features`: whats_new.html
diff --git a/docs/whats_new.rst b/docs/whats_new.rst
new file mode 100644
index 0000000..38bfec5
--- /dev/null
+++ b/docs/whats_new.rst
@@ -0,0 +1,171 @@
+====================
+Nanopb: New features
+====================
+
+.. include :: menu.rst
+
+.. contents ::
+
+What's new in nanopb 0.4
+========================
+Long in the making, nanopb 0.4 has seen some wide reaching improvements in
+reaction to the development of the rest of the protobuf ecosystem. This document
+showcases features that are not immediately visible, but that you may want to
+take advantage of.
+
+A lot of effort has been spent in retaining backwards and forwards compatibility
+with previous nanopb versions. For a list of breaking changes, see `migration document`_
+
+.. _`migration document`: migration.html
+
+New field descriptor format
+---------------------------
+The basic design of nanopb has always been that the information about messages
+is stored in a compact descriptor format, which is iterated in runtime.
+Initially it was very tightly tied with encoder and decoder logic.
+
+In nanopb-0.3.0 the field iteration logic was separated to `pb_common.c`.
+Already at that point it was clear that the old format was getting too limited,
+but it wasn't extended at that time.
+
+Now in 0.4, the descriptor format was completely decoupled from the encoder
+and decoder logic, and redesigned to meet new demands. Previously each field
+was stored as `pb_field_t` struct, which was between 8 and 32 bytes in size,
+depending on compilation options and platform. Now information about fields is
+stored as a variable length sequence of `uint32_t` data words. There
+are 1, 2, 4 and 8 word formats, with the 8 word format containing plenty of
+space for future extensibility.
+
+One benefit of the variable length format is that most messages now take less
+storage space. Most fields use 2 words, while simple fields in small messages
+require only 1 word. Benefit is larger if code previously required
+`PB_FIELD_16BIT` or `PB_FIELD_32BIT` options. In the `AllTypes` test case, 0.3
+had data size of 1008 bytes in 8-bit configuration and 1408 bytes in 16-bit
+configuration. New format in 0.4 takes 896 bytes for either of these.
+
+In addition, the new decoupling has allowed moving most of the field descriptor
+data into FLASH on Harvard architectures, such as AVR. Previously nanopb was
+quite RAM-heavy on AVR, which cannot put normal constants in flash like most
+other platforms do.
+
+Python packaging for generator
+------------------------------
+Nanopb generator is now available as a Python package, installable using `pip`
+package manager. This will reduce the need for binary packages, as if you have
+Python already installed you can just `pip install nanopb` and have the
+generator available on path as `nanopb_generator`.
+
+The generator can also take advantage of the Python-based `protoc` available in
+`grpcio-tools` Python package. If you also install that, there is no longer
+a need to have binary `protoc` available.
+
+Generator now automatically calls protoc
+----------------------------------------
+Initially, nanopb generator was used in two steps: first calling `protoc` to
+parse the `.proto` file into `.pb` binary format, and then calling `nanopb_generator`
+to output the `.pb.h` and `.pb.c` files.
+
+Nanopb 0.2.3 added support for running as a `protoc` plugin, which allowed
+single-step generation using `--nanopb_out` parameter. However, the plugin
+mode has two complications: passing options to nanopb generator itself becomes
+more difficult, and the generator does not know the actual path of input files.
+The second limitation has been particularly problematic for locating `.options`
+files.
+
+Both of these older methods still work and will remain supported. However, now
+`nanopb_generator` can also take `.proto` files directly and it will transparently
+call `protoc` in the background.
+
+Callbacks bound by function name
+--------------------------------
+Since its very beginnings, nanopb has supported field callbacks to allow processing 
+structures that are larger than what could fit in memory at once. So far the
+callback functions have been stored in the message structure in a `pb_callback_t`
+struct.
+
+Storing pointers along with user data is somewhat risky from a security point of
+view. In addition it has caused problems with `oneof` fields, which reuse the
+same storage space for multiple submessages. Because there is no separate area
+for each submessage, there is no space to store the callback pointers either.
+
+Nanopb-0.4.0 introduces callbacks that are referenced by the function name
+instead of setting the pointers separately. This should work well for most
+applications that have a single callback function for each message type.
+For more complex needs, `pb_callback_t` will also remain supported.
+
+Function name callbacks also allow specifying custom data types for inclusion
+in the message structure. For example, you could have `MyObject*` pointer along
+with other message fields, and then process that object in custom way in your
+callback.
+
+This feature is demonstrated in `tests/oneof_callback` test case and
+`examples/network_server` example.
+
+Message level callback for oneofs
+---------------------------------
+As mentioned above, callbacks inside submessages inside oneofs have been
+problematic to use. To make using `pb_callback_t`-style callbacks there possible,
+a new generator option `submsg_callback` was added.
+
+Setting this option to true will cause a new message level callback to be added
+before the `which_field` of the oneof. This callback will be called when the
+submessage tag number is known, but before the actual message is decoded. The
+callback can either choose to set callback pointers inside the submessage, or
+just completely decode the submessage there and then. If any unread data remains
+after the callback returns, normal submessage decoding will continue.
+
+There is an example of this in `tests/oneof_callback` test case.
+
+Binding message types to custom structures
+------------------------------------------
+It is often said that good C code is chock full of macros. Or maybe I got it
+wrong. But since nanopb 0.2, the field descriptor generation has heavily relied
+on macros. This allows it to automatically adapt to differences in type alignment
+on different platforms, and to decouple the Python generation logic from how
+the message descriptors are implemented on the C side.
+
+Now in 0.4.0, I've made the macros even more abstract. Time will tell whether
+this was such a great idea that I think it is, but now the complete list of
+fields in each message is available in `.pb.h` file. This allows a kind of
+metaprogramming using `X-macros`_
+
+.. _`X-macros`: https://en.wikipedia.org/wiki/X_Macro
+
+One feature that this can be used for is binding the message descriptor to a
+custom structure or C++ class type. You could have a bunch of other fields in
+the structure and even the datatypes can be different to an extent, and nanopb
+will automatically detect the size and position of each field. The generated
+`.pb.c` files now just have calls of `PB_BIND(msgname, structname, width)`.
+Adding a similar call to your own code will bind the message to your own structure.
+
+UTF-8 validation
+----------------
+Protobuf format defines that strings should consist of valid UTF-8 codepoints.
+Previously nanopb has not enforced this, requiring extra care in the user code.
+Now optional UTF-8 validation is available with compilation option `PB_VALIDATE_UTF8`.
+
+Double to float conversion
+--------------------------
+Some platforms such as `AVR` do not support the `double` datatype, instead making
+it an alias for `float`. This has resulted in problems when trying to process
+message types containing `double` fields generated on other machines. There
+has been an example on how to manually perform the conversion between `double`
+and `float`.
+
+Now that example is integrated as an optional feature in nanopb core. By defining
+`PB_CONVERT_DOUBLE_FLOAT`, the required conversion between 32- and 64-bit floating
+point formats happens automatically on decoding and encoding.
+
+Improved testing
+----------------
+Testing on embedded platforms has been integrated in the continuous testing
+environment. Now all of the 80+ test cases are automatically run on STM32 and
+AVR targets. Previously only a few specialized test cases were manually tested
+on embedded systems.
+
+Nanopb fuzzer has also been integrated in Google's `OSSFuzz`_ platform, giving
+a huge boost in the CPU power available for randomized testing.
+
+.. _`OSSFuzz`: https://google.github.io/oss-fuzz/
+
+