Update documentation
diff --git a/docs/Makefile b/docs/Makefile
index 24e6ab7..62920d0 100644
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -5,5 +5,5 @@
 	rsvg-convert $< > $@
 
 %.html: %.rst
-	rst2html --stylesheet=lsr.css --link-stylesheet $< $@
+	rst2html --field-name-limit=32 --stylesheet=lsr.css --link-stylesheet $< $@
 	sed -i 's!</head>!<link href="favicon.ico" type="image/x-icon" rel="shortcut icon" />\n</head>!' $@
diff --git a/docs/concepts.rst b/docs/concepts.rst
index 34efcc9..a8d5e52 100644
--- a/docs/concepts.rst
+++ b/docs/concepts.rst
@@ -16,13 +16,18 @@
 
 Compiling .proto files for nanopb
 ---------------------------------
-Nanopb uses the Google's protoc compiler to parse the .proto file, and then a
-python script to generate the C header and source code from it::
+Nanopb comes with a Python script to generate `.pb.c` and `.pb.h` files from
+the `.proto` definition::
 
-    user@host:~$ protoc -omessage.pb message.proto
-    user@host:~$ python ../generator/nanopb_generator.py message.pb
-    Writing to message.h and message.c
-    user@host:~$
+    user@host:~$ python nanopb/generator/nanopb_generator.py message.proto
+    Writing to message.pb.h and message.pb.c
+
+Internally this script uses Google `protoc` to parse the input file. If you
+do not have it available, you may receive an error message. You can install
+either `grpcio-tools` Python package using `pip`, or the `protoc` compiler
+itself from `protobuf-compiler` distribution package. Generally the Python
+package is recommended, because nanopb requires protoc version 3.6 or newer,
+and some distributions come with an older version.
 
 Modifying generator behaviour
 -----------------------------
@@ -192,7 +197,7 @@
 ------------------
 ::
 
-    bool (*encode)(pb_ostream_t *stream, const pb_field_t *field, void * const *arg);
+    bool (*encode)(pb_ostream_t *stream, const pb_field_iter_t *field, void * const *arg);
 
 When encoding, the callback should write out complete fields, including the wire type and field number tag. It can write as many or as few fields as it likes. For example, if you want to write out an array as *repeated* field, you should do it all in a single call.
 
@@ -206,7 +211,7 @@
 
 This callback writes out a dynamically sized string::
 
-    bool write_string(pb_ostream_t *stream, const pb_field_t *field, void * const *arg)
+    bool write_string(pb_ostream_t *stream, const pb_field_iter_t *field, void * const *arg)
     {
         char *str = get_string_from_somewhere();
         if (!pb_encode_tag_for_field(stream, field))
@@ -219,7 +224,7 @@
 ------------------
 ::
 
-    bool (*decode)(pb_istream_t *stream, const pb_field_t *field, void **arg);
+    bool (*decode)(pb_istream_t *stream, const pb_field_iter_t *field, void **arg);
 
 When decoding, the callback receives a length-limited substring that reads the contents of a single field. The field tag has already been read. For *string* and *bytes*, the length value has already been parsed, and is available at *stream->bytes_left*.
 
@@ -229,7 +234,7 @@
 
 This callback reads multiple integers and prints them::
 
-    bool read_ints(pb_istream_t *stream, const pb_field_t *field, void **arg)
+    bool read_ints(pb_istream_t *stream, const pb_field_iter_t *field, void **arg)
     {
         while (stream->bytes_left)
         {
@@ -241,10 +246,32 @@
         return true;
     }
 
-Field description array
-=======================
+Function name bound callbacks
+-----------------------------
+::
 
-For using the *pb_encode* and *pb_decode* functions, you need an array of pb_field_t constants describing the structure you wish to encode. This description is usually autogenerated from .proto file.
+    bool MyMessage_callback(pb_istream_t *istream, pb_ostream_t *ostream, const pb_field_iter_t *field);
+
+:istream:   Input stream to read from, or NULL if called in encoding context.
+:ostream:   Output stream to write to, or NULL if called in decoding context.
+:field:     Iterator for the field currently being encoded or decoded.
+
+Storing function pointer in `pb_callback_t` fields inside the message requires extra storage space and is often cumbersome.
+As an alternative, the generator options `callback_function` and `callback_datatype` can be used to bind a callback function based on its name.
+
+Typically this feature is used by setting `callback_datatype` to e.g. `void*` or other data type used for callback state.
+Then the generator will automatically set `callback_function` to `MessageName_callback` and produce a prototype for it in generated `.pb.h`.
+By implementing this function in your own code, you will receive callbacks for fields without having to separately set function pointers.
+
+If you want to use function name bound callbacks for some fields and `pb_callback_t` for other fields,
+you can call `pb_default_field_callback` from the message-level callback.
+It will then read a function pointer from `pb_callback_t` and call it.
+
+Message descriptor
+==================
+
+For using the *pb_encode* and *pb_decode* functions, you need a description of
+all the fields contained in a message. This description is usually autogenerated from .proto file.
 
 For example this submessage in the Person.proto file::
 
@@ -255,13 +282,30 @@
     }
  }
 
-generates this field description array for the structure *Person_PhoneNumber*::
+This in turn generates a macro list in the `.pb.h` file::
 
- const pb_field_t Person_PhoneNumber_fields[3] = {
-    PB_FIELD(  1, STRING  , REQUIRED, STATIC, Person_PhoneNumber, number, number, 0),
-    PB_FIELD(  2, ENUM    , OPTIONAL, STATIC, Person_PhoneNumber, type, number, &Person_PhoneNumber_type_default),
-    PB_LAST_FIELD
- };
+    #define Person_PhoneNumber_FIELDLIST(X, a) \
+    X(a, STATIC,   REQUIRED, STRING,   number,            1) \
+    X(a, STATIC,   OPTIONAL, UENUM,    type,              2)
+
+Inside the `.pb.c` file there is a macro call to `PB_BIND`::
+
+    PB_BIND(Person_PhoneNumber, Person_PhoneNumber, AUTO)
+
+These macros will in combination generate `pb_msgdesc_t` structure and associated lists::
+
+    const uint32_t Person_PhoneNumber_field_info[] = { ... };
+    const pb_msgdesc_t * const Person_PhoneNumber_submsg_info[] = { ... };
+    const pb_msgdesc_t Person_PhoneNumber_msg = {
+      2,
+      Person_PhoneNumber_field_info,
+      Person_PhoneNumber_submsg_info,
+      Person_PhoneNumber_DEFAULT,
+      NULL,
+    };
+
+The encoding and decoding functions take a pointer to this structure and use it
+to process each field in the message.
 
 Oneof
 =====
@@ -317,12 +361,14 @@
 Notice that neither ``which_payload`` field nor the unused fields in ``payload``
 will consume any space in the resulting encoded message.
 
-When a C union is used to represent a ``oneof`` section, the union cannot have
-callback fields or nested callback fields. Otherwise, the decoding process may
-fail. If callbacks must be used inside a ``oneof`` section, the generator
-option *no_unions* should be set to *true* for that section.
+When a field inside ``oneof`` contains `pb_callback_t` fields, the callback
+values cannot be set before decoding. This is because the different fields
+share the same storage space in C `union`. Instead either function name bound
+callbacks or a separate message level callback can be used.
+See `tests/oneof_callback`_ for an example on this.
 
 .. _`oneof`: https://developers.google.com/protocol-buffers/docs/reference/proto2-spec#oneof_and_oneof_field
+.. _`tests/oneof_callback`: https://github.com/nanopb/nanopb/tree/master/tests/oneof_callback
 
 Extension fields
 ================
@@ -384,7 +430,7 @@
  MyMessage msg = MyMessage_init_default;
 
 In addition to this, `pb_decode()` will initialize message fields to defaults
-at runtime. If this is not desired, `pb_decode_noinit()` can be used instead.
+at runtime. If this is not desired, `pb_decode_ex()` can be used instead.
 
 Message framing
 ===============
@@ -402,7 +448,7 @@
 
 Nanopb provides a few helpers to facilitate implementing framing formats:
 
-1. Functions *pb_encode_delimited* and *pb_decode_delimited* prefix the message data with a varint-encoded length.
+1. Functions *pb_encode_ex* and *pb_decode_ex* prefix the message data with a varint-encoded length.
 2. Union messages and oneofs are supported in order to implement top-level container messages.
 3. Message IDs can be specified using the *(nanopb_msgopt).msgid* option and can then be accessed from the header.
 
diff --git a/docs/index.rst b/docs/index.rst
index 9b88014..b94f56d 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -12,10 +12,11 @@
 Overall structure
 =================
 
-For the runtime program, you always need *pb.h* for type declarations.
+For the runtime program, you always need *pb.h* for type declarations and *pb_common.h/c* for base functions.
 Depending on whether you want to encode, decode, or both, you also need *pb_encode.h/c* or *pb_decode.h/c*.
 
-The high-level encoding and decoding functions take an array of *pb_field_t* structures, which describes the fields of a message structure. Usually you want these autogenerated from a *.proto* file. The tool script *nanopb_generator.py* accomplishes this.
+The high-level encoding and decoding functions take a pointer to *pb_msgdesc_t* structure, which describes the fields of a message structure.
+Usually you want these autogenerated from a *.proto* file. The tool script *nanopb_generator.py* accomplishes this.
 
 .. image:: generator_flow.png
 
@@ -37,8 +38,8 @@
 **Features**
 
 #) Pure C runtime
-#) Small code size (2–10 kB depending on processor, plus any message definitions)
-#) Small ram usage (typically ~300 bytes, plus any message structs)
+#) Small code size (5–10 kB depending on processor and compilation options, plus any message definitions)
+#) Small ram usage (typically ~300 bytes stack, plus any message structs)
 #) Allows specifying maximum size for strings and arrays, so that they can be allocated statically.
 #) No malloc needed: everything can be allocated statically or on the stack. Optional malloc support available.
 #) You can use either encoder or decoder alone to cut the code size in half.
@@ -56,6 +57,7 @@
 #) Reflection (runtime introspection) is not supported. E.g. you can't request a field by giving its name in a string.
 #) Numeric arrays are always encoded as packed, even if not marked as packed in .proto.
 #) Cyclic references between messages are supported only in callback and malloc mode.
+#) Nanopb doesn't have a stable ABI (application binary interface) between versions, so using it as a shared library (.so / .dll) requires extra care.
 
 Getting started
 ===============
@@ -68,8 +70,7 @@
 
 Save this in *message.proto* and compile it::
 
-    user@host:~$ protoc -omessage.pb message.proto
-    user@host:~$ python nanopb/generator/nanopb_generator.py message.pb
+    user@host:~$ python nanopb/generator/nanopb_generator.py message.proto
 
 You should now have in *message.pb.h*::
 
@@ -77,7 +78,8 @@
     int32_t value;
  } Example;
  
- extern const pb_field_t Example_fields[2];
+ extern const pb_msgdesc_t Example_msg;
+ #define Example_fields &Example_msg
 
 Then you have to include the nanopb headers and the generated header::
 
@@ -107,6 +109,7 @@
 #) *stdint.h*, for definitions of *int32_t* etc.
 #) *stddef.h*, for definition of *size_t*
 #) *stdbool.h*, for definition of *bool*
+#) *limits.h*, for definition of *CHAR_BIT*
 
 If these header files do not come with your compiler, you can use the
 file *extra/pb_syshdr.h* instead. It contains an example of how to provide
@@ -122,11 +125,16 @@
 
 To build the tests, you will need the `scons`__ build system. The tests should
 be runnable on most platforms. Windows and Linux builds are regularly tested.
+The tests also support embedded targets: STM32 (ARM Cortex-M) and AVR builds
+are regularly tested.
 
 __ http://www.scons.org/
 
 In addition to the build system, you will also need a working Google Protocol
-Buffers *protoc* compiler, and the Python bindings for Protocol Buffers. On
-Debian-based systems, install the following packages: *protobuf-compiler*,
-*python-protobuf* and *libprotobuf-dev*.
+Buffers *protoc* compiler, and the Python bindings for Protocol Buffers.
+
+Easiest way to install dependencies is to use the Python package manager `pip`,
+which works on all platforms supported by Python::
+
+    pip install scons protobuf grpcio-tools
 
diff --git a/docs/reference.rst b/docs/reference.rst
index 82fb480..eff5fd1 100644
--- a/docs/reference.rst
+++ b/docs/reference.rst
@@ -7,8 +7,6 @@
 .. contents ::
 
 
-
-
 Compilation options
 ===================
 The following options can be specified in one of two ways:
@@ -29,10 +27,6 @@
                                presence. Default value is 64. Increases stack
                                usage 1 byte per every 8 fields. Compiler
                                warning will tell if you need this.
-PB_FIELD_16BIT                 Add support for tag numbers > 255 and fields
-                               larger than 255 bytes or 255 array entries.
-                               Increases code size 3 bytes per each field.
-                               Compiler error will tell if you need this.
 PB_FIELD_32BIT                 Add support for tag numbers > 65535 and fields
                                larger than 65535 bytes or 65535 array entries.
                                Increases code size 9 bytes per each field.
@@ -45,9 +39,6 @@
                                supports encoding and decoding with memory
                                buffers. Speeds up execution and decreases code
                                size slightly.
-PB_OLD_CALLBACK_STYLE          Use the old function signature (void\* instead
-                               of void\*\*) for callback fields. This was the
-                               default until nanopb-0.2.1.
 PB_SYSTEM_HEADER               Replace the standard header files with a single
                                header file. It should define all the required
                                functions and typedefs listed on the
@@ -66,9 +57,9 @@
                                slightly and slightly increases code size.
 ============================  ================================================
 
-The PB_MAX_REQUIRED_FIELDS, PB_FIELD_16BIT and PB_FIELD_32BIT settings allow
+The PB_MAX_REQUIRED_FIELDS and PB_FIELD_32BIT settings allow
 raising some datatype limits to suit larger messages. Their need is recognized
-automatically by C-preprocessor #if-directives in the generated .pb.h files.
+automatically by C-preprocessor #if-directives in the generated `.pb.c` files.
 The default setting is to use the smallest datatypes (least resources used).
 
 .. _`overview page`: index.html#compiler-requirements
@@ -76,8 +67,11 @@
 
 Proto file options
 ==================
-The generator behaviour can be adjusted using these options, defined in the
-'nanopb.proto' file in the generator folder:
+The generator behaviour can be adjusted using several options, defined in the
+`nanopb.proto`_ file in the generator folder. Here is a list of the most common
+options, but see the file for a full list:
+
+.. _`nanopb.proto`: https://github.com/nanopb/nanopb/blob/master/generator/proto/nanopb.proto
 
 ============================  ================================================
 max_size                       Allocated size for *bytes* and *string* fields.
@@ -113,15 +107,13 @@
 These options can be defined for the .proto files before they are converted
 using the nanopb-generatory.py. There are three ways to define the options:
 
-1. Using a separate .options file.
-   This is the preferred way as of nanopb-0.2.1, because it has the best
-   compatibility with other protobuf libraries.
+1. Using a separate .options file. This allows using wildcards for applying
+   same options to multiple fields.
 2. Defining the options on the command line of nanopb_generator.py.
    This only makes sense for settings that apply to a whole file.
 3. Defining the options in the .proto file using the nanopb extensions.
-   This is the way used in nanopb-0.1, and will remain supported in the
-   future. It however sometimes causes trouble when using the .proto file
-   with other protobuf libraries.
+   This keeps the options close to the fields they apply to, but can be
+   problematic if the same .proto file is shared with many projects.
 
 The effect of the options is the same no matter how they are given. The most
 common purpose is to define maximum size for string fields in order to
@@ -167,9 +159,10 @@
   ones later.
   
 To debug problems in applying the options, you can use the *-v* option for the
-plugin. Plugin options are specified in front of the output path:
+nanopb generator. With protoc, plugin options are specified in front of the output path:
 
-    protoc ... --nanopb_out=-v:. message.proto
+    nanopb_generator -v message.proto           # When invoked directly
+    protoc ... --nanopb_out=-v:. message.proto  # When invoked through protoc
 
 Protoc doesn't currently pass include path into plugins. Therefore if your
 *.proto* is in a subdirectory, nanopb may have trouble finding the associated
@@ -200,12 +193,10 @@
     }
 
 A small complication is that you have to set the include path of protoc so that
-nanopb.proto can be found. This file, in turn, requires the file
-*google/protobuf/descriptor.proto*. This is usually installed under
-*/usr/include*. Therefore, to compile a .proto file which uses options, use a
+nanopb.proto can be found. Therefore, to compile a .proto file which uses options, use a
 protoc command similar to::
 
-    protoc -I/usr/include -Inanopb/generator -I. --nanopb_out=. message.proto
+    protoc -Inanopb/generator/proto -I. --nanopb_out=. message.proto
 
 The options can be defined in file, message and field scopes::
 
@@ -229,6 +220,15 @@
 For most platforms this is equivalent to `uint8_t`. Some platforms however do not support
 8-bit variables, and on those platforms 16 or 32 bits need to be used for each byte.
 
+pb_size_t
+---------
+Type used for storing tag numbers and sizes of message fields. By default the type is 16-bit::
+
+    typedef uint_least16_t pb_size_t;
+
+If tag numbers or fields larger than 65535 are needed, `PB_FIELD_32BIT` option
+can be used to change the type to 32-bit value.
+
 pb_type_t
 ---------
 Type used to store the type of each field, to control the encoder/decoder behaviour. ::
@@ -240,19 +240,23 @@
 =========================== ===== ================================================
 LTYPE identifier            Value Storage format
 =========================== ===== ================================================
-PB_LTYPE_VARINT             0x00  Integer.
-PB_LTYPE_UVARINT            0x01  Unsigned integer.
-PB_LTYPE_SVARINT            0x02  Integer, zigzag encoded.
-PB_LTYPE_FIXED32            0x03  32-bit integer or floating point.
-PB_LTYPE_FIXED64            0x04  64-bit integer or floating point.
-PB_LTYPE_BYTES              0x05  Structure with *size_t* field and byte array.
-PB_LTYPE_STRING             0x06  Null-terminated string.
-PB_LTYPE_SUBMESSAGE         0x07  Submessage structure.
-PB_LTYPE_EXTENSION          0x08  Point to *pb_extension_t*.
-PB_LTYPE_FIXED_LENGTH_BYTES 0x09  Inline *pb_byte_t* array of fixed size.
+PB_LTYPE_BOOL               0x00  Boolean.
+PB_LTYPE_VARINT             0x01  Integer.
+PB_LTYPE_UVARINT            0x02  Unsigned integer.
+PB_LTYPE_SVARINT            0x03  Integer, zigzag encoded.
+PB_LTYPE_FIXED32            0x04  32-bit integer or floating point.
+PB_LTYPE_FIXED64            0x05  64-bit integer or floating point.
+PB_LTYPE_BYTES              0x06  Structure with *size_t* field and byte array.
+PB_LTYPE_STRING             0x07  Null-terminated string.
+PB_LTYPE_SUBMESSAGE         0x08  Submessage structure.
+PB_LTYPE_SUBMSG_W_CB        0x09  Submessage with pre-decoding callback.
+PB_LTYPE_EXTENSION          0x0A  Point to *pb_extension_t*.
+PB_LTYPE_FIXED_LENGTH_BYTES 0x0B  Inline *pb_byte_t* array of fixed size.
 =========================== ===== ================================================
 
-The bits 4-5 define whether the field is required, optional or repeated:
+The bits 4-5 define whether the field is required, optional or repeated.
+There are separate definitions for semantically different modes, even though
+some of them share values and are distinguished based on values of other fields:
 
 ==================== ===== ================================================
 HTYPE identifier     Value Field handling
@@ -260,10 +264,12 @@
 PB_HTYPE_REQUIRED    0x00  Verify that field exists in decoded message.
 PB_HTYPE_OPTIONAL    0x10  Use separate *has_<field>* boolean to specify
                            whether the field is present.
-                           (Unless it is a callback)
+PB_HTYPE_SINGULAR    0x10  Proto3 field, which is present when its value is
+                           non-zero.
 PB_HTYPE_REPEATED    0x20  A repeated field with preallocated array.
                            Separate *<field>_count* for number of items.
-                           (Unless it is a callback)
+PB_HTYPE_FIXARRAY    0x20  A repeated field that has constant length.
+PB_HTYPE_ONEOF       0x30  Oneof-field, only one of each group can be present.
 ==================== ===== ================================================
 
 The bits 6-7 define the how the storage for the field is allocated:
@@ -272,36 +278,83 @@
 ATYPE identifier     Value Allocation method
 ==================== ===== ================================================
 PB_ATYPE_STATIC      0x00  Statically allocated storage in the structure.
+PB_ATYPE_POINTER     0x80  Dynamically allocated storage. Struct field contains
+                           a pointer to the storage.
 PB_ATYPE_CALLBACK    0x40  A field with dynamic storage size. Struct field
-                           actually contains a pointer to a callback
-                           function.
+                           contains a pointer to a callback function.
 ==================== ===== ================================================
 
 
-pb_field_t
-----------
-Describes a single structure field with memory position in relation to others. The descriptions are usually autogenerated. ::
+pb_msgdesc_t
+------------
+Autogenerated structure that contains information about a message and pointers
+to the field descriptors. Use functions defined in `pb_common.h` to process
+the field information::
 
-    typedef struct pb_field_s pb_field_t;
-    struct pb_field_s {
+    typedef struct pb_msgdesc_s pb_msgdesc_t;
+    struct pb_msgdesc_s {
+        pb_size_t field_count;
+        const uint32_t *field_info;
+        const pb_msgdesc_t * const * submsg_info;
+        const pb_byte_t *default_value;
+
+        bool (*field_callback)(pb_istream_t *istream, pb_ostream_t *ostream, const pb_field_iter_t *field);
+    };
+
+:field_count:    Total number of fields in the message.
+:field_info:     Pointer to compact representation of the field information.
+:submsg_info:    Pointer to array of pointers to descriptors for submessages.
+:default_value:  Default values for this message as an encoded protobuf message.
+:field_callback: Function used to handle all callback fields in this message.
+                 By default `pb_default_field_callback()` loads per-field
+                 callbacks from a `pb_callback_t` structure.
+
+
+pb_field_iter_t
+---------------
+Describes a single structure field with memory position in relation to others.
+The field information is stored in a compact format and loaded into `pb_field_iter_t`
+by the functions defined in `pb_common.h`. ::
+
+    typedef struct pb_field_iter_s pb_field_iter_t;
+    struct pb_field_iter_s {
+        const pb_msgdesc_t *descriptor;
+        void *message;
+
+        pb_size_t index;
+        pb_size_t field_info_index;
+        pb_size_t required_field_index;
+        pb_size_t submessage_index;
+
         pb_size_t tag;
-        pb_type_t type;
-        pb_size_t data_offset;
-        pb_ssize_t size_offset;
         pb_size_t data_size;
         pb_size_t array_size;
-        const void *ptr;
-    } pb_packed;
+        pb_type_t type;
 
-:tag:           Tag number of the field or 0 to terminate a list of fields.
-:type:          LTYPE, HTYPE and ATYPE of the field.
-:data_offset:   Offset of field data, relative to the end of the previous field.
-:size_offset:   Offset of *bool* flag for optional fields or *size_t* count for arrays, relative to field data.
-:data_size:     Size of a single data entry, in bytes. For PB_LTYPE_BYTES, the size of the byte array inside the containing structure. For PB_HTYPE_CALLBACK, size of the C data type if known.
-:array_size:    Maximum number of entries in an array, if it is an array type.
-:ptr:           Pointer to default value for optional fields, or to submessage description for PB_LTYPE_SUBMESSAGE.
+        void *pField;
+        void *pData;
+        void *pSize;
 
-The *uint8_t* datatypes limit the maximum size of a single item to 255 bytes and arrays to 255 items. Compiler will give error if the values are too large. The types can be changed to larger ones by defining *PB_FIELD_16BIT*.
+        const pb_msgdesc_t *submsg_desc;
+    };
+
+:descriptor:              Pointer to `pb_msgdesc_t` for the message that contains this field.
+:message:                 Pointer to the start of the message structure.
+:index:                   Index of the field inside the message
+:field_info_index:        Index to the internal `field_info` array
+:required_field_index:    Index that counts only the required fields
+:submessage_index:        Index that counts only submessages
+:tag:                     Tag number defined in `.proto` file for this field.
+:data_size:               `sizeof()` of the field in the structure. For repeated fields this is for a single array entry.
+:array_size:              Maximum number of items in a statically allocated array.
+:type:                    Type (`pb_type_t`_) of the field.
+:pField:                  Pointer to the field storage in the structure.
+:pData:                   Pointer to data contents. For arrays and pointers this can be different than `pField`.
+:pSize:                   Pointer to count or has field, or NULL if this field doesn't have such.
+:submsg_desc:             For submessage fields, points to the descriptor for the submessage.
+
+By default `pb_size_t`_ is 16-bit, limiting the sizes and tags to 65535. The limit
+can be raised by defining `PB_FIELD_32BIT`.
 
 pb_bytes_array_t
 ----------------
@@ -312,7 +365,9 @@
         pb_byte_t bytes[1];
     } pb_bytes_array_t;
 
-In an actual array, the length of *bytes* may be different.
+In an actual array, the length of *bytes* may be different. The macros
+`PB_BYTES_ARRAY_T()` and `PB_BYTES_ARRAY_T_ALLOCSIZE()` are used to allocate
+variable length storage for bytes fields.
 
 pb_callback_t
 -------------
@@ -321,18 +376,26 @@
     typedef struct _pb_callback_t pb_callback_t;
     struct _pb_callback_t {
         union {
-            bool (*decode)(pb_istream_t *stream, const pb_field_t *field, void **arg);
-            bool (*encode)(pb_ostream_t *stream, const pb_field_t *field, void * const *arg);
+            bool (*decode)(pb_istream_t *stream, const pb_field_iter_t *field, void **arg);
+            bool (*encode)(pb_ostream_t *stream, const pb_field_iter_t *field, void * const *arg);
         } funcs;
         
         void *arg;
     };
 
-A pointer to the *arg* is passed to the callback when calling. It can be used to store any information that the callback might need.
+A pointer to the *arg* is passed to the callback when calling.
+It can be used to store any information that the callback might need.
+Note that this is a double pointer. If you set `field.arg` to point to `&data` in your
+main code, in the callback you can access it like this::
 
-Previously the function received just the value of *arg* instead of a pointer to it. This old behaviour can be enabled by defining *PB_OLD_CALLBACK_STYLE*.
+    myfunction(*arg);           /* Gives pointer to data as argument */
+    myfunction(*(data_t*)*arg); /* Gives value of data as argument */
+    *arg = newdata;             /* Alters value of field.arg in structure */
 
-When calling `pb_encode`_, *funcs.encode* is used, and similarly when calling `pb_decode`_, *funcs.decode* is used. The function pointers are stored in the same memory location but are of incompatible types. You can set the function pointer to NULL to skip the field.
+When calling `pb_encode`_, *funcs.encode* is used, and similarly when calling
+`pb_decode`_, *funcs.decode* is used. The function pointers are stored in the
+same memory location but are of incompatible types.
+You can set the function pointer to NULL to skip the field.
 
 pb_wire_type_t
 --------------
@@ -359,7 +422,7 @@
 
 In the normal case, the function pointers are *NULL* and the decoder and
 encoder use their internal implementations. The internal implementations
-assume that *arg* points to a *pb_field_t* that describes the field in question.
+assume that *arg* points to a `pb_field_iter_t`_ that describes the field in question.
 
 To implement custom processing of unknown fields, you can provide pointers
 to your own functions. Their functionality is mostly the same as for normal
@@ -415,6 +478,21 @@
 
 The *msg* parameter must be a constant string.
 
+PB_BIND
+-------
+This macro generates the `pb_msgdesc_t`_ and associated arrays, based on a list
+of fields in `X-macro`_ format. ::
+
+    #define PB_BIND(msgname, structname, width) ...
+
+:msgname:    Name of the message type. Expects `msgname_FIELDLIST` macro to exist.
+:structname: Name of the C structure to bind to.
+:width:      Number of words per field descriptor, or `AUTO` to use minimum size possible.
+
+This macro is automatically invoked inside the autogenerated `.pb.c` files.
+User code can also call it to bind message types with custom structures or class types.
+
+.. _`X-macro`: https://en.wikipedia.org/wiki/X_Macro
 
 
 pb_encode.h
@@ -449,25 +527,42 @@
 ---------
 Encodes the contents of a structure as a protocol buffers message and writes it to output stream. ::
 
-    bool pb_encode(pb_ostream_t *stream, const pb_field_t fields[], const void *src_struct);
+    bool pb_encode(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct);
 
 :stream:        Output stream to write to.
-:fields:        A field description array, usually autogenerated.
+:fields:        Message descriptor, usually autogenerated.
 :src_struct:    Pointer to the data that will be serialized.
 :returns:       True on success, false on IO error, on detectable errors in field description, or if a field encoder returns false.
 
 Normally pb_encode simply walks through the fields description array and serializes each field in turn. However, submessages must be serialized twice: first to calculate their size and then to actually write them to output. This causes some constraints for callback fields, which must return the same data on every call.
 
-pb_encode_delimited
+pb_encode_ex
 -------------------
-Calculates the length of the message, encodes it as varint and then encodes the message. ::
+Encodes the message, with several extended options::
 
-    bool pb_encode_delimited(pb_ostream_t *stream, const pb_field_t fields[], const void *src_struct);
+    bool pb_encode_ex(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct, unsigned int flags);
 
-(parameters are the same as for `pb_encode`_.)
+:stream:        Output stream to write to.
+:fields:        Message descriptor, usually autogenerated.
+:src_struct:    Pointer to the data that will be serialized.
+:flags:         Extended options, see below.
+:returns:       True on success, false on IO error, on detectable errors in field description, or if a field encoder returns false.
 
-A common way to indicate the message length in Protocol Buffers is to prefix it with a varint.
-This function does this, and it is compatible with *parseDelimitedFrom* in Google's protobuf library.
+The options that can be defined are:
+
+:PB_ENCODE_DELIMITED:      Indicate the length of the message by prefixing with a varint-encoded length. Compatible with *parseDelimitedFrom* in Google's protobuf library.
+:PB_ENCODE_NULLTERMINATED: Indicate the length of the message by appending a zero tag value after it. Supported by nanopb decoder, but not by most other protobuf libraries.
+
+pb_get_encoded_size
+-------------------
+Calculates the length of the encoded message. ::
+
+    bool pb_get_encoded_size(size_t *size, const pb_msgdesc_t *fields, const void *src_struct);
+
+:size:          Calculated size of the encoded message.
+:fields:        Message descriptor, usually autogenerated.
+:src_struct:    Pointer to the data that will be serialized.
+:returns:       True on success, false on detectable errors in field description or if a field encoder returns false.
 
 .. sidebar:: Encoding fields manually
 
@@ -477,17 +572,6 @@
 
     Writing packed arrays is a little bit more involved: you need to use `pb_encode_tag` and specify `PB_WT_STRING` as the wire type. Then you need to know exactly how much data you are going to write, and use `pb_encode_varint`_ to write out the number of bytes before writing the actual data. Substreams can be used to determine the number of bytes beforehand; see `pb_encode_submessage`_ source code for an example.
 
-pb_get_encoded_size
--------------------
-Calculates the length of the encoded message. ::
-
-    bool pb_get_encoded_size(size_t *size, const pb_field_t fields[], const void *src_struct);
-
-:size:          Calculated size of the encoded message.
-:fields:        A field description array, usually autogenerated.
-:src_struct:    Pointer to the data that will be serialized.
-:returns:       True on success, false on detectable errors in field description or if a field encoder returns false.
-
 pb_encode_tag
 -------------
 Starts a field in the Protocol Buffers binary format: encodes the field number and the wire type of the data. ::
@@ -501,12 +585,12 @@
 
 pb_encode_tag_for_field
 -----------------------
-Same as `pb_encode_tag`_, except takes the parameters from a *pb_field_t* structure. ::
+Same as `pb_encode_tag`_, except takes the parameters from a *pb_field_iter_t* structure. ::
 
-    bool pb_encode_tag_for_field(pb_ostream_t *stream, const pb_field_t *field);
+    bool pb_encode_tag_for_field(pb_ostream_t *stream, const pb_field_iter_t *field);
 
 :stream:        Output stream to write to. 1-5 bytes will be written.
-:field:         Field description structure. Usually autogenerated.
+:field:         Field iterator for this field.
 :returns:       True on success, false on IO error or unknown field type.
 
 This function only considers the LTYPE of the field. You can use it from your field callbacks, because the source generator writes correct LTYPE also for callback type fields.
@@ -573,14 +657,26 @@
 :value:     Pointer to a 8-bytes large C variable, for example `uint64_t foo;`.
 :returns:   True on success, false on IO error.
 
+pb_encode_float_as_double
+-------------------------
+Encodes a 32-bit `float` value so that it appears like a 64-bit `double` in the
+encoded message. This is sometimes needed when platforms like AVR that do not
+support need to communicate using a message type that contains `double` fields. ::
+
+    bool pb_encode_float_as_double(pb_ostream_t *stream, float value);
+
+:stream:    Output stream to write to.
+:value:     Float value to encode.
+:returns:   True on success, false on IO error.
+
 pb_encode_submessage
 --------------------
 Encodes a submessage field, including the size header for it. Works for fields of any message type::
 
-    bool pb_encode_submessage(pb_ostream_t *stream, const pb_field_t fields[], const void *src_struct);
+    bool pb_encode_submessage(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct);
 
 :stream:        Output stream to write to.
-:fields:        Pointer to the autogenerated field description array for the submessage type, e.g. `MyMessage_fields`.
+:fields:        Pointer to the autogenerated message descriptor for the submessage type, e.g. `MyMessage_fields`.
 :src:           Pointer to the structure where submessage data is.
 :returns:       True on success, false on IO errors, pb_encode errors or if submessage size changes between calls.
 
@@ -592,13 +688,6 @@
 
 
 
-
-
-
-
-
-
-
 pb_decode.h
 ===========
 
@@ -629,57 +718,50 @@
 ---------
 Read and decode all fields of a structure. Reads until EOF on input stream. ::
 
-    bool pb_decode(pb_istream_t *stream, const pb_field_t fields[], void *dest_struct);
+    bool pb_decode(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct);
 
 :stream:        Input stream to read from.
-:fields:        A field description array. Usually autogenerated.
+:fields:        Message descriptor, usually autogenerated.
 :dest_struct:   Pointer to structure where data will be stored.
 :returns:       True on success, false on IO error, on detectable errors in field description, if a field encoder returns false or if a required field is missing.
 
 In Protocol Buffers binary format, EOF is only allowed between fields. If it happens anywhere else, pb_decode will return *false*. If pb_decode returns false, you cannot trust any of the data in the structure.
 
-In addition to EOF, the pb_decode implementation supports terminating a message with a 0 byte. This is compatible with the official Protocol Buffers because 0 is never a valid field tag.
-
 For optional fields, this function applies the default value and sets *has_<field>* to false if the field is not present.
 
 If *PB_ENABLE_MALLOC* is defined, this function may allocate storage for any pointer type fields.
 In this case, you have to call `pb_release`_ to release the memory after you are done with the message.
 On error return `pb_decode` will release the memory itself.
 
-pb_decode_noinit
-----------------
-Same as `pb_decode`_, except does not apply the default values to fields. ::
+pb_decode_ex
+------------
+Same as `pb_decode`_, but allows extended options. ::
 
-    bool pb_decode_noinit(pb_istream_t *stream, const pb_field_t fields[], void *dest_struct);
+    bool pb_decode_ex(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct, unsigned int flags);
 
-(parameters are the same as for `pb_decode`_.)
+:stream:        Input stream to read from.
+:fields:        Message descriptor, usually autogenerated.
+:dest_struct:   Pointer to structure where data will be stored.
+:flags:         Extended options, see below
+:returns:       True on success, false on IO error, on detectable errors in field description, if a field encoder returns false or if a required field is missing.
 
-The destination structure should be filled with zeros before calling this function. Doing a *memset* manually can be slightly faster than using `pb_decode`_ if you don't need any default values.
+The following options can be defined and combined with bitwise `|` operator:
 
-In addition to decoding a single message, this function can be used to merge two messages, so that
-values from previous message will remain if the new message does not contain a field.
+:PB_DECODE_NOINIT:         Do not initialize structure before decoding. This can be used to combine multiple messages, or if you have already initialized the message yourself.
+:PB_DECODE_DELIMITED:      Expect a length prefix in varint format before message. The counterpart of `PB_ENCODE_DELIMITED`.
+:PB_DECODE_NULLTERMINATED: Expect the message to be terminated with zero tag. The counterpart of `PB_ENCODE_NULLTERMINATED`.
 
-This function *will not* release the message even on error return. If you use *PB_ENABLE_MALLOC*,
-you will need to call `pb_release`_ yourself.
-
-pb_decode_delimited
--------------------
-Same as `pb_decode`_, except that it first reads a varint with the length of the message. ::
-
-    bool pb_decode_delimited(pb_istream_t *stream, const pb_field_t fields[], void *dest_struct);
-
-(parameters are the same as for `pb_decode`_.)
-
-A common method to indicate message size in Protocol Buffers is to prefix it with a varint.
-This function is compatible with *writeDelimitedTo* in the Google's Protocol Buffers library.
+If *PB_ENABLE_MALLOC* is defined, this function may allocate storage for any pointer type fields.
+In this case, you have to call `pb_release`_ to release the memory after you are done with the message.
+On error return `pb_decode_ex` will release the memory itself.
 
 pb_release
 ----------
 Releases any dynamically allocated fields::
 
-    void pb_release(const pb_field_t fields[], void *dest_struct);
+    void pb_release(const pb_msgdesc_t *fields, void *dest_struct);
 
-:fields:        A field description array. Usually autogenerated.
+:fields:        Message descriptor, usually autogenerated.
 :dest_struct:   Pointer to structure where data is stored. If NULL, function does nothing.
 
 This function is only available if *PB_ENABLE_MALLOC* is defined. It will release any
@@ -730,6 +812,17 @@
 :dest:          Storage for the decoded integer. Value is undefined on error.
 :returns:       True on success, false if value exceeds uint64_t range or an IO error happens.
 
+pb_decode_varint32
+------------------
+Same as `pb_decode_varint`, but limits the value to 32 bits::
+
+    bool pb_decode_varint32(pb_istream_t *stream, uint32_t *dest);
+
+Parameters are the same as `pb_decode_varint`. This function can be used for
+decoding lengths and other commonly occurring elements that you know shouldn't
+be larger than 32 bit. It will return an error if the value exceeds the `uint32_t`
+datatype.
+
 pb_decode_svarint
 -----------------
 Similar to `pb_decode_varint`_, except that it performs zigzag-decoding on the value. This corresponds to the Protocol Buffers *sint32* and *sint64* datatypes. ::
@@ -764,6 +857,17 @@
 
 Same as `pb_decode_fixed32`_, except this reads 8 bytes.
 
+pb_decode_double_as_float
+-------------------------
+Decodes a 64-bit `double` value into a 32-bit `float` variable.
+Counterpart of `pb_encode_float_as_double`_. ::
+
+    bool pb_decode_double_as_float(pb_istream_t *stream, float *dest);
+
+:stream:        Input stream to read from. 8 bytes will be read.
+:dest:          Pointer to destination *float*.
+:returns:       True on success, false on IO errors.
+
 pb_make_string_substream
 ------------------------
 Decode the length for a field with wire type *PB_WT_STRING* and create a substream for reading the data. ::
@@ -787,3 +891,61 @@
 
 This function copies back the state from the substream to the parent stream.
 It must be called after done with the substream.
+
+
+
+pb_common.h
+===========
+
+pb_field_iter_begin
+-------------------
+Begins iterating over the fields in a message type::
+
+    bool pb_field_iter_begin(pb_field_iter_t *iter, const pb_msgdesc_t *desc, void *message);
+
+:iter:     Pointer to destination `pb_field_iter_t`_ variable.
+:desc:     Autogenerated message descriptor.
+:message:  Pointer to message structure.
+:returns:  True on success, false if the message type has no fields.
+
+pb_field_iter_next
+------------------
+Advance to the next field in the message::
+
+    bool pb_field_iter_next(pb_field_iter_t *iter);
+
+:iter:      Pointer to `pb_field_iter_t`_ previously initialized by `pb_field_iter_begin`_.
+:returns:   True on success, false after last field in the message.
+
+When the last field in the message has been processed, this function will return
+false and initialize `iter` back to the first field in the message.
+
+pb_field_iter_find
+------------------
+Find a field specified by tag number in the message::
+
+    bool pb_field_iter_find(pb_field_iter_t *iter, uint32_t tag);
+
+:iter:      Pointer to `pb_field_iter_t`_ previously initialized by `pb_field_iter_begin`_.
+:tag:       Tag number to search for.
+:returns:   True if field was found, false otherwise.
+
+This function is functionally identical to calling `pb_field_iter_next()` until
+`iter.tag` equals the searched value. Internally this function avoids fully
+processing the descriptor for intermediate fields.
+
+pb_validate_utf8
+----------------
+Validates an UTF8 encoded string::
+
+    bool pb_validate_utf8(const char *s);
+
+:s:         Pointer to beginning of a string.
+:returns:   True, if string is valid UTF-8, false otherwise.
+
+The protobuf standard requires that `string` fields only contain valid UTF-8
+encoded text, while `bytes` fields can contain arbitrary data. When the
+compilation option `PB_VALIDATE_UTF8` is defined, nanopb will automatically
+validate strings on both encoding and decoding.
+
+User code can call this function to validate strings in e.g. custom callbacks.
diff --git a/docs/security.rst b/docs/security.rst
index 6f7152e..ddc3587 100644
--- a/docs/security.rst
+++ b/docs/security.rst
@@ -29,7 +29,7 @@
 1. Callback, pointer and extension fields in message structures given to
    pb_encode() and pb_decode(). These fields are memory pointers, and are
    generated depending on the message definition in the .proto file.
-2. The automatically generated field definitions, i.e. *pb_field_t* lists.
+2. The automatically generated field definitions, i.e. *pb_msgdesc_t*.
 3. Contents of the *pb_istream_t* and *pb_ostream_t* structures (this does not
    mean the contents of the stream itself, just the stream definition).