| ==================== |
| Nanopb: New features |
| ==================== |
| |
| .. include :: menu.rst |
| |
| .. contents :: |
| |
| What's new in nanopb 0.4 |
| ======================== |
| Long in the making, nanopb 0.4 has seen some wide reaching improvements in |
| reaction to the development of the rest of the protobuf ecosystem. This document |
| showcases features that are not immediately visible, but that you may want to |
| take advantage of. |
| |
| A lot of effort has been spent in retaining backwards and forwards compatibility |
| with previous nanopb versions. For a list of breaking changes, see `migration document`_ |
| |
| .. _`migration document`: migration.html |
| |
| New field descriptor format |
| --------------------------- |
| The basic design of nanopb has always been that the information about messages |
| is stored in a compact descriptor format, which is iterated in runtime. |
| Initially it was very tightly tied with encoder and decoder logic. |
| |
| In nanopb-0.3.0 the field iteration logic was separated to `pb_common.c`. |
| Already at that point it was clear that the old format was getting too limited, |
| but it wasn't extended at that time. |
| |
| Now in 0.4, the descriptor format was completely decoupled from the encoder |
| and decoder logic, and redesigned to meet new demands. Previously each field |
| was stored as `pb_field_t` struct, which was between 8 and 32 bytes in size, |
| depending on compilation options and platform. Now information about fields is |
| stored as a variable length sequence of `uint32_t` data words. There |
| are 1, 2, 4 and 8 word formats, with the 8 word format containing plenty of |
| space for future extensibility. |
| |
| One benefit of the variable length format is that most messages now take less |
| storage space. Most fields use 2 words, while simple fields in small messages |
| require only 1 word. Benefit is larger if code previously required |
| `PB_FIELD_16BIT` or `PB_FIELD_32BIT` options. In the `AllTypes` test case, 0.3 |
| had data size of 1008 bytes in 8-bit configuration and 1408 bytes in 16-bit |
| configuration. New format in 0.4 takes 896 bytes for either of these. |
| |
| In addition, the new decoupling has allowed moving most of the field descriptor |
| data into FLASH on Harvard architectures, such as AVR. Previously nanopb was |
| quite RAM-heavy on AVR, which cannot put normal constants in flash like most |
| other platforms do. |
| |
| Python packaging for generator |
| ------------------------------ |
| Nanopb generator is now available as a Python package, installable using `pip` |
| package manager. This will reduce the need for binary packages, as if you have |
| Python already installed you can just `pip install nanopb` and have the |
| generator available on path as `nanopb_generator`. |
| |
| The generator can also take advantage of the Python-based `protoc` available in |
| `grpcio-tools` Python package. If you also install that, there is no longer |
| a need to have binary `protoc` available. |
| |
| Generator now automatically calls protoc |
| ---------------------------------------- |
| Initially, nanopb generator was used in two steps: first calling `protoc` to |
| parse the `.proto` file into `.pb` binary format, and then calling `nanopb_generator` |
| to output the `.pb.h` and `.pb.c` files. |
| |
| Nanopb 0.2.3 added support for running as a `protoc` plugin, which allowed |
| single-step generation using `--nanopb_out` parameter. However, the plugin |
| mode has two complications: passing options to nanopb generator itself becomes |
| more difficult, and the generator does not know the actual path of input files. |
| The second limitation has been particularly problematic for locating `.options` |
| files. |
| |
| Both of these older methods still work and will remain supported. However, now |
| `nanopb_generator` can also take `.proto` files directly and it will transparently |
| call `protoc` in the background. |
| |
| Callbacks bound by function name |
| -------------------------------- |
| Since its very beginnings, nanopb has supported field callbacks to allow processing |
| structures that are larger than what could fit in memory at once. So far the |
| callback functions have been stored in the message structure in a `pb_callback_t` |
| struct. |
| |
| Storing pointers along with user data is somewhat risky from a security point of |
| view. In addition it has caused problems with `oneof` fields, which reuse the |
| same storage space for multiple submessages. Because there is no separate area |
| for each submessage, there is no space to store the callback pointers either. |
| |
| Nanopb-0.4.0 introduces callbacks that are referenced by the function name |
| instead of setting the pointers separately. This should work well for most |
| applications that have a single callback function for each message type. |
| For more complex needs, `pb_callback_t` will also remain supported. |
| |
| Function name callbacks also allow specifying custom data types for inclusion |
| in the message structure. For example, you could have `MyObject*` pointer along |
| with other message fields, and then process that object in custom way in your |
| callback. |
| |
| This feature is demonstrated in `tests/oneof_callback`_ test case and |
| `examples/network_server`_ example. |
| |
| .. _`examples/network_server`: https://github.com/nanopb/nanopb/tree/master/examples/network_server |
| |
| Message level callback for oneofs |
| --------------------------------- |
| As mentioned above, callbacks inside submessages inside oneofs have been |
| problematic to use. To make using `pb_callback_t`-style callbacks there possible, |
| a new generator option `submsg_callback` was added. |
| |
| Setting this option to true will cause a new message level callback to be added |
| before the `which_field` of the oneof. This callback will be called when the |
| submessage tag number is known, but before the actual message is decoded. The |
| callback can either choose to set callback pointers inside the submessage, or |
| just completely decode the submessage there and then. If any unread data remains |
| after the callback returns, normal submessage decoding will continue. |
| |
| There is an example of this in `tests/oneof_callback`_ test case. |
| |
| .. _`tests/oneof_callback`: https://github.com/nanopb/nanopb/tree/master/tests/oneof_callback |
| |
| Binding message types to custom structures |
| ------------------------------------------ |
| It is often said that good C code is chock full of macros. Or maybe I got it |
| wrong. But since nanopb 0.2, the field descriptor generation has heavily relied |
| on macros. This allows it to automatically adapt to differences in type alignment |
| on different platforms, and to decouple the Python generation logic from how |
| the message descriptors are implemented on the C side. |
| |
| Now in 0.4.0, I've made the macros even more abstract. Time will tell whether |
| this was such a great idea that I think it is, but now the complete list of |
| fields in each message is available in `.pb.h` file. This allows a kind of |
| metaprogramming using `X-macros`_ |
| |
| .. _`X-macros`: https://en.wikipedia.org/wiki/X_Macro |
| |
| One feature that this can be used for is binding the message descriptor to a |
| custom structure or C++ class type. You could have a bunch of other fields in |
| the structure and even the datatypes can be different to an extent, and nanopb |
| will automatically detect the size and position of each field. The generated |
| `.pb.c` files now just have calls of `PB_BIND(msgname, structname, width)`. |
| Adding a similar call to your own code will bind the message to your own structure. |
| |
| UTF-8 validation |
| ---------------- |
| Protobuf format defines that strings should consist of valid UTF-8 codepoints. |
| Previously nanopb has not enforced this, requiring extra care in the user code. |
| Now optional UTF-8 validation is available with compilation option `PB_VALIDATE_UTF8`. |
| |
| Double to float conversion |
| -------------------------- |
| Some platforms such as `AVR` do not support the `double` datatype, instead making |
| it an alias for `float`. This has resulted in problems when trying to process |
| message types containing `double` fields generated on other machines. There |
| has been an example on how to manually perform the conversion between `double` |
| and `float`. |
| |
| Now that example is integrated as an optional feature in nanopb core. By defining |
| `PB_CONVERT_DOUBLE_FLOAT`, the required conversion between 32- and 64-bit floating |
| point formats happens automatically on decoding and encoding. |
| |
| Improved testing |
| ---------------- |
| Testing on embedded platforms has been integrated in the continuous testing |
| environment. Now all of the 80+ test cases are automatically run on STM32 and |
| AVR targets. Previously only a few specialized test cases were manually tested |
| on embedded systems. |
| |
| Nanopb fuzzer has also been integrated in Google's `OSSFuzz`_ platform, giving |
| a huge boost in the CPU power available for randomized testing. |
| |
| .. _`OSSFuzz`: https://google.github.io/oss-fuzz/ |
| |
| |