| # Emboss User Guide |
| |
| |
| ## Getting Started |
| |
| First, you must identify a data structure you want to read and write. These are |
| often documented in hardware manuals a bit like [this one, for the fictional |
| BN-P-6000404 illuminated button panel](BogoNEL_BN-P-6000404_User_Guide.pdf). We |
| will use the BN-P-6000404 as an example. |
| |
| |
| ### A Caution |
| |
| Emboss is still beta software. While we believe that we will not need to make |
| any more breaking changes before 1.0, you may still encounter bugs and there are |
| many missing features. |
| |
| You can contact `emboss-dev@google.com` with any issues. Emboss is not an |
| officially supported Google product, but the Emboss authors will try to answer |
| emails. |
| |
| |
| ### System Requirements |
| |
| #### Running the Emboss Compiler |
| |
| The Emboss compiler requires Python 3.8 or later -- the minimum supported |
| version tracks the support timeline of the Python project. On a Linux-like |
| system with Python 3 installed in the usual place (`/usr/bin/python3`), you |
| can run the embossc script at the top level on an `.emb` file to generate |
| C++, like so: |
| |
| ``` |
| embossc --generate cc --output-path path/to/object/dir path/to/input.emb |
| ``` |
| |
| If your project is using Bazel, the `build_defs.bzl` file has an |
| `emboss_cc_library` rule that you can use from your project. |
| |
| |
| #### Using the Generated C++ Code |
| |
| The code generated by Emboss requires a C++11-compliant compiler, and a |
| reasonably up-to-date standard library. Emboss has been tested with GCC and |
| Clang, libc++ and libstd++. In theory, it should work with MSVC, ICC, etc., but |
| it has not been tested, so there are likely to be bugs. |
| |
| The generated C++ code lives entirely in a `.h` file, one per `.emb` file. All |
| of the generated code is in C++ templates or (in a very few cases) `inline` |
| functions. The generated code is structured this way in order to implement |
| "pay-as-you-use" for code size: any functions, methods, or views that are not |
| used by your code won't end up in your final binary. This is often important |
| for environments like microcontrollers! |
| |
| There is an Emboss runtime library (under `runtime/cpp`), which is also |
| header-only. You will need to add the root of the Emboss source tree to your |
| `#include` path. |
| |
| Note that it is *strongly* recommended that you compile your release code with |
| at least some optimizations: `-Os` or `-O2`. The Emboss generated code leans |
| fairly heavily on your C++ compiler's inlining and common code elimination to |
| produce fast, lean compiled code. |
| |
| |
| #### Contributing to the Compiler |
| |
| If you want to contribute features or bugfixes to the Emboss compiler itself, |
| you will need Bazel to run the Emboss test suite. |
| |
| |
| ### Create an `.emb` file |
| |
| Next, you will need to translate your structures. |
| |
| ``` |
| [$default byte_order: "LittleEndian"] |
| [(cpp) namespace: "bogonel::bnp6000404"] |
| ``` |
| |
| The BN-P-6000404 uses little-endian numbers, so we can set the default byte |
| order to `LittleEndian`. There is no particular C++ namespace implied by the |
| BN-P-6000404 user guide, so we use one that is specific to the BN-P-6000404. |
| |
| The BN-P-6000404, like many devices with serial interfaces, uses a framed |
| message system, with a fixed header and a variable message body depending on a |
| message ID. For the BN-P-6000404, this framing looks like this: |
| |
| <!-- TODO(bolms): finalize the "magic value initialization" feature, document it |
| here. --> |
| |
| ``` |
| struct Message: |
| -- Top-level message structure, specified in section 5.3 of the BN-P-6000404 |
| -- user guide. |
| |
| 0 [+1] UInt sync_1 |
| [requires: this == 0x42] |
| |
| 1 [+1] UInt sync_2 |
| [requires: this == 0x4E] |
| |
| 2 [+1] MessageId message_id |
| -- Type of message |
| |
| 3 [+1] UInt message_length (ml) |
| -- Length of message, including header and checksum |
| |
| # ... body fields to follow ... |
| ``` |
| |
| We could have chosen to put the header fields into a separate `Header` structure |
| instead of placing them directly in the `Message` structure. |
| |
| The `sync_1` and `sync_2` fields are required to have specific magic values, so |
| we add the appropriate `[requires: ...]` attributes to them. This tells Emboss |
| that if those fields do not have those values, then the `Message` `struct` is |
| ill-formed: in the client code, the `Message` will not be `Ok()` if those fields |
| have the wrong values, and Emboss will not allow wrong values to be written into |
| those fields using the checked (default) APIs. |
| |
| Unfortunately, BogoNEL does not provide a nice table of message IDs, but |
| fortunately there are only a few, so we can gather them from the individual |
| messages: |
| |
| ``` |
| enum MessageId: |
| -- Message type idenfiers for the BN-P-6000404. |
| IDENTIFICATION = 0x01 |
| INTERACTION = 0x02 |
| QUERY_IDENTIFICATION = 0x10 |
| QUERY_BUTTONS = 0x11 |
| SET_ILLUMINATION = 0x12 |
| ``` |
| |
| Next, we should translate the individual messages to Emboss. |
| |
| ``` |
| struct Identification: |
| -- IDENTIFICATION message, specified in section 5.3.3. |
| |
| 0 [+4] UInt vendor |
| # 0x4F474F42 is "BOGO" in ASCII, interpreted as a 4-byte little-endian |
| # value. |
| [requires: this == 0x4F47_4F42] |
| |
| 0 [+4] UInt:8[4] vendor_ascii |
| -- "BOGO" for BogoNEL Corp |
| # The `vendor` field really contains the four ASCII characters "BOGO", so we |
| # could use a byte array instead of a single UInt. Since it is valid to |
| # have overlapping fields, we can have both `vendor` and `vendor_ascii` in |
| # our Emboss specification. |
| |
| 4 [+2] UInt firmware_major |
| -- Firmware major version |
| |
| 6 [+2] UInt firmware_minor |
| -- Firmware minor version |
| ``` |
| |
| <!-- TODO(bolms): fixed-length, ASCIIZ, and variable-length string support? --> |
| |
| The `Identification` structure is fairly straightforward. In this case, we |
| provide an alternate view of the `vendor` field via `vendor_ascii`: 0x4F474F42 |
| in little-endian works out to the ASCII characters "BOGO". |
| |
| Note that `vendor_ascii` uses `UInt:8[4]` for its type, and not `UInt[4]`. For |
| most fields, we can use plain `UInt` and Emboss will figure out how big the |
| `UInt` should be, but for an array we must be explicit that we want 8-bit |
| elements. |
| |
| ``` |
| struct Interaction: |
| -- INTERACTION message, specified in section 5.3.4. |
| |
| 0 [+1] UInt number_of_buttons (n) |
| -- Number of buttons currently depressed by user |
| |
| 4 [+n] ButtonId:8[n] button_id |
| -- ID of pressed button. A number of entries equal to number_of_buttons |
| -- will be provided. |
| ``` |
| |
| <!-- TODO(bolms): reserved field support --> |
| |
| `Interaction` is also fairly straightforward. The only tricky bit is the |
| `button_id` field: since `Interaction` can return a variable number of button |
| IDs, depending on how many buttons are currently pressed, the `button_id` field |
| must has length `n`. It would have been OK to use `[+number_of_buttons]`, but |
| full field names can get cumbersome, particularly when the length involves are |
| more complex expression. Instead, we set an *alias* for `number_of_buttons` |
| using `(n)`, and then use the alias in `button_id`'s length. The `n` alias is |
| not visible outside of the `Interaction` message, and won't be available in the |
| generated code, so the short name is not likely to cause confusion. |
| |
| ``` |
| enum ButtonId: |
| -- Button IDs, specified in table 5-6. |
| BUTTON_A = 0x00 |
| BUTTON_B = 0x04 |
| BUTTON_C = 0x08 |
| BUTTON_D = 0x0C |
| BUTTON_E = 0x01 |
| BUTTON_F = 0x05 |
| BUTTON_G = 0x09 |
| BUTTON_H = 0x0D |
| BUTTON_I = 0x02 |
| BUTTON_J = 0x06 |
| BUTTON_K = 0x0A |
| BUTTON_L = 0x0E |
| BUTTON_M = 0x03 |
| BUTTON_N = 0x07 |
| BUTTON_O = 0x0B |
| BUTTON_P = 0x0F |
| ``` |
| |
| We had to prefix all of the button names with `BUTTON_` because Emboss does not |
| allow single-character enum names. |
| |
| The QUERY IDENTIFICATION and QUERY BUTTONS messages don't have any fields other |
| than `checksum`, so we will handle them a bit differently. |
| |
| ``` |
| struct SetIllumination: |
| -- SET ILLUMINATION message, specified in section 5.3.7. |
| |
| 0 [+1] bits: |
| 0 [+1] Flag red_channel_enable |
| -- Enables setting the RED channel. |
| |
| 1 [+1] Flag blue_channel_enable |
| -- Enables setting the BLUE channel. |
| |
| 2 [+1] Flag green_channel_enable |
| -- Enables setting the GREEN channel. |
| |
| 1 [+1] UInt blink_duty |
| -- Sets the proportion of time between time on and time off for blink |
| -- feature. |
| -- |
| -- Minimum value = 0 (no illumination) |
| -- |
| -- Maximum value = 240 (constant illumination) |
| [requires: 0 <= this <= 240] |
| |
| 2 [+2] UInt blink_period |
| -- Sets the blink period, in milliseconds. |
| -- |
| -- Minimum value = 10 |
| -- |
| -- Maximum value = 10000 |
| [requires: 10 <= this <= 10_000] |
| |
| 4 [+4] bits: |
| 0 [+32] UInt:2[16] intensity |
| -- Intensity values for the unmasked channels. 2 bits of intensity for |
| -- each button. |
| ``` |
| |
| `SetIllumination` requires us to use bitfields. The first bitfield is in the |
| CHANNEL MASK field: rather than making a single `channel_mask` field, Emboss |
| lets us specify the red, green, and blue channel masks separately. |
| |
| As with `sync_1` and `sync_2`, we have added `[requires: ...]` to the |
| `blink_duty` and `blink_period` fields: this time, specifying a range of valid |
| values. `[requires: ...]` accepts an arbitrary expression, which can be as |
| simple or as complex as desired. |
| |
| It is not clear from BogoNEL's documentation whether "bit 0" means the least |
| significant or most significant bit of its byte, but a little experimentation |
| with the device shows that setting the least significant bit causes |
| `SetIllumination` to set its red channel. Emboss always numbers bits in |
| bitfields from least significant (bit 0) to most significant. |
| |
| The other bitfield is the `intensity` array. The BN-P-6000404 uses an array of |
| 2 bit intensity values, so we specify that array. |
| |
| Finally, we should add all of the sub-messages into `Message`, and also take |
| care of `checksum`. After making those changes, `Message` looks like: |
| |
| ``` |
| struct Message: |
| -- Top-level message structure, specified in section 5.3 of the BN-P-6000404 |
| -- user guide. |
| |
| 0 [+1] UInt sync_1 |
| [requires: this == 0x42] |
| |
| 1 [+1] UInt sync_2 |
| [requires: this == 0x4E] |
| |
| 2 [+1] MessageId message_id |
| -- Type of message |
| |
| 3 [+1] UInt message_length (ml) |
| -- Length of message, including header and checksum |
| |
| if message_id == MessageId.IDENTIFICATION: |
| 4 [+ml-8] Identification identification |
| |
| if message_id == MessageId.INTERACTION: |
| 4 [+ml-8] Interaction interaction |
| |
| if message_id == MessageId.SET_ILLUMINATION: |
| 4 [+ml-8] SetIllumination set_illumination |
| |
| 0 [+ml-4] UInt:8[] checksummed_bytes |
| |
| ml-4 [+4] UInt checksum |
| ``` |
| |
| By wrapping the various message types in `if message_id == ...` constructs, |
| those substructures will only be available when the `message_id` field is set to |
| the corresponding message type. This kind of selection is used for any |
| structure field that is only valid some of the time. |
| |
| The substructures all have the length `ml-8`. The `ml` is a short alias for the |
| `message_length` field; these short aliases are available so that the field |
| types and names don't have to be pushed far to the right. Aliases may only be |
| used directly in the same structure definition where they are created; they may |
| not be used elsewhere in an Emboss file, and they are not available in the |
| generated code. The length is `ml-8` in this case because the `message_length` |
| includes the header and checksum, which left out of the substructures. |
| |
| Note that we simply don't have any subfield for QUERY IDENTIFICATION or QUERY |
| BUTTONS: since those messages do not have any fields, there is no need for a |
| zero-byte structure. |
| |
| We also added the `checksummed_bytes` field as a convenience for computing the |
| checksum. |
| |
| |
| ### Generate code |
| |
| Once you have an `.emb`, you will need to generate code from it. |
| |
| The simplest way to do so is to run the `embossc` tool: |
| |
| ``` |
| embossc -I src --generate cc --output-path generated bogonel.emb |
| ``` |
| |
| The `-I` option adds a directory to the *include path*. The input file -- in |
| this case, `bogonel.emb` -- must be found somewhere on the include path. |
| |
| The `--generate` option specifies which back end to use; `cc` is the C++ back |
| end. |
| |
| The `--output-path` option specifies where the generated file should be placed. |
| Note that the output path will include all of the path components of the input |
| file: if the input file is `x/y/z.emb`, then the path `x/y/z.emb.h` will be |
| appended to the `--output-path`. Missing directories will be created. |
| |
| |
| <!-- #### Using Bazel --> |
| |
| <!-- TODO(bolms): Make this usable from Bazel. --> |
| |
| |
| ### Include the generated C++ code |
| |
| Emboss generates a single C++ header file from your `.emb` by appending `.h` to |
| the file name: to use the BogoNEL definitions, you would `#include |
| "path/to/bogonel.emb.h"` in your C++ code. |
| |
| Currently, Emboss does not generate a corresponding `.cc` file: the code that |
| Emboss generates is all templates, which exist in the `.h`. Although the Emboss |
| maintainers (e.g., bolms@) like the simplicity of generating a single file, this |
| could change at some point. |
| |
| |
| ### Use the generated C++ code |
| |
| Emboss generates *views*, which your program can use to read and write existing |
| arrays of bytes, and which do not take ownership. For example: |
| |
| ```c++ |
| #include "path/to/bogonel.emb.h" |
| |
| template <typename View> |
| bool ChecksumIsCorrect(View message_view); |
| |
| // Handles BogoNEL BN-P-6000404 device messages from a byte stream. Returns |
| // the number of bytes that were processed. Unprocessed bytes should be |
| // passed into the next call. |
| int HandleBogonelPanelMessages(const char *bytes, int byte_count) { |
| auto message_view = bogonel::bnp6000404::MakeMessageView(bytes, byte_count); |
| |
| // IsComplete() will return true if the view has enough bytes to fully |
| // contain the message; i.e., that byte_count is at least |
| // message_view.message_length().Read() + 4. |
| if (!message_view->IsComplete()) { |
| return 0; |
| } |
| |
| // If Emboss is happy with the message, we still need to check the checksum: |
| // Emboss does not (yet) have support for automatically checking checksums and |
| // CRCs. |
| if (!message_view->Ok() || !ChecksumIsCorrect(message_view)) { |
| // If the message is complete, but not correct, we need to log an error. |
| HandleBrokenMessage(message_view); |
| return message_view->Size(); |
| } |
| |
| |
| // At this point, we know the message is complete and (basically) OK, so |
| // we dispatch it to a message-type-specific handler. |
| switch (message_view->message_id().Read()) { |
| case bogonel::bnp6000404::MessageId::IDENTIFICATION: |
| HandleIdentificationMessage(message_view); |
| break; |
| |
| case bogonel::bnp6000404::MessageId::INTERACTION: |
| HandleInteractionMessage(message_view); |
| break; |
| |
| case bogonel::bnp6000404::MessageId::QUERY_IDENTIFICATION: |
| case bogonel::bnp6000404::MessageId::QUERY_BUTTONS: |
| case bogonel::bnp6000404::MessageId::SET_ILLUMINATION: |
| Log("Unexpected host to device message type."); |
| break; |
| |
| default: |
| Log("Unknown message type."); |
| break; |
| } |
| |
| return message_view->Size(); |
| } |
| |
| template <typename View> |
| bool ChecksumIsCorrect(View message_view) { |
| uint32_t checksum = 0; |
| for (int i = 0; i < message_view.checksum_bytes().ElementCount(); ++i) { |
| checksum += message_view.checksum_bytes()[i].Read(); |
| } |
| return checksum == message_view.checksum().Read(); |
| } |
| ``` |
| |
| <!-- TODO(bolms): solidify support for checksums, so that the Ok() call in the |
| example actually checks them. --> |
| |
| The `message_view` object in this example is a lightweight object that simply |
| provides *access* to the bytes in `message`. Emboss views are very cheap to |
| construct because they only contain a couple of pointers and a length -- they do |
| not copy or take ownership of the underlying bytes. This also means that you |
| have to keep the underlying bytes alive as long as you are using a view -- you |
| can't let them go out of scope or delete them. |
| |
| Views can also be used for writing, if they are given pointers to mutable |
| memory: |
| |
| ```c++ |
| void ConstructSetIlluminationMessage(const vector<bool> &lit_buttons, |
| vector<char> *result) { |
| // The SetIllumination message has a constant size, so SizeInBytes() is |
| // available as a static method. |
| int length = bogonel::bnp6000404::SetIllumination::SizeInBytes() + 8; |
| result->clear(); |
| result->resize(length); |
| |
| auto view = bogonel::bnp6000404::MakeMessageView(result); |
| view->sync_1().Write(0x42); |
| view->sync_2().Write(0x4E); |
| view->message_id().Write(bogonel::bnp6000404::MessageId::SET_ILLUMINATION); |
| view->message_length().Write(length); |
| view->set_illumination().red_channel_enable().Write(true); |
| view->set_illumination().blue_channel_enable().Write(true); |
| view->set_illumination().green_channel_enable().Write(true); |
| view->set_illumination().blink_duty().Write(240); |
| view->set_illumination().blink_period().Write(10000); |
| for (int i = 0; i < view->set_illumination().intensity().ElementCount(); |
| ++i) { |
| view->set_illumination().intensity()[i].Write(lit_buttons[i] ? 3 : 0); |
| } |
| } |
| ``` |
| |
| |
| ### Use the `.emb` Autoformatter |
| |
| You can use the `.emb` autoformatter to avoid manual formatting. For now, it is |
| available at `compiler/front_end/format.py`. |
| |
| *TODO(bolms): Package the Emboss tools for easy workstation installation.* |