Emboss User Guide

Getting Started

First, you must identify a data structure you want to read and write. These are often documented in hardware manuals a bit like this one, for the fictional BN-P-6000404 illuminated button panel. We will use the BN-P-6000404 as an example.

A Caution

Emboss is still beta software. While we believe that we will not need to make any more breaking changes before 1.0, you may still encounter bugs and there are many missing features.

You can contact emboss-dev@google.com with any issues. Emboss is not an officially supported Google product, but the Emboss authors will try to answer emails.

System Requirements

Running the Emboss Compiler

The Emboss compiler requires Python 3.6 or later. On a Linux-like system with Python 3 installed in the usual place (/usr/bin/python3), you can run the embossc script at the top level on an .emb file to generate C++, like so:

embossc --generate cc --output-path path/to/object/dir path/to/input.emb

If your project is using Bazel, the build_defs.bzl file has an emboss_cc_library rule that you can use from your project.

Using the Generated C++ Code

The code generated by Emboss requires a C++11-compliant compiler, and a reasonably up-to-date standard library. Emboss has been tested with GCC and Clang, libc++ and libstd++. In theory, it should work with MSVC, ICC, etc., but it has not been tested, so there are likely to be bugs.

The generated C++ code lives entirely in a .h file, one per .emb file. All of the generated code is in C++ templates or (in a very few cases) inline functions. The generated code is structured this way in order to implement “pay-as-you-use” for code size: any functions, methods, or views that are not used by your code won't end up in your final binary. This is often important for environments like microcontrollers!

There is an Emboss runtime library (under runtime/cpp), which is also header-only. You will need to add the root of the Emboss source tree to your #include path.

Note that it is strongly recommended that you compile your release code with at least some optimizations: -Os or -O2. The Emboss generated code leans fairly heavily on your C++ compiler's inlining and common code elimination to produce fast, lean compiled code.

Contributing to the Compiler

If you want to contribute features or bugfixes to the Emboss compiler itself, you will need Bazel to run the Emboss test suite.

Create an .emb file

Next, you will need to translate your structures.

[$default byte_order: "LittleEndian"]
[(cpp) namespace: "bogonel::bnp6000404"]

The BN-P-6000404 uses little-endian numbers, so we can set the default byte order to LittleEndian. There is no particular C++ namespace implied by the BN-P-6000404 user guide, so we use one that is specific to the BN-P-6000404.

The BN-P-6000404, like many devices with serial interfaces, uses a framed message system, with a fixed header and a variable message body depending on a message ID. For the BN-P-6000404, this framing looks like this:

struct Message:
  -- Top-level message structure, specified in section 5.3 of the BN-P-6000404
  -- user guide.

  0 [+1]  UInt       sync_1
    [requires: this == 0x42]

  1 [+1]  UInt       sync_2
    [requires: this == 0x4E]

  2 [+1]  MessageId  message_id
    -- Type of message

  3 [+1]  UInt       message_length (ml)
    -- Length of message, including header and checksum

  # ... body fields to follow ...

We could have chosen to put the header fields into a separate Header structure instead of placing them directly in the Message structure.

The sync_1 and sync_2 fields are required to have specific magic values, so we add the appropriate [requires: ...] attributes to them. This tells Emboss that if those fields do not have those values, then the Message struct is ill-formed: in the client code, the Message will not be Ok() if those fields have the wrong values, and Emboss will not allow wrong values to be written into those fields using the checked (default) APIs.

Unfortunately, BogoNEL does not provide a nice table of message IDs, but fortunately there are only a few, so we can gather them from the individual messages:

enum MessageId:
  -- Message type idenfiers for the BN-P-6000404.
  IDENTIFICATION       = 0x01
  INTERACTION          = 0x02
  QUERY_BUTTONS        = 0x11

Next, we should translate the individual messages to Emboss.

struct Identification:
  -- IDENTIFICATION message, specified in section 5.3.3.

  0 [+4]  UInt       vendor
    # 0x4F474F42 is "BOGO" in ASCII, interpreted as a 4-byte little-endian
    # value.
    [requires: this == 0x4F47_4F42]

  0 [+4]  UInt:8[4]  vendor_ascii
    -- "BOGO" for BogoNEL Corp
    # The `vendor` field really contains the four ASCII characters "BOGO", so we
    # could use a byte array instead of a single UInt.  Since it is valid to
    # have overlapping fields, we can have both `vendor` and `vendor_ascii` in
    # our Emboss specification.

  4 [+2]  UInt       firmware_major
    -- Firmware major version

  6 [+2]  UInt       firmware_minor
    -- Firmware minor version

The Identification structure is fairly straightforward. In this case, we provide an alternate view of the vendor field via vendor_ascii: 0x4F474F42 in little-endian works out to the ASCII characters “BOGO”.

Note that vendor_ascii uses UInt:8[4] for its type, and not UInt[4]. For most fields, we can use plain UInt and Emboss will figure out how big the UInt should be, but for an array we must be explicit that we want 8-bit elements.

struct Interaction:
  -- INTERACTION message, specified in section 5.3.4.

  0 [+1]  UInt           number_of_buttons (n)
    -- Number of buttons currently depressed by user

  4 [+n]  ButtonId:8[n]  button_id
    -- ID of pressed button.  A number of entries equal to number_of_buttons
    -- will be provided.

Interaction is also fairly straightforward. The only tricky bit is the button_id field: since Interaction can return a variable number of button IDs, depending on how many buttons are currently pressed, the button_id field must has length n. It would have been OK to use [+number_of_buttons], but full field names can get cumbersome, particularly when the length involves are more complex expression. Instead, we set an alias for number_of_buttons using (n), and then use the alias in button_id‘s length. The n alias is not visible outside of the Interaction message, and won’t be available in the generated code, so the short name is not likely to cause confusion.

enum ButtonId:
  -- Button IDs, specified in table 5-6.
  BUTTON_A = 0x00
  BUTTON_B = 0x04
  BUTTON_C = 0x08
  BUTTON_D = 0x0C
  BUTTON_E = 0x01
  BUTTON_F = 0x05
  BUTTON_G = 0x09
  BUTTON_H = 0x0D
  BUTTON_I = 0x02
  BUTTON_J = 0x06
  BUTTON_K = 0x0A
  BUTTON_L = 0x0E
  BUTTON_M = 0x03
  BUTTON_N = 0x07
  BUTTON_O = 0x0B
  BUTTON_P = 0x0F

We had to prefix all of the button names with BUTTON_ because Emboss does not allow single-character enum names.

The QUERY IDENTIFICATION and QUERY BUTTONS messages don't have any fields other than checksum, so we will handle them a bit differently.

struct SetIllumination:
  -- SET ILLUMINATION message, specified in section 5.3.7.

  0 [+1]    bits:
    0 [+1]  Flag  red_channel_enable
      -- Enables setting the RED channel.

    1 [+1]  Flag  blue_channel_enable
      -- Enables setting the BLUE channel.

    2 [+1]  Flag  green_channel_enable
      -- Enables setting the GREEN channel.

  1 [+1]    UInt  blink_duty
      -- Sets the proportion of time between time on and time off for blink
      -- feature.
      -- Minimum value = 0 (no illumination)
      -- Maximum value = 240 (constant illumination)
      [requires: 0 <= this <= 240]

  2 [+2]    UInt  blink_period
      -- Sets the blink period, in milliseconds.
      -- Minimum value = 10
      -- Maximum value = 10000
      [requires: 10 <= this <= 10_000]

  4 [+4]    bits:
    0 [+32]  UInt:2[16]  intensity
      -- Intensity values for the unmasked channels.  2 bits of intensity for
      -- each button.

SetIllumination requires us to use bitfields. The first bitfield is in the CHANNEL MASK field: rather than making a single channel_mask field, Emboss lets us specify the red, green, and blue channel masks separately.

As with sync_1 and sync_2, we have added [requires: ...] to the blink_duty and blink_period fields: this time, specifying a range of valid values. [requires: ...] accepts an arbitrary expression, which can be as simple or as complex as desired.

It is not clear from BogoNEL's documentation whether “bit 0” means the least significant or most significant bit of its byte, but a little experimentation with the device shows that setting the least significant bit causes SetIllumination to set its red channel. Emboss always numbers bits in bitfields from least significant (bit 0) to most significant.

The other bitfield is the intensity array. The BN-P-6000404 uses an array of 2 bit intensity values, so we specify that array.

Finally, we should add all of the sub-messages into Message, and also take care of checksum. After making those changes, Message looks like:

struct Message:
  -- Top-level message structure, specified in section 5.3 of the BN-P-6000404
  -- user guide.

  0 [+1]       UInt                 sync_1
    [requires: this == 0x42]

  1 [+1]       UInt                 sync_2
    [requires: this == 0x4E]

  2 [+1]       MessageId            message_id
    -- Type of message

  3 [+1]       UInt                 message_length (ml)
    -- Length of message, including header and checksum

  if message_id == MessageId.IDENTIFICATION:
    4 [+ml-8]  Identification       identification

  if message_id == MessageId.INTERACTION:
    4 [+ml-8]  Interaction          interaction

  if message_id == MessageId.SET_ILLUMINATION:
    4 [+ml-8]  SetIllumination      set_illumination

  0 [+ml-4]    UInt:8[]             checksummed_bytes

  ml-4 [+4]    UInt                 checksum

By wrapping the various message types in if message_id == ... constructs, those substructures will only be available when the message_id field is set to the corresponding message type. This kind of selection is used for any structure field that is only valid some of the time.

The substructures all have the length ml-8. The ml is a short alias for the message_length field; these short aliases are available so that the field types and names don't have to be pushed far to the right. Aliases may only be used directly in the same structure definition where they are created; they may not be used elsewhere in an Emboss file, and they are not available in the generated code. The length is ml-8 in this case because the message_length includes the header and checksum, which left out of the substructures.

Note that we simply don't have any subfield for QUERY IDENTIFICATION or QUERY BUTTONS: since those messages do not have any fields, there is no need for a zero-byte structure.

We also added the checksummed_bytes field as a convenience for computing the checksum.

Generate code

Once you have an .emb, you will need to generate code from it.

The simplest way to do so is to run the embossc tool:

embossc -I src --generate cc --output-path generated bogonel.emb

The -I option adds a directory to the include path. The input file -- in this case, bogonel.emb -- must be found somewhere on the include path.

The --generate option specifies which back end to use; cc is the C++ back end.

The --output-path option specifies where the generated file should be placed. Note that the output path will include all of the path components of the input file: if the input file is x/y/z.emb, then the path x/y/z.emb.h will be appended to the --output-path. Missing directories will be created.

Include the generated C++ code

Emboss generates a single C++ header file from your .emb by appending .h to the file name: to use the BogoNEL definitions, you would #include "path/to/bogonel.emb.h" in your C++ code.

Currently, Emboss does not generate a corresponding .cc file: the code that Emboss generates is all templates, which exist in the .h. Although the Emboss maintainers (e.g., bolms@) like the simplicity of generating a single file, this could change at some point.

Use the generated C++ code

Emboss generates views, which your program can use to read and write existing arrays of bytes, and which do not take ownership. For example:

#include "path/to/bogonel.emb.h"

template <typename View>
bool ChecksumIsCorrect(View message_view);

// Handles BogoNEL BN-P-6000404 device messages from a byte stream.  Returns
// the number of bytes that were processed.  Unprocessed bytes should be
// passed into the next call.
int HandleBogonelPanelMessages(const char *bytes, int byte_count) {
  auto message_view = bogonel::bnp6000404::MakeMessageView(bytes, byte_count);

  // IsComplete() will return true if the view has enough bytes to fully
  // contain the message; i.e., that byte_count is at least
  // message_view.message_length().Read() + 4.
  if (!message_view->IsComplete()) {
    return 0;

  // If Emboss is happy with the message, we still need to check the checksum:
  // Emboss does not (yet) have support for automatically checking checksums and
  // CRCs.
  if (!message_view->Ok() || !ChecksumIsCorrect(message_view)) {
    // If the message is complete, but not correct, we need to log an error.
    return message_view->Size();

  // At this point, we know the message is complete and (basically) OK, so
  // we dispatch it to a message-type-specific handler.
  switch (message_view->message_id().Read()) {
    case bogonel::bnp6000404::MessageId::IDENTIFICATION:

    case bogonel::bnp6000404::MessageId::INTERACTION:

    case bogonel::bnp6000404::MessageId::QUERY_IDENTIFICATION:
    case bogonel::bnp6000404::MessageId::QUERY_BUTTONS:
    case bogonel::bnp6000404::MessageId::SET_ILLUMINATION:
      Log("Unexpected host to device message type.");

      Log("Unknown message type.");

  return message_view->Size();

template <typename View>
bool ChecksumIsCorrect(View message_view) {
  uint32_t checksum = 0;
  for (int i = 0; i < message_view.checksum_bytes().ElementCount(); ++i) {
    checksum += message_view.checksum_bytes()[i].Read();
  return checksum == message_view.checksum().Read();

The message_view object in this example is a lightweight object that simply provides access to the bytes in message. Emboss views are very cheap to construct because they only contain a couple of pointers and a length -- they do not copy or take ownership of the underlying bytes. This also means that you have to keep the underlying bytes alive as long as you are using a view -- you can't let them go out of scope or delete them.

Views can also be used for writing, if they are given pointers to mutable memory:

void ConstructSetIlluminationMessage(const vector<bool> &lit_buttons,
                                     vector<char> *result) {
  // The SetIllumination message has a constant size, so SizeInBytes() is
  // available as a static method.
  int length = bogonel::bnp6000404::SetIllumination::SizeInBytes() + 8;

  auto view = bogonel::bnp6000404::MakeMessageView(result);
  for (int i = 0; i < view->set_illumination().intensity().ElementCount();
       ++i) {
    view->set_illumination().intensity()[i].Write(lit_buttons[i] ? 3 : 0);

Use the .emb Autoformatter

You can use the .emb autoformatter to avoid manual formatting. For now, it is available at compiler/front_end/format.py.

TODO(bolms): Package the Emboss tools for easy workstation installation.