# Emboss User Guide


## Getting Started

First, you must identify a data structure you want to read and write.  These are
often documented in hardware manuals a bit like [this one, for the fictional
BN-P-6000404 illuminated button panel](BogoNEL_BN-P-6000404_User_Guide.pdf).  We
will use the BN-P-6000404 as an example.


### A Caution

Emboss is still beta software.  While we believe that we will not need to make
any more breaking changes before 1.0, you may still encounter bugs and there are
many missing features.

You can contact `emboss-dev@google.com` with any issues.  Emboss is not an
officially supported Google product, but the Emboss authors will try to answer
emails.


### System Requirements

#### Running the Emboss Compiler

The Emboss compiler requires Python 3.6 or later.  On a Linux-like system with
Python 3 installed in the usual place (`/usr/bin/python3`), you can run the
embossc script at the top level on an `.emb` file to generate C++, like so:

```
embossc --generate cc --output-path path/to/object/dir path/to/input.emb
```

If your project is using Bazel, the `build_defs.bzl` file has an
`emboss_cc_library` rule that you can use from your project.


#### Using the Generated C++ Code

The code generated by Emboss requires a C++11-compliant compiler, and a
reasonably up-to-date standard library.  Emboss has been tested with GCC and
Clang, libc++ and libstd++.  In theory, it should work with MSVC, ICC, etc., but
it has not been tested, so there are likely to be bugs.

The generated C++ code lives entirely in a `.h` file, one per `.emb` file.  All
of the generated code is in C++ templates or (in a very few cases) `inline`
functions.  The generated code is structured this way in order to implement
"pay-as-you-use" for code size: any functions, methods, or views that are not
used by your code won't end up in your final binary.  This is often important
for environments like microcontrollers!

There is an Emboss runtime library (under `runtime/cpp`), which is also
header-only.  You will need to add the root of the Emboss source tree to your
`#include` path.

Note that it is *strongly* recommended that you compile your release code with
at least some optimizations: `-Os` or `-O2`.  The Emboss generated code leans
fairly heavily on your C++ compiler's inlining and common code elimination to
produce fast, lean compiled code.


#### Contributing to the Compiler

If you want to contribute features or bugfixes to the Emboss compiler itself,
you will need Bazel to run the Emboss test suite.


### Create an `.emb` file

Next, you will need to translate your structures.

```
[$default byte_order: "LittleEndian"]
[(cpp) namespace: "bogonel::bnp6000404"]
```

The BN-P-6000404 uses little-endian numbers, so we can set the default byte
order to `LittleEndian`.  There is no particular C++ namespace implied by the
BN-P-6000404 user guide, so we use one that is specific to the BN-P-6000404.

The BN-P-6000404, like many devices with serial interfaces, uses a framed
message system, with a fixed header and a variable message body depending on a
message ID.  For the BN-P-6000404, this framing looks like this:

<!-- TODO(bolms): finalize the "magic value initialization" feature, document it
here.  -->

```
struct Message:
  -- Top-level message structure, specified in section 5.3 of the BN-P-6000404
  -- user guide.

  0 [+1]  UInt       sync_1
    [requires: this == 0x42]

  1 [+1]  UInt       sync_2
    [requires: this == 0x4E]

  2 [+1]  MessageId  message_id
    -- Type of message

  3 [+1]  UInt       message_length (ml)
    -- Length of message, including header and checksum

  # ... body fields to follow ...
```

We could have chosen to put the header fields into a separate `Header` structure
instead of placing them directly in the `Message` structure.

The `sync_1` and `sync_2` fields are required to have specific magic values, so
we add the appropriate `[requires: ...]` attributes to them.  This tells Emboss
that if those fields do not have those values, then the `Message` `struct` is
ill-formed: in the client code, the `Message` will not be `Ok()` if those fields
have the wrong values, and Emboss will not allow wrong values to be written into
those fields using the checked (default) APIs.

Unfortunately, BogoNEL does not provide a nice table of message IDs, but
fortunately there are only a few, so we can gather them from the individual
messages:

```
enum MessageId:
  -- Message type idenfiers for the BN-P-6000404.
  IDENTIFICATION       = 0x01
  INTERACTION          = 0x02
  QUERY_IDENTIFICATION = 0x10
  QUERY_BUTTONS        = 0x11
  SET_ILLUMINATION     = 0x12
```

Next, we should translate the individual messages to Emboss.

```
struct Identification:
  -- IDENTIFICATION message, specified in section 5.3.3.

  0 [+4]  UInt       vendor
    # 0x4F474F42 is "BOGO" in ASCII, interpreted as a 4-byte little-endian
    # value.
    [requires: this == 0x4F47_4F42]

  0 [+4]  UInt:8[4]  vendor_ascii
    -- "BOGO" for BogoNEL Corp
    # The `vendor` field really contains the four ASCII characters "BOGO", so we
    # could use a byte array instead of a single UInt.  Since it is valid to
    # have overlapping fields, we can have both `vendor` and `vendor_ascii` in
    # our Emboss specification.

  4 [+2]  UInt       firmware_major
    -- Firmware major version

  6 [+2]  UInt       firmware_minor
    -- Firmware minor version
```

<!-- TODO(bolms): fixed-length, ASCIIZ, and variable-length string support? -->

The `Identification` structure is fairly straightforward.  In this case, we
provide an alternate view of the `vendor` field via `vendor_ascii`: 0x4F474F42
in little-endian works out to the ASCII characters "BOGO".

Note that `vendor_ascii` uses `UInt:8[4]` for its type, and not `UInt[4]`.  For
most fields, we can use plain `UInt` and Emboss will figure out how big the
`UInt` should be, but for an array we must be explicit that we want 8-bit
elements.

```
struct Interaction:
  -- INTERACTION message, specified in section 5.3.4.

  0 [+1]  UInt           number_of_buttons (n)
    -- Number of buttons currently depressed by user

  4 [+n]  ButtonId:8[n]  button_id
    -- ID of pressed button.  A number of entries equal to number_of_buttons
    -- will be provided.
```

<!-- TODO(bolms): reserved field support -->

`Interaction` is also fairly straightforward.  The only tricky bit is the
`button_id` field: since `Interaction` can return a variable number of button
IDs, depending on how many buttons are currently pressed, the `button_id` field
must has length `n`.  It would have been OK to use `[+number_of_buttons]`, but
full field names can get cumbersome, particularly when the length involves are
more complex expression.  Instead, we set an *alias* for `number_of_buttons`
using `(n)`, and then use the alias in `button_id`'s length.  The `n` alias is
not visible outside of the `Interaction` message, and won't be available in the
generated code, so the short name is not likely to cause confusion.

```
enum ButtonId:
  -- Button IDs, specified in table 5-6.
  BUTTON_A = 0x00
  BUTTON_B = 0x04
  BUTTON_C = 0x08
  BUTTON_D = 0x0C
  BUTTON_E = 0x01
  BUTTON_F = 0x05
  BUTTON_G = 0x09
  BUTTON_H = 0x0D
  BUTTON_I = 0x02
  BUTTON_J = 0x06
  BUTTON_K = 0x0A
  BUTTON_L = 0x0E
  BUTTON_M = 0x03
  BUTTON_N = 0x07
  BUTTON_O = 0x0B
  BUTTON_P = 0x0F
```

We had to prefix all of the button names with `BUTTON_` because Emboss does not
allow single-character enum names.

The QUERY IDENTIFICATION and QUERY BUTTONS messages don't have any fields other
than `checksum`, so we will handle them a bit differently.

```
struct SetIllumination:
  -- SET ILLUMINATION message, specified in section 5.3.7.

  0 [+1]    bits:
    0 [+1]  Flag  red_channel_enable
      -- Enables setting the RED channel.

    1 [+1]  Flag  blue_channel_enable
      -- Enables setting the BLUE channel.

    2 [+1]  Flag  green_channel_enable
      -- Enables setting the GREEN channel.

  1 [+1]    UInt  blink_duty
      -- Sets the proportion of time between time on and time off for blink
      -- feature.
      --
      -- Minimum value = 0 (no illumination)
      --
      -- Maximum value = 240 (constant illumination)
      [requires: 0 <= this <= 240]

  2 [+2]    UInt  blink_period
      -- Sets the blink period, in milliseconds.
      --
      -- Minimum value = 10
      --
      -- Maximum value = 10000
      [requires: 10 <= this <= 10_000]

  4 [+4]    bits:
    0 [+32]  UInt:2[16]  intensity
      -- Intensity values for the unmasked channels.  2 bits of intensity for
      -- each button.
```

`SetIllumination` requires us to use bitfields.  The first bitfield is in the
CHANNEL MASK field: rather than making a single `channel_mask` field, Emboss
lets us specify the red, green, and blue channel masks separately.

As with `sync_1` and `sync_2`, we have added `[requires: ...]` to the
`blink_duty` and `blink_period` fields: this time, specifying a range of valid
values.  `[requires: ...]` accepts an arbitrary expression, which can be as
simple or as complex as desired.

It is not clear from BogoNEL's documentation whether "bit 0" means the least
significant or most significant bit of its byte, but a little experimentation
with the device shows that setting the least significant bit causes
`SetIllumination` to set its red channel.  Emboss always numbers bits in
bitfields from least significant (bit 0) to most significant.

The other bitfield is the `intensity` array.  The BN-P-6000404 uses an array of
2 bit intensity values, so we specify that array.

Finally, we should add all of the sub-messages into `Message`, and also take
care of `checksum`.  After making those changes, `Message` looks like:

```
struct Message:
  -- Top-level message structure, specified in section 5.3 of the BN-P-6000404
  -- user guide.

  0 [+1]       UInt                 sync_1
    [requires: this == 0x42]

  1 [+1]       UInt                 sync_2
    [requires: this == 0x4E]

  2 [+1]       MessageId            message_id
    -- Type of message

  3 [+1]       UInt                 message_length (ml)
    -- Length of message, including header and checksum

  if message_id == MessageId.IDENTIFICATION:
    4 [+ml-8]  Identification       identification

  if message_id == MessageId.INTERACTION:
    4 [+ml-8]  Interaction          interaction

  if message_id == MessageId.SET_ILLUMINATION:
    4 [+ml-8]  SetIllumination      set_illumination

  0 [+ml-4]    UInt:8[]             checksummed_bytes

  ml-4 [+4]    UInt                 checksum
```

By wrapping the various message types in `if message_id == ...` constructs,
those substructures will only be available when the `message_id` field is set to
the corresponding message type.  This kind of selection is used for any
structure field that is only valid some of the time.

The substructures all have the length `ml-8`.  The `ml` is a short alias for the
`message_length` field; these short aliases are available so that the field
types and names don't have to be pushed far to the right.  Aliases may only be
used directly in the same structure definition where they are created; they may
not be used elsewhere in an Emboss file, and they are not available in the
generated code.  The length is `ml-8` in this case because the `message_length`
includes the header and checksum, which left out of the substructures.

Note that we simply don't have any subfield for QUERY IDENTIFICATION or QUERY
BUTTONS: since those messages do not have any fields, there is no need for a
zero-byte structure.

We also added the `checksummed_bytes` field as a convenience for computing the
checksum.


### Generate code

Once you have an `.emb`, you will need to generate code from it.

The simplest way to do so is to run the `embossc` tool:

```
embossc -I src --generate cc --output-path generated bogonel.emb
```

The `-I` option adds a directory to the *include path*.  The input file -- in
this case, `bogonel.emb` -- must be found somewhere on the include path.

The `--generate` option specifies which back end to use; `cc` is the C++ back
end.

The `--output-path` option specifies where the generated file should be placed.
Note that the output path will include all of the path components of the input
file: if the input file is `x/y/z.emb`, then the path `x/y/z.emb.h` will be
appended to the `--output-path`.  Missing directories will be created.


<!-- #### Using Bazel -->

<!-- TODO(bolms): Make this usable from Bazel. -->


### Include the generated C++ code

Emboss generates a single C++ header file from your `.emb` by appending `.h` to
the file name: to use the BogoNEL definitions, you would `#include
"path/to/bogonel.emb.h"` in your C++ code.

Currently, Emboss does not generate a corresponding `.cc` file: the code that
Emboss generates is all templates, which exist in the `.h`.  Although the Emboss
maintainers (e.g., bolms@) like the simplicity of generating a single file, this
could change at some point.


### Use the generated C++ code

Emboss generates *views*, which your program can use to read and write existing
arrays of bytes, and which do not take ownership.  For example:

```c++
#include "path/to/bogonel.emb.h"

template <typename View>
bool ChecksumIsCorrect(View message_view);

// Handles BogoNEL BN-P-6000404 device messages from a byte stream.  Returns
// the number of bytes that were processed.  Unprocessed bytes should be
// passed into the next call.
int HandleBogonelPanelMessages(const char *bytes, int byte_count) {
  auto message_view = bogonel::bnp6000404::MakeMessageView(bytes, byte_count);

  // IsComplete() will return true if the view has enough bytes to fully
  // contain the message; i.e., that byte_count is at least
  // message_view.message_length().Read() + 4.
  if (!message_view->IsComplete()) {
    return 0;
  }

  // If Emboss is happy with the message, we still need to check the checksum:
  // Emboss does not (yet) have support for automatically checking checksums and
  // CRCs.
  if (!message_view->Ok() || !ChecksumIsCorrect(message_view)) {
    // If the message is complete, but not correct, we need to log an error.
    HandleBrokenMessage(message_view);
    return message_view->Size();
  }


  // At this point, we know the message is complete and (basically) OK, so
  // we dispatch it to a message-type-specific handler.
  switch (message_view->message_id().Read()) {
    case bogonel::bnp6000404::MessageId::IDENTIFICATION:
      HandleIdentificationMessage(message_view);
      break;

    case bogonel::bnp6000404::MessageId::INTERACTION:
      HandleInteractionMessage(message_view);
      break;

    case bogonel::bnp6000404::MessageId::QUERY_IDENTIFICATION:
    case bogonel::bnp6000404::MessageId::QUERY_BUTTONS:
    case bogonel::bnp6000404::MessageId::SET_ILLUMINATION:
      Log("Unexpected host to device message type.");
      break;

    default:
      Log("Unknown message type.");
      break;
  }

  return message_view->Size();
}

template <typename View>
bool ChecksumIsCorrect(View message_view) {
  uint32_t checksum = 0;
  for (int i = 0; i < message_view.checksum_bytes().ElementCount(); ++i) {
    checksum += message_view.checksum_bytes()[i].Read();
  }
  return checksum == message_view.checksum().Read();
}
```

<!-- TODO(bolms): solidify support for checksums, so that the Ok() call in the
example actually checks them. -->

The `message_view` object in this example is a lightweight object that simply
provides *access* to the bytes in `message`.  Emboss views are very cheap to
construct because they only contain a couple of pointers and a length -- they do
not copy or take ownership of the underlying bytes.  This also means that you
have to keep the underlying bytes alive as long as you are using a view -- you
can't let them go out of scope or delete them.

Views can also be used for writing, if they are given pointers to mutable
memory:

```c++
void ConstructSetIlluminationMessage(const vector<bool> &lit_buttons,
                                     vector<char> *result) {
  // The SetIllumination message has a constant size, so SizeInBytes() is
  // available as a static method.
  int length = bogonel::bnp6000404::SetIllumination::SizeInBytes() + 8;
  result->clear();
  result->resize(length);

  auto view = bogonel::bnp6000404::MakeMessageView(result);
  view->sync_1().Write(0x42);
  view->sync_2().Write(0x4E);
  view->message_id().Write(bogonel::bnp6000404::MessageId::SET_ILLUMINATION);
  view->message_length().Write(length);
  view->set_illumination().red_channel_enable().Write(true);
  view->set_illumination().blue_channel_enable().Write(true);
  view->set_illumination().green_channel_enable().Write(true);
  view->set_illumination().blink_duty().Write(240);
  view->set_illumination().blink_period().Write(10000);
  for (int i = 0; i < view->set_illumination().intensity().ElementCount();
       ++i) {
    view->set_illumination().intensity()[i].Write(lit_buttons[i] ? 3 : 0);
  }
}
```


### Use the `.emb` Autoformatter

You can use the `.emb` autoformatter to avoid manual formatting.  For now, it is
available at `compiler/front_end/format.py`.

*TODO(bolms): Package the Emboss tools for easy workstation installation.*
