| # Use in C++ |
| |
| Assuming you have written a schema using the above language in say |
| `mygame.fbs` (FlatBuffer Schema, though the extension doesn't matter), |
| you've generated a C++ header called `mygame_generated.h` using the |
| compiler (e.g. `flatc -c mygame.fbs`), you can now start using this in |
| your program by including the header. As noted, this header relies on |
| `flatbuffers/flatbuffers.h`, which should be in your include path. |
| |
| ### Writing in C++ |
| |
| To start creating a buffer, create an instance of `FlatBufferBuilder` |
| which will contain the buffer as it grows: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| FlatBufferBuilder fbb; |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Before we serialize a Monster, we need to first serialize any objects |
| that are contained there-in, i.e. we serialize the data tree using |
| depth first, pre-order traversal. This is generally easy to do on |
| any tree structures. For example: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| auto name = fbb.CreateString("MyMonster"); |
| |
| unsigned char inv[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }; |
| auto inventory = fbb.CreateVector(inv, 10); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| `CreateString` and `CreateVector` serialize these two built-in |
| datatypes, and return offsets into the serialized data indicating where |
| they are stored, such that `Monster` below can refer to them. |
| |
| `CreateString` can also take an `std::string`, or a `const char *` with |
| an explicit length, and is suitable for holding UTF-8 and binary |
| data if needed. |
| |
| `CreateVector` can also take an `std::vector`. The |
| offset it returns is typed, i.e. can only be used to set fields of the |
| correct type below. To create a vector of struct objects (which will |
| be stored as contiguous memory in the buffer, use `CreateVectorOfStructs` |
| instead. |
| |
| To create a vector of nested objects (e.g. tables, strings or other vectors) |
| collect their offsets in a temporary array/vector, then call `CreateVector` |
| on that (see e.g. the array of strings example in `test.cpp` |
| `CreateFlatBufferTest`). |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| Vec3 vec(1, 2, 3); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| `Vec3` is the first example of code from our generated |
| header. Structs (unlike tables) translate to simple structs in C++, so |
| we can construct them in a familiar way. |
| |
| We have now serialized the non-scalar components of of the monster |
| example, so we could create the monster something like this: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| auto mloc = CreateMonster(fbb, &vec, 150, 80, name, inventory, Color_Red, 0, Any_NONE); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Note that we're passing `150` for the `mana` field, which happens to be the |
| default value: this means the field will not actually be written to the buffer, |
| since we'll get that value anyway when we query it. This is a nice space |
| savings, since it is very common for fields to be at their default. It means |
| we also don't need to be scared to add fields only used in a minority of cases, |
| since they won't bloat up the buffer sizes if they're not actually used. |
| |
| We do something similarly for the union field `test` by specifying a `0` offset |
| and the `NONE` enum value (part of every union) to indicate we don't actually |
| want to write this field. You can use `0` also as a default for other |
| non-scalar types, such as strings, vectors and tables. |
| |
| Tables (like `Monster`) give you full flexibility on what fields you write |
| (unlike `Vec3`, which always has all fields set because it is a `struct`). |
| If you want even more control over this (i.e. skip fields even when they are |
| not default), instead of the convenient `CreateMonster` call we can also |
| build the object field-by-field manually: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| MonsterBuilder mb(fbb); |
| mb.add_pos(&vec); |
| mb.add_hp(80); |
| mb.add_name(name); |
| mb.add_inventory(inventory); |
| auto mloc = mb.Finish(); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| We start with a temporary helper class `MonsterBuilder` (which is |
| defined in our generated code also), then call the various `add_` |
| methods to set fields, and `Finish` to complete the object. This is |
| pretty much the same code as you find inside `CreateMonster`, except |
| we're leaving out a few fields. Fields may also be added in any order, |
| though orderings with fields of the same size adjacent |
| to each other most efficient in size, due to alignment. You should |
| not nest these Builder classes (serialize your |
| data in pre-order). |
| |
| Regardless of whether you used `CreateMonster` or `MonsterBuilder`, you |
| now have an offset to the root of your data, and you can finish the |
| buffer using: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| FinishMonsterBuffer(fbb, mloc); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| The buffer is now ready to be stored somewhere, sent over the network, |
| be compressed, or whatever you'd like to do with it. You can access the |
| start of the buffer with `fbb.GetBufferPointer()`, and it's size from |
| `fbb.GetSize()`. |
| |
| `samples/sample_binary.cpp` is a complete code sample similar to |
| the code above, that also includes the reading code below. |
| |
| ### Reading in C++ |
| |
| If you've received a buffer from somewhere (disk, network, etc.) you can |
| directly start traversing it using: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| auto monster = GetMonster(buffer_pointer); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| `monster` is of type `Monster *`, and points to somewhere *inside* your |
| buffer (root object pointers are not the same as `buffer_pointer` !). |
| If you look in your generated header, you'll see it has |
| convenient accessors for all fields, e.g. |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| assert(monster->hp() == 80); |
| assert(monster->mana() == 150); // default |
| assert(strcmp(monster->name()->c_str(), "MyMonster") == 0); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| These should all be true. Note that we never stored a `mana` value, so |
| it will return the default. |
| |
| To access sub-objects, in this case the `Vec3`: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| auto pos = monster->pos(); |
| assert(pos); |
| assert(pos->z() == 3); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| If we had not set the `pos` field during serialization, it would be |
| `NULL`. |
| |
| Similarly, we can access elements of the inventory array: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| auto inv = monster->inventory(); |
| assert(inv); |
| assert(inv->Get(9) == 9); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| ### Storing maps / dictionaries in a FlatBuffer |
| |
| FlatBuffers doesn't support maps natively, but there is support to |
| emulate their behavior with vectors and binary search, which means you |
| can have fast lookups directly from a FlatBuffer without having to unpack |
| your data into a `std::map` or similar. |
| |
| To use it: |
| - Designate one of the fields in a table as they "key" field. You do this |
| by setting the `key` attribute on this field, e.g. |
| `name:string (key)`. |
| You may only have one key field, and it must be of string or scalar type. |
| - Write out tables of this type as usual, collect their offsets in an |
| array or vector. |
| - Instead of `CreateVector`, call `CreateVectorOfSortedTables`, |
| which will first sort all offsets such that the tables they refer to |
| are sorted by the key field, then serialize it. |
| - Now when you're accessing the FlatBuffer, you can use `Vector::LookupByKey` |
| instead of just `Vector::Get` to access elements of the vector, e.g.: |
| `myvector->LookupByKey("Fred")`, which returns a pointer to the |
| corresponding table type, or `nullptr` if not found. |
| `LookupByKey` performs a binary search, so should have a similar speed to |
| `std::map`, though may be faster because of better caching. `LookupByKey` |
| only works if the vector has been sorted, it will likely not find elements |
| if it hasn't been sorted. |
| |
| ### Direct memory access |
| |
| As you can see from the above examples, all elements in a buffer are |
| accessed through generated accessors. This is because everything is |
| stored in little endian format on all platforms (the accessor |
| performs a swap operation on big endian machines), and also because |
| the layout of things is generally not known to the user. |
| |
| For structs, layout is deterministic and guaranteed to be the same |
| accross platforms (scalars are aligned to their |
| own size, and structs themselves to their largest member), and you |
| are allowed to access this memory directly by using `sizeof()` and |
| `memcpy` on the pointer to a struct, or even an array of structs. |
| |
| To compute offsets to sub-elements of a struct, make sure they |
| are a structs themselves, as then you can use the pointers to |
| figure out the offset without having to hardcode it. This is |
| handy for use of arrays of structs with calls like `glVertexAttribPointer` |
| in OpenGL or similar APIs. |
| |
| It is important to note is that structs are still little endian on all |
| machines, so only use tricks like this if you can guarantee you're not |
| shipping on a big endian machine (an `assert(FLATBUFFERS_LITTLEENDIAN)` |
| would be wise). |
| |
| ### Access of untrusted buffers |
| |
| The generated accessor functions access fields over offsets, which is |
| very quick. These offsets are not verified at run-time, so a malformed |
| buffer could cause a program to crash by accessing random memory. |
| |
| When you're processing large amounts of data from a source you know (e.g. |
| your own generated data on disk), this is acceptable, but when reading |
| data from the network that can potentially have been modified by an |
| attacker, this is undesirable. |
| |
| For this reason, you can optionally use a buffer verifier before you |
| access the data. This verifier will check all offsets, all sizes of |
| fields, and null termination of strings to ensure that when a buffer |
| is accessed, all reads will end up inside the buffer. |
| |
| Each root type will have a verification function generated for it, |
| e.g. for `Monster`, you can call: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| bool ok = VerifyMonsterBuffer(Verifier(buf, len)); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| if `ok` is true, the buffer is safe to read. |
| |
| Besides untrusted data, this function may be useful to call in debug |
| mode, as extra insurance against data being corrupted somewhere along |
| the way. |
| |
| While verifying a buffer isn't "free", it is typically faster than |
| a full traversal (since any scalar data is not actually touched), |
| and since it may cause the buffer to be brought into cache before |
| reading, the actual overhead may be even lower than expected. |
| |
| In specialized cases where a denial of service attack is possible, |
| the verifier has two additional constructor arguments that allow |
| you to limit the nesting depth and total amount of tables the |
| verifier may encounter before declaring the buffer malformed. |
| |
| ## Text & schema parsing |
| |
| Using binary buffers with the generated header provides a super low |
| overhead use of FlatBuffer data. There are, however, times when you want |
| to use text formats, for example because it interacts better with source |
| control, or you want to give your users easy access to data. |
| |
| Another reason might be that you already have a lot of data in JSON |
| format, or a tool that generates JSON, and if you can write a schema for |
| it, this will provide you an easy way to use that data directly. |
| |
| (see the schema documentation for some specifics on the JSON format |
| accepted). |
| |
| There are two ways to use text formats: |
| |
| ### Using the compiler as a conversion tool |
| |
| This is the preferred path, as it doesn't require you to add any new |
| code to your program, and is maximally efficient since you can ship with |
| binary data. The disadvantage is that it is an extra step for your |
| users/developers to perform, though you might be able to automate it. |
| |
| flatc -b myschema.fbs mydata.json |
| |
| This will generate the binary file `mydata_wire.bin` which can be loaded |
| as before. |
| |
| ### Making your program capable of loading text directly |
| |
| This gives you maximum flexibility. You could even opt to support both, |
| i.e. check for both files, and regenerate the binary from text when |
| required, otherwise just load the binary. |
| |
| This option is currently only available for C++, or Java through JNI. |
| |
| As mentioned in the section "Building" above, this technique requires |
| you to link a few more files into your program, and you'll want to include |
| `flatbuffers/idl.h`. |
| |
| Load text (either a schema or json) into an in-memory buffer (there is a |
| convenient `LoadFile()` utility function in `flatbuffers/util.h` if you |
| wish). Construct a parser: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| flatbuffers::Parser parser; |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Now you can parse any number of text files in sequence: |
| |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp} |
| parser.Parse(text_file.c_str()); |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| This works similarly to how the command-line compiler works: a sequence |
| of files parsed by the same `Parser` object allow later files to |
| reference definitions in earlier files. Typically this means you first |
| load a schema file (which populates `Parser` with definitions), followed |
| by one or more JSON files. |
| |
| As optional argument to `Parse`, you may specify a null-terminated list of |
| include paths. If not specified, any include statements try to resolve from |
| the current directory. |
| |
| If there were any parsing errors, `Parse` will return `false`, and |
| `Parser::err` contains a human readable error string with a line number |
| etc, which you should present to the creator of that file. |
| |
| After each JSON file, the `Parser::fbb` member variable is the |
| `FlatBufferBuilder` that contains the binary buffer version of that |
| file, that you can access as described above. |
| |
| `samples/sample_text.cpp` is a code sample showing the above operations. |
| |
| ### Threading |
| |
| Reading a FlatBuffer does not touch any memory outside the original buffer, |
| and is entirely read-only (all const), so is safe to access from multiple |
| threads even without synchronisation primitives. |
| |
| Creating a FlatBuffer is not thread safe. All state related to building |
| a FlatBuffer is contained in a FlatBufferBuilder instance, and no memory |
| outside of it is touched. To make this thread safe, either do not |
| share instances of FlatBufferBuilder between threads (recommended), or |
| manually wrap it in synchronisation primites. There's no automatic way to |
| accomplish this, by design, as we feel multithreaded construction |
| of a single buffer will be rare, and synchronisation overhead would be costly. |