docs/source/Schemas.md - third_party/github/google/flatbuffers - Git at Google

 # Writing a schema

 The syntax of the schema language (aka IDL, Interface Definition
 Language) should look quite familiar to users of any of the C family of
 languages, and also to users of other IDLs. Let's look at an example
 first:

     // example IDL file

     namespace MyGame;

     enum Color : byte { Red = 1, Green, Blue }

     union Any { Monster, Weapon, Pickup }

     struct Vec3 {
       x:float;
       y:float;
       z:float;
     }

     table Monster {
       pos:Vec3;
       mana:short = 150;
       hp:short = 100;
       name:string;
       friendly:bool = false (deprecated, priority: 1);
       inventory:[ubyte];
       color:Color = Blue;
       test:Any;
     }

     root_type Monster;

 (Weapon & Pickup not defined as part of this example).

 ### Tables

 Tables are the main way of defining objects in FlatBuffers, and consist
 of a name (here `Monster`) and a list of fields. Each field has a name,
 a type, and optionally a default value (if omitted, it defaults to 0 /
 NULL).

 Each field is optional: It does not have to appear in the wire
 representation, and you can choose to omit fields for each individual
 object. As a result, you have the flexibility to add fields without fear of
 bloating your data. This design is also FlatBuffer's mechanism for forward
 and backwards compatibility. Note that:

 -   You can add new fields in the schema ONLY at the end of a table
     definition. Older data will still
     read correctly, and give you the default value when read. Older code
     will simply ignore the new field.
     If you want to have flexibility to use any order for fields in your
     schema, you can manually assign ids (much like Protocol Buffers),
     see the `id` attribute below.

 -   You cannot delete fields you don't use anymore from the schema,
     but you can simply
     stop writing them into your data for almost the same effect.
     Additionally you can mark them as `deprecated` as in the example
     above, which will prevent the generation of accessors in the
     generated C++, as a way to enforce the field not being used any more.
     (careful: this may break code!).

 -   You may change field names and table names, if you're ok with your
     code breaking until you've renamed them there too.


 ### Structs

 Similar to a table, only now none of the fields are optional (so no defaults
 either), and fields may not be added or be deprecated. Structs may only contain
 scalars or other structs. Use this for
 simple objects where you are very sure no changes will ever be made
 (as quite clear in the example `Vec3`). Structs use less memory than
 tables and are even faster to access (they are always stored in-line in their
 parent object, and use no virtual table).

 ### Types

 Builtin scalar types are:

 -   8 bit: `byte ubyte bool`

 -   16 bit: `short ushort`

 -   32 bit: `int uint float`

 -   64 bit: `long ulong double`

 -   Vector of any other type (denoted with `[type]`). Nesting vectors
     is not supported, instead you can wrap the inner vector in a table.

 -   `string`, which may only hold UTF-8 or 7-bit ASCII. For other text encodings
     or general binary data use vectors (`[byte]` or `[ubyte]`) instead.

 -   References to other tables or structs, enums or unions (see
     below).

 You can't change types of fields once they're used, with the exception
 of same-size data where a `reinterpret_cast` would give you a desirable result,
 e.g. you could change a `uint` to an `int` if no values in current data use the
 high bit yet.

 ### (Default) Values

 Values are a sequence of digits, optionally followed by a `.` and more digits
 for float constants, and optionally prefixed by a `-`. Non-scalar defaults are
 currently not supported (always NULL).

 You generally do not want to change default values after they're initially
 defined. Fields that have the default value are not actually stored in the
 serialized data but are generated in code, so when you change the default, you'd
 now get a different value than from code generated from an older version of
 the schema. There are situations however where this may be
 desirable, especially if you can ensure a simultaneous rebuild of
 all code.

 ### Enums

 Define a sequence of named constants, each with a given value, or
 increasing by one from the previous one. The default first value
 is `0`. As you can see in the enum declaration, you specify the underlying
 integral type of the enum with `:` (in this case `byte`), which then determines
 the type of any fields declared with this enum type.

 ### Unions

 Unions share a lot of properties with enums, but instead of new names
 for constants, you use names of tables. You can then declare
 a union field which can hold a reference to any of those types, and
 additionally a hidden field with the suffix `_type` is generated that
 holds the corresponding enum value, allowing you to know which type to
 cast to at runtime.

 ### Namespaces

 These will generate the corresponding namespace in C++ for all helper
 code, and packages in Java. You can use `.` to specify nested namespaces /
 packages.

 ### Root type

 This declares what you consider to be the root table (or struct) of the
 serialized data.

 ### Comments & documentation

 May be written as in most C-based languages. Additionally, a triple
 comment (`///`) on a line by itself signals that a comment is documentation
 for whatever is declared on the line after it
 (table/struct/field/enum/union/element), and the comment is output
 in the corresponding C++ code. Multiple such lines per item are allowed.

 ### Attributes

 Attributes may be attached to a declaration, behind a field, or after
 the name of a table/struct/enum/union. These may either have a value or
 not. Some attributes like `deprecated` are understood by the compiler,
 others are simply ignored (like `priority`), but are available to query
 if you parse the schema at runtime.
 This is useful if you write your own code generators/editors etc., and
 you wish to add additional information specific to your tool (such as a
 help text).

 Current understood attributes:

 -   `id: n` (on a table field): manually set the field identifier to `n`.
     If you use this attribute, you must use it on ALL fields of this table,
     and the numbers must be a contiguous range from 0 onwards.
     Additionally, since a union type effectively adds two fields, its
     id must be that of the second field (the first field is the type
     field and not explicitly declared in the schema).
     For example, if the last field before the union field had id 6,
     the union field should have id 8, and the unions type field will
     implicitly be 7.
     IDs allow the fields to be placed in any order in the schema.
     When a new field is added to the schema is must use the next available ID.
 -   `deprecated` (on a field): do not generate accessors for this field
     anymore, code should stop using this data.
 -   `original_order` (on a table): since elements in a table do not need
     to be stored in any particular order, they are often optimized for
     space by sorting them to size. This attribute stops that from happening.
 -   `force_align: size` (on a struct): force the alignment of this struct
     to be something higher than what it is naturally aligned to. Causes
     these structs to be aligned to that amount inside a buffer, IF that
     buffer is allocated with that alignment (which is not necessarily
     the case for buffers accessed directly inside a `FlatBufferBuilder`).

 ## Gotchas

 ### Schemas and version control

 FlatBuffers relies on new field declarations being added at the end, and earlier
 declarations to not be removed, but be marked deprecated when needed. We think
 this is an improvement over the manual number assignment that happens in
 Protocol Buffers (and which is still an option using the `id` attribute
 mentioned above).

 One place where this is possibly problematic however is source control. If user
 A adds a field, generates new binary data with this new schema, then tries to
 commit both to source control after user B already committed a new field also,
 and just auto-merges the schema, the binary files are now invalid compared to
 the new schema.

 The solution of course is that you should not be generating binary data before
 your schema changes have been committed, ensuring consistency with the rest of
 the world. If this is not practical for you, use explicit field ids, which
 should always generate a merge conflict if two people try to allocate the same
 id.
	# Writing a schema

	The syntax of the schema language (aka IDL, Interface Definition
	Language) should look quite familiar to users of any of the C family of
	languages, and also to users of other IDLs. Let's look at an example
	first:

	// example IDL file

	namespace MyGame;

	enum Color : byte { Red = 1, Green, Blue }

	union Any { Monster, Weapon, Pickup }

	struct Vec3 {
	x:float;
	y:float;
	z:float;
	}

	table Monster {
	pos:Vec3;
	mana:short = 150;
	hp:short = 100;
	name:string;
	friendly:bool = false (deprecated, priority: 1);
	inventory:[ubyte];
	color:Color = Blue;
	test:Any;
	}

	root_type Monster;

	(Weapon & Pickup not defined as part of this example).

	### Tables

	Tables are the main way of defining objects in FlatBuffers, and consist
	of a name (here `Monster`) and a list of fields. Each field has a name,
	a type, and optionally a default value (if omitted, it defaults to 0 /
	NULL).

	Each field is optional: It does not have to appear in the wire
	representation, and you can choose to omit fields for each individual
	object. As a result, you have the flexibility to add fields without fear of
	bloating your data. This design is also FlatBuffer's mechanism for forward
	and backwards compatibility. Note that:

	- You can add new fields in the schema ONLY at the end of a table
	definition. Older data will still
	read correctly, and give you the default value when read. Older code
	will simply ignore the new field.
	If you want to have flexibility to use any order for fields in your
	schema, you can manually assign ids (much like Protocol Buffers),
	see the `id` attribute below.

	- You cannot delete fields you don't use anymore from the schema,
	but you can simply
	stop writing them into your data for almost the same effect.
	Additionally you can mark them as `deprecated` as in the example
	above, which will prevent the generation of accessors in the
	generated C++, as a way to enforce the field not being used any more.
	(careful: this may break code!).

	- You may change field names and table names, if you're ok with your
	code breaking until you've renamed them there too.



	### Structs

	Similar to a table, only now none of the fields are optional (so no defaults
	either), and fields may not be added or be deprecated. Structs may only contain
	scalars or other structs. Use this for
	simple objects where you are very sure no changes will ever be made
	(as quite clear in the example `Vec3`). Structs use less memory than
	tables and are even faster to access (they are always stored in-line in their
	parent object, and use no virtual table).

	### Types

	Builtin scalar types are:

	- 8 bit: `byte ubyte bool`

	- 16 bit: `short ushort`

	- 32 bit: `int uint float`

	- 64 bit: `long ulong double`

	- Vector of any other type (denoted with `[type]`). Nesting vectors
	is not supported, instead you can wrap the inner vector in a table.

	- `string`, which may only hold UTF-8 or 7-bit ASCII. For other text encodings
	or general binary data use vectors (`[byte]` or `[ubyte]`) instead.

	- References to other tables or structs, enums or unions (see
	below).

	You can't change types of fields once they're used, with the exception
	of same-size data where a `reinterpret_cast` would give you a desirable result,
	e.g. you could change a `uint` to an `int` if no values in current data use the
	high bit yet.

	### (Default) Values

	Values are a sequence of digits, optionally followed by a `.` and more digits
	for float constants, and optionally prefixed by a `-`. Non-scalar defaults are
	currently not supported (always NULL).

	You generally do not want to change default values after they're initially
	defined. Fields that have the default value are not actually stored in the
	serialized data but are generated in code, so when you change the default, you'd
	now get a different value than from code generated from an older version of
	the schema. There are situations however where this may be
	desirable, especially if you can ensure a simultaneous rebuild of
	all code.

	### Enums

	Define a sequence of named constants, each with a given value, or
	increasing by one from the previous one. The default first value
	is `0`. As you can see in the enum declaration, you specify the underlying
	integral type of the enum with `:` (in this case `byte`), which then determines
	the type of any fields declared with this enum type.

	### Unions

	Unions share a lot of properties with enums, but instead of new names
	for constants, you use names of tables. You can then declare
	a union field which can hold a reference to any of those types, and
	additionally a hidden field with the suffix `_type` is generated that
	holds the corresponding enum value, allowing you to know which type to
	cast to at runtime.

	### Namespaces

	These will generate the corresponding namespace in C++ for all helper
	code, and packages in Java. You can use `.` to specify nested namespaces /
	packages.

	### Root type

	This declares what you consider to be the root table (or struct) of the
	serialized data.

	### Comments & documentation

	May be written as in most C-based languages. Additionally, a triple
	comment (`///`) on a line by itself signals that a comment is documentation
	for whatever is declared on the line after it
	(table/struct/field/enum/union/element), and the comment is output
	in the corresponding C++ code. Multiple such lines per item are allowed.

	### Attributes

	Attributes may be attached to a declaration, behind a field, or after
	the name of a table/struct/enum/union. These may either have a value or
	not. Some attributes like `deprecated` are understood by the compiler,
	others are simply ignored (like `priority`), but are available to query
	if you parse the schema at runtime.
	This is useful if you write your own code generators/editors etc., and
	you wish to add additional information specific to your tool (such as a
	help text).

	Current understood attributes:

	- `id: n` (on a table field): manually set the field identifier to `n`.
	If you use this attribute, you must use it on ALL fields of this table,
	and the numbers must be a contiguous range from 0 onwards.
	Additionally, since a union type effectively adds two fields, its
	id must be that of the second field (the first field is the type
	field and not explicitly declared in the schema).
	For example, if the last field before the union field had id 6,
	the union field should have id 8, and the unions type field will
	implicitly be 7.
	IDs allow the fields to be placed in any order in the schema.
	When a new field is added to the schema is must use the next available ID.
	- `deprecated` (on a field): do not generate accessors for this field
	anymore, code should stop using this data.
	- `original_order` (on a table): since elements in a table do not need
	to be stored in any particular order, they are often optimized for
	space by sorting them to size. This attribute stops that from happening.
	- `force_align: size` (on a struct): force the alignment of this struct
	to be something higher than what it is naturally aligned to. Causes
	these structs to be aligned to that amount inside a buffer, IF that
	buffer is allocated with that alignment (which is not necessarily
	the case for buffers accessed directly inside a `FlatBufferBuilder`).

	## Gotchas

	### Schemas and version control

	FlatBuffers relies on new field declarations being added at the end, and earlier
	declarations to not be removed, but be marked deprecated when needed. We think
	this is an improvement over the manual number assignment that happens in
	Protocol Buffers (and which is still an option using the `id` attribute
	mentioned above).

	One place where this is possibly problematic however is source control. If user
	A adds a field, generates new binary data with this new schema, then tries to
	commit both to source control after user B already committed a new field also,
	and just auto-merges the schema, the binary files are now invalid compared to
	the new schema.

	The solution of course is that you should not be generating binary data before
	your schema changes have been committed, ensuring consistency with the rest of
	the world. If this is not practical for you, use explicit field ids, which
	should always generate a merge conflict if two people try to allocate the same
	id.