Initial open-source commit of Emboss.
diff --git a/.bazelrc b/.bazelrc
new file mode 100644
index 0000000..6682e93
--- /dev/null
+++ b/.bazelrc
@@ -0,0 +1 @@
+build --copt=-std=c++11
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..939e534
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,28 @@
+# How to Contribute
+
+We'd love to accept your patches and contributions to this project. There are
+just a few small guidelines you need to follow.
+
+## Contributor License Agreement
+
+Contributions to this project must be accompanied by a Contributor License
+Agreement. You (or your employer) retain the copyright to your contribution;
+this simply gives us permission to use and redistribute your contributions as
+part of the project. Head over to <https://cla.developers.google.com/> to see
+your current agreements on file or to sign a new one.
+
+You generally only need to submit a CLA once, so if you've already submitted one
+(even if it was for a different project), you probably don't need to do it
+again.
+
+## Code reviews
+
+All submissions, including submissions by project members, require review. We
+use GitHub pull requests for this purpose. Consult
+[GitHub Help](https://help.github.com/articles/about-pull-requests/) for more
+information on using pull requests.
+
+## Community Guidelines
+
+This project follows [Google's Open Source Community
+Guidelines](https://opensource.google.com/conduct/).
diff --git a/LICENSE b/LICENSE
new file mode 100644
index 0000000..d645695
--- /dev/null
+++ b/LICENSE
@@ -0,0 +1,202 @@
+
+ Apache License
+ Version 2.0, January 2004
+ http://www.apache.org/licenses/
+
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+ 1. Definitions.
+
+ "License" shall mean the terms and conditions for use, reproduction,
+ and distribution as defined by Sections 1 through 9 of this document.
+
+ "Licensor" shall mean the copyright owner or entity authorized by
+ the copyright owner that is granting the License.
+
+ "Legal Entity" shall mean the union of the acting entity and all
+ other entities that control, are controlled by, or are under common
+ control with that entity. For the purposes of this definition,
+ "control" means (i) the power, direct or indirect, to cause the
+ direction or management of such entity, whether by contract or
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
+ outstanding shares, or (iii) beneficial ownership of such entity.
+
+ "You" (or "Your") shall mean an individual or Legal Entity
+ exercising permissions granted by this License.
+
+ "Source" form shall mean the preferred form for making modifications,
+ including but not limited to software source code, documentation
+ source, and configuration files.
+
+ "Object" form shall mean any form resulting from mechanical
+ transformation or translation of a Source form, including but
+ not limited to compiled object code, generated documentation,
+ and conversions to other media types.
+
+ "Work" shall mean the work of authorship, whether in Source or
+ Object form, made available under the License, as indicated by a
+ copyright notice that is included in or attached to the work
+ (an example is provided in the Appendix below).
+
+ "Derivative Works" shall mean any work, whether in Source or Object
+ form, that is based on (or derived from) the Work and for which the
+ editorial revisions, annotations, elaborations, or other modifications
+ represent, as a whole, an original work of authorship. For the purposes
+ of this License, Derivative Works shall not include works that remain
+ separable from, or merely link (or bind by name) to the interfaces of,
+ the Work and Derivative Works thereof.
+
+ "Contribution" shall mean any work of authorship, including
+ the original version of the Work and any modifications or additions
+ to that Work or Derivative Works thereof, that is intentionally
+ submitted to Licensor for inclusion in the Work by the copyright owner
+ or by an individual or Legal Entity authorized to submit on behalf of
+ the copyright owner. For the purposes of this definition, "submitted"
+ means any form of electronic, verbal, or written communication sent
+ to the Licensor or its representatives, including but not limited to
+ communication on electronic mailing lists, source code control systems,
+ and issue tracking systems that are managed by, or on behalf of, the
+ Licensor for the purpose of discussing and improving the Work, but
+ excluding communication that is conspicuously marked or otherwise
+ designated in writing by the copyright owner as "Not a Contribution."
+
+ "Contributor" shall mean Licensor and any individual or Legal Entity
+ on behalf of whom a Contribution has been received by Licensor and
+ subsequently incorporated within the Work.
+
+ 2. Grant of Copyright License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ copyright license to reproduce, prepare Derivative Works of,
+ publicly display, publicly perform, sublicense, and distribute the
+ Work and such Derivative Works in Source or Object form.
+
+ 3. Grant of Patent License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ (except as stated in this section) patent license to make, have made,
+ use, offer to sell, sell, import, and otherwise transfer the Work,
+ where such license applies only to those patent claims licensable
+ by such Contributor that are necessarily infringed by their
+ Contribution(s) alone or by combination of their Contribution(s)
+ with the Work to which such Contribution(s) was submitted. If You
+ institute patent litigation against any entity (including a
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
+ or a Contribution incorporated within the Work constitutes direct
+ or contributory patent infringement, then any patent licenses
+ granted to You under this License for that Work shall terminate
+ as of the date such litigation is filed.
+
+ 4. Redistribution. You may reproduce and distribute copies of the
+ Work or Derivative Works thereof in any medium, with or without
+ modifications, and in Source or Object form, provided that You
+ meet the following conditions:
+
+ (a) You must give any other recipients of the Work or
+ Derivative Works a copy of this License; and
+
+ (b) You must cause any modified files to carry prominent notices
+ stating that You changed the files; and
+
+ (c) You must retain, in the Source form of any Derivative Works
+ that You distribute, all copyright, patent, trademark, and
+ attribution notices from the Source form of the Work,
+ excluding those notices that do not pertain to any part of
+ the Derivative Works; and
+
+ (d) If the Work includes a "NOTICE" text file as part of its
+ distribution, then any Derivative Works that You distribute must
+ include a readable copy of the attribution notices contained
+ within such NOTICE file, excluding those notices that do not
+ pertain to any part of the Derivative Works, in at least one
+ of the following places: within a NOTICE text file distributed
+ as part of the Derivative Works; within the Source form or
+ documentation, if provided along with the Derivative Works; or,
+ within a display generated by the Derivative Works, if and
+ wherever such third-party notices normally appear. The contents
+ of the NOTICE file are for informational purposes only and
+ do not modify the License. You may add Your own attribution
+ notices within Derivative Works that You distribute, alongside
+ or as an addendum to the NOTICE text from the Work, provided
+ that such additional attribution notices cannot be construed
+ as modifying the License.
+
+ You may add Your own copyright statement to Your modifications and
+ may provide additional or different license terms and conditions
+ for use, reproduction, or distribution of Your modifications, or
+ for any such Derivative Works as a whole, provided Your use,
+ reproduction, and distribution of the Work otherwise complies with
+ the conditions stated in this License.
+
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
+ any Contribution intentionally submitted for inclusion in the Work
+ by You to the Licensor shall be under the terms and conditions of
+ this License, without any additional terms or conditions.
+ Notwithstanding the above, nothing herein shall supersede or modify
+ the terms of any separate license agreement you may have executed
+ with Licensor regarding such Contributions.
+
+ 6. Trademarks. This License does not grant permission to use the trade
+ names, trademarks, service marks, or product names of the Licensor,
+ except as required for reasonable and customary use in describing the
+ origin of the Work and reproducing the content of the NOTICE file.
+
+ 7. Disclaimer of Warranty. Unless required by applicable law or
+ agreed to in writing, Licensor provides the Work (and each
+ Contributor provides its Contributions) on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied, including, without limitation, any warranties or conditions
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+ PARTICULAR PURPOSE. You are solely responsible for determining the
+ appropriateness of using or redistributing the Work and assume any
+ risks associated with Your exercise of permissions under this License.
+
+ 8. Limitation of Liability. In no event and under no legal theory,
+ whether in tort (including negligence), contract, or otherwise,
+ unless required by applicable law (such as deliberate and grossly
+ negligent acts) or agreed to in writing, shall any Contributor be
+ liable to You for damages, including any direct, indirect, special,
+ incidental, or consequential damages of any character arising as a
+ result of this License or out of the use or inability to use the
+ Work (including but not limited to damages for loss of goodwill,
+ work stoppage, computer failure or malfunction, or any and all
+ other commercial damages or losses), even if such Contributor
+ has been advised of the possibility of such damages.
+
+ 9. Accepting Warranty or Additional Liability. While redistributing
+ the Work or Derivative Works thereof, You may choose to offer,
+ and charge a fee for, acceptance of support, warranty, indemnity,
+ or other liability obligations and/or rights consistent with this
+ License. However, in accepting such obligations, You may act only
+ on Your own behalf and on Your sole responsibility, not on behalf
+ of any other Contributor, and only if You agree to indemnify,
+ defend, and hold each Contributor harmless for any liability
+ incurred by, or claims asserted against, such Contributor by reason
+ of your accepting any such warranty or additional liability.
+
+ END OF TERMS AND CONDITIONS
+
+ APPENDIX: How to apply the Apache License to your work.
+
+ To apply the Apache License to your work, attach the following
+ boilerplate notice, with the fields enclosed by brackets "[]"
+ replaced with your own identifying information. (Don't include
+ the brackets!) The text should be enclosed in the appropriate
+ comment syntax for the file format. We also recommend that a
+ file or class name and description of purpose be included on the
+ same "printed page" as the copyright notice for easier
+ identification within third-party archives.
+
+ Copyright [yyyy] [name of copyright owner]
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..9bb3998
--- /dev/null
+++ b/README.md
@@ -0,0 +1,96 @@
+# Emboss
+
+Emboss is a tool for generating code that reads and writes binary data
+structures. It is designed to help write code that communicates with hardware
+devices such as GPS receivers, LIDAR scanners, or actuators.
+
+## What does Emboss *do*?
+
+Emboss takes specifications of binary data structures, and produces code that
+will efficiently and safely read and write those structures.
+
+Currently, Emboss only generates C++ code, but the compiler is structured so
+that writing new back ends is relatively easy -- contact emboss-dev@google.com
+if you think Emboss would be useful, but your project uses a different language.
+
+
+## When should I use Emboss?
+
+If you're sitting down with a manual that looks something like
+[this](http://www.novatel.com/assets/Documents/Manuals/om-20000094.pdf) or
+[this](http://www.u-blox.com/images/downloads/Product_Docs/u-blox6_ReceiverDescriptionProtocolSpec_%28GPS.G6-SW-10018%29.pdf),
+Emboss is meant for you.
+
+
+## When should I not use Emboss?
+
+Emboss is not designed to handle text-based protocols; if you can use minicom or
+telnet to connect to your device, and manually enter commands and see responses,
+Emboss probably won't help you.
+
+Emboss is intended for cases where you do not control the data format. If you
+are defining your own format, you may be better off using [Protocol
+Buffers](https://developers.google.com/protocol-buffers/) or [Cap'n
+Proto](https://capnproto.org/) or [BSON](http://bsonspec.org/) or some similar
+system.
+
+
+## Why not just use packed structs?
+
+In C++, packed structs are most common method of dealing with these kinds of
+structures; however, they have a number of drawbacks compared to Emboss views:
+
+1. Access to packed structs is not checked. Emboss (by default) ensures that
+ you do not read or write out of bounds.
+2. It is easy to accidentally trigger C++ undefined behavior using packed
+ structs, for example by not respecting the struct's alignment restrictions
+ or by running afoul of strict aliasing rules. Emboss is designed to work
+ with misaligned data, and is careful to use strict-aliasing-safe constructs.
+3. Packed structs do not handle variable-size arrays, nor arrays of
+ sub-byte-size fields, such as boolean flags.
+4. Packed structs do not handle endianness; your code must be very careful to
+ correctly convert stored endianness to native.
+5. Packed structs do not handle variable-sized fields, such as embedded
+ substructs with variable length.
+6. Although unions can sometimes help, packed structs do not handle overlapping
+ fields well.
+7. Although unions can sometimes help, packed structs do not handle optional
+ fields well.
+8. Certain aspects of bitfields in C++, such as their exact placement within
+ the larger containing block, are implementation-defined. Emboss always
+ reads and writes bitfields in a portable way.
+9. Packed structs do not have support for conversion to human-readable text
+ format.
+10. It is difficult to read the definition of a packed struct in order to
+ generate documentation, alternate representations, or support in languages
+ other than C and C++.
+
+
+## What does Emboss *not* do?
+
+Emboss does not help you transmit data over a wire -- you must use something
+else to actually transmit bytes back and forth. This is partly because there
+are too many possible ways of communicating with devices, but also because it
+allows you to manipulate structures independently of where they came from or
+where they are going.
+
+Emboss does not help you interpret your data, or implement any kind of
+higher-level logic. It is strictly meant to help you turn bit patterns into
+something suitable for your programming language to handle.
+
+
+## What state is Emboss in?
+
+Emboss is currently under development. While it should be entirely ready for
+many data formats, it may still be missing features. If you find something that
+Emboss can't handle, please contact `emboss-dev@google.com` to see if and when
+support can be added.
+
+Emboss is not an officially supported Google product: while the Emboss authors
+will try to answer feature requests, bug reports, and questions, there is no SLA
+(service level agreement).
+
+
+## Getting Started
+
+Head over to the [User Guide](g3doc/guide.md) to get started.
diff --git a/WORKSPACE b/WORKSPACE
new file mode 100644
index 0000000..f26dd06
--- /dev/null
+++ b/WORKSPACE
@@ -0,0 +1,19 @@
+workspace(name = "com_google_emboss")
+
+load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
+load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")
+
+# googletest
+git_repository(
+ name = "com_google_googletest",
+ remote = "https://github.com/google/googletest",
+ commit = "f899e81e43407c9a3433d9ad3a0a8f64e450ba44",
+ shallow_since = "1563302555 -0400",
+)
+
+git_repository(
+ name = "com_google_absl",
+ remote = "https://github.com/abseil/abseil-cpp",
+ commit = "44efe96dfca674a17b45ca53fc77fb69f1e29bf4",
+ shallow_since = "1562769772 +0000",
+)
diff --git a/back_end/__init__.py b/back_end/__init__.py
new file mode 100644
index 0000000..2c31d84
--- /dev/null
+++ b/back_end/__init__.py
@@ -0,0 +1,14 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
diff --git a/back_end/cpp/BUILD b/back_end/cpp/BUILD
new file mode 100644
index 0000000..d04f30c
--- /dev/null
+++ b/back_end/cpp/BUILD
@@ -0,0 +1,313 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Emboss C++ code generator.
+
+load(":build_defs.bzl", "emboss_cc_test")
+
+package(
+ default_visibility = ["//visibility:private"],
+ features = ["-layering_check"],
+)
+
+py_binary(
+ name = "emboss_codegen_cpp",
+ srcs = ["emboss_codegen_cpp.py"],
+ python_version = "PY3",
+ visibility = ["//visibility:public"],
+ deps = [
+ ":header_generator",
+ "//public:ir_pb2",
+ ],
+)
+
+py_library(
+ name = "header_generator",
+ srcs = ["header_generator.py"],
+ data = [
+ "generated_code_templates",
+ ],
+ deps = [
+ "//back_end/util:code_template",
+ "//public:ir_pb2",
+ "//util:ir_util",
+ "//util:name_conversion",
+ ],
+)
+
+emboss_cc_test(
+ name = "span_se_log_file_status_emb_generated_code_test",
+ srcs = [
+ "testcode/read_log_file_status_test.cc",
+ ],
+ deps = [
+ "//testdata:span_se_log_file_status_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "nested_structure_test",
+ srcs = [
+ "testcode/nested_structure_test.cc",
+ ],
+ deps = [
+ "//testdata:nested_structure_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "alignments_test",
+ srcs = [
+ "testcode/alignments_test.cc",
+ ],
+ deps = [
+ "//testdata:alignments_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "condition_test",
+ srcs = [
+ "testcode/condition_test.cc",
+ ],
+ deps = [
+ "//testdata:condition_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "enum_test",
+ srcs = [
+ "testcode/enum_test.cc",
+ ],
+ deps = [
+ "//testdata:enum_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "explicit_sizes_test",
+ srcs = [
+ "testcode/explicit_sizes_test.cc",
+ ],
+ deps = [
+ "//testdata:explicit_sizes_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "importer_test",
+ srcs = [
+ "testcode/importer_test.cc",
+ ],
+ deps = [
+ "//testdata:importer_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "uint_sizes_test",
+ srcs = [
+ "testcode/uint_sizes_test.cc",
+ ],
+ deps = [
+ "//testdata:uint_sizes_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "int_sizes_test",
+ srcs = [
+ "testcode/int_sizes_test.cc",
+ ],
+ deps = [
+ "//testdata:int_sizes_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "float_test",
+ srcs = [
+ "testcode/float_test.cc",
+ ],
+ deps = [
+ "//testdata:float_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "dynamic_size_test",
+ srcs = [
+ "testcode/dynamic_size_test.cc",
+ ],
+ deps = [
+ "//testdata:dynamic_size_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "auto_array_size_test",
+ srcs = [
+ "testcode/auto_array_size_test.cc",
+ ],
+ deps = [
+ "//testdata:auto_array_size_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "start_size_range_test",
+ srcs = [
+ "testcode/start_size_range_test.cc",
+ ],
+ deps = [
+ "//testdata:start_size_range_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "bcd_test",
+ srcs = [
+ "testcode/bcd_test.cc",
+ ],
+ deps = [
+ "//testdata:bcd_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "namespace_test",
+ srcs = [
+ "testcode/namespace_test.cc",
+ ],
+ deps = [
+ "//testdata:absolute_cpp_namespace_emboss",
+ "//testdata:cpp_namespace_emboss",
+ "//testdata:no_cpp_namespace_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "requires_test",
+ srcs = [
+ "testcode/requires_test.cc",
+ ],
+ deps = [
+ "//testdata:requires_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "subtypes_test",
+ srcs = [
+ "testcode/subtypes_test.cc",
+ ],
+ deps = [
+ "//testdata:subtypes_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "inline_type_test",
+ srcs = [
+ "testcode/inline_type_test.cc",
+ ],
+ deps = [
+ "//testdata:inline_type_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "bits_test",
+ srcs = [
+ "testcode/bits_test.cc",
+ ],
+ deps = [
+ "//public:cpp_utils",
+ "//testdata:bits_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "anonymous_bits_test",
+ srcs = [
+ "testcode/anonymous_bits_test.cc",
+ ],
+ deps = [
+ "//public:cpp_utils",
+ "//testdata:anonymous_bits_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "virtual_field_test",
+ srcs = [
+ "testcode/virtual_field_test.cc",
+ ],
+ deps = [
+ "//testdata:virtual_field_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "text_format_test",
+ srcs = [
+ "testcode/text_format_test.cc",
+ ],
+ deps = [
+ "//testdata:text_format_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "parameters_test",
+ srcs = [
+ "testcode/parameters_test.cc",
+ ],
+ deps = [
+ "//testdata:parameters_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_test(
+ name = "complex_structure_test",
+ srcs = ["testcode/complex_structure_test.cc"],
+ deps = [
+ "//testdata:complex_structure_emboss",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
diff --git a/back_end/cpp/__init__.py b/back_end/cpp/__init__.py
new file mode 100644
index 0000000..2c31d84
--- /dev/null
+++ b/back_end/cpp/__init__.py
@@ -0,0 +1,14 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
diff --git a/back_end/cpp/build_defs.bzl b/back_end/cpp/build_defs.bzl
new file mode 100644
index 0000000..fd891da
--- /dev/null
+++ b/back_end/cpp/build_defs.bzl
@@ -0,0 +1,33 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# -*- mode: python; -*-
+# vim:set ft=blazebuild:
+"""Rule to generate cc_tests with and without system-specific optimizations."""
+
+def emboss_cc_test(name, copts = None, **kwargs):
+ """Generates cc_test rules with and without -DEMBOSS_NO_OPTIMIZATIONS."""
+ native.cc_test(
+ name = name,
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"] + (copts or []),
+ **kwargs
+ )
+ native.cc_test(
+ name = name + "_no_opts",
+ copts = [
+ "-DEMBOSS_NO_OPTIMIZATIONS",
+ "-DEMBOSS_FORCE_ALL_CHECKS",
+ ] + (copts or []),
+ **kwargs
+ )
diff --git a/back_end/cpp/emboss_codegen_cpp.py b/back_end/cpp/emboss_codegen_cpp.py
new file mode 100644
index 0000000..1dbe5ad
--- /dev/null
+++ b/back_end/cpp/emboss_codegen_cpp.py
@@ -0,0 +1,37 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Emboss C++ code generator.
+
+This is a driver program that reads IR, feeds it to header_generator, and prints
+the result.
+"""
+
+from __future__ import print_function
+
+import sys
+
+from back_end.cpp import header_generator
+from public import ir_pb2
+
+
+def main(argv):
+ del argv # Unused.
+ ir = ir_pb2.EmbossIr.from_json(sys.stdin.read())
+ print(header_generator.generate_header(ir))
+ return 0
+
+
+if __name__ == '__main__':
+ sys.exit(main(sys.argv))
diff --git a/back_end/cpp/generated_code_templates b/back_end/cpp/generated_code_templates
new file mode 100644
index 0000000..ebd828e
--- /dev/null
+++ b/back_end/cpp/generated_code_templates
@@ -0,0 +1,813 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// -*- mode: C++ -*-
+// vim: set filetype=cpp:
+
+// Fragments of C++ code used by the Emboss C++ code generator. Anything before
+// the first template is ignored. The names between ** ** are used as template
+// names. See code_template.py for more details. Local variable names are
+// prefixed with `emboss_reserved_local_` to avoid conflicting with struct field
+// names.
+
+// ** outline ** ///////////////////////////////////////////////////////////////
+// Generated by the Emboss compiler. DO NOT EDIT!
+#ifndef $_header_guard_$
+#define $_header_guard_$
+#include <stdint.h>
+#include <string.h>
+
+#include <algorithm>
+#include <ostream>
+#include <type_traits>
+#include <utility>
+
+#include "public/emboss_cpp_util.h"
+
+$_includes_$
+
+$_body_$
+
+#endif // $_header_guard_$
+
+
+// ** include ** ///////////////////////////////////////////////////////////////
+#include "$_file_name_$"
+
+
+// ** body ** //////////////////////////////////////////////////////////////////
+$_type_declarations_$
+$_type_definitions_$
+$_method_definitions_$
+
+
+// ** namespace_wrap ** ////////////////////////////////////////////////////////
+namespace $_component_$ {
+$_body_$
+} // namespace $_component_$
+
+
+// ** structure_view_declaration ** ////////////////////////////////////////////
+template <class Storage>
+class Generic$_name_$View;
+
+
+// ** structure_view_class ** //////////////////////////////////////////////////
+template <class View>
+struct EmbossReservedInternalIsGeneric$_name_$View;
+
+template <class Storage>
+class Generic$_name_$View final {
+ public:
+ Generic$_name_$View() : backing_() {}
+ explicit Generic$_name_$View(
+ $_constructor_parameters_$ Storage emboss_reserved_local_bytes)
+ : backing_(emboss_reserved_local_bytes) $_parameter_initializers_$
+ $_initialize_parameters_initialized_true_$ {}
+
+ // Views over compatible backing storage should be freely assignable.
+ template <typename OtherStorage>
+ Generic$_name_$View(
+ const Generic$_name_$View<OtherStorage> &emboss_reserved_local_other)
+ : backing_{emboss_reserved_local_other.BackingStorage()} {}
+
+ // Allow pass-through construction of backing_, but only if there is at least
+ // one argument, and, if exactly one argument, that argument is not a
+ // (possibly c/v/ref-qualified) Generic$_name_$View.
+ //
+ // Explicitly ruling out overloads that might match the copy or move
+ // constructor is necessary in order for the copy and move constructors to be
+ // reliably found during overload resolution.
+ template <typename Arg,
+ typename = typename ::std::enable_if<
+ !EmbossReservedInternalIsGeneric$_name_$View<
+ typename ::std::remove_cv<typename ::std::remove_reference<
+ Arg>::type>::type>::value>::type>
+ explicit Generic$_name_$View(
+ $_constructor_parameters_$ Arg &&emboss_reserved_local_arg)
+ : backing_(::std::forward<Arg>(
+ emboss_reserved_local_arg)) $_parameter_initializers_$
+ $_initialize_parameters_initialized_true_$ {}
+ template <typename Arg0, typename Arg1, typename... Args>
+ explicit Generic$_name_$View(
+ $_constructor_parameters_$ Arg0 &&emboss_reserved_local_arg0,
+ Arg1 &&emboss_reserved_local_arg1, Args &&... emboss_reserved_local_args)
+ : backing_(::std::forward<Arg0>(emboss_reserved_local_arg0),
+ ::std::forward<Arg1>(emboss_reserved_local_arg1),
+ ::std::forward<Args>(
+ emboss_reserved_local_args)...) $_parameter_initializers_$
+ $_initialize_parameters_initialized_true_$ {}
+
+ template <typename OtherStorage>
+ Generic$_name_$View<Storage> &operator=(
+ const Generic$_name_$View<OtherStorage> &emboss_reserved_local_other) {
+ backing_ = emboss_reserved_local_other.BackingStorage();
+ return *this;
+ }
+
+ $_enum_usings_$
+
+ bool Ok() const {
+ if (!IsComplete()) return false;
+$_parameter_ok_checks_$
+$_field_ok_checks_$
+$_requires_check_$
+ return true;
+ }
+ Storage BackingStorage() const { return backing_; }
+ bool IsComplete() const {
+ return backing_.Ok() && IntrinsicSizeIn$_units_$().Ok() &&
+ backing_.SizeIn$_units_$() >=
+ static_cast</**/ ::std::size_t>(
+ IntrinsicSizeIn$_units_$().UncheckedRead());
+ }
+$_size_method_$
+
+ template <typename OtherStorage>
+ bool Equals(
+ Generic$_name_$View<OtherStorage> emboss_reserved_local_other) const {
+ $_equals_method_body_$ return true;
+ }
+ template <typename OtherStorage>
+ bool UncheckedEquals(
+ Generic$_name_$View<OtherStorage> emboss_reserved_local_other) const {
+ $_unchecked_equals_method_body_$ return true;
+ }
+ // (Unchecked)CopyFrom copies the number of bytes included in the other view,
+ // and ignores the size of the current view. Even if they differ before
+ // copying, the destination view's size should match the source view's size
+ // after copying, because any fields used in the calculation of the
+ // destination view's size should be updated by the copy.
+ template <typename OtherStorage>
+ void UncheckedCopyFrom(
+ Generic$_name_$View<OtherStorage> emboss_reserved_local_other) const {
+ backing_.UncheckedCopyFrom(
+ emboss_reserved_local_other.BackingStorage(),
+ emboss_reserved_local_other.IntrinsicSizeIn$_units_$().UncheckedRead());
+ }
+
+ template <typename OtherStorage>
+ void CopyFrom(
+ Generic$_name_$View<OtherStorage> emboss_reserved_local_other) const {
+ backing_.CopyFrom(
+ emboss_reserved_local_other.BackingStorage(),
+ emboss_reserved_local_other.IntrinsicSizeIn$_units_$().Read());
+ }
+ template <typename OtherStorage>
+ bool TryToCopyFrom(
+ Generic$_name_$View<OtherStorage> emboss_reserved_local_other) const {
+ return emboss_reserved_local_other.Ok() && backing_.TryToCopyFrom(
+ emboss_reserved_local_other.BackingStorage(),
+ emboss_reserved_local_other.IntrinsicSizeIn$_units_$().Read());
+ }
+
+ template <class Stream>
+ bool UpdateFromTextStream(Stream *emboss_reserved_local_stream) const {
+ ::std::string emboss_reserved_local_brace;
+ if (!::emboss::support::ReadToken(emboss_reserved_local_stream,
+ &emboss_reserved_local_brace))
+ return false;
+ if (emboss_reserved_local_brace != "{") return false;
+ for (;;) {
+ ::std::string emboss_reserved_local_name;
+ if (!::emboss::support::ReadToken(emboss_reserved_local_stream,
+ &emboss_reserved_local_name))
+ return false;
+ if (emboss_reserved_local_name == ",")
+ if (!::emboss::support::ReadToken(emboss_reserved_local_stream,
+ &emboss_reserved_local_name))
+ return false;
+ if (emboss_reserved_local_name == "}") return true;
+ ::std::string emboss_reserved_local_colon;
+ if (!::emboss::support::ReadToken(emboss_reserved_local_stream,
+ &emboss_reserved_local_colon))
+ return false;
+ if (emboss_reserved_local_colon != ":") return false;
+$_decode_fields_$
+ // decode_fields will `continue` if it successfully finds a field.
+ return false;
+ }
+ }
+
+ template <class Stream>
+ void WriteToTextStream(
+ Stream *emboss_reserved_local_stream,
+ ::emboss::TextOutputOptions emboss_reserved_local_options) const {
+ ::emboss::TextOutputOptions emboss_reserved_local_field_options =
+ emboss_reserved_local_options.PlusOneIndent();
+ if (emboss_reserved_local_options.multiline()) {
+ emboss_reserved_local_stream->Write("{\n");
+ } else {
+ emboss_reserved_local_stream->Write("{");
+ }
+ bool emboss_reserved_local_wrote_field = false;
+$_write_fields_$
+ // Avoid unused variable warnings for empty structures:
+ (void)emboss_reserved_local_wrote_field;
+ if (emboss_reserved_local_options.multiline()) {
+ emboss_reserved_local_stream->Write(
+ emboss_reserved_local_options.current_indent());
+ emboss_reserved_local_stream->Write("}");
+ } else {
+ emboss_reserved_local_stream->Write(" }");
+ }
+ }
+
+$_field_method_declarations_$
+
+ private:
+ Storage backing_;
+ $_parameter_fields_$
+ $_parameters_initialized_flag_$
+
+ // This is a bit of a hack to handle Equals() and UncheckedEquals() between
+ // views with different underlying storage -- otherwise, structs with
+ // anonymous members run into access violations.
+ //
+ // TODO(bolms): Revisit this once the special-case code for anonymous members
+ // is replaced by explicit read/write virtual fields in the IR.
+ template <class OtherStorage>
+ friend class Generic$_name_$View;
+};
+using $_name_$View =
+ Generic$_name_$View</**/ ::emboss::support::ReadOnlyContiguousBuffer>;
+using $_name_$Writer =
+ Generic$_name_$View</**/ ::emboss::support::ReadWriteContiguousBuffer>;
+
+template <class View>
+struct EmbossReservedInternalIsGeneric$_name_$View {
+ static constexpr bool value = false;
+};
+
+template <class Storage>
+struct EmbossReservedInternalIsGeneric$_name_$View<
+ Generic$_name_$View<Storage>> {
+ static constexpr bool value = true;
+};
+
+template <typename T>
+inline Generic$_name_$View<
+ /**/ ::emboss::support::ContiguousBuffer<
+ typename ::std::remove_reference<
+ decltype(*::std::declval<T>()->data())>::type,
+ 1, 0>>
+Make$_name_$View($_constructor_parameters_$ T &&emboss_reserved_local_arg) {
+ return Generic$_name_$View<
+ /**/ ::emboss::support::ContiguousBuffer<
+ typename ::std::remove_reference<decltype(
+ *::std::declval<T>()->data())>::type,
+ 1, 0>>(
+ $_forwarded_parameters_$ ::std::forward<T>(emboss_reserved_local_arg));
+}
+
+template <typename T>
+inline Generic$_name_$View</**/ ::emboss::support::ContiguousBuffer<T, 1, 0>>
+Make$_name_$View($_constructor_parameters_$ T *emboss_reserved_local_data,
+ size_t emboss_reserved_local_size) {
+ return Generic$_name_$View</**/ ::emboss::support::ContiguousBuffer<T, 1, 0>>(
+ $_forwarded_parameters_$ emboss_reserved_local_data,
+ emboss_reserved_local_size);
+}
+
+template <typename T, size_t kAlignment>
+inline Generic$_name_$View<
+ /**/ ::emboss::support::ContiguousBuffer<T, kAlignment, 0>>
+MakeAligned$_name_$View(
+ $_constructor_parameters_$ T *emboss_reserved_local_data,
+ size_t emboss_reserved_local_size) {
+ return Generic$_name_$View<
+ /**/ ::emboss::support::ContiguousBuffer<T, kAlignment, 0>>(
+ $_forwarded_parameters_$ emboss_reserved_local_data,
+ emboss_reserved_local_size);
+}
+
+// ** decode_field ** //////////////////////////////////////////////////////////
+ // If the field name matches $_field_name_$, handle it, otherwise fall
+ // through to the next field.
+ if (emboss_reserved_local_name == "$_field_name_$") {
+ // TODO(bolms): How should missing optional fields be handled?
+ if (!$_field_name_$().UpdateFromTextStream(
+ emboss_reserved_local_stream)) {
+ return false;
+ }
+ continue;
+ }
+
+
+// ** write_field_to_text_stream ** ////////////////////////////////////////////
+ if (has_$_field_name_$().ValueOr(false)) {
+ if (emboss_reserved_local_field_options.multiline()) {
+ emboss_reserved_local_stream->Write(
+ emboss_reserved_local_field_options.current_indent());
+ } else {
+ if (emboss_reserved_local_wrote_field) {
+ emboss_reserved_local_stream->Write(",");
+ }
+ emboss_reserved_local_stream->Write(" ");
+ }
+ emboss_reserved_local_stream->Write("$_field_name_$: ");
+ $_field_name_$().WriteToTextStream(emboss_reserved_local_stream,
+ emboss_reserved_local_field_options);
+ emboss_reserved_local_wrote_field = true;
+ if (emboss_reserved_local_field_options.multiline()) {
+ emboss_reserved_local_stream->Write("\n");
+ }
+ }
+
+
+// ** write_read_only_field_to_text_stream ** //////////////////////////////////
+ if (has_$_field_name_$().ValueOr(false) &&
+ emboss_reserved_local_field_options.comments()) {
+ emboss_reserved_local_stream->Write(
+ emboss_reserved_local_field_options.current_indent());
+ // TODO(bolms): When there are multiline read-only fields, add an option
+ // to TextOutputOptions to add `# ` to the current indent and use it here,
+ // so that subsequent lines are also commented out.
+ emboss_reserved_local_stream->Write("# $_field_name_$: ");
+ $_field_name_$().WriteToTextStream(emboss_reserved_local_stream,
+ emboss_reserved_local_field_options);
+ emboss_reserved_local_stream->Write("\n");
+ }
+
+// ** constant_structure_size_method ** ////////////////////////////////////////
+ static constexpr ::std::size_t SizeIn$_units_$() {
+ return static_cast</**/ ::std::size_t>(IntrinsicSizeIn$_units_$().Read());
+ }
+ static constexpr bool SizeIsKnown() {
+ return IntrinsicSizeIn$_units_$().Ok();
+ }
+
+// ** runtime_structure_size_method ** /////////////////////////////////////////
+ ::std::size_t SizeIn$_units_$() const {
+ return static_cast</**/ ::std::size_t>(IntrinsicSizeIn$_units_$().Read());
+ }
+ bool SizeIsKnown() const { return IntrinsicSizeIn$_units_$().Ok(); }
+
+
+// ** ok_method_test ** ////////////////////////////////////////////////////////
+ // If we don't have enough information to determine whether $_field_$ is
+ // present in the structure, then structure.Ok() should be false.
+ if (!has_$_field_$.Known()) return false;
+ // If $_field_$ is present, but not Ok(), then structure.Ok() should be
+ // false. If $_field_$ is not present, it does not matter whether it is
+ // Ok().
+ if (has_$_field_$.ValueOrDefault() && !$_field_$.Ok()) return false;
+
+
+// ** equals_method_test ** ////////////////////////////////////////////////////
+ // If this->$_field_$ is not equal to emboss_reserved_local_other.$_field_$,
+ // then the structures are not equal.
+
+ // If either structure's has_$_field_$ is unknown, then default to not
+ // Equals().
+ //
+ // TODO(bolms): Should Equals() return Maybe<bool> and/or return true for
+ // non-Ok()-but-equivalent structures?
+ if (!has_$_field_$.Known()) return false;
+ if (!emboss_reserved_local_other.has_$_field_$.Known()) return false;
+
+ // If one side has $_field_$ but the other side does not, then the fields
+ // are not equal. We use ValueOrDefault() instead of Value() since Value()
+ // is more complex and non-constexpr, and we already know that
+ // has_$_field_$.Known() is true for both structures.
+ if (emboss_reserved_local_other.has_$_field_$.ValueOrDefault() &&
+ !has_$_field_$.ValueOrDefault())
+ return false;
+ if (has_$_field_$.ValueOrDefault() &&
+ !emboss_reserved_local_other.has_$_field_$.ValueOrDefault())
+ return false;
+
+ // If both sides have $_field_$, then check that their Equals() returns
+ // true.
+ if (emboss_reserved_local_other.has_$_field_$.ValueOrDefault() &&
+ has_$_field_$.ValueOrDefault() &&
+ !$_field_$.Equals(emboss_reserved_local_other.$_field_$))
+ return false;
+
+
+// ** unchecked_equals_method_test ** //////////////////////////////////////////
+ // The contract for UncheckedEquals() is that the caller must assure that
+ // both views are Ok() (which implies that has_$_field_$.Known() is true),
+ // and UncheckedEquals() will never perform any assertion checks (which
+ // implies that UncheckedEquals() cannot call has_$_field_$.Value()).
+
+ // If this->has_$_field_$ but !emboss_reserved_local_other.has_$_field_$, or
+ // vice versa, then the structures are not equal. If neither structure
+ // has_$_field_$, then $_field_$ is considered equal.
+ if (emboss_reserved_local_other.has_$_field_$.ValueOr(false) &&
+ !has_$_field_$.ValueOr(false))
+ return false;
+ if (has_$_field_$.ValueOr(false) &&
+ !emboss_reserved_local_other.has_$_field_$.ValueOr(false))
+ return false;
+
+ // If $_field_$ is present in both structures, then check its equality.
+ if (emboss_reserved_local_other.has_$_field_$.ValueOr(false) &&
+ has_$_field_$.ValueOr(false) &&
+ !$_field_$.UncheckedEquals(emboss_reserved_local_other.$_field_$))
+ return false;
+
+
+// ** structure_view_type ** ///////////////////////////////////////////////////
+$_namespace_$::Generic$_name_$View<typename $_buffer_type_$>
+
+
+// ** external_view_type ** ////////////////////////////////////////////////////
+$_namespace_$::$_name_$View<
+ /**/ ::emboss::support::FixedSizeViewParameters<$_bits_$, $_validator_$>,
+ typename $_buffer_type_$>
+
+
+// ** enum_view_type ** ////////////////////////////////////////////////////////
+$_support_namespace_$::EnumView<
+ /**/ $_enum_type_$,
+ ::emboss::support::FixedSizeViewParameters<$_bits_$, $_validator_$>,
+ typename $_buffer_type_$>
+
+
+// ** array_view_adapter ** ////////////////////////////////////////////////////
+$_support_namespace_$::GenericArrayView<
+ typename $_element_view_type_$, typename $_buffer_type_$, $_element_size_$,
+ $_addressable_unit_size_$ $_element_view_parameter_types_$>
+
+
+// ** structure_field_validator ** /////////////////////////////////////////////
+struct $_name_$ {
+ template <typename ValueType>
+ static constexpr bool ValueIsOk(ValueType emboss_reserved_local_value) {
+ return ($_expression_$).ValueOrDefault();
+ }
+};
+
+
+// ** structure_single_field_method_declarations ** ////////////////////////////
+ $_visibility_$:
+ typename $_type_reader_$ $_name_$() const;
+ ::emboss::support::Maybe<bool> has_$_name_$() const;
+
+
+// ** structure_single_field_method_definitions ** /////////////////////////////
+template <class Storage>
+inline typename $_type_reader_$ Generic$_parent_type_$View<Storage>::$_name_$()
+ const {
+ // If it's not possible to read the location of this field, provide a view
+ // into a null storage -- the only safe methods to call on it will be Ok() and
+ // IsComplete(), but it is necessary to return a view so that client code can
+ // call those methods at all. Similarly, if the end of the field would come
+ // before the start, we provide a null storage, though arguably we should
+ // not.
+ if ($_parameters_known_$ has_$_name_$().ValueOr(false) && $_size_$.Known() &&
+ $_size_$.ValueOr(0) >= 0 && $_offset_$.Known() &&
+ $_offset_$.ValueOr(0) >= 0) {
+ return $_type_reader_$(
+ $_parameter_values_$ backing_
+ .template GetOffsetStorage<$_alignment_$, $_static_offset_$>(
+ $_offset_$.ValueOrDefault(), $_size_$.ValueOrDefault()));
+ } else {
+ return $_type_reader_$();
+ }
+}
+
+template <class Storage>
+inline ::emboss::support::Maybe<bool>
+Generic$_parent_type_$View<Storage>::has_$_name_$() const {
+ return $_field_exists_$;
+}
+
+
+// ** structure_single_const_virtual_field_method_declarations ** //////////////
+ class $_virtual_view_type_name_$ final {
+ public:
+ using ValueType = $_logical_type_$;
+
+ constexpr $_virtual_view_type_name_$() {}
+ $_virtual_view_type_name_$(const $_virtual_view_type_name_$ &) = default;
+ $_virtual_view_type_name_$($_virtual_view_type_name_$ &&) = default;
+ $_virtual_view_type_name_$ &operator=(const $_virtual_view_type_name_$ &) =
+ default;
+ $_virtual_view_type_name_$ &operator=($_virtual_view_type_name_$ &&) =
+ default;
+ ~$_virtual_view_type_name_$() = default;
+
+ static constexpr $_logical_type_$ Read();
+ static constexpr $_logical_type_$ UncheckedRead();
+ static constexpr bool Ok() { return true; }
+ template <class Stream>
+ void WriteToTextStream(Stream *emboss_reserved_local_stream,
+ const ::emboss::TextOutputOptions
+ &emboss_reserved_local_options) const {
+ ::emboss::support::$_write_to_text_stream_function_$(
+ this, emboss_reserved_local_stream, emboss_reserved_local_options);
+ }
+ };
+
+ static constexpr $_virtual_view_type_name_$ $_name_$() {
+ return $_virtual_view_type_name_$();
+ }
+ static constexpr ::emboss::support::Maybe<bool> has_$_name_$() {
+ return ::emboss::support::Maybe<bool>(true);
+ }
+
+
+// ** structure_single_const_virtual_field_method_definitions ** ///////////////
+namespace $_parent_type_$ {
+inline constexpr $_logical_type_$ $_name_$() {
+ return $_read_value_$.ValueOrDefault();
+}
+} // namespace $_parent_type_$
+
+template <class Storage>
+inline constexpr $_logical_type_$
+Generic$_parent_type_$View<Storage>::$_virtual_view_type_name_$::Read() {
+ return $_parent_type_$::$_name_$();
+}
+
+template <class Storage>
+inline constexpr $_logical_type_$ Generic$_parent_type_$View<
+ Storage>::$_virtual_view_type_name_$::UncheckedRead() {
+ return $_parent_type_$::$_name_$();
+}
+
+// ** structure_single_virtual_field_method_declarations ** ////////////////////
+ class $_virtual_view_type_name_$ final {
+ public:
+ using ValueType = $_logical_type_$;
+
+ explicit $_virtual_view_type_name_$(
+ const Generic$_parent_type_$View &emboss_reserved_local_view)
+ : view_(emboss_reserved_local_view) {}
+ $_virtual_view_type_name_$() = delete;
+ $_virtual_view_type_name_$(const $_virtual_view_type_name_$ &) = default;
+ $_virtual_view_type_name_$($_virtual_view_type_name_$ &&) = default;
+ $_virtual_view_type_name_$ &operator=(const $_virtual_view_type_name_$ &) =
+ default;
+ $_virtual_view_type_name_$ &operator=($_virtual_view_type_name_$ &&) =
+ default;
+ ~$_virtual_view_type_name_$() = default;
+
+ $_logical_type_$ Read() const {
+ EMBOSS_CHECK(view_.has_$_name_$().ValueOr(false));
+ auto emboss_reserved_local_value = MaybeRead();
+ EMBOSS_CHECK(emboss_reserved_local_value.Known());
+ EMBOSS_CHECK(ValueIsOk(emboss_reserved_local_value.ValueOrDefault()));
+ return emboss_reserved_local_value.ValueOrDefault();
+ }
+ $_logical_type_$ UncheckedRead() const {
+ // UncheckedRead() on a virtual still calls Ok() on its dependencies;
+ // i.e., it still does some bounds checking. This is because of a subtle
+ // case, illustrated by the example below:
+ //
+ // # .emb
+ // struct Foo:
+ // 0 [+1] UInt x
+ // if x != 0:
+ // 1 [+1] UInt y
+ // let x_and_y = x != 0 && y != 0
+ //
+ // // .cc
+ // std::array<char, 1> buffer = {0};
+ // const auto view = MakeFooView(&buffer);
+ // assert(!view.x_and_y().UncheckedRead());
+ //
+ // Without the checks for Ok(), the implementation of UncheckedRead()
+ // looks something like:
+ //
+ // bool UncheckedRead() const {
+ // return And(view_.x().UncheckedRead(),
+ // view_.y().UncheckedRead()).ValueOrDefault();
+ // }
+ //
+ // Unfortunately, even if x().UncheckedRead() is false, this will call
+ // UncheckedRead() on y(), which will segfault.
+ //
+ // TODO(bolms): Figure out a way to minimize bounds checking, instead of
+ // just always checking here.
+ return MaybeRead().ValueOrDefault();
+ }
+ // Ok() can be false if some dependency is unreadable, *or* if there is an
+ // error somewhere in the arithmetic -- say, division by zero.
+ bool Ok() const {
+ auto emboss_reserved_local_value = MaybeRead();
+ return emboss_reserved_local_value.Known() &&
+ ValueIsOk(emboss_reserved_local_value.ValueOrDefault());
+ }
+ template <class Stream>
+ void WriteToTextStream(Stream *emboss_reserved_local_stream,
+ const ::emboss::TextOutputOptions
+ &emboss_reserved_local_options) const {
+ ::emboss::support::$_write_to_text_stream_function_$(
+ this, emboss_reserved_local_stream, emboss_reserved_local_options);
+ }
+
+$_write_methods_$
+
+ private:
+ ::emboss::support::Maybe</**/ $_logical_type_$> MaybeRead() const {
+ return $_read_value_$;
+ }
+
+ static constexpr bool ValueIsOk(
+ $_logical_type_$ emboss_reserved_local_value) {
+ return $_value_is_ok_$.ValueOr(false);
+ }
+
+ const Generic$_parent_type_$View view_;
+ };
+ $_virtual_view_type_name_$ $_name_$() const;
+ ::emboss::support::Maybe<bool> has_$_name_$() const;
+
+
+// ** structure_single_virtual_field_write_methods ** //////////////////////////
+ bool TryToWrite($_logical_type_$ emboss_reserved_local_value) {
+ const auto emboss_reserved_local_maybe_new_value = $_transform_$;
+ if (!CouldWriteValue(emboss_reserved_local_value)) return false;
+ return view_.$_destination_$.TryToWrite(
+ emboss_reserved_local_maybe_new_value.ValueOrDefault());
+ }
+ void Write($_logical_type_$ emboss_reserved_local_value) {
+ EMBOSS_CHECK(TryToWrite(emboss_reserved_local_value));
+ }
+ void UncheckedWrite($_logical_type_$ emboss_reserved_local_value) {
+ view_.$_destination_$.UncheckedWrite(($_transform_$).ValueOrDefault());
+ }
+ bool CouldWriteValue($_logical_type_$ emboss_reserved_local_value) {
+ if (!ValueIsOk(emboss_reserved_local_value)) return false;
+ const auto emboss_reserved_local_maybe_new_value = $_transform_$;
+ if (!emboss_reserved_local_maybe_new_value.Known()) return false;
+ return view_.$_destination_$.CouldWriteValue(
+ emboss_reserved_local_maybe_new_value.ValueOrDefault());
+ }
+ template <class Stream>
+ bool UpdateFromTextStream(Stream *emboss_reserved_local_stream) {
+ return ::emboss::support::ReadIntegerFromTextStream(
+ this, emboss_reserved_local_stream);
+ }
+
+
+// ** structure_single_virtual_field_method_definitions ** /////////////////////
+template <class Storage>
+inline typename Generic$_parent_type_$View<Storage>::$_virtual_view_type_name_$
+Generic$_parent_type_$View<Storage>::$_name_$() const {
+ return
+ typename Generic$_parent_type_$View<Storage>::$_virtual_view_type_name_$(
+ *this);
+}
+
+template <class Storage>
+inline ::emboss::support::Maybe<bool>
+Generic$_parent_type_$View<Storage>::has_$_name_$() const {
+ return $_field_exists_$;
+}
+
+
+// ** structure_single_field_indirect_method_declarations ** ///////////////////
+ $_visibility_$:
+ // The "this->" is required for (some versions of?) GCC.
+ auto $_name_$() const -> decltype(this->$_aliased_field_$) {
+ return has_$_name_$().ValueOrDefault() ? $_aliased_field_$
+ : decltype(this->$_aliased_field_$)();
+ }
+ ::emboss::support::Maybe<bool> has_$_name_$() const;
+
+
+// ** struct_single_field_indirect_method_definitions ** ///////////////////////
+template <class Storage>
+inline ::emboss::support::Maybe<bool>
+Generic$_parent_type_$View<Storage>::has_$_name_$() const {
+ return $_field_exists_$;
+}
+
+
+// ** structure_single_parameter_field_method_declarations ** //////////////////
+ private:
+ // TODO(bolms): Is there any harm if these are public methods?
+ constexpr ::emboss::support::MaybeConstantView</**/ $_logical_type_$>
+ $_name_$() const {
+ return parameters_initialized_
+ ? ::emboss::support::MaybeConstantView</**/ $_logical_type_$>(
+ $_name_$_)
+ : ::emboss::support::MaybeConstantView</**/ $_logical_type_$>();
+ }
+ constexpr ::emboss::support::Maybe<bool> has_$_name_$() const {
+ return ::emboss::support::Maybe<bool>(parameters_initialized_);
+ }
+
+
+// ** enum_declaration ** //////////////////////////////////////////////////////
+enum class $_enum_$ : $_enum_type_$;
+
+
+// ** enum_definition ** ///////////////////////////////////////////////////////
+enum class $_enum_$ : $_enum_type_$ {
+$_enum_values_$
+};
+
+// This setup (ab)uses the fact that C++ templates can be defined in many
+// translation units, but will be collapsed to a single definition at link time
+// (or no definition, if no client code instantiates the template).
+//
+// Emboss could accomplish almost the same result by generating multiple .cc
+// files (one per function), but Bazel doesn't have great support for specifying
+// "the output of this rule is an indeterminate number of files, all of which
+// should be used as input to this other rule," which would be necessary to
+// generate all the .cc files and then build and link them into a library.
+template <class Enum>
+class EnumTraits;
+
+template <>
+class EnumTraits<$_enum_$> final {
+ public:
+ static bool TryToGetEnumFromName(const char *emboss_reserved_local_name,
+ $_enum_$ *emboss_reserved_local_result) {
+ if (emboss_reserved_local_name == nullptr) return false;
+ // TODO(bolms): The generated code here would be much more efficient for
+ // large enums if the mapping were performed using a prefix trie rather than
+ // repeated strcmp().
+$_enum_from_name_cases_$
+ return false;
+ }
+
+ static const char *TryToGetNameFromEnum(
+ $_enum_$ emboss_reserved_local_value) {
+ switch (emboss_reserved_local_value) {
+$_name_from_enum_cases_$
+ default: return nullptr;
+ }
+ }
+
+ static bool EnumIsKnown($_enum_$ emboss_reserved_local_value) {
+ switch (emboss_reserved_local_value) {
+$_enum_is_known_cases_$
+ default:
+ return false;
+ }
+ }
+
+ static ::std::ostream &SendToOstream(::std::ostream &emboss_reserved_local_os,
+ $_enum_$ emboss_reserved_local_value) {
+ const char *emboss_reserved_local_name =
+ TryToGetNameFromEnum(emboss_reserved_local_value);
+ if (emboss_reserved_local_name == nullptr) {
+ emboss_reserved_local_os
+ << static_cast</**/ ::std::underlying_type<$_enum_$>::type>(
+ emboss_reserved_local_value);
+ } else {
+ emboss_reserved_local_os << emboss_reserved_local_name;
+ }
+ return emboss_reserved_local_os;
+ }
+};
+
+// These functions are intended to be found via ADL.
+static inline bool TryToGetEnumFromName(
+ const char *emboss_reserved_local_name,
+ $_enum_$ *emboss_reserved_local_result) {
+ return EnumTraits<$_enum_$>::TryToGetEnumFromName(
+ emboss_reserved_local_name, emboss_reserved_local_result);
+}
+
+static inline const char *TryToGetNameFromEnum(
+ $_enum_$ emboss_reserved_local_value) {
+ return EnumTraits<$_enum_$>::TryToGetNameFromEnum(
+ emboss_reserved_local_value);
+}
+
+static inline bool EnumIsKnown($_enum_$ emboss_reserved_local_value) {
+ return EnumTraits<$_enum_$>::EnumIsKnown(emboss_reserved_local_value);
+}
+
+static inline ::std::ostream &operator<<(
+ ::std::ostream &emboss_reserved_local_os,
+ $_enum_$ emboss_reserved_local_value) {
+ return EnumTraits<$_enum_$>::SendToOstream(emboss_reserved_local_os,
+ emboss_reserved_local_value);
+}
+
+// ** enum_from_name_case ** ///////////////////////////////////////////////////
+ if (!strcmp("$_name_$", emboss_reserved_local_name)) {
+ *emboss_reserved_local_result = $_enum_$::$_name_$;
+ return true;
+ }
+
+// ** name_from_enum_case ** ///////////////////////////////////////////////////
+ case $_enum_$::$_name_$: return "$_name_$";
+
+// ** enum_is_known_case ** ////////////////////////////////////////////////////
+ case $_enum_$::$_name_$: return true;
+
+// ** enum_value ** ////////////////////////////////////////////////////////////
+ $_name_$ = $_value_$,
+
+// ** enum_using_statement ** //////////////////////////////////////////////////
+ using $_name_$ = $_component_$;
diff --git a/back_end/cpp/header_generator.py b/back_end/cpp/header_generator.py
new file mode 100644
index 0000000..167cb26
--- /dev/null
+++ b/back_end/cpp/header_generator.py
@@ -0,0 +1,1248 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""C++ header code generator.
+
+Call generate_header(ir) to get the text of a C++ header file implementing View
+classes for the ir.
+"""
+
+import collections
+import pkgutil
+import re
+
+from back_end.util import code_template
+from public import ir_pb2
+from util import ir_util
+from util import name_conversion
+
+_TEMPLATES = code_template.parse_templates(pkgutil.get_data(
+ "back_end.cpp",
+ "generated_code_templates").decode(encoding="UTF-8"))
+
+_CPP_RESERVED_WORDS = set((
+ # C keywords. A few of these are not (yet) C++ keywords, but some compilers
+ # accept the superset of C and C++, so we still want to avoid them.
+ "asm", "auto", "break", "case", "char", "const", "continue", "default",
+ "do", "double", "else", "enum", "extern", "float", "for", "fortran", "goto",
+ "if", "inline", "int", "long", "register", "restrict", "return", "short",
+ "signed", "sizeof", "static", "struct", "switch", "typedef", "union",
+ "unsigned", "void", "volatile", "while", "_Alignas", "_Alignof", "_Atomic",
+ "_Bool", "_Complex", "_Generic", "_Imaginary", "_Noreturn", "_Pragma",
+ "_Static_assert", "_Thread_local",
+ # The following are not technically reserved words, but collisions are
+ # likely due to the standard macros.
+ "complex", "imaginary", "noreturn",
+ # C++ keywords that are not also C keywords.
+ "alignas", "alignof", "and", "and_eq", "asm", "bitand", "bitor", "bool",
+ "catch", "char16_t", "char32_t", "class", "compl", "concept", "constexpr",
+ "const_cast", "decltype", "delete", "dynamic_cast", "explicit", "export",
+ "false", "friend", "mutable", "namespace", "new", "noexcept", "not",
+ "not_eq", "nullptr", "operator", "or", "or_eq", "private", "protected",
+ "public", "reinterpret_cast", "requires", "static_assert", "static_cast",
+ "template", "this", "thread_local", "throw", "true", "try", "typeid",
+ "typename", "using", "virtual", "wchar_t", "xor", "xor_eq",
+ # "NULL" is not a keyword, but is still very likely to cause problems if
+ # used as a namespace name.
+ "NULL",
+))
+
+# The support namespace, as a C++ namespace prefix. This namespace contains the
+# Emboss C++ support classes.
+_SUPPORT_NAMESPACE = "::emboss::support"
+
+# TODO(bolms): This should be a command-line flag.
+_PRELUDE_INCLUDE_FILE = "public/emboss_prelude.h"
+
+
+def _get_module_namespace(module):
+ """Returns the C++ namespace of the module, as a list of components.
+
+ Arguments:
+ module: The IR of an Emboss module whose namespace should be returned.
+
+ Returns:
+ A list of strings, one per namespace component. This list can be formatted
+ as appropriate by the caller.
+ """
+ namespace_attr = ir_util.get_attribute(module.attribute, "namespace")
+ if namespace_attr and namespace_attr.string_constant.text:
+ namespace = namespace_attr.string_constant.text
+ else:
+ namespace = "emboss_generated_code"
+ if namespace[0:2] == "::":
+ # If the user explicitly specified the leading "::", trim it off: it will be
+ # re-added later, when the namespace is used as a prefix (as opposed to
+ # "namespace foo { }").
+ namespace = namespace[2:]
+ namespace_list = namespace.split("::")
+ for namespace_component in namespace_list:
+ assert re.match("[a-zA-Z_][a-zA-Z0-9_]*", namespace_component), (
+ "Bad namespace '{}'".format(namespace))
+ assert namespace_component not in _CPP_RESERVED_WORDS, (
+ "Reserved word '{}' is not allowed as a namespace component.".format(
+ namespace_component))
+ return namespace_list
+
+
+def _cpp_string_escape(string):
+ return re.sub("['\"\\\\]", r"\\\0", string)
+
+
+def _get_includes(module):
+ """Returns the appropriate #includes based on module's imports."""
+ includes = []
+ for import_ in module.foreign_import:
+ if import_.file_name.text:
+ includes.append(
+ code_template.format_template(
+ _TEMPLATES.include,
+ file_name=_cpp_string_escape(import_.file_name.text + ".h")))
+ else:
+ includes.append(
+ code_template.format_template(
+ _TEMPLATES.include,
+ file_name=_cpp_string_escape(_PRELUDE_INCLUDE_FILE)))
+ return "".join(includes)
+
+
+def _render_namespace_prefix(namespace):
+ """Returns namespace rendered as a prefix, like ::foo::bar::baz."""
+ return "".join(["::" + n for n in namespace])
+
+
+def _render_integer(value):
+ """Returns a C++ string representation of a constant integer."""
+ integer_type = _cpp_integer_type_for_range(value, value)
+ assert integer_type, ("Bug: value should never be outside [-2**63, 2**64), "
+ "got {}.".format(value))
+ # C++ literals are always positive. Negative constants are actually the
+ # positive literal with the unary `-` operator applied.
+ #
+ # This means that C++ compilers for 2s-complement systems get finicky about
+ # minimum integers: if you feed `-9223372036854775808` into GCC, with -Wall,
+ # you get:
+ #
+ # warning: integer constant is so large that it is unsigned
+ #
+ # and Clang gives:
+ #
+ # warning: integer literal is too large to be represented in a signed
+ # integer type, interpreting as unsigned [-Wimplicitly-unsigned-literal]
+ #
+ # and MSVC:
+ #
+ # warning C4146: unary minus operator applied to unsigned type, result
+ # still unsigned
+ #
+ # So, workaround #1: -(2**63) must be written `(-9223372036854775807 - 1)`.
+ #
+ # The next problem is that MSVC (but not Clang or GCC) will pick `unsigned`
+ # as the type of a literal like `2147483648`. As far as I can tell, this is a
+ # violation of the C++11 standard, but it's possible that the final standard
+ # has different rules. (MSVC seems to treat decimal literals the way that the
+ # standard says octal and hexadecimal literals should be treated.)
+ #
+ # Luckily, workaround #2: we can unconditionally append `LL` to all constants
+ # to force them to be interpreted as `long long` (or `unsigned long long` for
+ # `ULL`-suffixed constants), and then use a narrowing cast to the appropriate
+ # type, without any warnings on any major compilers.
+ #
+ # TODO(bolms): This suffix computation is kind of a hack.
+ suffix = "U" if "uint" in integer_type else ""
+ if value == -(2**63):
+ return "static_cast</**/{0}>({1}LL - 1)".format(integer_type, -(2**63 - 1))
+ else:
+ return "static_cast</**/{0}>({1}{2}LL)".format(integer_type, value, suffix)
+
+
+def _maybe_type(wrapped_type):
+ return "::emboss::support::Maybe</**/{}>".format(wrapped_type)
+
+
+def _render_integer_for_expression(value):
+ integer_type = _cpp_integer_type_for_range(value, value)
+ return "{0}({1})".format(_maybe_type(integer_type), _render_integer(value))
+
+
+def _wrap_in_namespace(body, namespace):
+ """Returns the given body wrapped in the given namespace."""
+ for component in reversed(namespace):
+ body = code_template.format_template(_TEMPLATES.namespace_wrap,
+ component=component,
+ body=body) + "\n"
+ return body
+
+
+def _get_type_size(type_ir, ir):
+ size = ir_util.fixed_size_of_type_in_bits(type_ir, ir)
+ assert size is not None, (
+ "_get_type_size should only be called for constant-sized types.")
+ return size
+
+
+def _offset_storage_adapter(buffer_type, alignment, static_offset):
+ return "{}::template OffsetStorageType<{}, {}>".format(
+ buffer_type, alignment, static_offset)
+
+
+def _bytes_to_bits_convertor(buffer_type, byte_order, size):
+ assert byte_order, "byte_order should not be empty."
+ return "{}::BitBlock</**/{}::{}ByteOrderer<typename {}>, {}>".format(
+ _SUPPORT_NAMESPACE,
+ _SUPPORT_NAMESPACE,
+ byte_order,
+ buffer_type,
+ size)
+
+
+def _get_fully_qualified_namespace(name, ir):
+ module = ir_util.find_object((name.module_file,), ir)
+ namespace = _render_namespace_prefix(_get_module_namespace(module))
+ return namespace + "".join(["::" + str(s) for s in name.object_path[:-1]])
+
+
+def _get_unqualified_name(name):
+ return name.object_path[-1]
+
+
+def _get_fully_qualified_name(name, ir):
+ return (_get_fully_qualified_namespace(name, ir) + "::" +
+ _get_unqualified_name(name))
+
+
+def _get_adapted_cpp_buffer_type_for_field(type_definition, size_in_bits,
+ buffer_type, byte_order,
+ parent_addressable_unit):
+ if (parent_addressable_unit == ir_pb2.TypeDefinition.BYTE and
+ type_definition.addressable_unit == ir_pb2.TypeDefinition.BIT):
+ assert byte_order
+ return _bytes_to_bits_convertor(buffer_type, byte_order, size_in_bits)
+ else:
+ assert parent_addressable_unit == type_definition.addressable_unit, (
+ "Addressable unit mismatch: {} vs {}".format(
+ parent_addressable_unit,
+ type_definition.addressable_unit))
+ return buffer_type
+
+
+def _get_cpp_view_type_for_type_definition(
+ type_definition, size, ir, buffer_type, byte_order, parent_addressable_unit,
+ validator):
+ """Returns the C++ type information needed to construct a view.
+
+ Returns the C++ type for a view of the given Emboss TypeDefinition, and the
+ C++ types of its parameters, if any.
+
+ Arguments:
+ type_definition: The ir_pb2.TypeDefinition whose view should be
+ constructed.
+ size: The size, in type_definition.addressable_units, of the instantiated
+ type, or None if it is not known at compile time.
+ ir: The complete IR.
+ buffer_type: The C++ type to be used as the Storage parameter of the view
+ (e.g., "ContiguousBuffer<...>").
+ byte_order: For BIT types which are direct children of BYTE types,
+ "LittleEndian", "BigEndian", or "None". Otherwise, None.
+ parent_addressable_unit: The addressable_unit_size of the structure
+ containing this structure.
+ validator: The name of the validator type to be injected into the view.
+
+ Returns:
+ A tuple of: the C++ view type and a (possibly-empty) list of the C++ types
+ of Emboss parameters which must be passed to the view's constructor.
+ """
+ adapted_buffer_type = _get_adapted_cpp_buffer_type_for_field(
+ type_definition, size, buffer_type, byte_order, parent_addressable_unit)
+ if type_definition.HasField("external"):
+ # Externals do not (yet) support runtime parameters.
+ return code_template.format_template(
+ _TEMPLATES.external_view_type,
+ namespace=_get_fully_qualified_namespace(
+ type_definition.name.canonical_name, ir),
+ name=_get_unqualified_name(type_definition.name.canonical_name),
+ bits=size,
+ validator=validator,
+ buffer_type=adapted_buffer_type), []
+ elif type_definition.HasField("structure"):
+ parameter_types = []
+ for parameter in type_definition.runtime_parameter:
+ parameter_types.append(
+ _cpp_basic_type_for_expression_type(parameter.type, ir))
+ return code_template.format_template(
+ _TEMPLATES.structure_view_type,
+ namespace=_get_fully_qualified_namespace(
+ type_definition.name.canonical_name, ir),
+ name=_get_unqualified_name(type_definition.name.canonical_name),
+ buffer_type=adapted_buffer_type), parameter_types
+ elif type_definition.HasField("enumeration"):
+ return code_template.format_template(
+ _TEMPLATES.enum_view_type,
+ support_namespace=_SUPPORT_NAMESPACE,
+ enum_type=_get_fully_qualified_name(type_definition.name.canonical_name,
+ ir),
+ bits=size,
+ validator=validator,
+ buffer_type=adapted_buffer_type), []
+ else:
+ assert False, "Unknown variety of type {}".format(type_definition)
+
+
+def _get_cpp_view_type_for_physical_type(
+ type_ir, size, byte_order, ir, buffer_type, parent_addressable_unit,
+ validator):
+ """Returns the C++ type information needed to construct a field's view.
+
+ Returns the C++ type of an ir_pb2.Type, and the C++ types of its parameters,
+ if any.
+
+ Arguments:
+ type_ir: The ir_pb2.Type whose view should be constructed.
+ size: The size, in type_definition.addressable_units, of the instantiated
+ type, or None if it is not known at compile time.
+ byte_order: For BIT types which are direct children of BYTE types,
+ "LittleEndian", "BigEndian", or "None". Otherwise, None.
+ ir: The complete IR.
+ buffer_type: The C++ type to be used as the Storage parameter of the view
+ (e.g., "ContiguousBuffer<...>").
+ parent_addressable_unit: The addressable_unit_size of the structure
+ containing this type.
+ validator: The name of the validator type to be injected into the view.
+
+ Returns:
+ A tuple of: the C++ type for a view of the given Emboss Type and a list of
+ the C++ types of any parameters of the view type, which should be passed
+ to the view's constructor.
+ """
+ if ir_util.is_array(type_ir):
+ # An array view is parameterized by the element's view type.
+ base_type = type_ir.array_type.base_type
+ element_size_in_bits = _get_type_size(base_type, ir)
+ assert element_size_in_bits, (
+ "TODO(bolms): Implement arrays of dynamically-sized elements.")
+ assert element_size_in_bits % parent_addressable_unit == 0, (
+ "Array elements must fall on byte boundaries.")
+ element_size = element_size_in_bits // parent_addressable_unit
+ element_view_type, element_view_parameter_types, element_view_parameters = (
+ _get_cpp_view_type_for_physical_type(
+ base_type, element_size_in_bits, byte_order, ir,
+ _offset_storage_adapter(buffer_type, element_size, 0),
+ parent_addressable_unit, validator))
+ return (
+ code_template.format_template(
+ _TEMPLATES.array_view_adapter,
+ support_namespace=_SUPPORT_NAMESPACE,
+ # TODO(bolms): The element size should be calculable from the field
+ # size and array length.
+ element_view_type=element_view_type,
+ element_view_parameter_types="".join(
+ ", " + p for p in element_view_parameter_types),
+ element_size=element_size,
+ addressable_unit_size=parent_addressable_unit,
+ buffer_type=buffer_type),
+ element_view_parameter_types,
+ element_view_parameters
+ )
+ else:
+ assert type_ir.HasField("atomic_type")
+ reference = type_ir.atomic_type.reference
+ referenced_type = ir_util.find_object(reference, ir)
+ if parent_addressable_unit > referenced_type.addressable_unit:
+ assert byte_order, repr(type_ir)
+ reader, parameter_types = _get_cpp_view_type_for_type_definition(
+ referenced_type, size, ir, buffer_type, byte_order,
+ parent_addressable_unit, validator)
+ return reader, parameter_types, list(type_ir.atomic_type.runtime_parameter)
+
+
+def _render_variable(variable, prefix=""):
+ """Renders a variable reference (e.g., `foo` or `foo.bar.baz`) in C++ code."""
+ # A "variable" could be an immediate field or a subcomponent of an immediate
+ # field. For either case, in C++ it is valid to just use the last component
+ # of the name; it is not necessary to qualify the method with the type.
+ components = []
+ for component in variable:
+ components.append(_cpp_field_name(component[-1]) + "()")
+ components[-1] = prefix + components[-1]
+ return ".".join(components)
+
+
+def _render_enum_value(enum_type, ir):
+ cpp_enum_type = _get_fully_qualified_name(enum_type.name.canonical_name, ir)
+ return "{}(static_cast</**/{}>({}))".format(
+ _maybe_type(cpp_enum_type), cpp_enum_type, enum_type.value)
+
+
+def _builtin_function_name(function):
+ """Returns the C++ operator name corresponding to an Emboss operator."""
+ functions = {
+ ir_pb2.Function.ADDITION: "Sum",
+ ir_pb2.Function.SUBTRACTION: "Difference",
+ ir_pb2.Function.MULTIPLICATION: "Product",
+ ir_pb2.Function.EQUALITY: "Equal",
+ ir_pb2.Function.INEQUALITY: "NotEqual",
+ ir_pb2.Function.AND: "And",
+ ir_pb2.Function.OR: "Or",
+ ir_pb2.Function.LESS: "LessThan",
+ ir_pb2.Function.LESS_OR_EQUAL: "LessThanOrEqual",
+ ir_pb2.Function.GREATER: "GreaterThan",
+ ir_pb2.Function.GREATER_OR_EQUAL: "GreaterThanOrEqual",
+ ir_pb2.Function.CHOICE: "Choice",
+ ir_pb2.Function.MAXIMUM: "Maximum",
+ }
+ return functions[function]
+
+
+def _cpp_basic_type_for_expression_type(expression_type, ir):
+ """Returns the C++ basic type (int32_t, bool, etc.) for an ExpressionType."""
+ if expression_type.WhichOneof("type") == "integer":
+ return _cpp_integer_type_for_range(
+ int(expression_type.integer.minimum_value),
+ int(expression_type.integer.maximum_value))
+ elif expression_type.WhichOneof("type") == "boolean":
+ return "bool"
+ elif expression_type.WhichOneof("type") == "enumeration":
+ return _get_fully_qualified_name(
+ expression_type.enumeration.name.canonical_name, ir)
+ else:
+ assert False, "Unknown expression type " + expression_type.WhichOneof(
+ "type")
+
+
+def _cpp_basic_type_for_expression(expression, ir):
+ """Returns the C++ basic type (int32_t, bool, etc.) for an Expression."""
+ return _cpp_basic_type_for_expression_type(expression.type, ir)
+
+
+def _cpp_integer_type_for_range(min_val, max_val):
+ """Returns the appropriate C++ integer type to hold min_val up to max_val."""
+ # The choice of int32_t, uint32_t, int64_t, then uint64_t is somewhat
+ # arbitrary here, and might not be perfectly ideal. I (bolms@) have chosen
+ # this set of types to a) minimize the number of casts that occur in
+ # arithmetic expressions, and b) favor 32-bit arithmetic, which is mostly
+ # "cheapest" on current (2018) systems. Signed integers are also preferred
+ # over unsigned so that the C++ compiler can take advantage of undefined
+ # overflow.
+ for size in (32, 64):
+ if min_val >= -(2**(size - 1)) and max_val <= 2**(size - 1) - 1:
+ return "::std::int{}_t".format(size)
+ elif min_val >= 0 and max_val <= 2**size - 1:
+ return "::std::uint{}_t".format(size)
+ return None
+
+
+def _render_builtin_operation(expression, ir, field_reader):
+ """Renders a built-in operation (+, -, &&, etc.) into C++ code."""
+ assert expression.function.function not in (
+ ir_pb2.Function.UPPER_BOUND, ir_pb2.Function.LOWER_BOUND), (
+ "UPPER_BOUND and LOWER_BOUND should be constant.")
+ if expression.function.function == ir_pb2.Function.PRESENCE:
+ return field_reader.render_existence(expression.function.args[0])
+ args = expression.function.args
+ rendered_args = [
+ _render_expression(arg, ir, field_reader).rendered for arg in args]
+ minimum_integers = []
+ maximum_integers = []
+ enum_types = set()
+ have_boolean_types = False
+ for subexpression in [expression] + list(args):
+ if subexpression.type.WhichOneof("type") == "integer":
+ minimum_integers.append(int(subexpression.type.integer.minimum_value))
+ maximum_integers.append(int(subexpression.type.integer.maximum_value))
+ elif subexpression.type.WhichOneof("type") == "enumeration":
+ enum_types.add(_cpp_basic_type_for_expression(subexpression, ir))
+ elif subexpression.type.WhichOneof("type") == "boolean":
+ have_boolean_types = True
+ # At present, all Emboss functions other than `$has` take and return one of
+ # the following:
+ #
+ # integers
+ # integers and booleans
+ # a single enum type
+ # a single enum type and booleans
+ # booleans
+ #
+ # Really, the intermediate type is only necessary for integers, but it
+ # simplifies the C++ somewhat if the appropriate enum/boolean type is provided
+ # as "IntermediateT" -- it means that, e.g., the choice ("?:") operator does
+ # not have to have two versions, one of which casts (some of) its arguments to
+ # IntermediateT, and one of which does not.
+ #
+ # This is not a particularly robust scheme, but it works for all of the Emboss
+ # functions I (bolms@) have written and am considering (division, modulus,
+ # exponentiation, logical negation, bit shifts, bitwise and/or/xor, $min,
+ # $floor, $ceil, $has).
+ if minimum_integers and not enum_types:
+ intermediate_type = _cpp_integer_type_for_range(min(minimum_integers),
+ max(maximum_integers))
+ elif len(enum_types) == 1 and not minimum_integers:
+ intermediate_type = list(enum_types)[0]
+ else:
+ assert have_boolean_types
+ assert not enum_types
+ assert not minimum_integers
+ intermediate_type = "bool"
+ arg_types = [_cpp_basic_type_for_expression(arg, ir) for arg in args]
+ result_type = _cpp_basic_type_for_expression(expression, ir)
+ function_variant = "</**/{}, {}, {}>".format(
+ intermediate_type, result_type, ", ".join(arg_types))
+ return "::emboss::support::{}{}({})".format(
+ _builtin_function_name(expression.function.function),
+ function_variant, ", ".join(rendered_args))
+
+
+class _FieldRenderer(object):
+ """Base class for rendering field reads."""
+
+ def render_field_read_with_context(self, expression, ir, prefix):
+ expression_cpp_type = _cpp_basic_type_for_expression(expression, ir)
+ return ("({0}{1}.Ok()"
+ " ? {2}(static_cast</**/{3}>({0}{1}.UncheckedRead()))"
+ " : {2}())".format(
+ prefix,
+ _render_variable(ir_util.hashable_form_of_field_reference(
+ expression.field_reference)),
+ _maybe_type(expression_cpp_type),
+ expression_cpp_type))
+
+ def render_existence_with_context(self, expression, prefix):
+ return "{1}{0}".format(
+ _render_variable(
+ ir_util.hashable_form_of_field_reference(
+ expression.field_reference),
+ "has_"),
+ prefix)
+
+
+class _DirectFieldRenderer(_FieldRenderer):
+ """Renderer for fields read from inside a structure's View type."""
+
+ def render_field(self, expression, ir):
+ return self.render_field_read_with_context(expression, ir, "")
+
+ def render_existence(self, expression):
+ return self.render_existence_with_context(expression, "")
+
+
+class _VirtualViewFieldRenderer(_FieldRenderer):
+ """Renderer for field reads from inside a virtual field's View."""
+
+ def render_existence(self, expression):
+ return self.render_existence_with_context(expression, "view_.")
+
+ def render_field(self, expression, ir):
+ return self.render_field_read_with_context(expression, ir, "view_.")
+
+
+_ExpressionResult = collections.namedtuple("ExpressionResult",
+ ["rendered", "is_constant"])
+
+
+def _render_expression(expression, ir, field_reader=None):
+ """Renders an expression into C++ code.
+
+ Arguments:
+ expression: The expression to render.
+ ir: The IR in which to look up references.
+ field_reader: An object with render_existence and render_field methods
+ appropriate for the C++ context of the expression.
+
+ Returns:
+ A tuple of (rendered_text, is_constant), where rendered_text is C++ code
+ that can be emitted, and is_constant is True if the expression is a
+ compile-time constant suitable for use in a C++11 constexpr context,
+ otherwise False.
+ """
+ if field_reader is None:
+ field_reader = _DirectFieldRenderer()
+
+ # If the expression is constant, there are no guarantees that subexpressions
+ # will fit into C++ types, or that operator arguments and return types can fit
+ # in the same type: expressions like `-0x8000_0000_0000_0000` and
+ # `0x1_0000_0000_0000_0000 - 1` can appear.
+ if expression.type.WhichOneof("type") == "integer":
+ if expression.type.integer.modulus == "infinity":
+ return _ExpressionResult(_render_integer_for_expression(int(
+ expression.type.integer.modular_value)), True)
+ elif expression.type.WhichOneof("type") == "boolean":
+ if expression.type.boolean.HasField("value"):
+ if expression.type.boolean.value:
+ return _ExpressionResult(_maybe_type("bool") + "(true)", True)
+ else:
+ return _ExpressionResult(_maybe_type("bool") + "(false)", True)
+ elif expression.type.WhichOneof("type") == "enumeration":
+ if expression.type.enumeration.HasField("value"):
+ return _ExpressionResult(
+ _render_enum_value(expression.type.enumeration, ir), True)
+ else:
+ # There shouldn't be any "opaque" type expressions here.
+ assert False, "Unhandled expression type {}".format(
+ expression.type.WhichOneof("type"))
+
+ # Otherwise, render the operation.
+ if expression.WhichOneof("expression") == "function":
+ return _ExpressionResult(
+ _render_builtin_operation(expression, ir, field_reader), False)
+ elif expression.WhichOneof("expression") == "field_reference":
+ return _ExpressionResult(field_reader.render_field(expression, ir), False)
+ elif (expression.WhichOneof("expression") == "builtin_reference" and
+ expression.builtin_reference.canonical_name.object_path[-1] ==
+ "$logical_value"):
+ return _ExpressionResult(
+ _maybe_type("decltype(emboss_reserved_local_value)") +
+ "(emboss_reserved_local_value)", False)
+ # Any of the constant expression types should have been handled in the
+ # previous section.
+
+ assert False, "Unable to render expression {}".format(str(expression))
+
+
+def _render_existence_test(field, ir):
+ return _render_expression(field.existence_condition, ir)
+
+
+def _alignment_of_location(location):
+ constraints = location.start.type.integer
+ if constraints.modulus == "infinity":
+ # The C++ templates use 0 as a sentinel value meaning infinity for
+ # alignment.
+ return 0, constraints.modular_value
+ else:
+ return constraints.modulus, constraints.modular_value
+
+
+def _get_cpp_type_reader_of_field(field_ir, ir, buffer_type, validator,
+ parent_addressable_unit):
+ """Returns the C++ view type for a field."""
+ field_size = None
+ if field_ir.type.HasField("size_in_bits"):
+ field_size = ir_util.constant_value(field_ir.type.size_in_bits)
+ assert field_size is not None
+ elif ir_util.is_constant(field_ir.location.size):
+ # TODO(bolms): Normalize the IR so that this clause is unnecessary.
+ field_size = (ir_util.constant_value(field_ir.location.size) *
+ parent_addressable_unit)
+ byte_order_attr = ir_util.get_attribute(field_ir.attribute, "byte_order")
+ if byte_order_attr:
+ byte_order = byte_order_attr.string_constant.text
+ else:
+ byte_order = ""
+ field_alignment, field_offset = _alignment_of_location(field_ir.location)
+ return _get_cpp_view_type_for_physical_type(
+ field_ir.type, field_size, byte_order, ir,
+ _offset_storage_adapter(buffer_type, field_alignment, field_offset),
+ parent_addressable_unit, validator)
+
+
+def _generate_structure_field_methods(enclosing_type_name, field_ir, ir,
+ parent_addressable_unit):
+ if ir_util.field_is_virtual(field_ir):
+ return _generate_structure_virtual_field_methods(
+ enclosing_type_name, field_ir, ir)
+ else:
+ return _generate_structure_physical_field_methods(
+ enclosing_type_name, field_ir, ir, parent_addressable_unit)
+
+
+def _generate_custom_validator_expression_for(field_ir, ir):
+ """Returns a validator expression for the given field, or None."""
+ requires_attr = ir_util.get_attribute(field_ir.attribute, "requires")
+ if requires_attr:
+ class _ValidatorFieldReader(object):
+ """A "FieldReader" that translates the current field to `value`."""
+
+ def render_existence(self, expression):
+ del expression # Unused.
+ assert False, "Shouldn't be here."
+
+ def render_field(self, expression, ir):
+ assert len(expression.field_reference.path) == 1
+ assert (expression.field_reference.path[0].canonical_name ==
+ field_ir.name.canonical_name)
+ expression_cpp_type = _cpp_basic_type_for_expression(expression, ir)
+ return "{}(emboss_reserved_local_value)".format(
+ _maybe_type(expression_cpp_type))
+
+ validation_body = _render_expression(requires_attr.expression, ir,
+ _ValidatorFieldReader())
+ return validation_body.rendered
+ else:
+ return None
+
+
+def _generate_validator_expression_for(field_ir, ir):
+ """Returns a validator expression for the given field."""
+ result = _generate_custom_validator_expression_for(field_ir, ir)
+ if result is None:
+ return "::emboss::support::Maybe<bool>(true)"
+ return result
+
+
+def _generate_structure_virtual_field_methods(enclosing_type_name, field_ir,
+ ir):
+ """Generates C++ code for methods for a single virtual field.
+
+ Arguments:
+ enclosing_type_name: The text name of the enclosing type.
+ field_ir: The IR for the field to generate methods for.
+ ir: The full IR for the module.
+
+ Returns:
+ A tuple of ("", declarations, definitions). The declarations can be
+ inserted into the class definition for the enclosing type's View. Any
+ definitions should be placed after the class definition. These are
+ separated to satisfy C++'s declaration-before-use requirements.
+ """
+ if field_ir.write_method.WhichOneof("method") == "alias":
+ return _generate_field_indirection(field_ir, enclosing_type_name, ir)
+
+ read_value = _render_expression(
+ field_ir.read_transform, ir,
+ field_reader=_VirtualViewFieldRenderer())
+ field_exists = _render_existence_test(field_ir, ir)
+ logical_type = _cpp_basic_type_for_expression(field_ir.read_transform, ir)
+
+ if read_value.is_constant and field_exists.is_constant:
+ declaration_template = (
+ _TEMPLATES.structure_single_const_virtual_field_method_declarations)
+ definition_template = (
+ _TEMPLATES.structure_single_const_virtual_field_method_definitions)
+ else:
+ declaration_template = (
+ _TEMPLATES.structure_single_virtual_field_method_declarations)
+ definition_template = (
+ _TEMPLATES.structure_single_virtual_field_method_definitions)
+
+ if field_ir.write_method.WhichOneof("method") == "transform":
+ destination = _render_variable(
+ ir_util.hashable_form_of_field_reference(
+ field_ir.write_method.transform.destination))
+ transform = _render_expression(
+ field_ir.write_method.transform.function_body, ir,
+ field_reader=_VirtualViewFieldRenderer()).rendered
+ write_methods = code_template.format_template(
+ _TEMPLATES.structure_single_virtual_field_write_methods,
+ logical_type=logical_type,
+ destination=destination,
+ transform=transform)
+ else:
+ write_methods = ""
+
+ name = field_ir.name.canonical_name.object_path[-1]
+ if name.startswith("$"):
+ name = _cpp_field_name(field_ir.name.name.text)
+ virtual_view_type_name = "EmbossReservedDollarVirtual{}View".format(name)
+ else:
+ virtual_view_type_name = "EmbossReservedVirtual{}View".format(
+ name_conversion.snake_to_camel(name))
+ assert logical_type, "Could not find appropriate C++ type for {}".format(
+ field_ir.read_transform)
+ if field_ir.read_transform.type.WhichOneof("type") == "integer":
+ write_to_text_stream_function = "WriteIntegerViewToTextStream"
+ elif field_ir.read_transform.type.WhichOneof("type") == "boolean":
+ write_to_text_stream_function = "WriteBooleanViewToTextStream"
+ elif field_ir.read_transform.type.WhichOneof("type") == "enumeration":
+ write_to_text_stream_function = "WriteEnumViewToTextStream"
+ else:
+ assert False, "Unexpected read-only virtual field type {}".format(
+ field_ir.read_transform.type.WhichOneof("type"))
+
+ value_is_ok = _generate_validator_expression_for(field_ir, ir)
+ declaration = code_template.format_template(
+ declaration_template,
+ visibility=_visibility_for_field(field_ir),
+ name=name,
+ virtual_view_type_name=virtual_view_type_name,
+ logical_type=logical_type,
+ read_value=read_value.rendered,
+ write_to_text_stream_function=write_to_text_stream_function,
+ parent_type=enclosing_type_name,
+ write_methods=write_methods,
+ value_is_ok=value_is_ok)
+ definition = code_template.format_template(
+ definition_template,
+ name=name,
+ virtual_view_type_name=virtual_view_type_name,
+ logical_type=logical_type,
+ read_value=read_value.rendered,
+ parent_type=enclosing_type_name,
+ field_exists=field_exists.rendered)
+ return "", declaration, definition
+
+
+def _generate_validator_type_for(enclosing_type_name, field_ir, ir):
+ """Returns a validator type name and definition for the given field."""
+ result_expression = _generate_custom_validator_expression_for(field_ir, ir)
+ if result_expression is None:
+ return "::emboss::support::AllValuesAreOk", ""
+
+ field_name = field_ir.name.canonical_name.object_path[-1]
+ validator_type_name = "EmbossReservedValidatorFor{}".format(
+ name_conversion.snake_to_camel(field_name))
+ qualified_validator_type_name = "{}::{}".format(enclosing_type_name,
+ validator_type_name)
+
+ validator_declaration = code_template.format_template(
+ _TEMPLATES.structure_field_validator,
+ name=validator_type_name,
+ expression=result_expression,
+ )
+ validator_declaration = _wrap_in_namespace(validator_declaration,
+ [enclosing_type_name])
+ return qualified_validator_type_name, validator_declaration
+
+
+def _generate_structure_physical_field_methods(enclosing_type_name, field_ir,
+ ir, parent_addressable_unit):
+ """Generates C++ code for methods for a single physical field.
+
+ Arguments:
+ enclosing_type_name: The text name of the enclosing type.
+ field_ir: The IR for the field to generate methods for.
+ ir: The full IR for the module.
+ parent_addressable_unit: The addressable unit (BIT or BYTE) of the enclosing
+ structure.
+
+ Returns:
+ A tuple of (declarations, definitions). The declarations can be inserted
+ into the class definition for the enclosing type's View. Any definitions
+ should be placed after the class definition. These are separated to satisfy
+ C++'s declaration-before-use requirements.
+ """
+ validator_type, validator_declaration = _generate_validator_type_for(
+ enclosing_type_name, field_ir, ir)
+
+ type_reader, unused_parameter_types, parameter_expressions = (
+ _get_cpp_type_reader_of_field(field_ir, ir, "Storage", validator_type,
+ parent_addressable_unit))
+
+ field_name = field_ir.name.canonical_name.object_path[-1]
+ parameter_values = []
+ parameters_known = []
+ for parameter in parameter_expressions:
+ parameter_cpp_expr = _render_expression(parameter, ir)
+ parameter_values.append(
+ "{}.ValueOrDefault(), ".format(parameter_cpp_expr.rendered))
+ parameters_known.append(
+ "{}.Known() && ".format(parameter_cpp_expr.rendered))
+ field_alignment, field_offset = _alignment_of_location(field_ir.location)
+ declaration = code_template.format_template(
+ _TEMPLATES.structure_single_field_method_declarations,
+ type_reader=type_reader,
+ visibility=_visibility_for_field(field_ir),
+ name=field_name)
+ definition = code_template.format_template(
+ _TEMPLATES.structure_single_field_method_definitions,
+ parent_type=enclosing_type_name,
+ name=field_name,
+ type_reader=type_reader,
+ offset=_render_expression(field_ir.location.start, ir).rendered,
+ size=_render_expression(field_ir.location.size, ir).rendered,
+ field_exists=_render_existence_test(field_ir, ir).rendered,
+ alignment=field_alignment,
+ parameters_known="".join(parameters_known),
+ parameter_values="".join(parameter_values),
+ static_offset=field_offset)
+ return validator_declaration, declaration, definition
+
+
+def _render_size_method(fields, ir):
+ """Renders the Size methods of a struct or bits, using the correct templates.
+
+ Arguments:
+ fields: The list of fields in the struct or bits. This is used to find the
+ $size_in_bits or $size_in_bytes virtual field.
+ ir: The IR to which fields belong.
+
+ Returns:
+ A string representation of the Size methods, suitable for inclusion in an
+ Emboss View class.
+ """
+ # The SizeInBytes(), SizeInBits(), and SizeIsKnown() methods just forward to
+ # the generated IntrinsicSizeIn$_units_$() method, which returns a virtual
+ # field with Read() and Ok() methods.
+ #
+ # TODO(bolms): Remove these shims, rename IntrinsicSizeIn$_units_$ to
+ # SizeIn$_units_$, and update all callers to the new API.
+ for field in fields:
+ if field.name.name.text in ("$size_in_bits", "$size_in_bytes"):
+ # If the read_transform and existence_condition are constant, then the
+ # size is constexpr.
+ if (_render_expression(field.read_transform, ir).is_constant and
+ _render_expression(field.existence_condition, ir).is_constant):
+ template = _TEMPLATES.constant_structure_size_method
+ else:
+ template = _TEMPLATES.runtime_structure_size_method
+ return code_template.format_template(
+ template,
+ units="Bits" if field.name.name.text == "$size_in_bits" else "Bytes")
+ assert False, "Expected a $size_in_bits or $size_in_bytes field."
+
+
+def _visibility_for_field(field_ir):
+ """Returns the C++ visibility for field_ir within its parent view."""
+ # Generally, the Google style guide for hand-written C++ forbids having
+ # multiple public: and private: sections, but trying to conform to that bit of
+ # the style guide would make this file significantly more complex.
+ #
+ # Alias fields are generated as simple methods that forward directly to the
+ # aliased field's method:
+ #
+ # auto alias() const -> decltype(parent().child().aliased_subchild()) {
+ # return parent().child().aliased_subchild();
+ # }
+ #
+ # Figuring out the return type of `parent().child().aliased_subchild()` is
+ # quite complex, since there are several levels of template indirection
+ # involved. It is much easier to just leave it up to the C++ compiler.
+ #
+ # Unfortunately, the C++ compiler will complain if `parent()` is not declared
+ # before `alias()`. If the `parent` field happens to be anonymous, the Google
+ # style guide would put `parent()`'s declaration after `alias()`'s
+ # declaration, which causes the C++ compiler to complain that `parent` is
+ # unknown.
+ #
+ # The easy fix to this is just to declare `parent()` before `alias()`, and
+ # explicitly mark `parent()` as `private` and `alias()` as `public`.
+ #
+ # Perhaps surprisingly, this limitation does not apply when `parent()`'s type
+ # is not yet complete at the point where `alias()` is declared; I believe this
+ # is because both `parent()` and `alias()` exist in a templated `class`, and
+ # by the time `parent().child().aliased_subchild()` is actually resolved, the
+ # compiler is instantiating the class and has the full definitions of all the
+ # other classes available.
+ if field_ir.name.is_anonymous:
+ return "private"
+ else:
+ return "public"
+
+
+def _generate_field_indirection(field_ir, parent_type_name, ir):
+ """Renders a method which forwards to a field's view."""
+ rendered_aliased_field = _render_variable(
+ ir_util.hashable_form_of_field_reference(field_ir.write_method.alias))
+ declaration = code_template.format_template(
+ _TEMPLATES.structure_single_field_indirect_method_declarations,
+ aliased_field=rendered_aliased_field,
+ visibility=_visibility_for_field(field_ir),
+ parent_type=parent_type_name,
+ name=field_ir.name.name.text)
+ definition = code_template.format_template(
+ _TEMPLATES.struct_single_field_indirect_method_definitions,
+ parent_type=parent_type_name,
+ name=field_ir.name.name.text,
+ aliased_field=rendered_aliased_field,
+ field_exists=_render_existence_test(field_ir, ir).rendered)
+ return "", declaration, definition
+
+
+def _generate_subtype_definitions(type_ir, ir):
+ """Generates C++ code for subtypes of type_ir."""
+ subtype_bodies = []
+ subtype_forward_declarations = []
+ subtype_method_definitions = []
+ type_name = type_ir.name.name.text
+ for subtype in type_ir.subtype:
+ inner_defs = _generate_type_definition(subtype, ir)
+ subtype_forward_declaration, subtype_body, subtype_methods = inner_defs
+ subtype_forward_declarations.append(subtype_forward_declaration)
+ subtype_bodies.append(subtype_body)
+ subtype_method_definitions.append(subtype_methods)
+ wrapped_forward_declarations = _wrap_in_namespace(
+ "\n".join(subtype_forward_declarations), [type_name])
+ wrapped_bodies = _wrap_in_namespace("\n".join(subtype_bodies), [type_name])
+ wrapped_method_definitions = _wrap_in_namespace(
+ "\n".join(subtype_method_definitions), [type_name])
+ return (wrapped_bodies, wrapped_forward_declarations,
+ wrapped_method_definitions)
+
+
+def _cpp_field_name(name):
+ """Returns the C++ name for the given field name."""
+ if name.startswith("$"):
+ dollar_field_names = {
+ "$size_in_bits": "IntrinsicSizeInBits",
+ "$size_in_bytes": "IntrinsicSizeInBytes",
+ "$max_size_in_bits": "MaxSizeInBits",
+ "$min_size_in_bits": "MinSizeInBits",
+ "$max_size_in_bytes": "MaxSizeInBytes",
+ "$min_size_in_bytes": "MinSizeInBytes",
+ }
+ return dollar_field_names[name]
+ else:
+ return name
+
+
+def _generate_structure_definition(type_ir, ir):
+ """Generates C++ for an Emboss structure (struct or bits).
+
+ Arguments:
+ type_ir: The IR for the struct definition.
+ ir: The full IR; used for type lookups.
+
+ Returns:
+ A tuple of: (forward declaration for classes, class bodies, method bodies),
+ suitable for insertion into the appropriate places in the generated header.
+ """
+ subtype_bodies, subtype_forward_declarations, subtype_method_definitions = (
+ _generate_subtype_definitions(type_ir, ir))
+ type_name = type_ir.name.name.text
+ field_helper_type_definitions = []
+ field_method_declarations = []
+ field_method_definitions = []
+ virtual_field_type_definitions = []
+ decode_field_clauses = []
+ write_field_clauses = []
+ ok_method_clauses = []
+ equals_method_clauses = []
+ unchecked_equals_method_clauses = []
+ enum_using_statements = []
+ parameter_fields = []
+ constructor_parameters = []
+ forwarded_parameters = []
+ parameter_initializers = []
+ units = {1: "Bits", 8: "Bytes"}[type_ir.addressable_unit]
+
+ for subtype in type_ir.subtype:
+ if subtype.HasField("enumeration"):
+ enum_using_statements.append(
+ code_template.format_template(
+ _TEMPLATES.enum_using_statement,
+ component=_get_fully_qualified_name(subtype.name.canonical_name,
+ ir),
+ name=_get_unqualified_name(subtype.name.canonical_name)))
+
+ # TODO(bolms): Reorder parameter fields to optimize packing in the view type.
+ for parameter in type_ir.runtime_parameter:
+ parameter_type = _cpp_basic_type_for_expression_type(parameter.type, ir)
+ parameter_name = parameter.name.name.text
+ parameter_fields.append("{} {}_;".format(parameter_type, parameter_name))
+ constructor_parameters.append(
+ "{} {}, ".format(parameter_type, parameter_name))
+ forwarded_parameters.append(
+ "::std::forward<{}>({}),".format(parameter_type, parameter_name))
+ parameter_initializers.append(", {0}_({0})".format(parameter_name))
+ field_method_declarations.append(
+ code_template.format_template(
+ _TEMPLATES.structure_single_parameter_field_method_declarations,
+ name=parameter_name,
+ logical_type=parameter_type))
+ # TODO(bolms): Should parameters appear in text format?
+ equals_method_clauses.append(
+ code_template.format_template(_TEMPLATES.equals_method_test,
+ field=parameter_name + "()"))
+ unchecked_equals_method_clauses.append(
+ code_template.format_template(_TEMPLATES.unchecked_equals_method_test,
+ field=parameter_name + "()"))
+ if type_ir.runtime_parameter:
+ flag_name = "parameters_initialized_"
+ parameters_initialized_flag = "bool {} = false;".format(flag_name)
+ initialize_parameters_initialized_true = ", {}(true)".format(flag_name)
+ parameter_checks = ["if (!{}) return false;".format(flag_name)]
+ else:
+ parameters_initialized_flag = ""
+ initialize_parameters_initialized_true = ""
+ parameter_checks = [""]
+
+ for field_index in type_ir.structure.fields_in_dependency_order:
+ field = type_ir.structure.field[field_index]
+ helper_types, declaration, definition = (
+ _generate_structure_field_methods(
+ type_name, field, ir, type_ir.addressable_unit))
+ field_helper_type_definitions.append(helper_types)
+ field_method_definitions.append(definition)
+ ok_method_clauses.append(
+ code_template.format_template(
+ _TEMPLATES.ok_method_test,
+ field=_cpp_field_name(field.name.name.text) + "()"))
+ if not ir_util.field_is_virtual(field):
+ # Virtual fields do not participate in equality tests -- they are equal by
+ # definition.
+ equals_method_clauses.append(
+ code_template.format_template(
+ _TEMPLATES.equals_method_test, field=field.name.name.text + "()"))
+ unchecked_equals_method_clauses.append(
+ code_template.format_template(
+ _TEMPLATES.unchecked_equals_method_test,
+ field=field.name.name.text + "()"))
+ field_method_declarations.append(declaration)
+ if not field.name.is_anonymous and not ir_util.field_is_read_only(field):
+ # As above, read-only fields cannot be decoded from text format.
+ decode_field_clauses.append(
+ code_template.format_template(
+ _TEMPLATES.decode_field,
+ field_name=field.name.canonical_name.object_path[-1]))
+ text_output_attr = ir_util.get_attribute(field.attribute, "text_output")
+ if not text_output_attr or text_output_attr.string_constant == "Emit":
+ if ir_util.field_is_read_only(field):
+ write_field_template = _TEMPLATES.write_read_only_field_to_text_stream
+ else:
+ write_field_template = _TEMPLATES.write_field_to_text_stream
+ write_field_clauses.append(
+ code_template.format_template(
+ write_field_template,
+ field_name=field.name.canonical_name.object_path[-1]))
+
+ requires_attr = ir_util.get_attribute(type_ir.attribute, "requires")
+ if requires_attr is not None:
+ requires_clause = _render_expression(
+ requires_attr.expression, ir, _DirectFieldRenderer()).rendered
+ requires_check = " if (!({}).ValueOr(false))\n return false;".format(
+ requires_clause)
+ else:
+ requires_check = ""
+
+ class_forward_declarations = code_template.format_template(
+ _TEMPLATES.structure_view_declaration,
+ name=type_name)
+ class_bodies = code_template.format_template(
+ _TEMPLATES.structure_view_class,
+ name=type_ir.name.canonical_name.object_path[-1],
+ size_method=_render_size_method(type_ir.structure.field, ir),
+ field_method_declarations="".join(field_method_declarations),
+ field_ok_checks="\n".join(ok_method_clauses),
+ parameter_ok_checks="\n".join(parameter_checks),
+ requires_check=requires_check,
+ equals_method_body="\n".join(equals_method_clauses),
+ unchecked_equals_method_body="\n".join(unchecked_equals_method_clauses),
+ decode_fields="\n".join(decode_field_clauses),
+ enum_usings="\n".join(enum_using_statements),
+ write_fields="\n".join(write_field_clauses),
+ parameter_fields="\n".join(parameter_fields),
+ constructor_parameters="".join(constructor_parameters),
+ forwarded_parameters="".join(forwarded_parameters),
+ parameter_initializers="\n".join(parameter_initializers),
+ parameters_initialized_flag=parameters_initialized_flag,
+ initialize_parameters_initialized_true=(
+ initialize_parameters_initialized_true),
+ units=units)
+ method_definitions = "\n".join(field_method_definitions)
+ early_virtual_field_types = "\n".join(virtual_field_type_definitions)
+ all_field_helper_type_definitions = "\n".join(field_helper_type_definitions)
+ return (early_virtual_field_types + subtype_forward_declarations +
+ class_forward_declarations,
+ all_field_helper_type_definitions + subtype_bodies + class_bodies,
+ subtype_method_definitions + method_definitions)
+
+
+def _generate_enum_definition(type_ir):
+ """Generates C++ for an Emboss enum."""
+ enum_values = []
+ enum_from_string_statements = []
+ string_from_enum_statements = []
+ enum_is_known_statements = []
+ previously_seen_numeric_values = set()
+ # Because enum types in Emboss allow unknown values, the C++ enum has to be
+ # based on uint64_t or int64_t; otherwise, if the enum is used on a 64-bit
+ # field anywhere in any structure, then the return type of Read() (et al)
+ # would be too small to hold the full range of values.
+ #
+ # TODO(bolms): Should Emboss have a way to annotate enums as "known values
+ # only" or "32-bit only", so that the C++ enum can be 32 bits (or smaller)?
+ #
+ # TODO(bolms): Should the default type be int64_t?
+ enum_type = "::std::uint64_t"
+ for value in type_ir.enumeration.value:
+ numeric_value = ir_util.constant_value(value.value)
+ if numeric_value < 0:
+ enum_type = "::std::int64_t"
+ enum_values.append(
+ code_template.format_template(_TEMPLATES.enum_value,
+ name=value.name.name.text,
+ value=_render_integer(numeric_value)))
+ enum_from_string_statements.append(
+ code_template.format_template(_TEMPLATES.enum_from_name_case,
+ enum=type_ir.name.name.text,
+ name=value.name.name.text))
+ if numeric_value not in previously_seen_numeric_values:
+ string_from_enum_statements.append(
+ code_template.format_template(_TEMPLATES.name_from_enum_case,
+ enum=type_ir.name.name.text,
+ name=value.name.name.text))
+ enum_is_known_statements.append(
+ code_template.format_template(_TEMPLATES.enum_is_known_case,
+ enum=type_ir.name.name.text,
+ name=value.name.name.text))
+ previously_seen_numeric_values.add(numeric_value)
+ return (
+ code_template.format_template(
+ _TEMPLATES.enum_declaration,
+ enum=type_ir.name.name.text,
+ enum_type=enum_type),
+ code_template.format_template(
+ _TEMPLATES.enum_definition,
+ enum=type_ir.name.name.text,
+ enum_type=enum_type,
+ enum_values="".join(enum_values),
+ enum_from_name_cases="\n".join(enum_from_string_statements),
+ name_from_enum_cases="\n".join(string_from_enum_statements),
+ enum_is_known_cases="\n".join(enum_is_known_statements)),
+ ""
+ )
+
+
+def _generate_type_definition(type_ir, ir):
+ """Generates C++ for an Emboss type."""
+ if type_ir.HasField("structure"):
+ return _generate_structure_definition(type_ir, ir)
+ elif type_ir.HasField("enumeration"):
+ return _generate_enum_definition(type_ir)
+ elif type_ir.HasField("external"):
+ # TODO(bolms): This should probably generate an #include.
+ return "", "", ""
+ else:
+ # TODO(bolms): provide error message instead of ICE
+ assert False, "Unknown type {}".format(type_ir)
+
+
+def _generate_header_guard(file_path):
+ # TODO(bolms): Make this configurable.
+ header_path = file_path + ".h"
+ uppercased_path = header_path.upper()
+ no_punctuation_path = re.sub(r"[^A-Za-z0-9_]", "_", uppercased_path)
+ no_double_underscore_path = re.sub(r"__+", "_", no_punctuation_path)
+ return no_double_underscore_path + "_"
+
+
+def generate_header(ir):
+ """Generates a C++ header from an Emboss module.
+
+ Arguments:
+ ir: An EmbossIr of the module.
+
+ Returns:
+ A string containing the text of a C++ header which implements Views for the
+ types in the Emboss module.
+ """
+ type_declarations = []
+ type_definitions = []
+ method_definitions = []
+ for type_definition in ir.module[0].type:
+ declaration, definition, methods = _generate_type_definition(
+ type_definition, ir)
+ type_declarations.append(declaration)
+ type_definitions.append(definition)
+ method_definitions.append(methods)
+ body = code_template.format_template(
+ _TEMPLATES.body,
+ type_declarations="".join(type_declarations),
+ type_definitions="".join(type_definitions),
+ method_definitions="".join(method_definitions))
+ body = _wrap_in_namespace(body, _get_module_namespace(ir.module[0]))
+ includes = _get_includes(ir.module[0])
+ return code_template.format_template(
+ _TEMPLATES.outline,
+ includes=includes,
+ body=body,
+ header_guard=_generate_header_guard(ir.module[0].source_file_name))
diff --git a/back_end/cpp/testcode/alignments_test.cc b/back_end/cpp/testcode/alignments_test.cc
new file mode 100644
index 0000000..383e44a
--- /dev/null
+++ b/back_end/cpp/testcode/alignments_test.cc
@@ -0,0 +1,275 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests that generated code properly preserves and propagates alignment
+// information.
+#include <stdint.h>
+
+#include <vector>
+
+#include "public/emboss_cpp_util.h"
+#include "testdata/alignments.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+using ::emboss::support::ContiguousBuffer;
+
+TEST(AlignmentsTest, DirectFieldAlignments) {
+ auto unaligned_view = MakeAlignmentsView<char>(nullptr, 0);
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.four_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.twelve_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.three_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.eleven_offset())>::value));
+
+ auto four_aligned_view = MakeAlignedAlignmentsView<char, 4>(nullptr, 0);
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 4, 0>>,
+ decltype(four_aligned_view.zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 4, 0>>,
+ decltype(four_aligned_view.four_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 4, 0>>,
+ decltype(four_aligned_view.twelve_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 4, 3>>,
+ decltype(four_aligned_view.three_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 4, 3>>,
+ decltype(four_aligned_view.eleven_offset())>::value));
+
+ auto eight_aligned_view = MakeAlignedAlignmentsView<char, 8>(nullptr, 0);
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 8, 0>>,
+ decltype(eight_aligned_view.zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 8, 4>>,
+ decltype(eight_aligned_view.four_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 8, 4>>,
+ decltype(eight_aligned_view.twelve_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 8, 3>>,
+ decltype(eight_aligned_view.three_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 8, 3>>,
+ decltype(eight_aligned_view.eleven_offset())>::value));
+}
+
+TEST(AlignmentsTest, AlignmentReductionAssignment) {
+ alignas(4) unsigned char data[4];
+ auto four_aligned_view = MakeAlignedAlignmentsView<unsigned char, 4>(data, 4);
+ {
+ // Implicit construction.
+ AlignmentsView unaligned_view{four_aligned_view};
+ EXPECT_EQ(data, unaligned_view.BackingStorage().data());
+ }
+ {
+ // Implicit conversion during assignment.
+ AlignmentsView unaligned_view;
+ unaligned_view = four_aligned_view;
+ EXPECT_EQ(data, unaligned_view.BackingStorage().data());
+ }
+}
+
+TEST(AlignmentsTest, ArrayFieldAlignments) {
+ auto unaligned_view = MakeAlignmentsView<char>(nullptr, 0);
+ EXPECT_TRUE(
+ (::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.zero_offset_four_stride_array()[0])>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ GenericPlaceholder6View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.zero_offset_six_stride_array()[0])>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(
+ unaligned_view.three_offset_four_stride_array()[0])>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ GenericPlaceholder6View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.four_offset_six_stride_array()[0])>::value));
+
+ auto four_aligned_view = MakeAlignedAlignmentsView<char, 4>(nullptr, 0);
+ EXPECT_TRUE(
+ (::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 4, 0>>,
+ decltype(
+ four_aligned_view.zero_offset_four_stride_array()[0])>::value));
+ EXPECT_TRUE((
+ ::std::is_same<GenericPlaceholder6View<ContiguousBuffer<char, 2, 0>>,
+ decltype(four_aligned_view
+ .zero_offset_six_stride_array()[0])>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 4, 3>>,
+ decltype(
+ four_aligned_view.three_offset_four_stride_array()[0])>::value));
+ EXPECT_TRUE((
+ ::std::is_same<GenericPlaceholder6View<ContiguousBuffer<char, 2, 0>>,
+ decltype(four_aligned_view
+ .four_offset_six_stride_array()[0])>::value));
+
+ auto eight_aligned_view = MakeAlignedAlignmentsView<char, 8>(nullptr, 0);
+ EXPECT_TRUE(
+ (::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 4, 0>>,
+ decltype(
+ eight_aligned_view.zero_offset_four_stride_array()[0])>::value));
+ EXPECT_TRUE((
+ ::std::is_same<GenericPlaceholder6View<ContiguousBuffer<char, 2, 0>>,
+ decltype(eight_aligned_view
+ .zero_offset_six_stride_array()[0])>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 4, 3>>,
+ decltype(
+ eight_aligned_view.three_offset_four_stride_array()[0])>::value));
+ EXPECT_TRUE((
+ ::std::is_same<GenericPlaceholder6View<ContiguousBuffer<char, 2, 0>>,
+ decltype(eight_aligned_view
+ .four_offset_six_stride_array()[0])>::value));
+}
+
+TEST(AlignmentsTest, SubFieldAlignments) {
+ auto unaligned_view = MakeAlignmentsView<char>(nullptr, 0);
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.zero_offset_substructure()
+ .zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.zero_offset_substructure()
+ .two_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.two_offset_substructure()
+ .zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.two_offset_substructure()
+ .two_offset())>::value));
+
+ auto four_aligned_view = MakeAlignedAlignmentsView<char, 4>(nullptr, 0);
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 4, 0>>,
+ decltype(four_aligned_view.zero_offset_substructure()
+ .zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 4, 2>>,
+ decltype(four_aligned_view.zero_offset_substructure()
+ .two_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 4, 2>>,
+ decltype(four_aligned_view.two_offset_substructure()
+ .zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 4, 0>>,
+ decltype(four_aligned_view.two_offset_substructure()
+ .two_offset())>::value));
+
+ auto eight_aligned_view = MakeAlignedAlignmentsView<char, 8>(nullptr, 0);
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 8, 0>>,
+ decltype(eight_aligned_view.zero_offset_substructure()
+ .zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 8, 2>>,
+ decltype(eight_aligned_view.zero_offset_substructure()
+ .two_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 8, 2>>,
+ decltype(eight_aligned_view.two_offset_substructure()
+ .zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 8, 4>>,
+ decltype(eight_aligned_view.two_offset_substructure()
+ .two_offset())>::value));
+}
+
+TEST(AlignmentsTest, ArraySubFieldAlignments) {
+ auto unaligned_view = MakeAlignmentsView<char>(nullptr, 0);
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.zero_offset_six_stride_array()[0]
+ .zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.zero_offset_six_stride_array()[0]
+ .two_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.four_offset_six_stride_array()[0]
+ .zero_offset())>::value));
+ EXPECT_TRUE(
+ (::std::is_same<GenericPlaceholder4View<ContiguousBuffer<char, 1, 0>>,
+ decltype(unaligned_view.four_offset_six_stride_array()[0]
+ .two_offset())>::value));
+
+ auto four_aligned_view = MakeAlignedAlignmentsView<char, 4>(nullptr, 0);
+ EXPECT_TRUE((::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 2, 0>>,
+ decltype(four_aligned_view.zero_offset_six_stride_array()[0]
+ .zero_offset())>::value));
+ EXPECT_TRUE((::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 2, 0>>,
+ decltype(four_aligned_view.zero_offset_six_stride_array()[0]
+ .two_offset())>::value));
+ EXPECT_TRUE((::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 2, 0>>,
+ decltype(four_aligned_view.four_offset_six_stride_array()[0]
+ .zero_offset())>::value));
+ EXPECT_TRUE((::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 2, 0>>,
+ decltype(four_aligned_view.four_offset_six_stride_array()[0]
+ .two_offset())>::value));
+
+ auto eight_aligned_view = MakeAlignedAlignmentsView<char, 8>(nullptr, 0);
+ EXPECT_TRUE((::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 2, 0>>,
+ decltype(eight_aligned_view.zero_offset_six_stride_array()[0]
+ .zero_offset())>::value));
+ EXPECT_TRUE((::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 2, 0>>,
+ decltype(eight_aligned_view.zero_offset_six_stride_array()[0]
+ .two_offset())>::value));
+ EXPECT_TRUE((::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 2, 0>>,
+ decltype(eight_aligned_view.four_offset_six_stride_array()[0]
+ .zero_offset())>::value));
+ EXPECT_TRUE((::std::is_same<
+ GenericPlaceholder4View<ContiguousBuffer<char, 2, 0>>,
+ decltype(eight_aligned_view.four_offset_six_stride_array()[0]
+ .two_offset())>::value));
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/anonymous_bits_test.cc b/back_end/cpp/testcode/anonymous_bits_test.cc
new file mode 100644
index 0000000..f2a918f
--- /dev/null
+++ b/back_end/cpp/testcode/anonymous_bits_test.cc
@@ -0,0 +1,114 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for generated code for anonymous "bits" types, using
+// anonymous_bits.emb.
+#include <stdint.h>
+
+#include <vector>
+
+#include "public/emboss_cpp_util.h"
+#include <gtest/gtest.h>
+
+#include "testdata/anonymous_bits.emb.h"
+
+namespace emboss {
+namespace test {
+namespace bits {
+namespace {
+
+TEST(AnonymousBits, InnerEnumIsVisibleAtOuterScope) {
+ EXPECT_EQ(static_cast<Foo::Bar>(0), Foo::Bar::BAR);
+}
+
+TEST(AnonymousBits, BitsAreReadable) {
+ alignas(8) uint8_t data[] = {0x01, 0x00, 0x00, 0x80, 0x01, 0x00, 0x80, 0x00};
+ EXPECT_FALSE((FooWriter{data, sizeof data - 1}.Ok()));
+
+ auto foo = MakeAlignedFooView<uint8_t, 8>(data, sizeof data);
+ ASSERT_TRUE(foo.Ok());
+ EXPECT_TRUE(foo.high_bit().Read());
+ EXPECT_TRUE(foo.first_bit().Read());
+ EXPECT_TRUE(foo.bit_23().Read());
+ EXPECT_TRUE(foo.low_bit().Read());
+ foo.first_bit().Write(false);
+ EXPECT_EQ(0, data[0]);
+ foo.bit_23().Write(false);
+ EXPECT_EQ(0, data[6]);
+}
+
+TEST(AnonymousBits, Equals) {
+ alignas(8) uint8_t buf_x[] = {0x01, 0x00, 0x00, 0x80, 0x01, 0x00, 0x80, 0x00};
+ alignas(8) uint8_t buf_y[] = {0x01, 0x00, 0x00, 0x80, 0x01, 0x00, 0x80, 0x00};
+
+ auto x = MakeAlignedFooView<uint8_t, 8>(buf_x, sizeof buf_x);
+ auto x_const = MakeFooView(static_cast<const uint8_t *>(buf_x), sizeof buf_x);
+ auto y = MakeAlignedFooView<uint8_t, 8>(buf_y, sizeof buf_y);
+
+ EXPECT_TRUE(x.Equals(x));
+ EXPECT_TRUE(x.UncheckedEquals(x));
+ EXPECT_TRUE(y.Equals(y));
+ EXPECT_TRUE(y.UncheckedEquals(y));
+
+ EXPECT_TRUE(x.Equals(y));
+ EXPECT_TRUE(x.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x));
+ EXPECT_TRUE(y.UncheckedEquals(x));
+
+ EXPECT_TRUE(x_const.Equals(y));
+ EXPECT_TRUE(x_const.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x_const));
+ EXPECT_TRUE(y.UncheckedEquals(x_const));
+
+ // Changing the second byte of buf_y should have no effect on equality.
+ ++buf_y[1];
+ EXPECT_NE(buf_x, buf_y);
+ EXPECT_TRUE(x.Equals(y));
+ EXPECT_TRUE(x.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x));
+ EXPECT_TRUE(y.UncheckedEquals(x));
+
+ ++buf_y[0];
+ EXPECT_FALSE(x.Equals(y));
+ EXPECT_FALSE(x.UncheckedEquals(y));
+ EXPECT_FALSE(y.Equals(x));
+ EXPECT_FALSE(y.UncheckedEquals(x));
+}
+
+TEST(AnonymousBits, WriteToString) {
+ const uint8_t data[] = {0x01, 0x00, 0x00, 0x80, 0x01, 0x00, 0x80, 0x00};
+ auto foo = MakeFooView(data, sizeof data);
+ ASSERT_TRUE(foo.Ok());
+ EXPECT_EQ(
+ "{ high_bit: true, bar: BAR, first_bit: true, bit_23: true, low_bit: "
+ "true }",
+ ::emboss::WriteToString(foo));
+}
+
+TEST(AnonymousBits, ReadFromString) {
+ const uint8_t data[] = {0x01, 0x00, 0x00, 0x80, 0x01, 0x00, 0x80, 0x00};
+ uint8_t data2[] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
+ auto foo = MakeFooView(data, sizeof data);
+ auto foo_writer = MakeFooView(data2, sizeof data2);
+ ASSERT_TRUE(foo.Ok());
+ ASSERT_TRUE(foo_writer.Ok());
+ ::emboss::UpdateFromText(foo_writer, ::emboss::WriteToString(foo));
+ EXPECT_EQ(::std::vector<uint8_t>(data, data + sizeof data),
+ ::std::vector<uint8_t>(data2, data2 + sizeof data2));
+}
+
+} // namespace
+} // namespace bits
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/auto_array_size_test.cc b/back_end/cpp/testcode/auto_array_size_test.cc
new file mode 100644
index 0000000..405a66e
--- /dev/null
+++ b/back_end/cpp/testcode/auto_array_size_test.cc
@@ -0,0 +1,374 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for automatically-sized arrays from auto_array_size.emb.
+
+#include <stdint.h>
+
+#include <iterator>
+#include <random>
+#include <vector>
+
+#include "public/emboss_text_util.h"
+#include "testdata/auto_array_size.emb.h"
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+alignas(8) static const uint8_t kAutoSize[22] = {
+ 0x03, // 0:1 array_size == 3
+ 0x10, 0x20, 0x30, 0x40, // 1:5 four_byte_array
+ 0x11, 0x12, 0x21, 0x22, // 5:9 four_struct_array[0, 1]
+ 0x31, 0x32, 0x41, 0x42, // 9:13 four_struct_array[2, 3]
+ 0x50, 0x60, 0x70, // 13:16 dynamic_byte_array
+ 0x51, 0x52, 0x61, 0x62, 0x71, 0x72, // 16:22 dynamic_struct_array
+};
+
+TEST(AutoSizeView, IteratorIncrement) {
+ auto src_buf = std::vector<uint8_t>(kAutoSize, kAutoSize + sizeof kAutoSize);
+ auto src = MakeAutoSizeView(&src_buf).four_struct_array();
+ auto dst_buf = std::vector<uint8_t>(kAutoSize, kAutoSize + sizeof kAutoSize);
+ auto dst = MakeAutoSizeView(&dst_buf).four_struct_array();
+ EXPECT_TRUE(src.Equals(dst));
+
+ std::fill(dst.BackingStorage().begin(), dst.BackingStorage().end(), 0);
+ EXPECT_FALSE(src.Equals(dst));
+ for (auto src_it = src.begin(), dst_it = dst.begin();
+ src_it != src.end() && dst_it != dst.end(); ++src_it, ++dst_it) {
+ dst_it->CopyFrom(*src_it);
+ }
+ EXPECT_TRUE(src.Equals(dst));
+
+ std::fill(dst.BackingStorage().begin(), dst.BackingStorage().end(), 0);
+ EXPECT_FALSE(src.Equals(dst));
+ for (auto src_it = src.begin(), dst_it = dst.begin();
+ src_it != src.end() && dst_it != dst.end(); src_it++, dst_it++) {
+ dst_it->CopyFrom(*src_it);
+ }
+ EXPECT_TRUE(src.Equals(dst));
+
+ std::fill(dst.BackingStorage().begin(), dst.BackingStorage().end(), 0);
+ EXPECT_FALSE(src.Equals(dst));
+ for (auto src_it = src.rbegin(), dst_it = dst.rbegin();
+ src_it != src.rend() && dst_it != dst.rend(); ++src_it, ++dst_it) {
+ dst_it->CopyFrom(*src_it);
+ }
+ EXPECT_TRUE(src.Equals(dst));
+
+ std::fill(dst.BackingStorage().begin(), dst.BackingStorage().end(), 0);
+ EXPECT_FALSE(src.Equals(dst));
+ for (auto src_it = src.rbegin(), dst_it = dst.rbegin();
+ src_it != src.rend() && dst_it != dst.rend(); src_it++, dst_it++) {
+ dst_it->CopyFrom(*src_it);
+ }
+ EXPECT_TRUE(src.Equals(dst));
+
+ EXPECT_EQ(src.begin(), src.begin()++);
+ EXPECT_EQ(src.rbegin(), src.rbegin()++);
+ EXPECT_EQ(src.end(), (src.end())--);
+ EXPECT_EQ(src.rend(), src.rend()--);
+}
+
+TEST(AutoSizeView, PreviousNext) {
+ auto view = MakeAutoSizeView(kAutoSize, sizeof kAutoSize).four_struct_array();
+ EXPECT_TRUE(std::next(view.begin(), 0)->Equals(view[0]));
+ EXPECT_TRUE(std::next(view.begin(), 1)->Equals(view[1]));
+ EXPECT_TRUE(std::next(view.begin(), 2)->Equals(view[2]));
+ EXPECT_TRUE(std::next(view.begin(), 3)->Equals(view[3]));
+
+ EXPECT_TRUE(std::next(view.rbegin(), 0)->Equals(view[3]));
+ EXPECT_TRUE(std::next(view.rbegin(), 1)->Equals(view[2]));
+ EXPECT_TRUE(std::next(view.rbegin(), 2)->Equals(view[1]));
+ EXPECT_TRUE(std::next(view.rbegin(), 3)->Equals(view[0]));
+
+ EXPECT_TRUE(std::prev(view.end(), 1)->Equals(view[3]));
+ EXPECT_TRUE(std::prev(view.end(), 2)->Equals(view[2]));
+ EXPECT_TRUE(std::prev(view.end(), 3)->Equals(view[1]));
+ EXPECT_TRUE(std::prev(view.end(), 4)->Equals(view[0]));
+
+ EXPECT_TRUE(std::prev(view.rend(), 1)->Equals(view[0]));
+ EXPECT_TRUE(std::prev(view.rend(), 2)->Equals(view[1]));
+ EXPECT_TRUE(std::prev(view.rend(), 3)->Equals(view[2]));
+ EXPECT_TRUE(std::prev(view.rend(), 4)->Equals(view[3]));
+}
+
+TEST(AutoSizeView, ForEach) {
+ auto view = MakeAutoSizeView(kAutoSize, sizeof kAutoSize).four_struct_array();
+
+ int i = 0;
+ std::for_each(view.begin(), view.end(), [&](ElementView element) {
+ ASSERT_TRUE(element.Equals(view[i++]));
+ });
+
+ i = view.ElementCount() - 1;
+ std::for_each(view.rbegin(), view.rend(), [&](ElementView element) {
+ ASSERT_TRUE(element.Equals(view[i--]));
+ });
+}
+
+TEST(AutoSizeView, Find) {
+ auto view = MakeAutoSizeView(kAutoSize, sizeof kAutoSize).four_struct_array();
+
+ EXPECT_TRUE(
+ std::find_if(view.begin(), view.end(), [view](ElementView element) {
+ return element.Equals(view[0]);
+ })->Equals(view[0]));
+ EXPECT_TRUE(
+ std::find_if(view.begin(), view.end(), [view](ElementView element) {
+ return element.Equals(view[1]);
+ })->Equals(view[1]));
+ EXPECT_TRUE(
+ std::find_if(view.begin(), view.end(), [view](ElementView element) {
+ return element.Equals(view[2]);
+ })->Equals(view[2]));
+ EXPECT_TRUE(
+ std::find_if(view.begin(), view.end(), [view](ElementView element) {
+ return element.Equals(view[3]);
+ })->Equals(view[3]));
+
+ EXPECT_TRUE(
+ std::find_if(view.rbegin(), view.rend(), [view](ElementView element) {
+ return element.Equals(view[0]);
+ })->Equals(view[0]));
+ EXPECT_TRUE(
+ std::find_if(view.rbegin(), view.rend(), [view](ElementView element) {
+ return element.Equals(view[1]);
+ })->Equals(view[1]));
+ EXPECT_TRUE(
+ std::find_if(view.rbegin(), view.rend(), [view](ElementView element) {
+ return element.Equals(view[2]);
+ })->Equals(view[2]));
+ EXPECT_TRUE(
+ std::find_if(view.rbegin(), view.rend(), [view](ElementView element) {
+ return element.Equals(view[3]);
+ })->Equals(view[3]));
+}
+
+TEST(AutoSizeView, Comparison) {
+ auto view = MakeAutoSizeView(kAutoSize, sizeof kAutoSize).four_struct_array();
+
+ EXPECT_EQ(view.begin() + view.ElementCount(), view.end());
+ EXPECT_EQ(view.end() - view.ElementCount(), view.begin());
+
+ EXPECT_EQ(view.rbegin() + view.ElementCount(), view.rend());
+ EXPECT_EQ(view.rend() - view.ElementCount(), view.rbegin());
+
+ EXPECT_LT(view.begin(), view.end());
+ EXPECT_LT(view.rbegin(), view.rend());
+
+ EXPECT_LE(view.begin() - 1, view.end());
+ EXPECT_LE(view.rbegin() - 1, view.rend());
+ EXPECT_LE(view.begin() - 1, view.end());
+ EXPECT_LE(view.rbegin() - 1, view.rend());
+
+ EXPECT_GT(view.end(), view.begin());
+ EXPECT_GT(view.rend(), view.rbegin());
+
+ EXPECT_GE(view.end() + 1, view.begin());
+ EXPECT_GE(view.end() + 1, view.begin());
+ EXPECT_GE(view.rend() + 1, view.rbegin());
+ EXPECT_GE(view.rend() + 1, view.rbegin());
+}
+
+TEST(AutoSizeView, RangeBasedFor) {
+ auto view = MakeAutoSizeView(kAutoSize, sizeof kAutoSize).four_struct_array();
+
+ int i = 0;
+ for (auto element : view) {
+ ASSERT_TRUE(element.Equals(view[i++]));
+ }
+}
+
+TEST(AutoSizeView, CanReadAutoArrays) {
+ auto view =
+ MakeAlignedAutoSizeView<const uint8_t, 8>(kAutoSize, sizeof kAutoSize);
+ EXPECT_EQ(22, view.SizeInBytes());
+ EXPECT_EQ(3, view.array_size().Read());
+ EXPECT_EQ(0x10, view.four_byte_array()[0].Read());
+ EXPECT_EQ(0x20, view.four_byte_array()[1].Read());
+ EXPECT_EQ(0x30, view.four_byte_array()[2].Read());
+ EXPECT_EQ(0x40, view.four_byte_array()[3].Read());
+ EXPECT_EQ(4, view.four_byte_array().SizeInBytes());
+ EXPECT_DEATH(view.four_byte_array()[4].Read(), "");
+ EXPECT_EQ(0x11, view.four_struct_array()[0].a().Read());
+ EXPECT_EQ(0x12, view.four_struct_array()[0].b().Read());
+ EXPECT_EQ(0x21, view.four_struct_array()[1].a().Read());
+ EXPECT_EQ(0x22, view.four_struct_array()[1].b().Read());
+ EXPECT_EQ(0x31, view.four_struct_array()[2].a().Read());
+ EXPECT_EQ(0x32, view.four_struct_array()[2].b().Read());
+ EXPECT_EQ(0x41, view.four_struct_array()[3].a().Read());
+ EXPECT_EQ(0x42, view.four_struct_array()[3].b().Read());
+ EXPECT_EQ(8, view.four_struct_array().SizeInBytes());
+ EXPECT_DEATH(view.four_struct_array()[4].a().Read(), "");
+ EXPECT_EQ(0x50, view.dynamic_byte_array()[0].Read());
+ EXPECT_EQ(0x60, view.dynamic_byte_array()[1].Read());
+ EXPECT_EQ(0x70, view.dynamic_byte_array()[2].Read());
+ EXPECT_EQ(3, view.dynamic_byte_array().SizeInBytes());
+ EXPECT_FALSE(view.dynamic_byte_array()[3].IsComplete());
+ EXPECT_EQ(0x51, view.dynamic_struct_array()[0].a().Read());
+ EXPECT_EQ(0x52, view.dynamic_struct_array()[0].b().Read());
+ EXPECT_EQ(0x61, view.dynamic_struct_array()[1].a().Read());
+ EXPECT_EQ(0x62, view.dynamic_struct_array()[1].b().Read());
+ EXPECT_EQ(0x71, view.dynamic_struct_array()[2].a().Read());
+ EXPECT_EQ(0x72, view.dynamic_struct_array()[2].b().Read());
+ EXPECT_EQ(6, view.dynamic_struct_array().SizeInBytes());
+ EXPECT_FALSE(view.dynamic_struct_array()[3].IsComplete());
+}
+
+TEST(AutoSizeWriter, CanWriteAutoArrays) {
+ ::std::vector<char> buffer(sizeof kAutoSize, 0);
+ auto writer = MakeAutoSizeView(&buffer);
+ writer.array_size().Write(0);
+ EXPECT_EQ(13, writer.SizeInBytes());
+ EXPECT_DEATH(writer.dynamic_byte_array()[0].Read(), "");
+ writer.array_size().Write(3);
+ EXPECT_EQ(22, writer.SizeInBytes());
+ writer.four_byte_array()[0].Write(0x10);
+ writer.four_byte_array()[1].Write(0x20);
+ writer.four_byte_array()[2].Write(0x30);
+ writer.four_byte_array()[3].Write(0x40);
+ EXPECT_DEATH(writer.four_byte_array()[4].Write(0), "");
+ writer.four_struct_array()[0].a().Write(0x11);
+ writer.four_struct_array()[0].b().Write(0x12);
+ writer.four_struct_array()[1].a().Write(0x21);
+ writer.four_struct_array()[1].b().Write(0x22);
+ writer.four_struct_array()[2].a().Write(0x31);
+ writer.four_struct_array()[2].b().Write(0x32);
+ writer.four_struct_array()[3].a().Write(0x41);
+ writer.four_struct_array()[3].b().Write(0x42);
+ EXPECT_FALSE(writer.four_struct_array()[4].IsComplete());
+ writer.dynamic_byte_array()[0].Write(0x50);
+ writer.dynamic_byte_array()[1].Write(0x60);
+ writer.dynamic_byte_array()[2].Write(0x70);
+ EXPECT_FALSE(writer.dynamic_byte_array()[3].IsComplete());
+ writer.dynamic_struct_array()[0].a().Write(0x51);
+ writer.dynamic_struct_array()[0].b().Write(0x52);
+ writer.dynamic_struct_array()[1].a().Write(0x61);
+ writer.dynamic_struct_array()[1].b().Write(0x62);
+ writer.dynamic_struct_array()[2].a().Write(0x71);
+ writer.dynamic_struct_array()[2].b().Write(0x72);
+ EXPECT_FALSE(writer.dynamic_struct_array()[3].IsComplete());
+ EXPECT_EQ(std::vector<char>(kAutoSize, kAutoSize + sizeof kAutoSize), buffer);
+}
+
+TEST(AutoSizeView, CanUseDataMethod) {
+ auto view =
+ MakeAlignedAutoSizeView<const uint8_t, 8>(kAutoSize, sizeof kAutoSize);
+
+ for (unsigned i = 0; i < view.SizeInBytes(); ++i) {
+ EXPECT_EQ(*(view.BackingStorage().data() + i), kAutoSize[i])
+ << " at element " << i;
+ }
+}
+
+TEST(AutoSizeView, CanCopyFrom) {
+ auto source =
+ MakeAlignedAutoSizeView<const uint8_t, 8>(kAutoSize, sizeof kAutoSize);
+
+ std::array<uint8_t, sizeof kAutoSize> buf = {0};
+ auto dest = MakeAlignedAutoSizeView<uint8_t, 8>(buf.data(), buf.size());
+
+ // Copy one element.
+ EXPECT_NE(source.four_struct_array()[0].a().Read(),
+ dest.four_struct_array()[0].a().Read());
+ EXPECT_NE(source.four_struct_array()[0].b().Read(),
+ dest.four_struct_array()[0].b().Read());
+ dest.four_struct_array()[0].CopyFrom(source.four_struct_array()[0]);
+ EXPECT_EQ(source.four_struct_array()[0].a().Read(),
+ dest.four_struct_array()[0].a().Read());
+ EXPECT_EQ(source.four_struct_array()[0].b().Read(),
+ dest.four_struct_array()[0].b().Read());
+
+ // Copy entire view.
+ dest.CopyFrom(source);
+ for (unsigned i = 0; i < source.four_struct_array().ElementCount(); ++i) {
+ EXPECT_EQ(source.four_struct_array()[i].a().Read(),
+ dest.four_struct_array()[i].a().Read());
+ EXPECT_EQ(source.four_struct_array()[i].b().Read(),
+ dest.four_struct_array()[i].b().Read());
+ }
+}
+
+TEST(AutoSizeView, CanCopyFromDifferentSizes) {
+ constexpr int padding = 10;
+ std::array<uint8_t, sizeof kAutoSize + padding> source_buffer;
+ memset(source_buffer.data(), 0, source_buffer.size());
+ memcpy(source_buffer.data(), kAutoSize, sizeof kAutoSize);
+ auto source = MakeAutoSizeView(&source_buffer);
+
+ std::array<uint8_t, sizeof kAutoSize + padding> buf;
+ memset(buf.data(), 0xff, buf.size());
+ auto dest = MakeAutoSizeView(buf.data(), sizeof kAutoSize);
+
+ dest.CopyFrom(source);
+ for (unsigned i = 0; i < sizeof kAutoSize; ++i) {
+ EXPECT_EQ(buf[i], source_buffer[i]) << i;
+ }
+ for (unsigned i = sizeof kAutoSize; i < sizeof kAutoSize + padding; ++i) {
+ EXPECT_EQ(buf[i], 0xff) << i;
+ }
+}
+
+TEST(AutoSizeView, CanCopyFromOverlapping) {
+ constexpr int kElementSizeBytes = ElementView::SizeInBytes();
+ std::vector<uint8_t> buf = {1, 2, 3};
+
+ auto source = MakeElementView(buf.data(), kElementSizeBytes);
+ auto dest = MakeElementView(buf.data() + 1, kElementSizeBytes);
+ EXPECT_EQ(source.a().Read(), buf[0]);
+ EXPECT_EQ(source.b().Read(), dest.a().Read());
+ EXPECT_EQ(dest.b().Read(), buf[2]);
+
+ dest.CopyFrom(source); // Forward overlap.
+ EXPECT_EQ(buf, std::vector<uint8_t>({1, 1, 2}));
+ source.CopyFrom(dest); // Reverse overlap.
+ EXPECT_EQ(buf, std::vector<uint8_t>({1, 2, 2}));
+}
+
+TEST(AutoSizeView, Equals) {
+ std::vector<uint8_t> buf_x = {1, 2};
+ std::vector<uint8_t> buf_y = {1, 2, 3};
+ auto x = MakeElementView(&buf_x);
+ auto x_const =
+ MakeElementView(static_cast<const std::vector<uint8_t>*>(&buf_x));
+ auto y = MakeElementView(&buf_y);
+
+ EXPECT_TRUE(x.Equals(x));
+ EXPECT_TRUE(x.UncheckedEquals(x));
+ EXPECT_TRUE(y.Equals(y));
+ EXPECT_TRUE(y.UncheckedEquals(y));
+
+ EXPECT_TRUE(x.Equals(y));
+ EXPECT_TRUE(x.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x));
+ EXPECT_TRUE(y.UncheckedEquals(x));
+
+ EXPECT_TRUE(y.Equals(x_const));
+ EXPECT_TRUE(y.UncheckedEquals(x_const));
+ EXPECT_TRUE(x_const.Equals(y));
+ EXPECT_TRUE(x_const.UncheckedEquals(y));
+
+ ++buf_y[1];
+ EXPECT_FALSE(x.Equals(y));
+ EXPECT_FALSE(x.UncheckedEquals(y));
+ EXPECT_FALSE(y.Equals(x));
+ EXPECT_FALSE(y.UncheckedEquals(x));
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/bcd_test.cc b/back_end/cpp/testcode/bcd_test.cc
new file mode 100644
index 0000000..44d0bbb
--- /dev/null
+++ b/back_end/cpp/testcode/bcd_test.cc
@@ -0,0 +1,300 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for the generated View class from bcd.emb.
+//
+// These tests check that Binary-Coded Decimal (BCD) numbers work.
+#include <stdint.h>
+
+#include <array>
+#include <string>
+#include <vector>
+
+#include "testdata/bcd.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+alignas(8) static const uint8_t kBcd[40] = {
+ 0x02, // 0 [+1] one_byte == 2
+ 0x04, 0x01, // 1 [+2] two_byte == 104
+ 0x66, 0x55, 0x44, // 3 [+3] three_byte == 445566
+ 0x06, 0x05, 0x04, 0x03, // 6 [+4] four_byte == 3040506
+ 0x21, 0x43, 0x65, 0x87, // 10 [+5] five_byte
+ 0x99, // 10 [+5] five_byte == 9987654321
+ 0x23, 0x91, 0x78, 0x56, // 15 [+6] six_byte
+ 0x34, 0x12, // 15 [+6] six_byte == 123456789123
+ 0x37, 0x46, 0x55, 0x64, // 21 [+7] seven_byte
+ 0x73, 0x82, 0x91, // 21 [+7] seven_byte == 91827364554637
+ 0x06, 0x05, 0x04, 0x03, // 28 [+8] eight_byte
+ 0x02, 0x01, 0x00, 0x99, // 28 [+8] eight_byte == 9900010203040506
+ 0x34, 0x1d, 0x3c, 0x12, // 36 [+4] four_bit = 4,
+ // six_bit = 13,
+ // ten_bit = 307,
+ // twelve_bit = 123,
+};
+
+TEST(BcdSizesView, CanReadBcd) {
+ auto view = MakeAlignedBcdSizesView<const uint8_t, 8>(kBcd, sizeof kBcd);
+ EXPECT_EQ(2, view.one_byte().Read());
+ EXPECT_EQ(104, view.two_byte().Read());
+ EXPECT_EQ(445566, view.three_byte().Read());
+ EXPECT_EQ(3040506, view.four_byte().Read());
+ EXPECT_EQ(9987654321UL, view.five_byte().Read());
+ EXPECT_EQ(123456789123UL, view.six_byte().Read());
+ EXPECT_EQ(91827364554637UL, view.seven_byte().Read());
+ EXPECT_EQ(9900010203040506UL, view.eight_byte().Read());
+ EXPECT_EQ(4, view.four_bit().Read());
+ EXPECT_EQ(13, view.six_bit().Read());
+ EXPECT_EQ(307, view.ten_bit().Read());
+ EXPECT_EQ(123, view.twelve_bit().Read());
+ // Test that the views return appropriate integer widths.
+ EXPECT_EQ(1, sizeof(view.one_byte().Read()));
+ EXPECT_EQ(2, sizeof(view.two_byte().Read()));
+ EXPECT_EQ(4, sizeof(view.three_byte().Read()));
+ EXPECT_EQ(4, sizeof(view.four_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.five_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.six_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.seven_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.eight_byte().Read()));
+ EXPECT_EQ(1, sizeof(view.four_bit().Read()));
+ EXPECT_EQ(1, sizeof(view.six_bit().Read()));
+ EXPECT_EQ(2, sizeof(view.ten_bit().Read()));
+ EXPECT_EQ(2, sizeof(view.twelve_bit().Read()));
+}
+
+TEST(BcdSizesWriter, CanWriteBcd) {
+ uint8_t buffer[sizeof kBcd] = {0};
+ auto writer = BcdSizesWriter(buffer, sizeof buffer);
+ writer.one_byte().Write(2);
+ writer.two_byte().Write(104);
+ writer.three_byte().Write(445566);
+ writer.four_byte().Write(3040506);
+ writer.five_byte().Write(9987654321UL);
+ writer.six_byte().Write(123456789123UL);
+ writer.seven_byte().Write(91827364554637UL);
+ writer.eight_byte().Write(9900010203040506UL);
+ writer.four_bit().Write(4);
+ writer.six_bit().Write(13);
+ writer.ten_bit().Write(307);
+ writer.twelve_bit().Write(123);
+ EXPECT_EQ(std::vector<uint8_t>(kBcd, kBcd + sizeof kBcd),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+
+ EXPECT_DEATH(writer.one_byte().Write(100), "");
+ EXPECT_DEATH(writer.three_byte().Write(1445566), "");
+ EXPECT_DEATH(writer.ten_bit().Write(400), "");
+}
+
+TEST(BcdSizesView, OkIsTrueForGoodBcd) {
+ auto view = BcdSizesView(kBcd, sizeof kBcd);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_TRUE(view.one_byte().Ok());
+ EXPECT_TRUE(view.one_byte().Ok());
+ EXPECT_TRUE(view.two_byte().Ok());
+ EXPECT_TRUE(view.three_byte().Ok());
+ EXPECT_TRUE(view.four_byte().Ok());
+ EXPECT_TRUE(view.five_byte().Ok());
+ EXPECT_TRUE(view.six_byte().Ok());
+ EXPECT_TRUE(view.seven_byte().Ok());
+ EXPECT_TRUE(view.eight_byte().Ok());
+}
+
+static const uint8_t kBadBcd[40] = {
+ 0x0a, // 0 [+1] one_byte
+ 0xb4, 0x01, // 1 [+2] two_byte
+ 0xaa, 0x55, 0x44, // 3 [+3] three_byte
+ 0x06, 0x05, 0x04, 0xff, // 6 [+4] four_byte
+ 0xff, 0xff, 0xff, 0xff, // 10 [+5] five_byte
+ 0xff, // 10 [+5] five_byte
+ 0xff, 0xff, 0xff, 0xff, // 15 [+6] six_byte
+ 0xff, 0xff, // 15 [+6] six_byte
+ 0xff, 0xff, 0xff, 0xff, // 21 [+7] seven_byte
+ 0xff, 0xff, 0xff, // 21 [+7] seven_byte
+ 0xff, 0xff, 0xff, 0xff, // 28 [+8] eight_byte
+ 0xff, 0xff, 0xff, 0xff, // 28 [+8] eight_byte
+ 0xff, 0xff, 0xff, 0xff, // 36 [+4] four_, six_, ten_, twelve_bit
+};
+
+TEST(BcdSizesView, UncheckedReadingInvalidBcdDoesNotCrash) {
+ auto view = BcdSizesView(kBadBcd, sizeof kBadBcd);
+ view.one_byte().UncheckedRead();
+ view.two_byte().UncheckedRead();
+ view.three_byte().UncheckedRead();
+ view.four_byte().UncheckedRead();
+ view.five_byte().UncheckedRead();
+ view.six_byte().UncheckedRead();
+ view.seven_byte().UncheckedRead();
+ view.eight_byte().UncheckedRead();
+ view.four_bit().UncheckedRead();
+ view.six_bit().UncheckedRead();
+ view.ten_bit().UncheckedRead();
+ view.twelve_bit().UncheckedRead();
+}
+
+TEST(BcdSizesView, ReadingInvalidBcdCrashes) {
+ auto view = BcdSizesView(kBadBcd, sizeof kBadBcd);
+ EXPECT_DEATH(view.one_byte().Read(), "");
+ EXPECT_DEATH(view.two_byte().Read(), "");
+ EXPECT_DEATH(view.three_byte().Read(), "");
+ EXPECT_DEATH(view.four_byte().Read(), "");
+ EXPECT_DEATH(view.five_byte().Read(), "");
+ EXPECT_DEATH(view.six_byte().Read(), "");
+ EXPECT_DEATH(view.seven_byte().Read(), "");
+ EXPECT_DEATH(view.eight_byte().Read(), "");
+ EXPECT_DEATH(view.four_bit().Read(), "");
+ EXPECT_DEATH(view.six_bit().Read(), "");
+ EXPECT_DEATH(view.ten_bit().Read(), "");
+ EXPECT_DEATH(view.twelve_bit().Read(), "");
+}
+
+TEST(BcdSizesView, OkIsFalseForBadBcd) {
+ auto view = BcdSizesView(kBadBcd, sizeof kBadBcd);
+ EXPECT_FALSE(view.Ok());
+ EXPECT_FALSE(view.one_byte().Ok());
+ EXPECT_FALSE(view.two_byte().Ok());
+ EXPECT_FALSE(view.three_byte().Ok());
+ EXPECT_FALSE(view.four_byte().Ok());
+ EXPECT_FALSE(view.five_byte().Ok());
+ EXPECT_FALSE(view.six_byte().Ok());
+ EXPECT_FALSE(view.seven_byte().Ok());
+ EXPECT_FALSE(view.eight_byte().Ok());
+ EXPECT_FALSE(view.four_bit().Ok());
+ EXPECT_FALSE(view.six_bit().Ok());
+ EXPECT_FALSE(view.ten_bit().Ok());
+ EXPECT_FALSE(view.twelve_bit().Ok());
+}
+
+TEST(BcdBigEndianView, BigEndianReadWrite) {
+ uint8_t big_endian[4] = {0x12, 0x34, 0x56, 0x78};
+ auto writer = BcdBigEndianWriter(big_endian, sizeof big_endian);
+ EXPECT_EQ(12345678, writer.four_byte().Read());
+ writer.four_byte().Write(87654321);
+ EXPECT_EQ(0x87, big_endian[0]);
+ EXPECT_EQ(0x65, big_endian[1]);
+ EXPECT_EQ(0x43, big_endian[2]);
+ EXPECT_EQ(0x21, big_endian[3]);
+}
+
+TEST(BcdBigEndianView, CopyFrom) {
+ std::array<uint8_t, 4> buf_x = {0x12, 0x34, 0x56, 0x78};
+ std::array<uint8_t, 4> buf_y = {0x00, 0x00, 0x00, 0x00};
+
+ auto x = BcdBigEndianWriter(&buf_x);
+ auto y = BcdBigEndianWriter(&buf_y);
+
+ EXPECT_NE(x.four_byte().Read(), y.four_byte().Read());
+ x.four_byte().CopyFrom(y.four_byte());
+ EXPECT_EQ(x.four_byte().Read(), y.four_byte().Read());
+}
+
+TEST(BcdBigEndianView, TryToCopyFrom) {
+ std::array<uint8_t, 4> buf_x = {0x12, 0x34, 0x56, 0x78};
+ std::array<uint8_t, 4> buf_y = {0x00, 0x00, 0x00, 0x00};
+
+ auto x = BcdBigEndianWriter(&buf_x);
+ auto y = BcdBigEndianWriter(&buf_y);
+
+ EXPECT_NE(x.four_byte().Read(), y.four_byte().Read());
+ EXPECT_TRUE(x.four_byte().TryToCopyFrom(y.four_byte()));
+ EXPECT_EQ(x.four_byte().Read(), y.four_byte().Read());
+}
+
+TEST(BcdSizesView, Equals) {
+ std::array<uint8_t, sizeof kBcd> buf_x;
+ std::array<uint8_t, sizeof kBcd> buf_y;
+
+ std::copy(kBcd, kBcd + sizeof kBcd, buf_x.begin());
+ std::copy(kBcd, kBcd + sizeof kBcd, buf_y.begin());
+
+ EXPECT_EQ(buf_x, buf_y);
+ auto x = BcdSizesView(&buf_x);
+ auto x_const = BcdSizesView(
+ static_cast</**/ ::std::array</**/ ::std::uint8_t, sizeof kBcd>*>(
+ &buf_x));
+ auto y = BcdSizesView(&buf_y);
+
+ EXPECT_TRUE(x.Equals(x));
+ EXPECT_TRUE(x.UncheckedEquals(x));
+ EXPECT_TRUE(y.Equals(y));
+ EXPECT_TRUE(y.UncheckedEquals(y));
+
+ EXPECT_TRUE(x.Equals(y));
+ EXPECT_TRUE(x.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x));
+ EXPECT_TRUE(y.UncheckedEquals(x));
+
+ EXPECT_TRUE(x_const.Equals(y));
+ EXPECT_TRUE(x_const.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x_const));
+ EXPECT_TRUE(y.UncheckedEquals(x_const));
+
+ ++buf_y[1];
+ EXPECT_FALSE(x.Equals(y));
+ EXPECT_FALSE(x.UncheckedEquals(y));
+ EXPECT_FALSE(y.Equals(x));
+ EXPECT_FALSE(y.UncheckedEquals(x));
+
+ EXPECT_FALSE(x_const.Equals(y));
+ EXPECT_FALSE(x_const.UncheckedEquals(y));
+ EXPECT_FALSE(y.Equals(x_const));
+ EXPECT_FALSE(y.UncheckedEquals(x_const));
+}
+
+TEST(BcdSizesView, UncheckedEquals) {
+ std::array<uint8_t, sizeof kBadBcd> buf_x;
+ std::array<uint8_t, sizeof kBadBcd> buf_y;
+
+ std::copy(kBadBcd, kBadBcd + sizeof kBadBcd, buf_x.begin());
+ std::copy(kBadBcd, kBadBcd + sizeof kBadBcd, buf_y.begin());
+
+ EXPECT_EQ(buf_x, buf_y);
+ auto x = BcdSizesView(&buf_x);
+ auto x_const = BcdSizesView(
+ static_cast</**/ ::std::array</**/ ::std::uint8_t, sizeof kBadBcd>*>(
+ &buf_x));
+ auto y = BcdSizesView(&buf_y);
+
+ EXPECT_TRUE(x.UncheckedEquals(x));
+ EXPECT_DEATH(x.Equals(x), "");
+ EXPECT_TRUE(y.UncheckedEquals(y));
+ EXPECT_DEATH(y.Equals(y), "");
+
+ EXPECT_TRUE(x.UncheckedEquals(y));
+ EXPECT_DEATH(x.Equals(y), "");
+ EXPECT_TRUE(y.UncheckedEquals(x));
+ EXPECT_DEATH(y.Equals(x), "");
+
+ EXPECT_TRUE(x_const.UncheckedEquals(y));
+ EXPECT_DEATH(x_const.Equals(y), "");
+ EXPECT_TRUE(y.UncheckedEquals(x_const));
+ EXPECT_DEATH(y.Equals(x_const), "");
+
+ ++buf_y[1];
+ EXPECT_FALSE(x.UncheckedEquals(y));
+ EXPECT_DEATH(x.Equals(y), "");
+ EXPECT_FALSE(y.UncheckedEquals(x));
+ EXPECT_DEATH(y.Equals(x), "");
+
+ EXPECT_FALSE(x_const.UncheckedEquals(y));
+ EXPECT_DEATH(x_const.Equals(y), "");
+ EXPECT_FALSE(y.UncheckedEquals(x_const));
+ EXPECT_DEATH(y.Equals(x_const), "");
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/bits_test.cc b/back_end/cpp/testcode/bits_test.cc
new file mode 100644
index 0000000..947b1b0
--- /dev/null
+++ b/back_end/cpp/testcode/bits_test.cc
@@ -0,0 +1,244 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for generated code for "bits" types, using bits.emb.
+#include <stdint.h>
+
+#include <vector>
+
+#include "public/emboss_cpp_util.h"
+#include "testdata/bits.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+TEST(Bits, OneByteView) {
+ uint8_t data[] = {0x2b};
+ auto one_byte = GenericOneByteView<support::BitBlock<
+ support::LittleEndianByteOrderer<support::ReadWriteContiguousBuffer>, 8>>{
+ support::BitBlock<
+ support::LittleEndianByteOrderer<support::ReadWriteContiguousBuffer>,
+ 8>{support::ReadWriteContiguousBuffer{data, sizeof data}}};
+ EXPECT_EQ(0xa, one_byte.mid_nibble().Read());
+ EXPECT_EQ(0, one_byte.high_bit().Read());
+ EXPECT_EQ(1, one_byte.low_bit().Read());
+ one_byte.less_high_bit().Write(1);
+ EXPECT_EQ(0x6b, data[0]);
+ one_byte.less_low_bit().Write(0);
+ EXPECT_EQ(0x69, data[0]);
+ one_byte.mid_nibble().Write(5);
+ EXPECT_EQ(0x55, data[0]);
+}
+
+TEST(Bits, StructOfBits) {
+ alignas(8) uint8_t data[] = {0xe8, 0x7f, 0xfe, 0xf1, 0xff, 0xbf, 0x3d};
+ auto struct_of_bits =
+ MakeAlignedStructOfBitsView<uint8_t, 8>(data, sizeof data);
+ EXPECT_EQ(0xa, struct_of_bits.one_byte().mid_nibble().Read());
+ EXPECT_FALSE(struct_of_bits.Ok());
+ EXPECT_FALSE(struct_of_bits.located_byte().Ok());
+ struct_of_bits.one_byte().mid_nibble().Write(0x01);
+ EXPECT_EQ(0xc4, data[0]);
+ EXPECT_TRUE(struct_of_bits.Ok());
+ EXPECT_TRUE(struct_of_bits.located_byte().Ok());
+ EXPECT_EQ(0x7f, struct_of_bits.located_byte().Read());
+ EXPECT_EQ(0x9, struct_of_bits.two_byte().mid_nibble().Read());
+ EXPECT_EQ(0x6, struct_of_bits.four_byte().one_byte().mid_nibble().Read());
+ EXPECT_EQ(0x3, struct_of_bits.four_byte().high_nibble().Read());
+ struct_of_bits.four_byte().one_byte().mid_nibble().Write(0x9);
+ EXPECT_EQ(0x7f, data[5]);
+ EXPECT_EQ(0x3e, data[6]);
+ EXPECT_EQ(101, struct_of_bits.four_byte().low_nibble().Read());
+ struct_of_bits.four_byte().low_nibble().Write(115);
+ EXPECT_EQ(0xff, data[3]);
+ // Out-of-[range] write.
+ EXPECT_DEATH(struct_of_bits.four_byte().low_nibble().Write(100), "");
+}
+
+TEST(Bits, StructOfBitsFromText) {
+ alignas(8) uint8_t data[] = {0xe8, 0x7f, 0xfe, 0xf1, 0xff, 0xbf, 0x3d};
+ auto struct_of_bits =
+ MakeAlignedStructOfBitsView<uint8_t, 8>(data, sizeof data);
+ EXPECT_TRUE(::emboss::UpdateFromText(struct_of_bits, R"(
+ {
+ one_byte: {
+ high_bit: false
+ mid_nibble: 0x01
+ }
+ four_byte: {
+ one_byte: {
+ mid_nibble: 0x9
+ }
+ low_nibble: 115
+ }
+ }
+ )"));
+ EXPECT_EQ(0x44, data[0]);
+ EXPECT_EQ(0x7f, data[5]);
+ EXPECT_EQ(0x3e, data[6]);
+ EXPECT_EQ(0xff, data[3]);
+}
+
+TEST(Bits, ArrayOfBits) {
+ alignas(8) uint8_t data[] = {0xe8, 0x7f, 0xfe, 0xf1, 0xff, 0xbf, 0x00, 0x3d};
+ auto bit_array = MakeAlignedBitArrayView<uint8_t, 8>(data, sizeof data);
+ EXPECT_EQ(0xa, bit_array.one_byte()[0].mid_nibble().Read());
+ EXPECT_EQ(0xf, bit_array.one_byte()[7].mid_nibble().Read());
+ bit_array.one_byte()[7].mid_nibble().Write(0x0);
+ EXPECT_EQ(0x01, data[7]);
+ EXPECT_TRUE(bit_array.Ok());
+ bit_array = MakeAlignedBitArrayView<uint8_t, 8>(data, sizeof data - 1);
+ EXPECT_FALSE(bit_array.Ok());
+}
+
+TEST(Bits, ArrayInBits) {
+ uint8_t data[] = {0xaa, 0xaa};
+ auto array = ArrayInBitsInStructWriter{data, sizeof data};
+ EXPECT_EQ(false, array.array_in_bits().flags()[0].Read());
+ EXPECT_EQ(true, array.array_in_bits().flags()[1].Read());
+ EXPECT_EQ(false, array.array_in_bits().flags()[10].Read());
+ EXPECT_EQ(true, array.array_in_bits().flags()[11].Read());
+ array.array_in_bits().flags()[8].Write(true);
+ EXPECT_EQ(0xab, data[1]);
+ EXPECT_EQ(12, array.array_in_bits().flags().SizeInBits());
+ EXPECT_EQ(12, array.array_in_bits().flags().ElementCount());
+ EXPECT_TRUE(array.array_in_bits().flags().Ok());
+ EXPECT_TRUE(array.array_in_bits().flags().IsComplete());
+}
+
+TEST(Bits, ArrayInBitsFromText) {
+ uint8_t data[] = {0, 0};
+ auto array = ArrayInBitsInStructWriter{data, sizeof data};
+ EXPECT_TRUE(::emboss::UpdateFromText(array.array_in_bits(), R"(
+ {
+ lone_flag: true
+ flags: { true, false, true, false, true, false,
+ true, false, true, false, true, false }
+ }
+ )"));
+ EXPECT_EQ(0x55, data[0]);
+ EXPECT_EQ(0x85, data[1]);
+}
+
+TEST(Bits, ArrayInBitsToText) {
+ uint8_t data[] = {0x55, 0x85};
+ auto array = ArrayInBitsInStructWriter{data, sizeof data};
+ EXPECT_EQ(
+ "{\n"
+ " lone_flag: true\n"
+ " flags: {\n"
+ " [0]: true\n"
+ " [1]: false\n"
+ " [2]: true\n"
+ " [3]: false\n"
+ " [4]: true\n"
+ " [5]: false\n"
+ " [6]: true\n"
+ " [7]: false\n"
+ " [8]: true\n"
+ " [9]: false\n"
+ " [10]: true\n"
+ " [11]: false\n"
+ " }\n"
+ "}",
+ ::emboss::WriteToString(array.array_in_bits(),
+ ::emboss::MultilineText()));
+}
+
+TEST(Bits, CopyFrom) {
+ std::array<uint8_t, 4> buf_x = {0x00, 0x00};
+ std::array<uint8_t, 4> buf_y = {0xff, 0xff};
+
+ auto x = ArrayInBitsInStructWriter{&buf_x};
+ auto y = ArrayInBitsInStructWriter{&buf_y};
+
+ EXPECT_NE(x.array_in_bits().flags()[0].Read(),
+ y.array_in_bits().flags()[0].Read());
+
+ x.array_in_bits().flags()[0].CopyFrom(y.array_in_bits().flags()[0]);
+ EXPECT_EQ(x.array_in_bits().flags()[0].Read(),
+ y.array_in_bits().flags()[0].Read());
+
+ EXPECT_NE(x.array_in_bits().flags()[1].Read(),
+ y.array_in_bits().flags()[1].Read());
+ EXPECT_NE(x.array_in_bits().flags()[10].Read(),
+ y.array_in_bits().flags()[10].Read());
+ EXPECT_NE(x.array_in_bits().flags()[11].Read(),
+ y.array_in_bits().flags()[11].Read());
+}
+
+TEST(Bits, TryToCopyFrom) {
+ std::array<uint8_t, 4> buf_x = {0x00, 0x00};
+ std::array<uint8_t, 4> buf_y = {0xff, 0xff};
+
+ auto x = ArrayInBitsInStructWriter{&buf_x};
+ auto y = ArrayInBitsInStructWriter{&buf_y};
+
+ EXPECT_NE(x.array_in_bits().flags()[0].Read(),
+ y.array_in_bits().flags()[0].Read());
+
+ EXPECT_TRUE(
+ x.array_in_bits().flags()[0].TryToCopyFrom(y.array_in_bits().flags()[0]));
+ EXPECT_EQ(x.array_in_bits().flags()[0].Read(),
+ y.array_in_bits().flags()[0].Read());
+
+ EXPECT_NE(x.array_in_bits().flags()[1].Read(),
+ y.array_in_bits().flags()[1].Read());
+ EXPECT_NE(x.array_in_bits().flags()[10].Read(),
+ y.array_in_bits().flags()[10].Read());
+ EXPECT_NE(x.array_in_bits().flags()[11].Read(),
+ y.array_in_bits().flags()[11].Read());
+}
+
+TEST(Bits, Equals) {
+ alignas(8) uint8_t buf_x[] = {0xe8, 0x7f, 0xfe, 0xf1, 0xff, 0xbf, 0x00, 0x3d};
+ alignas(8) uint8_t buf_y[] = {0xe8, 0x7f, 0xfe, 0xf1, 0xff, 0xbf, 0x00, 0x3d};
+
+ auto x = MakeAlignedBitArrayView<uint8_t, 8>(buf_x, sizeof buf_x);
+ auto x_const =
+ MakeBitArrayView(static_cast</**/ ::std::uint8_t *>(buf_x), sizeof buf_x);
+ auto y = MakeAlignedBitArrayView<uint8_t, 8>(buf_y, sizeof buf_y);
+
+ EXPECT_TRUE(x.Equals(x));
+ EXPECT_TRUE(x.UncheckedEquals(x));
+ EXPECT_TRUE(y.Equals(y));
+ EXPECT_TRUE(y.UncheckedEquals(y));
+
+ EXPECT_TRUE(x.Equals(y));
+ EXPECT_TRUE(x.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x));
+ EXPECT_TRUE(y.UncheckedEquals(x));
+
+ EXPECT_TRUE(x_const.Equals(y));
+ EXPECT_TRUE(x_const.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x_const));
+ EXPECT_TRUE(y.UncheckedEquals(x_const));
+
+ ++buf_y[1];
+ EXPECT_FALSE(x.Equals(y));
+ EXPECT_FALSE(x.UncheckedEquals(y));
+ EXPECT_FALSE(y.Equals(x));
+ EXPECT_FALSE(y.UncheckedEquals(x));
+
+ EXPECT_FALSE(x_const.Equals(y));
+ EXPECT_FALSE(x_const.UncheckedEquals(y));
+ EXPECT_FALSE(y.Equals(x_const));
+ EXPECT_FALSE(y.UncheckedEquals(x_const));
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/complex_structure_test.cc b/back_end/cpp/testcode/complex_structure_test.cc
new file mode 100644
index 0000000..25a2819
--- /dev/null
+++ b/back_end/cpp/testcode/complex_structure_test.cc
@@ -0,0 +1,52 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests of generated code for text format.
+#include <stdint.h>
+
+#include <type_traits>
+#include <utility>
+#include <vector>
+
+#include "testdata/complex_structure.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss_test {
+namespace {
+
+TEST(InvalidTextInput, PrematureEnd) {
+ ::std::array<char, 64> values = {0};
+ const auto view = ::emboss_test::MakeComplexView(&values);
+ ::emboss::UpdateFromText(view, "{a:");
+}
+
+TEST(InvalidTextInput, ReallyPrematureEnd) {
+ ::std::array<char, 64> values = {0};
+ const auto view = ::emboss_test::MakeComplexView(&values);
+ ::emboss::UpdateFromText(view, "\x01");
+}
+
+TEST(InvalidTextInput, WeirdInputDoesNotHang) {
+ ::std::string text{0x7b, 0x78, 0x32, 0x3a, 0x30, 0x0d, 0x0d, 0x62, 0x32,
+ 0x7f, 0x30, 0x0d, 0x0d, 0x62, 0x32, 0x3a, 0x30, 0x0d,
+ 0x0d, 0x62, 0x32, 0x3a, 0x30, 0x0d, 0x0c, 0x30, 0x0d,
+ 0x0d, 0x63, 0x32, 0x3a, 0x30, 0x0d, 0x0d, 0x62, 0x36,
+ 0x3a, 0x30, 0x0d, 0x32, 0x3a, 0x30, 0x0d};
+ ::std::array<char, 64> values = {0};
+ const auto view = ::emboss_test::MakeComplexView(&values);
+ ::emboss::UpdateFromText(view, text);
+}
+
+} // namespace
+} // namespace emboss_test
diff --git a/back_end/cpp/testcode/condition_test.cc b/back_end/cpp/testcode/condition_test.cc
new file mode 100644
index 0000000..bb3e559
--- /dev/null
+++ b/back_end/cpp/testcode/condition_test.cc
@@ -0,0 +1,842 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests to ensure that conditional fields work.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/condition.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+TEST(Conditional, WithConditionTrueFieldsAreOk) {
+ uint8_t buffer[2] = {0, 0};
+ auto writer = BasicConditionalWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_TRUE(writer.x().Ok());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(Conditional, WithConditionTrueAllFieldsAreReadable) {
+ uint8_t buffer[2] = {0, 2};
+ auto writer = BasicConditionalWriter(buffer, sizeof buffer);
+ EXPECT_EQ(0, writer.x().Read());
+ EXPECT_EQ(2, writer.xc().Read());
+}
+
+TEST(Conditional, WithConditionTrueConditionalFieldIsWritable) {
+ uint8_t buffer1[2] = {0, 2};
+ auto writer1 = BasicConditionalWriter(buffer1, sizeof buffer1);
+ EXPECT_TRUE(writer1.xc().TryToWrite(3));
+ EXPECT_EQ(3, buffer1[1]);
+
+ uint8_t buffer2[2] = {0, 0};
+ auto writer2 = BasicConditionalWriter(buffer2, sizeof buffer2);
+ EXPECT_FALSE(writer2.xc().Equals(writer1.xc()));
+ EXPECT_TRUE(writer2.xc().TryToCopyFrom(writer1.xc()));
+ EXPECT_TRUE(writer2.xc().Equals(writer1.xc()));
+ EXPECT_EQ(3, buffer2[1]);
+}
+
+TEST(Conditional, WithConditionFalseStructIsOkButConditionalFieldIsNot) {
+ uint8_t buffer[2] = {1, 2};
+ auto writer = BasicConditionalWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_TRUE(writer.x().Ok());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(Conditional, BasicConditionFalseReadCrashes) {
+ uint8_t buffer[2] = {1, 2};
+ auto writer = BasicConditionalWriter(buffer, sizeof buffer);
+ EXPECT_DEATH(writer.xc().Read(), "");
+}
+
+TEST(Conditional, BasicConditionFalseWriteCrashes) {
+ uint8_t buffer[2] = {1, 2};
+ auto writer = BasicConditionalWriter(buffer, sizeof buffer);
+ EXPECT_DEATH(writer.xc().Write(3), "");
+}
+
+TEST(Conditional, BasicConditionTrueSizeIncludesConditionalField) {
+ uint8_t buffer[2] = {0, 2};
+ auto writer = BasicConditionalWriter(buffer, sizeof buffer);
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_EQ(2, BasicConditional::MaxSizeInBytes());
+ EXPECT_EQ(1, BasicConditional::MinSizeInBytes());
+ EXPECT_EQ(2, writer.MaxSizeInBytes().Read());
+ EXPECT_EQ(1, writer.MinSizeInBytes().Read());
+}
+
+TEST(Conditional, BasicConditionFalseSizeDoesNotIncludeConditionalField) {
+ uint8_t buffer[2] = {1, 2};
+ auto writer = BasicConditionalWriter(buffer, sizeof buffer);
+ EXPECT_EQ(1, writer.SizeInBytes());
+ EXPECT_EQ(2, writer.MaxSizeInBytes().Read());
+ EXPECT_EQ(1, writer.MinSizeInBytes().Read());
+}
+
+TEST(Conditional, WithConditionFalseStructIsOkWhenBufferIsSmall) {
+ uint8_t buffer[1] = {1};
+ auto writer = BasicConditionalWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_TRUE(writer.x().Ok());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(Conditional, WithConditionTrueStructIsNotOkWhenBufferIsSmall) {
+ uint8_t buffer[1] = {0};
+ auto writer = BasicConditionalWriter(buffer, sizeof buffer);
+ EXPECT_FALSE(writer.Ok());
+ EXPECT_TRUE(writer.x().Ok());
+ EXPECT_TRUE(writer.has_xc().Known());
+ EXPECT_TRUE(writer.has_xc().Value());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(Conditional, WithNegativeConditionTrueFieldsAreOk) {
+ uint8_t buffer[2] = {1, 0};
+ auto writer = NegativeConditionalWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_TRUE(writer.x().Ok());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(Conditional, WithNegativeConditionTrueAllFieldsAreReadable) {
+ uint8_t buffer[2] = {1, 2};
+ auto writer = NegativeConditionalWriter(buffer, sizeof buffer);
+ EXPECT_EQ(1, writer.x().Read());
+ EXPECT_EQ(2, writer.xc().Read());
+}
+
+TEST(Conditional, WithNegativeConditionTrueConditionalFieldIsWritable) {
+ uint8_t buffer1[2] = {1, 2};
+ auto writer1 = NegativeConditionalWriter(buffer1, sizeof buffer1);
+ EXPECT_TRUE(writer1.xc().TryToWrite(3));
+ EXPECT_EQ(3, buffer1[1]);
+
+ uint8_t buffer2[2] = {1, 0};
+ auto writer2 = NegativeConditionalWriter(buffer2, sizeof buffer2);
+ EXPECT_FALSE(writer2.xc().Equals(writer1.xc()));
+ EXPECT_TRUE(writer2.xc().TryToCopyFrom(writer1.xc()));
+ EXPECT_TRUE(writer2.xc().Equals(writer1.xc()));
+ EXPECT_EQ(3, buffer2[1]);
+}
+
+TEST(Conditional,
+ WithNegativeConditionFalseStructIsOkButConditionalFieldIsNot) {
+ uint8_t buffer[2] = {0, 2};
+ auto writer = NegativeConditionalWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_TRUE(writer.x().Ok());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(Conditional, NegativeConditionFalseReadCrashes) {
+ uint8_t buffer1[2] = {0, 2};
+ auto writer1 = NegativeConditionalWriter(buffer1, sizeof buffer1);
+ EXPECT_DEATH(writer1.xc().Read(), "");
+
+ uint8_t buffer2[2] = {0, 0};
+ auto writer2 = BasicConditionalWriter(buffer2, sizeof buffer2);
+ EXPECT_TRUE(writer2.xc().CouldWriteValue(2));
+ EXPECT_EQ(writer2.xc().Read(), 0);
+ EXPECT_FALSE(writer2.xc().TryToCopyFrom(writer1.xc()));
+ EXPECT_EQ(writer2.xc().Read(), 0);
+}
+
+TEST(Conditional, NegativeConditionFalseWriteCrashes) {
+ uint8_t buffer1[2] = {0, 2};
+ auto writer1 = NegativeConditionalWriter(buffer1, sizeof buffer1);
+ EXPECT_DEATH(writer1.xc().Write(3), "");
+
+ uint8_t buffer2[2] = {0, 2};
+ auto writer2 = NegativeConditionalWriter(buffer2, sizeof buffer2);
+ EXPECT_FALSE(writer2.xc().TryToCopyFrom(writer1.xc()));
+}
+
+TEST(Conditional, NegativeConditionTrueSizeIncludesConditionalField) {
+ uint8_t buffer[2] = {1, 2};
+ auto writer = NegativeConditionalWriter(buffer, sizeof buffer);
+ EXPECT_EQ(2, writer.SizeInBytes());
+}
+
+TEST(Conditional, NegativeConditionFalseSizeDoesNotIncludeConditionalField) {
+ uint8_t buffer[2] = {0, 2};
+ auto writer = NegativeConditionalWriter(buffer, sizeof buffer);
+ EXPECT_EQ(1, writer.SizeInBytes());
+}
+
+TEST(Conditional, WithNegativeConditionFalseStructIsOkWhenBufferIsSmall) {
+ uint8_t buffer[1] = {0};
+ auto writer = NegativeConditionalWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_TRUE(writer.x().Ok());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(Conditional, WithNegativeConditionTrueStructIsNotOkWhenBufferIsSmall) {
+ uint8_t buffer[1] = {1};
+ auto writer = NegativeConditionalWriter(buffer, sizeof buffer);
+ EXPECT_FALSE(writer.Ok());
+ EXPECT_TRUE(writer.x().Ok());
+ EXPECT_TRUE(writer.has_xc().Known());
+ EXPECT_TRUE(writer.has_xc().Value());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(Conditional,
+ SizeIncludesUnconditionalFieldsThatOverlapWithConditionalFields) {
+ uint8_t buffer[2] = {1, 2};
+ auto writer = ConditionalAndUnconditionalOverlappingFinalFieldWriter(
+ buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_EQ(2, writer.SizeInBytes());
+}
+
+TEST(Conditional,
+ SizeIsConstantWhenUnconditionalFieldsOverlapWithConditionalFields) {
+ EXPECT_EQ(
+ 2, ConditionalAndUnconditionalOverlappingFinalFieldWriter::SizeInBytes());
+}
+
+TEST(Conditional, WhenConditionalFieldIsFirstSizeIsConstant) {
+ EXPECT_EQ(2, ConditionalBasicConditionalFieldFirstWriter::SizeInBytes());
+}
+
+TEST(Conditional, WhenConditionIsFalseDynamicallyPlacedFieldDoesNotAffectSize) {
+ uint8_t buffer[3] = {1, 0, 10};
+ auto writer = ConditionalAndDynamicLocationWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_EQ(3, writer.SizeInBytes());
+}
+
+TEST(Conditional, WhenConditionIsTrueDynamicallyPlacedFieldDoesAffectSize) {
+ uint8_t buffer[4] = {0, 0, 3, 0};
+ auto writer = ConditionalAndDynamicLocationWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_EQ(4, writer.SizeInBytes());
+}
+
+TEST(Conditional, WhenConditionIsTrueDynamicallyPlacedFieldOutOfRangeIsError) {
+ uint8_t buffer[3] = {0, 0, 3};
+ auto writer = ConditionalAndDynamicLocationWriter(buffer, sizeof buffer);
+ EXPECT_FALSE(writer.Ok());
+ EXPECT_EQ(4, writer.SizeInBytes());
+}
+
+TEST(Conditional, ConditionUsesMinInt) {
+ uint8_t buffer[2] = {0, 0};
+ auto view = MakeConditionUsesMinIntView(buffer, sizeof buffer);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_FALSE(view.has_xc().ValueOr(true));
+ EXPECT_EQ(1, view.SizeInBytes());
+ buffer[0] = 0x80;
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(-0x80, view.x().Read());
+ EXPECT_TRUE(view.has_xc().ValueOr(false));
+ EXPECT_EQ(2, view.SizeInBytes());
+}
+
+TEST(Conditional,
+ StructWithNestedConditionIsNotOkWhenOuterConditionDoesNotExist) {
+ uint8_t buffer[3] = {1, 0, 3};
+ auto writer = NestedConditionalWriter(buffer, sizeof buffer);
+ ASSERT_FALSE(writer.IntrinsicSizeInBytes().Ok());
+ ASSERT_FALSE((writer.xc().Ok()));
+ ASSERT_FALSE(writer.SizeIsKnown());
+ ASSERT_FALSE(writer.IsComplete());
+ ASSERT_FALSE(writer.Ok());
+ ASSERT_TRUE(writer.has_xc().Known());
+ ASSERT_FALSE(writer.has_xc().Value());
+ ASSERT_FALSE(writer.has_xcc().Known());
+}
+
+TEST(Conditional,
+ StructWithCorrectNestedConditionIsOkWhenOuterConditionDoesNotExist) {
+ uint8_t buffer[3] = {1, 0, 3};
+ auto writer = CorrectNestedConditionalWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.IsComplete());
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_TRUE(writer.has_xc().Known());
+ EXPECT_FALSE(writer.has_xc().Value());
+ EXPECT_TRUE(writer.has_xcc().Known());
+ EXPECT_FALSE(writer.has_xcc().Value());
+ EXPECT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(1, writer.SizeInBytes());
+}
+
+TEST(Conditional, StructWithNestedConditionIsOkWhenOuterConditionExists) {
+ uint8_t buffer[3] = {0, 1, 3};
+ auto writer = NestedConditionalWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_TRUE(writer.has_xc().Known());
+ EXPECT_TRUE(writer.has_xc().Value());
+ EXPECT_TRUE(writer.has_xcc().Known());
+ EXPECT_FALSE(writer.has_xcc().Value());
+ EXPECT_EQ(2, writer.SizeInBytes());
+}
+
+TEST(Conditional, AlwaysMissingFieldDoesNotContributeToStaticSize) {
+ EXPECT_EQ(0, OnlyAlwaysFalseConditionWriter::SizeInBytes());
+ EXPECT_EQ(1, AlwaysFalseConditionWriter::SizeInBytes());
+}
+
+TEST(Conditional, AlwaysMissingFieldDoesNotContributeToSize) {
+ uint8_t buffer[1] = {0};
+ auto view = MakeAlwaysFalseConditionDynamicSizeView(buffer, sizeof buffer);
+ ASSERT_TRUE(view.SizeIsKnown());
+ EXPECT_EQ(1, view.SizeInBytes());
+}
+
+TEST(Conditional, StructIsOkWithAlwaysMissingField) {
+ uint8_t buffer[1] = {0};
+ auto writer = AlwaysFalseConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(1, writer.SizeInBytes());
+ EXPECT_EQ(1, AlwaysFalseConditionView::SizeInBytes());
+}
+
+TEST(Conditional, StructIsOkWithOnlyAlwaysMissingField) {
+ uint8_t buffer[1] = {0};
+ auto writer = OnlyAlwaysFalseConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(0, writer.SizeInBytes());
+ EXPECT_EQ(0, OnlyAlwaysFalseConditionView::SizeInBytes());
+}
+
+TEST(Conditional, ConditionDoesNotBlockStaticSize) {
+ EXPECT_EQ(3, ConditionDoesNotContributeToSizeView::SizeInBytes());
+}
+
+TEST(Conditional, EqualsHaveAllFields) {
+ std::array<uint8_t, 2> buf_a = {0, 1};
+ std::array<uint8_t, 2> buf_b = {0, 1};
+ EXPECT_EQ(buf_a, buf_b);
+
+ auto a = BasicConditionalWriter(&buf_a);
+ auto a_const = BasicConditionalWriter(
+ static_cast</**/ ::std::array<uint8_t, 2> *>(&buf_a));
+ auto b = BasicConditionalWriter(&buf_b);
+
+ EXPECT_TRUE(a.has_xc().Known());
+ EXPECT_TRUE(a.has_xc().Value());
+ EXPECT_TRUE(b.has_xc().Known());
+ EXPECT_TRUE(b.has_xc().Value());
+
+ EXPECT_TRUE(a.Equals(a));
+ EXPECT_TRUE(a.UncheckedEquals(a));
+ EXPECT_TRUE(b.Equals(b));
+ EXPECT_TRUE(b.UncheckedEquals(b));
+
+ EXPECT_TRUE(a.Equals(b));
+ EXPECT_TRUE(a.UncheckedEquals(b));
+ EXPECT_TRUE(b.Equals(a));
+ EXPECT_TRUE(b.UncheckedEquals(a));
+
+ EXPECT_TRUE(a_const.Equals(b));
+ EXPECT_TRUE(a_const.UncheckedEquals(b));
+ EXPECT_TRUE(b.Equals(a_const));
+ EXPECT_TRUE(b.UncheckedEquals(a_const));
+
+ b.xc().Write(b.xc().Read() + 1);
+ EXPECT_FALSE(a.Equals(b));
+ EXPECT_FALSE(a.UncheckedEquals(b));
+ EXPECT_FALSE(b.Equals(a));
+ EXPECT_FALSE(b.UncheckedEquals(a));
+
+ EXPECT_FALSE(a_const.Equals(b));
+ EXPECT_FALSE(a_const.UncheckedEquals(b));
+ EXPECT_FALSE(b.Equals(a_const));
+ EXPECT_FALSE(b.UncheckedEquals(a_const));
+}
+
+TEST(Conditional, EqualsOneViewMissingField) {
+ std::array<uint8_t, 2> buf_a = {0, 1};
+ std::array<uint8_t, 2> buf_b = {1, 1};
+ EXPECT_NE(buf_a, buf_b);
+
+ auto a = BasicConditionalWriter(&buf_a);
+ auto b = BasicConditionalWriter(&buf_b);
+
+ EXPECT_TRUE(a.has_xc().Known());
+ EXPECT_TRUE(a.has_xc().Value());
+ EXPECT_TRUE(b.has_xc().Known());
+ EXPECT_FALSE(b.has_xc().Value());
+
+ EXPECT_FALSE(a.Equals(b));
+ EXPECT_FALSE(a.UncheckedEquals(b));
+ EXPECT_FALSE(b.Equals(a));
+ EXPECT_FALSE(b.UncheckedEquals(a));
+}
+
+TEST(Conditional, EqualsBothFieldsMissing) {
+ std::array<uint8_t, 2> buf_a = {1, 1};
+ std::array<uint8_t, 2> buf_b = {1, 1};
+ EXPECT_EQ(buf_a, buf_b);
+
+ auto a = BasicConditionalWriter(&buf_a);
+ auto a_const = BasicConditionalWriter(
+ static_cast</**/ ::std::array<uint8_t, 2> *>(&buf_a));
+ auto b = BasicConditionalWriter(&buf_b);
+
+ EXPECT_TRUE(a.has_xc().Known());
+ EXPECT_FALSE(a.has_xc().Value());
+ EXPECT_TRUE(b.has_xc().Known());
+ EXPECT_FALSE(b.has_xc().Value());
+
+ EXPECT_TRUE(a.Equals(b));
+ EXPECT_TRUE(a.UncheckedEquals(b));
+ EXPECT_TRUE(b.Equals(a));
+ EXPECT_TRUE(b.UncheckedEquals(a));
+
+ EXPECT_TRUE(a_const.Equals(b));
+ EXPECT_TRUE(a_const.UncheckedEquals(b));
+ EXPECT_TRUE(b.Equals(a_const));
+ EXPECT_TRUE(b.UncheckedEquals(a_const));
+
+ ++buf_b[1];
+ EXPECT_NE(buf_a, buf_b);
+ EXPECT_TRUE(a.Equals(b));
+ EXPECT_TRUE(a.UncheckedEquals(b));
+ EXPECT_TRUE(b.Equals(a));
+ EXPECT_TRUE(b.UncheckedEquals(a));
+
+ EXPECT_TRUE(a_const.Equals(b));
+ EXPECT_TRUE(a_const.UncheckedEquals(b));
+ EXPECT_TRUE(b.Equals(a_const));
+ EXPECT_TRUE(b.UncheckedEquals(a_const));
+}
+
+TEST(Conditional, TrueEnumBasedCondition) {
+ uint8_t buffer[2] = {1};
+ auto writer = EnumConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_EQ(0, writer.xc().Read());
+}
+
+TEST(Conditional, FalseEnumBasedCondition) {
+ uint8_t buffer[2] = {0};
+ auto writer = EnumConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(1, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(Conditional, TrueEnumBasedNegativeCondition) {
+ uint8_t buffer[2] = {0};
+ auto writer = NegativeEnumConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_EQ(0, writer.xc().Read());
+}
+
+TEST(Conditional, FalseEnumBasedNegativeCondition) {
+ uint8_t buffer[2] = {1};
+ auto writer = NegativeEnumConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(1, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(LessThanConditional, LessThan) {
+ uint8_t buffer[2] = {4};
+ auto writer = LessThanConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(LessThanConditional, Equal) {
+ uint8_t buffer[2] = {5};
+ auto writer = LessThanConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(1, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(LessThanConditional, GreaterThan) {
+ uint8_t buffer[2] = {6};
+ auto writer = LessThanConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(1, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(LessThanOrEqualConditional, LessThan) {
+ uint8_t buffer[2] = {4};
+ auto writer = LessThanOrEqualConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(LessThanOrEqualConditional, Equal) {
+ uint8_t buffer[2] = {5};
+ auto writer = LessThanOrEqualConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(LessThanOrEqualConditional, GreaterThan) {
+ uint8_t buffer[2] = {6};
+ auto writer = LessThanOrEqualConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(1, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(GreaterThanConditional, LessThan) {
+ uint8_t buffer[2] = {4};
+ auto writer = GreaterThanConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(1, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(GreaterThanConditional, Equal) {
+ uint8_t buffer[2] = {5};
+ auto writer = GreaterThanConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(1, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(GreaterThanConditional, GreaterThan) {
+ uint8_t buffer[2] = {6};
+ auto writer = GreaterThanConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(GreaterThanOrEqualConditional, LessThan) {
+ uint8_t buffer[2] = {4};
+ auto writer = GreaterThanOrEqualConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(1, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(GreaterThanOrEqualConditional, Equal) {
+ uint8_t buffer[2] = {5};
+ auto writer = GreaterThanOrEqualConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(GreaterThanOrEqualConditional, GreaterThan) {
+ uint8_t buffer[2] = {6};
+ auto writer = GreaterThanOrEqualConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(RangeConditional, ValueTooSmall) {
+ uint8_t buffer[3] = {1, 9};
+ auto writer = RangeConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(RangeConditional, ValueTooLarge) {
+ uint8_t buffer[3] = {11, 12};
+ auto writer = RangeConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(RangeConditional, ValuesSwapped) {
+ uint8_t buffer[3] = {8, 7};
+ auto writer = RangeConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(RangeConditional, True) {
+ uint8_t buffer[3] = {7, 8};
+ auto writer = RangeConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(3, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(ReverseRangeConditional, ValueTooSmall) {
+ uint8_t buffer[3] = {1, 9};
+ auto writer = ReverseRangeConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(ReverseRangeConditional, ValueTooLarge) {
+ uint8_t buffer[3] = {11, 12};
+ auto writer = ReverseRangeConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(ReverseRangeConditional, ValuesSwapped) {
+ uint8_t buffer[3] = {8, 7};
+ auto writer = ReverseRangeConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(ReverseRangeConditional, True) {
+ uint8_t buffer[3] = {7, 8};
+ auto writer = ReverseRangeConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(3, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(AndConditional, BothFalse) {
+ uint8_t buffer[3] = {1, 1};
+ auto writer = AndConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(AndConditional, FirstFalse) {
+ uint8_t buffer[3] = {1, 5};
+ auto writer = AndConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(AndConditional, SecondFalse) {
+ uint8_t buffer[3] = {5, 1};
+ auto writer = AndConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(AndConditional, BothTrue) {
+ uint8_t buffer[3] = {5, 5};
+ auto writer = AndConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(3, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(OrConditional, BothFalse) {
+ uint8_t buffer[3] = {1, 1};
+ auto writer = OrConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(2, writer.SizeInBytes());
+ EXPECT_FALSE(writer.xc().Ok());
+}
+
+TEST(OrConditional, FirstFalse) {
+ uint8_t buffer[3] = {1, 5};
+ auto writer = OrConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(3, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(OrConditional, SecondFalse) {
+ uint8_t buffer[3] = {5, 1};
+ auto writer = OrConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(3, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(OrConditional, BothTrue) {
+ uint8_t buffer[3] = {5, 5};
+ auto writer = OrConditionWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ ASSERT_TRUE(writer.SizeIsKnown());
+ EXPECT_EQ(3, writer.SizeInBytes());
+ EXPECT_TRUE(writer.xc().Ok());
+}
+
+TEST(ChoiceConditional, UseX) {
+ ::std::array<uint8_t, 4> buffer = {1, 5, 0, 10};
+ auto view = MakeChoiceConditionView(&buffer);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_TRUE(view.SizeIsKnown());
+ EXPECT_EQ(4, view.SizeInBytes());
+ EXPECT_TRUE(view.has_xyc().ValueOr(false));
+ EXPECT_EQ(10, view.xyc().Read());
+}
+
+TEST(ChoiceConditional, UseY) {
+ ::std::array<uint8_t, 4> buffer = {2, 5, 0, 10};
+ auto view = MakeChoiceConditionView(&buffer);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_TRUE(view.SizeIsKnown());
+ EXPECT_EQ(3, view.SizeInBytes());
+ EXPECT_FALSE(view.has_xyc().ValueOr(true));
+}
+
+TEST(FlagConditional, True) {
+ uint8_t buffer[2] = {0x80, 0xff};
+ auto writer = ContainsContainsBitsWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(writer.Ok());
+ EXPECT_TRUE(writer.SizeIsKnown());
+ EXPECT_TRUE(writer.top().Ok());
+}
+
+TEST(WriteToString, MissingFieldsAreNotWritten) {
+ uint8_t buffer[2] = {0x01, 0x00};
+ auto reader = BasicConditionalWriter(buffer, 1U);
+ EXPECT_EQ(
+ "{\n"
+ " x: 1 # 0x1\n"
+ "}",
+ ::emboss::WriteToString(reader, ::emboss::MultilineText()));
+ EXPECT_EQ("{ x: 1 }", ::emboss::WriteToString(reader));
+}
+
+TEST(WriteToString, PresentFieldsNotWritten) {
+ uint8_t buffer[2] = {0x00, 0x01};
+ auto reader = BasicConditionalWriter(buffer, 2U);
+ EXPECT_EQ(
+ "{\n"
+ " x: 0 # 0x0\n"
+ " xc: 1 # 0x1\n"
+ "}",
+ ::emboss::WriteToString(reader, ::emboss::MultilineText()));
+ EXPECT_EQ("{ x: 0, xc: 1 }", ::emboss::WriteToString(reader));
+}
+
+TEST(WriteToString, AlwaysFalseCondition) {
+ uint8_t buffer[2] = {0x00};
+ auto reader = MakeAlwaysFalseConditionView(buffer, 1U);
+ EXPECT_EQ(
+ "{\n"
+ " x: 0 # 0x0\n"
+ "}",
+ ::emboss::WriteToString(reader, ::emboss::MultilineText()));
+ EXPECT_EQ("{ x: 0 }", ::emboss::WriteToString(reader));
+}
+
+TEST(WriteToString, OnlyAlwaysFalseCondition) {
+ uint8_t buffer[2] = {0x00};
+ auto reader = MakeOnlyAlwaysFalseConditionView(buffer, 0U);
+ EXPECT_EQ(
+ "{\n"
+ "}",
+ ::emboss::WriteToString(reader, ::emboss::MultilineText()));
+ EXPECT_EQ("{ }", ::emboss::WriteToString(reader));
+}
+
+TEST(WriteToString, EmptyStruct) {
+ uint8_t buffer[2] = {0x00};
+ auto reader = MakeEmptyStructView(buffer, 0U);
+ EXPECT_EQ(
+ "{\n"
+ "}",
+ ::emboss::WriteToString(reader, ::emboss::MultilineText()));
+ EXPECT_EQ("{ }", ::emboss::WriteToString(reader));
+}
+
+TEST(ConditionalInline, ConditionalInline) {
+ uint8_t buffer[4] = {0x00, 0x01, 0x02, 0x03};
+ auto reader = ConditionalInlineWriter(buffer, 4U);
+ EXPECT_EQ(1, reader.type_0().a().Read());
+ EXPECT_TRUE(reader.has_type_1().Known());
+ EXPECT_FALSE(reader.has_type_1().Value());
+}
+
+TEST(ConditionalAnonymous, ConditionalAnonymous) {
+ ::std::array<uint8_t, 2> buffer = {0x00, 0x98};
+ auto view = MakeConditionalAnonymousView(&buffer);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_FALSE(view.has_low().Value());
+ EXPECT_FALSE(view.has_mid().Value());
+ EXPECT_FALSE(view.has_high().Value());
+ view.x().Write(100);
+ EXPECT_TRUE(view.has_low().Value());
+ EXPECT_FALSE(view.has_mid().Value());
+ EXPECT_TRUE(view.has_high().Value());
+ EXPECT_EQ(0, view.low().Read());
+ EXPECT_EQ(1, view.high().Read());
+ view.low().Write(1);
+ EXPECT_TRUE(view.has_low().Value());
+ EXPECT_TRUE(view.has_mid().Value());
+ EXPECT_TRUE(view.has_high().Value());
+ EXPECT_EQ(1, view.low().Read());
+ EXPECT_EQ(3, view.mid().Read());
+ EXPECT_EQ(1, view.high().Read());
+}
+
+TEST(ConditionalOnFlag, ConditionalOnFlag) {
+ ::std::array<uint8_t, 2> buffer = {0x00, 0x98};
+ auto view = MakeConditionalOnFlagView(&buffer);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_FALSE(view.enabled().Read());
+ EXPECT_FALSE(view.has_value().Value());
+ buffer[0] = 1;
+ EXPECT_TRUE(view.enabled().Read());
+ EXPECT_TRUE(view.has_value().Value());
+ EXPECT_EQ(0x98, view.value().Read());
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/dynamic_size_test.cc b/back_end/cpp/testcode/dynamic_size_test.cc
new file mode 100644
index 0000000..3344a10
--- /dev/null
+++ b/back_end/cpp/testcode/dynamic_size_test.cc
@@ -0,0 +1,685 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for fields and structs with dynamic sizes.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/dynamic_size.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+static constexpr std::array<uint8_t, 16> kMessage = {{
+ 0x02, // 0:1 header_length = 2
+ 0x06, // 1:2 message_length = 6
+ 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, // 2:8 message
+ 0x07, 0x08, 0x09, 0x0a, // 8:12 crc32
+ 0x00, 0x00, 0x00, 0x00, // Extra, unused bytes.
+}};
+
+// MessageView::SizeInBytes() returns the expected value.
+TEST(MessageView, DynamicSizeIsCorrect) {
+ auto view = MessageView(&kMessage);
+ EXPECT_EQ(12, view.SizeInBytes());
+}
+
+// Fields read the correct values.
+TEST(MessageView, FieldsAreCorrect) {
+ auto view = MessageView(&kMessage);
+ EXPECT_EQ(2, view.header_length().Read());
+ EXPECT_EQ(6, view.message_length().Read());
+ EXPECT_EQ(1, view.message()[0].Read());
+ EXPECT_EQ(2, view.message()[1].Read());
+ EXPECT_EQ(3, view.message()[2].Read());
+ EXPECT_EQ(4, view.message()[3].Read());
+ EXPECT_EQ(5, view.message()[4].Read());
+ EXPECT_EQ(6, view.message()[5].Read());
+ EXPECT_EQ(6, view.message().SizeInBytes());
+ EXPECT_DEATH(view.message()[6].Read(), "");
+ EXPECT_EQ(0x0a090807, view.crc32().Read());
+}
+
+// The zero-length padding field works as expected.
+TEST(MessageView, PaddingFieldWorks) {
+ auto view = MessageView(&kMessage);
+ EXPECT_EQ(0, view.padding().SizeInBytes());
+ EXPECT_DEATH(view.padding()[0].Read(), "");
+}
+
+static constexpr std::array<uint8_t, 16> kPaddedMessage = {{
+ 0x06, // 0:1 header_length = 6
+ 0x04, // 1:2 message_length = 4
+ 0x01, 0x02, 0x03, 0x04, // 2:6 padding
+ 0x05, 0x06, 0x07, 0x08, // 6:10 message
+ 0x09, 0x0a, 0x0b, 0x0c, // 10:14 crc32
+ 0x00, 0x00, // Extra, unused bytes.
+}};
+
+// Fields read the correct values.
+TEST(MessageView, PaddedMessageFieldsAreCorrect) {
+ auto view = MessageView(&kPaddedMessage);
+ EXPECT_EQ(6, view.header_length().Read());
+ EXPECT_EQ(4, view.message_length().Read());
+ EXPECT_EQ(1, view.padding()[0].Read());
+ EXPECT_EQ(2, view.padding()[1].Read());
+ EXPECT_EQ(3, view.padding()[2].Read());
+ EXPECT_EQ(4, view.padding()[3].Read());
+ EXPECT_EQ(4, view.padding().SizeInBytes());
+ EXPECT_DEATH(view.padding()[4].Read(), "");
+ EXPECT_EQ(5, view.message()[0].Read());
+ EXPECT_EQ(6, view.message()[1].Read());
+ EXPECT_EQ(7, view.message()[2].Read());
+ EXPECT_EQ(8, view.message()[3].Read());
+ EXPECT_EQ(4, view.message().SizeInBytes());
+ EXPECT_DEATH(view.message()[4].Read(), "");
+ EXPECT_EQ(0x0c0b0a09, view.crc32().Read());
+}
+
+// Writes to fields produce the correct byte values.
+TEST(MessageView, Writer) {
+ uint8_t buffer[kPaddedMessage.size()] = {0};
+ auto writer = MessageWriter(buffer, sizeof buffer);
+
+ // Write values that should match kMessage.
+ writer.header_length().Write(2);
+ writer.message_length().Write(6);
+ EXPECT_EQ(6, writer.message_length().Read());
+ for (int i = 0; i < writer.message_length().Read(); ++i) {
+ writer.message()[i].Write(i + 1);
+ }
+ EXPECT_EQ(12, writer.SizeInBytes());
+ EXPECT_DEATH(writer.message()[writer.message_length().Read()].Read(), "");
+ EXPECT_DEATH(writer.padding()[0].Read(), "");
+ writer.crc32().Write(0x0a090807);
+ EXPECT_EQ(std::vector<uint8_t>(kMessage.begin(), kMessage.end()),
+ std::vector<uint8_t>(buffer, buffer + kMessage.size()));
+
+ // Update values to match kPaddedMessage. Only update values that are
+ // different.
+ auto writer2 = MessageWriter(buffer, sizeof buffer);
+ writer2.header_length().Write(6);
+ // Writes made through one writer should be immediately visible to the other.
+ EXPECT_EQ(6, writer.header_length().Read());
+ EXPECT_EQ(6, writer2.header_length().Read());
+ writer2.message_length().Write(4);
+ // The message() field is now pointing to a different place; it should read
+ // the data that was already there.
+ EXPECT_EQ(5, writer2.message()[0].Read());
+ // The padding bytes are already set to the correct values; do not update
+ // them.
+ for (int i = 0; i < writer2.message_length().Read(); ++i) {
+ writer2.padding()[i].Write(i + 1);
+ }
+ writer2.crc32().Write(0x0c0b0a09);
+ EXPECT_EQ(std::vector<uint8_t>(kPaddedMessage.begin(), kPaddedMessage.end()),
+ std::vector<uint8_t>(buffer, buffer + kPaddedMessage.size()));
+}
+
+TEST(MessageView, MakeFromPointerArrayIterator) {
+ std::array<const std::array<uint8_t, 16> *, 2> buffers = {
+ {&kMessage, &kPaddedMessage}};
+ // Ensure that the weird const-reference-to-pointer type returned by iteration
+ // over a std::array of std::arrays actually compiles.
+ for (const auto &buffer : buffers) {
+ auto view = MakeMessageView(buffer);
+ // Message length is 4 or 6, depending on the iteration.
+ EXPECT_TRUE(view.message_length().Read() == 4 ||
+ view.message_length().Read() == 6);
+ }
+}
+
+static const uint8_t kThreeByFiveImage[46] = {
+ 0x03, // 0:1 size
+ 0x01, 0x02, 0x03, // pixels[0][0]
+ 0x04, 0x05, 0x06, // pixels[0][1]
+ 0x07, 0x08, 0x09, // pixels[0][2]
+ 0x0a, 0x0b, 0x0c, // pixels[0][3]
+ 0x0d, 0x0e, 0x0f, // pixels[0][4]
+ 0x10, 0x11, 0x12, // pixels[1][0]
+ 0x13, 0x14, 0x15, // pixels[1][1]
+ 0x16, 0x17, 0x18, // pixels[1][2]
+ 0x19, 0x1a, 0x1b, // pixels[1][3]
+ 0x1c, 0x1d, 0x1e, // pixels[1][4]
+ 0x1f, 0x20, 0x21, // pixels[2][0]
+ 0x22, 0x23, 0x24, // pixels[2][1]
+ 0x25, 0x26, 0x27, // pixels[2][2]
+ 0x28, 0x29, 0x2a, // pixels[2][3]
+ 0x2b, 0x2c, 0x2d, // pixels[2][4]
+};
+
+// A variable-sized array of fixed-size arrays of fixed-size arrays reads
+// correct values.
+TEST(ImageView, PixelsAreCorrect) {
+ auto view = ImageView(kThreeByFiveImage, sizeof kThreeByFiveImage);
+ int counter = 1;
+ for (int x = 0; x < view.size().Read(); ++x) {
+ for (int y = 0; y < 5; ++y) {
+ for (int channel = 0; channel < 3; ++channel) {
+ EXPECT_EQ(counter, view.pixels()[x][y][channel].Read())
+ << "x: " << x << "; y: " << y << "; channel: " << channel;
+ ++counter;
+ }
+ }
+ }
+}
+
+TEST(ImageView, WritePixels) {
+ uint8_t buffer[sizeof kThreeByFiveImage];
+ auto writer = ImageWriter(buffer, sizeof buffer);
+ writer.size().Write(3);
+ int counter = 1;
+ for (int x = 0; x < writer.size().Read(); ++x) {
+ for (int y = 0; y < 5; ++y) {
+ for (int channel = 0; channel < 3; ++channel) {
+ writer.pixels()[x][y][channel].Write(counter);
+ ++counter;
+ }
+ }
+ }
+ EXPECT_EQ(std::vector<uint8_t>(kThreeByFiveImage,
+ kThreeByFiveImage + sizeof kThreeByFiveImage),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+static const uint8_t kTwoRegionsAFirst[10] = {
+ 0x04, // 0:1 a_start
+ 0x02, // 1:2 a_size
+ 0x06, // 2:3 b_start
+ 0x0a, // 3:4 b_end
+ 0x11, 0x22, // 4:6 region_a
+ 0x33, 0x44, 0x55, 0x66, // 6:10 region_b
+};
+
+// With two dynamically-positioned regions, having the binary locations match
+// the order of their declarations works.
+TEST(TwoRegionsView, RegionAFirstWorks) {
+ auto view = TwoRegionsView(kTwoRegionsAFirst, sizeof kTwoRegionsAFirst);
+ EXPECT_EQ(10, view.SizeInBytes());
+ EXPECT_EQ(4, view.a_start().Read());
+ EXPECT_EQ(2, view.a_size().Read());
+ EXPECT_EQ(6, view.b_start().Read());
+ EXPECT_EQ(10, view.b_end().Read());
+ EXPECT_EQ(0x11, view.region_a()[0].Read());
+ EXPECT_EQ(0x22, view.region_a()[1].Read());
+ EXPECT_EQ(0x33, view.region_b()[0].Read());
+ EXPECT_EQ(0x66, view.region_b()[3].Read());
+}
+
+static const uint8_t kTwoRegionsBFirst[14] = {
+ 0x0a, // 0:1 a_start
+ 0x04, // 1:2 a_size
+ 0x04, // 2:3 b_start
+ 0x06, // 3:4 b_end
+ 0x11, 0x22, // 4:6 region_b
+ 0xff, 0xff, 0xff, 0xff, // 6:10 unmapped
+ 0x33, 0x44, 0x55, 0x66, // 10:14 region_a
+};
+
+// With two dynamically-positioned regions, having the binary locations opposite
+// of the order of their declarations works.
+TEST(TwoRegionsView, RegionBFirstWorks) {
+ auto view = TwoRegionsView(kTwoRegionsBFirst, sizeof kTwoRegionsBFirst);
+ EXPECT_EQ(14, view.SizeInBytes());
+ EXPECT_EQ(10, view.a_start().Read());
+ EXPECT_EQ(4, view.a_size().Read());
+ EXPECT_EQ(4, view.b_start().Read());
+ EXPECT_EQ(6, view.b_end().Read());
+ EXPECT_EQ(0x33, view.region_a()[0].Read());
+ EXPECT_EQ(0x66, view.region_a()[3].Read());
+ EXPECT_EQ(0x11, view.region_b()[0].Read());
+ EXPECT_EQ(0x22, view.region_b()[1].Read());
+}
+
+static const uint8_t kTwoRegionsAAndBOverlap[8] = {
+ 0x05, // 0:1 a_start
+ 0x02, // 1:2 a_size
+ 0x04, // 2:3 b_start
+ 0x08, // 3:4 b_end
+ 0x11, 0x22, 0x33, 0x44, // 4:8 region_a / region_b
+};
+
+// With two dynamically-positioned regions, having the binary locations overlap
+// works.
+TEST(TwoRegionsView, RegionAAndBOverlappedWorks) {
+ auto view =
+ TwoRegionsView(kTwoRegionsAAndBOverlap, sizeof kTwoRegionsAAndBOverlap);
+ EXPECT_EQ(8, view.SizeInBytes());
+ EXPECT_EQ(5, view.a_start().Read());
+ EXPECT_EQ(2, view.a_size().Read());
+ EXPECT_EQ(4, view.b_start().Read());
+ EXPECT_EQ(8, view.b_end().Read());
+ EXPECT_EQ(0x22, view.region_a()[0].Read());
+ EXPECT_EQ(0x33, view.region_a()[1].Read());
+ EXPECT_EQ(0x11, view.region_b()[0].Read());
+ EXPECT_EQ(0x22, view.region_b()[1].Read());
+ EXPECT_EQ(0x33, view.region_b()[2].Read());
+ EXPECT_EQ(0x44, view.region_b()[3].Read());
+}
+
+TEST(TwoRegionsView, Write) {
+ uint8_t buffer[64];
+ auto writer = TwoRegionsWriter(buffer, sizeof buffer);
+ writer.a_start().Write(4);
+ writer.a_size().Write(2);
+ writer.b_start().Write(6);
+ writer.b_end().Write(10);
+ writer.region_a()[0].Write(0x11);
+ writer.region_a()[1].Write(0x22);
+ writer.region_b()[0].Write(0x33);
+ writer.region_b()[1].Write(0x44);
+ writer.region_b()[2].Write(0x55);
+ writer.region_b()[3].Write(0x66);
+ EXPECT_EQ(std::vector<uint8_t>(kTwoRegionsAFirst,
+ kTwoRegionsAFirst + sizeof kTwoRegionsAFirst),
+ std::vector<uint8_t>(buffer, buffer + sizeof kTwoRegionsAFirst));
+
+ writer.a_start().Write(10);
+ writer.a_size().Write(4);
+ writer.b_start().Write(4);
+ writer.b_end().Write(6);
+ writer.region_a()[0].Write(0x33);
+ writer.region_a()[1].Write(0x44);
+ writer.region_a()[2].Write(0x55);
+ writer.region_a()[3].Write(0x66);
+ writer.region_b()[0].Write(0x11);
+ writer.region_b()[1].Write(0x22);
+ // Set the unmapped region correctly.
+ buffer[6] = 0xff;
+ buffer[7] = 0xff;
+ buffer[8] = 0xff;
+ buffer[9] = 0xff;
+ EXPECT_EQ(std::vector<uint8_t>(kTwoRegionsBFirst,
+ kTwoRegionsBFirst + sizeof kTwoRegionsBFirst),
+ std::vector<uint8_t>(buffer, buffer + sizeof kTwoRegionsBFirst));
+
+ writer.a_start().Write(5);
+ writer.a_size().Write(2);
+ writer.b_start().Write(4);
+ writer.b_end().Write(8);
+ writer.region_b()[0].Write(0x11);
+ writer.region_b()[1].Write(0xff);
+ writer.region_b()[2].Write(0xee);
+ writer.region_b()[3].Write(0x44);
+ EXPECT_EQ(0xff, writer.region_a()[0].Read());
+ EXPECT_EQ(0xee, writer.region_a()[1].Read());
+ writer.region_a()[0].Write(0x22);
+ writer.region_a()[1].Write(0x33);
+ EXPECT_EQ(0x22, writer.region_b()[1].Read());
+ EXPECT_EQ(0x33, writer.region_b()[2].Read());
+ EXPECT_EQ(
+ std::vector<uint8_t>(
+ kTwoRegionsAAndBOverlap,
+ kTwoRegionsAAndBOverlap + sizeof kTwoRegionsAAndBOverlap),
+ std::vector<uint8_t>(buffer, buffer + sizeof kTwoRegionsAAndBOverlap));
+}
+
+static const uint8_t kMultipliedSize[299] = {
+ 0x09, // 0:1 width == 9
+ 0x21, // 1:2 height == 33
+ // 9 x 33 == 297-byte block for data.
+ 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 2:11
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 11:20
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 20:29
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 29:38
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 38:47
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 47:56
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 56:65
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 65:74
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 74:83
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 83:92
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 92:101
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 101:110
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 110:119
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 119:128
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 128:137
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 137:146
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 146:155
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 155:164
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 164:173
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 173:182
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 182:191
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 191:200
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 200:209
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 209:218
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 218:227
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 227:236
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 236:245
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 245:254
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 254:263
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 263:272
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 272:281
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 281:290
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, // 290:299
+};
+
+// A structure with two 8-bit fields whose values are multiplied to get the
+// length of an array works, even when the length of the array is too big to
+// fit in 8 bits.
+TEST(MultipliedSizeView, MultipliedSizesUseWideEnoughArithmetic) {
+ auto view = MultipliedSizeView(kMultipliedSize, sizeof kMultipliedSize);
+ EXPECT_EQ(299, view.SizeInBytes());
+ EXPECT_EQ(9, view.width().Read());
+ EXPECT_EQ(33, view.height().Read());
+ EXPECT_EQ(1, view.data()[0].Read());
+ EXPECT_EQ(0xff, view.data()[296].Read());
+}
+
+static const uint8_t kNegativeTermsInSizesAMinusBIsBiggest[7] = {
+ 0x07, // 0:1 a
+ 0x01, // 1:2 b
+ 0x02, // 2:3 c
+ // 3:a-b == 3:6 a_minus_b
+ // 3:a-2*b == 3:5 a_minus_2b
+ // 3:a-b-c == 3:4 a_minus_b_minus_c
+ // 3:10-a == 3:3 ten_minus_a
+ // 3:a-2*c == 3:3 a_minus_2c
+ // 3:a-c == 3:5 a_minus_c
+ 0x11,
+ 0x22,
+ 0x33,
+ 0x44,
+};
+
+// Given a variety of potential sizes for a structure, the correct one is
+// selected.
+TEST(NegativeTermsInSizes, AMinusBIsBiggest) {
+ auto view =
+ NegativeTermsInSizesView(kNegativeTermsInSizesAMinusBIsBiggest,
+ sizeof kNegativeTermsInSizesAMinusBIsBiggest);
+ EXPECT_EQ(6, view.SizeInBytes());
+ EXPECT_EQ(7, view.a().Read());
+ EXPECT_EQ(1, view.b().Read());
+ EXPECT_EQ(2, view.c().Read());
+ EXPECT_EQ(0x33, view.a_minus_b()[2].Read());
+}
+
+static const uint8_t kNegativeTermsInSizesAMinusCIsBiggest[7] = {
+ 0x07, // 0:1 a
+ 0x02, // 1:2 b
+ 0x01, // 2:3 c
+ // 3:a-b == 3:5 a_minus_b
+ // 3:a-2*b == 3:3 a_minus_2b
+ // 3:a-b-c == 3:4 a_minus_b_minus_c
+ // 3:10-a == 3:3 ten_minus_a
+ // 3:a-2*c == 3:5 a_minus_2c
+ // 3:a-c == 3:6 a_minus_c
+ 0x11,
+ 0x22,
+ 0x33,
+ 0x44,
+};
+
+// Given a variety of potential sizes for a structure, the correct one is
+// selected.
+TEST(NegativeTermsInSizes, AMinusCIsBiggest) {
+ auto view =
+ NegativeTermsInSizesView(kNegativeTermsInSizesAMinusCIsBiggest,
+ sizeof kNegativeTermsInSizesAMinusCIsBiggest);
+ EXPECT_EQ(6, view.SizeInBytes());
+ EXPECT_EQ(7, view.a().Read());
+ EXPECT_EQ(2, view.b().Read());
+ EXPECT_EQ(1, view.c().Read());
+ EXPECT_EQ(0x33, view.a_minus_c()[2].Read());
+ EXPECT_TRUE(view.a_minus_b().IsComplete());
+ EXPECT_TRUE(view.a_minus_2b().IsComplete());
+}
+
+static const uint8_t kNegativeTermsInSizesTenMinusAIsBiggest[7] = {
+ 0x04, // 0:1 a
+ 0x00, // 1:2 b
+ 0x00, // 2:3 c
+ // 3:a-b == 3:4 a_minus_b
+ // 3:a-2*b == 3:4 a_minus_2b
+ // 3:a-b-c == 3:4 a_minus_b_minus_c
+ // 3:10-a == 3:6 ten_minus_a
+ // 3:a-2*c == 3:4 a_minus_2c
+ // 3:a-c == 3:4 a_minus_c
+ 0x11,
+ 0x22,
+ 0x33,
+ 0x44,
+};
+
+// Given a variety of potential sizes for a structure, the correct one is
+// selected.
+TEST(NegativeTermsInSizes, TenMinusAIsBiggest) {
+ auto view =
+ NegativeTermsInSizesView(kNegativeTermsInSizesTenMinusAIsBiggest,
+ sizeof kNegativeTermsInSizesTenMinusAIsBiggest);
+ EXPECT_EQ(6, view.SizeInBytes());
+ EXPECT_EQ(4, view.a().Read());
+ EXPECT_EQ(0, view.b().Read());
+ EXPECT_EQ(0, view.c().Read());
+ EXPECT_EQ(0x33, view.ten_minus_a()[2].Read());
+ EXPECT_TRUE(view.a_minus_b().IsComplete());
+ EXPECT_TRUE(view.a_minus_2b().IsComplete());
+}
+
+static const uint8_t kNegativeTermsEndWouldBeNegative[10] = {
+ 0x00, // 0:1 a
+ 0x02, // 1:2 b
+ 0x02, // 2:3 c
+ // 3:a-b == 3:-2 a_minus_b
+ // 3:a-2*b == 3:-4 a_minus_2b
+ // 3:a-b-c == 3:-4 a_minus_b_minus_c
+ // 3:10-a == 3:10 ten_minus_a
+ // 3:a-2*c == 3:-4 a_minus_2c
+ // 3:a-c == 3:-2 a_minus_c
+ 0x11,
+ 0x22,
+ 0x33,
+ 0x44,
+ 0x55,
+ 0x66,
+ 0x77,
+};
+
+// Given a variety of potential sizes for a structure, some of which would be
+// negative, the correct one is selected.
+TEST(NegativeTermsInSizes, NegativeEnd) {
+ auto view = NegativeTermsInSizesView(kNegativeTermsEndWouldBeNegative,
+ sizeof kNegativeTermsEndWouldBeNegative);
+ EXPECT_EQ(10, view.SizeInBytes());
+ EXPECT_TRUE(view.SizeIsKnown());
+ EXPECT_EQ(0, view.a().Read());
+ EXPECT_EQ(2, view.b().Read());
+ EXPECT_EQ(2, view.c().Read());
+ EXPECT_EQ(0x77, view.ten_minus_a()[6].Read());
+ EXPECT_FALSE(view.a_minus_b().IsComplete());
+ EXPECT_FALSE(view.a_minus_2b().IsComplete());
+}
+
+// If a field's offset is negative, the field is not Ok() and !IsComplete().
+TEST(NegativeTermInLocation, NegativeLocation) {
+ ::std::array<char, 256> bytes = {15};
+ auto view = MakeNegativeTermInLocationView(&bytes);
+ EXPECT_FALSE(view.Ok());
+ EXPECT_TRUE(view.a().Ok());
+ EXPECT_TRUE(view.IsComplete());
+ EXPECT_FALSE(view.b().IsComplete());
+ EXPECT_FALSE(view.b().Ok());
+}
+
+static const uint8_t kChainedSizeInOrder[4] = {
+ 0x01, // 0:1 a
+ 0x02, // 1:2 b
+ 0x03, // 2:3 c
+ 0x04, // 3:4 d
+};
+
+// Fields are readable, even through multiple levels of indirection.
+TEST(ChainedSize, ChainedSizeInOrder) {
+ auto view = ChainedSizeView(kChainedSizeInOrder, sizeof kChainedSizeInOrder);
+ ASSERT_TRUE(view.SizeIsKnown());
+ EXPECT_EQ(4, view.SizeInBytes());
+ ASSERT_TRUE(view.a().IsComplete());
+ EXPECT_EQ(1, view.a().Read());
+ ASSERT_TRUE(view.b().IsComplete());
+ EXPECT_EQ(2, view.b().Read());
+ ASSERT_TRUE(view.c().IsComplete());
+ EXPECT_EQ(3, view.c().Read());
+ ASSERT_TRUE(view.d().IsComplete());
+ EXPECT_EQ(4, view.d().Read());
+}
+
+static const uint8_t kChainedSizeNotInOrder[4] = {
+ 0x03, // 0:1 a
+ 0x04, // 1:2 d
+ 0x01, // 2:3 c
+ 0x02, // 3:4 b
+};
+
+// Fields are readable, even through multiple levels of indirection, when their
+// placement in the binary structure is not in the same order.
+TEST(ChainedSize, ChainedSizeNotInOrder) {
+ auto view =
+ ChainedSizeView(kChainedSizeNotInOrder, sizeof kChainedSizeNotInOrder);
+ ASSERT_TRUE(view.Ok());
+ ASSERT_TRUE(view.SizeIsKnown());
+ EXPECT_EQ(4, view.SizeInBytes());
+ ASSERT_TRUE(view.a().IsComplete());
+ EXPECT_EQ(3, view.a().Read());
+ ASSERT_TRUE(view.b().IsComplete());
+ EXPECT_EQ(2, view.b().Read());
+ ASSERT_TRUE(view.c().IsComplete());
+ EXPECT_EQ(1, view.c().Read());
+ ASSERT_TRUE(view.d().IsComplete());
+ EXPECT_EQ(4, view.d().Read());
+}
+
+// Fields are readable, even through multiple levels of indirection.
+TEST(ChainedSize, Write) {
+ uint8_t buffer[4] = {0};
+ auto writer = ChainedSizeWriter(buffer, sizeof buffer);
+ writer.a().Write(1);
+ writer.b().Write(2);
+ writer.c().Write(3);
+ writer.d().Write(4);
+ EXPECT_EQ(
+ std::vector<uint8_t>(kChainedSizeInOrder,
+ kChainedSizeInOrder + sizeof kChainedSizeInOrder),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+ writer.a().Write(3);
+ writer.b().Write(2);
+ writer.c().Write(1);
+ writer.d().Write(4);
+ EXPECT_EQ(std::vector<uint8_t>(
+ kChainedSizeNotInOrder,
+ kChainedSizeNotInOrder + sizeof kChainedSizeNotInOrder),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+static const uint8_t kChainedSizeTooShortForD[3] = {
+ 0x01, // 0:1 a
+ 0x02, // 1:2 b
+ 0x03, // 2:3 c
+ // d is missing
+};
+
+// When a structure is partial, fields whose locations are available are still
+// readable, and the SizeInBytes method can be called as long as all of the
+// fields required to calculate the size are readable, even if other fields are
+// not.
+TEST(ChainedSize, ChainedSizeTooShortForD) {
+ auto view = ChainedSizeView(kChainedSizeTooShortForD,
+ sizeof kChainedSizeTooShortForD);
+ ASSERT_FALSE(view.Ok());
+ ASSERT_TRUE(view.SizeIsKnown());
+ EXPECT_EQ(4, view.SizeInBytes());
+ ASSERT_TRUE(view.a().IsComplete());
+ EXPECT_EQ(1, view.a().Read());
+ ASSERT_TRUE(view.b().IsComplete());
+ EXPECT_EQ(2, view.b().Read());
+ ASSERT_TRUE(view.c().IsComplete());
+ EXPECT_EQ(3, view.c().Read());
+ ASSERT_FALSE(view.d().IsComplete());
+}
+
+static const uint8_t kChainedSizeTooShortForC[2] = {
+ 0x01, // 0:1 a
+ 0x02, // 1:2 b
+ // c is missing
+ // d is missing
+};
+
+// When not all fields required to compute SizeInBytes() can be read,
+// SizeIsKnown() returns false.
+TEST(ChainedSize, ChainedSizeTooShortForC) {
+ auto view = ChainedSizeView(kChainedSizeTooShortForC,
+ sizeof kChainedSizeTooShortForC);
+ ASSERT_FALSE(view.Ok());
+ EXPECT_FALSE(view.SizeIsKnown());
+ ASSERT_TRUE(view.a().IsComplete());
+ EXPECT_EQ(1, view.a().Read());
+ ASSERT_TRUE(view.b().IsComplete());
+ EXPECT_EQ(2, view.b().Read());
+ ASSERT_FALSE(view.c().IsComplete());
+ ASSERT_FALSE(view.d().IsComplete());
+}
+
+// A structure with static size and two end-aligned fields compiles and returns
+// the correct size.
+TEST(FinalFieldOverlaps, FinalSizeIsCorrect) {
+ ASSERT_EQ(5, FinalFieldOverlapsView::SizeInBytes());
+}
+
+static const uint8_t kDynamicFinalFieldOverlapsDynamicFieldIsLast[12] = {
+ 0x0a, // 0:1 a
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 1:9 padding
+ 0x01, // 9:10 b
+ 0x02, // 10:11 (a:a+1) low byte of c
+ 0x03, // 11:12 (a+1:a+2) d; high byte of c
+};
+
+static const uint8_t kDynamicFinalFieldOverlapsStaticFieldIsLast[10] = {
+ 0x07, // 0:1 a
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // 1:7 padding
+ 0x02, // 7:8 (a:a+1) low byte of c
+ 0x03, // 8:9 (a+1:a+2) d; high byte of c
+ 0x01, // 9:10 b
+};
+
+// A structure with dynamic size and two end-aligned fields compiles and returns
+// the correct size.
+TEST(DynamicFinalFieldOverlaps, FinalSizeIsCorrect) {
+ auto dynamic_last_view = DynamicFinalFieldOverlapsView(
+ kDynamicFinalFieldOverlapsDynamicFieldIsLast,
+ sizeof kDynamicFinalFieldOverlapsDynamicFieldIsLast);
+ ASSERT_EQ(12, dynamic_last_view.SizeInBytes());
+ auto static_last_view = DynamicFinalFieldOverlapsView(
+ kDynamicFinalFieldOverlapsStaticFieldIsLast,
+ sizeof kDynamicFinalFieldOverlapsStaticFieldIsLast);
+ ASSERT_EQ(10, static_last_view.SizeInBytes());
+}
+
+TEST(DynamicFieldDependsOnLaterField, DynamicLocationIsNotKnown) {
+ uint8_t bytes[5] = {0x04, 0x03, 0x02, 0x01, 0x00};
+ auto view = MakeDynamicFieldDependsOnLaterFieldView(bytes, 4);
+ EXPECT_FALSE(view.b().Ok());
+ view = MakeDynamicFieldDependsOnLaterFieldView(bytes, 5);
+ EXPECT_TRUE(view.b().Ok());
+ EXPECT_EQ(3, view.b().Read());
+}
+
+TEST(DynamicFieldDoesNotAffectSize, DynamicFieldDoesNotAffectSize) {
+ EXPECT_EQ(256, DynamicFieldDoesNotAffectSizeView::SizeInBytes());
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/enum_test.cc b/back_end/cpp/testcode/enum_test.cc
new file mode 100644
index 0000000..49e5d4d
--- /dev/null
+++ b/back_end/cpp/testcode/enum_test.cc
@@ -0,0 +1,264 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for the generated View class for Container and Box from
+// nested_structure.emb.
+//
+// These tests check that nested structures work.
+#include <gtest/gtest.h>
+
+#include <array>
+#include <cstdint>
+#include <sstream>
+#include <string>
+#include <type_traits>
+#include <vector>
+
+#include "testdata/enum.emb.h"
+
+namespace emboss {
+namespace test {
+namespace {
+
+alignas(8) static const ::std::uint8_t kManifestEntry[14] = {
+ 0x01, // 0:1 Kind kind == SPROCKET
+ 0x04, 0x00, 0x00, 0x00, // 1:5 UInt count == 4
+ 0x02, 0x00, 0x00, 0x00, // 5:9 Kind wide_kind == GEEGAW
+ 0x20, 0x00, 0x00, 0x00, 0x00, // 9:14 Kind wide_kind_in_bits == GEEGAW
+};
+
+TEST(ManifestEntryView, CanReadKind) {
+ auto view = MakeAlignedManifestEntryView<const ::std::uint8_t, 8>(
+ kManifestEntry, sizeof kManifestEntry);
+ EXPECT_EQ(Kind::SPROCKET, view.kind().Read());
+ EXPECT_EQ(Kind::GEEGAW, view.wide_kind().Read());
+ EXPECT_EQ(Kind::GEEGAW, view.wide_kind_in_bits().Read());
+}
+
+TEST(ManifestEntryView, Equals) {
+ ::std::array</**/ ::std::uint8_t, sizeof kManifestEntry> buf_x;
+ ::std::array</**/ ::std::uint8_t, sizeof kManifestEntry> buf_y;
+
+ ::std::copy(kManifestEntry, kManifestEntry + sizeof kManifestEntry,
+ buf_x.begin());
+ ::std::copy(kManifestEntry, kManifestEntry + sizeof kManifestEntry,
+ buf_y.begin());
+
+ EXPECT_EQ(buf_x, buf_y);
+ auto x = MakeManifestEntryView(&buf_x);
+ auto x_const = MakeManifestEntryView(
+ static_cast</**/ ::std::array</**/ ::std::uint8_t, sizeof kManifestEntry>
+ *>(&buf_x));
+ auto y = MakeManifestEntryView(&buf_y);
+
+ EXPECT_TRUE(x.Equals(x));
+ EXPECT_TRUE(x.UncheckedEquals(x));
+ EXPECT_TRUE(y.Equals(y));
+ EXPECT_TRUE(y.UncheckedEquals(y));
+
+ EXPECT_TRUE(x.Equals(y));
+ EXPECT_TRUE(x.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x));
+ EXPECT_TRUE(y.UncheckedEquals(x));
+
+ EXPECT_TRUE(x_const.Equals(y));
+ EXPECT_TRUE(x_const.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x_const));
+ EXPECT_TRUE(y.UncheckedEquals(x_const));
+
+ ++buf_y[1];
+ EXPECT_FALSE(x.Equals(y));
+ EXPECT_FALSE(x.UncheckedEquals(y));
+ EXPECT_FALSE(y.Equals(x));
+ EXPECT_FALSE(y.UncheckedEquals(x));
+
+ EXPECT_FALSE(x_const.Equals(y));
+ EXPECT_FALSE(x_const.UncheckedEquals(y));
+ EXPECT_FALSE(y.Equals(x_const));
+ EXPECT_FALSE(y.UncheckedEquals(x_const));
+}
+
+static const ::std::uint8_t kManifestEntryEdgeCases[14] = {
+ 0xff, // 0:1 Kind kind == 0x0f
+ 0x04, 0x00, 0x00, 0x00, // 1:5 UInt count == 4
+ 0xff, 0xff, 0xff, 0xff, // 5:9 Kind wide_kind == MAX32BIT
+ 0xf0, 0xff, 0xff, 0xff, 0x0f, // 9:14 Kind wide_kind_in_bits == GEEGAW
+};
+
+TEST(ManifestEntryView, EdgeCases) {
+ auto view = ManifestEntryView(kManifestEntryEdgeCases,
+ sizeof kManifestEntryEdgeCases);
+ EXPECT_EQ(static_cast<Kind>(255), view.kind().Read());
+ EXPECT_EQ(255, static_cast<uint64_t>(view.kind().Read()));
+ EXPECT_EQ(Kind::MAX32BIT, view.wide_kind().Read());
+ EXPECT_EQ(Kind::MAX32BIT, view.wide_kind_in_bits().Read());
+}
+
+TEST(Kind, Values) {
+ EXPECT_EQ(static_cast<Kind>(0), Kind::WIDGET);
+ EXPECT_EQ(static_cast<Kind>(1), Kind::SPROCKET);
+ EXPECT_EQ(static_cast<Kind>(2), Kind::GEEGAW);
+ EXPECT_EQ(static_cast<Kind>(static_cast<uint64_t>(Kind::GEEGAW) +
+ static_cast<uint64_t>(Kind::SPROCKET)),
+ Kind::COMPUTED);
+ EXPECT_EQ(static_cast<Kind>(4294967295), Kind::MAX32BIT);
+}
+
+TEST(ManifestEntryWriter, CanWriteKind) {
+ ::std::uint8_t buffer[sizeof kManifestEntry] = {0};
+ auto writer = ManifestEntryWriter(buffer, sizeof buffer);
+ writer.kind().Write(Kind::SPROCKET);
+ writer.count().Write(4);
+ writer.wide_kind().Write(Kind::GEEGAW);
+ writer.wide_kind_in_bits().Write(Kind::GEEGAW);
+ EXPECT_EQ(std::vector</**/ ::std::uint8_t>(
+ kManifestEntry, kManifestEntry + sizeof kManifestEntry),
+ std::vector</**/ ::std::uint8_t>(buffer, buffer + sizeof buffer));
+
+ EXPECT_DEATH(writer.kind().Write(Kind::LARGE_VALUE), "");
+ writer.kind().Write(static_cast<Kind>(0xff));
+ EXPECT_EQ(static_cast<Kind>(0xff), writer.kind().Read());
+ EXPECT_EQ(0xff, buffer[0]);
+ // The writes to kind() should not have overwritten the next field.
+ EXPECT_EQ(0x04, buffer[1]);
+ writer.wide_kind().Write(Kind::MAX32BIT);
+ writer.wide_kind_in_bits().Write(Kind::MAX32BIT);
+ EXPECT_EQ(std::vector</**/ ::std::uint8_t>(
+ kManifestEntryEdgeCases,
+ kManifestEntryEdgeCases + sizeof kManifestEntryEdgeCases),
+ std::vector</**/ ::std::uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(Kind, EnumToName) {
+ EXPECT_EQ("WIDGET", TryToGetNameFromEnum(Kind::WIDGET));
+ EXPECT_EQ("SPROCKET", TryToGetNameFromEnum(Kind::SPROCKET));
+ EXPECT_EQ("MAX32BIT", TryToGetNameFromEnum(Kind::MAX32BIT));
+ // In the case of duplicate values, the first one listed in the .emb is
+ // chosen.
+ // TODO(bolms): Decide if this policy is good enough, or if the choice should
+ // be explicit.
+ EXPECT_EQ("LARGE_VALUE", TryToGetNameFromEnum(Kind::LARGE_VALUE));
+ EXPECT_EQ("LARGE_VALUE", TryToGetNameFromEnum(Kind::DUPLICATE_LARGE_VALUE));
+ EXPECT_EQ(nullptr, TryToGetNameFromEnum(static_cast<Kind>(100)));
+}
+
+TEST(Kind, EnumToOstream) {
+ {
+ std::ostringstream s;
+ s << Kind::WIDGET;
+ EXPECT_EQ("WIDGET", s.str());
+ }
+ {
+ std::ostringstream s;
+ s << Kind::MAX32BIT;
+ EXPECT_EQ("MAX32BIT", s.str());
+ }
+ {
+ std::ostringstream s;
+ s << static_cast<Kind>(10005);
+ EXPECT_EQ("10005", s.str());
+ }
+ {
+ std::ostringstream s;
+ s << Kind::WIDGET << ":" << Kind::SPROCKET;
+ EXPECT_EQ("WIDGET:SPROCKET", s.str());
+ }
+}
+
+TEST(ManifestEntryView, CopyFrom) {
+ std::array</**/ ::std::uint8_t, 14> buf_x = {0x00};
+ std::array</**/ ::std::uint8_t, 14> buf_y = {0xff};
+
+ auto x = MakeManifestEntryView(&buf_x);
+ auto y = MakeManifestEntryView(&buf_y);
+
+ EXPECT_NE(x.kind().Read(), y.kind().Read());
+ x.kind().CopyFrom(y.kind());
+ EXPECT_EQ(x.kind().Read(), y.kind().Read());
+}
+
+TEST(ManifestEntryView, TryToCopyFrom) {
+ std::array</**/ ::std::uint8_t, 14> buf_x = {0x00};
+ std::array</**/ ::std::uint8_t, 14> buf_y = {0xff};
+
+ auto x = MakeManifestEntryView(&buf_x);
+ auto y = MakeManifestEntryView(&buf_y);
+
+ EXPECT_NE(x.kind().Read(), y.kind().Read());
+ EXPECT_TRUE(x.kind().TryToCopyFrom(y.kind()));
+ EXPECT_EQ(x.kind().Read(), y.kind().Read());
+}
+
+TEST(Kind, NameToEnum) {
+ Kind result;
+ EXPECT_TRUE(TryToGetEnumFromName("WIDGET", &result));
+ EXPECT_EQ(Kind::WIDGET, result);
+ EXPECT_TRUE(TryToGetEnumFromName("SPROCKET", &result));
+ EXPECT_EQ(Kind::SPROCKET, result);
+ EXPECT_TRUE(TryToGetEnumFromName("MAX32BIT", &result));
+ EXPECT_EQ(Kind::MAX32BIT, result);
+ EXPECT_TRUE(TryToGetEnumFromName("LARGE_VALUE", &result));
+ EXPECT_EQ(Kind::LARGE_VALUE, result);
+ EXPECT_EQ(Kind::DUPLICATE_LARGE_VALUE, result);
+ EXPECT_TRUE(TryToGetEnumFromName("DUPLICATE_LARGE_VALUE", &result));
+ EXPECT_EQ(Kind::LARGE_VALUE, result);
+ EXPECT_EQ(Kind::DUPLICATE_LARGE_VALUE, result);
+
+ result = Kind::WIDGET;
+ EXPECT_FALSE(TryToGetEnumFromName("MAX32BIT ", &result));
+ EXPECT_EQ(Kind::WIDGET, result);
+ EXPECT_FALSE(TryToGetEnumFromName("", &result));
+ EXPECT_EQ(Kind::WIDGET, result);
+ EXPECT_FALSE(TryToGetEnumFromName(nullptr, &result));
+ EXPECT_EQ(Kind::WIDGET, result);
+ EXPECT_FALSE(TryToGetEnumFromName(" MAX32BIT", &result));
+ EXPECT_EQ(Kind::WIDGET, result);
+ EXPECT_FALSE(TryToGetEnumFromName("MAX32BI", &result));
+ EXPECT_EQ(Kind::WIDGET, result);
+ EXPECT_FALSE(TryToGetEnumFromName("max32bit", &result));
+ EXPECT_EQ(Kind::WIDGET, result);
+}
+
+TEST(Kind, Type) {
+ EXPECT_TRUE(
+ (::std::is_same<uint64_t, ::std::underlying_type<Kind>::type>::value));
+}
+
+TEST(Signed, Type) {
+ EXPECT_TRUE(
+ (::std::is_same<int64_t, ::std::underlying_type<Signed>::type>::value));
+}
+
+TEST(Foo, EnumsExposedFromView) {
+ EXPECT_EQ(StructContainingEnum::Status::OK,
+ StructContainingEnumView::Status::OK);
+ EXPECT_EQ(StructContainingEnum::Status::FAILURE,
+ StructContainingEnumView::Status::FAILURE);
+}
+
+TEST(Kind, EnumIsKnown) {
+ EXPECT_TRUE(EnumIsKnown(Kind::WIDGET));
+ EXPECT_TRUE(EnumIsKnown(Kind::SPROCKET));
+ EXPECT_TRUE(EnumIsKnown(Kind::GEEGAW));
+ EXPECT_TRUE(EnumIsKnown(Kind::COMPUTED));
+ EXPECT_TRUE(EnumIsKnown(Kind::LARGE_VALUE));
+ EXPECT_TRUE(EnumIsKnown(Kind::DUPLICATE_LARGE_VALUE));
+ EXPECT_TRUE(EnumIsKnown(Kind::MAX32BIT));
+ EXPECT_TRUE(EnumIsKnown(Kind::MAX64BIT));
+ EXPECT_FALSE(EnumIsKnown(static_cast<Kind>(12345)));
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/explicit_sizes_test.cc b/back_end/cpp/testcode/explicit_sizes_test.cc
new file mode 100644
index 0000000..4be4598
--- /dev/null
+++ b/back_end/cpp/testcode/explicit_sizes_test.cc
@@ -0,0 +1,49 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for the generated View class for Container and Box from
+// nested_structure.emb.
+//
+// These tests check that nested structures work.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/explicit_sizes.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+static const uint8_t kUIntArrays[21] = {
+ 0x21, // one_nibble == { 0x1, 0x2 }
+ 0x10, 0x20, // two_nibble == { 0x10, 0x20 }
+ 0x10, 0x11, 0x20, 0x22, // four_nibble == { 0x1110, 0x2220 }
+};
+
+TEST(SizesView, CanReadSizes) {
+ auto outer_view = BitArrayContainerView(kUIntArrays, sizeof kUIntArrays);
+ auto view = outer_view.uint_arrays();
+ EXPECT_EQ(0x1, view.one_nibble()[0].Read());
+ EXPECT_EQ(0x2, view.one_nibble()[1].Read());
+ EXPECT_EQ(0x10, view.two_nibble()[0].Read());
+ EXPECT_EQ(0x20, view.two_nibble()[1].Read());
+ EXPECT_EQ(0x1110, view.four_nibble()[0].Read());
+ EXPECT_EQ(0x2220, view.four_nibble()[1].Read());
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/float_test.cc b/back_end/cpp/testcode/float_test.cc
new file mode 100644
index 0000000..a8360a9
--- /dev/null
+++ b/back_end/cpp/testcode/float_test.cc
@@ -0,0 +1,369 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for Emboss floating-point support.
+
+#include <stdint.h>
+
+#include <cmath>
+#include <vector>
+
+#include "testdata/float.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+std::array<char, 8> MakeFloats(uint32_t bits) {
+ return std::array<char, 8>({{
+ // Little endian version
+ static_cast<char>(bits & 0xff), //
+ static_cast<char>((bits >> 8) & 0xff), //
+ static_cast<char>((bits >> 16) & 0xff), //
+ static_cast<char>((bits >> 24) & 0xff), //
+
+ // Big endian version
+ static_cast<char>((bits >> 24) & 0xff), //
+ static_cast<char>((bits >> 16) & 0xff), //
+ static_cast<char>((bits >> 8) & 0xff), //
+ static_cast<char>(bits & 0xff), //
+ }});
+}
+
+std::array<char, 16> MakeDoubles(uint64_t bits) {
+ return std::array<char, 16>({{
+ // Little endian version
+ static_cast<char>(bits & 0xff), //
+ static_cast<char>((bits >> 8) & 0xff), //
+ static_cast<char>((bits >> 16) & 0xff), //
+ static_cast<char>((bits >> 24) & 0xff), //
+ static_cast<char>((bits >> 32) & 0xff), //
+ static_cast<char>((bits >> 40) & 0xff), //
+ static_cast<char>((bits >> 48) & 0xff), //
+ static_cast<char>((bits >> 56) & 0xff), //
+
+ // Big endian version
+ static_cast<char>((bits >> 56) & 0xff), //
+ static_cast<char>((bits >> 48) & 0xff), //
+ static_cast<char>((bits >> 40) & 0xff), //
+ static_cast<char>((bits >> 32) & 0xff), //
+ static_cast<char>((bits >> 24) & 0xff), //
+ static_cast<char>((bits >> 16) & 0xff), //
+ static_cast<char>((bits >> 8) & 0xff), //
+ static_cast<char>(bits & 0xff), //
+ }});
+}
+
+// This is used separately for tests where !(a == a).
+void TestFloatWrite(float value, uint32_t bits) {
+ const auto floats = MakeFloats(bits);
+
+ std::array<char, 8> buffer = {};
+ auto writer = MakeFloatsView(&buffer);
+ EXPECT_TRUE(writer.float_little_endian().CouldWriteValue(value));
+ EXPECT_TRUE(writer.float_big_endian().CouldWriteValue(value));
+ writer.float_little_endian().Write(value);
+ writer.float_big_endian().Write(value);
+ EXPECT_EQ(floats, buffer);
+}
+
+std::array<char, 8> TestFloatValue(float value, uint32_t bits) {
+ const auto floats = MakeFloats(bits);
+
+ auto view = MakeFloatsView(&floats);
+ EXPECT_EQ(value, view.float_little_endian().Read());
+ EXPECT_EQ(value, view.float_big_endian().Read());
+
+ TestFloatWrite(value, bits);
+
+ return floats;
+}
+
+// This is used separately for tests where !(a == a).
+void TestDoubleWrite(double value, uint64_t bits) {
+ const auto doubles = MakeDoubles(bits);
+
+ std::array<char, 16> buffer = {};
+ auto writer = MakeDoublesView(&buffer);
+ EXPECT_TRUE(writer.double_little_endian().CouldWriteValue(value));
+ EXPECT_TRUE(writer.double_big_endian().CouldWriteValue(value));
+ writer.double_little_endian().Write(value);
+ writer.double_big_endian().Write(value);
+ EXPECT_EQ(doubles, buffer);
+}
+
+std::array<char, 16> TestDoubleValue(double value, uint64_t bits) {
+ const auto doubles = MakeDoubles(bits);
+
+ auto view = MakeDoublesView(&doubles);
+ EXPECT_EQ(value, view.double_little_endian().Read());
+ EXPECT_EQ(value, view.double_big_endian().Read());
+
+ TestDoubleWrite(value, bits);
+
+ return doubles;
+}
+
+TEST(Floats, One) { TestFloatValue(+1.0f, 0x3f800000); }
+TEST(Floats, Fraction) { TestFloatValue(-0.375f, 0xbec00000); }
+TEST(Floats, MinimumDenorm) { TestFloatValue(std::exp2(-149.0f), 0x00000001); }
+
+TEST(Floats, PlusZero) {
+ auto floats = TestFloatValue(+0.0f, 0x00000000);
+ auto view = MakeFloatsView(&floats);
+ EXPECT_FALSE(std::signbit(view.float_little_endian().Read()));
+ EXPECT_FALSE(std::signbit(view.float_big_endian().Read()));
+}
+
+TEST(Floats, MinusZero) {
+ auto floats = TestFloatValue(-0.0f, 0x80000000);
+ auto view = MakeFloatsView(&floats);
+ EXPECT_TRUE(std::signbit(view.float_little_endian().Read()));
+ EXPECT_TRUE(std::signbit(view.float_big_endian().Read()));
+}
+
+TEST(Floats, PlusInfinity) {
+ auto floats = MakeFloats(0x7f800000);
+ auto view = MakeFloatsView(&floats);
+ EXPECT_TRUE(std::isinf(view.float_little_endian().Read()));
+ EXPECT_TRUE(std::isinf(view.float_big_endian().Read()));
+ EXPECT_FALSE(std::signbit(view.float_little_endian().Read()));
+ EXPECT_FALSE(std::signbit(view.float_big_endian().Read()));
+ TestFloatWrite(view.float_little_endian().Read(), 0x7f800000);
+}
+
+TEST(Floats, MinusInfinity) {
+ auto floats = MakeFloats(0xff800000);
+ auto view = MakeFloatsView(&floats);
+ EXPECT_TRUE(std::isinf(view.float_little_endian().Read()));
+ EXPECT_TRUE(std::isinf(view.float_big_endian().Read()));
+ EXPECT_TRUE(std::signbit(view.float_little_endian().Read()));
+ EXPECT_TRUE(std::signbit(view.float_big_endian().Read()));
+ TestFloatWrite(view.float_little_endian().Read(), 0xff800000);
+}
+
+TEST(Floats, Nan) {
+ // TODO(bolms): IEEE 754 does not specify the difference between quiet and
+ // signalling NaN, and there are two completely incompatible definitions in
+ // use by modern processors. Ideally, Emboss should provide some way to
+ // specify which convention is in use, but in practice it probably doesn't
+ // matter when dealing with hardware devices.
+ //
+ // Note that the above bit patterns are signalling NaNs on some processors,
+ // and thus any operation on them other than 'std::isnan' should be avoided.
+
+ auto floats = MakeFloats(0x7f800001);
+ auto view = MakeFloatsView(&floats);
+ EXPECT_TRUE(std::isnan(view.float_little_endian().Read()));
+ EXPECT_TRUE(std::isnan(view.float_big_endian().Read()));
+ TestFloatWrite(view.float_little_endian().Read(), 0x7f800001);
+}
+
+TEST(FloatView, Equals) {
+ auto buf_x = MakeFloats(64);
+ auto buf_y = MakeFloats(64);
+ EXPECT_EQ(buf_x, buf_y);
+
+ auto x = MakeFloatsView(&buf_x);
+ auto x_const =
+ MakeFloatsView(static_cast</**/ ::std::array<char, 8>*>(&buf_x));
+ auto y = MakeFloatsView(&buf_y);
+
+ EXPECT_TRUE(x.Equals(x));
+ EXPECT_TRUE(x.UncheckedEquals(x));
+ EXPECT_TRUE(y.Equals(y));
+ EXPECT_TRUE(y.UncheckedEquals(y));
+
+ EXPECT_TRUE(x.Equals(y));
+ EXPECT_TRUE(x.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x));
+ EXPECT_TRUE(y.UncheckedEquals(x));
+
+ EXPECT_TRUE(x_const.Equals(y));
+ EXPECT_TRUE(x_const.UncheckedEquals(y));
+ EXPECT_TRUE(y.Equals(x_const));
+ EXPECT_TRUE(y.UncheckedEquals(x_const));
+
+ ++buf_y[1];
+ EXPECT_FALSE(x.Equals(y));
+ EXPECT_FALSE(x.UncheckedEquals(y));
+ EXPECT_FALSE(y.Equals(x));
+ EXPECT_FALSE(y.UncheckedEquals(x));
+}
+
+TEST(FloatView, EqualsNaN) {
+ auto buf_x = MakeFloats(0x7f800001);
+ auto buf_y = MakeFloats(0x7f800001);
+ EXPECT_EQ(buf_x, buf_y);
+
+ auto x = MakeFloatsView(&buf_x);
+ auto y = MakeFloatsView(&buf_y);
+
+ EXPECT_TRUE(std::isnan(x.float_little_endian().Read()));
+ EXPECT_TRUE(std::isnan(x.float_big_endian().Read()));
+ EXPECT_TRUE(std::isnan(y.float_little_endian().Read()));
+ EXPECT_TRUE(std::isnan(y.float_big_endian().Read()));
+
+ EXPECT_FALSE(x.Equals(x));
+ EXPECT_FALSE(y.Equals(y));
+ EXPECT_FALSE(x.Equals(y));
+ EXPECT_FALSE(y.Equals(x));
+}
+
+TEST(Doubles, One) { TestDoubleValue(+1.0, 0x3ff0000000000000UL); }
+TEST(Doubles, Fraction) { TestDoubleValue(-0.375, 0xbfd8000000000000UL); }
+TEST(Doubles, MinimumDenorm) {
+ TestDoubleValue(std::exp2(-1074.0), 0x0000000000000001UL);
+}
+
+TEST(Doubles, PlusZero) {
+ auto doubles = TestDoubleValue(+0.0, 0x0000000000000000UL);
+ auto view = MakeDoublesView(&doubles);
+ EXPECT_FALSE(std::signbit(view.double_little_endian().Read()));
+ EXPECT_FALSE(std::signbit(view.double_big_endian().Read()));
+}
+
+TEST(Doubles, MinusZero) {
+ auto doubles = TestDoubleValue(-0.0, 0x8000000000000000UL);
+ auto view = MakeDoublesView(&doubles);
+ EXPECT_TRUE(std::signbit(view.double_little_endian().Read()));
+ EXPECT_TRUE(std::signbit(view.double_big_endian().Read()));
+}
+
+TEST(Doubles, PlusInfinity) {
+ auto doubles = MakeDoubles(0x7ff0000000000000UL);
+ auto view = MakeDoublesView(&doubles);
+ EXPECT_TRUE(std::isinf(view.double_little_endian().Read()));
+ EXPECT_TRUE(std::isinf(view.double_big_endian().Read()));
+ EXPECT_FALSE(std::signbit(view.double_little_endian().Read()));
+ EXPECT_FALSE(std::signbit(view.double_big_endian().Read()));
+ TestDoubleWrite(view.double_little_endian().Read(), 0x7ff0000000000000UL);
+}
+
+TEST(Doubles, MinusInfinity) {
+ auto doubles = MakeDoubles(0xfff0000000000000UL);
+ auto view = MakeDoublesView(&doubles);
+ EXPECT_TRUE(std::isinf(view.double_little_endian().Read()));
+ EXPECT_TRUE(std::isinf(view.double_big_endian().Read()));
+ EXPECT_TRUE(std::signbit(view.double_little_endian().Read()));
+ EXPECT_TRUE(std::signbit(view.double_big_endian().Read()));
+ TestDoubleWrite(view.double_little_endian().Read(), 0xfff0000000000000UL);
+}
+
+TEST(Doubles, Nan) {
+ auto doubles = MakeDoubles(0x7ff0000000000001UL);
+ auto view = MakeDoublesView(&doubles);
+ EXPECT_TRUE(std::isnan(view.double_little_endian().Read()));
+ EXPECT_TRUE(std::isnan(view.double_big_endian().Read()));
+ TestDoubleWrite(view.double_little_endian().Read(), 0x7ff0000000000001UL);
+}
+
+TEST(Doubles, CopyFrom) {
+ auto doubles_x = MakeDoubles(0x7ff0000000000001UL);
+ auto doubles_y = MakeDoubles(0x0000000000000000UL);
+
+ auto x = MakeDoublesView(&doubles_x);
+ auto y = MakeDoublesView(&doubles_y);
+
+ EXPECT_NE(x.double_little_endian().Read(), y.double_little_endian().Read());
+ EXPECT_NE(x.double_big_endian().Read(), y.double_big_endian().Read());
+ x.double_little_endian().CopyFrom(y.double_little_endian());
+ x.double_big_endian().CopyFrom(y.double_big_endian());
+ EXPECT_EQ(x.double_little_endian().Read(), y.double_little_endian().Read());
+ EXPECT_EQ(x.double_big_endian().Read(), y.double_big_endian().Read());
+}
+
+TEST(Doubles, TryToCopyFrom) {
+ auto doubles_x = MakeDoubles(0x7ff0000000000001UL);
+ auto doubles_y = MakeDoubles(0x0000000000000000UL);
+
+ auto x = MakeDoublesView(&doubles_x);
+ auto y = MakeDoublesView(&doubles_y);
+
+ EXPECT_NE(x.double_little_endian().Read(), y.double_little_endian().Read());
+ EXPECT_NE(x.double_big_endian().Read(), y.double_big_endian().Read());
+ EXPECT_TRUE(x.double_little_endian().TryToCopyFrom(y.double_little_endian()));
+ EXPECT_TRUE(x.double_big_endian().TryToCopyFrom(y.double_big_endian()));
+ EXPECT_EQ(x.double_little_endian().Read(), y.double_little_endian().Read());
+ EXPECT_EQ(x.double_big_endian().Read(), y.double_big_endian().Read());
+}
+
+TEST(DoubleView, Equals) {
+ auto buf_x = MakeDoubles(64);
+ auto buf_y = MakeDoubles(64);
+ EXPECT_EQ(buf_x, buf_y);
+
+ auto x = MakeDoublesView(&buf_x);
+ auto y = MakeDoublesView(&buf_y);
+
+ EXPECT_TRUE(x.Equals(x));
+ EXPECT_TRUE(y.Equals(y));
+
+ EXPECT_TRUE(x.Equals(y));
+ EXPECT_TRUE(y.Equals(x));
+
+ ++buf_y[1];
+ EXPECT_FALSE(x.Equals(y));
+ EXPECT_FALSE(y.Equals(x));
+}
+
+TEST(DoubleView, EqualsNaN) {
+ auto buf_x = MakeDoubles(0x7ff0000000000001UL);
+ auto buf_y = MakeDoubles(0x7ff0000000000001UL);
+ EXPECT_EQ(buf_x, buf_y);
+
+ auto x = MakeDoublesView(&buf_x);
+ auto y = MakeDoublesView(&buf_y);
+
+ EXPECT_TRUE(std::isnan(x.double_little_endian().Read()));
+ EXPECT_TRUE(std::isnan(x.double_big_endian().Read()));
+ EXPECT_TRUE(std::isnan(y.double_little_endian().Read()));
+ EXPECT_TRUE(std::isnan(y.double_big_endian().Read()));
+
+ EXPECT_FALSE(x.Equals(x));
+ EXPECT_FALSE(y.Equals(y));
+
+ EXPECT_FALSE(x.Equals(y));
+ EXPECT_FALSE(y.Equals(x));
+}
+
+TEST(DoubleView, WriteTextFormat) {
+ auto buf_x = MakeDoubles(0x4050000000000000UL);
+ auto x = MakeDoublesView(&buf_x);
+ EXPECT_EQ("{ double_little_endian: 64, double_big_endian: 64 }",
+ ::emboss::WriteToString(x));
+ EXPECT_EQ(
+ "{\n"
+ " double_little_endian: 64\n"
+ " double_big_endian: 64\n"
+ "}",
+ ::emboss::WriteToString(x, ::emboss::MultilineText()));
+}
+
+TEST(DoubleView, ReadTextFormat) {
+ auto buf_x = MakeDoubles(0UL);
+ auto x = MakeDoublesView(&buf_x);
+ EXPECT_TRUE(::emboss::UpdateFromText(x,
+ "{\n"
+ " double_little_endian: 64\n"
+ " double_big_endian: 64\n"
+ "}"));
+ EXPECT_EQ(64, x.double_little_endian().Read());
+ EXPECT_EQ(64, x.double_big_endian().Read());
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/generated_code_nc.cc b/back_end/cpp/testcode/generated_code_nc.cc
new file mode 100644
index 0000000..d840d42
--- /dev/null
+++ b/back_end/cpp/testcode/generated_code_nc.cc
@@ -0,0 +1,51 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include <stdint.h>
+
+#include "testdata/auto_array_size.emb.h"
+
+namespace emboss {
+namespace test {
+namespace {
+
+void X() {
+ static const uint8_t kAutoSize[36] = {0};
+ (void)kAutoSize; // Suppress unused variable warning.
+
+#ifdef TEST_WRITER_IS_NOT_CONSTRUCTIBLE_FROM_CONSTANT
+ // A FooWriter should not be constructible from a constant pointer.
+ AutoSizeView view = AutoSizeWriter(kAutoSize, sizeof kAutoSize);
+#endif // TEST_WRITER_IS_NOT_CONSTRUCTIBLE_FROM_CONSTANT
+
+#ifdef TEST_CANNOT_CALL_SET_ON_CONSTANT_VIEW
+ // A call to FooView::xxx().Write() should not compile.
+ AutoSizeView view = AutoSizeView(kAutoSize, sizeof kAutoSize);
+ // .Read() should be OK.
+ (void)view.array_size().Read();
+ view.array_size().Write(1);
+#endif // TEST_CANNOT_CALL_SET_ON_CONSTANT_VIEW
+
+#ifdef TEST_CANNOT_CALL_SET_ON_CONSTANT_VIEW_OF_ARRAY
+ // A call to FooView::xxx()[y].Write() should not compile.
+ AutoSizeView view = AutoSizeView(kAutoSize, sizeof kAutoSize);
+ // .Read() should be OK.
+ (void)view.four_byte_array()[1].Read();
+ view.four_byte_array()[1].Write(0);
+#endif // TEST_CANNOT_CALL_SET_ON_CONSTANT_VIEW_OF_ARRAY
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/importer_test.cc b/back_end/cpp/testcode/importer_test.cc
new file mode 100644
index 0000000..60c6aa3
--- /dev/null
+++ b/back_end/cpp/testcode/importer_test.cc
@@ -0,0 +1,41 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for using imported types.
+
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/importer.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+const uint8_t kOuter[16] = {
+ 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, // inner
+ 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, // inner_gen
+};
+
+TEST(Importer, CanAccessInner) {
+ auto view = OuterView(kOuter, sizeof kOuter);
+ EXPECT_EQ(0x0807060504030201UL, view.inner().value().Read());
+ EXPECT_EQ(0x100f0e0d0c0b0a09UL, view.inner_gen().value().Read());
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/inline_type_test.cc b/back_end/cpp/testcode/inline_type_test.cc
new file mode 100644
index 0000000..5862f86
--- /dev/null
+++ b/back_end/cpp/testcode/inline_type_test.cc
@@ -0,0 +1,52 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for types defined inline.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/inline_type.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+static const uint8_t kFoo[2] = {0, 12};
+
+static const uint8_t kFooOnFire[2] = {12, 0};
+
+// Tests that inline-defined enums have correct, independent values.
+TEST(FooView, EnumValuesAreAsExpected) {
+ EXPECT_EQ(0, static_cast<int>(Foo::Status::OK));
+ EXPECT_EQ(12, static_cast<int>(Foo::Status::FAILURE));
+ EXPECT_EQ(12, static_cast<int>(Foo::SecondaryStatus::OK));
+ EXPECT_EQ(0, static_cast<int>(Foo::SecondaryStatus::FAILURE));
+}
+
+// Tests that a structure containing inline-defined enums can be read correctly.
+TEST(FooView, ReadsCorrectly) {
+ auto ok_view = FooView(kFoo, sizeof kFoo);
+ EXPECT_EQ(Foo::Status::OK, ok_view.status().Read());
+ EXPECT_EQ(Foo::SecondaryStatus::OK, ok_view.secondary_status().Read());
+ auto on_fire_view = FooView(kFooOnFire, sizeof kFooOnFire);
+ EXPECT_EQ(Foo::Status::FAILURE, on_fire_view.status().Read());
+ EXPECT_EQ(Foo::SecondaryStatus::FAILURE,
+ on_fire_view.secondary_status().Read());
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/int_sizes_test.cc b/back_end/cpp/testcode/int_sizes_test.cc
new file mode 100644
index 0000000..00a805b
--- /dev/null
+++ b/back_end/cpp/testcode/int_sizes_test.cc
@@ -0,0 +1,157 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for the generated View class for Container and Box from
+// nested_structure.emb.
+//
+// These tests check that nested structures work.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/int_sizes.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+alignas(8) static const uint8_t kIntSizes[36] = {
+ 0x02, // 0:1 one_byte == 2
+ 0xfc, 0xfe, // 1:3 two_byte == -260
+ 0x66, 0x55, 0x44, // 3:6 three_byte == 0x445566
+ 0xfa, 0xfa, 0xfb, 0xfc, // 6:10 four_byte == -0x03040506
+ 0x21, 0x43, 0x65, 0x87, // 10:14 five_byte
+ 0x29, // 14:15 five_byte == 0x2987654321
+ 0x44, 0x65, 0x87, 0xa9, // 15:19 six_byte
+ 0xcb, 0xed, // 19:21 six_byte == -0x123456789abc
+ 0x97, 0xa6, 0xb5, 0xc4, // 21:25 seven_byte
+ 0xd3, 0xe2, 0x71, // 25:28 seven_byte == 0x71e2d3c4b5a697
+ 0xfa, 0xfa, 0xfb, 0xfc, // 28:32 eight_byte
+ 0xfd, 0xfe, 0xff, 0x80, // 32:36 eight_byte == -0x7f00010203040506
+};
+
+TEST(SizesView, CanReadSizes) {
+ auto view =
+ MakeAlignedSizesView<const uint8_t, 8>(kIntSizes, sizeof kIntSizes);
+ EXPECT_EQ(2, view.one_byte().Read());
+ EXPECT_EQ(-260, view.two_byte().Read());
+ EXPECT_EQ(0x445566, view.three_byte().Read());
+ EXPECT_EQ(-0x03040506, view.four_byte().Read());
+ EXPECT_EQ(0x2987654321, view.five_byte().Read());
+ EXPECT_EQ(-0x123456789abc, view.six_byte().Read());
+ EXPECT_EQ(0x71e2d3c4b5a697, view.seven_byte().Read());
+ EXPECT_EQ(-0x7f00010203040506, view.eight_byte().Read());
+ // Test that the views return appropriate integer widths.
+ EXPECT_EQ(1, sizeof(view.one_byte().Read()));
+ EXPECT_EQ(2, sizeof(view.two_byte().Read()));
+ EXPECT_EQ(4, sizeof(view.three_byte().Read()));
+ EXPECT_EQ(4, sizeof(view.four_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.five_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.six_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.seven_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.eight_byte().Read()));
+}
+
+TEST(SizesWriter, CanWriteSizes) {
+ uint8_t buffer[sizeof kIntSizes];
+ auto writer = SizesWriter(buffer, sizeof buffer);
+ writer.one_byte().Write(2);
+ writer.two_byte().Write(-260);
+ writer.three_byte().Write(0x445566);
+ writer.four_byte().Write(-0x03040506);
+ writer.five_byte().Write(0x2987654321);
+ writer.six_byte().Write(-0x123456789abc);
+ writer.seven_byte().Write(0x71e2d3c4b5a697);
+ writer.eight_byte().Write(-0x7f00010203040506);
+ EXPECT_EQ(std::vector<uint8_t>(kIntSizes, kIntSizes + sizeof kIntSizes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+alignas(8) static const uint8_t kIntSizesNegativeOnes[36] = {
+ 0xff, // 0:1 one_byte == -1
+ 0xff, 0xff, // 1:3 two_byte == -1
+ 0xff, 0xff, 0xff, // 3:6 three_byte == -1
+ 0xff, 0xff, 0xff, 0xff, // 6:10 four_byte == -1
+ 0xff, 0xff, 0xff, 0xff, // 10:14 five_byte
+ 0xff, // 14:15 five_byte == -1
+ 0xff, 0xff, 0xff, 0xff, // 15:19 six_byte
+ 0xff, 0xff, // 19:21 six_byte == -1
+ 0xff, 0xff, 0xff, 0xff, // 21:25 seven_byte
+ 0xff, 0xff, 0xff, // 25:28 seven_byte == -1
+ 0xff, 0xff, 0xff, 0xff, // 28:32 eight_byte
+ 0xff, 0xff, 0xff, 0xff, // 32:36 eight_byte == -1
+};
+
+TEST(SizesView, CanReadNegativeOne) {
+ auto view = MakeAlignedSizesView<const uint8_t, 8>(
+ kIntSizesNegativeOnes, sizeof kIntSizesNegativeOnes);
+ EXPECT_EQ(-1, view.one_byte().Read());
+ EXPECT_EQ(-1, view.two_byte().Read());
+ EXPECT_EQ(-1, view.three_byte().Read());
+ EXPECT_EQ(-1, view.four_byte().Read());
+ EXPECT_EQ(-1, view.five_byte().Read());
+ EXPECT_EQ(-1, view.six_byte().Read());
+ EXPECT_EQ(-1, view.seven_byte().Read());
+ EXPECT_EQ(-1, view.eight_byte().Read());
+}
+
+TEST(SizesView, CanWriteNegativeOne) {
+ uint8_t buffer[sizeof kIntSizesNegativeOnes];
+ auto writer = SizesWriter(buffer, sizeof buffer);
+ writer.one_byte().Write(-1);
+ writer.two_byte().Write(-1);
+ writer.three_byte().Write(-1);
+ writer.four_byte().Write(-1);
+ writer.five_byte().Write(-1);
+ writer.six_byte().Write(-1);
+ writer.seven_byte().Write(-1);
+ writer.eight_byte().Write(-1);
+ EXPECT_EQ(std::vector<uint8_t>(
+ kIntSizesNegativeOnes,
+ kIntSizesNegativeOnes + sizeof kIntSizesNegativeOnes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(SizesView, CopyFrom) {
+ std::array<uint8_t, sizeof kIntSizesNegativeOnes> buf_x = {};
+ std::array<uint8_t, sizeof kIntSizesNegativeOnes> buf_y = {};
+
+ auto x = SizesWriter(&buf_x);
+ auto y = SizesWriter(&buf_y);
+
+ constexpr int kValue = -1;
+ x.one_byte().Write(kValue);
+ EXPECT_NE(x.one_byte().Read(), y.one_byte().Read());
+ y.one_byte().CopyFrom(x.one_byte());
+ EXPECT_EQ(x.one_byte().Read(), y.one_byte().Read());
+}
+
+TEST(SizesView, TryToCopyFrom) {
+ std::array<uint8_t, sizeof kIntSizesNegativeOnes> buf_x = {};
+ std::array<uint8_t, sizeof kIntSizesNegativeOnes> buf_y = {};
+
+ auto x = SizesWriter(&buf_x);
+ auto y = SizesWriter(&buf_y);
+
+ constexpr int kValue = -1;
+ x.one_byte().Write(kValue);
+ EXPECT_NE(x.one_byte().Read(), y.one_byte().Read());
+ EXPECT_TRUE(y.one_byte().TryToCopyFrom(x.one_byte()));
+ EXPECT_EQ(x.one_byte().Read(), y.one_byte().Read());
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/namespace_test.cc b/back_end/cpp/testcode/namespace_test.cc
new file mode 100644
index 0000000..7f01590
--- /dev/null
+++ b/back_end/cpp/testcode/namespace_test.cc
@@ -0,0 +1,40 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests that generated code ends up in the correct C++ namespaces.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/absolute_cpp_namespace.emb.h"
+#include "testdata/cpp_namespace.emb.h"
+#include "testdata/no_cpp_namespace.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+TEST(Namespace, FooValueHasCorrectValueInDifferentNamespaces) {
+ EXPECT_EQ(static_cast</**/ ::emboss_generated_code::Foo>(10),
+ ::emboss_generated_code::Foo::VALUE);
+ EXPECT_EQ(static_cast</**/ ::emboss::test::no_leading_double_colon::Foo>(11),
+ ::emboss::test::no_leading_double_colon::Foo::VALUE);
+ EXPECT_EQ(static_cast</**/ ::emboss::test::leading_double_colon::Foo>(12),
+ ::emboss::test::leading_double_colon::Foo::VALUE);
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/nested_structure_test.cc b/back_end/cpp/testcode/nested_structure_test.cc
new file mode 100644
index 0000000..3d99def
--- /dev/null
+++ b/back_end/cpp/testcode/nested_structure_test.cc
@@ -0,0 +1,195 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for the generated View class for Container and Box from
+// nested_structure.emb.
+//
+// These tests check that nested structures work.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/nested_structure.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+alignas(8) static const uint8_t kContainer[20] = {
+ 0x28, 0x00, 0x00, 0x00, // 0:4 weight == 40
+ 0x78, 0x56, 0x34, 0x12, // 4:8 important_box.id == 0x12345678
+ 0x03, 0x02, 0x01, 0x00, // 8:12 important_box.count == 0x010203
+ 0x21, 0x43, 0x65, 0x87, // 12:16 other_box.id == 0x87654321
+ 0xcc, 0xbb, 0xaa, 0x00, // 16:20 other_box.count == 0xaabbcc
+};
+
+// ContainerView::SizeInBytes() returns the expected value.
+TEST(ContainerView, StaticSizeIsCorrect) {
+ EXPECT_EQ(20, ContainerView::SizeInBytes());
+}
+
+// ContainerView::SizeInBytes() returns the expected value.
+TEST(ContainerView, SizeFieldIsCorrect) {
+ auto view =
+ MakeAlignedContainerView<const uint8_t, 8>(kContainer, sizeof kContainer);
+ EXPECT_EQ(40, view.weight().Read());
+}
+
+// ContainerView::important_box() returns a BoxView, and not a different
+// template instantiation.
+TEST(ContainerView, FieldTypesAreExpected) {
+ auto container =
+ MakeAlignedContainerView<const uint8_t, 8>(kContainer, sizeof kContainer);
+ auto box = container.important_box();
+ EXPECT_EQ(0x12345678, box.id().Read());
+}
+
+// Box::SizeInBytes() returns the expected value, when retrieved from a
+// Container.
+TEST(ContainerView, BoxSizeFieldIsCorrect) {
+ auto view =
+ MakeAlignedContainerView<const uint8_t, 8>(kContainer, sizeof kContainer);
+ EXPECT_EQ(8, view.important_box().SizeInBytes());
+}
+
+// Box::id() and Box::count() return correct values when retrieved from
+// Container.
+TEST(ContainerView, BoxFieldValuesAreCorrect) {
+ auto view =
+ MakeAlignedContainerView<const uint8_t, 8>(kContainer, sizeof kContainer);
+ EXPECT_EQ(0x12345678, view.important_box().id().Read());
+ EXPECT_EQ(0x010203, view.important_box().count().Read());
+ EXPECT_EQ(0x87654321, view.other_box().id().Read());
+ EXPECT_EQ(0xaabbcc, view.other_box().count().Read());
+}
+
+TEST(ContainerView, CanWriteValues) {
+ alignas(8) uint8_t buffer[sizeof kContainer];
+ auto writer = MakeAlignedContainerView<uint8_t, 8>(buffer, sizeof buffer);
+ writer.weight().Write(40);
+ writer.important_box().id().Write(0x12345678);
+ writer.important_box().count().Write(0x010203);
+ writer.other_box().id().Write(0x87654321);
+ writer.other_box().count().Write(0xaabbcc);
+ EXPECT_EQ(std::vector<uint8_t>(kContainer, kContainer + sizeof kContainer),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(ContainerView, CanReadTextFormat) {
+ alignas(8) uint8_t buffer[sizeof kContainer];
+ auto writer = MakeAlignedContainerView<uint8_t, 8>(buffer, sizeof buffer);
+ EXPECT_TRUE(::emboss::UpdateFromText(writer, R"(
+ {
+ weight: 40
+ important_box: {
+ id: 0x12345678
+ count: 0x010203
+ }
+ other_box: {
+ id: 0x87654321
+ count: 0xaabbcc
+ }
+ }
+ )"));
+ EXPECT_EQ(std::vector<uint8_t>(kContainer, kContainer + sizeof kContainer),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+alignas(8) static const uint8_t kTruck[44] = {
+ 0x88, 0x66, 0x44, 0x22, // 0:4 id == 0x22446688
+ 0x64, 0x00, 0x00, 0x00, // 4:8 cargo[0].weight == 100
+ 0xff, 0x00, 0x00, 0x00, // 8:12 cargo[0].important_box.id == 255
+ 0x0a, 0x00, 0x00, 0x00, // 12:16 cargo[0].important_box.count == 10
+ 0x00, 0x94, 0x35, 0x77, // 16:20 cargo[0].other_box.id == 2000000000
+ 0xf4, 0x01, 0x00, 0x00, // 20:24 cargo[0].other_box.count == 500
+ 0x65, 0x00, 0x00, 0x00, // 24:28 cargo[1].weight == 101
+ 0xfe, 0x00, 0x00, 0x00, // 28:32 cargo[1].important_box.id == 254
+ 0x09, 0x00, 0x00, 0x00, // 32:36 cargo[1].important_box.count == 9
+ 0x01, 0x94, 0x35, 0x77, // 36:40 cargo[1].other_box.id == 2000000001
+ 0xf3, 0x01, 0x00, 0x00, // 40:44 cargo[1].other_box.count == 499
+};
+
+TEST(TruckView, ValuesAreCorrect) {
+ auto view = MakeAlignedTruckView<const uint8_t, 8>(kTruck, sizeof kTruck);
+ EXPECT_EQ(0x22446688, view.id().Read());
+ EXPECT_EQ(100, view.cargo()[0].weight().Read());
+ EXPECT_EQ(255, view.cargo()[0].important_box().id().Read());
+ EXPECT_EQ(10, view.cargo()[0].important_box().count().Read());
+ EXPECT_EQ(2000000000, view.cargo()[0].other_box().id().Read());
+ EXPECT_EQ(500, view.cargo()[0].other_box().count().Read());
+ EXPECT_EQ(101, view.cargo()[1].weight().Read());
+ EXPECT_EQ(254, view.cargo()[1].important_box().id().Read());
+ EXPECT_EQ(9, view.cargo()[1].important_box().count().Read());
+ EXPECT_EQ(2000000001, view.cargo()[1].other_box().id().Read());
+ EXPECT_EQ(499, view.cargo()[1].other_box().count().Read());
+}
+
+TEST(TruckView, WriteValues) {
+ uint8_t buffer[sizeof kTruck];
+ auto writer = TruckWriter(buffer, sizeof buffer);
+ writer.id().Write(0x22446688);
+ writer.cargo()[0].weight().Write(100);
+ writer.cargo()[0].important_box().id().Write(255);
+ writer.cargo()[0].important_box().count().Write(10);
+ writer.cargo()[0].other_box().id().Write(2000000000);
+ writer.cargo()[0].other_box().count().Write(500);
+ writer.cargo()[1].weight().Write(101);
+ writer.cargo()[1].important_box().id().Write(254);
+ writer.cargo()[1].important_box().count().Write(9);
+ writer.cargo()[1].other_box().id().Write(2000000001);
+ writer.cargo()[1].other_box().count().Write(499);
+ EXPECT_EQ(std::vector<uint8_t>(kTruck, kTruck + sizeof kTruck),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(TruckView, CanReadTextFormat) {
+ uint8_t buffer[sizeof kTruck];
+ auto writer = TruckWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(::emboss::UpdateFromText(writer, R"(
+ {
+ id: 0x22446688
+ cargo: {
+ {
+ weight: 100
+ important_box: {
+ id: 255
+ count: 10
+ }
+ other_box: {
+ id: 2_000_000_000
+ count: 500
+ }
+ },
+ {
+ weight: 101
+ important_box: {
+ id: 254
+ count: 9
+ }
+ other_box: {
+ id: 2_000_000_001
+ count: 499
+ }
+ },
+ }
+ }
+ )"));
+ EXPECT_EQ(std::vector<uint8_t>(kTruck, kTruck + sizeof kTruck),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/ok_fuzzer.cc b/back_end/cpp/testcode/ok_fuzzer.cc
new file mode 100644
index 0000000..7bf8689
--- /dev/null
+++ b/back_end/cpp/testcode/ok_fuzzer.cc
@@ -0,0 +1,29 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Fuzz test for `Complex::Ok()` from `complex_structure.emb` on arbitrary data.
+//
+// This fuzz target verifies that `Ok()` does not crash on any input. It does
+// not verify that `Ok()` does the right thing.
+
+#include "testdata/complex_structure.emb.h"
+
+// Entry point for fuzz tester: this must have this exact signature, including
+// the name `LLVMFuzzerTestOneInput`, or it will not work.
+extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
+ auto storage = ::std::basic_string<uint8_t>(data, size);
+ const auto view = ::emboss_test::MakeComplexView(&storage);
+ (void)view.Ok();
+ return 0;
+}
diff --git a/back_end/cpp/testcode/parameters_test.cc b/back_end/cpp/testcode/parameters_test.cc
new file mode 100644
index 0000000..5392743
--- /dev/null
+++ b/back_end/cpp/testcode/parameters_test.cc
@@ -0,0 +1,118 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests of generated code for virtual fields.
+#include <stdint.h>
+
+#include <type_traits>
+#include <utility>
+#include <vector>
+
+#include "testdata/parameters.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss_test {
+namespace {
+
+TEST(Axes, Construction) {
+ ::std::array<char, 12> values = {1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0};
+ auto view = MakeAxesView(2, &values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(2, view.values().ElementCount());
+ EXPECT_EQ(1, view.values()[0].value().Read());
+ EXPECT_EQ(1, view.x().x().Read());
+ EXPECT_EQ(2, view.values()[1].value().Read());
+ EXPECT_EQ(2, view.y().y().Read());
+ EXPECT_FALSE(view.has_z().Value());
+
+ view = MakeAxesView(3, &values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(3, view.values().ElementCount());
+ EXPECT_EQ(1, view.values()[0].value().Read());
+ EXPECT_EQ(2, view.values()[1].value().Read());
+ EXPECT_EQ(3, view.values()[2].value().Read());
+ EXPECT_EQ(3, view.z().z().Read());
+
+ view = MakeAxesView(4, &values);
+ EXPECT_FALSE(view.Ok());
+}
+
+TEST(Axes, VirtualUsingParameter) {
+ ::std::array<char, 12> values = {1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0};
+ auto view = MakeAxesView(2, &values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(3, view.axis_count_plus_one().Read());
+}
+
+TEST(AxesEnvelope, FieldPassedAsParameter) {
+ ::std::array<unsigned char, 9> values = {2, 0, 0, 0, 0x80, 0, 100, 0, 0};
+ auto view = MakeAxesEnvelopeView(&values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(0x80000000U, view.axes().x().value().Read());
+ EXPECT_EQ(9, view.SizeInBytes());
+}
+
+TEST(AxesEnvelope, ParameterValueIsOutOfRange) {
+ ::std::array<unsigned char, 9> values = {16, 0, 0, 0, 0x80, 0, 100, 0, 0};
+ auto view = MakeAxesEnvelopeView(&values);
+ EXPECT_FALSE(view.Ok());
+ EXPECT_FALSE(view.axes().Ok());
+}
+
+TEST(Multiversion, ParameterPassedDown) {
+ ::std::array<char, 13> values = {0, 1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0};
+ auto view = MakeMultiversionView(Product::VERSION_1, &values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(2, view.axes().y().y().Read());
+ EXPECT_FALSE(view.axes().has_z().Value());
+ view = MakeMultiversionView(Product::VERSION_X, &values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(2, view.axes().y().y().Read());
+ EXPECT_TRUE(view.axes().has_z().Value());
+}
+
+TEST(Multiversion, ParameterUsedToSwitchField) {
+ ::std::array<unsigned char, 9> values = {1, 0, 0, 0, 0x80, 0, 100, 0, 0};
+ auto view = MakeMultiversionView(Product::VERSION_1, &values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_TRUE(view.config().power().Read());
+ EXPECT_FALSE(view.has_config_vx().Value());
+ EXPECT_EQ(5, view.SizeInBytes());
+ view = MakeMultiversionView(Product::VERSION_X, &values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_TRUE(view.config().power().Read());
+ EXPECT_TRUE(view.has_config_vx().Value());
+ EXPECT_EQ(25600, view.config_vx().gain().Read());
+ EXPECT_EQ(9, view.SizeInBytes());
+}
+
+TEST(StructContainingStructWithUnusedParameter, NoParameterIsNotOk) {
+ ::std::array<char, 1> bytes = {1};
+ auto view = MakeStructContainingStructWithUnusedParameterView(&bytes);
+ EXPECT_FALSE(view.Ok());
+ EXPECT_FALSE(view.swup().Ok());
+ // In theory, view.swup().y().Ok() could be true, but as of time of writing,
+ // missing/invalid parameters cause the parent structure to withhold backing
+ // storage, making the entire child struct not Ok().
+}
+
+TEST(SizedArrayOfBiasedValues, ArrayElementsAreAccessible) {
+ ::std::array<char, 3> bytes = {1, 10, 100};
+ auto view = MakeSizedArrayOfBiasedValuesView(&bytes);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(110, view.values()[0].value().Read());
+}
+
+} // namespace
+} // namespace emboss_test
diff --git a/back_end/cpp/testcode/read_log_file_status_test.cc b/back_end/cpp/testcode/read_log_file_status_test.cc
new file mode 100644
index 0000000..e2070ab
--- /dev/null
+++ b/back_end/cpp/testcode/read_log_file_status_test.cc
@@ -0,0 +1,96 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for the generated View class for a LogFileStatus from
+// span_se_log_file_status.emb.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/golden/span_se_log_file_status.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+// A simple, static LogFileStatus. There are technically no invalid
+// LogFileStatuses as long as there are at least 24 bytes to read.
+static const ::std::uint8_t kLogFileStatus[24] = {
+ 0x01, 0x02, 0x03, 0x04, // 0:4 UInt file_state
+ 'A', 'B', 'C', 'D', // 4:16 UInt:8[12] file_name
+ 'E', 'F', 'G', 'H', // 4:16 UInt:8[12] file_name
+ 'I', 'J', 'K', 'L', // 4:16 UInt:8[12] file_name
+ 0x05, 0x06, 0x07, 0x08, // 16:20 UInt file_size_kb
+ 0x09, 0x0a, 0x0b, 0x0c, // 20:24 UInt media
+};
+
+// LogFileStatusView constructor compiles and runs without crashing.
+TEST(LogFileStatusView, ConstructorRuns) {
+ LogFileStatusView(kLogFileStatus, sizeof kLogFileStatus);
+}
+
+// LogFileStatusView::SizeInBytes() returns the expected value.
+TEST(LogFileStatusView, SizeIsCorrect) {
+ EXPECT_EQ(24, LogFileStatusView::SizeInBytes());
+}
+
+// LogFileStatusView's atomic field accessors work.
+TEST(LogFileStatusView, AtomicFieldAccessorsWork) {
+ auto view = LogFileStatusView(kLogFileStatus, sizeof kLogFileStatus);
+ EXPECT_EQ(0x04030201, view.file_state().Read());
+ EXPECT_EQ(0x08070605, view.file_size_kb().Read());
+ EXPECT_EQ(0x0c0b0a09, view.media().Read());
+}
+
+// LogFileStatusView's array field accessor works.
+TEST(LogFileStatusView, ArrayFieldAccessor) {
+ auto view = LogFileStatusView(kLogFileStatus, sizeof kLogFileStatus);
+ EXPECT_EQ('A', view.file_name()[0].Read());
+ EXPECT_EQ('L', view.file_name()[11].Read());
+}
+
+// The "Ok()" method works.
+TEST(LogFileStatusView, Ok) {
+ auto view = LogFileStatusView(kLogFileStatus, sizeof kLogFileStatus);
+ EXPECT_TRUE(view.Ok());
+ view = LogFileStatusView(kLogFileStatus, sizeof kLogFileStatus - 1);
+ EXPECT_FALSE(view.Ok());
+ std::vector</**/ ::std::uint8_t> bigger_than_necessary(sizeof kLogFileStatus +
+ 1);
+ view = LogFileStatusView(&bigger_than_necessary[0],
+ bigger_than_necessary.size());
+ EXPECT_TRUE(view.Ok());
+}
+
+TEST(LogFileStatusView, Writing) {
+ ::std::uint8_t buffer[sizeof kLogFileStatus] = {0};
+ auto writer = LogFileStatusWriter(buffer, sizeof buffer);
+ writer.file_state().Write(0x04030201);
+ writer.file_size_kb().Write(0x08070605);
+ writer.media().Write(0x0c0b0a09);
+ // TODO(bolms): Add a Count method, that returns the element count instead of
+ // the byte count. (Not a problem here, since file_name's elements are each
+ // one byte anyway.)
+ for (::std::size_t i = 0; i < writer.file_name().SizeInBytes(); ++i) {
+ writer.file_name()[i].Write('A' + i);
+ }
+ EXPECT_EQ(std::vector</**/ ::std::uint8_t>(
+ kLogFileStatus, kLogFileStatus + sizeof kLogFileStatus),
+ std::vector</**/ ::std::uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/requires_test.cc b/back_end/cpp/testcode/requires_test.cc
new file mode 100644
index 0000000..2ab3279
--- /dev/null
+++ b/back_end/cpp/testcode/requires_test.cc
@@ -0,0 +1,204 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for [requires] using requires.emb.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/requires.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+TEST(RequiresIntegers, Ok) {
+ std::array<std::uint8_t, 3> buffer = {0, 0, 0};
+ auto view = MakeRequiresIntegersView(&buffer);
+ EXPECT_TRUE(view.zero_through_nine().Ok());
+ EXPECT_FALSE(view.ten_through_twenty().Ok());
+ EXPECT_TRUE(view.disjoint().Ok());
+ EXPECT_FALSE(view.ztn_plus_ttt().Ok());
+ EXPECT_TRUE(view.zero_through_nine_plus_five().Ok());
+ EXPECT_FALSE(view.alias_of_zero_through_nine().Ok());
+
+ buffer[1] = 10;
+ EXPECT_TRUE(view.zero_through_nine().Ok());
+ EXPECT_TRUE(view.ten_through_twenty().Ok());
+ EXPECT_TRUE(view.disjoint().Ok());
+ EXPECT_TRUE(view.ztn_plus_ttt().Ok());
+ EXPECT_TRUE(view.zero_through_nine_plus_five().Ok());
+ EXPECT_FALSE(view.alias_of_zero_through_nine().Ok());
+
+ buffer[0] = 2;
+ EXPECT_TRUE(view.zero_through_nine().Ok());
+ EXPECT_TRUE(view.ten_through_twenty().Ok());
+ EXPECT_TRUE(view.disjoint().Ok());
+ EXPECT_TRUE(view.ztn_plus_ttt().Ok());
+ EXPECT_TRUE(view.zero_through_nine_plus_five().Ok());
+ EXPECT_TRUE(view.alias_of_zero_through_nine().Ok());
+ EXPECT_FALSE(view.Ok());
+
+ buffer[1] = 12;
+ EXPECT_TRUE(view.zero_through_nine().Ok());
+ EXPECT_TRUE(view.ten_through_twenty().Ok());
+ EXPECT_TRUE(view.disjoint().Ok());
+ EXPECT_TRUE(view.ztn_plus_ttt().Ok());
+ EXPECT_TRUE(view.zero_through_nine_plus_five().Ok());
+ EXPECT_TRUE(view.alias_of_zero_through_nine().Ok());
+ EXPECT_TRUE(view.Ok());
+}
+
+TEST(RequiresIntegers, CouldWriteValue) {
+ std::array<std::uint8_t, 3> buffer = {0, 0, 0};
+ auto view = MakeRequiresIntegersView(&buffer);
+
+ EXPECT_TRUE(view.zero_through_nine().CouldWriteValue(0));
+ EXPECT_TRUE(view.zero_through_nine().CouldWriteValue(9));
+ EXPECT_FALSE(view.zero_through_nine().CouldWriteValue(-1));
+ EXPECT_FALSE(view.zero_through_nine().CouldWriteValue(10));
+
+ EXPECT_TRUE(view.ten_through_twenty().CouldWriteValue(10));
+ EXPECT_TRUE(view.ten_through_twenty().CouldWriteValue(20));
+ EXPECT_FALSE(view.ten_through_twenty().CouldWriteValue(-1));
+ EXPECT_FALSE(view.ten_through_twenty().CouldWriteValue(9));
+ EXPECT_FALSE(view.ten_through_twenty().CouldWriteValue(21));
+
+ EXPECT_TRUE(view.disjoint().CouldWriteValue(0));
+ EXPECT_TRUE(view.disjoint().CouldWriteValue(5));
+ EXPECT_TRUE(view.disjoint().CouldWriteValue(15));
+ EXPECT_TRUE(view.disjoint().CouldWriteValue(20));
+ EXPECT_FALSE(view.disjoint().CouldWriteValue(-1));
+ EXPECT_FALSE(view.disjoint().CouldWriteValue(6));
+ EXPECT_FALSE(view.disjoint().CouldWriteValue(14));
+ EXPECT_FALSE(view.disjoint().CouldWriteValue(21));
+}
+
+TEST(RequiresBools, Ok) {
+ std::array<std::uint8_t, 1> buffer = {0};
+ auto view = MakeRequiresBoolsView(&buffer);
+ EXPECT_TRUE(view.must_be_false().Ok());
+ EXPECT_FALSE(view.must_be_true().Ok());
+ EXPECT_TRUE(view.b_must_be_false().Ok());
+ EXPECT_FALSE(view.alias_of_a_must_be_true().Ok());
+ EXPECT_FALSE(view.Ok());
+
+ view.a().Write(true);
+ view.must_be_true().Write(true);
+ EXPECT_TRUE(view.must_be_false().Ok());
+ EXPECT_TRUE(view.must_be_true().Ok());
+ EXPECT_TRUE(view.b_must_be_false().Ok());
+ EXPECT_TRUE(view.alias_of_a_must_be_true().Ok());
+ EXPECT_TRUE(view.Ok());
+}
+
+TEST(RequiresBools, CouldWriteValue) {
+ std::array<std::uint8_t, 1> buffer = {0};
+ auto view = MakeRequiresBoolsView(&buffer);
+
+ EXPECT_TRUE(view.a().CouldWriteValue(true));
+ EXPECT_TRUE(view.a().CouldWriteValue(false));
+ EXPECT_TRUE(view.b().CouldWriteValue(true));
+ EXPECT_TRUE(view.b().CouldWriteValue(false));
+ EXPECT_TRUE(view.must_be_true().CouldWriteValue(true));
+ EXPECT_FALSE(view.must_be_true().CouldWriteValue(false));
+ EXPECT_FALSE(view.must_be_false().CouldWriteValue(true));
+ EXPECT_TRUE(view.must_be_false().CouldWriteValue(false));
+ EXPECT_TRUE(view.alias_of_a_must_be_true().CouldWriteValue(true));
+ EXPECT_FALSE(view.alias_of_a_must_be_true().CouldWriteValue(false));
+ EXPECT_DEATH(view.alias_of_a_must_be_true().Write(false), "");
+}
+
+TEST(RequiresEnums, Ok) {
+ std::array<std::uint8_t, 3> buffer = {0, 0, 0};
+ auto view = MakeRequiresEnumsView(&buffer);
+ EXPECT_TRUE(view.a().Ok());
+ EXPECT_TRUE(view.b().Ok());
+ EXPECT_TRUE(view.c().Ok());
+ EXPECT_TRUE(view.filtered_a().Ok());
+ EXPECT_FALSE(view.alias_of_a().Ok());
+ EXPECT_FALSE(view.Ok());
+
+ view.a().Write(RequiresEnums::Enum::EN1);
+ EXPECT_TRUE(view.a().Ok());
+ EXPECT_TRUE(view.b().Ok());
+ EXPECT_TRUE(view.c().Ok());
+ EXPECT_TRUE(view.filtered_a().Ok());
+ EXPECT_TRUE(view.alias_of_a().Ok());
+ EXPECT_TRUE(view.Ok());
+
+ view.b().Write(RequiresEnums::Enum::EN1);
+ EXPECT_TRUE(view.a().Ok());
+ EXPECT_TRUE(view.b().Ok());
+ EXPECT_TRUE(view.c().Ok());
+ EXPECT_TRUE(view.filtered_a().Ok());
+ EXPECT_TRUE(view.alias_of_a().Ok());
+ EXPECT_FALSE(view.Ok());
+
+ buffer[2] = 2;
+ EXPECT_FALSE(view.c().Ok());
+}
+
+TEST(RequiresEnums, CouldWriteValue) {
+ std::array<std::uint8_t, 3> buffer = {0, 0, 0};
+ auto view = MakeRequiresEnumsView(&buffer);
+
+ EXPECT_TRUE(view.a().CouldWriteValue(RequiresEnums::Enum::EN0));
+ EXPECT_TRUE(view.a().CouldWriteValue(RequiresEnums::Enum::EN3));
+ EXPECT_TRUE(view.b().CouldWriteValue(RequiresEnums::Enum::EN0));
+ EXPECT_TRUE(view.b().CouldWriteValue(RequiresEnums::Enum::EN3));
+ EXPECT_TRUE(view.c().CouldWriteValue(RequiresEnums::Enum::EN0));
+ EXPECT_TRUE(view.c().CouldWriteValue(RequiresEnums::Enum::EN1));
+ EXPECT_FALSE(view.c().CouldWriteValue(RequiresEnums::Enum::EN2));
+ EXPECT_FALSE(view.c().CouldWriteValue(RequiresEnums::Enum::EN3));
+ EXPECT_FALSE(view.alias_of_a().CouldWriteValue(RequiresEnums::Enum::EN0));
+ EXPECT_TRUE(view.alias_of_a().CouldWriteValue(RequiresEnums::Enum::EN1));
+ EXPECT_FALSE(view.alias_of_a().CouldWriteValue(RequiresEnums::Enum::EN2));
+ EXPECT_FALSE(view.alias_of_a().CouldWriteValue(RequiresEnums::Enum::EN3));
+}
+
+TEST(RequiresWithOptionalFields, Ok) {
+ std::array<std::uint8_t, 1> buffer = {0};
+ auto view = MakeRequiresWithOptionalFieldsView(&buffer);
+ EXPECT_FALSE(view.Ok());
+ view.a().Write(true);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_FALSE(view.has_b().Value());
+ view.b_exists().Write(true);
+ EXPECT_TRUE(view.b().Ok());
+ EXPECT_FALSE(view.b_true().Ok());
+ view.b_true().Write(true);
+ EXPECT_TRUE(view.b_true().Ok());
+ EXPECT_TRUE(view.Ok());
+ view.a().Write(false);
+ EXPECT_TRUE(view.Ok());
+}
+
+TEST(RequiresWithOptionalFields, CouldWriteValue) {
+ std::array<std::uint8_t, 1> buffer = {0};
+ auto view = MakeRequiresWithOptionalFieldsView(&buffer);
+ view.b_exists().Write(true);
+
+ EXPECT_TRUE(view.a().CouldWriteValue(true));
+ EXPECT_TRUE(view.a().CouldWriteValue(false));
+ EXPECT_TRUE(view.b().CouldWriteValue(true));
+ EXPECT_TRUE(view.b().CouldWriteValue(false));
+ EXPECT_TRUE(view.b_true().CouldWriteValue(true));
+ EXPECT_FALSE(view.b_true().CouldWriteValue(false));
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/start_size_range_test.cc b/back_end/cpp/testcode/start_size_range_test.cc
new file mode 100644
index 0000000..0071c9f
--- /dev/null
+++ b/back_end/cpp/testcode/start_size_range_test.cc
@@ -0,0 +1,45 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Test for the generated View class for StartSize from start_size_range.emb.
+
+#include <stdint.h>
+
+#include "testdata/start_size_range.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+static const uint8_t kStartSizeRange[9] = {
+ 0x02, // 0:1 0:1 size == 4
+ 0xe8, 0x03, // 1:3 1 [+2] start_size_constants == 1000
+ 0x11, 0x22, // 3:5 3 [+s] payload
+ 0x21, 0x43, 0x65, 0x87, // 5:9 3+s [+4] counter == 0x87654321
+};
+
+TEST(StartSizeView, EverythingInPlace) {
+ auto view = StartSizeView(kStartSizeRange, sizeof kStartSizeRange);
+ EXPECT_EQ(9, view.SizeInBytes());
+ EXPECT_EQ(2, view.size().Read());
+ EXPECT_EQ(1000, view.start_size_constants().Read());
+ EXPECT_EQ(0x11, view.payload()[0].Read());
+ EXPECT_EQ(0x22, view.payload()[1].Read());
+ EXPECT_EQ(0x87654321, view.counter().Read());
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/subtypes_test.cc b/back_end/cpp/testcode/subtypes_test.cc
new file mode 100644
index 0000000..e19c1cb
--- /dev/null
+++ b/back_end/cpp/testcode/subtypes_test.cc
@@ -0,0 +1,59 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for the generated View class from subtypes.emb.
+//
+// These tests check that nested types work.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/subtypes.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+TEST(SubtypesTest, InnerEnumNames) {
+ EXPECT_EQ(0, static_cast<int>(Out::In::InIn::InInIn::NO));
+ EXPECT_EQ(0, static_cast<int>(Out::In::InInView::InInIn::NO));
+}
+
+TEST(SubtypesTest, OuterStructure) {
+ uint8_t buffer[OutWriter::SizeInBytes()] = {0};
+ auto view = OutWriter(buffer, sizeof buffer);
+ buffer[1] = 0xcc;
+ EXPECT_EQ(0xcc, view.in_1().in_in_1().in_2().field_byte().Read());
+ view.in_1().in_in_1().in_2().field_byte().Write(0x88);
+ EXPECT_EQ(0x88, buffer[1]);
+ buffer[static_cast<int>(Out::In::InIn::outer_offset())] = 0xff;
+ EXPECT_EQ(0xff, view.nested_constant_check().Read());
+ view.nested_constant_check().Write(0x77);
+ EXPECT_EQ(0x77, buffer[24]);
+
+ buffer[6] = 0x7;
+ buffer[7] = 0x8;
+ buffer[14] = 0x6;
+ buffer[22] = 0xee;
+ EXPECT_EQ(0xee, view.name_collision().Read());
+ EXPECT_EQ(0x7, view.in_1().name_collision().Read());
+ EXPECT_EQ(0x8, view.in_1().name_collision_check().Read());
+ EXPECT_EQ(0x6, view.in_2().name_collision().Read());
+ EXPECT_EQ(0x6, view.in_2().name_collision().Read());
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/text_format_fuzzer.cc b/back_end/cpp/testcode/text_format_fuzzer.cc
new file mode 100644
index 0000000..42f1b4d
--- /dev/null
+++ b/back_end/cpp/testcode/text_format_fuzzer.cc
@@ -0,0 +1,31 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Fuzz test for reading Emboss text format of the `Complex` struct in
+// `complex_structure.emb`.
+//
+// This fuzz target verifies that ::emboss::UpdateFromText does not crash on any
+// input. It does not verify that UpdateFromText does the right thing.
+
+#include "testdata/complex_structure.emb.h"
+
+// Entry point for fuzz tester: this must have this exact signature, including
+// the name `LLVMFuzzerTestOneInput`, or it will not work.
+extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
+ ::std::array<char, 64> values = {0};
+ const auto view = ::emboss_test::MakeComplexView(&values);
+ ::emboss::UpdateFromText(
+ view, ::std::string(reinterpret_cast<const char *>(data), size));
+ return 0;
+}
diff --git a/back_end/cpp/testcode/text_format_test.cc b/back_end/cpp/testcode/text_format_test.cc
new file mode 100644
index 0000000..8e669b7
--- /dev/null
+++ b/back_end/cpp/testcode/text_format_test.cc
@@ -0,0 +1,81 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests of generated code for text format.
+#include <stdint.h>
+
+#include <type_traits>
+#include <utility>
+#include <vector>
+
+#include "testdata/text_format.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+TEST(TextFormat, VanillaOutput) {
+ std::array<char, 2> values = {1, 2};
+ const auto view = MakeVanillaView(&values);
+ EXPECT_EQ("{ a: 1, b: 2 }", ::emboss::WriteToString(view));
+ EXPECT_EQ(
+ "{\n"
+ " a: 1 # 0x1\n"
+ " b: 2 # 0x2\n"
+ "}",
+ ::emboss::WriteToString(view, ::emboss::MultilineText()));
+}
+
+TEST(TextFormat, SkippedFieldOutput) {
+ std::array<char, 3> values = {1, 2, 3};
+ const auto view = MakeStructWithSkippedFieldsView(&values);
+ EXPECT_EQ("{ a: 1, c: 3 }", ::emboss::WriteToString(view));
+ EXPECT_EQ(
+ "{\n"
+ " a: 1 # 0x1\n"
+ " c: 3 # 0x3\n"
+ "}",
+ ::emboss::WriteToString(view, ::emboss::MultilineText()));
+}
+
+TEST(TextFormat, SkippedStructureFieldOutput) {
+ std::array<char, 6> values = {1, 2, 3, 4, 5, 6};
+ const auto view = MakeStructWithSkippedStructureFieldsView(&values);
+ EXPECT_EQ("{ a: { a: 1, b: 2 }, c: { a: 5, b: 6 } }",
+ ::emboss::WriteToString(view));
+ EXPECT_EQ(
+ "{\n"
+ " a: {\n"
+ " a: 1 # 0x1\n"
+ " b: 2 # 0x2\n"
+ " }\n"
+ " c: {\n"
+ " a: 5 # 0x5\n"
+ " b: 6 # 0x6\n"
+ " }\n"
+ "}",
+ ::emboss::WriteToString(view, ::emboss::MultilineText()));
+ EXPECT_EQ("{ a: 3, b: 4 }", ::emboss::WriteToString(view.b()));
+ EXPECT_EQ(
+ "{\n"
+ " a: 3 # 0x3\n"
+ " b: 4 # 0x4\n"
+ "}",
+ ::emboss::WriteToString(view.b(), ::emboss::MultilineText()));
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/uint_sizes_test.cc b/back_end/cpp/testcode/uint_sizes_test.cc
new file mode 100644
index 0000000..faad3a0
--- /dev/null
+++ b/back_end/cpp/testcode/uint_sizes_test.cc
@@ -0,0 +1,417 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests for the generated View class for Container and Box from
+// nested_structure.emb.
+//
+// These tests check that nested structures work.
+#include <stdint.h>
+
+#include <vector>
+
+#include "testdata/uint_sizes.emb.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+alignas(8) static const uint8_t kUIntSizes[36] = {
+ 0x02, // 0:1 one_byte == 2
+ 0x04, 0x01, // 1:3 two_byte == 260
+ 0x66, 0x55, 0x44, // 3:6 three_byte == 0x445566
+ 0x06, 0x05, 0x04, 0x03, // 6:10 four_byte == 0x03040506
+ 0x21, 0x43, 0x65, 0x87, // 10:14 five_byte
+ 0xa9, // 14:15 five_byte == 0xa987654321
+ 0xbc, 0x9a, 0x78, 0x56, // 15:19 six_byte
+ 0x34, 0x12, // 19:21 six_byte == 0x123456789abc
+ 0x97, 0xa6, 0xb5, 0xc4, // 21:25 seven_byte
+ 0xd3, 0xe2, 0xf1, // 25:28 seven_byte == 0xf1e2d3c4b5a697
+ 0x06, 0x05, 0x04, 0x03, // 28:32 eight_byte
+ 0x02, 0x01, 0x00, 0xff, // 32:36 eight_byte == 0xff00010203040506
+};
+
+TEST(SizesView, CanReadSizes) {
+ auto view =
+ MakeAlignedSizesView<const uint8_t, 8>(kUIntSizes, sizeof kUIntSizes);
+ EXPECT_EQ(2, view.one_byte().Read());
+ EXPECT_EQ(260, view.two_byte().Read());
+ EXPECT_EQ(0x445566U, view.three_byte().Read());
+ EXPECT_EQ(0x03040506U, view.four_byte().Read());
+ EXPECT_EQ(0xa987654321, view.five_byte().Read());
+ EXPECT_EQ(0x123456789abc, view.six_byte().Read());
+ EXPECT_EQ(0xf1e2d3c4b5a697, view.seven_byte().Read());
+ EXPECT_EQ(0xff00010203040506UL, view.eight_byte().Read());
+ // Test that the views return appropriate integer widths.
+ EXPECT_EQ(1, sizeof(view.one_byte().Read()));
+ EXPECT_EQ(2, sizeof(view.two_byte().Read()));
+ EXPECT_EQ(4, sizeof(view.three_byte().Read()));
+ EXPECT_EQ(4, sizeof(view.four_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.five_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.six_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.seven_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.eight_byte().Read()));
+}
+
+TEST(SizesWriter, CanWriteSizes) {
+ uint8_t buffer[sizeof kUIntSizes];
+ auto writer = SizesWriter(buffer, sizeof buffer);
+ writer.one_byte().Write(2);
+ writer.two_byte().Write(260);
+ writer.three_byte().Write(0x445566U);
+ writer.four_byte().Write(0x03040506U);
+ writer.five_byte().Write(0xa987654321);
+ writer.six_byte().Write(0x123456789abc);
+ writer.seven_byte().Write(0xf1e2d3c4b5a697);
+ writer.eight_byte().Write(0xff00010203040506UL);
+ EXPECT_EQ(std::vector<uint8_t>(kUIntSizes, kUIntSizes + sizeof kUIntSizes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(SizesView, CanReadSizesBigEndian) {
+ auto view = BigEndianSizesView(kUIntSizes, sizeof kUIntSizes);
+ EXPECT_EQ(2, view.one_byte().Read());
+ EXPECT_EQ(0x0401, view.two_byte().Read());
+ EXPECT_EQ(0x665544U, view.three_byte().Read());
+ EXPECT_EQ(0x06050403U, view.four_byte().Read());
+ EXPECT_EQ(0x21436587a9, view.five_byte().Read());
+ EXPECT_EQ(0xbc9a78563412, view.six_byte().Read());
+ EXPECT_EQ(0x97a6b5c4d3e2f1, view.seven_byte().Read());
+ EXPECT_EQ(0x06050403020100ffUL, view.eight_byte().Read());
+ // Test that the views return appropriate integer widths.
+ EXPECT_EQ(1, sizeof(view.one_byte().Read()));
+ EXPECT_EQ(2, sizeof(view.two_byte().Read()));
+ EXPECT_EQ(4, sizeof(view.three_byte().Read()));
+ EXPECT_EQ(4, sizeof(view.four_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.five_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.six_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.seven_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.eight_byte().Read()));
+}
+
+TEST(SizesWriter, CanWriteSizesBigEndian) {
+ uint8_t buffer[sizeof kUIntSizes];
+ auto writer = BigEndianSizesWriter(buffer, sizeof buffer);
+ writer.one_byte().Write(2);
+ writer.two_byte().Write(0x0401);
+ writer.three_byte().Write(0x665544U);
+ writer.four_byte().Write(0x06050403U);
+ writer.five_byte().Write(0x21436587a9);
+ writer.six_byte().Write(0xbc9a78563412);
+ writer.seven_byte().Write(0x97a6b5c4d3e2f1);
+ writer.eight_byte().Write(0x06050403020100ffUL);
+ EXPECT_EQ(std::vector<uint8_t>(kUIntSizes, kUIntSizes + sizeof kUIntSizes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(SizesView, CanReadSizesAlternatingEndian) {
+ auto view = AlternatingEndianSizesView(kUIntSizes, sizeof kUIntSizes);
+ EXPECT_EQ(2, view.one_byte().Read());
+ EXPECT_EQ(0x0104, view.two_byte().Read());
+ EXPECT_EQ(0x665544U, view.three_byte().Read());
+ EXPECT_EQ(0x03040506U, view.four_byte().Read());
+ EXPECT_EQ(0x21436587a9, view.five_byte().Read());
+ EXPECT_EQ(0x123456789abc, view.six_byte().Read());
+ EXPECT_EQ(0x97a6b5c4d3e2f1, view.seven_byte().Read());
+ EXPECT_EQ(0xff00010203040506UL, view.eight_byte().Read());
+ // Test that the views return appropriate integer widths.
+ EXPECT_EQ(1, sizeof(view.one_byte().Read()));
+ EXPECT_EQ(2, sizeof(view.two_byte().Read()));
+ EXPECT_EQ(4, sizeof(view.three_byte().Read()));
+ EXPECT_EQ(4, sizeof(view.four_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.five_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.six_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.seven_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.eight_byte().Read()));
+}
+
+TEST(SizesWriter, CanWriteSizesAlternatingEndian) {
+ uint8_t buffer[sizeof kUIntSizes];
+ auto writer = AlternatingEndianSizesWriter(buffer, sizeof buffer);
+ writer.one_byte().Write(2);
+ writer.two_byte().Write(0x0104);
+ writer.three_byte().Write(0x665544U);
+ writer.four_byte().Write(0x03040506);
+ writer.five_byte().Write(0x21436587a9);
+ writer.six_byte().Write(0x123456789abc);
+ writer.seven_byte().Write(0x97a6b5c4d3e2f1);
+ writer.eight_byte().Write(0xff00010203040506UL);
+ EXPECT_EQ(std::vector<uint8_t>(kUIntSizes, kUIntSizes + sizeof kUIntSizes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(SizesView, DecodeUIntsFromText) {
+ uint8_t buffer[sizeof kUIntSizes] = {0};
+ auto writer = SizesWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(::emboss::UpdateFromText(writer, R"(
+ {
+ one_byte: 2
+ two_byte: 260
+ three_byte: 0x445566
+ four_byte: 0x03040506
+ five_byte: 0xa987654321
+ six_byte: 0x123456789abc
+ seven_byte: 0xf1e2d3c4b5a697
+ eight_byte: 0xff00010203040506
+ }
+ )"));
+ EXPECT_EQ(std::vector<uint8_t>(kUIntSizes, kUIntSizes + sizeof kUIntSizes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+ EXPECT_EQ(2, writer.one_byte().Read());
+ EXPECT_TRUE(::emboss::UpdateFromText(writer, "{one_byte:5}"));
+ EXPECT_EQ(5, buffer[0]);
+ EXPECT_EQ(5, writer.one_byte().Read());
+ EXPECT_FALSE(::emboss::UpdateFromText(writer, "{one_byte:256}"));
+ EXPECT_EQ(5, buffer[0]);
+ EXPECT_EQ(5, writer.one_byte().Read());
+ EXPECT_FALSE(::emboss::UpdateFromText(writer, "{three_byte:0x1000000}"));
+ EXPECT_FALSE(::emboss::UpdateFromText(writer, "{no_byte:0}"));
+}
+
+TEST(SizesView, DecodeUIntsFromTextWithCommas) {
+ uint8_t buffer[sizeof kUIntSizes] = {0};
+ auto writer = SizesWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(::emboss::UpdateFromText(writer, R"(
+ {
+ one_byte: 2,
+ two_byte: 260,
+ three_byte: 0x445566,
+ four_byte: 0x03040506,
+ five_byte: 0xa987654321,
+ six_byte: 0x123456789abc,
+ seven_byte: 0xf1e2d3c4b5a697,
+ eight_byte: 0xff00010203040506,
+ }
+ )"));
+ EXPECT_EQ(std::vector<uint8_t>(kUIntSizes, kUIntSizes + sizeof kUIntSizes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(SizesView, DecodeBigEndianUIntsFromText) {
+ uint8_t buffer[sizeof kUIntSizes] = {0};
+ auto writer = BigEndianSizesWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(::emboss::UpdateFromText(writer, R"(
+ {
+ one_byte: 2
+ two_byte: 0x0401
+ three_byte: 0x665544
+ four_byte: 0x06050403
+ five_byte: 0x21436587a9
+ six_byte: 0xbc9a78563412
+ seven_byte: 0x97a6b5c4d3e2f1
+ eight_byte: 0x06050403020100ff
+ }
+ )"));
+ EXPECT_EQ(std::vector<uint8_t>(kUIntSizes, kUIntSizes + sizeof kUIntSizes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(SizesView, EncodeUIntsToText) {
+ auto view =
+ MakeAlignedSizesView<const uint8_t, 8>(kUIntSizes, sizeof kUIntSizes);
+ EXPECT_EQ(
+ "{\n"
+ " one_byte: 2 # 0x2\n"
+ " two_byte: 260 # 0x104\n"
+ " three_byte: 4_478_310 # 0x44_5566\n"
+ " four_byte: 50_595_078 # 0x304_0506\n"
+ " five_byte: 728_121_033_505 # 0xa9_8765_4321\n"
+ " six_byte: 20_015_998_343_868 # 0x1234_5678_9abc\n"
+ " seven_byte: 68_084_868_553_483_927 # 0xf1_e2d3_c4b5_a697\n"
+ " eight_byte: 18_374_687_587_823_781_126 # 0xff00_0102_0304_0506\n"
+ "}",
+ ::emboss::WriteToString(view, ::emboss::MultilineText()));
+ EXPECT_EQ(
+ "{ one_byte: 2, two_byte: 260, three_byte: 4478310, four_byte: 50595078, "
+ "five_byte: 728121033505, six_byte: 20015998343868, seven_byte: "
+ "68084868553483927, eight_byte: 18374687587823781126 }",
+ ::emboss::WriteToString(view));
+}
+
+static const uint8_t kEnumSizes[36] = {
+ 0x01, // 0:1 one_byte == VALUE1
+ 0x0a, 0x00, // 1:3 two_byte == VALUE10
+ 0x10, 0x27, 0x00, // 3:6 three_byte == VALUE10000
+ 0x64, 0x00, 0x00, 0x00, // 6:10 four_byte == VALUE100
+ 0xa0, 0x86, 0x01, 0x00, // 10:14 five_byte
+ 0x00, // 14:15 five_byte == VALUE100000
+ 0x40, 0x42, 0x0f, 0x00, // 15:19 six_byte
+ 0x00, 0x00, // 19:21 six_byte == VALUE1000000
+ 0x80, 0x96, 0x98, 0x00, // 21:25 seven_byte
+ 0x00, 0x00, 0x00, // 25:28 seven_byte == VALUE10000000
+ 0xe8, 0x03, 0x00, 0x00, // 28:32 eight_byte
+ 0x00, 0x00, 0x00, 0x00, // 32:36 eight_byte == VALUE1000
+};
+
+TEST(SizesView, CanReadEnumSizes) {
+ auto view = EnumSizesView(kEnumSizes, sizeof kEnumSizes);
+ EXPECT_EQ(Enum::VALUE1, view.one_byte().Read());
+ EXPECT_EQ(Enum::VALUE10, view.two_byte().Read());
+ EXPECT_EQ(Enum::VALUE10000, view.three_byte().Read());
+ EXPECT_EQ(Enum::VALUE100, view.four_byte().Read());
+ EXPECT_EQ(Enum::VALUE100000, view.five_byte().Read());
+ EXPECT_EQ(Enum::VALUE1000000, view.six_byte().Read());
+ EXPECT_EQ(Enum::VALUE10000000, view.seven_byte().Read());
+ EXPECT_EQ(Enum::VALUE1000, view.eight_byte().Read());
+ // Emboss enums are always derived from uint64_t.
+ EXPECT_EQ(8, sizeof(view.one_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.two_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.three_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.four_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.five_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.six_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.seven_byte().Read()));
+ EXPECT_EQ(8, sizeof(view.eight_byte().Read()));
+}
+
+TEST(SizesWriter, CanWriteEnumSizes) {
+ uint8_t buffer[sizeof kEnumSizes];
+ auto writer = EnumSizesWriter(buffer, sizeof buffer);
+ writer.one_byte().Write(Enum::VALUE1);
+ writer.two_byte().Write(Enum::VALUE10);
+ writer.three_byte().Write(Enum::VALUE10000);
+ writer.four_byte().Write(Enum::VALUE100);
+ writer.five_byte().Write(Enum::VALUE100000);
+ writer.six_byte().Write(Enum::VALUE1000000);
+ writer.seven_byte().Write(Enum::VALUE10000000);
+ writer.eight_byte().Write(Enum::VALUE1000);
+ EXPECT_EQ(std::vector<uint8_t>(kEnumSizes, kEnumSizes + sizeof kEnumSizes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(SizesView, DecodeEnumsFromText) {
+ uint8_t buffer[sizeof kEnumSizes] = {0};
+ auto writer = EnumSizesWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(::emboss::UpdateFromText(writer, R"(
+ {
+ one_byte: VALUE1
+ two_byte: VALUE10
+ three_byte: VALUE10000
+ four_byte: VALUE100
+ five_byte: VALUE100000
+ six_byte: VALUE1000000
+ seven_byte: VALUE10000000
+ eight_byte: VALUE1000
+ }
+ )"));
+ EXPECT_EQ(std::vector<uint8_t>(kEnumSizes, kEnumSizes + sizeof kEnumSizes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+TEST(SizesView, DecodeEnumsFromIntegerText) {
+ uint8_t buffer[sizeof kEnumSizes] = {0};
+ auto writer = EnumSizesWriter(buffer, sizeof buffer);
+ EXPECT_TRUE(::emboss::UpdateFromText(writer, R"(
+ {
+ one_byte: 1
+ two_byte: 10
+ three_byte: 10000
+ four_byte: 100
+ five_byte: 100000
+ six_byte: 1000000
+ seven_byte: 10000000
+ eight_byte: 1000
+ }
+ )"));
+ EXPECT_EQ(std::vector<uint8_t>(kEnumSizes, kEnumSizes + sizeof kEnumSizes),
+ std::vector<uint8_t>(buffer, buffer + sizeof buffer));
+}
+
+static const uint8_t kUIntArraySizes[72] = {
+ 0x02, // 0:2 one_byte[0] == 2
+ 0x03, // 0:2 one_byte[1] == 3
+ 0x04, 0x01, // 2:6 two_byte[0] == 260
+ 0x05, 0x01, // 2:6 two_byte[1] == 261
+ 0x66, 0x55, 0x44, // 6:12 three_byte[0] == 0x445566
+ 0x67, 0x55, 0x44, // 6:12 three_byte[1] == 0x445567
+ 0x06, 0x05, 0x04, 0x03, // 12:20 four_byte[0] == 0x03040506
+ 0x07, 0x05, 0x04, 0x03, // 12:20 four_byte[1] == 0x03040507
+ 0x21, 0x43, 0x65, 0x87, // 20:30 five_byte[0]
+ 0xa9, // 20:30 five_byte[0] == 0xa987654321
+ 0x22, 0x43, 0x65, 0x87, // 20:30 five_byte[1]
+ 0xa9, // 20:30 five_byte[1] == 0xa987654322
+ 0xbc, 0x9a, 0x78, 0x56, // 30:42 six_byte[0]
+ 0x34, 0x12, // 30:42 six_byte[0] == 0x123456789abc
+ 0xbd, 0x9a, 0x78, 0x56, // 30:42 six_byte[1]
+ 0x34, 0x12, // 30:42 six_byte[1] == 0x123456789abd
+ 0x97, 0xa6, 0xb5, 0xc4, // 42:56 seven_byte[0]
+ 0xd3, 0xe2, 0xf1, // 42:56 seven_byte[0] == 0xf1e2d3c4b5a697
+ 0x98, 0xa6, 0xb5, 0xc4, // 42:56 seven_byte[1]
+ 0xd3, 0xe2, 0xf1, // 42:56 seven_byte[1] == 0xf1e2d3c4b5a698
+ 0x06, 0x05, 0x04, 0x03, // 56:72 eight_byte[0]
+ 0x02, 0x01, 0x00, 0xff, // 56:72 eight_byte[0] == 0xff00010203040506
+ 0x07, 0x05, 0x04, 0x03, // 56:72 eight_byte[1]
+ 0x02, 0x01, 0x00, 0xff, // 56:72 eight_byte[1] == 0xff00010203040507
+};
+
+TEST(SizesView, CanReadArraySizes) {
+ auto view = ArraySizesView(kUIntArraySizes, sizeof kUIntArraySizes);
+ EXPECT_EQ(2, view.one_byte()[0].Read());
+ EXPECT_EQ(3, view.one_byte()[1].Read());
+ EXPECT_EQ(260, view.two_byte()[0].Read());
+ EXPECT_EQ(261, view.two_byte()[1].Read());
+ EXPECT_EQ(0x445566U, view.three_byte()[0].Read());
+ EXPECT_EQ(0x445567U, view.three_byte()[1].Read());
+ EXPECT_EQ(0x03040506U, view.four_byte()[0].Read());
+ EXPECT_EQ(0x03040507U, view.four_byte()[1].Read());
+ EXPECT_EQ(0xa987654321, view.five_byte()[0].Read());
+ EXPECT_EQ(0xa987654322, view.five_byte()[1].Read());
+ EXPECT_EQ(0x123456789abc, view.six_byte()[0].Read());
+ EXPECT_EQ(0x123456789abd, view.six_byte()[1].Read());
+ EXPECT_EQ(0xf1e2d3c4b5a697, view.seven_byte()[0].Read());
+ EXPECT_EQ(0xf1e2d3c4b5a698, view.seven_byte()[1].Read());
+ EXPECT_EQ(0xff00010203040506UL, view.eight_byte()[0].Read());
+ EXPECT_EQ(0xff00010203040507UL, view.eight_byte()[1].Read());
+ // Test that the views return appropriate integer widths.
+ EXPECT_EQ(1, sizeof(view.one_byte()[0].Read()));
+ EXPECT_EQ(2, sizeof(view.two_byte()[0].Read()));
+ EXPECT_EQ(4, sizeof(view.three_byte()[0].Read()));
+ EXPECT_EQ(4, sizeof(view.four_byte()[0].Read()));
+ EXPECT_EQ(8, sizeof(view.five_byte()[0].Read()));
+ EXPECT_EQ(8, sizeof(view.six_byte()[0].Read()));
+ EXPECT_EQ(8, sizeof(view.seven_byte()[0].Read()));
+ EXPECT_EQ(8, sizeof(view.eight_byte()[0].Read()));
+}
+
+TEST(SizesView, CopyFrom) {
+ std::array<uint8_t, sizeof kUIntArraySizes> buf_x = {};
+ std::array<uint8_t, sizeof kUIntArraySizes> buf_y = {};
+
+ const auto x = SizesWriter(&buf_x);
+ const auto y = SizesWriter(&buf_y);
+
+ constexpr int kValue = 42;
+ x.one_byte().Write(kValue);
+ EXPECT_NE(x.one_byte().Read(), y.one_byte().Read());
+ y.one_byte().CopyFrom(x.one_byte());
+ EXPECT_EQ(x.one_byte().Read(), y.one_byte().Read());
+}
+
+TEST(SizesView, TryToCopyFrom) {
+ std::array<uint8_t, sizeof kUIntArraySizes> buf_x = {};
+ std::array<uint8_t, sizeof kUIntArraySizes> buf_y = {};
+
+ const auto x = SizesWriter(&buf_x);
+ const auto y = SizesWriter(&buf_y);
+
+ constexpr int kValue = 42;
+ x.one_byte().Write(kValue);
+ EXPECT_NE(x.one_byte().Read(), y.one_byte().Read());
+ EXPECT_TRUE(y.one_byte().TryToCopyFrom(x.one_byte()));
+ EXPECT_EQ(x.one_byte().Read(), y.one_byte().Read());
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/cpp/testcode/virtual_field_test.cc b/back_end/cpp/testcode/virtual_field_test.cc
new file mode 100644
index 0000000..21b338f
--- /dev/null
+++ b/back_end/cpp/testcode/virtual_field_test.cc
@@ -0,0 +1,436 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Tests of generated code for virtual fields.
+#include "testdata/virtual_field.emb.h"
+
+#include <stdint.h>
+
+#include <type_traits>
+#include <utility>
+#include <vector>
+
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+// Check that the constant methods are generated as constexpr free functions in
+// the type's aliased namespace, and have the appropriate values.
+static_assert(StructureWithConstants::ten() == 10,
+ "StructureWithConstants::ten() == 10");
+static_assert(StructureWithConstants::twenty() == 20,
+ "StructureWithConstants::twenty() == 20");
+static_assert(StructureWithConstants::four_billion() == 4000000000U,
+ "StructureWithConstants::four_billion() == 4000000000U");
+static_assert(StructureWithConstants::ten_billion() == 10000000000L,
+ "StructureWithConstants::ten_billion() == 10000000000L");
+static_assert(StructureWithConstants::minus_ten_billion() == -10000000000L,
+ "StructureWithConstants::minus_ten_billion() == -10000000000L");
+
+// Check the return types of the static Read methods.
+static_assert(
+ ::std::is_same<int32_t, decltype(StructureWithConstants::ten())>::value,
+ "StructureWithConstants::ten() returns int8_t");
+static_assert(
+ ::std::is_same<int32_t, decltype(StructureWithConstants::twenty())>::value,
+ "StructureWithConstants::twenty() returns int8_t");
+static_assert(
+ ::std::is_same<uint32_t,
+ decltype(StructureWithConstants::four_billion())>::value,
+ "StructureWithConstants::four_billion() returns uint32_t");
+static_assert(
+ ::std::is_same<int64_t,
+ decltype(StructureWithConstants::ten_billion())>::value,
+ "StructureWithConstants::ten_billion() returns int64_t");
+
+TEST(Constants, ValuesOnView) {
+ ::std::array<char, 4> values = {0, 0, 0, 0};
+ const auto view = MakeStructureWithConstantsView(&values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(10, view.ten().Read());
+ EXPECT_EQ(10, view.alias_of_ten().Read());
+ EXPECT_EQ(10, view.alias_of_alias_of_ten().Read());
+ EXPECT_EQ(20, view.twenty().Read());
+ EXPECT_EQ(4000000000U, view.four_billion().Read());
+ EXPECT_EQ(10000000000L, view.ten_billion().Read());
+ EXPECT_EQ(0, view.value().Read());
+ EXPECT_EQ(0, view.alias_of_value().Read());
+ EXPECT_EQ(0, view.alias_of_alias_of_value().Read());
+ view.alias_of_alias_of_value().Write(10);
+ EXPECT_EQ(10, view.value().Read());
+ EXPECT_EQ(10, view.alias_of_value().Read());
+ EXPECT_EQ(10, view.alias_of_alias_of_value().Read());
+}
+
+TEST(Computed, Values) {
+ ::std::array<char, 8> values = {5, 0, 0, 0, 50, 0, 0, 0};
+ const auto view = MakeStructureWithComputedValuesView(&values);
+ EXPECT_EQ(5, view.value().Read());
+ EXPECT_EQ(10, view.doubled().Read());
+ EXPECT_EQ(15, view.plus_ten().Read());
+ EXPECT_EQ(50, view.value2().Read());
+ EXPECT_EQ(100, view.signed_doubled().Read());
+ EXPECT_EQ(60, view.signed_plus_ten().Read());
+ EXPECT_EQ(250, view.product().Read());
+ view.value2().Write(-50);
+ EXPECT_EQ(-100, view.signed_doubled().Read());
+ EXPECT_EQ(-40, view.signed_plus_ten().Read());
+ EXPECT_EQ(-250, view.product().Read());
+}
+
+TEST(Computed, ReadFailsWhenUnderlyingFieldIsNotOk) {
+ ::std::array<char, 0> values = {};
+ const auto view = MakeStructureWithComputedValuesView(&values);
+ EXPECT_DEATH(view.value().Read(), "");
+ EXPECT_DEATH(view.doubled().Read(), "");
+}
+
+// Check the return types of nonstatic Read methods.
+static_assert(
+ ::std::is_same<int64_t, decltype(MakeStructureWithComputedValuesView("x", 1)
+ .doubled()
+ .Read())>::value,
+ "StructureWithComputedValuesView::doubled().Read() should return int64_t.");
+// Check the return types of nonstatic Read methods.
+static_assert(
+ ::std::is_same<int64_t, decltype(MakeStructureWithComputedValuesView("x", 1)
+ .product()
+ .Read())>::value,
+ "StructureWithComputedValuesView::product().Read() should return int64_t.");
+
+TEST(Constants, TextFormatWrite) {
+ ::std::array<char, 4> values = {5, 0, 0, 0};
+ const auto view = MakeStructureWithConstantsView(&values);
+ // TODO(bolms): Provide a way of marking fields as "not for text format," so
+ // that end users can choose whether to use an alias or an original field or
+ // both in the text format.
+ EXPECT_EQ("{ value: 5, alias_of_value: 5, alias_of_alias_of_value: 5 }",
+ ::emboss::WriteToString(view));
+ EXPECT_EQ(
+ "{\n"
+ " # ten: 10 # 0xa\n"
+ " # twenty: 20 # 0x14\n"
+ " # four_billion: 4_000_000_000 # 0xee6b_2800\n"
+ " # ten_billion: 10_000_000_000 # 0x2_540b_e400\n"
+ " # minus_ten_billion: -10_000_000_000 # -0x2_540b_e400\n"
+ " value: 5 # 0x5\n"
+ " alias_of_value: 5 # 0x5\n"
+ " alias_of_alias_of_value: 5 # 0x5\n"
+ " # alias_of_ten: 10 # 0xa\n"
+ " # alias_of_alias_of_ten: 10 # 0xa\n"
+ "}",
+ ::emboss::WriteToString(view, ::emboss::MultilineText()));
+}
+
+TEST(Computed, TextFormatWrite) {
+ ::std::array<char, 8> values = {5, 0, 0, 0, 50, 0, 0, 0};
+ const auto view = MakeStructureWithComputedValuesView(&values);
+ EXPECT_EQ("{ value: 5, plus_ten: 15, value2: 50, signed_plus_ten: 60 }",
+ ::emboss::WriteToString(view));
+ EXPECT_EQ(
+ "{\n"
+ " value: 5 # 0x5\n"
+ " # doubled: 10 # 0xa\n"
+ " plus_ten: 15 # 0xf\n"
+ " value2: 50 # 0x32\n"
+ " # signed_doubled: 100 # 0x64\n"
+ " signed_plus_ten: 60 # 0x3c\n"
+ " # product: 250 # 0xfa\n"
+ "}",
+ ::emboss::WriteToString(view, ::emboss::MultilineText()));
+}
+
+TEST(Constants, TextFormatRead) {
+ ::std::array<char, 4> values = {5, 0, 0, 0};
+ const auto view = MakeStructureWithConstantsView(&values);
+ EXPECT_TRUE(::emboss::UpdateFromText(view, "{ value: 50 }"));
+ EXPECT_EQ(50, values[0]);
+ EXPECT_FALSE(::emboss::UpdateFromText(view, "{ ten: 50 }"));
+ // TODO(bolms): Should this be allowed?
+ EXPECT_FALSE(::emboss::UpdateFromText(view, "{ ten: 10 }"));
+}
+
+TEST(Computed, TextFormatRead) {
+ ::std::array<char, 8> values = {5, 0, 0, 0, 50, 0, 0, 0};
+ const auto view = MakeStructureWithComputedValuesView(&values);
+ EXPECT_TRUE(::emboss::UpdateFromText(view, "{ value: 50, value2: 5 }"));
+ EXPECT_EQ(50, values[0]);
+ EXPECT_EQ(5, values[4]);
+ EXPECT_FALSE(::emboss::UpdateFromText(view, "{ product: 10 }"));
+ // TODO(bolms): Make Emboss automatically infer write_transform for
+ // easily-reversible cases like `field * 2`.
+ EXPECT_FALSE(::emboss::UpdateFromText(view, "{ doubled: 10 }"));
+}
+
+TEST(ConditionalVirtual, ConditionChecks) {
+ ::std::array<char, 4> values = {5, 0, 0, 0};
+ const auto view = MakeStructureWithConditionalValueView(&values);
+ EXPECT_TRUE(view.has_two_x().Value());
+ EXPECT_TRUE(view.has_x_plus_one().Value());
+ EXPECT_EQ(10, view.two_x().Read());
+ EXPECT_EQ(6, view.x_plus_one().Read());
+ EXPECT_EQ(10, view.two_x().UncheckedRead());
+ EXPECT_EQ(6, view.x_plus_one().UncheckedRead());
+ view.x().Write(0x80000000U);
+ EXPECT_FALSE(view.has_two_x().Value());
+ EXPECT_DEATH(view.two_x().Read(), "");
+ EXPECT_TRUE(view.has_x_plus_one().Value());
+ EXPECT_EQ(0x80000001U, view.x_plus_one().Read());
+}
+
+TEST(ConditionalVirtual, UncheckedRead) {
+ ::std::array<char, 4> values = {5, 0, 0, 0};
+ const auto view = MakeStructureWithConditionalValueView(&values[0], 1);
+ EXPECT_FALSE(view.Ok());
+ EXPECT_FALSE(view.x().Ok());
+ EXPECT_DEATH(view.two_x().Read(), "");
+ EXPECT_EQ(0, view.two_x().UncheckedRead());
+}
+
+TEST(ConditionalVirtual, TextFormatWrite) {
+ ::std::array<unsigned char, 4> values = {0, 0, 0, 0x80};
+ const auto view = MakeStructureWithConditionalValueView(&values);
+ EXPECT_EQ("{ x: 2147483648, x_plus_one: 2147483649 }",
+ ::emboss::WriteToString(view));
+ EXPECT_EQ(
+ "{\n"
+ " x: 2_147_483_648 # 0x8000_0000\n"
+ " x_plus_one: 2_147_483_649 # 0x8000_0001\n"
+ "}",
+ ::emboss::WriteToString(view, ::emboss::MultilineText()));
+ view.x().Write(5);
+ EXPECT_EQ("{ x: 5, x_plus_one: 6 }", ::emboss::WriteToString(view));
+ EXPECT_EQ(
+ "{\n"
+ " x: 5 # 0x5\n"
+ " # two_x: 10 # 0xa\n"
+ " x_plus_one: 6 # 0x6\n"
+ "}",
+ ::emboss::WriteToString(view, ::emboss::MultilineText()));
+}
+
+TEST(VirtualInCondition, ConditionCheck) {
+ ::std::array<char, 8> values = {5, 0, 0, 0, 50, 0, 0, 0};
+ const auto view = MakeStructureWithValueInConditionView(&values);
+ EXPECT_TRUE(view.has_if_two_x_lt_100().Value());
+ view.x().Write(75);
+ EXPECT_FALSE(view.has_if_two_x_lt_100().Value());
+}
+
+TEST(VirtualInLocation, Offset) {
+ ::std::array<char, 8> values = {5, 0, 0, 0, 50, 0, 0, 0};
+ const auto view = MakeStructureWithValuesInLocationView(&values);
+ EXPECT_FALSE(view.Ok());
+ view.x().Write(2);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(50, view.offset_two_x().Read());
+ EXPECT_EQ(50, view.size_two_x().Read());
+ view.x().Write(1);
+ EXPECT_FALSE(view.Ok());
+ EXPECT_EQ(50 * 0x10000, view.offset_two_x().Read());
+ view.x().Write(0);
+ EXPECT_EQ(0, view.offset_two_x().Read());
+}
+
+TEST(BooleanVirtual, TrueAndFalse) {
+ ::std::array<char, 4> values = {5, 0, 0, 0};
+ const auto view = MakeStructureWithBoolValueView(&values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_FALSE(view.x_is_ten().Read());
+ view.x().Write(10);
+ EXPECT_TRUE(view.x_is_ten().Read());
+}
+
+TEST(EnumVirtual, SmallAndLarge) {
+ ::std::array<char, 4> values = {5, 0, 0, 0};
+ const auto view = MakeStructureWithEnumValueView(&values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(StructureWithEnumValue::Category::SMALL, view.x_size().Read());
+ view.x().Write(100);
+ EXPECT_EQ(StructureWithEnumValue::Category::LARGE, view.x_size().Read());
+}
+
+TEST(BitsVirtual, Sum) {
+ ::std::array<char, 4> values = {5, 0, 10, 0};
+ const auto view = MakeStructureWithBitsWithValueView(&values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(5, view.b().a().Read());
+ EXPECT_EQ(5, view.alias_of_b_a().Read());
+ EXPECT_EQ(10, view.b().b().Read());
+ EXPECT_EQ(15, view.b().sum().Read());
+ EXPECT_EQ(15, view.alias_of_b_sum().Read());
+ view.alias_of_b_a().Write(20);
+ EXPECT_EQ(20, view.b().a().Read());
+ EXPECT_EQ(20, view.alias_of_b_a().Read());
+ EXPECT_EQ(20, values[0]);
+}
+
+TEST(ForeignConstants, ForeignConstants) {
+ static_assert(StructureUsingForeignConstants::one_hundred() == 100,
+ "StructureUsingForeignConstants::one_hundred() == 100");
+ ::std::array<char, 14> values = {5, 0, 0, 0, 10, 0, 0, 0, 15, 0, 20, 0, 0, 0};
+ const auto view = MakeStructureUsingForeignConstantsView(&values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(20, view.x().Read());
+ EXPECT_EQ(100, view.one_hundred().Read());
+}
+
+TEST(HasField, HasField) {
+ ::std::array<char, 3> values = {0, 0, 0};
+ const auto view = MakeHasFieldView(&values);
+ EXPECT_FALSE(view.Ok()); // There is not enough information to determine if
+ // view.has_y(), so the view is not Ok().
+ view.z().Write(11);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_FALSE(view.has_y().Value());
+ EXPECT_TRUE(view.has_x().Value());
+ EXPECT_FALSE(view.x().has_y().Value());
+ EXPECT_FALSE(view.x_has_y().Read());
+ EXPECT_FALSE(view.x_has_y().UncheckedRead());
+ view.x().v().Write(11);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_TRUE(view.has_y().Value());
+ EXPECT_TRUE(view.has_x().Value());
+ EXPECT_TRUE(view.x().has_y().Value());
+ EXPECT_TRUE(view.x_has_y().Read());
+ EXPECT_TRUE(view.x_has_y().UncheckedRead());
+}
+
+TEST(RestrictedAlias, RestrictedAlias) {
+ ::std::array<char, 5> values = {1, 2, 3, 4, 5};
+ const auto view = MakeRestrictedAliasView(&values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_TRUE(view.has_a_b().Value());
+ EXPECT_TRUE(view.a_b().Ok());
+ EXPECT_FALSE(view.has_a_b_alias().Value());
+ EXPECT_FALSE(view.a_b_alias().Ok());
+ EXPECT_FALSE(view.a_b_alias().a().Ok());
+ EXPECT_FALSE(view.a_b_alias().b().Ok());
+ view.alias_switch().Write(11);
+ EXPECT_TRUE(view.has_a_b().Value());
+ EXPECT_TRUE(view.a_b().Ok());
+ EXPECT_TRUE(view.has_a_b_alias().Value());
+ EXPECT_TRUE(view.a_b_alias().Ok());
+}
+
+TEST(VirtualWithConditionalComponent, ReadWhenAllPresent) {
+ ::std::array<char, 2> values = {0, 0};
+ const auto view = MakeVirtualUnconditionallyUsesConditionalView(&values);
+ EXPECT_TRUE(view.x_nor_xc().Read());
+ EXPECT_TRUE(view.x_nor_xc().UncheckedRead());
+}
+
+TEST(VirtualWithConditionalComponent, ReadWhenNotAllPresent) {
+ ::std::array<char, 2> values = {1, 0};
+ const auto view = MakeVirtualUnconditionallyUsesConditionalView(&values);
+ EXPECT_FALSE(view.x_nor_xc().Read());
+ EXPECT_FALSE(view.x_nor_xc().UncheckedRead());
+}
+
+TEST(IntrinsicSize, SizeInBytes) {
+ ::std::array<char, 1> values = {10};
+ const auto view = MakeUsesSizeView(&values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_TRUE(view.IntrinsicSizeInBytes().Ok());
+ EXPECT_EQ(1, view.IntrinsicSizeInBytes().Read());
+ EXPECT_EQ(1, UsesSizeView::IntrinsicSizeInBytes().Read());
+ EXPECT_EQ(1, UsesSize::IntrinsicSizeInBytes());
+ EXPECT_TRUE(view.r().IntrinsicSizeInBits().Ok());
+ EXPECT_EQ(8, view.r().IntrinsicSizeInBits().Read());
+ EXPECT_EQ(8, UsesSize::R::IntrinsicSizeInBits());
+ EXPECT_EQ(values[0], view.r().q().Read());
+ EXPECT_EQ(values[0] + 1, view.r_q_plus_byte_size().Read());
+ EXPECT_EQ(values[0] + 8, view.r().q_plus_bit_size().Read());
+}
+
+TEST(VirtualFields, SizeInBytes) {
+ const ::std::array<uint8_t, 8> values = {0x11, 0x11, 0x11, 0x11,
+ 0x22, 0x22, 0x22, 0x22};
+ const auto view = MakeUsesExternalSizeView(&values);
+ EXPECT_TRUE(view.Ok());
+ EXPECT_EQ(8, view.SizeInBytes());
+ EXPECT_EQ(view.x().SizeInBytes(), 4);
+ EXPECT_EQ(view.y().SizeInBytes(), 4);
+ EXPECT_EQ(view.x().value().Read(), 0x11111111);
+ EXPECT_EQ(view.y().value().Read(), 0x22222222);
+ EXPECT_TRUE(view.IntrinsicSizeInBytes().Ok());
+ EXPECT_EQ(8, UsesExternalSizeView::IntrinsicSizeInBytes().Read());
+ EXPECT_EQ(8, UsesExternalSize::MaxSizeInBytes());
+}
+
+TEST(WriteTransform, Write) {
+ ::std::array<char, 1> values = {0};
+ const auto view = MakeImplicitWriteBackView(&values);
+
+ view.x_plus_ten().Write(11);
+ EXPECT_EQ(1, view.x().Read());
+ EXPECT_EQ(11, view.x_plus_ten().Read());
+
+ view.ten_plus_x().Write(12);
+ EXPECT_EQ(2, view.x().Read());
+ EXPECT_EQ(12, view.ten_plus_x().Read());
+
+ EXPECT_TRUE((::std::is_same<decltype(view.x_minus_ten())::ValueType,
+ ::std::int32_t>::value));
+
+ view.x_minus_ten().Write(-7);
+ EXPECT_EQ(3, view.x().Read());
+ EXPECT_EQ(-7, view.x_minus_ten().Read());
+
+ view.ten_minus_x().Write(6);
+ EXPECT_EQ(4, view.x().Read());
+ EXPECT_EQ(6, view.ten_minus_x().Read());
+
+ view.ten_minus_x_plus_ten().Write(4);
+ EXPECT_EQ(16, view.x().Read());
+ EXPECT_EQ(4, view.ten_minus_x_plus_ten().Read());
+}
+
+TEST(WriteTransform, CouldWriteValue) {
+ ::std::array<char, 1> values = {0};
+ const auto view = MakeImplicitWriteBackView(&values);
+ EXPECT_EQ(0, view.x().Read());
+ // x is UInt:8, so has range [0, 255].
+
+ EXPECT_FALSE(view.x_plus_ten().CouldWriteValue(9));
+ EXPECT_TRUE(view.x_plus_ten().CouldWriteValue(10));
+ EXPECT_TRUE(view.x_plus_ten().CouldWriteValue(265));
+ EXPECT_FALSE(view.x_plus_ten().CouldWriteValue(266));
+
+ EXPECT_FALSE(view.ten_plus_x().CouldWriteValue(9));
+ EXPECT_TRUE(view.ten_plus_x().CouldWriteValue(10));
+ EXPECT_TRUE(view.ten_plus_x().CouldWriteValue(265));
+ EXPECT_FALSE(view.ten_plus_x().CouldWriteValue(266));
+
+ EXPECT_FALSE(view.x_minus_ten().CouldWriteValue(-11));
+ EXPECT_TRUE(view.x_minus_ten().CouldWriteValue(-10));
+ EXPECT_TRUE(view.x_minus_ten().CouldWriteValue(245));
+ EXPECT_FALSE(view.x_minus_ten().CouldWriteValue(246));
+
+ EXPECT_FALSE(view.ten_minus_x().CouldWriteValue(-246));
+ EXPECT_TRUE(view.ten_minus_x().CouldWriteValue(-245));
+ EXPECT_TRUE(view.ten_minus_x().CouldWriteValue(10));
+ EXPECT_FALSE(view.ten_minus_x().CouldWriteValue(11));
+
+ EXPECT_FALSE(view.ten_minus_x_plus_ten().CouldWriteValue(-236));
+ EXPECT_TRUE(view.ten_minus_x_plus_ten().CouldWriteValue(-235));
+ EXPECT_TRUE(view.ten_minus_x_plus_ten().CouldWriteValue(20));
+ EXPECT_FALSE(view.ten_minus_x_plus_ten().CouldWriteValue(21));
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/back_end/util/BUILD b/back_end/util/BUILD
new file mode 100644
index 0000000..f2db10c
--- /dev/null
+++ b/back_end/util/BUILD
@@ -0,0 +1,34 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Shared utilities for Emboss back ends.
+
+package(
+ default_visibility = ["//back_end:__subpackages__"],
+)
+
+py_library(
+ name = "code_template",
+ srcs = ["code_template.py"],
+ deps = [],
+)
+
+py_test(
+ name = "code_template_test",
+ srcs = ["code_template_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":code_template",
+ ],
+)
diff --git a/back_end/util/__init__.py b/back_end/util/__init__.py
new file mode 100644
index 0000000..2c31d84
--- /dev/null
+++ b/back_end/util/__init__.py
@@ -0,0 +1,14 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
diff --git a/back_end/util/code_template.py b/back_end/util/code_template.py
new file mode 100644
index 0000000..bbea043
--- /dev/null
+++ b/back_end/util/code_template.py
@@ -0,0 +1,140 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""A formatter for code templates.
+
+Use the format_template function to render a code template.
+"""
+
+import collections
+import re
+import string
+
+
+class _CppFormatter(string.Formatter):
+ """Customized Formatter using $_name_$ instead of {name}.
+
+ This class exists for the format_template() function; see its documentation
+ for details.
+ """
+
+ def parse(self, format_string):
+ """Overrides string.Formatter.parse.
+
+ Arguments:
+ format_string: a format string to be parsed.
+
+ Yields:
+ A sequence of 4-element tuples (literal, name, format_spec, conversion),
+ where:
+
+ literal: A literal string to include in the output. This will be
+ output before the substitution, if any.
+ name: The name of a substitution, or None if no substitution.
+ format_spec: A format specification.
+ conversion: A conversion specification.
+
+ Consult the documentation for string.Formatter for the format of the
+ format_spec and conversion elements.
+ """
+ # A replacement spec is $_field_name!conversion:format_spec_$, where
+ # conversion and format_spec are optional. string.Formatter will take care
+ # of parsing and interpreting the conversion and format_spec, so this method
+ # just extracts them.
+ delimiter_matches = re.finditer(
+ r"""(?x)
+ \$_
+ (?P<field_name> ( [^!:_] | _[^$] )* )
+ ( ! (?P<conversion> ( [^:_] | _[^$] )* ) )?
+ ( : (?P<format_spec> ( [^_] | _[^$] )* ) )?
+ _\$""", format_string)
+ after_last_delimiter = 0
+ for match in delimiter_matches:
+ yield (format_string[after_last_delimiter:match.start()],
+ match.group("field_name"),
+ # A missing format_spec is indicated by ""...
+ match.group("format_spec") or "",
+ # ... but a missing conversion is indicated by None. Consistency!
+ match.group("conversion") or None)
+ after_last_delimiter = match.end()
+ yield format_string[after_last_delimiter:], None, None, None
+
+
+_FORMATTER = _CppFormatter()
+
+
+def format_template(template, *args, **kwargs):
+ """format_template acts like str.format, but uses $_name_$ instead of {name}.
+
+ format_template acts like a str.format, except that instead of using { and }
+ to delimit substitutions, format_template uses $_ and _$. This simplifies
+ templates of source code in most languages, which frequently use "{" and "}",
+ but very rarely use "$". The choice of "$_" and "_$" is conducive to the use
+ of clang-format on templates.
+
+ format_template does not currently have a way to put literal "$_..._$" into a
+ format string.
+
+ See the documentation for str.format and string.Formatter for details about
+ template strings and the format of substitutions.
+
+ Arguments:
+ template: A template to format.
+ *args: Positional arguments for string.Formatter.format.
+ **kwargs: Keyword arguments for string.Formatter.format.
+
+ Returns:
+ A formatted string.
+ """
+ return _FORMATTER.format(template, *args, **kwargs)
+
+
+def parse_templates(text):
+ """Parses text into a namedtuple of templates.
+
+ parse_templates will split its argument into templates by searching for lines
+ of the form:
+
+ [punctuation] " ** " [name] " ** " [punctuation]
+
+ e.g.:
+
+ // ** struct_field_accessor ** ////////
+
+ Leading and trailing punctuation is ignored, and [name] is used as the name
+ of the template. [name] should match [A-Za-z][A-Za-z0-9_]* -- that is, it
+ should be a valid ASCII Python identifier.
+
+ Arguments:
+ text: The text to parse into templates.
+
+ Returns:
+ A namedtuple object whose attributes are the templates from text.
+ """
+ delimiter_re = re.compile(r"^\W*\*\* ([A-Za-z][A-Za-z0-9_]*) \*\*\W*$")
+ templates = {}
+ name = None
+ template = []
+ for line in text.splitlines():
+ if delimiter_re.match(line):
+ if name:
+ templates[name] = "\n".join(template)
+ name = delimiter_re.match(line).group(1)
+ template = []
+ else:
+ template.append(line)
+ if name:
+ templates[name] = "\n".join(template)
+ return collections.namedtuple("Templates",
+ list(templates.keys()))(**templates)
diff --git a/back_end/util/code_template_test.py b/back_end/util/code_template_test.py
new file mode 100644
index 0000000..97322a2
--- /dev/null
+++ b/back_end/util/code_template_test.py
@@ -0,0 +1,98 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for code_template."""
+
+import unittest
+from back_end.util import code_template
+
+
+class FormatTest(unittest.TestCase):
+ """Tests for code_template.format."""
+
+ def test_no_replacement_fields(self):
+ self.assertEqual("foo", code_template.format_template("foo"))
+ self.assertEqual("{foo}", code_template.format_template("{foo}"))
+ self.assertEqual("$foo$", code_template.format_template("$foo$"))
+ self.assertEqual("$_foo$", code_template.format_template("$_foo$"))
+ self.assertEqual("$foo_$", code_template.format_template("$foo_$"))
+
+ def test_one_replacement_field(self):
+ self.assertEqual("foo", code_template.format_template("$_bar_$", bar="foo"))
+ self.assertEqual("bazfoo",
+ code_template.format_template("baz$_bar_$", bar="foo"))
+ self.assertEqual("foobaz",
+ code_template.format_template("$_bar_$baz", bar="foo"))
+ self.assertEqual("bazfooqux",
+ code_template.format_template("baz$_bar_$qux", bar="foo"))
+
+ def test_one_replacement_field_with_formatting(self):
+ self.assertEqual("1.000000",
+ code_template.format_template("$_bar:.6f_$", bar=1))
+ self.assertEqual("'foo'",
+ code_template.format_template("$_bar!r_$", bar="foo"))
+ self.assertEqual("==foo==",
+ code_template.format_template("$_bar:=^7_$", bar="foo"))
+ self.assertEqual("=='foo'==",
+ code_template.format_template("$_bar!r:=^9_$", bar="foo"))
+ self.assertEqual("xx=='foo'==yy",
+ code_template.format_template("xx$_bar!r:=^9_$yy",
+ bar="foo"))
+
+ def test_one_replacement_field_value_missing(self):
+ self.assertRaises(KeyError, code_template.format_template, "$_bar_$")
+
+ def test_multiple_replacement_fields(self):
+ self.assertEqual(" aaa bbb ",
+ code_template.format_template(" $_bar_$ $_baz_$ ",
+ bar="aaa",
+ baz="bbb"))
+
+
+class ParseTemplatesTest(unittest.TestCase):
+ """Tests for code_template.parse_templates."""
+
+ def test_handles_no_template_case(self):
+ self.assertEqual({}, code_template.parse_templates("")._asdict())
+ self.assertEqual({}, code_template.parse_templates(
+ "this is not a template")._asdict())
+
+ def test_handles_one_template_at_start(self):
+ self.assertEqual({"foo": "bar"},
+ code_template.parse_templates("** foo **\nbar")._asdict())
+
+ def test_handles_one_template_after_start(self):
+ self.assertEqual(
+ {"foo": "bar"},
+ code_template.parse_templates("text\n** foo **\nbar")._asdict())
+
+ def test_handles_delimiter_with_other_text(self):
+ self.assertEqual(
+ {"foo": "bar"},
+ code_template.parse_templates("text\n// ** foo ** ////\nbar")._asdict())
+ self.assertEqual(
+ {"foo": "bar"},
+ code_template.parse_templates("text\n# ** foo ** #####\nbar")._asdict())
+
+ def test_handles_multiple_delimiters(self):
+ self.assertEqual({"foo": "bar",
+ "baz": "qux"}, code_template.parse_templates(
+ "** foo **\nbar\n** baz **\nqux")._asdict())
+
+ def test_returns_object_with_attributes(self):
+ self.assertEqual("bar", code_template.parse_templates(
+ "** foo **\nbar\n** baz **\nqux").foo)
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/embossc b/embossc
new file mode 100755
index 0000000..81279db
--- /dev/null
+++ b/embossc
@@ -0,0 +1,98 @@
+#!/usr/bin/python3
+
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import os
+from os import path
+import subprocess
+import sys
+
+def _parse_args(argv):
+ parser = argparse.ArgumentParser(description="Emboss compiler")
+ parser.add_argument("--color-output",
+ default="if_tty",
+ choices=["always", "never", "if_tty", "auto"],
+ help="Print error messages using color. 'auto' is a "
+ "synonym for 'if_tty'.")
+ parser.add_argument("--import-dir", "-I",
+ dest="import_dirs",
+ action="append",
+ default=["."],
+ help="A directory to use when searching for imported "
+ "embs. If no import-dirs are specified, the "
+ "current directory will be used.")
+ parser.add_argument("--generate",
+ nargs=1,
+ choices=["cc"],
+ default="cc",
+ help="Which back end to use. Currently only C++ is "
+ "supported.")
+ parser.add_argument("--output-path",
+ nargs=1,
+ default=".",
+ help="Prefix to use for the generated output file.")
+ parser.add_argument("input_file",
+ type=str,
+ nargs=1,
+ help=".emb file to compile.")
+ return parser.parse_args(argv[1:])
+
+
+def main(argv):
+ flags = _parse_args(argv)
+ base_path = path.dirname(__file__) or "."
+ subprocess_environment = {k: v for k, v in os.environ.items()}
+ if subprocess_environment.get("PYTHONPATH"):
+ subprocess_environment["PYTHONPATH"] = (
+ base_path + ":" + subprocess_environment.get("PYTHONPATH"))
+ else:
+ subprocess_environment["PYTHONPATH"] = base_path
+ front_end_args = [
+ sys.executable,
+ path.join(base_path, "front_end", "emboss_front_end.py"),
+ "--output-ir-to-stdout",
+ "--color-output", flags.color_output,
+ ]
+ for import_dir in flags.import_dirs:
+ front_end_args.extend([
+ "--import-dir",
+ import_dir
+ ])
+ front_end_args.append(flags.input_file[0])
+ front_end_status = subprocess.run(front_end_args,
+ stdout=subprocess.PIPE,
+ env=subprocess_environment)
+ if front_end_status.returncode != 0:
+ return front_end_status.returncode
+ back_end_status = subprocess.run(
+ [
+ sys.executable,
+ path.join(base_path, "back_end", "cpp", "emboss_codegen_cpp.py"),
+ ],
+ input=front_end_status.stdout,
+ stdout=subprocess.PIPE,
+ env=subprocess_environment)
+ if back_end_status.returncode != 0:
+ return back_end_status.returncode
+ output_file = path.join(flags.output_path[0], flags.input_file[0]) + ".h"
+ os.makedirs(path.dirname(output_file), exist_ok=True)
+ with open(output_file, "wb") as output:
+ output.write(back_end_status.stdout)
+ return 0
+
+
+if __name__ == "__main__":
+ sys.exit(main(sys.argv))
diff --git a/examples/span_se_log_file_status.emb b/examples/span_se_log_file_status.emb
new file mode 100644
index 0000000..b0c1088
--- /dev/null
+++ b/examples/span_se_log_file_status.emb
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+
+struct LogFileStatus:
+ 0:4 UInt file_state
+ 4:16 UInt:8[12] file_name
+ 16:20 UInt file_size_kb
+ 20:24 UInt media
diff --git a/front_end/BUILD b/front_end/BUILD
new file mode 100644
index 0000000..07b96a2
--- /dev/null
+++ b/front_end/BUILD
@@ -0,0 +1,461 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Emboss front end
+#
+# The code in this directory translates an Emboss definition file (.emb) to an
+# intermediate representation (IR). The IR is passed to back end code
+# generators to generate code in various languages.
+
+package(
+ default_visibility = [
+ "//:__subpackages__",
+ ],
+)
+
+py_library(
+ name = "tokenizer",
+ srcs = ["tokenizer.py"],
+ deps = [
+ "//util:error",
+ "//util:parser_types",
+ ],
+)
+
+py_test(
+ name = "tokenizer_test",
+ srcs = ["tokenizer_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":tokenizer",
+ "//util:error",
+ "//util:parser_types",
+ ],
+)
+
+py_library(
+ name = "lr1",
+ srcs = ["lr1.py"],
+ deps = [
+ "//util:parser_types",
+ ],
+)
+
+py_test(
+ name = "lr1_test",
+ srcs = ["lr1_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":lr1",
+ "//util:parser_types",
+ ],
+)
+
+py_library(
+ name = "module_ir",
+ srcs = ["module_ir.py"],
+ deps = [
+ "//public:ir_pb2",
+ "//util:name_conversion",
+ "//util:parser_types",
+ ],
+)
+
+py_test(
+ name = "module_ir_test",
+ srcs = ["module_ir_test.py"],
+ data = [
+ "//testdata:golden_files",
+ ],
+ python_version = "PY3",
+ deps = [
+ ":module_ir",
+ ":parser",
+ ":test_util",
+ ":tokenizer",
+ "//public:ir_pb2",
+ ],
+)
+
+py_library(
+ name = "parser",
+ srcs = ["parser.py"],
+ data = [
+ "error_examples",
+ ],
+ deps = [
+ ":lr1",
+ ":module_ir",
+ ":tokenizer",
+ "//util:simple_memoizer",
+ ],
+)
+
+py_test(
+ name = "parser_test",
+ srcs = ["parser_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":lr1",
+ ":parser",
+ ":tokenizer",
+ "//util:parser_types",
+ ],
+)
+
+py_library(
+ name = "test_util",
+ testonly = 1,
+ srcs = ["test_util.py"],
+ deps = [
+ ],
+)
+
+py_test(
+ name = "test_util_test",
+ srcs = ["test_util_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":test_util",
+ "//public:ir_pb2",
+ "//util:parser_types",
+ ],
+)
+
+py_library(
+ name = "glue",
+ srcs = ["glue.py"],
+ data = [
+ "prelude.emb",
+ ],
+ visibility = ["//:__subpackages__"],
+ deps = [
+ ":attribute_checker",
+ ":constraints",
+ ":dependency_checker",
+ ":expression_bounds",
+ ":lr1",
+ ":module_ir",
+ ":parser",
+ ":symbol_resolver",
+ ":synthetics",
+ ":tokenizer",
+ ":type_check",
+ ":write_inference",
+ "//public:ir_pb2",
+ "//util:error",
+ "//util:parser_types",
+ ],
+)
+
+py_test(
+ name = "glue_test",
+ srcs = ["glue_test.py"],
+ data = [
+ "//testdata:golden_files",
+ ],
+ python_version = "PY3",
+ deps = [
+ ":glue",
+ ":test_util",
+ "//public:ir_pb2",
+ "//util:error",
+ "//util:parser_types",
+ ],
+)
+
+py_library(
+ name = "synthetics",
+ srcs = ["synthetics.py"],
+ visibility = ["//visibility:private"],
+ deps = [
+ "//public:ir_pb2",
+ "//util:expression_parser",
+ "//util:traverse_ir",
+ ],
+)
+
+py_test(
+ name = "synthetics_test",
+ srcs = ["synthetics_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":glue",
+ ":synthetics",
+ ":test_util",
+ ],
+)
+
+py_library(
+ name = "symbol_resolver",
+ srcs = ["symbol_resolver.py"],
+ visibility = ["//visibility:private"],
+ deps = [
+ "//public:ir_pb2",
+ "//util:error",
+ "//util:ir_util",
+ "//util:traverse_ir",
+ ],
+)
+
+py_test(
+ name = "symbol_resolver_test",
+ srcs = ["symbol_resolver_test.py"],
+ python_version = "PY3",
+ shard_count = 8,
+ deps = [
+ ":glue",
+ ":symbol_resolver",
+ ":test_util",
+ "//util:error",
+ ],
+)
+
+py_library(
+ name = "write_inference",
+ srcs = ["write_inference.py"],
+ visibility = ["//visibility:private"],
+ deps = [
+ ":attributes",
+ ":expression_bounds",
+ "//public:ir_pb2",
+ "//util:ir_util",
+ "//util:traverse_ir",
+ ],
+)
+
+py_test(
+ name = "write_inference_test",
+ srcs = ["write_inference_test.py"],
+ python_version = "PY3",
+ shard_count = 8,
+ deps = [
+ ":glue",
+ ":test_util",
+ ":write_inference",
+ "//public:ir_pb2",
+ ],
+)
+
+py_library(
+ name = "attribute_checker",
+ srcs = ["attribute_checker.py"],
+ deps = [
+ ":attributes",
+ ":type_check",
+ "//public:ir_pb2",
+ "//util:error",
+ "//util:ir_util",
+ "//util:traverse_ir",
+ ],
+)
+
+py_library(
+ name = "attributes",
+ srcs = ["attributes.py"],
+ deps = [],
+)
+
+py_test(
+ name = "attribute_checker_test",
+ timeout = "long",
+ srcs = ["attribute_checker_test.py"],
+ python_version = "PY3",
+ shard_count = 16,
+ deps = [
+ ":attribute_checker",
+ ":glue",
+ ":test_util",
+ "//public:ir_pb2",
+ "//util:error",
+ "//util:ir_util",
+ ],
+)
+
+py_library(
+ name = "type_check",
+ srcs = ["type_check.py"],
+ deps = [
+ ":attributes",
+ "//public:ir_pb2",
+ "//util:error",
+ "//util:ir_util",
+ "//util:traverse_ir",
+ ],
+)
+
+py_test(
+ name = "type_check_test",
+ srcs = ["type_check_test.py"],
+ python_version = "PY3",
+ shard_count = 8,
+ deps = [
+ ":glue",
+ ":test_util",
+ ":type_check",
+ "//util:error",
+ ],
+)
+
+py_library(
+ name = "expression_bounds",
+ srcs = ["expression_bounds.py"],
+ data = [
+ "reserved_words",
+ ],
+ deps = [
+ ":attributes",
+ "//public:ir_pb2",
+ "//util:ir_util",
+ "//util:traverse_ir",
+ ],
+)
+
+py_test(
+ name = "expression_bounds_test",
+ srcs = ["expression_bounds_test.py"],
+ python_version = "PY3",
+ shard_count = 4,
+ deps = [
+ ":expression_bounds",
+ ":glue",
+ ":test_util",
+ ],
+)
+
+py_library(
+ name = "constraints",
+ srcs = ["constraints.py"],
+ data = [
+ "reserved_words",
+ ],
+ deps = [
+ ":attributes",
+ "//public:ir_pb2",
+ "//util:error",
+ "//util:ir_util",
+ "//util:traverse_ir",
+ ],
+)
+
+py_test(
+ name = "constraints_test",
+ srcs = ["constraints_test.py"],
+ python_version = "PY3",
+ shard_count = 8,
+ deps = [
+ ":constraints",
+ ":glue",
+ ":test_util",
+ "//util:error",
+ ],
+)
+
+py_library(
+ name = "dependency_checker",
+ srcs = ["dependency_checker.py"],
+ deps = [
+ "//public:ir_pb2",
+ "//util:error",
+ "//util:ir_util",
+ "//util:traverse_ir",
+ ],
+)
+
+py_test(
+ name = "dependency_checker_test",
+ srcs = ["dependency_checker_test.py"],
+ python_version = "PY3",
+ shard_count = 8,
+ deps = [
+ ":dependency_checker",
+ ":glue",
+ ":test_util",
+ "//util:error",
+ ],
+)
+
+py_binary(
+ name = "emboss_front_end",
+ srcs = ["emboss_front_end.py"],
+ python_version = "PY3",
+ visibility = ["//visibility:public"],
+ deps = [
+ ":glue",
+ ":module_ir",
+ "//util:error",
+ ],
+)
+
+py_binary(
+ name = "format",
+ srcs = ["format.py"],
+ main = "format.py",
+ python_version = "PY3",
+ visibility = ["//visibility:public"],
+ deps = [
+ ":format_emb",
+ ":parser",
+ ":tokenizer",
+ "//util:error",
+ ],
+)
+
+py_library(
+ name = "format_emb",
+ srcs = ["format_emb.py"],
+ deps = [
+ ":module_ir",
+ ":tokenizer",
+ "//util:parser_types",
+ ],
+)
+
+py_test(
+ name = "format_emb_test",
+ srcs = ["format_emb_test.py"],
+ data = [
+ "//testdata:format_embs",
+ ],
+ python_version = "PY3",
+ deps = [
+ ":format_emb",
+ ":module_ir",
+ ":parser",
+ ":tokenizer",
+ ],
+)
+
+py_binary(
+ name = "generate_grammar_md",
+ srcs = ["generate_grammar_md.py"],
+ python_version = "PY3",
+ deps = [
+ ":constraints",
+ ":module_ir",
+ ":tokenizer",
+ ],
+)
+
+py_test(
+ name = "docs_are_up_to_date_test",
+ srcs = ["docs_are_up_to_date_test.py"],
+ data = [
+ "//g3doc:grammar_md",
+ ],
+ python_version = "PY3",
+ deps = [
+ ":generate_grammar_md",
+ ],
+)
diff --git a/front_end/__init__.py b/front_end/__init__.py
new file mode 100644
index 0000000..2c31d84
--- /dev/null
+++ b/front_end/__init__.py
@@ -0,0 +1,14 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
diff --git a/front_end/attribute_checker.py b/front_end/attribute_checker.py
new file mode 100644
index 0000000..b1e4072
--- /dev/null
+++ b/front_end/attribute_checker.py
@@ -0,0 +1,573 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Module which adds and verifies attributes in Emboss IR.
+
+The main entry point is normalize_and_verify(), which adds attributes and/or
+verifies attributes which may have been manually entered.
+"""
+
+from front_end import attributes
+from front_end import type_check
+from public import ir_pb2
+from util import error
+from util import ir_util
+from util import traverse_ir
+
+# The "namespace" attribute is C++-back-end specific, and so should not be used
+# by the front end.
+_NAMESPACE = "namespace"
+
+
+# Error messages used by multiple attribute type checkers.
+_BAD_TYPE_MESSAGE = "Attribute '{name}' must have {type} value."
+_MUST_BE_CONSTANT_MESSAGE = "Attribute '{name}' must have a constant value."
+
+
+# Attribute type checkers
+def _is_constant_boolean(attr, module_source_file):
+ """Checks if the given attr is a constant boolean."""
+ if not attr.value.expression.type.boolean.HasField("value"):
+ return [[error.error(module_source_file,
+ attr.value.source_location,
+ _BAD_TYPE_MESSAGE.format(name=attr.name.text,
+ type="a constant boolean"))]]
+ return []
+
+
+def _is_boolean(attr, module_source_file):
+ """Checks if the given attr is a boolean."""
+ if attr.value.expression.type.WhichOneof("type") != "boolean":
+ return [[error.error(module_source_file,
+ attr.value.source_location,
+ _BAD_TYPE_MESSAGE.format(name=attr.name.text,
+ type="a boolean"))]]
+ return []
+
+
+def _is_constant_integer(attr, module_source_file):
+ """Checks if the given attr is an integer constant expression."""
+ if (not attr.value.HasField("expression") or
+ attr.value.expression.type.WhichOneof("type") != "integer"):
+ return [[error.error(module_source_file,
+ attr.value.source_location,
+ _BAD_TYPE_MESSAGE.format(name=attr.name.text,
+ type="an integer"))]]
+ if not ir_util.is_constant(attr.value.expression):
+ return [[error.error(module_source_file,
+ attr.value.source_location,
+ _MUST_BE_CONSTANT_MESSAGE.format(
+ name=attr.name.text))]]
+ return []
+
+
+def _is_string(attr, module_source_file):
+ """Checks if the given attr is a string."""
+ if not attr.value.HasField("string_constant"):
+ return [[error.error(module_source_file,
+ attr.value.source_location,
+ _BAD_TYPE_MESSAGE.format(name=attr.name.text,
+ type="a string"))]]
+ return []
+
+
+def _is_valid_byte_order(attr, module_source_file):
+ """Checks if the given attr is a valid byte_order."""
+ return _is_string_from_list(attr, module_source_file,
+ {"BigEndian", "LittleEndian", "Null"})
+
+
+def _is_string_from_list(attr, module_source_file, valid_values):
+ """Checks if the given attr has one of the valid_values."""
+ if attr.value.string_constant.text not in valid_values:
+ return [[error.error(module_source_file,
+ attr.value.source_location,
+ "Attribute '{name}' must be '{options}'.".format(
+ name=attr.name.text,
+ options="' or '".join(sorted(valid_values))))]]
+ return []
+
+
+def _is_valid_text_output(attr, module_source_file):
+ """Checks if the given attr is a valid text_output."""
+ return _is_string_from_list(attr, module_source_file, {"Emit", "Skip"})
+
+
+# Attributes must be the same type no matter where they occur.
+_ATTRIBUTE_TYPES = {
+ ("", attributes.ADDRESSABLE_UNIT_SIZE): _is_constant_integer,
+ ("", attributes.BYTE_ORDER): _is_valid_byte_order,
+ ("", attributes.FIXED_SIZE): _is_constant_integer,
+ ("", attributes.IS_INTEGER): _is_constant_boolean,
+ ("", attributes.REQUIRES): _is_boolean,
+ ("", attributes.STATIC_REQUIREMENTS): _is_boolean,
+ ("", attributes.TEXT_OUTPUT): _is_valid_text_output,
+ ("cpp", _NAMESPACE): _is_string,
+}
+
+_MODULE_ATTRIBUTES = {
+ ("", attributes.BYTE_ORDER, True),
+ # TODO(bolms): Allow back-end-specific attributes to be specified
+ # externally.
+ ("cpp", _NAMESPACE, False),
+}
+_BITS_ATTRIBUTES = {
+ ("", attributes.FIXED_SIZE, False),
+ ("", attributes.REQUIRES, False),
+ ("", attributes.STATIC_REQUIREMENTS, False),
+}
+_STRUCT_ATTRIBUTES = {
+ ("", attributes.FIXED_SIZE, False),
+ ("", attributes.BYTE_ORDER, True),
+ ("", attributes.REQUIRES, False),
+ ("", attributes.STATIC_REQUIREMENTS, False),
+}
+_ENUM_ATTRIBUTES = {
+ ("", attributes.STATIC_REQUIREMENTS, False),
+}
+_EXTERNAL_ATTRIBUTES = {
+ ("", attributes.ADDRESSABLE_UNIT_SIZE, False),
+ ("", attributes.FIXED_SIZE, False),
+ ("", attributes.IS_INTEGER, False),
+ ("", attributes.STATIC_REQUIREMENTS, False),
+}
+_STRUCT_PHYSICAL_FIELD_ATTRIBUTES = {
+ ("", attributes.BYTE_ORDER, False),
+ ("", attributes.REQUIRES, False),
+ ("", attributes.TEXT_OUTPUT, False),
+}
+_STRUCT_VIRTUAL_FIELD_ATTRIBUTES = {
+ ("", attributes.REQUIRES, False),
+ ("", attributes.TEXT_OUTPUT, False),
+}
+
+
+def _construct_integer_attribute(name, value, source_location):
+ """Constructs an integer Attribute with the given name and value."""
+ attr_value = ir_pb2.AttributeValue(
+ expression=ir_pb2.Expression(
+ constant=ir_pb2.NumericConstant(value=str(value),
+ source_location=source_location),
+ type=ir_pb2.ExpressionType(
+ integer=ir_pb2.IntegerType(modular_value=str(value),
+ modulus="infinity",
+ minimum_value=str(value),
+ maximum_value=str(value))),
+ source_location=source_location),
+ source_location=source_location)
+ return ir_pb2.Attribute(name=ir_pb2.Word(text=name),
+ value=attr_value,
+ source_location=source_location)
+
+
+def _construct_string_attribute(name, value, source_location):
+ """Constructs a string Attribute with the given name and value."""
+ attr_value = ir_pb2.AttributeValue(
+ string_constant=ir_pb2.String(text=value,
+ source_location=source_location),
+ source_location=source_location)
+ return ir_pb2.Attribute(name=ir_pb2.Word(text=name,
+ source_location=source_location),
+ value=attr_value,
+ source_location=source_location)
+
+
+def _check_attributes_in_ir(ir):
+ """Performs basic checks on all attributes in the given ir.
+
+ This function calls _check_attributes on each attribute list in ir.
+
+ Arguments:
+ ir: An ir_pb2.EmbossIr to check.
+
+ Returns:
+ A list of lists of error.error, or an empty list if there were no errors.
+ """
+
+ def check_module(module, errors):
+ errors.extend(_check_attributes(
+ module.attribute, _MODULE_ATTRIBUTES, "module '{}'".format(
+ module.source_file_name), module.source_file_name))
+
+ def check_type_definition(type_definition, source_file_name, errors):
+ if type_definition.HasField("structure"):
+ if type_definition.addressable_unit == ir_pb2.TypeDefinition.BYTE:
+ errors.extend(_check_attributes(
+ type_definition.attribute, _STRUCT_ATTRIBUTES, "struct '{}'".format(
+ type_definition.name.name.text), source_file_name))
+ elif type_definition.addressable_unit == ir_pb2.TypeDefinition.BIT:
+ errors.extend(_check_attributes(
+ type_definition.attribute, _BITS_ATTRIBUTES, "bits '{}'".format(
+ type_definition.name.name.text), source_file_name))
+ else:
+ assert False, "Unexpected addressable_unit '{}'".format(
+ type_definition.addressable_unit)
+ elif type_definition.HasField("enumeration"):
+ errors.extend(_check_attributes(
+ type_definition.attribute, _ENUM_ATTRIBUTES, "enum '{}'".format(
+ type_definition.name.name.text), source_file_name))
+ elif type_definition.HasField("external"):
+ errors.extend(_check_attributes(
+ type_definition.attribute, _EXTERNAL_ATTRIBUTES,
+ "external '{}'".format(
+ type_definition.name.name.text), source_file_name))
+
+ def check_struct_field(field, source_file_name, errors):
+ if ir_util.field_is_virtual(field):
+ field_attributes = _STRUCT_VIRTUAL_FIELD_ATTRIBUTES
+ field_adjective = "virtual "
+ else:
+ field_attributes = _STRUCT_PHYSICAL_FIELD_ATTRIBUTES
+ field_adjective = ""
+ errors.extend(_check_attributes(
+ field.attribute, field_attributes,
+ "{}struct field '{}'".format(field_adjective, field.name.name.text),
+ source_file_name))
+
+ errors = []
+ # TODO(bolms): Add a check that only known $default'ed attributes are
+ # used.
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Module], check_module,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.TypeDefinition], check_type_definition,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Field], check_struct_field,
+ parameters={"errors": errors})
+ return errors
+
+
+def _check_attributes(attribute_list, attribute_specs, context_name,
+ module_source_file):
+ """Performs basic checks on the given list of attributes.
+
+ Checks the given attribute_list for duplicates, unknown attributes, attributes
+ with incorrect type, and attributes whose values are not constant.
+
+ Arguments:
+ attribute_list: An iterable of ir_pb2.Attribute.
+ attribute_specs: A dict of attribute names to _Attribute structures
+ specifying the allowed attributes.
+ context_name: A name for the context of these attributes, such as "struct
+ 'Foo'" or "module 'm.emb'". Used in error messages.
+ module_source_file: The value of module.source_file_name from the module
+ containing 'attribute_list'. Used in error messages.
+
+ Returns:
+ A list of lists of error.Errors. An empty list indicates no errors were
+ found.
+ """
+ errors = []
+ already_seen_attributes = {}
+ for attr in attribute_list:
+ if attr.back_end.text:
+ attribute_name = "({}) {}".format(attr.back_end.text, attr.name.text)
+ else:
+ attribute_name = attr.name.text
+ if (attr.name.text, attr.is_default) in already_seen_attributes:
+ original_attr = already_seen_attributes[attr.name.text, attr.is_default]
+ errors.append([
+ error.error(module_source_file,
+ attr.source_location,
+ "Duplicate attribute '{}'.".format(attribute_name)),
+ error.note(module_source_file,
+ original_attr.source_location,
+ "Original attribute")])
+ continue
+ already_seen_attributes[attr.name.text, attr.is_default] = attr
+
+ if ((attr.back_end.text, attr.name.text, attr.is_default) not in
+ attribute_specs):
+ if attr.is_default:
+ error_message = "Attribute '{}' may not be defaulted on {}.".format(
+ attribute_name, context_name)
+ else:
+ error_message = "Unknown attribute '{}' on {}.".format(attribute_name,
+ context_name)
+ errors.append([error.error(module_source_file,
+ attr.name.source_location,
+ error_message)])
+ else:
+ attribute_check = _ATTRIBUTE_TYPES[attr.back_end.text, attr.name.text]
+ errors.extend(attribute_check(attr, module_source_file))
+ return errors
+
+
+def _fixed_size_of_struct_or_bits(struct, unit_size):
+ """Returns size of struct in bits or None, if struct is not fixed size."""
+ size = 0
+ for field in struct.field:
+ if not field.HasField("location"):
+ # Virtual fields do not contribute to the physical size of the struct.
+ continue
+ field_start = ir_util.constant_value(field.location.start)
+ field_size = ir_util.constant_value(field.location.size)
+ if field_start is None or field_size is None:
+ # Technically, start + size could be constant even if start and size are
+ # not; e.g. if start == x and size == 10 - x, but we don't handle that
+ # here.
+ return None
+ # TODO(bolms): knows_own_size
+ # TODO(bolms): compute min/max sizes for variable-sized arrays.
+ field_end = field_start + field_size
+ if field_end >= size:
+ size = field_end
+ return size * unit_size
+
+
+def _verify_size_attributes_on_structure(struct, type_definition,
+ source_file_name, errors):
+ """Verifies size attributes on a struct or bits."""
+ fixed_size = _fixed_size_of_struct_or_bits(struct,
+ type_definition.addressable_unit)
+ fixed_size_attr = ir_util.get_attribute(type_definition.attribute,
+ attributes.FIXED_SIZE)
+ if not fixed_size_attr:
+ return
+ if fixed_size is None:
+ errors.append([error.error(
+ source_file_name, fixed_size_attr.source_location,
+ "Struct is marked as fixed size, but contains variable-location "
+ "fields.")])
+ elif ir_util.constant_value(fixed_size_attr.expression) != fixed_size:
+ errors.append([error.error(
+ source_file_name, fixed_size_attr.source_location,
+ "Struct is {} bits, but is marked as {} bits.".format(
+ fixed_size, ir_util.constant_value(fixed_size_attr.expression)))])
+
+
+# TODO(bolms): remove [fixed_size]; it is superseded by $size_in_{bits,bytes}
+def _add_missing_size_attributes_on_structure(struct, type_definition):
+ """Adds missing size attributes on a struct."""
+ fixed_size = _fixed_size_of_struct_or_bits(struct,
+ type_definition.addressable_unit)
+ if fixed_size is None:
+ return
+ fixed_size_attr = ir_util.get_attribute(type_definition.attribute,
+ attributes.FIXED_SIZE)
+ if not fixed_size_attr:
+ # TODO(bolms): Use the offset and length of the last field as the
+ # source_location of the fixed_size attribute?
+ type_definition.attribute.extend([
+ _construct_integer_attribute(attributes.FIXED_SIZE, fixed_size,
+ type_definition.source_location)])
+
+
+def _field_needs_byte_order(field, type_definition, ir):
+ """Returns true if the given field needs a byte_order attribute."""
+ if ir_util.field_is_virtual(field):
+ # Virtual fields have no physical type, and thus do not need a byte order.
+ return False
+ field_type = ir_util.find_object(
+ ir_util.get_base_type(field.type).atomic_type.reference.canonical_name,
+ ir)
+ assert field_type is not None
+ assert field_type.addressable_unit != ir_pb2.TypeDefinition.NONE
+ return field_type.addressable_unit != type_definition.addressable_unit
+
+
+def _field_may_have_null_byte_order(field, type_definition, ir):
+ """Returns true if "Null" is a valid byte order for the given field."""
+ # If the field is one unit in length, then byte order does not matter.
+ if (ir_util.is_constant(field.location.size) and
+ ir_util.constant_value(field.location.size) == 1):
+ return True
+ unit = type_definition.addressable_unit
+ # Otherwise, if the field's type is either a one-unit-sized type or an array
+ # of a one-unit-sized type, then byte order does not matter.
+ if (ir_util.fixed_size_of_type_in_bits(ir_util.get_base_type(field.type), ir)
+ == unit):
+ return True
+ # In all other cases, byte order does matter.
+ return False
+
+
+def _add_missing_byte_order_attribute_on_field(field, type_definition, ir,
+ defaults):
+ """Adds missing byte_order attributes to fields that need them."""
+ if _field_needs_byte_order(field, type_definition, ir):
+ byte_order_attr = ir_util.get_attribute(field.attribute,
+ attributes.BYTE_ORDER)
+ if byte_order_attr is None:
+ if attributes.BYTE_ORDER in defaults:
+ field.attribute.extend([defaults[attributes.BYTE_ORDER]])
+ elif _field_may_have_null_byte_order(field, type_definition, ir):
+ field.attribute.extend(
+ [_construct_string_attribute(attributes.BYTE_ORDER, "Null",
+ field.source_location)])
+
+
+def _add_addressable_unit_to_external(external, type_definition):
+ """Sets the addressable_unit field for an external TypeDefinition."""
+ # Strictly speaking, addressable_unit isn't an "attribute," but it's close
+ # enough that it makes sense to handle it with attributes.
+ del external # Unused.
+ size = ir_util.get_integer_attribute(type_definition.attribute,
+ attributes.ADDRESSABLE_UNIT_SIZE)
+ if size == 1:
+ type_definition.addressable_unit = ir_pb2.TypeDefinition.BIT
+ elif size == 8:
+ type_definition.addressable_unit = ir_pb2.TypeDefinition.BYTE
+ # If the addressable_unit_size is not in (1, 8), it will be caught by
+ # _verify_addressable_unit_attribute_on_external, below.
+
+
+def _verify_byte_order_attribute_on_field(field, type_definition,
+ source_file_name, ir, errors):
+ """Verifies the byte_order attribute on the given field."""
+ byte_order_attr = ir_util.get_attribute(field.attribute,
+ attributes.BYTE_ORDER)
+ field_needs_byte_order = _field_needs_byte_order(field, type_definition, ir)
+ if byte_order_attr and not field_needs_byte_order:
+ errors.append([error.error(
+ source_file_name, byte_order_attr.source_location,
+ "Attribute 'byte_order' not allowed on field which is not byte order "
+ "dependent.")])
+ if not byte_order_attr and field_needs_byte_order:
+ errors.append([error.error(
+ source_file_name, field.source_location,
+ "Attribute 'byte_order' required on field which is byte order "
+ "dependent.")])
+ if (byte_order_attr and byte_order_attr.string_constant.text == "Null" and
+ not _field_may_have_null_byte_order(field, type_definition, ir)):
+ errors.append([error.error(
+ source_file_name, byte_order_attr.source_location,
+ "Attribute 'byte_order' may only be 'Null' for one-byte fields.")])
+
+
+def _verify_requires_attribute_on_field(field, source_file_name, ir, errors):
+ """Verifies that [requires] is valid on the given field."""
+ requires_attr = ir_util.get_attribute(field.attribute, attributes.REQUIRES)
+ if not requires_attr:
+ return
+ if ir_util.field_is_virtual(field):
+ field_expression_type = field.read_transform.type
+ else:
+ if not field.type.HasField("atomic_type"):
+ errors.append([
+ error.error(source_file_name, requires_attr.source_location,
+ "Attribute 'requires' is only allowed on integer, "
+ "enumeration, or boolean fields, not arrays."),
+ error.note(source_file_name, field.type.source_location,
+ "Field type."),
+ ])
+ return
+ field_type = ir_util.find_object(field.type.atomic_type.reference, ir)
+ assert field_type, "Field type should be non-None after name resolution."
+ field_expression_type = (
+ type_check.unbounded_expression_type_for_physical_type(field_type))
+ if field_expression_type.WhichOneof("type") not in (
+ "integer", "enumeration", "boolean"):
+ errors.append([error.error(
+ source_file_name, requires_attr.source_location,
+ "Attribute 'requires' is only allowed on integer, enumeration, or "
+ "boolean fields.")])
+
+
+def _verify_addressable_unit_attribute_on_external(external, type_definition,
+ source_file_name, errors):
+ """Verifies the addressable_unit_size attribute on an external."""
+ del external # Unused.
+ addressable_unit_size_attr = ir_util.get_integer_attribute(
+ type_definition.attribute, attributes.ADDRESSABLE_UNIT_SIZE)
+ if addressable_unit_size_attr is None:
+ errors.append([error.error(
+ source_file_name, type_definition.source_location,
+ "Expected '{}' attribute for external type.".format(
+ attributes.ADDRESSABLE_UNIT_SIZE))])
+ elif addressable_unit_size_attr not in (1, 8):
+ errors.append([
+ error.error(source_file_name, type_definition.source_location,
+ "Only values '1' (bit) and '8' (byte) are allowed for the "
+ "'{}' attribute".format(attributes.ADDRESSABLE_UNIT_SIZE))
+ ])
+
+
+def _gather_default_attributes(obj, defaults):
+ defaults = defaults.copy()
+ for attr in obj.attribute:
+ if attr.is_default:
+ defaulted_attr = ir_pb2.Attribute()
+ defaulted_attr.CopyFrom(attr)
+ defaulted_attr.is_default = False
+ defaults[attr.name.text] = defaulted_attr
+ return {"defaults": defaults}
+
+
+def _add_missing_attributes_on_ir(ir):
+ """Adds missing attributes in a complete IR."""
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.External], _add_addressable_unit_to_external)
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Structure], _add_missing_size_attributes_on_structure,
+ incidental_actions={
+ ir_pb2.Module: _gather_default_attributes,
+ ir_pb2.TypeDefinition: _gather_default_attributes,
+ ir_pb2.Field: _gather_default_attributes,
+ },
+ parameters={"defaults": {}})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Field], _add_missing_byte_order_attribute_on_field,
+ incidental_actions={
+ ir_pb2.Module: _gather_default_attributes,
+ ir_pb2.TypeDefinition: _gather_default_attributes,
+ ir_pb2.Field: _gather_default_attributes,
+ },
+ parameters={"defaults": {}})
+ return []
+
+
+def _verify_field_attributes(field, type_definition, source_file_name, ir,
+ errors):
+ _verify_byte_order_attribute_on_field(field, type_definition,
+ source_file_name, ir, errors)
+ _verify_requires_attribute_on_field(field, source_file_name, ir, errors)
+
+
+def _verify_attributes_on_ir(ir):
+ """Verifies attributes in a complete IR."""
+ errors = []
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Structure], _verify_size_attributes_on_structure,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.External], _verify_addressable_unit_attribute_on_external,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Field], _verify_field_attributes,
+ parameters={"errors": errors})
+ return errors
+
+
+def normalize_and_verify(ir):
+ """Performs various normalizations and verifications on ir.
+
+ Checks for duplicate attributes.
+
+ Adds fixed_size_in_bits and addressable_unit_size attributes to types when
+ they are missing, and checks their correctness when they are not missing.
+
+ Arguments:
+ ir: The IR object to normalize.
+
+ Returns:
+ A list of validation errors, or an empty list if no errors were encountered.
+ """
+ errors = _check_attributes_in_ir(ir)
+ if errors:
+ return errors
+ _add_missing_attributes_on_ir(ir)
+ return _verify_attributes_on_ir(ir)
diff --git a/front_end/attribute_checker_test.py b/front_end/attribute_checker_test.py
new file mode 100644
index 0000000..f9a6cfc
--- /dev/null
+++ b/front_end/attribute_checker_test.py
@@ -0,0 +1,589 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for attribute_checker.py."""
+
+import unittest
+from front_end import attribute_checker
+from front_end import glue
+from front_end import test_util
+from public import ir_pb2
+from util import error
+from util import ir_util
+
+# These are not shared with attribute_checker.py because their values are part
+# of the contract with back ends.
+_BYTE_ORDER = "byte_order"
+_FIXED_SIZE = "fixed_size_in_bits"
+
+
+def _make_ir_from_emb(emb_text, name="m.emb"):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ name,
+ test_util.dict_file_reader({name: emb_text}),
+ stop_before_step="normalize_and_verify")
+ assert not errors
+ return ir
+
+
+class NormalizeIrTest(unittest.TestCase):
+
+ def test_rejects_may_be_used_as_integer(self):
+ enum_ir = _make_ir_from_emb("enum Foo:\n"
+ " [may_be_used_as_integer: false]\n"
+ " VALUE = 1\n")
+ enum_type_ir = enum_ir.module[0].type[0]
+ self.assertEqual([[
+ error.error(
+ "m.emb", enum_type_ir.attribute[0].name.source_location,
+ "Unknown attribute 'may_be_used_as_integer' on enum 'Foo'.")
+ ]], attribute_checker.normalize_and_verify(enum_ir))
+
+ def test_adds_fixed_size_attribute_to_struct(self):
+ # field2 is intentionally after field3, in order to trigger certain code
+ # paths in attribute_checker.py.
+ struct_ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+2] UInt field1\n"
+ " 4 [+4] UInt field2\n"
+ " 2 [+2] UInt field3\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(struct_ir))
+ size_attr = ir_util.get_attribute(struct_ir.module[0].type[0].attribute,
+ _FIXED_SIZE)
+ self.assertEqual(64, ir_util.constant_value(size_attr.expression))
+ self.assertEqual(struct_ir.module[0].type[0].source_location,
+ size_attr.source_location)
+
+ def test_adds_fixed_size_attribute_to_struct_with_virtual_field(self):
+ struct_ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+2] UInt field1\n"
+ " let field2 = field1\n"
+ " 2 [+2] UInt field3\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(struct_ir))
+ size_attr = ir_util.get_attribute(struct_ir.module[0].type[0].attribute,
+ _FIXED_SIZE)
+ self.assertEqual(32, ir_util.constant_value(size_attr.expression))
+ self.assertEqual(struct_ir.module[0].type[0].source_location,
+ size_attr.source_location)
+
+ def test_adds_fixed_size_attribute_to_anonymous_bits(self):
+ struct_ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+4] bits:\n"
+ " 0 [+8] UInt field\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(struct_ir))
+ size_attr = ir_util.get_attribute(struct_ir.module[0].type[0].attribute,
+ _FIXED_SIZE)
+ self.assertEqual(32, ir_util.constant_value(size_attr.expression))
+ bits_size_attr = ir_util.get_attribute(
+ struct_ir.module[0].type[0].subtype[0].attribute, _FIXED_SIZE)
+ self.assertEqual(8, ir_util.constant_value(bits_size_attr.expression))
+ self.assertEqual(struct_ir.module[0].type[0].source_location,
+ size_attr.source_location)
+
+ def test_does_not_add_fixed_size_attribute_to_variable_size_struct(self):
+ struct_ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+4] UInt n\n"
+ " 4 [+n] UInt:8[] payload\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(struct_ir))
+ self.assertIsNone(ir_util.get_attribute(
+ struct_ir.module[0].type[0].attribute, _FIXED_SIZE))
+
+ def test_accepts_correct_fixed_size_and_size_attributes_on_struct(self):
+ struct_ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " [fixed_size_in_bits: 64]\n"
+ " 0 [+2] UInt field1\n"
+ " 2 [+2] UInt field2\n"
+ " 4 [+4] UInt field3\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(struct_ir))
+ size_attr = ir_util.get_attribute(struct_ir.module[0].type[0].attribute,
+ _FIXED_SIZE)
+ self.assertTrue(size_attr)
+ self.assertEqual(64, ir_util.constant_value(size_attr.expression))
+
+ def test_accepts_correct_size_attribute_on_struct(self):
+ struct_ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " [fixed_size_in_bits: 64]\n"
+ " 0 [+2] UInt field1\n"
+ " 4 [+4] UInt field3\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(struct_ir))
+ size_attr = ir_util.get_attribute(struct_ir.module[0].type[0].attribute,
+ _FIXED_SIZE)
+ self.assertTrue(size_attr.expression)
+ self.assertEqual(64, ir_util.constant_value(size_attr.expression))
+
+ def test_rejects_incorrect_fixed_size_attribute_on_variable_size_struct(self):
+ struct_ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " [fixed_size_in_bits: 8]\n"
+ " 0 [+4] UInt n\n"
+ " 4 [+n] UInt:8[] payload\n")
+ struct_type_ir = struct_ir.module[0].type[0]
+ self.assertEqual([[error.error(
+ "m.emb", struct_type_ir.attribute[0].value.source_location,
+ "Struct is marked as fixed size, but contains variable-location "
+ "fields.")]], attribute_checker.normalize_and_verify(struct_ir))
+
+ def test_rejects_size_attribute_with_wrong_large_value_on_struct(self):
+ struct_ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " [fixed_size_in_bits: 80]\n"
+ " 0 [+2] UInt field1\n"
+ " 2 [+2] UInt field2\n"
+ " 4 [+4] UInt field3\n")
+ struct_type_ir = struct_ir.module[0].type[0]
+ self.assertEqual([
+ [error.error("m.emb", struct_type_ir.attribute[0].value.source_location,
+ "Struct is 64 bits, but is marked as 80 bits.")]
+ ], attribute_checker.normalize_and_verify(struct_ir))
+
+ def test_rejects_size_attribute_with_wrong_small_value_on_struct(self):
+ struct_ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " [fixed_size_in_bits: 40]\n"
+ " 0 [+2] UInt field1\n"
+ " 2 [+2] UInt field2\n"
+ " 4 [+4] UInt field3\n")
+ struct_type_ir = struct_ir.module[0].type[0]
+ self.assertEqual([
+ [error.error("m.emb", struct_type_ir.attribute[0].value.source_location,
+ "Struct is 64 bits, but is marked as 40 bits.")]
+ ], attribute_checker.normalize_and_verify(struct_ir))
+
+ def test_accepts_variable_size_external(self):
+ external_ir = _make_ir_from_emb("external Foo:\n"
+ " [addressable_unit_size: 1]\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(external_ir))
+
+ def test_accepts_fixed_size_external(self):
+ external_ir = _make_ir_from_emb("external Foo:\n"
+ " [fixed_size_in_bits: 32]\n"
+ " [addressable_unit_size: 1]\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(external_ir))
+
+ def test_rejects_external_with_no_addressable_unit_size_attribute(self):
+ external_ir = _make_ir_from_emb("external Foo:\n"
+ " [is_integer: false]\n")
+ external_type_ir = external_ir.module[0].type[0]
+ self.assertEqual([
+ [error.error(
+ "m.emb", external_type_ir.source_location,
+ "Expected 'addressable_unit_size' attribute for external type.")]
+ ], attribute_checker.normalize_and_verify(external_ir))
+
+ def test_rejects_is_integer_with_non_constant_value(self):
+ external_ir = _make_ir_from_emb(
+ "external Foo:\n"
+ " [is_integer: $static_size_in_bits == 1]\n"
+ " [addressable_unit_size: 1]\n")
+ external_type_ir = external_ir.module[0].type[0]
+ self.assertEqual([
+ [error.error(
+ "m.emb", external_type_ir.attribute[0].value.source_location,
+ "Attribute 'is_integer' must have a constant boolean value.")]
+ ], attribute_checker.normalize_and_verify(external_ir))
+
+ def test_rejects_addressable_unit_size_with_non_constant_value(self):
+ external_ir = _make_ir_from_emb(
+ "external Foo:\n"
+ " [is_integer: true]\n"
+ " [addressable_unit_size: $static_size_in_bits]\n")
+ external_type_ir = external_ir.module[0].type[0]
+ self.assertEqual([
+ [error.error(
+ "m.emb", external_type_ir.attribute[1].value.source_location,
+ "Attribute 'addressable_unit_size' must have a constant value.")]
+ ], attribute_checker.normalize_and_verify(external_ir))
+
+ def test_rejects_external_with_wrong_addressable_unit_size_attribute(self):
+ external_ir = _make_ir_from_emb("external Foo:\n"
+ " [addressable_unit_size: 4]\n")
+ external_type_ir = external_ir.module[0].type[0]
+ self.assertEqual([
+ [error.error(
+ "m.emb", external_type_ir.source_location,
+ "Only values '1' (bit) and '8' (byte) are allowed for the "
+ "'addressable_unit_size' attribute")]
+ ], attribute_checker.normalize_and_verify(external_ir))
+
+ def test_rejects_duplicate_attribute(self):
+ ir = _make_ir_from_emb("external Foo:\n"
+ " [is_integer: true]\n"
+ " [is_integer: true]\n")
+ self.assertEqual([[
+ error.error("m.emb", ir.module[0].type[0].attribute[1].source_location,
+ "Duplicate attribute 'is_integer'."),
+ error.note("m.emb", ir.module[0].type[0].attribute[0].source_location,
+ "Original attribute"),
+ ]], attribute_checker.normalize_and_verify(ir))
+
+ def test_rejects_duplicate_default_attribute(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ '[$default byte_order: "LittleEndian"]\n')
+ self.assertEqual(
+ [[
+ error.error("m.emb", ir.module[0].attribute[1].source_location,
+ "Duplicate attribute 'byte_order'."),
+ error.note("m.emb", ir.module[0].attribute[0].source_location,
+ "Original attribute"),
+ ]], attribute_checker.normalize_and_verify(ir))
+
+ def test_rejects_unknown_attribute(self):
+ ir = _make_ir_from_emb("[gibberish: true]\n")
+ attr = ir.module[0].attribute[0]
+ self.assertEqual([[
+ error.error("m.emb", attr.name.source_location,
+ "Unknown attribute 'gibberish' on module 'm.emb'.")
+ ]], attribute_checker.normalize_and_verify(ir))
+
+ def test_rejects_non_constant_attribute(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " [fixed_size_in_bits: field1]\n"
+ " 0 [+2] UInt field1\n")
+ attr = ir.module[0].type[0].attribute[0]
+ self.assertEqual(
+ [[
+ error.error(
+ "m.emb", attr.value.source_location,
+ "Attribute 'fixed_size_in_bits' must have a constant value.")
+ ]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_accepts_string_attribute(self):
+ ir = _make_ir_from_emb('[(cpp) namespace: "foo"]\n')
+ self.assertEqual([], attribute_checker.normalize_and_verify(ir))
+
+ def test_rejects_wrong_type_for_string_attribute(self):
+ ir = _make_ir_from_emb("[(cpp) namespace: 9]\n")
+ attr = ir.module[0].attribute[0]
+ self.assertEqual([[
+ error.error("m.emb", attr.value.source_location,
+ "Attribute 'namespace' must have a string value.")
+ ]], attribute_checker.normalize_and_verify(ir))
+
+ def test_accepts_back_end_qualified_attribute(self):
+ ir = _make_ir_from_emb('[(cpp) namespace: "abc"]\n')
+ self.assertEqual([], attribute_checker.normalize_and_verify(ir))
+
+ def test_rejects_attribute_missing_required_back_end_specifier(self):
+ ir = _make_ir_from_emb('[namespace: "abc"]\n')
+ attr = ir.module[0].attribute[0]
+ self.assertEqual([[
+ error.error("m.emb", attr.name.source_location,
+ "Unknown attribute 'namespace' on module 'm.emb'.")
+ ]], attribute_checker.normalize_and_verify(ir))
+
+ def test_rejects_attribute_with_wrong_back_end_specifier(self):
+ ir = _make_ir_from_emb('[(c) namespace: "abc"]\n')
+ attr = ir.module[0].attribute[0]
+ self.assertEqual([[
+ error.error("m.emb", attr.name.source_location,
+ "Unknown attribute '(c) namespace' on module 'm.emb'.")
+ ]], attribute_checker.normalize_and_verify(ir))
+
+ def test_rejects_emboss_internal_attribute_with_back_end_specifier(self):
+ ir = _make_ir_from_emb('[(cpp) byte_order: "LittleEndian"]\n')
+ attr = ir.module[0].attribute[0]
+ self.assertEqual([[
+ error.error("m.emb", attr.name.source_location,
+ "Unknown attribute '(cpp) byte_order' on module 'm.emb'.")
+ ]], attribute_checker.normalize_and_verify(ir))
+
+ def test_adds_byte_order_attributes_from_default(self):
+ ir = _make_ir_from_emb('[$default byte_order: "BigEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+2] UInt bar\n"
+ " 2 [+2] UInt baz\n"
+ ' [byte_order: "LittleEndian"]\n')
+ self.assertEqual([], attribute_checker.normalize_and_verify(ir))
+ byte_order_attr = ir_util.get_attribute(
+ ir.module[0].type[0].structure.field[0].attribute, _BYTE_ORDER)
+ self.assertTrue(byte_order_attr.HasField("string_constant"))
+ self.assertEqual("BigEndian", byte_order_attr.string_constant.text)
+ byte_order_attr = ir_util.get_attribute(
+ ir.module[0].type[0].structure.field[1].attribute, _BYTE_ORDER)
+ self.assertTrue(byte_order_attr.HasField("string_constant"))
+ self.assertEqual("LittleEndian", byte_order_attr.string_constant.text)
+
+ def test_adds_null_byte_order_attributes(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt bar\n"
+ " 1 [+1] UInt baz\n"
+ ' [byte_order: "LittleEndian"]\n'
+ " 2 [+2] UInt:8[] baseball\n"
+ " 4 [+2] UInt:8[] bat\n"
+ ' [byte_order: "LittleEndian"]\n')
+ self.assertEqual([], attribute_checker.normalize_and_verify(ir))
+ structure = ir.module[0].type[0].structure
+ byte_order_attr = ir_util.get_attribute(
+ structure.field[0].attribute, _BYTE_ORDER)
+ self.assertTrue(byte_order_attr.HasField("string_constant"))
+ self.assertEqual("Null", byte_order_attr.string_constant.text)
+ self.assertEqual(structure.field[0].source_location,
+ byte_order_attr.source_location)
+ byte_order_attr = ir_util.get_attribute(structure.field[1].attribute,
+ _BYTE_ORDER)
+ self.assertTrue(byte_order_attr.HasField("string_constant"))
+ self.assertEqual("LittleEndian", byte_order_attr.string_constant.text)
+ byte_order_attr = ir_util.get_attribute(structure.field[2].attribute,
+ _BYTE_ORDER)
+ self.assertTrue(byte_order_attr.HasField("string_constant"))
+ self.assertEqual("Null", byte_order_attr.string_constant.text)
+ self.assertEqual(structure.field[2].source_location,
+ byte_order_attr.source_location)
+ byte_order_attr = ir_util.get_attribute(structure.field[3].attribute,
+ _BYTE_ORDER)
+ self.assertTrue(byte_order_attr.HasField("string_constant"))
+ self.assertEqual("LittleEndian", byte_order_attr.string_constant.text)
+
+ def test_disallows_default_byte_order_on_field(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+2] UInt bar\n"
+ ' [$default byte_order: "LittleEndian"]\n')
+ default_byte_order = ir.module[0].type[0].structure.field[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", default_byte_order.name.source_location,
+ "Attribute 'byte_order' may not be defaulted on struct field 'bar'."
+ )]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_disallows_default_byte_order_on_bits(self):
+ ir = _make_ir_from_emb("bits Foo:\n"
+ ' [$default byte_order: "LittleEndian"]\n'
+ " 0 [+2] UInt bar\n")
+ default_byte_order = ir.module[0].type[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", default_byte_order.name.source_location,
+ "Attribute 'byte_order' may not be defaulted on bits 'Foo'.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_disallows_default_byte_order_on_enum(self):
+ ir = _make_ir_from_emb("enum Foo:\n"
+ ' [$default byte_order: "LittleEndian"]\n'
+ " BAR = 1\n")
+ default_byte_order = ir.module[0].type[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", default_byte_order.name.source_location,
+ "Attribute 'byte_order' may not be defaulted on enum 'Foo'.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_adds_byte_order_from_scoped_default(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ ' [$default byte_order: "BigEndian"]\n'
+ " 0 [+2] UInt bar\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(ir))
+ byte_order_attr = ir_util.get_attribute(
+ ir.module[0].type[0].structure.field[0].attribute, _BYTE_ORDER)
+ self.assertTrue(byte_order_attr.HasField("string_constant"))
+ self.assertEqual("BigEndian", byte_order_attr.string_constant.text)
+
+ def test_disallows_unknown_byte_order(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+2] UInt bar\n"
+ ' [byte_order: "NoEndian"]\n')
+ byte_order = ir.module[0].type[0].structure.field[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", byte_order.value.source_location,
+ "Attribute 'byte_order' must be 'BigEndian' or 'LittleEndian' or "
+ "'Null'.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_disallows_unknown_default_byte_order(self):
+ ir = _make_ir_from_emb('[$default byte_order: "NoEndian"]\n')
+ default_byte_order = ir.module[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", default_byte_order.value.source_location,
+ "Attribute 'byte_order' must be 'BigEndian' or 'LittleEndian' or "
+ "'Null'.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_disallows_byte_order_on_non_byte_order_dependent_fields(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ ' [$default byte_order: "LittleEndian"]\n'
+ " 0 [+2] UInt uint\n"
+ "struct Bar:\n"
+ " 0 [+2] Foo foo\n"
+ ' [byte_order: "LittleEndian"]\n')
+ byte_order = ir.module[0].type[1].structure.field[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", byte_order.value.source_location,
+ "Attribute 'byte_order' not allowed on field which is not byte "
+ "order dependent.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_disallows_byte_order_on_virtual_field(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " let x = 10\n"
+ ' [byte_order: "LittleEndian"]\n')
+ byte_order = ir.module[0].type[0].structure.field[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", byte_order.name.source_location,
+ "Unknown attribute 'byte_order' on virtual struct field 'x'.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_disallows_null_byte_order_on_multibyte_fields(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+2] UInt uint\n"
+ ' [byte_order: "Null"]\n')
+ byte_order = ir.module[0].type[0].structure.field[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", byte_order.value.source_location,
+ "Attribute 'byte_order' may only be 'Null' for one-byte fields.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_disallows_null_byte_order_on_multibyte_array_elements(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+4] UInt:16[] uint\n"
+ ' [byte_order: "Null"]\n')
+ byte_order = ir.module[0].type[0].structure.field[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", byte_order.value.source_location,
+ "Attribute 'byte_order' may only be 'Null' for one-byte fields.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_requires_byte_order_on_byte_order_dependent_fields(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+2] UInt uint\n")
+ field = ir.module[0].type[0].structure.field[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", field.source_location,
+ "Attribute 'byte_order' required on field which is byte order "
+ "dependent.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_disallows_unknown_text_output_attribute(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+2] UInt bar\n"
+ ' [text_output: "None"]\n')
+ byte_order = ir.module[0].type[0].structure.field[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", byte_order.value.source_location,
+ "Attribute 'text_output' must be 'Emit' or 'Skip'.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_disallows_non_string_text_output_attribute(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+2] UInt bar\n"
+ " [text_output: 0]\n")
+ byte_order = ir.module[0].type[0].structure.field[0].attribute[0]
+ self.assertEqual(
+ [[error.error(
+ "m.emb", byte_order.value.source_location,
+ "Attribute 'text_output' must be 'Emit' or 'Skip'.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_allows_skip_text_output_attribute_on_physical_field(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt bar\n"
+ ' [text_output: "Skip"]\n')
+ self.assertEqual([], attribute_checker.normalize_and_verify(ir))
+
+ def test_allows_skip_text_output_attribute_on_virtual_field(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " let x = 10\n"
+ ' [text_output: "Skip"]\n')
+ self.assertEqual([], attribute_checker.normalize_and_verify(ir))
+
+ def test_allows_emit_text_output_attribute_on_physical_field(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt bar\n"
+ ' [text_output: "Emit"]\n')
+ self.assertEqual([], attribute_checker.normalize_and_verify(ir))
+
+ def test_adds_bit_addressable_unit_to_external(self):
+ external_ir = _make_ir_from_emb("external Foo:\n"
+ " [addressable_unit_size: 1]\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(external_ir))
+ self.assertEqual(ir_pb2.TypeDefinition.BIT,
+ external_ir.module[0].type[0].addressable_unit)
+
+ def test_adds_byte_addressable_unit_to_external(self):
+ external_ir = _make_ir_from_emb("external Foo:\n"
+ " [addressable_unit_size: 8]\n")
+ self.assertEqual([], attribute_checker.normalize_and_verify(external_ir))
+ self.assertEqual(ir_pb2.TypeDefinition.BYTE,
+ external_ir.module[0].type[0].addressable_unit)
+
+ def test_rejects_requires_using_array(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+4] UInt:8[] array\n"
+ " [requires: this]\n")
+ field_ir = ir.module[0].type[0].structure.field[0]
+ self.assertEqual(
+ [[error.error("m.emb", field_ir.attribute[0].value.source_location,
+ "Attribute 'requires' must have a boolean value.")]],
+ attribute_checker.normalize_and_verify(ir))
+
+ def test_rejects_requires_on_array(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+4] UInt:8[] array\n"
+ " [requires: false]\n")
+ field_ir = ir.module[0].type[0].structure.field[0]
+ self.assertEqual(
+ [[
+ error.error("m.emb", field_ir.attribute[0].value.source_location,
+ "Attribute 'requires' is only allowed on integer, "
+ "enumeration, or boolean fields, not arrays."),
+ error.note("m.emb", field_ir.type.source_location,
+ "Field type."),
+ ]],
+ error.filter_errors(attribute_checker.normalize_and_verify(ir)))
+
+ def test_rejects_requires_on_struct(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+4] Bar bar\n"
+ " [requires: false]\n"
+ "struct Bar:\n"
+ " 0 [+4] UInt uint\n")
+ field_ir = ir.module[0].type[0].structure.field[0]
+ self.assertEqual(
+ [[error.error("m.emb", field_ir.attribute[0].value.source_location,
+ "Attribute 'requires' is only allowed on integer, "
+ "enumeration, or boolean fields.")]],
+ error.filter_errors(attribute_checker.normalize_and_verify(ir)))
+
+ def test_rejects_requires_on_float(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+4] Float float\n"
+ " [requires: false]\n")
+ field_ir = ir.module[0].type[0].structure.field[0]
+ self.assertEqual(
+ [[error.error("m.emb", field_ir.attribute[0].value.source_location,
+ "Attribute 'requires' is only allowed on integer, "
+ "enumeration, or boolean fields.")]],
+ error.filter_errors(attribute_checker.normalize_and_verify(ir)))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/attributes.py b/front_end/attributes.py
new file mode 100644
index 0000000..fd6e31c
--- /dev/null
+++ b/front_end/attributes.py
@@ -0,0 +1,25 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Well-known attribute names."""
+
+# Attribute names which may be used by other parts of the front end.
+ADDRESSABLE_UNIT_SIZE = "addressable_unit_size"
+BYTE_ORDER = "byte_order"
+RANGE = "range"
+FIXED_SIZE = "fixed_size_in_bits"
+IS_INTEGER = "is_integer"
+REQUIRES = "requires"
+STATIC_REQUIREMENTS = "static_requirements"
+TEXT_OUTPUT = "text_output"
diff --git a/front_end/constraints.py b/front_end/constraints.py
new file mode 100644
index 0000000..a8628a1
--- /dev/null
+++ b/front_end/constraints.py
@@ -0,0 +1,610 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Routines to check miscellaneous constraints on the IR."""
+
+import pkgutil
+
+from front_end import attributes
+from public import ir_pb2
+from util import error
+from util import ir_util
+from util import traverse_ir
+
+
+def _render_type(type_ir, ir):
+ """Returns the human-readable notation of the given type."""
+ assert type_ir.HasField("atomic_type"), (
+ "TODO(bolms): Implement _render_type for array types.")
+ if type_ir.HasField("size_in_bits"):
+ return _render_atomic_type_name(
+ type_ir,
+ ir,
+ suffix=":" + str(ir_util.constant_value(type_ir.size_in_bits)))
+ else:
+ return _render_atomic_type_name(type_ir, ir)
+
+
+def _render_atomic_type_name(type_ir, ir, suffix=None):
+ assert type_ir.HasField("atomic_type"), (
+ "_render_atomic_type_name() requires an atomic type")
+ if not suffix:
+ suffix = ""
+ type_definition = ir_util.find_object(type_ir.atomic_type.reference, ir)
+ if type_definition.name.is_anonymous:
+ return "anonymous type"
+ else:
+ return "type '{}{}'".format(type_definition.name.name.text, suffix)
+
+
+def _check_that_inner_array_dimensions_are_constant(
+ type_ir, source_file_name, errors):
+ """Checks that inner array dimensions are constant."""
+ if type_ir.WhichOneof("size") == "automatic":
+ errors.append([error.error(
+ source_file_name, type_ir.element_count.source_location,
+ "Array dimensions can only be omitted for the outermost dimension.")])
+ elif type_ir.WhichOneof("size") == "element_count":
+ if not ir_util.is_constant(type_ir.element_count):
+ errors.append([error.error(source_file_name,
+ type_ir.element_count.source_location,
+ "Inner array dimensions must be constant.")])
+ else:
+ assert False, 'Expected "element_count" or "automatic" array size.'
+
+
+def _check_that_array_base_types_are_fixed_size(type_ir, source_file_name,
+ errors, ir):
+ """Checks that the sizes of array elements are known at compile time."""
+ if type_ir.base_type.HasField("array_type"):
+ # An array is fixed size if its base_type is fixed size and its array
+ # dimension is constant. This function will be called again on the inner
+ # array, and we do not want to cascade errors if the inner array's base_type
+ # is not fixed size. The array dimensions are separately checked by
+ # _check_that_inner_array_dimensions_are_constant, which will provide an
+ # appropriate error message for that case.
+ return
+ assert type_ir.base_type.HasField("atomic_type")
+ if type_ir.base_type.HasField("size_in_bits"):
+ # If the base_type has a size_in_bits, then it is fixed size.
+ return
+ base_type = ir_util.find_object(type_ir.base_type.atomic_type.reference, ir)
+ base_type_fixed_size = ir_util.get_integer_attribute(
+ base_type.attribute, attributes.FIXED_SIZE)
+ if base_type_fixed_size is None:
+ errors.append([error.error(source_file_name,
+ type_ir.base_type.atomic_type.source_location,
+ "Array elements must be fixed size.")])
+
+
+def _check_that_array_base_types_in_structs_are_multiples_of_bytes(
+ type_ir, type_definition, source_file_name, errors, ir):
+ # TODO(bolms): Remove this limitation.
+ """Checks that the sizes of array elements are multiples of 8 bits."""
+ if type_ir.base_type.HasField("array_type"):
+ # Only check the innermost array for multidimensional arrays.
+ return
+ assert type_ir.base_type.HasField("atomic_type")
+ if type_ir.base_type.HasField("size_in_bits"):
+ assert ir_util.is_constant(type_ir.base_type.size_in_bits)
+ base_type_size = ir_util.constant_value(type_ir.base_type.size_in_bits)
+ else:
+ fixed_size = ir_util.fixed_size_of_type_in_bits(type_ir.base_type, ir)
+ if fixed_size is None:
+ # Variable-sized elements are checked elsewhere.
+ return
+ base_type_size = fixed_size
+ if base_type_size % type_definition.addressable_unit != 0:
+ assert type_definition.addressable_unit == ir_pb2.TypeDefinition.BYTE
+ errors.append([error.error(source_file_name,
+ type_ir.base_type.source_location,
+ "Array elements in structs must have sizes "
+ "which are a multiple of 8 bits.")])
+
+
+def _check_constancy_of_constant_references(expression, source_file_name,
+ errors, ir):
+ """Checks that constant_references are constant."""
+ if expression.WhichOneof("expression") != "constant_reference":
+ return
+ # This is a bit of a hack: really, we want to know that the referred-to object
+ # has no dependencies on any instance variables of its parent structure; i.e.,
+ # that its value does not depend on having a view of the structure.
+ if not ir_util.is_constant_type(expression.type):
+ referred_name = expression.constant_reference.canonical_name
+ referred_object = ir_util.find_object(referred_name, ir)
+ errors.append([
+ error.error(
+ source_file_name, expression.source_location,
+ "Static references must refer to constants."),
+ error.note(
+ referred_name.module_file, referred_object.source_location,
+ "{} is not constant.".format(referred_name.object_path[-1]))
+ ])
+
+
+def _check_that_enum_values_are_representable(enum_type, source_file_name,
+ errors):
+ """Checks that enumeration values can fit in an int64 or uint64."""
+ values = []
+ for value in enum_type.value:
+ values.append((ir_util.constant_value(value.value), value))
+ # Guess if the user intended a signed or unsigned enumeration based on how
+ # many values would be out of range given either type.
+ signed_out_of_range = [v for v in values if not -2**63 <= v[0] <= 2**63-1]
+ unsigned_out_of_range = [v for v in values if not 0 <= v[0] <= 2**64-1]
+ if len(signed_out_of_range) < len(unsigned_out_of_range):
+ out_of_range = signed_out_of_range
+ range_name = "signed "
+ else:
+ out_of_range = unsigned_out_of_range
+ range_name = "unsigned "
+ # If all values are in range for either a signed or an unsigned enumeration,
+ # this loop will have zero iterations.
+ for value in out_of_range:
+ errors.append([
+ error.error(
+ source_file_name, value[1].value.source_location,
+ "Value {} is out of range for {}enumeration.".format(
+ value[0], range_name if -2**63 <= value[0] <= 2**64-1 else ""))
+ ])
+
+
+def _field_size(field, type_definition):
+ """Calculates the size of the given field in bits, if it is constant."""
+ size = ir_util.constant_value(field.location.size)
+ if size is None:
+ return None
+ return size * type_definition.addressable_unit
+
+
+def _check_type_requirements_for_field(type_ir, type_definition, field, ir,
+ source_file_name, errors):
+ """Checks that the `requires` attribute of each field's type is fulfilled."""
+ if not type_ir.HasField("atomic_type"):
+ return
+
+ if field.type.HasField("atomic_type"):
+ field_min_size = (int(field.location.size.type.integer.minimum_value) *
+ type_definition.addressable_unit)
+ field_max_size = (int(field.location.size.type.integer.maximum_value) *
+ type_definition.addressable_unit)
+ field_is_atomic = True
+ else:
+ field_is_atomic = False
+
+ if type_ir.HasField("size_in_bits"):
+ element_size = ir_util.constant_value(type_ir.size_in_bits)
+ else:
+ element_size = None
+
+ referenced_type_definition = ir_util.find_object(
+ type_ir.atomic_type.reference, ir)
+ type_is_anonymous = referenced_type_definition.name.is_anonymous
+ type_size_attr = ir_util.get_attribute(
+ referenced_type_definition.attribute, attributes.FIXED_SIZE)
+ if type_size_attr:
+ type_size = ir_util.constant_value(type_size_attr.expression)
+ else:
+ type_size = None
+
+ if (element_size is not None and type_size is not None and
+ element_size != type_size):
+ errors.append([
+ error.error(
+ source_file_name, type_ir.size_in_bits.source_location,
+ "Explicit size of {} bits does not match fixed size ({} bits) of "
+ "{}.".format(element_size, type_size,
+ _render_atomic_type_name(type_ir, ir))),
+ error.note(
+ type_ir.atomic_type.reference.canonical_name.module_file,
+ type_size_attr.source_location,
+ "Size specified here.")
+ ])
+ return
+
+ # If the type had no size specifier (the ':32' in 'UInt:32'), but the type is
+ # fixed size, then continue as if the type's size were explicitly stated.
+ if element_size is None:
+ element_size = type_size
+
+ # TODO(bolms): When the full dynamic size expression for types is generated,
+ # add a check that dynamically-sized types can, at least potentially, fit in
+ # their fields.
+
+ if field_is_atomic and element_size is not None:
+ # If the field has a fixed size, and the (atomic) type contained therein is
+ # also fixed size, then the sizes should match.
+ #
+ # TODO(bolms): Maybe change the case where the field is bigger than
+ # necessary into a warning?
+ if (field_max_size == field_min_size and
+ (element_size > field_max_size or
+ (element_size < field_min_size and not type_is_anonymous))):
+ errors.append([
+ error.error(
+ source_file_name, type_ir.source_location,
+ "Fixed-size {} cannot be placed in field of size {} bits; "
+ "requires {} bits.".format(
+ _render_type(type_ir, ir), field_max_size, element_size))
+ ])
+ return
+ elif element_size > field_max_size:
+ errors.append([
+ error.error(
+ source_file_name, type_ir.source_location,
+ "Field of maximum size {} bits cannot hold fixed-size {}, which "
+ "requires {} bits.".format(
+ field_max_size, _render_type(type_ir, ir), element_size))
+ ])
+ return
+
+ # If we're here, then field/type sizes are consistent.
+ if (element_size is None and field_is_atomic and
+ field_min_size == field_max_size):
+ # From here down, we just use element_size.
+ element_size = field_min_size
+
+ errors.extend(_check_physical_type_requirements(
+ type_ir, field.source_location, element_size, ir, source_file_name))
+
+
+def _check_type_requirements_for_parameter_type(
+ runtime_parameter, ir, source_file_name, errors):
+ """Checks that the type of a parameter is valid."""
+ physical_type = runtime_parameter.physical_type_alias
+ logical_type = runtime_parameter.type
+ size = ir_util.constant_value(physical_type.size_in_bits)
+ if logical_type.WhichOneof("type") == "integer":
+ integer_errors = _integer_bounds_errors(
+ logical_type.integer, "parameter", source_file_name,
+ physical_type.source_location)
+ if integer_errors:
+ errors.extend(integer_errors)
+ return
+ errors.extend(_check_physical_type_requirements(
+ physical_type, runtime_parameter.source_location,
+ size, ir, source_file_name))
+ elif logical_type.WhichOneof("type") == "enumeration":
+ if physical_type.HasField("size_in_bits"):
+ # This seems a little weird: for `UInt`, `Int`, etc., the explicit size is
+ # required, but for enums it is banned. This is because enums have a
+ # "native" 64-bit size in expressions, so the physical size is just
+ # ignored.
+ errors.extend([[
+ error.error(
+ source_file_name, physical_type.size_in_bits.source_location,
+ "Parameters with enum type may not have explicit size.")
+
+ ]])
+ else:
+ assert False, "Non-integer/enum parameters should have been caught earlier."
+
+
+def _check_physical_type_requirements(
+ type_ir, usage_source_location, size, ir, source_file_name):
+ """Checks that the given atomic `type_ir` is allowed to be `size` bits."""
+ referenced_type_definition = ir_util.find_object(
+ type_ir.atomic_type.reference, ir)
+ # TODO(bolms): replace this with a check against an automatically-generated
+ # `static_requirements` attribute on enum types. (The main problem is that
+ # the generated attribute would have no source text, so there would be a crash
+ # when trying to display the error.)
+ if referenced_type_definition.HasField("enumeration"):
+ if size is None:
+ return [[
+ error.error(
+ source_file_name, type_ir.source_location,
+ "Enumeration {} cannot be placed in a dynamically-sized "
+ "field.".format(_render_type(type_ir, ir)))
+ ]]
+ elif size < 1 or size > 64:
+ return [[
+ error.error(
+ source_file_name, type_ir.source_location,
+ "Enumeration {} cannot be {} bits; enumerations must be between "
+ "1 and 64 bits, inclusive.".format(
+ _render_atomic_type_name(type_ir, ir), size))
+ ]]
+
+ if size is None:
+ bindings = {"$is_statically_sized": False}
+ else:
+ bindings = {
+ "$is_statically_sized": True,
+ "$static_size_in_bits": size
+ }
+ requires_attr = ir_util.get_attribute(
+ referenced_type_definition.attribute, attributes.STATIC_REQUIREMENTS)
+ if requires_attr and not ir_util.constant_value(requires_attr.expression,
+ bindings):
+ # TODO(bolms): Figure out a better way to build this error message.
+ # The "Requirements specified here." message should print out the actual
+ # source text of the requires attribute, so that should help, but it's still
+ # a bit generic and unfriendly.
+ return [[
+ error.error(
+ source_file_name, usage_source_location,
+ "Requirements of {} not met.".format(
+ type_ir.atomic_type.reference.canonical_name.object_path[-1])),
+ error.note(
+ type_ir.atomic_type.reference.canonical_name.module_file,
+ requires_attr.source_location,
+ "Requirements specified here.")
+ ]]
+ return []
+
+
+def _check_allowed_in_bits(type_ir, type_definition, source_file_name, ir,
+ errors):
+ if not type_ir.HasField("atomic_type"):
+ return
+ referenced_type_definition = ir_util.find_object(
+ type_ir.atomic_type.reference, ir)
+ if (type_definition.addressable_unit %
+ referenced_type_definition.addressable_unit != 0):
+ assert type_definition.addressable_unit == ir_pb2.TypeDefinition.BIT
+ assert (referenced_type_definition.addressable_unit ==
+ ir_pb2.TypeDefinition.BYTE)
+ errors.append([
+ error.error(source_file_name, type_ir.source_location,
+ "Byte-oriented {} cannot be used in a bits field.".format(
+ _render_type(type_ir, ir)))
+ ])
+
+
+def _check_size_of_bits(type_ir, type_definition, source_file_name, errors):
+ """Checks that `bits` types are fixed size, less than 64 bits."""
+ del type_ir # Unused
+ if type_definition.addressable_unit != ir_pb2.TypeDefinition.BIT:
+ return
+ fixed_size = ir_util.get_integer_attribute(
+ type_definition.attribute, attributes.FIXED_SIZE)
+ if fixed_size is None:
+ errors.append([error.error(source_file_name,
+ type_definition.source_location,
+ "`bits` types must be fixed size.")])
+ return
+ if fixed_size > 64:
+ errors.append([error.error(source_file_name,
+ type_definition.source_location,
+ "`bits` types must be 64 bits or smaller.")])
+
+
+_RESERVED_WORDS = None
+
+
+def get_reserved_word_list():
+ if _RESERVED_WORDS is None:
+ _initialize_reserved_word_list()
+ return _RESERVED_WORDS
+
+
+def _initialize_reserved_word_list():
+ global _RESERVED_WORDS
+ _RESERVED_WORDS = {}
+ language = None
+ for line in pkgutil.get_data(
+ "front_end",
+ "reserved_words").decode(encoding="UTF-8").splitlines():
+ stripped_line = line.partition("#")[0].strip()
+ if not stripped_line:
+ continue
+ if stripped_line.startswith("--"):
+ language = stripped_line.partition("--")[2].strip()
+ else:
+ # For brevity's sake, only use the first language for error messages.
+ if stripped_line not in _RESERVED_WORDS:
+ _RESERVED_WORDS[stripped_line] = language
+
+
+def _check_name_for_reserved_words(obj, source_file_name, errors, context_name):
+ if obj.name.name.text in get_reserved_word_list():
+ errors.append([
+ error.error(
+ source_file_name, obj.name.name.source_location,
+ "{} reserved word may not be used as {}.".format(
+ get_reserved_word_list()[obj.name.name.text],
+ context_name))
+ ])
+
+
+def _check_field_name_for_reserved_words(field, source_file_name, errors):
+ return _check_name_for_reserved_words(field, source_file_name, errors,
+ "a field name")
+
+
+def _check_enum_name_for_reserved_words(enum, source_file_name, errors):
+ return _check_name_for_reserved_words(enum, source_file_name, errors,
+ "an enum name")
+
+
+def _check_type_name_for_reserved_words(type_definition, source_file_name,
+ errors):
+ return _check_name_for_reserved_words(
+ type_definition, source_file_name, errors, "a type name")
+
+
+def _bounds_can_fit_64_bit_unsigned(minimum, maximum):
+ return minimum >= 0 and maximum <= 2**64 - 1
+
+
+def _bounds_can_fit_64_bit_signed(minimum, maximum):
+ return minimum >= -(2**63) and maximum <= 2**63 - 1
+
+
+def _bounds_can_fit_any_64_bit_integer_type(minimum, maximum):
+ return (_bounds_can_fit_64_bit_unsigned(minimum, maximum) or
+ _bounds_can_fit_64_bit_signed(minimum, maximum))
+
+
+def _integer_bounds_errors_for_expression(expression, source_file_name):
+ """Checks that `expression` is in range for int64_t or uint64_t."""
+ # Only check non-constant subexpressions.
+ if (expression.WhichOneof("expression") == "function" and
+ not ir_util.is_constant_type(expression.type)):
+ errors = []
+ for arg in expression.function.args:
+ errors += _integer_bounds_errors_for_expression(arg, source_file_name)
+ if errors:
+ # Don't cascade bounds errors: report them at the lowest level they
+ # appear.
+ return errors
+ if expression.type.WhichOneof("type") == "integer":
+ errors = _integer_bounds_errors(expression.type.integer, "expression",
+ source_file_name,
+ expression.source_location)
+ if errors:
+ return errors
+ if (expression.WhichOneof("expression") == "function" and
+ not ir_util.is_constant_type(expression.type)):
+ int64_only_clauses = []
+ uint64_only_clauses = []
+ for clause in [expression] + list(expression.function.args):
+ if clause.type.WhichOneof("type") == "integer":
+ arg_minimum = int(clause.type.integer.minimum_value)
+ arg_maximum = int(clause.type.integer.maximum_value)
+ if not _bounds_can_fit_64_bit_signed(arg_minimum, arg_maximum):
+ uint64_only_clauses.append(clause)
+ elif not _bounds_can_fit_64_bit_unsigned(arg_minimum, arg_maximum):
+ int64_only_clauses.append(clause)
+ if int64_only_clauses and uint64_only_clauses:
+ error_set = [
+ error.error(
+ source_file_name, expression.source_location,
+ "Either all arguments to '{}' and its result must fit in a "
+ "64-bit unsigned integer, or all must fit in a 64-bit signed "
+ "integer.".format(expression.function.function_name.text))
+ ]
+ for signedness, clause_list in (("unsigned", uint64_only_clauses),
+ ("signed", int64_only_clauses)):
+ for clause in clause_list:
+ error_set.append(error.note(
+ source_file_name, clause.source_location,
+ "Requires {} 64-bit integer.".format(signedness)))
+ return [error_set]
+ return []
+
+
+def _integer_bounds_errors(bounds, name, source_file_name,
+ error_source_location):
+ """Returns appropriate errors, if any, for the given integer bounds."""
+ assert bounds.minimum_value, "{}".format(bounds)
+ assert bounds.maximum_value, "{}".format(bounds)
+ if (bounds.minimum_value == "-infinity" or
+ bounds.maximum_value == "infinity"):
+ return [[
+ error.error(
+ source_file_name, error_source_location,
+ "Integer range of {} must not be unbounded; it must fit "
+ "in a 64-bit signed or unsigned integer.".format(name))
+ ]]
+ if not _bounds_can_fit_any_64_bit_integer_type(int(bounds.minimum_value),
+ int(bounds.maximum_value)):
+ if int(bounds.minimum_value) == int(bounds.maximum_value):
+ return [[
+ error.error(
+ source_file_name, error_source_location,
+ "Constant value {} of {} cannot fit in a 64-bit signed or "
+ "unsigned integer.".format(bounds.minimum_value, name))
+ ]]
+ else:
+ return [[
+ error.error(
+ source_file_name, error_source_location,
+ "Potential range of {} is {} to {}, which cannot fit "
+ "in a 64-bit signed or unsigned integer.".format(
+ name, bounds.minimum_value, bounds.maximum_value))
+ ]]
+ return []
+
+
+def _check_bounds_on_runtime_integer_expressions(expression, source_file_name,
+ in_attribute, errors):
+ if in_attribute and in_attribute.name.text == attributes.STATIC_REQUIREMENTS:
+ # [static_requirements] is never evaluated at runtime, and $size_in_bits is
+ # unbounded, so it should not be checked.
+ return
+ # The logic for gathering errors and suppressing cascades is simpler if
+ # errors are just returned, rather than appended to a shared list.
+ errors += _integer_bounds_errors_for_expression(expression, source_file_name)
+
+
+def check_constraints(ir):
+ """Checks miscellaneous validity constraints in ir.
+
+ Checks that auto array sizes are only used for the outermost size of
+ multidimensional arrays. That is, Type[3][] is OK, but Type[][3] is not.
+
+ Checks that fixed-size fields are a correct size to hold statically-sized
+ types.
+
+ Checks that inner array dimensions are constant.
+
+ Checks that only constant-size types are used in arrays.
+
+ Arguments:
+ ir: An ir_pb2.EmbossIr object to check.
+
+ Returns:
+ A list of ConstraintViolations, or an empty list if there are none.
+ """
+ errors = []
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Structure, ir_pb2.Type], _check_allowed_in_bits,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ # TODO(bolms): look for [ir_pb2.ArrayType], [ir_pb2.AtomicType], and
+ # simplify _check_that_array_base_types_are_fixed_size.
+ ir, [ir_pb2.ArrayType], _check_that_array_base_types_are_fixed_size,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Structure, ir_pb2.ArrayType],
+ _check_that_array_base_types_in_structs_are_multiples_of_bytes,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.ArrayType, ir_pb2.ArrayType],
+ _check_that_inner_array_dimensions_are_constant,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Structure], _check_size_of_bits,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Structure, ir_pb2.Type], _check_type_requirements_for_field,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Field], _check_field_name_for_reserved_words,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.EnumValue], _check_enum_name_for_reserved_words,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.TypeDefinition], _check_type_name_for_reserved_words,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Expression], _check_constancy_of_constant_references,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Enum], _check_that_enum_values_are_representable,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Expression], _check_bounds_on_runtime_integer_expressions,
+ incidental_actions={ir_pb2.Attribute: lambda a: {"in_attribute": a}},
+ skip_descendants_of={ir_pb2.EnumValue, ir_pb2.Expression},
+ parameters={"errors": errors, "in_attribute": None})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.RuntimeParameter],
+ _check_type_requirements_for_parameter_type,
+ parameters={"errors": errors})
+ return errors
diff --git a/front_end/constraints_test.py b/front_end/constraints_test.py
new file mode 100644
index 0000000..1f12dc8
--- /dev/null
+++ b/front_end/constraints_test.py
@@ -0,0 +1,751 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for constraints.py."""
+
+import unittest
+from front_end import attributes
+from front_end import constraints
+from front_end import glue
+from front_end import test_util
+from util import error
+from util import ir_util
+
+
+def _make_ir_from_emb(emb_text, name="m.emb"):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ name,
+ test_util.dict_file_reader({name: emb_text}),
+ stop_before_step="check_constraints")
+ assert not errors, repr(errors)
+ return ir
+
+
+class ConstraintsTest(unittest.TestCase):
+ """Tests constraints.check_constraints and helpers."""
+
+ def test_error_on_missing_inner_array_size(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt:8[][1] one_byte\n")
+ error_array = ir.module[0].type[0].structure.field[0].type.array_type
+ self.assertEqual([[
+ error.error(
+ "m.emb",
+ error_array.base_type.array_type.element_count.source_location,
+ "Array dimensions can only be omitted for the outermost dimension.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_no_error_on_ok_array_size(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt:8[1][1] one_byte\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_no_error_on_ok_missing_outer_array_size(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt:8[1][] one_byte\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_no_error_on_dynamically_sized_struct_in_dynamically_sized_field(
+ self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt size\n"
+ " 1 [+size] Bar bar\n"
+ "struct Bar:\n"
+ " 0 [+1] UInt size\n"
+ " 1 [+size] UInt:8[] payload\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_no_error_on_dynamically_sized_struct_in_statically_sized_field(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+10] Bar bar\n"
+ "struct Bar:\n"
+ " 0 [+1] UInt size\n"
+ " 1 [+size] UInt:8[] payload\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_no_error_non_fixed_size_outer_array_dimension(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt size\n"
+ " 1 [+size] UInt:8[1][size-1] one_byte\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_error_non_fixed_size_inner_array_dimension(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt size\n"
+ " 1 [+size] UInt:8[size-1][1] one_byte\n")
+ error_array = ir.module[0].type[0].structure.field[1].type.array_type
+ self.assertEqual([[
+ error.error(
+ "m.emb",
+ error_array.base_type.array_type.element_count.source_location,
+ "Inner array dimensions must be constant.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_error_non_constant_inner_array_dimensions(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] Bar[1] one_byte\n"
+ # There is no dynamically-sized byte-oriented type in
+ # the Prelude, so this test has to make its own.
+ "external Bar:\n"
+ " [is_integer: true]\n"
+ " [addressable_unit_size: 8]\n")
+ error_array = ir.module[0].type[0].structure.field[0].type.array_type
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_array.base_type.atomic_type.source_location,
+ "Array elements must be fixed size.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_error_dynamically_sized_array_elements(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+1] Bar[1] bar\n"
+ "struct Bar:\n"
+ " 0 [+1] UInt size\n"
+ " 1 [+size] UInt:8[] payload\n")
+ error_array = ir.module[0].type[0].structure.field[0].type.array_type
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_array.base_type.atomic_type.source_location,
+ "Array elements must be fixed size.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_field_too_small_for_type(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+1] Bar bar\n"
+ "struct Bar:\n"
+ " 0 [+2] UInt value\n")
+ error_type = ir.module[0].type[0].structure.field[0].type
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_type.source_location,
+ "Fixed-size type 'Bar' cannot be placed in field of size 8 bits; "
+ "requires 16 bits.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_dynamically_sized_field_always_too_small_for_type(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+1] bits:\n"
+ " 0 [+1] UInt x\n"
+ " 0 [+x] Bar bar\n"
+ "struct Bar:\n"
+ " 0 [+2] UInt value\n")
+ error_type = ir.module[0].type[0].structure.field[2].type
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_type.source_location,
+ "Field of maximum size 8 bits cannot hold fixed-size type 'Bar', "
+ "which requires 16 bits.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_struct_field_too_big_for_type(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+2] Byte double_byte\n"
+ "struct Byte:\n"
+ " 0 [+1] UInt b\n")
+ error_type = ir.module[0].type[0].structure.field[0].type
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_type.source_location,
+ "Fixed-size type 'Byte' cannot be placed in field of size 16 bits; "
+ "requires 8 bits.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_bits_field_too_big_for_type(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+9] UInt uint72\n"
+ ' [byte_order: "LittleEndian"]\n')
+ error_field = ir.module[0].type[0].structure.field[0]
+ uint_type = ir_util.find_object(error_field.type.atomic_type.reference, ir)
+ uint_requirements = ir_util.get_attribute(uint_type.attribute,
+ attributes.STATIC_REQUIREMENTS)
+ self.assertEqual([[
+ error.error("m.emb", error_field.source_location,
+ "Requirements of UInt not met."),
+ error.note("", uint_requirements.source_location,
+ "Requirements specified here."),
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_field_type_not_allowed_in_bits(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "bits Foo:\n"
+ " 0 [+16] Bar bar\n"
+ "external Bar:\n"
+ " [addressable_unit_size: 8]\n")
+ error_type = ir.module[0].type[0].structure.field[0].type
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_type.source_location,
+ "Byte-oriented type 'Bar' cannot be used in a bits field.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_arrays_allowed_in_bits(self):
+ ir = _make_ir_from_emb("bits Foo:\n"
+ " 0 [+16] Flag[16] bar\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_oversized_anonymous_bit_field(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+4] bits:\n"
+ " 0 [+8] UInt field\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_undersized_anonymous_bit_field(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+1] bits:\n"
+ " 0 [+32] UInt field\n")
+ error_type = ir.module[0].type[0].structure.field[0].type
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_type.source_location,
+ "Fixed-size anonymous type cannot be placed in field of size 8 "
+ "bits; requires 32 bits.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_reserved_field_name(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+8] UInt restrict\n")
+ error_name = ir.module[0].type[0].structure.field[0].name.name
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_name.source_location,
+ "C reserved word may not be used as a field name.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_reserved_type_name(self):
+ ir = _make_ir_from_emb("struct False:\n"
+ " 0 [+1] UInt foo\n")
+ error_name = ir.module[0].type[0].name.name
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_name.source_location,
+ "Python 3 reserved word may not be used as a type name.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_reserved_enum_name(self):
+ ir = _make_ir_from_emb("enum Foo:\n"
+ " NULL = 1\n")
+ error_name = ir.module[0].type[0].enumeration.value[0].name.name
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_name.source_location,
+ "C reserved word may not be used as an enum name.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_bits_type_in_struct_array(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+10] UInt:8[10] array\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_bits_type_in_bits_array(self):
+ ir = _make_ir_from_emb("bits Foo:\n"
+ " 0 [+10] UInt:8[10] array\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_explicit_size_too_small(self):
+ ir = _make_ir_from_emb("bits Foo:\n"
+ " 0 [+0] UInt:0 zero_bit\n")
+ error_field = ir.module[0].type[0].structure.field[0]
+ uint_type = ir_util.find_object(error_field.type.atomic_type.reference, ir)
+ uint_requirements = ir_util.get_attribute(uint_type.attribute,
+ attributes.STATIC_REQUIREMENTS)
+ self.assertEqual([[
+ error.error("m.emb", error_field.source_location,
+ "Requirements of UInt not met."),
+ error.note("", uint_requirements.source_location,
+ "Requirements specified here."),
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_explicit_enumeration_size_too_small(self):
+ ir = _make_ir_from_emb('[$default byte_order: "BigEndian"]\n'
+ "bits Foo:\n"
+ " 0 [+0] Bar:0 zero_bit\n"
+ "enum Bar:\n"
+ " BAZ = 0\n")
+ error_type = ir.module[0].type[0].structure.field[0].type
+ self.assertEqual([[
+ error.error("m.emb", error_type.source_location,
+ "Enumeration type 'Bar' cannot be 0 bits; enumerations "
+ "must be between 1 and 64 bits, inclusive."),
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_explicit_size_too_big_for_field(self):
+ ir = _make_ir_from_emb("bits Foo:\n"
+ " 0 [+8] UInt:32 thirty_two_bit\n")
+ error_type = ir.module[0].type[0].structure.field[0].type
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_type.source_location,
+ "Fixed-size type 'UInt:32' cannot be placed in field of size 8 "
+ "bits; requires 32 bits.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_explicit_size_too_small_for_field(self):
+ ir = _make_ir_from_emb("bits Foo:\n"
+ " 0 [+64] UInt:32 thirty_two_bit\n")
+ error_type = ir.module[0].type[0].structure.field[0].type
+ self.assertEqual([[
+ error.error("m.emb", error_type.source_location,
+ "Fixed-size type 'UInt:32' cannot be placed in field of "
+ "size 64 bits; requires 32 bits.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_explicit_size_too_big(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+16] UInt:128 one_twenty_eight_bit\n"
+ ' [byte_order: "LittleEndian"]\n')
+ error_field = ir.module[0].type[0].structure.field[0]
+ uint_type = ir_util.find_object(error_field.type.atomic_type.reference, ir)
+ uint_requirements = ir_util.get_attribute(uint_type.attribute,
+ attributes.STATIC_REQUIREMENTS)
+ self.assertEqual([[
+ error.error("m.emb", error_field.source_location,
+ "Requirements of UInt not met."),
+ error.note("", uint_requirements.source_location,
+ "Requirements specified here."),
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_explicit_enumeration_size_too_big(self):
+ ir = _make_ir_from_emb('[$default byte_order: "BigEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+9] Bar seventy_two_bit\n"
+ "enum Bar:\n"
+ " BAZ = 0\n")
+ error_type = ir.module[0].type[0].structure.field[0].type
+ self.assertEqual([[
+ error.error("m.emb", error_type.source_location,
+ "Enumeration type 'Bar' cannot be 72 bits; enumerations "
+ "must be between 1 and 64 bits, inclusive."),
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_explicit_size_on_fixed_size_type(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] Byte:8 one_byte\n"
+ "struct Byte:\n"
+ " 0 [+1] UInt b\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_explicit_size_too_small_on_fixed_size_type(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+0] Byte:0 null_byte\n"
+ "struct Byte:\n"
+ " 0 [+1] UInt b\n")
+ error_type = ir.module[0].type[0].structure.field[0].type
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_type.size_in_bits.source_location,
+ "Explicit size of 0 bits does not match fixed size (8 bits) of "
+ "type 'Byte'."),
+ error.note("m.emb", ir.module[0].type[1].source_location,
+ "Size specified here."),
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_explicit_size_too_big_on_fixed_size_type(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+2] Byte:16 double_byte\n"
+ "struct Byte:\n"
+ " 0 [+1] UInt b\n")
+ error_type = ir.module[0].type[0].structure.field[0].type
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_type.size_in_bits.source_location,
+ "Explicit size of 16 bits does not match fixed size (8 bits) of "
+ "type 'Byte'."),
+ error.note(
+ "m.emb", ir.module[0].type[1].source_location,
+ "Size specified here."),
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_explicit_size_ignored_on_variable_size_type(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+1] UInt n\n"
+ " 1 [+n] UInt:8[] d\n"
+ "struct Bar:\n"
+ " 0 [+10] Foo:80 foo\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_fixed_size_type_in_dynamically_sized_field(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt bar\n"
+ " 0 [+bar] Byte one_byte\n"
+ "struct Byte:\n"
+ " 0 [+1] UInt b\n")
+ self.assertEqual([], constraints.check_constraints(ir))
+
+ def test_enum_in_dynamically_sized_field(self):
+ ir = _make_ir_from_emb('[$default byte_order: "BigEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+1] UInt bar\n"
+ " 0 [+bar] Baz baz\n"
+ "enum Baz:\n"
+ " QUX = 0\n")
+ error_type = ir.module[0].type[0].structure.field[1].type
+ self.assertEqual(
+ [[
+ error.error("m.emb", error_type.source_location,
+ "Enumeration type 'Baz' cannot be placed in a "
+ "dynamically-sized field.")
+ ]],
+ error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_enum_value_too_high(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "enum Foo:\n"
+ " HIGH = 0x1_0000_0000_0000_0000\n")
+ error_value = ir.module[0].type[0].enumeration.value[0].value
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_value.source_location,
+ # TODO(bolms): Try to print numbers like 2**64 in hex? (I.e., if a
+ # number is a round number in hex, but not in decimal, print in
+ # hex?)
+ "Value 18446744073709551616 is out of range for enumeration.")]
+ ], constraints.check_constraints(ir))
+
+ def test_enum_value_too_low(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "enum Foo:\n"
+ " LOW = -0x8000_0000_0000_0001\n")
+ error_value = ir.module[0].type[0].enumeration.value[0].value
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_value.source_location,
+ "Value -9223372036854775809 is out of range for enumeration.")]
+ ], constraints.check_constraints(ir))
+
+ def test_enum_value_too_wide(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "enum Foo:\n"
+ " LOW = -1\n"
+ " HIGH = 0x8000_0000_0000_0000\n")
+ error_value = ir.module[0].type[0].enumeration.value[0].value
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_value.source_location,
+ "Value -1 is out of range for unsigned enumeration.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_enum_value_too_wide_unsigned_error_message(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "enum Foo:\n"
+ " LOW = -2\n"
+ " LOW2 = -1\n"
+ " HIGH = 0x8000_0000_0000_0000\n")
+ error_value = ir.module[0].type[0].enumeration.value[2].value
+ self.assertEqual([[
+ error.error(
+ "m.emb", error_value.source_location,
+ "Value 9223372036854775808 is out of range for signed enumeration.")
+ ]], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_enum_value_too_wide_multiple(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "enum Foo:\n"
+ " LOW = -2\n"
+ " LOW2 = -1\n"
+ " HIGH = 0x8000_0000_0000_0000\n"
+ " HIGH2 = 0x8000_0000_0000_0001\n")
+ error_value = ir.module[0].type[0].enumeration.value[0].value
+ error_value2 = ir.module[0].type[0].enumeration.value[1].value
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_value.source_location,
+ "Value -2 is out of range for unsigned enumeration.")],
+ [error.error(
+ "m.emb", error_value2.source_location,
+ "Value -1 is out of range for unsigned enumeration.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_enum_value_too_wide_multiple_signed_error_message(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "enum Foo:\n"
+ " LOW = -3\n"
+ " LOW2 = -2\n"
+ " LOW3 = -1\n"
+ " HIGH = 0x8000_0000_0000_0000\n"
+ " HIGH2 = 0x8000_0000_0000_0001\n")
+ error_value = ir.module[0].type[0].enumeration.value[3].value
+ error_value2 = ir.module[0].type[0].enumeration.value[4].value
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_value.source_location,
+ "Value 9223372036854775808 is out of range for signed "
+ "enumeration.")],
+ [error.error(
+ "m.emb", error_value2.source_location,
+ "Value 9223372036854775809 is out of range for signed "
+ "enumeration.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_enum_value_mixed_error_message(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "enum Foo:\n"
+ " LOW = -1\n"
+ " HIGH = 0x8000_0000_0000_0000\n"
+ " HIGH2 = 0x1_0000_0000_0000_0000\n")
+ error_value = ir.module[0].type[0].enumeration.value[0].value
+ error_value2 = ir.module[0].type[0].enumeration.value[2].value
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_value.source_location,
+ "Value -1 is out of range for unsigned enumeration.")],
+ [error.error(
+ "m.emb", error_value2.source_location,
+ "Value 18446744073709551616 is out of range for enumeration.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_explicit_non_byte_size_array_element(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+2] UInt:4[4] nibbles\n")
+ error_type = ir.module[0].type[0].structure.field[0].type.array_type
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_type.base_type.source_location,
+ "Array elements in structs must have sizes which are a multiple of "
+ "8 bits.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_implicit_non_byte_size_array_element(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "bits Nibble:\n"
+ " 0 [+4] UInt nibble\n"
+ "struct Foo:\n"
+ " 0 [+2] Nibble[4] nibbles\n")
+ error_type = ir.module[0].type[1].structure.field[0].type.array_type
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_type.base_type.source_location,
+ "Array elements in structs must have sizes which are a multiple of "
+ "8 bits.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_bits_must_be_fixed_size(self):
+ ir = _make_ir_from_emb("bits Dynamic:\n"
+ " 0 [+3] UInt x\n"
+ " 3 [+3 * x] UInt:3[x] a\n")
+ error_type = ir.module[0].type[0]
+ self.assertEqual([
+ [error.error("m.emb", error_type.source_location,
+ "`bits` types must be fixed size.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_bits_must_be_small(self):
+ ir = _make_ir_from_emb("bits Big:\n"
+ " 0 [+64] UInt x\n"
+ " 64 [+1] UInt y\n")
+ error_type = ir.module[0].type[0]
+ self.assertEqual([
+ [error.error("m.emb", error_type.source_location,
+ "`bits` types must be 64 bits or smaller.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_constant_expressions_must_be_small(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+8] UInt x\n"
+ " if x < 0x1_0000_0000_0000_0000:\n"
+ " 8 [+1] UInt y\n")
+ condition = ir.module[0].type[0].structure.field[1].existence_condition
+ error_location = condition.function.args[1].source_location
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_location,
+ "Constant value {} of expression cannot fit in a 64-bit signed or "
+ "unsigned integer.".format(2**64))]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_variable_expression_out_of_range_for_uint64(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+8] UInt x\n"
+ " if x + 1 < 0xffff_ffff_ffff_ffff:\n"
+ " 8 [+1] UInt y\n")
+ condition = ir.module[0].type[0].structure.field[1].existence_condition
+ error_location = condition.function.args[0].source_location
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_location,
+ "Potential range of expression is {} to {}, which cannot fit in a "
+ "64-bit signed or unsigned integer.".format(1, 2**64))]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_variable_expression_out_of_range_for_int64(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+8] UInt x\n"
+ " if x - 0x8000_0000_0000_0001 < 0:\n"
+ " 8 [+1] UInt y\n")
+ condition = ir.module[0].type[0].structure.field[1].existence_condition
+ error_location = condition.function.args[0].source_location
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_location,
+ "Potential range of expression is {} to {}, which cannot fit in a "
+ "64-bit signed or unsigned integer.".format(-(2**63) - 1,
+ 2**63 - 2))]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_requires_expression_out_of_range_for_uint64(self):
+ ir = _make_ir_from_emb('[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+8] UInt x\n"
+ " [requires: this * 2 < 0x1_0000]\n")
+ attribute_list = ir.module[0].type[0].structure.field[0].attribute
+ error_arg = attribute_list[0].value.expression.function.args[0]
+ error_location = error_arg.source_location
+ self.assertEqual(
+ [[
+ error.error(
+ "m.emb", error_location,
+ "Potential range of expression is {} to {}, which cannot fit "
+ "in a 64-bit signed or unsigned integer.".format(0, 2**65-2))
+ ]],
+ error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_arguments_require_different_signedness_64_bits(self):
+ ir = _make_ir_from_emb(
+ '[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ # Left side requires uint64, right side requires int64.
+ " if (x + 0x8000_0000_0000_0000) + (x - 0x7fff_ffff_ffff_ffff) < 10:\n"
+ " 1 [+1] UInt y\n")
+ condition = ir.module[0].type[0].structure.field[1].existence_condition
+ error_expression = condition.function.args[0]
+ error_location = error_expression.source_location
+ arg0_location = error_expression.function.args[0].source_location
+ arg1_location = error_expression.function.args[1].source_location
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_location,
+ "Either all arguments to '+' and its result must fit in a 64-bit "
+ "unsigned integer, or all must fit in a 64-bit signed integer."),
+ error.note("m.emb", arg0_location,
+ "Requires unsigned 64-bit integer."),
+ error.note("m.emb", arg1_location,
+ "Requires signed 64-bit integer.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_return_value_requires_different_signedness_from_arguments(self):
+ ir = _make_ir_from_emb(
+ '[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ # Both arguments require uint64; result fits in int64.
+ " if (x + 0x7fff_ffff_ffff_ffff) - 0x8000_0000_0000_0000 < 10:\n"
+ " 1 [+1] UInt y\n")
+ condition = ir.module[0].type[0].structure.field[1].existence_condition
+ error_expression = condition.function.args[0]
+ error_location = error_expression.source_location
+ arg0_location = error_expression.function.args[0].source_location
+ arg1_location = error_expression.function.args[1].source_location
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_location,
+ "Either all arguments to '-' and its result must fit in a 64-bit "
+ "unsigned integer, or all must fit in a 64-bit signed integer."),
+ error.note("m.emb", arg0_location,
+ "Requires unsigned 64-bit integer."),
+ error.note("m.emb", arg1_location,
+ "Requires unsigned 64-bit integer."),
+ error.note("m.emb", error_location,
+ "Requires signed 64-bit integer.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_return_value_requires_different_signedness_from_one_argument(self):
+ ir = _make_ir_from_emb(
+ '[$default byte_order: "LittleEndian"]\n'
+ "struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ # One argument requires uint64; result fits in int64.
+ " if (x + 0x7fff_ffff_ffff_fff0) - 0x7fff_ffff_ffff_ffff < 10:\n"
+ " 1 [+1] UInt y\n")
+ condition = ir.module[0].type[0].structure.field[1].existence_condition
+ error_expression = condition.function.args[0]
+ error_location = error_expression.source_location
+ arg0_location = error_expression.function.args[0].source_location
+ self.assertEqual([
+ [error.error(
+ "m.emb", error_location,
+ "Either all arguments to '-' and its result must fit in a 64-bit "
+ "unsigned integer, or all must fit in a 64-bit signed integer."),
+ error.note("m.emb", arg0_location,
+ "Requires unsigned 64-bit integer."),
+ error.note("m.emb", error_location,
+ "Requires signed 64-bit integer.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_checks_constancy_of_constant_references(self):
+ ir = _make_ir_from_emb("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " let y = x\n"
+ " let z = Foo.y\n")
+ error_expression = ir.module[0].type[0].structure.field[2].read_transform
+ error_location = error_expression.source_location
+ note_field = ir.module[0].type[0].structure.field[1]
+ note_location = note_field.source_location
+ self.assertEqual([
+ [error.error("m.emb", error_location,
+ "Static references must refer to constants."),
+ error.note("m.emb", note_location, "y is not constant.")]
+ ], error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_checks_for_explicit_size_on_parameters(self):
+ ir = _make_ir_from_emb("struct Foo(y: UInt):\n"
+ " 0 [+1] UInt x\n")
+ error_parameter = ir.module[0].type[0].runtime_parameter[0]
+ error_location = error_parameter.physical_type_alias.source_location
+ self.assertEqual(
+ [[error.error("m.emb", error_location,
+ "Integer range of parameter must not be unbounded; it "
+ "must fit in a 64-bit signed or unsigned integer.")]],
+ error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_checks_for_correct_explicit_size_on_parameters(self):
+ ir = _make_ir_from_emb("struct Foo(y: UInt:300):\n"
+ " 0 [+1] UInt x\n")
+ error_parameter = ir.module[0].type[0].runtime_parameter[0]
+ error_location = error_parameter.physical_type_alias.source_location
+ self.assertEqual(
+ [[error.error("m.emb", error_location,
+ "Potential range of parameter is 0 to {}, which cannot "
+ "fit in a 64-bit signed or unsigned integer.".format(
+ 2**300-1))]],
+ error.filter_errors(constraints.check_constraints(ir)))
+
+ def test_checks_for_explicit_enum_size_on_parameters(self):
+ ir = _make_ir_from_emb("struct Foo(y: Bar:8):\n"
+ " 0 [+1] UInt x\n"
+ "enum Bar:\n"
+ " QUX = 1\n")
+ error_parameter = ir.module[0].type[0].runtime_parameter[0]
+ error_size = error_parameter.physical_type_alias.size_in_bits
+ error_location = error_size.source_location
+ self.assertEqual(
+ [[error.error(
+ "m.emb", error_location,
+ "Parameters with enum type may not have explicit size.")]],
+ error.filter_errors(constraints.check_constraints(ir)))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/dependency_checker.py b/front_end/dependency_checker.py
new file mode 100644
index 0000000..7a1fa7d
--- /dev/null
+++ b/front_end/dependency_checker.py
@@ -0,0 +1,261 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Checks for dependency cycles in Emboss IR."""
+
+from public import ir_pb2
+from util import error
+from util import ir_util
+from util import traverse_ir
+
+
+def _add_reference_to_dependencies(reference, dependencies, name):
+ dependencies[name] |= {ir_util.hashable_form_of_reference(reference)}
+
+
+def _add_field_reference_to_dependencies(reference, dependencies, name):
+ dependencies[name] |= {ir_util.hashable_form_of_reference(reference.path[0])}
+
+
+def _add_name_to_dependencies(proto, dependencies):
+ name = ir_util.hashable_form_of_reference(proto.name)
+ dependencies.setdefault(name, set())
+ return {"name": name}
+
+
+def _find_dependencies(ir):
+ """Constructs a dependency graph for the entire IR."""
+ dependencies = {}
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Reference], _add_reference_to_dependencies,
+ # TODO(bolms): Add handling for references inside of attributes, once
+ # there are attributes with non-constant values.
+ skip_descendants_of={
+ ir_pb2.AtomicType, ir_pb2.Attribute, ir_pb2.FieldReference
+ },
+ incidental_actions={
+ ir_pb2.Field: _add_name_to_dependencies,
+ ir_pb2.EnumValue: _add_name_to_dependencies,
+ ir_pb2.RuntimeParameter: _add_name_to_dependencies,
+ },
+ parameters={"dependencies": dependencies})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.FieldReference], _add_field_reference_to_dependencies,
+ skip_descendants_of={ir_pb2.Attribute},
+ incidental_actions={
+ ir_pb2.Field: _add_name_to_dependencies,
+ ir_pb2.EnumValue: _add_name_to_dependencies,
+ ir_pb2.RuntimeParameter: _add_name_to_dependencies,
+ },
+ parameters={"dependencies": dependencies})
+ return dependencies
+
+
+def _find_dependency_ordering_for_fields_in_structure(
+ structure, type_definition, dependencies):
+ """Populates structure.fields_in_dependency_order."""
+ # For fields which appear before their dependencies in the original source
+ # text, this algorithm moves them to immediately after their dependencies.
+ #
+ # This is one of many possible schemes for constructing a dependency ordering;
+ # it has the advantage that all of the generated fields (e.g., $size_in_bytes)
+ # stay at the end of the ordering, which makes testing easier.
+ order = []
+ added = set()
+ for parameter in type_definition.runtime_parameter:
+ added.add(ir_util.hashable_form_of_reference(parameter.name))
+ needed = list(range(len(structure.field)))
+ while True:
+ for i in range(len(needed)):
+ field_number = needed[i]
+ field = ir_util.hashable_form_of_reference(
+ structure.field[field_number].name)
+ assert field in dependencies, "dependencies = {}".format(dependencies)
+ if all(dependency in added for dependency in dependencies[field]):
+ order.append(field_number)
+ added.add(field)
+ del needed[i]
+ break
+ else:
+ break
+ # If a non-local-field dependency were in dependencies[field], then not all
+ # fields would be added to the dependency ordering. This shouldn't happen.
+ assert len(order) == len(structure.field), (
+ "order: {}\nlen(structure.field: {})".format(order, len(structure.field)))
+ del structure.fields_in_dependency_order[:]
+ structure.fields_in_dependency_order.extend(order)
+
+
+def _find_dependency_ordering_for_fields(ir):
+ """Populates the fields_in_dependency_order fields throughout ir."""
+ dependencies = {}
+ # TODO(bolms): This duplicates work in _find_dependencies that could be
+ # shared.
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.FieldReference], _add_field_reference_to_dependencies,
+ skip_descendants_of={ir_pb2.Attribute},
+ incidental_actions={
+ ir_pb2.Field: _add_name_to_dependencies,
+ ir_pb2.EnumValue: _add_name_to_dependencies,
+ ir_pb2.RuntimeParameter: _add_name_to_dependencies,
+ },
+ parameters={"dependencies": dependencies})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Structure],
+ _find_dependency_ordering_for_fields_in_structure,
+ parameters={"dependencies": dependencies})
+
+
+def _find_module_import_dependencies(ir):
+ """Constructs a dependency graph of module imports."""
+ dependencies = {}
+ for module in ir.module:
+ foreign_imports = set()
+ for foreign_import in module.foreign_import:
+ # The prelude gets an automatic self-import that shouldn't cause any
+ # problems. No other self-imports are allowed, however.
+ if foreign_import.file_name.text or module.source_file_name:
+ foreign_imports |= {(foreign_import.file_name.text,)}
+ dependencies[module.source_file_name,] = foreign_imports
+ return dependencies
+
+
+def _find_cycles(graph):
+ """Finds cycles in graph.
+
+ The graph does not need to be fully connected.
+
+ Arguments:
+ graph: A dictionary whose keys are node labels. Values are sets of node
+ labels, representing edges from the key node to the value nodes.
+
+ Returns:
+ A set of sets of nodes which form strongly-connected components (subgraphs
+ where every node is directly or indirectly reachable from every other node).
+ No node will be included in more than one strongly-connected component, by
+ definition. Strongly-connected components of size 1, where the node in the
+ component does not have a self-edge, are not included in the result.
+
+ Note that a strongly-connected component may have a more complex structure
+ than a single loop. For example:
+
+ +-- A <-+ +-> B --+
+ | | | |
+ v C v
+ D ^ ^ E
+ | | | |
+ +-> F --+ +-- G <-+
+ """
+ # This uses Tarjan's strongly-connected components algorithm, as described by
+ # Wikipedia. This is a depth-first traversal of the graph with a node stack
+ # that is independent of the call stack; nodes are added to the stack when
+ # they are first encountered, but not removed until all nodes they can reach
+ # have been checked.
+ next_index = [0]
+ node_indices = {}
+ node_lowlinks = {}
+ nodes_on_stack = set()
+ stack = []
+ nontrivial_components = set()
+
+ def strong_connect(node):
+ """Implements the STRONGCONNECT routine of Tarjan's algorithm."""
+ node_indices[node] = next_index[0]
+ node_lowlinks[node] = next_index[0]
+ next_index[0] += 1
+ stack.append(node)
+ nodes_on_stack.add(node)
+
+ for destination_node in graph[node]:
+ if destination_node not in node_indices:
+ strong_connect(destination_node)
+ node_lowlinks[node] = min(node_lowlinks[node],
+ node_lowlinks[destination_node])
+ elif destination_node in nodes_on_stack:
+ node_lowlinks[node] = min(node_lowlinks[node],
+ node_indices[destination_node])
+
+ strongly_connected_component = []
+ if node_lowlinks[node] == node_indices[node]:
+ while True:
+ popped_node = stack.pop()
+ nodes_on_stack.remove(popped_node)
+ strongly_connected_component.append(popped_node)
+ if popped_node == node:
+ break
+ if (len(strongly_connected_component) > 1 or
+ strongly_connected_component[0] in
+ graph[strongly_connected_component[0]]):
+ nontrivial_components.add(frozenset(strongly_connected_component))
+
+ for node in graph:
+ if node not in node_indices:
+ strong_connect(node)
+ return nontrivial_components
+
+
+def _find_object_dependency_cycles(ir):
+ """Finds dependency cycles in types in the ir."""
+ dependencies = _find_dependencies(ir)
+ cycles = _find_cycles(dict(dependencies))
+ errors = []
+ for cycle in cycles:
+ # TODO(bolms): This lists the entire strongly-connected component in a
+ # fairly arbitrary order. This is simple, and handles components that
+ # aren't simple cycles, but may not be the most user-friendly way to
+ # present this information.
+ cycle_list = sorted(list(cycle))
+ node_object = ir_util.find_object(cycle_list[0], ir)
+ error_group = [
+ error.error(cycle_list[0][0], node_object.source_location,
+ "Dependency cycle\n" + node_object.name.name.text)
+ ]
+ for node in cycle_list[1:]:
+ node_object = ir_util.find_object(node, ir)
+ error_group.append(error.note(node[0], node_object.source_location,
+ node_object.name.name.text))
+ errors.append(error_group)
+ return errors
+
+
+def _find_module_dependency_cycles(ir):
+ """Finds dependency cycles in modules in the ir."""
+ dependencies = _find_module_import_dependencies(ir)
+ cycles = _find_cycles(dict(dependencies))
+ errors = []
+ for cycle in cycles:
+ cycle_list = sorted(list(cycle))
+ module = ir_util.find_object(cycle_list[0], ir)
+ error_group = [
+ error.error(cycle_list[0][0], module.source_location,
+ "Import dependency cycle\n" + module.source_file_name)
+ ]
+ for module_name in cycle_list[1:]:
+ module = ir_util.find_object(module_name, ir)
+ error_group.append(error.note(module_name[0], module.source_location,
+ module.source_file_name))
+ errors.append(error_group)
+ return errors
+
+
+def find_dependency_cycles(ir):
+ """Finds any dependency cycles in the ir."""
+ errors = _find_module_dependency_cycles(ir)
+ return errors + _find_object_dependency_cycles(ir)
+
+
+def set_dependency_order(ir):
+ """Sets the fields_in_dependency_order member of Structures."""
+ _find_dependency_ordering_for_fields(ir)
+ return []
diff --git a/front_end/dependency_checker_test.py b/front_end/dependency_checker_test.py
new file mode 100644
index 0000000..2639db7
--- /dev/null
+++ b/front_end/dependency_checker_test.py
@@ -0,0 +1,308 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for dependency_checker.py."""
+
+import unittest
+from front_end import dependency_checker
+from front_end import glue
+from front_end import test_util
+from util import error
+
+
+def _parse_snippet(emb_file):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ "m.emb",
+ test_util.dict_file_reader({"m.emb": emb_file}),
+ stop_before_step="find_dependency_cycles")
+ assert not errors
+ return ir
+
+
+def _find_dependencies_for_snippet(emb_file):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ "m.emb",
+ test_util.dict_file_reader({
+ "m.emb": emb_file
+ }),
+ stop_before_step="set_dependency_order")
+ assert not errors, errors
+ return ir
+
+
+class DependencyCheckerTest(unittest.TestCase):
+
+ def test_error_on_simple_field_cycle(self):
+ ir = _parse_snippet("struct Foo:\n"
+ " 0 [+field2] UInt field1\n"
+ " 0 [+field1] UInt field2\n")
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nfield1"),
+ error.note("m.emb", struct.field[1].source_location, "field2")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_self_cycle(self):
+ ir = _parse_snippet("struct Foo:\n"
+ " 0 [+field1] UInt field1\n")
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nfield1")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_triple_field_cycle(self):
+ ir = _parse_snippet("struct Foo:\n"
+ " 0 [+field2] UInt field1\n"
+ " 0 [+field3] UInt field2\n"
+ " 0 [+field1] UInt field3\n")
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nfield1"),
+ error.note("m.emb", struct.field[1].source_location, "field2"),
+ error.note("m.emb", struct.field[2].source_location, "field3"),
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_complex_field_cycle(self):
+ ir = _parse_snippet("struct Foo:\n"
+ " 0 [+field2] UInt field1\n"
+ " 0 [+field3+field4] UInt field2\n"
+ " 0 [+field1] UInt field3\n"
+ " 0 [+field2] UInt field4\n")
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nfield1"),
+ error.note("m.emb", struct.field[1].source_location, "field2"),
+ error.note("m.emb", struct.field[2].source_location, "field3"),
+ error.note("m.emb", struct.field[3].source_location, "field4"),
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_simple_enum_value_cycle(self):
+ ir = _parse_snippet("enum Foo:\n"
+ " XX = YY\n"
+ " YY = XX\n")
+ enum = ir.module[0].type[0].enumeration
+ self.assertEqual([[
+ error.error("m.emb", enum.value[0].source_location,
+ "Dependency cycle\nXX"),
+ error.note("m.emb", enum.value[1].source_location, "YY")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_no_error_on_no_cycle(self):
+ ir = _parse_snippet("enum Foo:\n"
+ " XX = 0\n"
+ " YY = XX\n")
+ self.assertEqual([], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_cycle_nested(self):
+ ir = _parse_snippet("struct Foo:\n"
+ " struct Bar:\n"
+ " 0 [+field2] UInt field1\n"
+ " 0 [+field1] UInt field2\n"
+ " 0 [+1] UInt field\n")
+ struct = ir.module[0].type[0].subtype[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nfield1"),
+ error.note("m.emb", struct.field[1].source_location, "field2")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_import_cycle(self):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ "m.emb",
+ test_util.dict_file_reader({"m.emb": 'import "n.emb" as n\n',
+ "n.emb": 'import "m.emb" as m\n'}),
+ stop_before_step="find_dependency_cycles")
+ assert not errors
+ self.assertEqual([[
+ error.error("m.emb", ir.module[0].source_location,
+ "Import dependency cycle\nm.emb"),
+ error.note("n.emb", ir.module[2].source_location, "n.emb")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_import_cycle_and_field_cycle(self):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ "m.emb",
+ test_util.dict_file_reader({"m.emb": 'import "n.emb" as n\n'
+ "struct Foo:\n"
+ " 0 [+field1] UInt field1\n",
+ "n.emb": 'import "m.emb" as m\n'}),
+ stop_before_step="find_dependency_cycles")
+ assert not errors
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", ir.module[0].source_location,
+ "Import dependency cycle\nm.emb"),
+ error.note("n.emb", ir.module[2].source_location, "n.emb")
+ ], [
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nfield1")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_field_existence_self_cycle(self):
+ ir = _parse_snippet("struct Foo:\n"
+ " if x == 1:\n"
+ " 0 [+1] UInt x\n")
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nx")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_field_existence_cycle(self):
+ ir = _parse_snippet("struct Foo:\n"
+ " if y == 1:\n"
+ " 0 [+1] UInt x\n"
+ " if x == 0:\n"
+ " 1 [+1] UInt y\n")
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nx"),
+ error.note("m.emb", struct.field[1].source_location, "y")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_virtual_field_cycle(self):
+ ir = _parse_snippet("struct Foo:\n"
+ " let x = y\n"
+ " let y = x\n")
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nx"),
+ error.note("m.emb", struct.field[1].source_location, "y")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_virtual_non_virtual_field_cycle(self):
+ ir = _parse_snippet("struct Foo:\n"
+ " let x = y\n"
+ " x [+4] UInt y\n")
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nx"),
+ error.note("m.emb", struct.field[1].source_location, "y")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_non_virtual_virtual_field_cycle(self):
+ ir = _parse_snippet("struct Foo:\n"
+ " y [+4] UInt x\n"
+ " let y = x\n")
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nx"),
+ error.note("m.emb", struct.field[1].source_location, "y")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_error_on_cycle_involving_subfield(self):
+ ir = _parse_snippet("struct Bar:\n"
+ " foo_b.x [+4] Foo foo_a\n"
+ " foo_a.x [+4] Foo foo_b\n"
+ "struct Foo:\n"
+ " 0 [+4] UInt x\n")
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("m.emb", struct.field[0].source_location,
+ "Dependency cycle\nfoo_a"),
+ error.note("m.emb", struct.field[1].source_location, "foo_b")
+ ]], dependency_checker.find_dependency_cycles(ir))
+
+ def test_dependency_ordering_with_no_dependencies(self):
+ ir = _find_dependencies_for_snippet("struct Foo:\n"
+ " 0 [+4] UInt a\n"
+ " 4 [+4] UInt b\n")
+ self.assertEqual([], dependency_checker.set_dependency_order(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([0, 1], struct.fields_in_dependency_order[:2])
+
+ def test_dependency_ordering_with_dependency_in_order(self):
+ ir = _find_dependencies_for_snippet("struct Foo:\n"
+ " 0 [+4] UInt a\n"
+ " a [+4] UInt b\n")
+ self.assertEqual([], dependency_checker.set_dependency_order(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([0, 1], struct.fields_in_dependency_order[:2])
+
+ def test_dependency_ordering_with_dependency_in_reverse_order(self):
+ ir = _find_dependencies_for_snippet("struct Foo:\n"
+ " b [+4] UInt a\n"
+ " 0 [+4] UInt b\n")
+ self.assertEqual([], dependency_checker.set_dependency_order(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([1, 0], struct.fields_in_dependency_order[:2])
+
+ def test_dependency_ordering_with_extra_fields(self):
+ ir = _find_dependencies_for_snippet("struct Foo:\n"
+ " d [+4] UInt a\n"
+ " 4 [+4] UInt b\n"
+ " 8 [+4] UInt c\n"
+ " 12 [+4] UInt d\n")
+ self.assertEqual([], dependency_checker.set_dependency_order(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([1, 2, 3, 0], struct.fields_in_dependency_order[:4])
+
+ def test_dependency_ordering_scrambled(self):
+ ir = _find_dependencies_for_snippet("struct Foo:\n"
+ " d [+4] UInt a\n"
+ " c [+4] UInt b\n"
+ " a [+4] UInt c\n"
+ " 12 [+4] UInt d\n")
+ self.assertEqual([], dependency_checker.set_dependency_order(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([3, 0, 2, 1], struct.fields_in_dependency_order[:4])
+
+ def test_dependency_ordering_multiple_dependents(self):
+ ir = _find_dependencies_for_snippet("struct Foo:\n"
+ " d [+4] UInt a\n"
+ " d [+4] UInt b\n"
+ " d [+4] UInt c\n"
+ " 12 [+4] UInt d\n")
+ self.assertEqual([], dependency_checker.set_dependency_order(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([3, 0, 1, 2], struct.fields_in_dependency_order[:4])
+
+ def test_dependency_ordering_multiple_dependencies(self):
+ ir = _find_dependencies_for_snippet("struct Foo:\n"
+ " b+c [+4] UInt a\n"
+ " 4 [+4] UInt b\n"
+ " 8 [+4] UInt c\n"
+ " a [+4] UInt d\n")
+ self.assertEqual([], dependency_checker.set_dependency_order(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([1, 2, 0, 3], struct.fields_in_dependency_order[:4])
+
+ def test_dependency_ordering_with_parameter(self):
+ ir = _find_dependencies_for_snippet("struct Foo:\n"
+ " 0 [+1] Bar(x) b\n"
+ " 1 [+1] UInt x\n"
+ "struct Bar(x: UInt:8):\n"
+ " x [+1] UInt y\n")
+ self.assertEqual([], dependency_checker.set_dependency_order(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([1, 0], struct.fields_in_dependency_order[:2])
+
+ def test_dependency_ordering_with_local_parameter(self):
+ ir = _find_dependencies_for_snippet("struct Foo(x: Int:13):\n"
+ " 0 [+x] Int b\n")
+ self.assertEqual([], dependency_checker.set_dependency_order(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([0], struct.fields_in_dependency_order[:1])
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/docs_are_up_to_date_test.py b/front_end/docs_are_up_to_date_test.py
new file mode 100644
index 0000000..ded2b55
--- /dev/null
+++ b/front_end/docs_are_up_to_date_test.py
@@ -0,0 +1,42 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests that g3doc/grammar.md is up to date."""
+
+import pkgutil
+
+import unittest
+from front_end import generate_grammar_md
+
+
+class DocsAreUpToDateTest(unittest.TestCase):
+ """Tests that auto-generated, checked-in documentation is up to date."""
+
+ def test_grammar_md(self):
+ doc_md = pkgutil.get_data("g3doc", "grammar.md").decode(encoding="UTF-8")
+ correct_md = generate_grammar_md.generate_grammar_md()
+ # If this fails, run:
+ #
+ # bazel run //front_end:generate_grammar_md > g3doc/grammar.md
+ #
+ # Be sure to check that the results look good before committing!
+ doc_md_lines = doc_md.splitlines()
+ correct_md_lines = correct_md.splitlines()
+ for i in range(len(doc_md_lines)):
+ self.assertEqual(correct_md_lines[i], doc_md_lines[i])
+ self.assertEqual(correct_md, doc_md)
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/emboss_front_end.py b/front_end/emboss_front_end.py
new file mode 100644
index 0000000..a6ae13c
--- /dev/null
+++ b/front_end/emboss_front_end.py
@@ -0,0 +1,167 @@
+#!/usr/bin/python3
+
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Emboss front end.
+
+This is an internal tool, normally called by the "embossc" driver, rather than
+directly by a user. It parses a .emb file and its dependencies, and prints an
+intermediate representation of the parse trees and symbol tables to stdout, or
+prints various bits of debug info, depending on which flags are passed.
+"""
+
+from __future__ import print_function
+
+import argparse
+import os
+from os import path
+import sys
+
+from front_end import glue
+from front_end import module_ir
+from util import error
+
+def _parse_command_line(argv):
+ """Parses the given command-line arguments."""
+ parser = argparse.ArgumentParser(description="Emboss compiler front end.",
+ prog=argv[0])
+ parser.add_argument("input_file",
+ type=str,
+ nargs=1,
+ help=".emb file to compile.")
+ parser.add_argument("--debug-show-tokenization",
+ action="store_true",
+ help="Show the tokenization of the main input file.")
+ parser.add_argument("--debug-show-parse-tree",
+ action="store_true",
+ help="Show the parse tree of the main input file.")
+ parser.add_argument("--debug-show-module-ir",
+ action="store_true",
+ help="Show the module-level IR of the main input file "
+ "before symbol resolution.")
+ parser.add_argument("--debug-show-full-ir",
+ action="store_true",
+ help="Show the final IR of the main input file.")
+ parser.add_argument("--debug-show-used-productions",
+ action="store_true",
+ help="Show all of the grammar productions used in "
+ "parsing the main input file.")
+ parser.add_argument("--debug-show-unused-productions",
+ action="store_true",
+ help="Show all of the grammar productions not used in "
+ "parsing the main input file.")
+ parser.add_argument("--output-ir-to-stdout",
+ action="store_true",
+ help="Dump serialized IR to stdout.")
+ parser.add_argument("--no-debug-show-header-lines",
+ dest="debug_show_header_lines",
+ action="store_false",
+ help="Print header lines before output if true.")
+ parser.add_argument("--color-output",
+ default="if_tty",
+ choices=["always", "never", "if_tty", "auto"],
+ help="Print error messages using color. 'auto' is a "
+ "synonym for 'if_tty'.")
+ parser.add_argument("--import-dir", "-I",
+ dest="import_dirs",
+ action="append",
+ default=["."],
+ help="A directory to use when searching for imported "
+ "embs. If no import_dirs are specified, the "
+ "current directory will be used.")
+ return parser.parse_args(argv[1:])
+
+
+def _show_errors(errors, debug_info, flags):
+ """Prints errors with source code snippets."""
+ source_codes = {}
+ for source_file in debug_info.modules:
+ source_codes[source_file] = debug_info.modules[source_file].source_code
+ use_color = (flags.color_output == "always" or
+ (flags.color_output in ("auto", "if_tty") and
+ os.isatty(sys.stderr.fileno())))
+ print(error.format_errors(errors, source_codes, use_color), file=sys.stderr)
+
+
+def _find_in_dirs_and_read(import_dirs):
+ """Returns a function which will search import_dirs for a file."""
+
+ def _find_and_read(file_name):
+ """Searches import_dirs for file_name and returns the contents."""
+ errors = []
+ # *All* source files, including the one specified on the command line, will
+ # be searched for in the import_dirs. This may be surprising, especially if
+ # the current directory is *not* an import_dir.
+ # TODO(bolms): Determine if this is really the desired behavior.
+ for import_dir in import_dirs:
+ full_name = path.join(import_dir, file_name)
+ try:
+ with open(full_name) as f:
+ # As written, this follows the typical compiler convention of checking
+ # the include/import directories in the order specified by the user,
+ # and always reading the first matching file, even if other files
+ # might match in later directories. This lets files shadow other
+ # files, which can be useful in some cases (to override things), but
+ # can also cause accidental shadowing, which can be tricky to fix.
+ #
+ # TODO(bolms): Check if any other files with the same name are in the
+ # import path, and give a warning or error?
+ return f.read(), None
+ except IOError as e:
+ errors.append(str(e))
+ return None, errors + ["import path " + ":".join(import_dirs)]
+
+ return _find_and_read
+
+
+def main(flags):
+ ir, debug_info, errors = glue.parse_emboss_file(
+ flags.input_file[0], _find_in_dirs_and_read(flags.import_dirs))
+ if errors:
+ _show_errors(errors, debug_info, flags)
+ return 1
+ main_module_debug_info = debug_info.modules[flags.input_file[0]]
+ if flags.debug_show_tokenization:
+ if flags.debug_show_header_lines:
+ print("Tokenization:")
+ print(main_module_debug_info.format_tokenization())
+ if flags.debug_show_parse_tree:
+ if flags.debug_show_header_lines:
+ print("Parse Tree:")
+ print(main_module_debug_info.format_parse_tree())
+ if flags.debug_show_module_ir:
+ if flags.debug_show_header_lines:
+ print("Module IR:")
+ print(main_module_debug_info.format_module_ir())
+ if flags.debug_show_full_ir:
+ if flags.debug_show_header_lines:
+ print("Full IR:")
+ print(str(ir))
+ if flags.debug_show_used_productions:
+ if flags.debug_show_header_lines:
+ print("Used Productions:")
+ print(glue.format_production_set(main_module_debug_info.used_productions))
+ if flags.debug_show_unused_productions:
+ if flags.debug_show_header_lines:
+ print("Unused Productions:")
+ print(glue.format_production_set(
+ set(module_ir.PRODUCTIONS) - main_module_debug_info.used_productions))
+ if flags.output_ir_to_stdout:
+ print(ir.to_json())
+ return 0
+
+
+if __name__ == "__main__":
+ sys.exit(main(_parse_command_line(sys.argv)))
diff --git a/front_end/error_examples b/front_end/error_examples
new file mode 100644
index 0000000..eb7722e
--- /dev/null
+++ b/front_end/error_examples
@@ -0,0 +1,783 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Error definitions.
+#
+# Each error definition consists of an error message section, followed by one or
+# more examples of that error. The examples are used to train the parser to
+# recognize the error.
+#
+# Each error example should contain the token '$ERR' immediately before the
+# token on which the error is triggered.
+#
+# An example may optionally contain the token '$ANY'. If so, the corresponding
+# error message will be used as a default message for parser errors which occur
+# in the corresponding parser state. This is roughly equivalent to repeating
+# the example with every erroneous token in place of the '$ANY'.
+
+================================================================================
+A name is required for a struct field.
+--------------------------------------------------------------------------------
+# First example.
+struct Foo:
+ 0 [+1] UInt $ERR # Missing name (end of struct).
+---
+# Second example -- error on newline token, not comment token.
+struct Foo:
+ 0 [+1] UInt $ERR
+
+
+================================================================================
+Module attributes must appear before type definitions.
+--------------------------------------------------------------------------------
+struct Foo:
+ 0 [+1] UInt field
+
+$ERR [attrib: VALUE] # Attribute after types.
+
+
+================================================================================
+A type name must be CamelCase.
+--------------------------------------------------------------------------------
+struct $ERR foo: # snake_case
+---
+struct $ERR FOO: # SHOUTY
+---
+struct $ERR Foo_: # Not a valid identifier
+---
+bits $ERR foo:
+---
+bits $ERR FOO:
+---
+bits $ERR Foo_:
+---
+enum $ERR foo:
+---
+enum $ERR FOO:
+---
+enum $ERR Foo_:
+---
+external $ERR foo:
+---
+external $ERR FOO:
+---
+external $ERR Foo_:
+---
+bits Type:
+ bits $ERR foo:
+---
+bits Type:
+ bits $ERR FOO:
+---
+bits Type:
+ bits $ERR Foo_:
+---
+bits Type:
+ struct $ERR foo:
+---
+bits Type:
+ struct $ERR FOO:
+---
+bits Type:
+ struct $ERR Foo_:
+---
+bits Type:
+ enum $ERR foo:
+---
+bits Type:
+ enum $ERR FOO:
+---
+bits Type:
+ enum $ERR Foo_:
+---
+bits Type:
+ external $ERR foo:
+---
+bits Type:
+ external $ERR FOO:
+---
+bits Type:
+ external $ERR Foo_:
+
+
+================================================================================
+Expected type name.
+--------------------------------------------------------------------------------
+struct $ERR $ANY
+---
+bits $ERR $ANY
+---
+enum $ERR $ANY
+---
+external $ERR $ANY
+---
+bits Type:
+ bits $ERR $ANY
+---
+bits Type:
+ struct $ERR $ANY
+---
+bits Type:
+ enum $ERR $ANY
+---
+bits Type:
+ external $ERR $ANY
+
+
+================================================================================
+Expected ':' after type name.
+--------------------------------------------------------------------------------
+struct Type $ERR $ANY
+---
+bits Type $ERR $ANY
+---
+enum Type $ERR $ANY
+---
+external Type $ERR $ANY
+
+
+================================================================================
+Expected end of line.
+--------------------------------------------------------------------------------
+struct Type: $ERR $ANY
+---
+bits Type: $ERR $ANY
+---
+enum Type: $ERR $ANY
+---
+external Type: $ERR $ANY
+---
+bits Type:
+ enum Type: $ERR $ANY
+---
+bits Type:
+ struct Type: $ERR $ANY
+---
+bits Type:
+ external Type: $ERR $ANY
+---
+struct Type:
+ enum Type: $ERR $ANY
+---
+struct Type:
+ external Type: $ERR $ANY
+---
+struct Type:
+ bits Type: $ERR $ANY
+---
+bits Type:
+ if false: $ERR $ANY
+---
+struct Type:
+ if false: $ERR $ANY
+---
+struct Type:
+ 0 [+0] bits: $ERR $ANY
+---
+struct Type:
+ if false:
+ 0 [+0] bits: $ERR $ANY
+---
+struct Type:
+ 0 [+0] enum name: $ERR $ANY
+---
+struct Type:
+ if false:
+ 0 [+0] enum name: $ERR $ANY
+
+
+================================================================================
+Expected documentation, an import, a module attribute, or a type at top level.
+--------------------------------------------------------------------------------
+$ERR $ANY
+
+
+================================================================================
+Expected an import, a module attribute, or a type definition.
+--------------------------------------------------------------------------------
+-- doc
+$ERR $ANY
+
+
+================================================================================
+Expected an import, a module attribute, or a type definition.
+--------------------------------------------------------------------------------
+import "string" as name
+$ERR $ANY
+
+
+================================================================================
+Imports should follow the form 'import "file_name.emb" as name'.
+--------------------------------------------------------------------------------
+import $ERR $ANY
+---
+import "string" $ERR $ANY
+---
+import "string" as $ERR $ANY
+
+
+================================================================================
+An import statement must be on its own line.
+--------------------------------------------------------------------------------
+import "string" as name $ERR $ANY
+
+
+================================================================================
+Expected inline type definition or field definition.
+--------------------------------------------------------------------------------
+struct Type:
+ external Type:
+ --
+ $ERR $ANY
+
+
+================================================================================
+Nested type definitions must come before fields.
+--------------------------------------------------------------------------------
+struct Type:
+ VALUE [+0] Type name
+ $ERR struct
+---
+struct Type:
+ VALUE [+0] Type name
+ $ERR enum
+---
+struct Type:
+ VALUE [+0] Type name
+ $ERR bits
+
+
+================================================================================
+Type documentation must come before fields.
+--------------------------------------------------------------------------------
+struct Type:
+ VALUE [+0] Type name
+ $ERR -- doc
+
+
+================================================================================
+Expected field definition.
+--------------------------------------------------------------------------------
+struct Type:
+ VALUE [+0] Type name
+ $ERR $ANY
+
+
+================================================================================
+Expected documentation, attribute, or enum value for inline enum.
+--------------------------------------------------------------------------------
+struct Type:
+ VALUE [+0] enum name:
+ $ERR $ANY
+
+
+================================================================================
+Attributes must be of the form '[name: value]', '[(domain) name: value]', or '[$default name: value]'
+--------------------------------------------------------------------------------
+[ $ERR $ANY
+---
+struct Type:
+ VALUE [+0] Type name [name: $ERR $ANY
+---
+[( $ERR $ANY
+---
+[(name $ERR $ANY
+---
+[(name) $ERR $ANY
+---
+[$default $ERR $ANY
+---
+[name $ERR $ANY
+---
+[name: $ERR $ANY
+---
+[name: VALUE $ERR $ANY
+---
+[name: false $ERR $ANY
+---
+[name: name $ERR $ANY
+---
+[name: "string" $ERR $ANY
+---
+[name: Type $ERR $ANY
+---
+[name: 0 $ERR $ANY
+
+
+================================================================================
+Expected a variable, constant, or expression.
+--------------------------------------------------------------------------------
+# The top-of-stack parser states here are common to expression parsing; the
+# module-attribute prefix is just the shortest prefix that gets to an
+# expression.
+[name: ( $ERR $ANY
+---
+# Yes, '((' puts the parser in a slightly different state than '('.
+# Fortunately, '(((' and on are equivalent to '(('.
+[name: (( $ERR $ANY
+---
+[name: + $ERR $ANY
+---
+[name: - $ERR $ANY
+---
+[name: 0 != ( $ERR $ANY
+---
+struct Type:
+ ( $ERR $ANY
+
+
+================================================================================
+Documentation must come before any enumeration values.
+--------------------------------------------------------------------------------
+enum Type:
+ VALUE = VALUE
+ --
+ $ERR -- doc
+
+================================================================================
+Attributes must come before any enumeration values.
+--------------------------------------------------------------------------------
+enum Type:
+ VALUE = VALUE
+ --
+ $ERR [attribute: value]
+
+
+================================================================================
+Expected 'NAME = value' pair.
+--------------------------------------------------------------------------------
+enum Type:
+ VALUE = VALUE
+ --
+ $ERR $ANY
+
+
+================================================================================
+Documentation must come before attributes.
+--------------------------------------------------------------------------------
+[name: VALUE]
+$ERR -- doc
+---
+enum Type:
+ [name: "string"]
+ $ERR -- doc
+---
+external Type:
+ [name: "string"]
+ $ERR -- doc
+---
+struct Type:
+ [name: "string"]
+ $ERR -- doc
+---
+bits Type:
+ [name: "string"]
+ $ERR -- doc
+
+
+================================================================================
+Imports must come before module attributes.
+--------------------------------------------------------------------------------
+[name: VALUE]
+$ERR import
+
+
+================================================================================
+Expected an attribute or a type definition.
+--------------------------------------------------------------------------------
+[name: VALUE]
+$ERR $ANY
+
+
+================================================================================
+Expected documentation or attribute for field. (Wrong indent?)
+--------------------------------------------------------------------------------
+struct Type :
+ VALUE [+0] Type name
+ $ERR $ANY
+
+
+================================================================================
+Documentation must come before nested types.
+--------------------------------------------------------------------------------
+struct Type:
+ enum Type:
+ VALUE = VALUE
+ $ERR -- doc
+
+
+================================================================================
+Attributes must come before nested types.
+--------------------------------------------------------------------------------
+struct Type:
+ enum Type:
+ VALUE = VALUE
+ $ERR [
+
+
+================================================================================
+Expected nested type or field definition.
+--------------------------------------------------------------------------------
+struct Type:
+ enum Type:
+ VALUE = VALUE
+ $ERR $ANY
+
+
+================================================================================
+Documentation must come before field definitions.
+--------------------------------------------------------------------------------
+struct Type:
+ VALUE [+0] Type name
+ --
+ $ERR -- doc
+
+
+================================================================================
+Attributes must come before field definitions.
+--------------------------------------------------------------------------------
+struct Type:
+ VALUE [+0] Type name
+ --
+ $ERR [
+
+
+================================================================================
+Nested types must come before field definitions.
+--------------------------------------------------------------------------------
+struct Type :
+ VALUE [+0] Type name
+ --
+ $ERR enum
+---
+struct Type :
+ VALUE [+0] Type name
+ --
+ $ERR struct
+---
+struct Type :
+ VALUE [+0] Type name
+ --
+ $ERR bits
+---
+struct Type :
+ VALUE [+0] Type name
+ --
+ $ERR external
+
+
+================================================================================
+Expected field definition.
+--------------------------------------------------------------------------------
+struct Type :
+ VALUE [+0] Type name
+ --
+ $ERR $ANY
+
+
+================================================================================
+Expected attribute or end of line.
+--------------------------------------------------------------------------------
+struct Type:
+ VALUE [+0] Type name [name: VALUE] $ERR $ANY
+
+
+================================================================================
+Module documentation must come before imports.
+--------------------------------------------------------------------------------
+import "string" as name
+$ERR -- doc
+
+
+================================================================================
+Body of a type definition must be non-empty and indented.
+--------------------------------------------------------------------------------
+bits Type:
+$ERR $ANY
+---
+external Type:
+$ERR $ANY
+---
+struct Type:
+$ERR $ANY
+---
+enum Type:
+$ERR $ANY
+
+
+================================================================================
+Expected documentation, attribute, nested type, or field definition.
+--------------------------------------------------------------------------------
+bits Type:
+ $ERR $ANY
+---
+struct Type:
+ $ERR $ANY
+
+
+================================================================================
+Expected documentation or attribute.
+--------------------------------------------------------------------------------
+external Type:
+ $ERR $ANY
+
+
+================================================================================
+Expected documentation, attribute, or enumeration value.
+--------------------------------------------------------------------------------
+enum Type:
+ $ERR $ANY
+
+
+================================================================================
+Enumerations may not contain subtypes.
+--------------------------------------------------------------------------------
+enum Type:
+ $ERR bits
+---
+enum Type:
+ $ERR struct
+---
+enum Type:
+ $ERR external
+---
+enum Type:
+ $ERR enum
+
+
+================================================================================
+Expected operator or closing parenthesis.
+--------------------------------------------------------------------------------
+[name: (name $ERR $ANY
+---
+[name: (VALUE $ERR $ANY
+---
+[name: (false $ERR $ANY
+---
+[name: (0 $ERR $ANY
+---
+[name: ((false) $ERR $ANY
+
+
+================================================================================
+Expected arithmetic operator or closing parenthesis.
+--------------------------------------------------------------------------------
+[name: (0 != 0 $ERR $ANY
+---
+[name: (0 != VALUE $ERR $ANY
+---
+[name: (0 != false $ERR $ANY
+
+
+================================================================================
+Expected arithmetic operator or end of attribute.
+--------------------------------------------------------------------------------
+[name: 0 != VALUE $ERR $ANY
+---
+[name: 0 != 0 $ERR $ANY
+---
+[name: 0 != false $ERR $ANY
+
+
+================================================================================
+Expected operator or '.'.
+--------------------------------------------------------------------------------
+[name: 0 != name $ERR $ANY
+
+
+================================================================================
+Expected operator or closing parenthesis; enumerated values do not have subfields.
+--------------------------------------------------------------------------------
+[name: (VALUE $ERR .
+
+
+================================================================================
+Expected name.
+--------------------------------------------------------------------------------
+[name: Type. $ERR $ANY
+
+
+================================================================================
+Field locations must be of the form 'start [+length]'. Missing '+'?
+--------------------------------------------------------------------------------
+struct Foo:
+ 0 [ $ERR 0
+---
+struct Foo:
+ 0 [ $ERR name
+---
+struct Foo:
+ 0 [ $ERR VALUE
+---
+struct Foo:
+ 0 [ $ERR Type
+---
+struct Foo:
+ 0 [ $ERR 0
+---
+struct Foo:
+ 0 [ $ERR name
+---
+struct Foo:
+ 0 [ $ERR VALUE
+---
+struct Foo:
+ 0 [ $ERR Type
+
+
+================================================================================
+Right operand must be an expression.
+--------------------------------------------------------------------------------
+[name: 0 * $ERR $ANY
+---
+[name: 0 + $ERR $ANY
+---
+[name: 0 - $ERR $ANY
+---
+[name: 0 == $ERR $ANY
+---
+[name: 0 != $ERR $ANY
+---
+[name: VALUE * $ERR $ANY
+---
+[name: VALUE + $ERR $ANY
+---
+[name: VALUE - $ERR $ANY
+---
+[name: VALUE == $ERR $ANY
+---
+[name: VALUE != $ERR $ANY
+
+
+================================================================================
+Attribute must appear on its own line.
+--------------------------------------------------------------------------------
+[name: false] $ERR $ANY
+
+
+================================================================================
+Expected subtype or enumeration name.
+--------------------------------------------------------------------------------
+[name: (Type. $ERR $ANY
+
+
+================================================================================
+Attribute missing closing ']'.
+--------------------------------------------------------------------------------
+[name: (0) $ERR
+
+
+================================================================================
+Expected operator or end of attribute.
+--------------------------------------------------------------------------------
+[name: (0) $ERR $ANY
+
+
+================================================================================
+Expected subfield name.
+--------------------------------------------------------------------------------
+[name: (name. $ERR $ANY
+---
+[name: name. $ERR $ANY
+---
+[name: 0 == name. $ERR $ANY
+
+
+================================================================================
+Expected expression for size of field.
+--------------------------------------------------------------------------------
+struct Foo:
+ 0 [+ $ERR $ANY
+
+
+================================================================================
+Unexpected enum name; possibly incorrect indentation.
+--------------------------------------------------------------------------------
+enum Type:
+ VALUE = VALUE
+ --
+ $ERR VALUE = VALUE
+
+
+================================================================================
+Space is required after '--'.
+--------------------------------------------------------------------------------
+enum Type:
+ VALUE = VALUE
+ --
+ $ERR --No space
+
+
+================================================================================
+Expected documentation or next value.
+--------------------------------------------------------------------------------
+enum Type:
+ VALUE = VALUE
+ --
+ $ERR $ANY
+
+
+================================================================================
+Expected '=' after enumerated name.
+--------------------------------------------------------------------------------
+enum Type:
+ VALUE $ERR $ANY
+
+
+================================================================================
+Parentheses are required to mix '&&' and '||'.
+--------------------------------------------------------------------------------
+struct Foo:
+ if a && b && c $ERR ||
+---
+struct Foo:
+ if a || b || c $ERR &&
+
+
+================================================================================
+Greater-than comparisons may only be chained with equality and greater-than.
+--------------------------------------------------------------------------------
+struct Foo:
+ if a > b $ERR <
+---
+struct Foo:
+ if a > b $ERR <=
+
+
+================================================================================
+Less-than comparisons may only be chained with equality and less-than.
+--------------------------------------------------------------------------------
+struct Foo:
+ if a < b $ERR >
+---
+struct Foo:
+ if a < b $ERR >=
+
+
+================================================================================
+Chained or nested choice operations ('?:') require explicit parentheses.
+--------------------------------------------------------------------------------
+struct Foo:
+ if a ? b : c $ERR ?
+---
+struct Foo:
+ if a ? b $ERR ?
+
+
+================================================================================
+Virtual fields are not allowed in anonymous 'bits' constructs; place them in the enclosing type.
+--------------------------------------------------------------------------------
+struct Foo:
+ 0 [+1] bits:
+ $ERR let x = 0
diff --git a/front_end/expression_bounds.py b/front_end/expression_bounds.py
new file mode 100644
index 0000000..8ea53f3
--- /dev/null
+++ b/front_end/expression_bounds.py
@@ -0,0 +1,715 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Functions for proving mathematical properties of expressions."""
+
+import fractions
+import operator
+
+from public import ir_pb2
+from util import ir_util
+from util import traverse_ir
+
+
+def compute_constraints_of_expression(expression, ir):
+ """Adds appropriate bounding constraints to the given expression."""
+ if ir_util.is_constant_type(expression.type):
+ return
+ expression_variety = expression.WhichOneof("expression")
+ if expression_variety == "constant":
+ _compute_constant_value_of_constant(expression)
+ elif expression_variety == "constant_reference":
+ _compute_constant_value_of_constant_reference(expression, ir)
+ elif expression_variety == "function":
+ _compute_constraints_of_function(expression, ir)
+ elif expression_variety == "field_reference":
+ _compute_constraints_of_field_reference(expression, ir)
+ elif expression_variety == "builtin_reference":
+ _compute_constraints_of_builtin_value(expression)
+ elif expression_variety == "boolean_constant":
+ _compute_constant_value_of_boolean_constant(expression)
+ else:
+ assert False, "Unknown expression variety {!r}".format(expression_variety)
+ if expression.type.WhichOneof("type") == "integer":
+ _assert_integer_constraints(expression)
+
+
+def _compute_constant_value_of_constant(expression):
+ value = expression.constant.value
+ expression.type.integer.modular_value = value
+ expression.type.integer.minimum_value = value
+ expression.type.integer.maximum_value = value
+ expression.type.integer.modulus = "infinity"
+
+
+def _compute_constant_value_of_constant_reference(expression, ir):
+ referred_object = ir_util.find_object(
+ expression.constant_reference.canonical_name, ir)
+ if isinstance(referred_object, ir_pb2.EnumValue):
+ compute_constraints_of_expression(referred_object.value, ir)
+ assert ir_util.is_constant(referred_object.value)
+ new_value = str(ir_util.constant_value(referred_object.value))
+ expression.type.enumeration.value = new_value
+ elif isinstance(referred_object, ir_pb2.Field):
+ assert ir_util.field_is_virtual(referred_object), (
+ "Non-virtual non-enum-value constant reference should have been caught "
+ "in type_check.py")
+ compute_constraints_of_expression(referred_object.read_transform, ir)
+ expression.type.CopyFrom(referred_object.read_transform.type)
+ else:
+ assert False, "Unexpected constant reference type."
+
+
+def _compute_constraints_of_function(expression, ir):
+ """Computes the known constraints of the result of a function."""
+ for arg in expression.function.args:
+ compute_constraints_of_expression(arg, ir)
+ op = expression.function.function
+ if op in (ir_pb2.Function.ADDITION, ir_pb2.Function.SUBTRACTION):
+ _compute_constraints_of_additive_operator(expression)
+ elif op == ir_pb2.Function.MULTIPLICATION:
+ _compute_constraints_of_multiplicative_operator(expression)
+ elif op in (ir_pb2.Function.EQUALITY, ir_pb2.Function.INEQUALITY,
+ ir_pb2.Function.LESS, ir_pb2.Function.LESS_OR_EQUAL,
+ ir_pb2.Function.GREATER, ir_pb2.Function.GREATER_OR_EQUAL,
+ ir_pb2.Function.AND, ir_pb2.Function.OR):
+ _compute_constant_value_of_comparison_operator(expression)
+ elif op == ir_pb2.Function.CHOICE:
+ _compute_constraints_of_choice_operator(expression)
+ elif op == ir_pb2.Function.MAXIMUM:
+ _compute_constraints_of_maximum_function(expression)
+ elif op == ir_pb2.Function.PRESENCE:
+ _compute_constraints_of_existence_function(expression, ir)
+ elif op in (ir_pb2.Function.UPPER_BOUND, ir_pb2.Function.LOWER_BOUND):
+ _compute_constraints_of_bound_function(expression)
+ else:
+ assert False, "Unknown operator {!r}".format(op)
+
+
+def _compute_constraints_of_existence_function(expression, ir):
+ """Computes the constraints of a $has(field) expression."""
+ field_path = expression.function.args[0].field_reference.path[-1]
+ field = ir_util.find_object(field_path, ir)
+ compute_constraints_of_expression(field.existence_condition, ir)
+ expression.type.CopyFrom(field.existence_condition.type)
+
+
+def _compute_constraints_of_field_reference(expression, ir):
+ """Computes the constraints of a reference to a structure's field."""
+ field_path = expression.field_reference.path[-1]
+ field = ir_util.find_object(field_path, ir)
+ if isinstance(field, ir_pb2.Field) and ir_util.field_is_virtual(field):
+ # References to virtual fields should have the virtual field's constraints
+ # copied over.
+ compute_constraints_of_expression(field.read_transform, ir)
+ expression.type.CopyFrom(field.read_transform.type)
+ return
+ # Non-virtual non-integer fields do not (yet) have constraints.
+ if expression.type.WhichOneof("type") == "integer":
+ # TODO(bolms): These lines will need to change when support is added for
+ # fixed-point types.
+ expression.type.integer.modulus = "1"
+ expression.type.integer.modular_value = "0"
+ type_definition = ir_util.find_parent_object(field_path, ir)
+ if isinstance(field, ir_pb2.Field):
+ referrent_type = field.type
+ else:
+ referrent_type = field.physical_type_alias
+ if referrent_type.HasField("size_in_bits"):
+ type_size = ir_util.constant_value(referrent_type.size_in_bits)
+ else:
+ field_size = ir_util.constant_value(field.location.size)
+ if field_size is None:
+ type_size = None
+ else:
+ type_size = field_size * type_definition.addressable_unit
+ assert referrent_type.HasField("atomic_type"), field
+ assert not referrent_type.atomic_type.reference.canonical_name.module_file
+ _set_integer_constraints_from_physical_type(
+ expression, referrent_type, type_size)
+
+
+def _set_integer_constraints_from_physical_type(
+ expression, physical_type, type_size):
+ """Copies the integer constraints of an expression from a physical type."""
+ # SCAFFOLDING HACK: In order to keep changelists manageable, this hardcodes
+ # the ranges for all of the Emboss Prelude integer types. This would break
+ # any user-defined `external` integer types, but that feature isn't fully
+ # implemented in the C++ backend, so it doesn't matter for now.
+ #
+ # Adding the attribute(s) for integer bounds will require new operators:
+ # integer/flooring division, remainder, and exponentiation (2**N, 10**N).
+ #
+ # (Technically, there are a few sets of operators that would work: for
+ # example, just the choice operator `?:` is sufficient, but very ugly.
+ # Bitwise AND, bitshift, and exponentiation would also work, but `10**($bits
+ # >> 2) * 2**($bits & 0b11) - 1` isn't quite as clear as `10**($bits // 4) *
+ # 2**($bits % 4) - 1`, in my (bolms@) opinion.)
+ #
+ # TODO(bolms): Add a scheme for defining integer bounds on user-defined
+ # external types.
+ if type_size is None:
+ # If the type_size is unknown, then we can't actually say anything about the
+ # minimum and maximum values of the type. For UInt, Int, and Bcd, an error
+ # will be thrown during the constraints check stage.
+ expression.type.integer.minimum_value = "-infinity"
+ expression.type.integer.maximum_value = "infinity"
+ return
+ name = tuple(physical_type.atomic_type.reference.canonical_name.object_path)
+ if name == ("UInt",):
+ expression.type.integer.minimum_value = "0"
+ expression.type.integer.maximum_value = str(2**type_size - 1)
+ elif name == ("Int",):
+ expression.type.integer.minimum_value = str(-(2**(type_size - 1)))
+ expression.type.integer.maximum_value = str(2**(type_size - 1) - 1)
+ elif name == ("Bcd",):
+ expression.type.integer.minimum_value = "0"
+ expression.type.integer.maximum_value = str(
+ 10**(type_size // 4) * 2**(type_size % 4) - 1)
+ else:
+ assert False, "Unknown integral type " + ".".join(name)
+
+
+def _compute_constraints_of_parameter(parameter):
+ if parameter.type.WhichOneof("type") == "integer":
+ type_size = ir_util.constant_value(
+ parameter.physical_type_alias.size_in_bits)
+ _set_integer_constraints_from_physical_type(
+ parameter, parameter.physical_type_alias, type_size)
+
+
+def _compute_constraints_of_builtin_value(expression):
+ """Computes the constraints of a builtin (like $static_size_in_bits)."""
+ name = expression.builtin_reference.canonical_name.object_path[0]
+ if name == "$static_size_in_bits":
+ expression.type.integer.modulus = "1"
+ expression.type.integer.modular_value = "0"
+ expression.type.integer.minimum_value = "0"
+ # The maximum theoretically-supported size of something is 2**64 bytes,
+ # which is 2**64 * 8 bits.
+ #
+ # Really, $static_size_in_bits is only valid in expressions that have to be
+ # evaluated at compile time anyway, so it doesn't really matter if the
+ # bounds are excessive.
+ expression.type.integer.maximum_value = "infinity"
+ elif name == "$is_statically_sized":
+ # No bounds on a boolean variable.
+ pass
+ elif name == "$logical_value":
+ # $logical_value is the placeholder used in inferred write-through
+ # transformations.
+ #
+ # Only integers (currently) have "real" write-through transformations, but
+ # fields that would otherwise be straight aliases, but which have a
+ # [requires] attribute, are elevated to write-through fields, so that the
+ # [requires] clause can be checked in Write, CouldWriteValue, TryToWrite,
+ # Read, and Ok.
+ if expression.type.WhichOneof("type") == "integer":
+ assert expression.type.integer.modulus
+ assert expression.type.integer.modular_value
+ assert expression.type.integer.minimum_value
+ assert expression.type.integer.maximum_value
+ elif expression.type.WhichOneof("type") == "enumeration":
+ assert expression.type.enumeration.name
+ elif expression.type.WhichOneof("type") == "boolean":
+ pass
+ else:
+ assert False, "Unexpected type for $logical_value"
+ else:
+ assert False, "Unknown builtin " + name
+
+
+def _compute_constant_value_of_boolean_constant(expression):
+ expression.type.boolean.value = expression.boolean_constant.value
+
+
+def _add(a, b):
+ """Adds a and b, where a and b are ints, "infinity", or "-infinity"."""
+ if a in ("infinity", "-infinity"):
+ a, b = b, a
+ if b == "infinity":
+ assert a != "-infinity"
+ return "infinity"
+ if b == "-infinity":
+ assert a != "infinity"
+ return "-infinity"
+ return int(a) + int(b)
+
+
+def _sub(a, b):
+ """Subtracts b from a, where a and b are ints, "infinity", or "-infinity"."""
+ if b == "infinity":
+ return _add(a, "-infinity")
+ if b == "-infinity":
+ return _add(a, "infinity")
+ return _add(a, -int(b))
+
+
+def _sign(a):
+ """Returns 1 if a > 0, 0 if a == 0, and -1 if a < 0."""
+ if a == "infinity":
+ return 1
+ if a == "-infinity":
+ return -1
+ if int(a) > 0:
+ return 1
+ if int(a) < 0:
+ return -1
+ return 0
+
+
+def _mul(a, b):
+ """Multiplies a and b, where a and b are ints, "infinity", or "-infinity"."""
+ if _is_infinite(a):
+ a, b = b, a
+ if _is_infinite(b):
+ sign = _sign(a) * _sign(b)
+ if sign > 0:
+ return "infinity"
+ if sign < 0:
+ return "-infinity"
+ return 0
+ return int(a) * int(b)
+
+
+def _is_infinite(a):
+ return a in ("infinity", "-infinity")
+
+
+def _max(a):
+ """Returns max of a, where elements are ints, "infinity", or "-infinity"."""
+ if any(n == "infinity" for n in a):
+ return "infinity"
+ if all(n == "-infinity" for n in a):
+ return "-infinity"
+ return max(int(n) for n in a if not _is_infinite(n))
+
+
+def _min(a):
+ """Returns min of a, where elements are ints, "infinity", or "-infinity"."""
+ if any(n == "-infinity" for n in a):
+ return "-infinity"
+ if all(n == "infinity" for n in a):
+ return "infinity"
+ return min(int(n) for n in a if not _is_infinite(n))
+
+
+def _compute_constraints_of_additive_operator(expression):
+ """Computes the modular value of an additive expression."""
+ funcs = {
+ ir_pb2.Function.ADDITION: _add,
+ ir_pb2.Function.SUBTRACTION: _sub,
+ }
+ func = funcs[expression.function.function]
+ args = expression.function.args
+ for arg in args:
+ assert arg.type.integer.modular_value, str(expression)
+ left, right = args
+ unadjusted_modular_value = func(left.type.integer.modular_value,
+ right.type.integer.modular_value)
+ new_modulus = _greatest_common_divisor(left.type.integer.modulus,
+ right.type.integer.modulus)
+ expression.type.integer.modulus = str(new_modulus)
+ if new_modulus == "infinity":
+ expression.type.integer.modular_value = str(unadjusted_modular_value)
+ else:
+ expression.type.integer.modular_value = str(unadjusted_modular_value %
+ new_modulus)
+ lmax = left.type.integer.maximum_value
+ lmin = left.type.integer.minimum_value
+ if expression.function.function == ir_pb2.Function.SUBTRACTION:
+ rmax = right.type.integer.minimum_value
+ rmin = right.type.integer.maximum_value
+ else:
+ rmax = right.type.integer.maximum_value
+ rmin = right.type.integer.minimum_value
+ expression.type.integer.minimum_value = str(func(lmin, rmin))
+ expression.type.integer.maximum_value = str(func(lmax, rmax))
+
+
+def _compute_constraints_of_multiplicative_operator(expression):
+ """Computes the modular value of a multiplicative expression."""
+ bounds = [arg.type.integer for arg in expression.function.args]
+
+ # The minimum and maximum values can come from any of the four pairings of
+ # (left min, left max) with (right min, right max), depending on the signs and
+ # magnitudes of the minima and maxima. E.g.:
+ #
+ # max = left max * right max: [ 2, 3] * [ 2, 3]
+ # max = left min * right min: [-3, -2] * [-3, -2]
+ # max = left max * right min: [-3, -2] * [ 2, 3]
+ # max = left min * right max: [ 2, 3] * [-3, -2]
+ # max = left max * right max: [-2, 3] * [-2, 3]
+ # max = left min * right min: [-3, 2] * [-3, 2]
+ #
+ # For uncorrelated multiplication, the minimum and maximum will always come
+ # from multiplying one extreme by another: if x is nonzero, then
+ #
+ # (y + e) * x > y * x || (y - e) * x > y * x
+ #
+ # for arbitrary nonzero e, so the extrema can only occur when we either cannot
+ # add or cannot subtract e.
+ #
+ # Correlated multiplication (e.g., `x * x`) can have tighter bounds, but
+ # Emboss is not currently trying to be that smart.
+ lmin, lmax = bounds[0].minimum_value, bounds[0].maximum_value
+ rmin, rmax = bounds[1].minimum_value, bounds[1].maximum_value
+ extrema = [_mul(lmax, rmax), _mul(lmin, rmax), #
+ _mul(lmax, rmin), _mul(lmin, rmin)]
+ expression.type.integer.minimum_value = str(_min(extrema))
+ expression.type.integer.maximum_value = str(_max(extrema))
+
+ if all(bound.modulus == "infinity" for bound in bounds):
+ # If both sides are constant, the result is constant.
+ expression.type.integer.modulus = "infinity"
+ expression.type.integer.modular_value = str(int(bounds[0].modular_value) *
+ int(bounds[1].modular_value))
+ return
+
+ if any(bound.modulus == "infinity" for bound in bounds):
+ # If one side is constant and the other is not, then the non-constant
+ # modulus and modular_value can both be multiplied by the constant. E.g.,
+ # if `a` is congruent to 3 mod 5, then `4 * a` will be congruent to 12 mod
+ # 20:
+ #
+ # a = ... | 4 * a = ... | 4 * a mod 20 = ...
+ # 3 | 12 | 12
+ # 8 | 32 | 12
+ # 13 | 52 | 12
+ # 18 | 72 | 12
+ # 23 | 92 | 12
+ # 28 | 112 | 12
+ # 33 | 132 | 12
+ #
+ # This is trivially shown by noting that the difference between consecutive
+ # possible values for `4 * a` always differ by 20.
+ if bounds[0].modulus == "infinity":
+ constant, variable = bounds
+ else:
+ variable, constant = bounds
+ if int(constant.modular_value) == 0:
+ # If the constant is 0, the result is 0, no matter what the variable side
+ # is.
+ expression.type.integer.modulus = "infinity"
+ expression.type.integer.modular_value = "0"
+ return
+ new_modulus = int(variable.modulus) * abs(int(constant.modular_value))
+ expression.type.integer.modulus = str(new_modulus)
+ # The `% new_modulus` will force the `modular_value` to be positive, even
+ # when `constant.modular_value` is negative.
+ expression.type.integer.modular_value = str(
+ int(variable.modular_value) * int(constant.modular_value) % new_modulus)
+ return
+
+ # If neither side is constant, then the result is more complex. Full proof is
+ # available in g3doc/modular_congruence_multiplication_proof.md
+ #
+ # Essentially, if:
+ #
+ # l == _ * l_mod + l_mv
+ # r == _ * r_mod + r_mv
+ #
+ # Then we find l_mod0 and r_mod0 in:
+ #
+ # l == (_ * l_mod_nz + l_mv_nz) * l_mod0
+ # r == (_ * r_mod_nz + r_mv_nz) * r_mod0
+ #
+ # And finally conclude:
+ #
+ # l * r == _ * GCD(l_mod_nz, r_mod_nz) * l_mod0 * r_mod0 + l_mv * r_mv
+ product_of_zero_congruence_moduli = 1
+ product_of_modular_values = 1
+ nonzero_congruence_moduli = []
+ for bound in bounds:
+ zero_congruence_modulus = _greatest_common_divisor(bound.modulus,
+ bound.modular_value)
+ assert int(bound.modulus) % zero_congruence_modulus == 0
+ product_of_zero_congruence_moduli *= zero_congruence_modulus
+ product_of_modular_values *= int(bound.modular_value)
+ nonzero_congruence_moduli.append(int(bound.modulus) //
+ zero_congruence_modulus)
+ shared_nonzero_congruence_modulus = _greatest_common_divisor(
+ nonzero_congruence_moduli[0], nonzero_congruence_moduli[1])
+ final_modulus = (shared_nonzero_congruence_modulus *
+ product_of_zero_congruence_moduli)
+ expression.type.integer.modulus = str(final_modulus)
+ expression.type.integer.modular_value = str(product_of_modular_values %
+ final_modulus)
+
+
+def _assert_integer_constraints(expression):
+ """Asserts that the integer bounds of expression are self-consistent.
+
+ Asserts that `minimum_value` and `maximum_value` are congruent to
+ `modular_value` modulo `modulus`.
+
+ If `modulus` is "infinity", asserts that `minimum_value`, `maximum_value`, and
+ `modular_value` are all equal.
+
+ If `minimum_value` is equal to `maximum_value`, asserts that `modular_value`
+ is equal to both, and that `modulus` is "infinity".
+
+ Arguments:
+ expression: an expression with type.integer
+
+ Returns:
+ None
+ """
+ bounds = expression.type.integer
+ if bounds.modulus == "infinity":
+ assert bounds.minimum_value == bounds.modular_value
+ assert bounds.maximum_value == bounds.modular_value
+ return
+ modulus = int(bounds.modulus)
+ assert modulus > 0
+ if bounds.minimum_value != "-infinity":
+ assert int(bounds.minimum_value) % modulus == int(bounds.modular_value)
+ if bounds.maximum_value != "infinity":
+ assert int(bounds.maximum_value) % modulus == int(bounds.modular_value)
+ if bounds.minimum_value == bounds.maximum_value:
+ # TODO(bolms): I believe there are situations using the not-yet-implemented
+ # integer division operator that would trigger these asserts, so they should
+ # be turned into assignments (with corresponding tests) when implementing
+ # division.
+ assert bounds.modular_value == bounds.minimum_value
+ assert bounds.modulus == "infinity"
+ if bounds.minimum_value != "-infinity" and bounds.maximum_value != "infinity":
+ assert int(bounds.minimum_value) <= int(bounds.maximum_value)
+
+
+def _compute_constant_value_of_comparison_operator(expression):
+ """Computes the constant value, if any, of a comparison operator."""
+ args = expression.function.args
+ if all(ir_util.is_constant(arg) for arg in args):
+ functions = {
+ ir_pb2.Function.EQUALITY: operator.eq,
+ ir_pb2.Function.INEQUALITY: operator.ne,
+ ir_pb2.Function.LESS: operator.lt,
+ ir_pb2.Function.LESS_OR_EQUAL: operator.le,
+ ir_pb2.Function.GREATER: operator.gt,
+ ir_pb2.Function.GREATER_OR_EQUAL: operator.ge,
+ ir_pb2.Function.AND: operator.and_,
+ ir_pb2.Function.OR: operator.or_,
+ }
+ func = functions[expression.function.function]
+ expression.type.boolean.value = func(
+ *[ir_util.constant_value(arg) for arg in args])
+
+
+def _compute_constraints_of_bound_function(expression):
+ """Computes the constraints of $upper_bound or $lower_bound."""
+ if expression.function.function == ir_pb2.Function.UPPER_BOUND:
+ value = expression.function.args[0].type.integer.maximum_value
+ elif expression.function.function == ir_pb2.Function.LOWER_BOUND:
+ value = expression.function.args[0].type.integer.minimum_value
+ else:
+ assert False, "Non-bound function"
+ expression.type.integer.minimum_value = value
+ expression.type.integer.maximum_value = value
+ expression.type.integer.modular_value = value
+ expression.type.integer.modulus = "infinity"
+
+
+def _compute_constraints_of_maximum_function(expression):
+ """Computes the constraints of the $max function."""
+ assert expression.type.WhichOneof("type") == "integer"
+ args = expression.function.args
+ assert args[0].type.WhichOneof("type") == "integer"
+ # The minimum value of the result occurs when every argument takes its minimum
+ # value, which means that the minimum result is the maximum-of-minimums.
+ expression.type.integer.minimum_value = str(_max(
+ [arg.type.integer.minimum_value for arg in args]))
+ # The maximum result is the maximum-of-maximums.
+ expression.type.integer.maximum_value = str(_max(
+ [arg.type.integer.maximum_value for arg in args]))
+ # If the expression is dominated by a constant factor, then the result is
+ # constant. I (bolms@) believe this is the only case where
+ # _compute_constraints_of_maximum_function might violate the assertions in
+ # _assert_integer_constraints.
+ if (expression.type.integer.minimum_value ==
+ expression.type.integer.maximum_value):
+ expression.type.integer.modular_value = (
+ expression.type.integer.minimum_value)
+ expression.type.integer.modulus = "infinity"
+ return
+ result_modulus = args[0].type.integer.modulus
+ result_modular_value = args[0].type.integer.modular_value
+ # The result of $max(a, b) could be either a or b, which means that the result
+ # of $max(a, b) uses the _shared_modular_value() of a and b, just like the
+ # choice operator '?:'.
+ #
+ # This also takes advantage of the fact that $max(a, b, c, d, ...) is
+ # equivalent to $max(a, $max(b, $max(c, $max(d, ...)))), so it is valid to
+ # call _shared_modular_value() in a loop.
+ for arg in args[1:]:
+ # TODO(bolms): I think the bounds could be tigher in some cases where
+ # arg.maximum_value is less than the new expression.minimum_value, and
+ # in some very specific cases where arg.maximum_value is greater than the
+ # new expression.minimum_value, but arg.maximum_value - arg.modulus is less
+ # than expression.minimum_value.
+ result_modulus, result_modular_value = _shared_modular_value(
+ (result_modulus, result_modular_value),
+ (arg.type.integer.modulus, arg.type.integer.modular_value))
+ expression.type.integer.modulus = str(result_modulus)
+ expression.type.integer.modular_value = str(result_modular_value)
+
+
+def _shared_modular_value(left, right):
+ """Returns the shared modulus and modular value of left and right.
+
+ Arguments:
+ left: A tuple of (modulus, modular value)
+ right: A tuple of (modulus, modular value)
+
+ Returns:
+ A tuple of (modulus, modular_value) such that:
+
+ left.modulus % result.modulus == 0
+ right.modulus % result.modulus == 0
+ left.modular_value % result.modulus = result.modular_value
+ right.modular_value % result.modulus = result.modular_value
+
+ That is, the result.modulus and result.modular_value will be compatible
+ with, but (possibly) less restrictive than both left.(modulus,
+ modular_value) and right.(modulus, modular_value).
+ """
+ left_modulus, left_modular_value = left
+ right_modulus, right_modular_value = right
+ # The combined modulus is gcd(gcd(left_modulus, right_modulus),
+ # left_modular_value - right_modular_value).
+ #
+ # The inner gcd normalizes the left_modulus and right_modulus, but can leave
+ # incompatible modular_values. The outer gcd finds a modulus to which both
+ # modular_values are congruent. Some examples:
+ #
+ # left | right | res
+ # --------------+----------------+--------------------
+ # l % 12 == 7 | r % 20 == 15 | res % 4 == 3
+ # l == 35 | r % 20 == 15 | res % 20 == 15
+ # l % 24 == 15 | r % 12 == 7 | res % 4 == 3
+ # l % 20 == 15 | r % 20 == 10 | res % 5 == 0
+ # l % 20 == 16 | r % 20 == 11 | res % 5 == 1
+ # l == 10 | r == 7 | res % 3 == 1
+ # l == 4 | r == 4 | res == 4
+ #
+ # The cases where one side or the other are constant are handled
+ # automatically by the fact that _greatest_common_divisor("infinity", x)
+ # is x.
+ common_modulus = _greatest_common_divisor(left_modulus, right_modulus)
+ new_modulus = _greatest_common_divisor(
+ common_modulus, abs(int(left_modular_value) - int(right_modular_value)))
+ if new_modulus == "infinity":
+ # The only way for the new_modulus to come out as "infinity" *should* be
+ # if both if_true and if_false have the same constant value.
+ assert left_modular_value == right_modular_value
+ assert left_modulus == right_modulus == "infinity"
+ return new_modulus, left_modular_value
+ else:
+ assert (int(left_modular_value) % new_modulus ==
+ int(right_modular_value) % new_modulus)
+ return new_modulus, int(left_modular_value) % new_modulus
+
+
+def _compute_constraints_of_choice_operator(expression):
+ """Computes the constraints of a choice operation '?:'."""
+ condition, if_true, if_false = expression.function.args
+ if condition.type.boolean.HasField("value"):
+ # The generated expressions for $size_in_bits and $size_in_bytes look like
+ #
+ # $max((field1_existence_condition ? field1_start + field1_size : 0),
+ # (field2_existence_condition ? field2_start + field2_size : 0),
+ # (field3_existence_condition ? field3_start + field3_size : 0),
+ # ...)
+ #
+ # Since most existence_conditions are just "true", it is important to select
+ # the tighter bounds in those cases -- otherwise, only zero-length
+ # structures could have a constant $size_in_bits or $size_in_bytes.
+ side = if_true if condition.type.boolean.value else if_false
+ expression.type.CopyFrom(side.type)
+ return
+ # The type.integer minimum_value/maximum_value bounding code is needed since
+ # constraints.check_constraints() will complain if minimum and maximum are not
+ # set correctly. I'm (bolms@) not sure if the modulus/modular_value pulls its
+ # weight, but for completeness I've left it in.
+ if if_true.type.WhichOneof("type") == "integer":
+ # The minimum value of the choice is the minimum value of either side, and
+ # the maximum is the maximum value of either side.
+ expression.type.integer.minimum_value = str(_min([
+ if_true.type.integer.minimum_value,
+ if_false.type.integer.minimum_value]))
+ expression.type.integer.maximum_value = str(_max([
+ if_true.type.integer.maximum_value,
+ if_false.type.integer.maximum_value]))
+ new_modulus, new_modular_value = _shared_modular_value(
+ (if_true.type.integer.modulus, if_true.type.integer.modular_value),
+ (if_false.type.integer.modulus, if_false.type.integer.modular_value))
+ expression.type.integer.modulus = str(new_modulus)
+ expression.type.integer.modular_value = str(new_modular_value)
+ else:
+ assert if_true.type.WhichOneof("type") in ("boolean", "enumeration"), (
+ "Unknown type {} for expression".format(
+ if_true.type.WhichOneof("type")))
+
+
+def _greatest_common_divisor(a, b):
+ """Returns the greatest common divisor of a and b.
+
+ Arguments:
+ a: an integer, a stringified integer, or the string "infinity"
+ b: an integer, a stringified integer, or the string "infinity"
+
+ Returns:
+ Conceptually, "infinity" is treated as the product of all integers.
+
+ If both a and b are 0, returns "infinity".
+
+ Otherwise, if either a or b are "infinity", and the other is 0, returns
+ "infinity".
+
+ Otherwise, if either a or b are "infinity", returns the other.
+
+ Otherwise, returns the greatest common divisor of a and b.
+ """
+ if a != "infinity": a = int(a)
+ if b != "infinity": b = int(b)
+ assert a == "infinity" or a >= 0
+ assert b == "infinity" or b >= 0
+ if a == b == 0: return "infinity"
+ # GCD(0, x) is always x, so it's safe to shortcut when a == 0 or b == 0.
+ if a == 0: return b
+ if b == 0: return a
+ if a == "infinity": return b
+ if b == "infinity": return a
+ return fractions.gcd(a, b)
+
+
+def compute_constants(ir):
+ """Computes constant values for all expressions in ir.
+
+ compute_constants calculates all constant values and adds them to the type
+ information for each expression and subexpression.
+
+ Arguments:
+ ir: an IR on which to compute constants
+
+ Returns:
+ A (possibly empty) list of errors.
+ """
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Expression], compute_constraints_of_expression,
+ skip_descendants_of={ir_pb2.Expression})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.RuntimeParameter], _compute_constraints_of_parameter,
+ skip_descendants_of={ir_pb2.Expression})
+ return []
diff --git a/front_end/expression_bounds_test.py b/front_end/expression_bounds_test.py
new file mode 100644
index 0000000..64f1530
--- /dev/null
+++ b/front_end/expression_bounds_test.py
@@ -0,0 +1,1126 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for expression_bounds."""
+
+import unittest
+from front_end import expression_bounds
+from front_end import glue
+from front_end import test_util
+
+
+class ComputeConstantsTest(unittest.TestCase):
+
+ def _make_ir(self, emb_text):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ "m.emb",
+ test_util.dict_file_reader({"m.emb": emb_text}),
+ stop_before_step="compute_constants")
+ assert not errors, errors
+ return ir
+
+ def test_constant_integer(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 10 [+1] UInt x\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ start = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual("10", start.type.integer.minimum_value)
+ self.assertEqual("10", start.type.integer.maximum_value)
+ self.assertEqual("10", start.type.integer.modular_value)
+ self.assertEqual("infinity", start.type.integer.modulus)
+
+ def test_boolean_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if true:\n"
+ " 0 [+1] UInt x\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expression = ir.module[0].type[0].structure.field[0].existence_condition
+ self.assertTrue(expression.type.boolean.HasField("value"))
+ self.assertTrue(expression.type.boolean.value)
+
+ def test_constant_equality(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if 5 == 5:\n"
+ " 0 [+1] UInt x\n"
+ " if 5 == 6:\n"
+ " 0 [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ structure = ir.module[0].type[0].structure
+ true_condition = structure.field[0].existence_condition
+ false_condition = structure.field[1].existence_condition
+ self.assertTrue(true_condition.type.boolean.HasField("value"))
+ self.assertTrue(true_condition.type.boolean.value)
+ self.assertTrue(false_condition.type.boolean.HasField("value"))
+ self.assertFalse(false_condition.type.boolean.value)
+
+ def test_constant_inequality(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if 5 != 5:\n"
+ " 0 [+1] UInt x\n"
+ " if 5 != 6:\n"
+ " 0 [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ structure = ir.module[0].type[0].structure
+ false_condition = structure.field[0].existence_condition
+ true_condition = structure.field[1].existence_condition
+ self.assertTrue(false_condition.type.boolean.HasField("value"))
+ self.assertFalse(false_condition.type.boolean.value)
+ self.assertTrue(true_condition.type.boolean.HasField("value"))
+ self.assertTrue(true_condition.type.boolean.value)
+
+ def test_constant_less_than(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if 5 < 4:\n"
+ " 0 [+1] UInt x\n"
+ " if 5 < 5:\n"
+ " 0 [+1] UInt y\n"
+ " if 5 < 6:\n"
+ " 0 [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ structure = ir.module[0].type[0].structure
+ greater_than_condition = structure.field[0].existence_condition
+ equal_condition = structure.field[1].existence_condition
+ less_than_condition = structure.field[2].existence_condition
+ self.assertTrue(greater_than_condition.type.boolean.HasField("value"))
+ self.assertFalse(greater_than_condition.type.boolean.value)
+ self.assertTrue(equal_condition.type.boolean.HasField("value"))
+ self.assertFalse(equal_condition.type.boolean.value)
+ self.assertTrue(less_than_condition.type.boolean.HasField("value"))
+ self.assertTrue(less_than_condition.type.boolean.value)
+
+ def test_constant_less_than_or_equal(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if 5 <= 4:\n"
+ " 0 [+1] UInt x\n"
+ " if 5 <= 5:\n"
+ " 0 [+1] UInt y\n"
+ " if 5 <= 6:\n"
+ " 0 [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ structure = ir.module[0].type[0].structure
+ greater_than_condition = structure.field[0].existence_condition
+ equal_condition = structure.field[1].existence_condition
+ less_than_condition = structure.field[2].existence_condition
+ self.assertTrue(greater_than_condition.type.boolean.HasField("value"))
+ self.assertFalse(greater_than_condition.type.boolean.value)
+ self.assertTrue(equal_condition.type.boolean.HasField("value"))
+ self.assertTrue(equal_condition.type.boolean.value)
+ self.assertTrue(less_than_condition.type.boolean.HasField("value"))
+ self.assertTrue(less_than_condition.type.boolean.value)
+
+ def test_constant_greater_than(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if 5 > 4:\n"
+ " 0 [+1] UInt x\n"
+ " if 5 > 5:\n"
+ " 0 [+1] UInt y\n"
+ " if 5 > 6:\n"
+ " 0 [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ structure = ir.module[0].type[0].structure
+ greater_than_condition = structure.field[0].existence_condition
+ equal_condition = structure.field[1].existence_condition
+ less_than_condition = structure.field[2].existence_condition
+ self.assertTrue(greater_than_condition.type.boolean.HasField("value"))
+ self.assertTrue(greater_than_condition.type.boolean.value)
+ self.assertTrue(equal_condition.type.boolean.HasField("value"))
+ self.assertFalse(equal_condition.type.boolean.value)
+ self.assertTrue(less_than_condition.type.boolean.HasField("value"))
+ self.assertFalse(less_than_condition.type.boolean.value)
+
+ def test_constant_greater_than_or_equal(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if 5 >= 4:\n"
+ " 0 [+1] UInt x\n"
+ " if 5 >= 5:\n"
+ " 0 [+1] UInt y\n"
+ " if 5 >= 6:\n"
+ " 0 [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ structure = ir.module[0].type[0].structure
+ greater_than_condition = structure.field[0].existence_condition
+ equal_condition = structure.field[1].existence_condition
+ less_than_condition = structure.field[2].existence_condition
+ self.assertTrue(greater_than_condition.type.boolean.HasField("value"))
+ self.assertTrue(greater_than_condition.type.boolean.value)
+ self.assertTrue(equal_condition.type.boolean.HasField("value"))
+ self.assertTrue(equal_condition.type.boolean.value)
+ self.assertTrue(less_than_condition.type.boolean.HasField("value"))
+ self.assertFalse(less_than_condition.type.boolean.value)
+
+ def test_constant_and(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if false && false:\n"
+ " 0 [+1] UInt x\n"
+ " if true && false:\n"
+ " 0 [+1] UInt y\n"
+ " if false && true:\n"
+ " 0 [+1] UInt z\n"
+ " if true && true:\n"
+ " 0 [+1] UInt w\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ structure = ir.module[0].type[0].structure
+ false_false_condition = structure.field[0].existence_condition
+ true_false_condition = structure.field[1].existence_condition
+ false_true_condition = structure.field[2].existence_condition
+ true_true_condition = structure.field[3].existence_condition
+ self.assertTrue(false_false_condition.type.boolean.HasField("value"))
+ self.assertFalse(false_false_condition.type.boolean.value)
+ self.assertTrue(true_false_condition.type.boolean.HasField("value"))
+ self.assertFalse(true_false_condition.type.boolean.value)
+ self.assertTrue(false_true_condition.type.boolean.HasField("value"))
+ self.assertFalse(false_true_condition.type.boolean.value)
+ self.assertTrue(true_true_condition.type.boolean.HasField("value"))
+ self.assertTrue(true_true_condition.type.boolean.value)
+
+ def test_constant_or(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if false || false:\n"
+ " 0 [+1] UInt x\n"
+ " if true || false:\n"
+ " 0 [+1] UInt y\n"
+ " if false || true:\n"
+ " 0 [+1] UInt z\n"
+ " if true || true:\n"
+ " 0 [+1] UInt w\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ structure = ir.module[0].type[0].structure
+ false_false_condition = structure.field[0].existence_condition
+ true_false_condition = structure.field[1].existence_condition
+ false_true_condition = structure.field[2].existence_condition
+ true_true_condition = structure.field[3].existence_condition
+ self.assertTrue(false_false_condition.type.boolean.HasField("value"))
+ self.assertFalse(false_false_condition.type.boolean.value)
+ self.assertTrue(true_false_condition.type.boolean.HasField("value"))
+ self.assertTrue(true_false_condition.type.boolean.value)
+ self.assertTrue(false_true_condition.type.boolean.HasField("value"))
+ self.assertTrue(false_true_condition.type.boolean.value)
+ self.assertTrue(true_true_condition.type.boolean.HasField("value"))
+ self.assertTrue(true_true_condition.type.boolean.value)
+
+ def test_enum_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if Bar.QUX == Bar.QUX:\n"
+ " 0 [+1] Bar x\n"
+ "enum Bar:\n"
+ " QUX = 12\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ condition = ir.module[0].type[0].structure.field[0].existence_condition
+ left = condition.function.args[0]
+ self.assertEqual("12", left.type.enumeration.value)
+
+ def test_non_constant_field_reference(self):
+ ir = self._make_ir("struct Foo:\n"
+ " y [+1] UInt x\n"
+ " 0 [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ start = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual("0", start.type.integer.minimum_value)
+ self.assertEqual("255", start.type.integer.maximum_value)
+ self.assertEqual("0", start.type.integer.modular_value)
+ self.assertEqual("1", start.type.integer.modulus)
+
+ def test_field_reference_bounds_are_uncomputable(self):
+ # Variable-sized UInt/Int/Bcd should not cause an error here: they are
+ # handled in the constraints pass.
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 0 [+x] UInt y\n"
+ " y [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+
+ def test_field_references_references_bounds_are_uncomputable(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 0 [+x] UInt y\n"
+ " 0 [+y] UInt z\n"
+ " z [+1] UInt q\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+
+ def test_non_constant_equality(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if 5 == y:\n"
+ " 0 [+1] UInt x\n"
+ " 0 [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ structure = ir.module[0].type[0].structure
+ condition = structure.field[0].existence_condition
+ self.assertFalse(condition.type.boolean.HasField("value"))
+
+ def test_constant_addition(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 7+5 [+1] UInt x\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ start = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual("12", start.type.integer.minimum_value)
+ self.assertEqual("12", start.type.integer.maximum_value)
+ self.assertEqual("12", start.type.integer.modular_value)
+ self.assertEqual("infinity", start.type.integer.modulus)
+ self.assertEqual("7", start.function.args[0].type.integer.minimum_value)
+ self.assertEqual("7", start.function.args[0].type.integer.maximum_value)
+ self.assertEqual("7", start.function.args[0].type.integer.modular_value)
+ self.assertEqual("infinity", start.type.integer.modulus)
+ self.assertEqual("5", start.function.args[1].type.integer.minimum_value)
+ self.assertEqual("5", start.function.args[1].type.integer.maximum_value)
+ self.assertEqual("5", start.function.args[1].type.integer.modular_value)
+ self.assertEqual("infinity", start.type.integer.modulus)
+
+ def test_constant_subtraction(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 7-5 [+1] UInt x\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ start = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual("2", start.type.integer.minimum_value)
+ self.assertEqual("2", start.type.integer.maximum_value)
+ self.assertEqual("2", start.type.integer.modular_value)
+ self.assertEqual("infinity", start.type.integer.modulus)
+ self.assertEqual("7", start.function.args[0].type.integer.minimum_value)
+ self.assertEqual("7", start.function.args[0].type.integer.maximum_value)
+ self.assertEqual("7", start.function.args[0].type.integer.modular_value)
+ self.assertEqual("infinity", start.type.integer.modulus)
+ self.assertEqual("5", start.function.args[1].type.integer.minimum_value)
+ self.assertEqual("5", start.function.args[1].type.integer.maximum_value)
+ self.assertEqual("5", start.function.args[1].type.integer.modular_value)
+ self.assertEqual("infinity", start.type.integer.modulus)
+
+ def test_constant_multiplication(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 7*5 [+1] UInt x\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ start = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual("35", start.type.integer.minimum_value)
+ self.assertEqual("35", start.type.integer.maximum_value)
+ self.assertEqual("35", start.type.integer.modular_value)
+ self.assertEqual("infinity", start.type.integer.modulus)
+ self.assertEqual("7", start.function.args[0].type.integer.minimum_value)
+ self.assertEqual("7", start.function.args[0].type.integer.maximum_value)
+ self.assertEqual("7", start.function.args[0].type.integer.modular_value)
+ self.assertEqual("infinity", start.type.integer.modulus)
+ self.assertEqual("5", start.function.args[1].type.integer.minimum_value)
+ self.assertEqual("5", start.function.args[1].type.integer.maximum_value)
+ self.assertEqual("5", start.function.args[1].type.integer.modular_value)
+ self.assertEqual("infinity", start.type.integer.modulus)
+
+ def test_nested_constant_expression(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if 7*(3+1) == 28:\n"
+ " 0 [+1] UInt x\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ condition = ir.module[0].type[0].structure.field[0].existence_condition
+ self.assertTrue(condition.type.boolean.value)
+ condition_left = condition.function.args[0]
+ self.assertEqual("28", condition_left.type.integer.minimum_value)
+ self.assertEqual("28", condition_left.type.integer.maximum_value)
+ self.assertEqual("28", condition_left.type.integer.modular_value)
+ self.assertEqual("infinity", condition_left.type.integer.modulus)
+ condition_left_left = condition_left.function.args[0]
+ self.assertEqual("7", condition_left_left.type.integer.minimum_value)
+ self.assertEqual("7", condition_left_left.type.integer.maximum_value)
+ self.assertEqual("7", condition_left_left.type.integer.modular_value)
+ self.assertEqual("infinity", condition_left_left.type.integer.modulus)
+ condition_left_right = condition_left.function.args[1]
+ self.assertEqual("4", condition_left_right.type.integer.minimum_value)
+ self.assertEqual("4", condition_left_right.type.integer.maximum_value)
+ self.assertEqual("4", condition_left_right.type.integer.modular_value)
+ self.assertEqual("infinity", condition_left_right.type.integer.modulus)
+ condition_left_right_left = condition_left_right.function.args[0]
+ self.assertEqual("3", condition_left_right_left.type.integer.minimum_value)
+ self.assertEqual("3", condition_left_right_left.type.integer.maximum_value)
+ self.assertEqual("3", condition_left_right_left.type.integer.modular_value)
+ self.assertEqual("infinity", condition_left_right_left.type.integer.modulus)
+ condition_left_right_right = condition_left_right.function.args[1]
+ self.assertEqual("1", condition_left_right_right.type.integer.minimum_value)
+ self.assertEqual("1", condition_left_right_right.type.integer.maximum_value)
+ self.assertEqual("1", condition_left_right_right.type.integer.modular_value)
+ self.assertEqual("infinity",
+ condition_left_right_right.type.integer.modulus)
+
+ def test_constant_plus_non_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 5+(4*x) [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ y_start = ir.module[0].type[0].structure.field[1].location.start
+ self.assertEqual("4", y_start.type.integer.modulus)
+ self.assertEqual("1", y_start.type.integer.modular_value)
+ self.assertEqual("5", y_start.type.integer.minimum_value)
+ self.assertEqual("1025", y_start.type.integer.maximum_value)
+
+ def test_constant_minus_non_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 5-(4*x) [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ y_start = ir.module[0].type[0].structure.field[1].location.start
+ self.assertEqual("4", y_start.type.integer.modulus)
+ self.assertEqual("1", y_start.type.integer.modular_value)
+ self.assertEqual("-1015", y_start.type.integer.minimum_value)
+ self.assertEqual("5", y_start.type.integer.maximum_value)
+
+ def test_non_constant_minus_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " (4*x)-5 [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ y_start = ir.module[0].type[0].structure.field[1].location.start
+ self.assertEqual(str((4 * 0) - 5), y_start.type.integer.minimum_value)
+ self.assertEqual(str((4 * 255) - 5), y_start.type.integer.maximum_value)
+ self.assertEqual("4", y_start.type.integer.modulus)
+ self.assertEqual("3", y_start.type.integer.modular_value)
+
+ def test_non_constant_plus_non_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] UInt y\n"
+ " (4*x)+(6*y+3) [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("3", z_start.type.integer.minimum_value)
+ self.assertEqual(str(4 * 255 + 6 * 255 + 3),
+ z_start.type.integer.maximum_value)
+ self.assertEqual("2", z_start.type.integer.modulus)
+ self.assertEqual("1", z_start.type.integer.modular_value)
+
+ def test_non_constant_minus_non_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] UInt y\n"
+ " (x*3)-(y*3) [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("3", z_start.type.integer.modulus)
+ self.assertEqual("0", z_start.type.integer.modular_value)
+ self.assertEqual(str(-3 * 255), z_start.type.integer.minimum_value)
+ self.assertEqual(str(3 * 255), z_start.type.integer.maximum_value)
+
+ def test_non_constant_times_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " (4*x+1)*5 [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ y_start = ir.module[0].type[0].structure.field[1].location.start
+ self.assertEqual("20", y_start.type.integer.modulus)
+ self.assertEqual("5", y_start.type.integer.modular_value)
+ self.assertEqual("5", y_start.type.integer.minimum_value)
+ self.assertEqual(str((4 * 255 + 1) * 5), y_start.type.integer.maximum_value)
+
+ def test_non_constant_times_negative_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " (4*x+1)*-5 [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ y_start = ir.module[0].type[0].structure.field[1].location.start
+ self.assertEqual("20", y_start.type.integer.modulus)
+ self.assertEqual("15", y_start.type.integer.modular_value)
+ self.assertEqual(str((4 * 255 + 1) * -5),
+ y_start.type.integer.minimum_value)
+ self.assertEqual("-5", y_start.type.integer.maximum_value)
+
+ def test_non_constant_times_zero(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " (4*x+1)*0 [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ y_start = ir.module[0].type[0].structure.field[1].location.start
+ self.assertEqual("infinity", y_start.type.integer.modulus)
+ self.assertEqual("0", y_start.type.integer.modular_value)
+ self.assertEqual("0", y_start.type.integer.minimum_value)
+ self.assertEqual("0", y_start.type.integer.maximum_value)
+
+ def test_non_constant_times_non_constant_shared_modulus(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] UInt y\n"
+ " (4*x+3)*(4*y+3) [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("4", z_start.type.integer.modulus)
+ self.assertEqual("1", z_start.type.integer.modular_value)
+ self.assertEqual("9", z_start.type.integer.minimum_value)
+ self.assertEqual(str((4 * 255 + 3)**2), z_start.type.integer.maximum_value)
+
+ def test_non_constant_times_non_constant_congruent_to_zero(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] UInt y\n"
+ " (4*x)*(4*y) [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("16", z_start.type.integer.modulus)
+ self.assertEqual("0", z_start.type.integer.modular_value)
+ self.assertEqual("0", z_start.type.integer.minimum_value)
+ self.assertEqual(str((4 * 255)**2), z_start.type.integer.maximum_value)
+
+ def test_non_constant_times_non_constant_partially_shared_modulus(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] UInt y\n"
+ " (4*x+3)*(8*y+3) [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("4", z_start.type.integer.modulus)
+ self.assertEqual("1", z_start.type.integer.modular_value)
+ self.assertEqual("9", z_start.type.integer.minimum_value)
+ self.assertEqual(str((4 * 255 + 3) * (8 * 255 + 3)),
+ z_start.type.integer.maximum_value)
+
+ def test_non_constant_times_non_constant_full_complexity(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] UInt y\n"
+ " (12*x+9)*(40*y+15) [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("60", z_start.type.integer.modulus)
+ self.assertEqual("15", z_start.type.integer.modular_value)
+ self.assertEqual(str(9 * 15), z_start.type.integer.minimum_value)
+ self.assertEqual(str((12 * 255 + 9) * (40 * 255 + 15)),
+ z_start.type.integer.maximum_value)
+
+ def test_signed_non_constant_times_signed_non_constant_full_complexity(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] Int x\n"
+ " 1 [+1] Int y\n"
+ " (12*x+9)*(40*y+15) [+1] Int z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("60", z_start.type.integer.modulus)
+ self.assertEqual("15", z_start.type.integer.modular_value)
+ # Max x/min y is slightly lower than min x/max y (-7825965 vs -7780065).
+ self.assertEqual(str((12 * 127 + 9) * (40 * -128 + 15)),
+ z_start.type.integer.minimum_value)
+ # Max x/max y is slightly higher than min x/min y (7810635 vs 7795335).
+ self.assertEqual(str((12 * 127 + 9) * (40 * 127 + 15)),
+ z_start.type.integer.maximum_value)
+
+ def test_non_constant_times_non_constant_flipped_min_max(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] UInt y\n"
+ " (-x*3)*(y*3) [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("9", z_start.type.integer.modulus)
+ self.assertEqual("0", z_start.type.integer.modular_value)
+ self.assertEqual(str(-((3 * 255)**2)), z_start.type.integer.minimum_value)
+ self.assertEqual("0", z_start.type.integer.maximum_value)
+
+ # Currently, only `$static_size_in_bits` has an infinite bound, so all of the
+ # examples below use `$static_size_in_bits`. Unfortunately, this also means
+ # that these tests rely on the fact that Emboss doesn't try to do any term
+ # rewriting or smart correlation between the arguments of various operators:
+ # for example, several tests rely on `$static_size_in_bits -
+ # $static_size_in_bits` having the range `-infinity` to `infinity`, when a
+ # trivial term rewrite would turn that expression into `0`.
+ #
+ # Unbounded expressions are only allowed at compile-time anyway, so these
+ # tests cover some fairly unlikely uses of the Emboss expression language.
+ def test_unbounded_plus_constant(self):
+ ir = self._make_ir("external Foo:\n"
+ " [requires: $static_size_in_bits + 2 > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("1", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("2", expr.type.integer.minimum_value)
+ self.assertEqual("infinity", expr.type.integer.maximum_value)
+
+ def test_negative_unbounded_plus_constant(self):
+ ir = self._make_ir("external Foo:\n"
+ " [requires: -$static_size_in_bits + 2 > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("1", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("-infinity", expr.type.integer.minimum_value)
+ self.assertEqual("2", expr.type.integer.maximum_value)
+
+ def test_negative_unbounded_plus_unbounded(self):
+ ir = self._make_ir(
+ "external Foo:\n"
+ " [requires: -$static_size_in_bits + $static_size_in_bits > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("1", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("-infinity", expr.type.integer.minimum_value)
+ self.assertEqual("infinity", expr.type.integer.maximum_value)
+
+ def test_unbounded_minus_unbounded(self):
+ ir = self._make_ir(
+ "external Foo:\n"
+ " [requires: $static_size_in_bits - $static_size_in_bits > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("1", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("-infinity", expr.type.integer.minimum_value)
+ self.assertEqual("infinity", expr.type.integer.maximum_value)
+
+ def test_unbounded_minus_negative_unbounded(self):
+ ir = self._make_ir(
+ "external Foo:\n"
+ " [requires: $static_size_in_bits - -$static_size_in_bits > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("1", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("0", expr.type.integer.minimum_value)
+ self.assertEqual("infinity", expr.type.integer.maximum_value)
+
+ def test_unbounded_times_constant(self):
+ ir = self._make_ir("external Foo:\n"
+ " [requires: ($static_size_in_bits + 1) * 2 > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("2", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("2", expr.type.integer.minimum_value)
+ self.assertEqual("infinity", expr.type.integer.maximum_value)
+
+ def test_unbounded_times_negative_constant(self):
+ ir = self._make_ir("external Foo:\n"
+ " [requires: ($static_size_in_bits + 1) * -2 > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("2", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("-infinity", expr.type.integer.minimum_value)
+ self.assertEqual("-2", expr.type.integer.maximum_value)
+
+ def test_unbounded_times_negative_zero(self):
+ ir = self._make_ir("external Foo:\n"
+ " [requires: ($static_size_in_bits + 1) * 0 > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("infinity", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("0", expr.type.integer.minimum_value)
+ self.assertEqual("0", expr.type.integer.maximum_value)
+
+ def test_negative_unbounded_times_constant(self):
+ ir = self._make_ir("external Foo:\n"
+ " [requires: (-$static_size_in_bits + 1) * 2 > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("2", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("-infinity", expr.type.integer.minimum_value)
+ self.assertEqual("2", expr.type.integer.maximum_value)
+
+ def test_double_unbounded_minus_unbounded(self):
+ ir = self._make_ir(
+ "external Foo:\n"
+ " [requires: 2 * $static_size_in_bits - $static_size_in_bits > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("1", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("-infinity", expr.type.integer.minimum_value)
+ self.assertEqual("infinity", expr.type.integer.maximum_value)
+
+ def test_double_unbounded_times_negative_unbounded(self):
+ ir = self._make_ir(
+ "external Foo:\n"
+ " [requires: 2 * $static_size_in_bits * -$static_size_in_bits > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("2", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("-infinity", expr.type.integer.minimum_value)
+ self.assertEqual("0", expr.type.integer.maximum_value)
+
+ def test_upper_bound_of_field(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] Int x\n"
+ " let u = $upper_bound(x)\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ u_type = ir.module[0].type[0].structure.field[1].read_transform.type
+ self.assertEqual("infinity", u_type.integer.modulus)
+ self.assertEqual("127", u_type.integer.maximum_value)
+ self.assertEqual("127", u_type.integer.minimum_value)
+ self.assertEqual("127", u_type.integer.modular_value)
+
+ def test_lower_bound_of_field(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] Int x\n"
+ " let l = $lower_bound(x)\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ l_type = ir.module[0].type[0].structure.field[1].read_transform.type
+ self.assertEqual("infinity", l_type.integer.modulus)
+ self.assertEqual("-128", l_type.integer.maximum_value)
+ self.assertEqual("-128", l_type.integer.minimum_value)
+ self.assertEqual("-128", l_type.integer.modular_value)
+
+ def test_upper_bound_of_max(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] Int x\n"
+ " 1 [+1] UInt y\n"
+ " let u = $upper_bound($max(x, y))\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ u_type = ir.module[0].type[0].structure.field[2].read_transform.type
+ self.assertEqual("infinity", u_type.integer.modulus)
+ self.assertEqual("255", u_type.integer.maximum_value)
+ self.assertEqual("255", u_type.integer.minimum_value)
+ self.assertEqual("255", u_type.integer.modular_value)
+
+ def test_lower_bound_of_max(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] Int x\n"
+ " 1 [+1] UInt y\n"
+ " let l = $lower_bound($max(x, y))\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ l_type = ir.module[0].type[0].structure.field[2].read_transform.type
+ self.assertEqual("infinity", l_type.integer.modulus)
+ self.assertEqual("0", l_type.integer.maximum_value)
+ self.assertEqual("0", l_type.integer.minimum_value)
+ self.assertEqual("0", l_type.integer.modular_value)
+
+ def test_double_unbounded_both_ends_times_negative_unbounded(self):
+ ir = self._make_ir(
+ "external Foo:\n"
+ " [requires: (2 * ($static_size_in_bits - $static_size_in_bits) + 1) "
+ " * -$static_size_in_bits > 0]\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].attribute[0].value.expression.function.args[0]
+ self.assertEqual("1", expr.type.integer.modulus)
+ self.assertEqual("0", expr.type.integer.modular_value)
+ self.assertEqual("-infinity", expr.type.integer.minimum_value)
+ self.assertEqual("infinity", expr.type.integer.maximum_value)
+
+ def test_choice_two_non_constant_integers(self):
+ cases = [
+ # t % 12 == 7 and f % 20 == 15 ==> r % 4 == 3
+ (12, 7, 20, 15, 4, 3, -128 * 20 + 15, 127 * 20 + 15),
+ # t % 24 == 15 and f % 12 == 7 ==> r % 4 == 3
+ (24, 15, 12, 7, 4, 3, -128 * 24 + 15, 127 * 24 + 15),
+ # t % 20 == 15 and f % 20 == 10 ==> r % 5 == 0
+ (20, 15, 20, 10, 5, 0, -128 * 20 + 10, 127 * 20 + 15),
+ # t % 20 == 16 and f % 20 == 11 ==> r % 5 == 1
+ (20, 16, 20, 11, 5, 1, -128 * 20 + 11, 127 * 20 + 16),
+ ]
+ for (t_mod, t_val, f_mod, f_val, r_mod, r_val, r_min, r_max) in cases:
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] Int y\n"
+ " if (x == 0 ? y * {} + {} : y * {} + {}) == 0:\n"
+ " 1 [+1] UInt z\n".format(
+ t_mod, t_val, f_mod, f_val))
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[2]
+ expr = field.existence_condition.function.args[0]
+ self.assertEqual(str(r_mod), expr.type.integer.modulus)
+ self.assertEqual(str(r_val), expr.type.integer.modular_value)
+ self.assertEqual(str(r_min), expr.type.integer.minimum_value)
+ self.assertEqual(str(r_max), expr.type.integer.maximum_value)
+
+ def test_choice_one_non_constant_integer(self):
+ cases = [
+ # t == 35 and f % 20 == 15 ==> res % 20 == 15
+ (35, 20, 15, 20, 15, -128 * 20 + 15, 127 * 20 + 15),
+ # t == 200035 and f % 20 == 15 ==> res % 20 == 15
+ (200035, 20, 15, 20, 15, -128 * 20 + 15, 200035),
+ # t == 21 and f % 20 == 16 ==> res % 5 == 1
+ (21, 20, 16, 5, 1, -128 * 20 + 16, 127 * 20 + 16),
+ ]
+ for (t_val, f_mod, f_val, r_mod, r_val, r_min, r_max) in cases:
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] Int y\n"
+ " if (x == 0 ? {0} : y * {1} + {2}) == 0:\n"
+ " 1 [+1] UInt z\n"
+ " if (x == 0 ? y * {1} + {2} : {0}) == 0:\n"
+ " 1 [+1] UInt q\n".format(t_val, f_mod, f_val))
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field_constant_true = ir.module[0].type[0].structure.field[2]
+ constant_true = field_constant_true.existence_condition.function.args[0]
+ field_constant_false = ir.module[0].type[0].structure.field[3]
+ constant_false = field_constant_false.existence_condition.function.args[0]
+ self.assertEqual(str(r_mod), constant_true.type.integer.modulus)
+ self.assertEqual(str(r_val), constant_true.type.integer.modular_value)
+ self.assertEqual(str(r_min), constant_true.type.integer.minimum_value)
+ self.assertEqual(str(r_max), constant_true.type.integer.maximum_value)
+ self.assertEqual(str(r_mod), constant_false.type.integer.modulus)
+ self.assertEqual(str(r_val), constant_false.type.integer.modular_value)
+ self.assertEqual(str(r_min), constant_false.type.integer.minimum_value)
+ self.assertEqual(str(r_max), constant_false.type.integer.maximum_value)
+
+ def test_choice_two_constant_integers(self):
+ cases = [
+ # t == 10 and f == 7 ==> res % 3 == 1
+ (10, 7, 3, 1, 7, 10),
+ # t == 4 and f == 4 ==> res == 4
+ (4, 4, "infinity", 4, 4, 4),
+ ]
+ for (t_val, f_val, r_mod, r_val, r_min, r_max) in cases:
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] Int y\n"
+ " if (x == 0 ? {} : {}) == 0:\n"
+ " 1 [+1] UInt z\n".format(t_val, f_val))
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field_constant_true = ir.module[0].type[0].structure.field[2]
+ constant_true = field_constant_true.existence_condition.function.args[0]
+ self.assertEqual(str(r_mod), constant_true.type.integer.modulus)
+ self.assertEqual(str(r_val), constant_true.type.integer.modular_value)
+ self.assertEqual(str(r_min), constant_true.type.integer.minimum_value)
+ self.assertEqual(str(r_max), constant_true.type.integer.maximum_value)
+
+ def test_constant_true_has(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if $present(x):\n"
+ " 1 [+1] UInt q\n"
+ " 0 [+1] UInt x\n"
+ " if x > 10:\n"
+ " 1 [+1] Int y\n"
+ " if false:\n"
+ " 2 [+1] Int z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[0]
+ has_func = field.existence_condition
+ self.assertTrue(has_func.type.boolean.value)
+
+ def test_constant_false_has(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if $present(z):\n"
+ " 1 [+1] UInt q\n"
+ " 0 [+1] UInt x\n"
+ " if x > 10:\n"
+ " 1 [+1] Int y\n"
+ " if false:\n"
+ " 2 [+1] Int z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[0]
+ has_func = field.existence_condition
+ self.assertTrue(has_func.type.boolean.HasField("value"))
+ self.assertFalse(has_func.type.boolean.value)
+
+ def test_variable_has(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if $present(y):\n"
+ " 1 [+1] UInt q\n"
+ " 0 [+1] UInt x\n"
+ " if x > 10:\n"
+ " 1 [+1] Int y\n"
+ " if false:\n"
+ " 2 [+1] Int z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[0]
+ has_func = field.existence_condition
+ self.assertFalse(has_func.type.boolean.HasField("value"))
+
+ def test_max_of_constants(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] Int y\n"
+ " if $max(0, 1, 2) == 0:\n"
+ " 1 [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[2]
+ max_func = field.existence_condition.function.args[0]
+ self.assertEqual("infinity", max_func.type.integer.modulus)
+ self.assertEqual("2", max_func.type.integer.modular_value)
+ self.assertEqual("2", max_func.type.integer.minimum_value)
+ self.assertEqual("2", max_func.type.integer.maximum_value)
+
+ def test_max_dominated_by_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] Int y\n"
+ " if $max(x, y, 255) == 0:\n"
+ " 1 [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[2]
+ max_func = field.existence_condition.function.args[0]
+ self.assertEqual("infinity", max_func.type.integer.modulus)
+ self.assertEqual("255", max_func.type.integer.modular_value)
+ self.assertEqual("255", max_func.type.integer.minimum_value)
+ self.assertEqual("255", max_func.type.integer.maximum_value)
+
+ def test_max_of_variables(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] Int y\n"
+ " if $max(x, y) == 0:\n"
+ " 1 [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[2]
+ max_func = field.existence_condition.function.args[0]
+ self.assertEqual("1", max_func.type.integer.modulus)
+ self.assertEqual("0", max_func.type.integer.modular_value)
+ self.assertEqual("0", max_func.type.integer.minimum_value)
+ self.assertEqual("255", max_func.type.integer.maximum_value)
+
+ def test_max_of_variables_with_shared_modulus(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] Int y\n"
+ " if $max(x * 8 + 5, y * 4 + 3) == 0:\n"
+ " 1 [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[2]
+ max_func = field.existence_condition.function.args[0]
+ self.assertEqual("2", max_func.type.integer.modulus)
+ self.assertEqual("1", max_func.type.integer.modular_value)
+ self.assertEqual("5", max_func.type.integer.minimum_value)
+ self.assertEqual("2045", max_func.type.integer.maximum_value)
+
+ def test_max_of_three_variables(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] Int y\n"
+ " 2 [+2] Int z\n"
+ " if $max(x, y, z) == 0:\n"
+ " 1 [+1] UInt q\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[3]
+ max_func = field.existence_condition.function.args[0]
+ self.assertEqual("1", max_func.type.integer.modulus)
+ self.assertEqual("0", max_func.type.integer.modular_value)
+ self.assertEqual("0", max_func.type.integer.minimum_value)
+ self.assertEqual("32767", max_func.type.integer.maximum_value)
+
+ def test_max_of_one_variable(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] Int y\n"
+ " 2 [+2] Int z\n"
+ " if $max(x * 2 + 3) == 0:\n"
+ " 1 [+1] UInt q\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[3]
+ max_func = field.existence_condition.function.args[0]
+ self.assertEqual("2", max_func.type.integer.modulus)
+ self.assertEqual("1", max_func.type.integer.modular_value)
+ self.assertEqual("3", max_func.type.integer.minimum_value)
+ self.assertEqual("513", max_func.type.integer.maximum_value)
+
+ def test_max_of_one_variable_and_one_constant(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] Int y\n"
+ " 2 [+2] Int z\n"
+ " if $max(x * 2 + 3, 311) == 0:\n"
+ " 1 [+1] UInt q\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field = ir.module[0].type[0].structure.field[3]
+ max_func = field.existence_condition.function.args[0]
+ self.assertEqual("2", max_func.type.integer.modulus)
+ self.assertEqual("1", max_func.type.integer.modular_value)
+ self.assertEqual("311", max_func.type.integer.minimum_value)
+ self.assertEqual("513", max_func.type.integer.maximum_value)
+
+ def test_choice_non_integer_arguments(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " if x == 0 ? false : true:\n"
+ " 1 [+1] UInt y\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ expr = ir.module[0].type[0].structure.field[1].existence_condition
+ self.assertEqual("boolean", expr.type.WhichOneof("type"))
+ self.assertFalse(expr.type.boolean.HasField("value"))
+
+ def test_uint_value_range_for_explicit_size(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+x] UInt:16 y\n"
+ " y [+1] UInt z\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("1", z_start.type.integer.modulus)
+ self.assertEqual("0", z_start.type.integer.modular_value)
+ self.assertEqual("0", z_start.type.integer.minimum_value)
+ self.assertEqual("65535", z_start.type.integer.maximum_value)
+
+ def test_uint_value_ranges(self):
+ cases = [
+ (1, 1),
+ (2, 3),
+ (3, 7),
+ (4, 15),
+ (8, 255),
+ (12, 4095),
+ (15, 32767),
+ (16, 65535),
+ (32, 4294967295),
+ (48, 281474976710655),
+ (64, 18446744073709551615),
+ ]
+ for bits, upper in cases:
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+8] bits:\n"
+ " 0 [+{}] UInt x\n"
+ " x [+1] UInt z\n".format(bits))
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("1", z_start.type.integer.modulus)
+ self.assertEqual("0", z_start.type.integer.modular_value)
+ self.assertEqual("0", z_start.type.integer.minimum_value)
+ self.assertEqual(str(upper), z_start.type.integer.maximum_value)
+
+ def test_int_value_ranges(self):
+ cases = [
+ (1, -1, 0),
+ (2, -2, 1),
+ (3, -4, 3),
+ (4, -8, 7),
+ (8, -128, 127),
+ (12, -2048, 2047),
+ (15, -16384, 16383),
+ (16, -32768, 32767),
+ (32, -2147483648, 2147483647),
+ (48, -140737488355328, 140737488355327),
+ (64, -9223372036854775808, 9223372036854775807),
+ ]
+ for bits, lower, upper in cases:
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+8] bits:\n"
+ " 0 [+{}] Int x\n"
+ " x [+1] UInt z\n".format(bits))
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("1", z_start.type.integer.modulus)
+ self.assertEqual("0", z_start.type.integer.modular_value)
+ self.assertEqual(str(lower), z_start.type.integer.minimum_value)
+ self.assertEqual(str(upper), z_start.type.integer.maximum_value)
+
+ def test_bcd_value_ranges(self):
+ cases = [
+ (1, 1),
+ (2, 3),
+ (3, 7),
+ (4, 9),
+ (8, 99),
+ (12, 999),
+ (15, 7999),
+ (16, 9999),
+ (32, 99999999),
+ (48, 999999999999),
+ (64, 9999999999999999),
+ ]
+ for bits, upper in cases:
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+8] bits:\n"
+ " 0 [+{}] Bcd x\n"
+ " x [+1] UInt z\n".format(bits))
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ z_start = ir.module[0].type[0].structure.field[2].location.start
+ self.assertEqual("1", z_start.type.integer.modulus)
+ self.assertEqual("0", z_start.type.integer.modular_value)
+ self.assertEqual("0", z_start.type.integer.minimum_value)
+ self.assertEqual(str(upper), z_start.type.integer.maximum_value)
+
+ def test_virtual_field_bounds(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " let y = x + 10\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field_y = ir.module[0].type[0].structure.field[1]
+ self.assertEqual("1", field_y.read_transform.type.integer.modulus)
+ self.assertEqual("0", field_y.read_transform.type.integer.modular_value)
+ self.assertEqual("10", field_y.read_transform.type.integer.minimum_value)
+ self.assertEqual("265", field_y.read_transform.type.integer.maximum_value)
+
+ def test_virtual_field_bounds_copied(self):
+ ir = self._make_ir("struct Foo:\n"
+ " let z = y + 100\n"
+ " let y = x + 10\n"
+ " 0 [+1] UInt x\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field_z = ir.module[0].type[0].structure.field[0]
+ self.assertEqual("1", field_z.read_transform.type.integer.modulus)
+ self.assertEqual("0", field_z.read_transform.type.integer.modular_value)
+ self.assertEqual("110", field_z.read_transform.type.integer.minimum_value)
+ self.assertEqual("365", field_z.read_transform.type.integer.maximum_value)
+ y_reference = field_z.read_transform.function.args[0]
+ self.assertEqual("1", y_reference.type.integer.modulus)
+ self.assertEqual("0", y_reference.type.integer.modular_value)
+ self.assertEqual("10", y_reference.type.integer.minimum_value)
+ self.assertEqual("265", y_reference.type.integer.maximum_value)
+
+ def test_constant_reference_to_virtual_bounds_copied(self):
+ ir = self._make_ir("struct Foo:\n"
+ " let ten = Bar.ten\n"
+ " let truth = Bar.truth\n"
+ "struct Bar:\n"
+ " let ten = 10\n"
+ " let truth = true\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field_ten = ir.module[0].type[0].structure.field[0]
+ self.assertEqual("infinity", field_ten.read_transform.type.integer.modulus)
+ self.assertEqual("10", field_ten.read_transform.type.integer.modular_value)
+ self.assertEqual("10", field_ten.read_transform.type.integer.minimum_value)
+ self.assertEqual("10", field_ten.read_transform.type.integer.maximum_value)
+ field_truth = ir.module[0].type[0].structure.field[1]
+ self.assertTrue(field_truth.read_transform.type.boolean.value)
+
+ def test_forward_reference_to_reference_to_enum_correctly_calculated(self):
+ ir = self._make_ir("struct Foo:\n"
+ " let ten = Bar.TEN\n"
+ "enum Bar:\n"
+ " TEN = TEN2\n"
+ " TEN2 = 5 + 5\n")
+ self.assertEqual([], expression_bounds.compute_constants(ir))
+ field_ten = ir.module[0].type[0].structure.field[0]
+ self.assertEqual("10", field_ten.read_transform.type.enumeration.value)
+
+
+class InfinityAugmentedArithmeticTest(unittest.TestCase):
+
+ # TODO(bolms): Will there ever be any situations where all elements of the arg
+ # to _min would be "infinity"?
+ def test_min_of_infinities(self):
+ self.assertEqual("infinity",
+ expression_bounds._min(["infinity", "infinity"]))
+
+ # TODO(bolms): Will there ever be any situations where all elements of the arg
+ # to _max would be "-infinity"?
+ def test_max_of_negative_infinities(self):
+ self.assertEqual("-infinity",
+ expression_bounds._max(["-infinity", "-infinity"]))
+
+ def test_shared_modular_value_of_identical_modulus_and_value(self):
+ self.assertEqual((10, 8),
+ expression_bounds._shared_modular_value((10, 8), (10, 8)))
+
+ def test_shared_modular_value_of_identical_modulus(self):
+ self.assertEqual((5, 3),
+ expression_bounds._shared_modular_value((10, 8), (10, 3)))
+
+ def test_shared_modular_value_of_identical_value(self):
+ self.assertEqual((6, 2),
+ expression_bounds._shared_modular_value((18, 2), (12, 2)))
+
+ def test_shared_modular_value_of_different_arguments(self):
+ self.assertEqual((7, 4),
+ expression_bounds._shared_modular_value((21, 11), (14, 4)))
+
+ def test_shared_modular_value_of_infinity_and_non(self):
+ self.assertEqual((7, 4),
+ expression_bounds._shared_modular_value(("infinity", 25),
+ (14, 4)))
+
+ def test_shared_modular_value_of_infinity_and_infinity(self):
+ self.assertEqual((14, 5),
+ expression_bounds._shared_modular_value(("infinity", 19),
+ ("infinity", 5)))
+
+ def test_shared_modular_value_of_infinity_and_identical_value(self):
+ self.assertEqual(("infinity", 5),
+ expression_bounds._shared_modular_value(("infinity", 5),
+ ("infinity", 5)))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/format.py b/front_end/format.py
new file mode 100644
index 0000000..c04718a
--- /dev/null
+++ b/front_end/format.py
@@ -0,0 +1,126 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Formatter for Emboss source files.
+
+This program formats an Emboss source file given on the command line.
+"""
+
+from __future__ import absolute_import
+from __future__ import print_function
+
+import argparse
+import os
+import sys
+
+from front_end import format_emb
+from front_end import parser
+from front_end import tokenizer
+from util import error
+
+def _parse_command_line(argv):
+ """Parses the given command-line arguments."""
+ parser = argparse.ArgumentParser(description="Emboss compiler front end.",
+ prog=argv[0])
+ parser.add_argument("input_file",
+ type=str,
+ nargs='+',
+ help=".emb file to compile.")
+ parser.add_argument('--no-check-result',
+ default=True,
+ action='store_false',
+ dest='check_result',
+ help='Verify that the resulting formatted text contains '
+ 'only whitespace changes.')
+ parser.add_argument('--debug-show-line-types',
+ default=False,
+ help='Show the computed type of each line.')
+ parser.add_argument('--no-edit-in-place',
+ default=True,
+ action='store_false',
+ dest='edit_in_place',
+ help='Write the formatted text back to the input file.')
+ parser.add_argument('--indent',
+ type=int,
+ default=2,
+ help='Number of spaces to use for each level of '
+ 'indentation.')
+ parser.add_argument('--color-output',
+ default='if-tty',
+ choices=['always', 'never', 'if-tty', 'auto'],
+ help="Print error messages using color. 'auto' is a "
+ "synonym for 'if-tty'.")
+ return parser.parse_args(argv[1:])
+
+
+def _print_errors(errors, source_codes, flags):
+ use_color = (flags.color_output == 'always' or
+ (flags.color_output in ('auto', 'if-tty') and
+ os.isatty(sys.stderr.fileno())))
+ print(error.format_errors(errors, source_codes, use_color), file=sys.stderr)
+
+
+def main(argv=()):
+ flags = _parse_command_line(argv)
+
+ if not flags.edit_in_place and len(flags.input_file) > 1:
+ print('Multiple files may only be formatted without --no-edit-in-place.',
+ file=sys.stderr)
+ return 1
+
+ if flags.edit_in_place and flags.debug_show_line_types:
+ print('The flag --debug_show_line_types requires --no-edit_in_place.',
+ file=sys.stderr)
+ return 1
+
+ for file_name in flags.input_file:
+ with open(file_name) as f:
+ source_code = f.read()
+
+ tokens, errors = tokenizer.tokenize(source_code, file_name)
+ if errors:
+ _print_errors(errors, {file_name: source_code}, flags)
+ continue
+
+ parse_result = parser.parse_module(tokens)
+ if parse_result.error:
+ _print_errors(
+ [error.make_error_from_parse_error(file_name, parse_result.error)],
+ {file_name: source_code})
+ continue
+
+ formatted_text = format_emb.format_emboss_parse_tree(
+ parse_result.parse_tree,
+ format_emb.Config(show_line_types=flags.debug_show_line_types,
+ indent_width=flags.indent))
+
+ if flags.check_result and not flags.debug_show_line_types:
+ errors = format_emb.sanity_check_format_result(formatted_text,
+ source_code)
+ if errors:
+ for e in errors:
+ print(e, file=sys.stderr)
+ continue
+
+ if flags.edit_in_place:
+ with open(file_name, 'w') as f:
+ f.write(formatted_text)
+ else:
+ sys.stdout.write(formatted_text)
+
+ return 0
+
+
+if __name__ == '__main__':
+ sys.exit(main(sys.argv))
diff --git a/front_end/format_emb.py b/front_end/format_emb.py
new file mode 100644
index 0000000..1824c85
--- /dev/null
+++ b/front_end/format_emb.py
@@ -0,0 +1,893 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Formatter for Emboss source files.
+
+This module exports a single function, format_emboss_parse_tree(), which
+pretty-prints an Emboss parse tree.
+"""
+
+from __future__ import print_function
+
+import collections
+import itertools
+
+from front_end import module_ir
+from front_end import tokenizer
+from util import parser_types
+
+
+class Config(collections.namedtuple('Config',
+ ['indent_width', 'show_line_types'])):
+ """Configuration for formatting."""
+
+ def __new__(cls, indent_width=2, show_line_types=False):
+ return super(cls, Config).__new__(cls, indent_width, show_line_types)
+
+
+class _Row(collections.namedtuple('Row', ['name', 'columns', 'indent'])):
+ """Structured contents of a single line."""
+
+ def __new__(cls, name, columns=None, indent=0):
+ return super(cls, _Row).__new__(cls, name, tuple(columns or []), indent)
+
+
+class _Block(collections.namedtuple('Block', ['prefix', 'header', 'body'])):
+ """Structured block of multiple lines."""
+
+ def __new__(cls, prefix, header, body):
+ assert header
+ return super(cls, _Block).__new__(cls, prefix, header, body)
+
+
+# Map of productions to their formatters.
+_formatters = {}
+
+
+def format_emboss_parse_tree(parse_tree, config, used_productions=None):
+ """Formats Emboss source code.
+
+ Arguments:
+ parse_tree: A parse tree of an Emboss source file.
+ config: A Config tuple with formatting options.
+ used_productions: An optional set to which all used productions will be
+ added. Intended for use by test code to ensure full production
+ coverage.
+
+ Returns:
+ A string of the reformatted source text.
+ """
+ if hasattr(parse_tree, 'children'):
+ parsed_children = [format_emboss_parse_tree(child, config, used_productions)
+ for child in parse_tree.children]
+ args = parsed_children + [config]
+ if used_productions is not None:
+ used_productions.add(parse_tree.production)
+ return _formatters[parse_tree.production](*args)
+ else:
+ assert isinstance(parse_tree, parser_types.Token), str(parse_tree)
+ return parse_tree.text
+
+
+def sanity_check_format_result(formatted_text, original_text):
+ """Checks that the given texts are equivalent."""
+ # The texts are considered equivalent if they tokenize to the same token
+ # stream, except that:
+ #
+ # Multiple consecutive newline tokens are equivalent to a single newline
+ # token.
+ #
+ # Extra newline tokens at the start of the stream should be ignored.
+ #
+ # Whitespace at the start or end of a token should be ignored. This matters
+ # for documentation and comment tokens, which may have had trailing whitespace
+ # in the original text, and for indent tokens, which may contain a different
+ # number of space and/or tab characters.
+ original_tokens, errors = tokenizer.tokenize(original_text, '')
+ if errors:
+ return ['BUG: original text is not tokenizable: {!r}'.format(errors)]
+
+ formatted_tokens, errors = tokenizer.tokenize(formatted_text, '')
+ if errors:
+ return ['BUG: formatted text is not tokenizable: {!r}'.format(errors)]
+
+ o_tokens = _collapse_newline_tokens(original_tokens)
+ f_tokens = _collapse_newline_tokens(formatted_tokens)
+ for i in range(len(o_tokens)):
+ if (o_tokens[i].symbol != f_tokens[i].symbol or
+ o_tokens[i].text.strip() != f_tokens[i].text.strip()):
+ return ['BUG: Symbol {} differs: {!r} vs {!r}'.format(i, o_tokens[i],
+ f_tokens[i])]
+ return []
+
+
+def _collapse_newline_tokens(token_list):
+ r"""Collapses multiple consecutive "\\n" tokens into a single newline."""
+ result = []
+ for symbol, group in itertools.groupby(token_list, lambda x: x.symbol):
+ if symbol == '"\\n"':
+ # Skip all newlines if they are at the start, otherwise add a single
+ # newline for each consecutive run of newlines.
+ if result:
+ result.append(list(group)[0])
+ else:
+ result.extend(group)
+ return result
+
+
+def _indent_row(row):
+ """Adds one level of indent to the given row, returning a new row."""
+ assert isinstance(row, _Row), repr(row)
+ return _Row(name=row.name,
+ columns=row.columns,
+ indent=row.indent + 1)
+
+
+def _indent_rows(rows):
+ """Adds one level of indent to the given rows, returning a new list."""
+ return list(map(_indent_row, rows))
+
+
+def _indent_blocks(blocks):
+ """Adds one level of indent to the given blocks, returning a new list."""
+ return [_Block(prefix=_indent_rows(block.prefix),
+ header=_indent_row(block.header),
+ body=_indent_rows(block.body))
+ for block in blocks]
+
+
+def _intersperse(interspersed, sections):
+ """Intersperses `interspersed` between non-empty `sections`."""
+ result = []
+ for section in sections:
+ if section:
+ if result:
+ result.extend(interspersed)
+ result.extend(section)
+ return result
+
+
+def _should_add_blank_lines(blocks):
+ """Returns true if blank lines should be added between blocks."""
+ other_non_empty_lines = 0
+ last_non_empty_lines = 0
+ for block in blocks:
+ last_non_empty_lines = len([line for line in
+ block.body + block.prefix
+ if line.columns])
+ other_non_empty_lines += last_non_empty_lines
+ # Vertical spaces should be added if there are more interior
+ # non-empty-non-header lines than header lines.
+ return len(blocks) <= other_non_empty_lines - last_non_empty_lines
+
+
+def _columnize(blocks, indent_width, indent_columns=1):
+ """Aligns columns in the header rows of the given blocks.
+
+ The `indent_columns` argument is used to determine how many columns should be
+ indented. With `indent_columns == 1`, the result would be:
+
+ AA BB CC
+ AAA BBB CCC
+ A B C
+
+ With `indent_columns == 2`:
+
+ AA BB CC
+ AAA BBB CCC
+ A B C
+
+ With `indent_columns == 1`, only the first column is indented compared to
+ surrounding rows; with `indent_columns == 2`, both the first and second
+ columns are indented.
+
+ Arguments:
+ blocks: A list of _Blocks to columnize.
+ indent_width: The number of spaces per level of indent.
+ indent_columns: The number of columns to indent.
+
+ Returns:
+ A list of _Rows of the prefix, header, and body _Rows of each block, where
+ the header _Rows of each type have had their columns aligned.
+ """
+ single_width_separators = {'enum-value': {0, 1}, 'field': {0}}
+ # For each type of row, figure out how many characters each column needs.
+ row_types = collections.defaultdict(
+ lambda: collections.defaultdict(lambda: 0))
+ for block in blocks:
+ max_lengths = row_types[block.header.name]
+ for i in range(len(block.header.columns)):
+ if i == indent_columns - 1:
+ adjustment = block.header.indent * indent_width
+ else:
+ adjustment = 0
+ max_lengths[i] = max(max_lengths[i],
+ len(block.header.columns[i]) + adjustment)
+
+ assert len(row_types) < 3
+
+ # Then, for each row, actually columnize it.
+ result = []
+ for block in blocks:
+ columns = []
+ for i in range(len(block.header.columns)):
+ column_width = row_types[block.header.name][i]
+ if column_width == 0:
+ # Zero-width columns are entirely omitted, including their column
+ # separators.
+ pass
+ else:
+ if i == indent_columns - 1:
+ # This function only performs the right padding for each column.
+ # Since the left padding for indent will be added later, the
+ # corresponding space needs to be removed from the right padding of
+ # the first column.
+ column_width -= block.header.indent * indent_width
+ if i in single_width_separators.get(block.header.name, []):
+ # Only one space around the "=" in enum values and between the start
+ # and size in field locations.
+ column_width += 1
+ else:
+ column_width += 2
+ columns.append(block.header.columns[i].ljust(column_width))
+ result.append(block.prefix + [_Row(block.header.name,
+ [''.join(columns).rstrip()],
+ block.header.indent)] + block.body)
+ return result
+
+
+def _indent_blanks_and_comments(rows):
+ """Indents blank and comment lines to match the next non-blank line."""
+ result = []
+ previous_indent = 0
+ for row in reversed(rows):
+ if not ''.join(row.columns) or row.name == 'comment':
+ result.append(_Row(row.name, row.columns, previous_indent))
+ else:
+ result.append(row)
+ previous_indent = row.indent
+ return reversed(result)
+
+
+def _add_blank_rows_on_dedent(rows):
+ """Adds blank rows before dedented lines, where needed."""
+ result = []
+ previous_indent = 0
+ previous_row_was_blank = True
+ for row in rows:
+ row_is_blank = not ''.join(row.columns)
+ found_dedent = previous_indent > row.indent
+ if found_dedent and not previous_row_was_blank and not row_is_blank:
+ result.append(_Row('dedent-space', [], row.indent))
+ result.append(row)
+ previous_indent = row.indent
+ previous_row_was_blank = row_is_blank
+ return result
+
+
+def _render_row_to_text(row, indent_width):
+ assert len(row.columns) < 2, '{!r}'.format(row)
+ text = ' ' * indent_width * row.indent
+ text += ''.join(row.columns)
+ return text.rstrip()
+
+
+def _render_rows_to_text(rows, indent_width, show_line_types):
+ max_row_name_len = max([0] + [len(row.name) for row in rows])
+ flattened_rows = []
+ for row in rows:
+ row_text = _render_row_to_text(row, indent_width)
+ if show_line_types:
+ row_text = row.name.ljust(max_row_name_len) + '|' + row_text
+ flattened_rows.append(row_text)
+ return '\n'.join(flattened_rows + [''])
+
+
+def _check_productions():
+ """Asserts that the productions in this module match those in module_ir."""
+ productions_ok = True
+ for production in module_ir.PRODUCTIONS:
+ if production not in _formatters:
+ productions_ok = False
+ print('@_formats({!r})'.format(str(production)))
+
+ for production in _formatters:
+ if production not in module_ir.PRODUCTIONS:
+ productions_ok = False
+ print('not @_formats({!r})'.format(str(production)))
+
+ assert productions_ok, 'Grammar mismatch.'
+
+
+def _formats_with_config(production_text):
+ """Marks a function as a formatter requiring a config argument."""
+ production = parser_types.Production.parse(production_text)
+
+ def formats(f):
+ assert production not in _formatters, production
+ _formatters[production] = f
+ return f
+
+ return formats
+
+
+def _formats(production_text):
+ """Marks a function as the formatter for a particular production."""
+
+ def strip_config_argument(f):
+ _formats_with_config(production_text)(lambda *a, **kw: f(*a[:-1], **kw))
+ return f
+
+ return strip_config_argument
+
+
+################################################################################
+# From here to the end of the file are functions which recursively format an
+# Emboss parse tree.
+#
+# The format_parse_tree() function will call formatters, bottom-up, for the
+# entire parse tree. Each formatter will be called with the results of the
+# formatters for each child node. (The "formatter" for leaf nodes is the
+# original text of the token.)
+#
+# Formatters can be roughly divided into three types:
+#
+# The _module formatter is the top-level formatter. It handles final rendering
+# into text, and returns a string.
+#
+# Formatters for productions that are at least one full line return lists of
+# _Rows. The production 'attribute-line' falls into this category, but
+# 'attribute' does not. This form allows parallel constructs in separate lines
+# to be lined up column-wise, even when there are intervening lines that should
+# not be lined up -- for example, the types and names of struct fields will be
+# aligned, even if there are documentation, comment, or attribute lines mixed
+# in.
+#
+# Formatters for productions that are smaller than one full line just return
+# strings.
+
+
+@_formats_with_config('module -> comment-line* doc-line* import-line*'
+ ' attribute-line* type-definition*')
+def _module(comments, docs, imports, attributes, types, config):
+ """Performs top-level formatting for an Emboss source file."""
+
+ # The top-level sections other than types should be separated by single lines.
+ header_rows = _intersperse(
+ [_Row('section-break')],
+ [_strip_empty_leading_trailing_comment_lines(comments), docs, imports,
+ attributes])
+
+ # Top-level types should be separated by double lines from themselves and from
+ # the header rows.
+ rows = _intersperse(
+ [_Row('top-type-separator'), _Row('top-type-separator')],
+ [header_rows] + types)
+
+ # Final fixups.
+ rows = _indent_blanks_and_comments(rows)
+ rows = _add_blank_rows_on_dedent(rows)
+ return _render_rows_to_text(rows, config.indent_width, config.show_line_types)
+
+
+@_formats('doc-line -> doc Comment? eol')
+def _doc_line(doc, comment, eol):
+ assert not comment, 'Comment should not be possible on the same line as doc.'
+ return [_Row('doc', [doc])] + eol
+
+
+@_formats('import-line -> "import" string-constant "as" snake-word Comment?'
+ ' eol')
+def _import_line(import_, filename, as_, name, comment, eol):
+ return [_Row('import', ['{} {} {} {} {}'.format(
+ import_, filename, as_, name, comment)])] + eol
+
+
+@_formats('attribute-line -> attribute Comment? eol')
+def _attribute_line(attribute, comment, eol):
+ return [_Row('attribute', ['{} {}'.format(attribute, comment)])] + eol
+
+
+@_formats('attribute -> "[" attribute-context? "$default"? snake-word ":"'
+ ' attribute-value "]"')
+def _attribute(open_, context, default, name, colon, value, close):
+ return ''.join([open_,
+ _concatenate_with_spaces(context, default, name + colon,
+ value),
+ close])
+
+
+@_formats('parameter-definition -> snake-name ":" type')
+def _parameter_definition(name, colon, type_specifier):
+ return '{}{} {}'.format(name, colon, type_specifier)
+
+
+@_formats('type-definition* -> type-definition type-definition*')
+def _type_defitinions(definition, definitions):
+ return [definition] + definitions
+
+
+@_formats('bits -> "bits" type-name delimited-parameter-definition-list? ":"'
+ ' Comment? eol bits-body')
+@_formats('struct -> "struct" type-name delimited-parameter-definition-list?'
+ ' ":" Comment? eol struct-body')
+def _structure_type(struct, name, parameters, colon, comment, eol, body):
+ return ([_Row('type-header',
+ ['{} {}{}{} {}'.format(
+ struct, name, parameters, colon, comment)])] +
+ eol + body)
+
+
+@_formats('enum -> "enum" type-name ":" Comment? eol enum-body')
+@_formats('external -> "external" type-name ":" Comment? eol external-body')
+def _type(struct, name, colon, comment, eol, body):
+ return ([_Row('type-header',
+ ['{} {}{} {}'.format(struct, name, colon, comment)])] +
+ eol + body)
+
+
+@_formats_with_config('bits-body -> Indent doc-line* attribute-line*'
+ ' type-definition* bits-field-block Dedent')
+@_formats_with_config(
+ 'struct-body -> Indent doc-line* attribute-line*'
+ ' type-definition* struct-field-block Dedent')
+def _structure_body(indent, docs, attributes, type_definitions, fields, dedent,
+ config):
+ del indent, dedent # Unused.
+ spacing = [_Row('field-separator')] if _should_add_blank_lines(fields) else []
+ columnized_fields = _columnize(fields, config.indent_width, indent_columns=2)
+ return _indent_rows(_intersperse(
+ spacing, [docs, attributes] + type_definitions + columnized_fields))
+
+
+@_formats('field-location -> expression "[" "+" expression "]"')
+def _field_location(start, open_bracket, plus, size, close_bracket):
+ return [start, open_bracket + plus + size + close_bracket]
+
+
+@_formats('anonymous-bits-field-block -> conditional-anonymous-bits-field-block'
+ ' anonymous-bits-field-block')
+@_formats('anonymous-bits-field-block -> unconditional-anonymous-bits-field'
+ ' anonymous-bits-field-block')
+@_formats('bits-field-block -> conditional-bits-field-block bits-field-block')
+@_formats('bits-field-block -> unconditional-bits-field bits-field-block')
+@_formats('struct-field-block -> conditional-struct-field-block'
+ ' struct-field-block')
+@_formats('struct-field-block -> unconditional-struct-field struct-field-block')
+@_formats('unconditional-anonymous-bits-field* ->'
+ ' unconditional-anonymous-bits-field'
+ ' unconditional-anonymous-bits-field*')
+@_formats('unconditional-anonymous-bits-field+ ->'
+ ' unconditional-anonymous-bits-field'
+ ' unconditional-anonymous-bits-field*')
+@_formats('unconditional-bits-field* -> unconditional-bits-field'
+ ' unconditional-bits-field*')
+@_formats('unconditional-bits-field+ -> unconditional-bits-field'
+ ' unconditional-bits-field*')
+@_formats('unconditional-struct-field* -> unconditional-struct-field'
+ ' unconditional-struct-field*')
+@_formats('unconditional-struct-field+ -> unconditional-struct-field'
+ ' unconditional-struct-field*')
+def _structure_block(field, block):
+ """Prepends field to block."""
+ return field + block
+
+
+@_formats('virtual-field -> "let" snake-name "=" expression Comment? eol'
+ ' field-body?')
+def _virtual_field(let_keyword, name, equals, value, comment, eol, body):
+ # This formatting doesn't look the best when there are blocks of several
+ # virtual fields next to each other, but works pretty well when they're
+ # intermixed with physical fields. It's probably good enough for now, since
+ # there aren't (yet) any virtual fields in real .embs, and will probably only
+ # be a few in the near future.
+ return [_Block([],
+ _Row('virtual-field',
+ [_concatenate_with(
+ ' ',
+ _concatenate_with_spaces(let_keyword, name, equals,
+ value),
+ comment)]),
+ eol + body)]
+
+
+@_formats('field -> field-location type snake-name abbreviation?'
+ ' attribute* doc? Comment? eol field-body?')
+def _unconditional_field(location, type_, name, abbreviation, attributes, doc,
+ comment, eol, body):
+ return [_Block([],
+ _Row('field',
+ location + [type_,
+ _concatenate_with_spaces(name, abbreviation),
+ attributes, doc, comment]),
+ eol + body)]
+
+
+@_formats('field-body -> Indent doc-line* attribute-line* Dedent')
+def _field_body(indent, docs, attributes, dedent):
+ del indent, dedent # Unused
+ return _indent_rows(docs + attributes)
+
+
+@_formats('anonymous-bits-field-definition ->'
+ ' field-location "bits" ":" Comment? eol anonymous-bits-body')
+def _inline_bits(location, bits, colon, comment, eol, body):
+ # Even though an anonymous bits field technically defines a new, anonymous
+ # type, conceptually it's more like defining a bunch of fields on the
+ # surrounding type, so it is treated as an inline list of blocks, instead of
+ # being separately formatted.
+ header_row = _Row('field', [location[0], location[1] + ' ' + bits + colon,
+ '', '', '', '', comment])
+ return ([_Block([], header_row, eol + body.header_lines)] +
+ body.field_blocks)
+
+
+@_formats('inline-enum-field-definition ->'
+ ' field-location "enum" snake-name abbreviation? ":" Comment? eol'
+ ' enum-body')
+@_formats(
+ 'inline-struct-field-definition ->'
+ ' field-location "struct" snake-name abbreviation? ":" Comment? eol'
+ ' struct-body')
+@_formats('inline-bits-field-definition ->'
+ ' field-location "bits" snake-name abbreviation? ":" Comment? eol'
+ ' bits-body')
+def _inline_type(location, keyword, name, abbreviation, colon, comment, eol,
+ body):
+ """Formats an inline type in a struct or bits."""
+ header_row = _Row(
+ 'field', location + [keyword,
+ _concatenate_with_spaces(name, abbreviation) + colon,
+ '', '', comment])
+ return [_Block([], header_row, eol + body)]
+
+
+@_formats('conditional-struct-field-block -> "if" expression ":" Comment? eol'
+ ' Indent unconditional-struct-field+'
+ ' Dedent')
+@_formats('conditional-bits-field-block -> "if" expression ":" Comment? eol'
+ ' Indent unconditional-bits-field+'
+ ' Dedent')
+@_formats('conditional-anonymous-bits-field-block ->'
+ ' "if" expression ":" Comment? eol'
+ ' Indent unconditional-anonymous-bits-field+ Dedent')
+def _conditional_field(if_, condition, colon, comment, eol, indent, body,
+ dedent):
+ """Formats an `if` construct."""
+ del indent, dedent # Unused
+ # The body of an 'if' should be columnized with the surrounding blocks, so
+ # much like an inline 'bits', its body is treated as an inline list of blocks.
+ header_row = _Row('if',
+ ['{} {}{} {}'.format(if_, condition, colon, comment)])
+ indented_body = _indent_blocks(body)
+ assert indented_body, 'Expected body of if condition.'
+ return [_Block([header_row] + eol + indented_body[0].prefix,
+ indented_body[0].header,
+ indented_body[0].body)] + indented_body[1:]
+
+
+_InlineBitsBodyType = collections.namedtuple('InlineBitsBodyType',
+ ['header_lines', 'field_blocks'])
+
+
+@_formats('anonymous-bits-body ->'
+ ' Indent attribute-line* anonymous-bits-field-block Dedent')
+def _inline_bits_body(indent, attributes, fields, dedent):
+ del indent, dedent # Unused
+ return _InlineBitsBodyType(header_lines=_indent_rows(attributes),
+ field_blocks=_indent_blocks(fields))
+
+
+@_formats_with_config(
+ 'enum-body -> Indent doc-line* attribute-line* enum-value+'
+ ' Dedent')
+def _enum_body(indent, docs, attributes, values, dedent, config):
+ del indent, dedent # Unused
+ spacing = [_Row('value-separator')] if _should_add_blank_lines(values) else []
+ columnized_values = _columnize(values, config.indent_width)
+ return _indent_rows(_intersperse(spacing,
+ [docs, attributes] + columnized_values))
+
+
+@_formats('enum-value* -> enum-value enum-value*')
+@_formats('enum-value+ -> enum-value enum-value*')
+def _enum_values(value, block):
+ return value + block
+
+
+@_formats('enum-value -> constant-name "=" expression doc? Comment? eol'
+ ' enum-value-body?')
+def _enum_value(name, equals, value, docs, comment, eol, body):
+ return [_Block([], _Row('enum-value', [name, equals, value, docs, comment]),
+ eol + body)]
+
+
+@_formats('enum-value-body -> Indent doc-line* Dedent')
+def _enum_value_body(indent, docs, dedent):
+ del indent, dedent # Unused
+ return _indent_rows(docs)
+
+
+@_formats('external-body -> Indent doc-line* attribute-line* Dedent')
+def _external_body(indent, docs, attributes, dedent):
+ del indent, dedent # Unused
+ return _indent_rows(_intersperse([_Row('section-break')], [docs, attributes]))
+
+
+@_formats('comment-line -> Comment? "\\n"')
+def _comment_line(comment, eol):
+ del eol # Unused
+ if comment:
+ return [_Row('comment', [comment])]
+ else:
+ return [_Row('comment')]
+
+
+@_formats('eol -> "\\n" comment-line*')
+def _eol(eol, comments):
+ del eol # Unused
+ return _strip_empty_leading_trailing_comment_lines(comments)
+
+
+def _strip_empty_leading_trailing_comment_lines(comments):
+ first_non_empty_line = None
+ last_non_empty_line = None
+ for i in range(len(comments)):
+ if comments[i].columns:
+ if first_non_empty_line is None:
+ first_non_empty_line = i
+ last_non_empty_line = i
+ if first_non_empty_line is None:
+ return []
+ else:
+ return comments[first_non_empty_line:last_non_empty_line + 1]
+
+
+@_formats('attribute-line* -> ')
+@_formats('anonymous-bits-field-block -> ')
+@_formats('bits-field-block -> ')
+@_formats('comment-line* -> ')
+@_formats('doc-line* -> ')
+@_formats('enum-value* -> ')
+@_formats('enum-value-body? -> ')
+@_formats('field-body? -> ')
+@_formats('import-line* -> ')
+@_formats('struct-field-block -> ')
+@_formats('type-definition* -> ')
+@_formats('unconditional-anonymous-bits-field* -> ')
+@_formats('unconditional-bits-field* -> ')
+@_formats('unconditional-struct-field* -> ')
+def _empty_list():
+ return []
+
+
+@_formats('abbreviation? -> ')
+@_formats('additive-expression-right* -> ')
+@_formats('and-expression-right* -> ')
+@_formats('argument-list -> ')
+@_formats('array-length-specifier* -> ')
+@_formats('attribute* -> ')
+@_formats('attribute-context? -> ')
+@_formats('comma-then-expression* -> ')
+@_formats('Comment? -> ')
+@_formats('"$default"? -> ')
+@_formats('delimited-argument-list? -> ')
+@_formats('delimited-parameter-definition-list? -> ')
+@_formats('doc? -> ')
+@_formats('equality-expression-right* -> ')
+@_formats('equality-or-greater-expression-right* -> ')
+@_formats('equality-or-less-expression-right* -> ')
+@_formats('field-reference-tail* -> ')
+@_formats('or-expression-right* -> ')
+@_formats('parameter-definition-list -> ')
+@_formats('parameter-definition-list-tail* -> ')
+@_formats('times-expression-right* -> ')
+@_formats('type-size-specifier? -> ')
+def _empty_string():
+ return ''
+
+
+@_formats('abbreviation? -> abbreviation')
+@_formats('additive-operator -> "-"')
+@_formats('additive-operator -> "+"')
+@_formats('and-operator -> "&&"')
+@_formats('attribute-context? -> attribute-context')
+@_formats('attribute-value -> expression')
+@_formats('attribute-value -> string-constant')
+@_formats('boolean-constant -> BooleanConstant')
+@_formats('bottom-expression -> boolean-constant')
+@_formats('bottom-expression -> builtin-reference')
+@_formats('bottom-expression -> constant-reference')
+@_formats('bottom-expression -> field-reference')
+@_formats('bottom-expression -> numeric-constant')
+@_formats('builtin-field-word -> "$max_size_in_bits"')
+@_formats('builtin-field-word -> "$max_size_in_bytes"')
+@_formats('builtin-field-word -> "$min_size_in_bits"')
+@_formats('builtin-field-word -> "$min_size_in_bytes"')
+@_formats('builtin-field-word -> "$size_in_bits"')
+@_formats('builtin-field-word -> "$size_in_bytes"')
+@_formats('builtin-reference -> builtin-word')
+@_formats('builtin-word -> "$is_statically_sized"')
+@_formats('builtin-word -> "$static_size_in_bits"')
+@_formats('choice-expression -> logical-expression')
+@_formats('Comment? -> Comment')
+@_formats('comparison-expression -> additive-expression')
+@_formats('constant-name -> constant-word')
+@_formats('constant-reference -> constant-reference-tail')
+@_formats('constant-reference-tail -> constant-word')
+@_formats('constant-word -> ShoutyWord')
+@_formats('"$default"? -> "$default"')
+@_formats('delimited-argument-list? -> delimited-argument-list')
+@_formats('doc? -> doc')
+@_formats('doc -> Documentation')
+@_formats('enum-value-body? -> enum-value-body')
+@_formats('equality-operator -> "=="')
+@_formats('equality-or-greater-expression-right -> equality-expression-right')
+@_formats('equality-or-greater-expression-right -> greater-expression-right')
+@_formats('equality-or-less-expression-right -> equality-expression-right')
+@_formats('equality-or-less-expression-right -> less-expression-right')
+@_formats('expression -> choice-expression')
+@_formats('field-body? -> field-body')
+@_formats('function-name -> "$lower_bound"')
+@_formats('function-name -> "$present"')
+@_formats('function-name -> "$max"')
+@_formats('function-name -> "$upper_bound"')
+@_formats('greater-operator -> ">="')
+@_formats('greater-operator -> ">"')
+@_formats('inequality-operator -> "!="')
+@_formats('less-operator -> "<="')
+@_formats('less-operator -> "<"')
+@_formats('logical-expression -> and-expression')
+@_formats('logical-expression -> comparison-expression')
+@_formats('logical-expression -> or-expression')
+@_formats('multiplicative-operator -> "*"')
+@_formats('negation-expression -> bottom-expression')
+@_formats('numeric-constant -> Number')
+@_formats('or-operator -> "||"')
+@_formats('snake-name -> snake-word')
+@_formats('snake-reference -> builtin-field-word')
+@_formats('snake-reference -> snake-word')
+@_formats('snake-word -> SnakeWord')
+@_formats('string-constant -> String')
+@_formats('type-definition -> bits')
+@_formats('type-definition -> enum')
+@_formats('type-definition -> external')
+@_formats('type-definition -> struct')
+@_formats('type-name -> type-word')
+@_formats('type-reference-tail -> type-word')
+@_formats('type-reference -> type-reference-tail')
+@_formats('type-size-specifier? -> type-size-specifier')
+@_formats('type-word -> CamelWord')
+@_formats('unconditional-anonymous-bits-field -> field')
+@_formats('unconditional-anonymous-bits-field -> inline-bits-field-definition')
+@_formats('unconditional-anonymous-bits-field -> inline-enum-field-definition')
+@_formats('unconditional-bits-field -> unconditional-anonymous-bits-field')
+@_formats('unconditional-bits-field -> virtual-field')
+@_formats('unconditional-struct-field -> anonymous-bits-field-definition')
+@_formats('unconditional-struct-field -> field')
+@_formats('unconditional-struct-field -> inline-bits-field-definition')
+@_formats('unconditional-struct-field -> inline-enum-field-definition')
+@_formats('unconditional-struct-field -> inline-struct-field-definition')
+@_formats('unconditional-struct-field -> virtual-field')
+def _identity(x):
+ return x
+
+
+@_formats('argument-list -> expression comma-then-expression*')
+@_formats('times-expression -> negation-expression times-expression-right*')
+@_formats('type -> type-reference delimited-argument-list? type-size-specifier?'
+ ' array-length-specifier*')
+@_formats('array-length-specifier -> "[" expression "]"')
+@_formats('array-length-specifier* -> array-length-specifier'
+ ' array-length-specifier*')
+@_formats('type-size-specifier -> ":" numeric-constant')
+@_formats('attribute-context -> "(" snake-word ")"')
+@_formats('constant-reference -> snake-reference "." constant-reference-tail')
+@_formats('constant-reference-tail -> type-word "." constant-reference-tail')
+@_formats('constant-reference-tail -> type-word "." snake-reference')
+@_formats('type-reference-tail -> type-word "." type-reference-tail')
+@_formats('field-reference -> snake-reference field-reference-tail*')
+@_formats('abbreviation -> "(" snake-word ")"')
+@_formats('additive-expression-right -> additive-operator times-expression')
+@_formats('additive-expression-right* -> additive-expression-right'
+ ' additive-expression-right*')
+@_formats('additive-expression -> times-expression additive-expression-right*')
+@_formats('array-length-specifier -> "[" "]"')
+@_formats('delimited-argument-list -> "(" argument-list ")"')
+@_formats('delimited-parameter-definition-list? ->'
+ ' delimited-parameter-definition-list')
+@_formats('delimited-parameter-definition-list ->'
+ ' "(" parameter-definition-list ")"')
+@_formats('parameter-definition-list -> parameter-definition'
+ ' parameter-definition-list-tail*')
+@_formats('parameter-definition-list-tail* -> parameter-definition-list-tail'
+ ' parameter-definition-list-tail*')
+@_formats('times-expression-right -> multiplicative-operator'
+ ' negation-expression')
+@_formats('times-expression-right* -> times-expression-right'
+ ' times-expression-right*')
+@_formats('field-reference-tail -> "." snake-reference')
+@_formats('field-reference-tail* -> field-reference-tail field-reference-tail*')
+@_formats('negation-expression -> additive-operator bottom-expression')
+@_formats('type-reference -> snake-word "." type-reference-tail')
+@_formats('bottom-expression -> "(" expression ")"')
+@_formats('bottom-expression -> function-name "(" argument-list ")"')
+@_formats('comma-then-expression* -> comma-then-expression'
+ ' comma-then-expression*')
+@_formats('or-expression-right* -> or-expression-right or-expression-right*')
+@_formats('less-expression-right-list -> equality-expression-right*'
+ ' less-expression-right'
+ ' equality-or-less-expression-right*')
+@_formats('or-expression-right+ -> or-expression-right or-expression-right*')
+@_formats('and-expression -> comparison-expression and-expression-right+')
+@_formats('comparison-expression -> additive-expression'
+ ' greater-expression-right-list')
+@_formats('comparison-expression -> additive-expression'
+ ' equality-expression-right+')
+@_formats('or-expression -> comparison-expression or-expression-right+')
+@_formats('equality-expression-right+ -> equality-expression-right'
+ ' equality-expression-right*')
+@_formats('and-expression-right* -> and-expression-right and-expression-right*')
+@_formats('equality-or-greater-expression-right* ->'
+ ' equality-or-greater-expression-right'
+ ' equality-or-greater-expression-right*')
+@_formats('and-expression-right+ -> and-expression-right and-expression-right*')
+@_formats('equality-or-less-expression-right* ->'
+ ' equality-or-less-expression-right'
+ ' equality-or-less-expression-right*')
+@_formats('equality-expression-right* -> equality-expression-right'
+ ' equality-expression-right*')
+@_formats('greater-expression-right-list ->'
+ ' equality-expression-right* greater-expression-right'
+ ' equality-or-greater-expression-right*')
+@_formats('comparison-expression -> additive-expression'
+ ' less-expression-right-list')
+def _concatenate(*elements):
+ """Concatenates all arguments with no delimiters."""
+ return ''.join(elements)
+
+
+@_formats('equality-expression-right -> equality-operator additive-expression')
+@_formats('less-expression-right -> less-operator additive-expression')
+@_formats('greater-expression-right -> greater-operator additive-expression')
+@_formats('or-expression-right -> or-operator comparison-expression')
+@_formats('and-expression-right -> and-operator comparison-expression')
+def _concatenate_with_prefix_spaces(*elements):
+ return ''.join(' ' + element for element in elements if element)
+
+
+@_formats('attribute* -> attribute attribute*')
+@_formats('comma-then-expression -> "," expression')
+@_formats('comparison-expression -> additive-expression inequality-operator'
+ ' additive-expression')
+@_formats('choice-expression -> logical-expression "?" logical-expression'
+ ' ":" logical-expression')
+@_formats('parameter-definition-list-tail -> "," parameter-definition')
+def _concatenate_with_spaces(*elements):
+ return _concatenate_with(' ', *elements)
+
+
+def _concatenate_with(joiner, *elements):
+ return joiner.join(element for element in elements if element)
+
+
+@_formats('attribute-line* -> attribute-line attribute-line*')
+@_formats('comment-line* -> comment-line comment-line*')
+@_formats('doc-line* -> doc-line doc-line*')
+@_formats('import-line* -> import-line import-line*')
+def _concatenate_lists(head, tail):
+ return head + tail
+
+
+_check_productions()
diff --git a/front_end/format_emb_test.py b/front_end/format_emb_test.py
new file mode 100644
index 0000000..75ce19d
--- /dev/null
+++ b/front_end/format_emb_test.py
@@ -0,0 +1,192 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for front_end.format_emb."""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import pkgutil
+import re
+import sys
+
+import unittest
+from front_end import format_emb
+from front_end import module_ir
+from front_end import parser
+from front_end import tokenizer
+
+
+class SanityCheckerTest(unittest.TestCase):
+
+ def test_text_does_not_tokenize(self):
+ self.assertTrue(format_emb.sanity_check_format_result("-- doc", "~ bad"))
+
+ def test_original_text_does_not_tokenize(self):
+ self.assertTrue(format_emb.sanity_check_format_result("~ bad", "-- doc"))
+
+ def test_text_matches(self):
+ self.assertFalse(format_emb.sanity_check_format_result("-- doc", "-- doc"))
+
+ def test_text_has_extra_eols(self):
+ self.assertFalse(
+ format_emb.sanity_check_format_result("-- doc\n\n-- doc",
+ "-- doc\n\n\n-- doc"))
+
+ def test_text_has_fewer_eols(self):
+ self.assertFalse(format_emb.sanity_check_format_result("-- doc\n\n-- doc",
+ "-- doc\n-- doc"))
+
+ def test_original_text_has_leading_eols(self):
+ self.assertFalse(format_emb.sanity_check_format_result("\n\n-- doc\n",
+ "-- doc\n"))
+
+ def test_original_text_has_extra_doc_whitespace(self):
+ self.assertFalse(format_emb.sanity_check_format_result("-- doc \n",
+ "-- doc\n"))
+
+ def test_comments_differ(self):
+ self.assertTrue(format_emb.sanity_check_format_result("#c\n-- doc\n",
+ "#d\n-- doc\n"))
+
+ def test_comment_missing(self):
+ self.assertTrue(format_emb.sanity_check_format_result("#c\n-- doc\n",
+ "\n-- doc\n"))
+
+ def test_comment_added(self):
+ self.assertTrue(format_emb.sanity_check_format_result("\n-- doc\n",
+ "#d\n-- doc\n"))
+
+ def test_token_text_differs(self):
+ self.assertTrue(format_emb.sanity_check_format_result("-- doc\n",
+ "-- bad doc\n"))
+
+ def test_token_type_differs(self):
+ self.assertTrue(format_emb.sanity_check_format_result("-- doc\n",
+ "abc\n"))
+
+ def test_eol_missing(self):
+ self.assertTrue(format_emb.sanity_check_format_result("abc\n-- doc\n",
+ "abc -- doc\n"))
+
+
+class FormatEmbTest(unittest.TestCase):
+ pass
+
+
+def _make_golden_file_tests():
+ """Generates test cases from the golden files in the resource bundle."""
+
+ package = "testdata.format"
+ path_prefix = ""
+
+ def make_test_case(name, unformatted_text, expected_text, indent_width):
+
+ def test_case(self):
+ self.maxDiff = 100000
+ unformatted_tokens, errors = tokenizer.tokenize(unformatted_text, name)
+ self.assertFalse(errors)
+ parsed_unformatted = parser.parse_module(unformatted_tokens)
+ self.assertFalse(parsed_unformatted.error)
+ formatted_text = format_emb.format_emboss_parse_tree(
+ parsed_unformatted.parse_tree,
+ format_emb.Config(indent_width=indent_width))
+ self.assertEqual(expected_text, formatted_text)
+ annotated_text = format_emb.format_emboss_parse_tree(
+ parsed_unformatted.parse_tree,
+ format_emb.Config(indent_width=indent_width, show_line_types=True))
+ self.assertEqual(expected_text, re.sub(r"^.*?\|", "", annotated_text,
+ flags=re.MULTILINE))
+ self.assertFalse(re.search("^[^|]+$", annotated_text, flags=re.MULTILINE))
+
+ return test_case
+
+ all_unformatted_texts = []
+
+ for filename in (
+ "abbreviations",
+ "anonymous_bits_formatting",
+ "arithmetic_expressions",
+ "array_length",
+ "attributes",
+ "choice_expression",
+ "comparison_expressions",
+ "conditional_field_formatting",
+ "conditional_inline_bits_formatting",
+ "dotted_names",
+ "empty",
+ "enum_value_bodies",
+ "enum_values_aligned",
+ "equality_expressions",
+ "external",
+ "extra_newlines",
+ "fields_aligned",
+ "functions",
+ "header_and_type",
+ "indent",
+ "inline_attributes_get_a_column",
+ "inline_bits",
+ "inline_documentation_gets_a_column",
+ "inline_enum",
+ "inline_struct",
+ "lines_not_spaced_out_with_excess_trailing_noise_lines",
+ "lines_not_spaced_out_with_not_enough_noise_lines",
+ "lines_spaced_out_with_noise_lines",
+ "logical_expressions",
+ "multiline_ifs",
+ "multiple_header_sections",
+ "nested_types_are_columnized_independently",
+ "one_type",
+ "parameterized_struct",
+ "sanity_check",
+ "spacing_between_types",
+ "trailing_spaces",
+ "virtual_fields"):
+ for suffix, width in ((".emb.formatted", 2),
+ (".emb.formatted_indent_4", 4)):
+ unformatted_name = path_prefix + filename + ".emb"
+ expected_name = path_prefix + filename + suffix
+ unformatted_text = pkgutil.get_data(package,
+ unformatted_name).decode("utf-8")
+ expected_text = pkgutil.get_data(package, expected_name).decode("utf-8")
+ setattr(FormatEmbTest, "test {} indent {}".format(filename, width),
+ make_test_case(filename, unformatted_text, expected_text, width))
+
+ all_unformatted_texts.append(unformatted_text)
+
+ def test_all_productions_used(self):
+ used_productions = set()
+ for unformatted_text in all_unformatted_texts:
+ unformatted_tokens, errors = tokenizer.tokenize(unformatted_text, "")
+ self.assertFalse(errors)
+ parsed_unformatted = parser.parse_module(unformatted_tokens)
+ self.assertFalse(parsed_unformatted.error)
+ format_emb.format_emboss_parse_tree(parsed_unformatted.parse_tree,
+ format_emb.Config(), used_productions)
+ unused_productions = set(module_ir.PRODUCTIONS) - used_productions
+ if unused_productions:
+ print("Used production total:", len(used_productions), file=sys.stderr)
+ for production in unused_productions:
+ print("Unused production:", str(production), file=sys.stderr)
+ print("Total:", len(unused_productions), file=sys.stderr)
+ self.assertEqual(set(module_ir.PRODUCTIONS), used_productions)
+
+ FormatEmbTest.testAllProductionsUsed = test_all_productions_used
+
+
+_make_golden_file_tests()
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/generate_grammar_md.py b/front_end/generate_grammar_md.py
new file mode 100644
index 0000000..f4b7d58
--- /dev/null
+++ b/front_end/generate_grammar_md.py
@@ -0,0 +1,235 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Generates a Markdown file documenting the raw Emboss grammar."""
+
+from __future__ import print_function
+
+import re
+
+from front_end import constraints
+from front_end import module_ir
+from front_end import tokenizer
+
+# Keep the output to less than 80 columns, so that the preformatted sections are
+# not cut off.
+_MAX_OUTPUT_WIDTH = 80
+
+_HEADER = """
+This is the context-free grammar for Emboss. Terminal symbols are in `"quotes"`
+or are named in `CamelCase`; nonterminal symbols are named in `snake_case`. The
+term `<empty>` to the right of the `->` indicates an empty production (a rule
+where the left-hand-side may be parsed from an empty string).
+
+This listing is auto-generated from the grammar defined in `module_ir.py`.
+
+Note that, unlike in many languages, comments are included in the grammar. This
+is so that comments can be handled more easily by the autoformatter; comments
+are ignored by the compiler. This is distinct from *documentation*, which is
+included in the IR for use by documentation generators.
+
+""".lstrip()
+
+_BOILERPLATE_PRODUCTION_HEADER = """
+The following productions are automatically generated to handle zero-or-more,
+one-or-more, and zero-or-one repeated lists (`foo*`, `foo+`, and `foo?`
+nonterminals) in LR(1). They are included for completeness, but may be ignored
+if you just want to understand the grammar.
+
+"""
+
+_TOKENIZER_RULE_HEADER = """
+The following regexes are used to tokenize input into the corresponding symbols.
+Note that the `Indent`, `Dedent`, and `EndOfLine` symbols are generated using
+separate logic.
+
+"""
+
+_KEYWORDS_HEADER = """
+The following {} keywords are reserved, but not used, by Emboss. They may not
+be used as field, type, or enum value names.
+
+"""
+
+
+def _sort_productions(productions, start_symbol):
+ """Sorts the given productions in a human-friendly order."""
+ productions_by_lhs = {}
+ for p in productions:
+ if p.lhs not in productions_by_lhs:
+ productions_by_lhs[p.lhs] = set()
+ productions_by_lhs[p.lhs].add(p)
+
+ queue = [start_symbol]
+ previously_queued_symbols = set(queue)
+ main_production_list = []
+ # This sorts productions depth-first. I'm not sure if it is better to sort
+ # them breadth-first or depth-first, or with some hybrid.
+ while queue:
+ symbol = queue.pop(-1)
+ if symbol not in productions_by_lhs:
+ continue
+ for production in sorted(productions_by_lhs[symbol]):
+ main_production_list.append(production)
+ for symbol in production.rhs:
+ # Skip boilerplate productions for now, but include their base
+ # production.
+ if symbol and symbol[-1] in "*+?":
+ symbol = symbol[0:-1]
+ if symbol not in previously_queued_symbols:
+ queue.append(symbol)
+ previously_queued_symbols.add(symbol)
+
+ # It's not particularly important to put boilerplate productions in any
+ # particular order.
+ boilerplate_production_list = sorted(
+ set(productions) - set(main_production_list))
+ for production in boilerplate_production_list:
+ assert production.lhs[-1] in "*+?", "Found orphaned production {}".format(
+ production.lhs)
+ assert set(productions) == set(
+ main_production_list + boilerplate_production_list)
+ assert len(productions) == len(main_production_list) + len(
+ boilerplate_production_list)
+ return main_production_list, boilerplate_production_list
+
+
+def _word_wrap_at_column(words, width):
+ """Wraps words to the specified width, and returns a list of wrapped lines."""
+ result = []
+ in_progress = []
+ for word in words:
+ if len(" ".join(in_progress + [word])) > width:
+ result.append(" ".join(in_progress))
+ assert len(result[-1]) <= width
+ in_progress = []
+ in_progress.append(word)
+ result.append(" ".join(in_progress))
+ assert len(result[-1]) <= width
+ return result
+
+
+def _format_productions(productions):
+ """Formats a list of productions for inclusion in a Markdown document."""
+ max_lhs_len = max([len(production.lhs) for production in productions])
+
+ # TODO(bolms): This highlighting is close for now, but not actually right.
+ result = ["```shell\n"]
+ last_lhs = None
+ for production in productions:
+ if last_lhs == production.lhs:
+ lhs = ""
+ delimiter = " |"
+ else:
+ lhs = production.lhs
+ delimiter = "->"
+ leader = "{lhs:{width}} {delimiter}".format(
+ lhs=lhs,
+ width=max_lhs_len,
+ delimiter=delimiter)
+ for rhs_block in _word_wrap_at_column(
+ production.rhs or ["<empty>"], _MAX_OUTPUT_WIDTH - len(leader)):
+ result.append("{leader} {rhs}\n".format(leader=leader, rhs=rhs_block))
+ leader = " " * len(leader)
+ last_lhs = production.lhs
+ result.append("```\n")
+ return "".join(result)
+
+
+def _normalize_literal_patterns(literals):
+ """Normalizes a list of strings to a list of (regex, symbol) pairs."""
+ return [(re.sub(r"(\W)", r"\\\1", literal), '"' + literal + '"')
+ for literal in literals]
+
+
+def _normalize_regex_patterns(regexes):
+ """Normalizes a list of tokenizer regexes to a list of (regex, symbol)."""
+ # g3doc breaks up patterns containing '|' when they are inserted into a table,
+ # unless they're preceded by '\'. Note that other special characters,
+ # including '\', should *not* be escaped with '\'.
+ return [(re.sub(r"\|", r"\\|", r.regex.pattern), r.symbol) for r in regexes]
+
+
+def _normalize_reserved_word_list(reserved_words):
+ """Returns words that would be allowed as names if they were not reserved."""
+ interesting_reserved_words = []
+ for word in reserved_words:
+ tokens, errors = tokenizer.tokenize(word, "")
+ assert tokens and not errors, "Failed to tokenize " + word
+ if tokens[0].symbol in ["SnakeWord", "CamelWord", "ShoutyWord"]:
+ interesting_reserved_words.append(word)
+ return sorted(interesting_reserved_words)
+
+
+def _format_token_rules(token_rules):
+ """Formats a list of (pattern, symbol) pairs as a table."""
+ pattern_width = max([len(rule[0]) for rule in token_rules])
+ pattern_width += 2 # For the `` characters.
+ result = ["{pat_header:{width}} | Symbol\n"
+ "{empty:-<{width}} | {empty:-<30}\n".format(pat_header="Pattern",
+ width=pattern_width,
+ empty="")]
+ for rule in token_rules:
+ if rule[1]:
+ symbol_name = "`" + rule[1] + "`"
+ else:
+ symbol_name = "*no symbol emitted*"
+ result.append(
+ "{pattern:{width}} | {symbol}\n".format(pattern="`" + rule[0] + "`",
+ width=pattern_width,
+ symbol=symbol_name))
+ return "".join(result)
+
+
+def _format_keyword_list(reserved_words):
+ """formats a list of reserved words."""
+ lines = []
+ current_line = ""
+ for word in reserved_words:
+ if len(current_line) + len(word) + 2 > 80:
+ lines.append(current_line)
+ current_line = ""
+ current_line += "`{}` ".format(word)
+ return "".join([line[:-1] + "\n" for line in lines])
+
+
+def generate_grammar_md():
+ """Generates up-to-date text for grammar.md."""
+ main_productions, boilerplate_productions = _sort_productions(
+ module_ir.PRODUCTIONS, module_ir.START_SYMBOL)
+ result = [_HEADER, _format_productions(main_productions),
+ _BOILERPLATE_PRODUCTION_HEADER,
+ _format_productions(boilerplate_productions)]
+
+ main_tokens = _normalize_literal_patterns(tokenizer.LITERAL_TOKEN_PATTERNS)
+ main_tokens += _normalize_regex_patterns(tokenizer.REGEX_TOKEN_PATTERNS)
+ result.append(_TOKENIZER_RULE_HEADER)
+ result.append(_format_token_rules(main_tokens))
+
+ reserved_words = _normalize_reserved_word_list(
+ constraints.get_reserved_word_list())
+ result.append(_KEYWORDS_HEADER.format(len(reserved_words)))
+ result.append(_format_keyword_list(reserved_words))
+
+ return "".join(result)
+
+
+def main(argv):
+ del argv # Unused.
+ print(generate_grammar_md(), end="")
+ return 0
+
+
+if __name__ == "__main__":
+ sys.exit(main(sys.argv))
diff --git a/front_end/glue.py b/front_end/glue.py
new file mode 100644
index 0000000..3a2f0fa
--- /dev/null
+++ b/front_end/glue.py
@@ -0,0 +1,369 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Main driver for the Emboss front-end.
+
+The parse_emboss_file function performs a complete parse of the specified file,
+and returns an IR or formatted error message.
+"""
+
+import collections
+import pkgutil
+
+from front_end import attribute_checker
+from front_end import constraints
+from front_end import dependency_checker
+from front_end import expression_bounds
+from front_end import lr1
+from front_end import module_ir
+from front_end import parser
+from front_end import symbol_resolver
+from front_end import synthetics
+from front_end import tokenizer
+from front_end import type_check
+from front_end import write_inference
+from public import ir_pb2
+from util import error
+from util import parser_types
+
+_IrDebugInfo = collections.namedtuple("IrDebugInfo", ["ir", "debug_info",
+ "errors"])
+
+
+class DebugInfo(object):
+ """Debug information about Emboss parsing."""
+ __slots__ = ("modules")
+
+ def __init__(self):
+ self.modules = {}
+
+ def __eq__(self, other):
+ return self.modules == other.modules
+
+ def __ne__(self, other):
+ return not self == other
+
+
+class ModuleDebugInfo(object):
+ """Debug information about the parse of a single file.
+
+ Attributes:
+ file_name: The name of the file from which this module came.
+ tokens: The tokenization of this module's source text.
+ parse_tree: The raw parse tree for this module.
+ ir: The intermediate representation of this module, before additional
+ processing such as symbol resolution.
+ used_productions: The set of grammar productions used when parsing this
+ module.
+ source_code: The source text of the module.
+ """
+ __slots__ = ("file_name", "tokens", "parse_tree", "ir", "used_productions",
+ "source_code")
+
+ def __init__(self, file_name):
+ self.file_name = file_name
+ self.tokens = None
+ self.parse_tree = None
+ self.ir = None
+ self.used_productions = None
+ self.source_code = None
+
+ def __eq__(self, other):
+ return (self.file_name == other.file_name and self.tokens == other.tokens
+ and self.parse_tree == other.parse_tree and self.ir == other.ir and
+ self.used_productions == other.used_productions and
+ self.source_code == other.source_code)
+
+ def __ne__(self, other):
+ return not self == other
+
+ def format_tokenization(self):
+ """Renders self.tokens in a human-readable format."""
+ return "\n".join([str(token) for token in self.tokens])
+
+ def format_parse_tree(self, parse_tree=None, indent=""):
+ """Renders self.parse_tree in a human-readable format."""
+ if parse_tree is None:
+ parse_tree = self.parse_tree
+ result = []
+ if isinstance(parse_tree, lr1.Reduction):
+ result.append(indent + parse_tree.symbol)
+ if parse_tree.children:
+ result.append(":\n")
+ for child in parse_tree.children:
+ result.append(self.format_parse_tree(child, indent + " "))
+ else:
+ result.append("\n")
+ else:
+ result.append("{}{}\n".format(indent, parse_tree))
+ return "".join(result)
+
+ def format_module_ir(self):
+ """Renders self.ir in a human-readable format."""
+ return repr(self.ir)
+
+
+def format_production_set(productions):
+ """Renders a set of productions in a human-readable format."""
+ return "\n".join([str(production) for production in sorted(productions)])
+
+
+_cached_modules = {}
+
+
+def parse_module_text(source_code, file_name):
+ """Parses the text of a module, returning a module-level IR.
+
+ Arguments:
+ source_code: The text of the module to parse.
+ file_name: The name of the module's source file (will be included in the
+ resulting IR).
+
+ Returns:
+ A module-level intermediate representation (IR), prior to import and symbol
+ resolution, and a corresponding ModuleDebugInfo, for debugging the parser.
+
+ Raises:
+ FrontEndFailure: An error occurred while parsing the module. str(error)
+ will give a human-readable error message.
+ """
+ # This is strictly an optimization to speed up tests, mostly by avoiding the
+ # need to re-parse the prelude for every test .emb.
+ if (source_code, file_name) in _cached_modules:
+ debug_info = _cached_modules[source_code, file_name]
+ ir = ir_pb2.Module()
+ ir.CopyFrom(debug_info.ir)
+ else:
+ debug_info = ModuleDebugInfo(file_name)
+ debug_info.source_code = source_code
+ tokens, errors = tokenizer.tokenize(source_code, file_name)
+ if errors:
+ return _IrDebugInfo(None, debug_info, errors)
+ debug_info.tokens = tokens
+ parse_result = parser.parse_module(tokens)
+ if parse_result.error:
+ return _IrDebugInfo(
+ None,
+ debug_info,
+ [error.make_error_from_parse_error(file_name, parse_result.error)])
+ debug_info.parse_tree = parse_result.parse_tree
+ used_productions = set()
+ ir = module_ir.build_ir(parse_result.parse_tree, used_productions)
+ debug_info.used_productions = used_productions
+ debug_info.ir = ir_pb2.Module()
+ debug_info.ir.CopyFrom(ir)
+ _cached_modules[source_code, file_name] = debug_info
+ ir.source_file_name = file_name
+ return _IrDebugInfo(ir, debug_info, [])
+
+
+def parse_module(file_name, file_reader):
+ """Parses a module, returning a module-level IR.
+
+ Arguments:
+ file_name: The name of the module's source file.
+ file_reader: A callable that returns either:
+ (file_contents, None) or
+ (None, list_of_error_detail_strings)
+
+ Returns:
+ (ir, debug_info, errors), where ir is a module-level intermediate
+ representation (IR), debug_info is a ModuleDebugInfo containing the
+ tokenization, parse tree, and original source text of all modules, and
+ errors is a list of tokenization or parse errors. If errors is not an empty
+ list, ir will be None.
+
+ Raises:
+ FrontEndFailure: An error occurred while reading or parsing the module.
+ str(error) will give a human-readable error message.
+ """
+ source_code, errors = file_reader(file_name)
+ if errors:
+ location = parser_types.make_location((1, 1), (1, 1))
+ return None, None, [
+ [error.error(file_name, location, "Unable to read file.")] +
+ [error.note(file_name, location, e) for e in errors]
+ ]
+ return parse_module_text(source_code, file_name)
+
+
+def get_prelude():
+ """Returns the module IR and debug info of the Emboss Prelude."""
+ return parse_module_text(
+ pkgutil.get_data("front_end",
+ "prelude.emb").decode(encoding="UTF-8"),
+ "")
+
+
+def parse_emboss_file(file_name, file_reader, stop_before_step=None):
+ """Fully parses an .emb, and returns an IR suitable for passing to a back end.
+
+ parse_emboss_file is a convenience function which calls only_parse_emboss_file
+ and process_ir.
+
+ Arguments:
+ file_name: The name of the module's source file.
+ file_reader: A callable that returns the contents of files, or raises
+ IOError.
+ stop_before_step: If set, parse_emboss_file will stop normalizing the IR
+ just before the specified step. This parameter should be None for
+ non-test code.
+
+ Returns:
+ (ir, debug_info, errors), where ir is a complete IR, ready for consumption
+ by an Emboss back end, debug_info is a DebugInfo containing the
+ tokenization, parse tree, and original source text of all modules, and
+ errors is a list of tokenization or parse errors. If errors is not an empty
+ list, ir will be None.
+ """
+ ir, debug_info, errors = only_parse_emboss_file(file_name, file_reader)
+ if errors:
+ return _IrDebugInfo(None, debug_info, errors)
+ ir, errors = process_ir(ir, stop_before_step)
+ if errors:
+ return _IrDebugInfo(None, debug_info, errors)
+ return _IrDebugInfo(ir, debug_info, errors)
+
+
+def only_parse_emboss_file(file_name, file_reader):
+ """Parses an .emb, and returns an IR suitable for process_ir.
+
+ only_parse_emboss_file parses the given file and all of its transitive
+ imports, and returns a first-stage intermediate representation, which can be
+ passed to process_ir.
+
+ Arguments:
+ file_name: The name of the module's source file.
+ file_reader: A callable that returns the contents of files, or raises
+ IOError.
+
+ Returns:
+ (ir, debug_info, errors), where ir is an intermediate representation (IR),
+ debug_info is a DebugInfo containing the tokenization, parse tree, and
+ original source text of all modules, and errors is a list of tokenization or
+ parse errors. If errors is not an empty list, ir will be None.
+ """
+ file_queue = [file_name]
+ files = {file_name}
+ debug_info = DebugInfo()
+ ir = ir_pb2.EmbossIr(module=[])
+ while file_queue:
+ file_to_parse = file_queue[0]
+ del file_queue[0]
+ if file_to_parse:
+ module, module_debug_info, errors = parse_module(file_to_parse,
+ file_reader)
+ else:
+ module, module_debug_info, errors = get_prelude()
+ if module_debug_info:
+ debug_info.modules[file_to_parse] = module_debug_info
+ if errors:
+ return _IrDebugInfo(None, debug_info, errors)
+ ir.module.extend([module]) # Proto supports extend but not append here.
+ for import_ in module.foreign_import:
+ if import_.file_name.text not in files:
+ file_queue.append(import_.file_name.text)
+ files.add(import_.file_name.text)
+ return _IrDebugInfo(ir, debug_info, [])
+
+
+def process_ir(ir, stop_before_step):
+ """Turns a first-stage IR into a fully-processed IR.
+
+ process_ir performs all of the semantic processing steps on `ir`: resolving
+ symbols, checking dependencies, adding type annotations, normalizing
+ attributes, etc. process_ir is generally meant to be called with the result
+ of parse_emboss_file(), but in theory could be called with a first-stage
+ intermediate representation (IR) from another source.
+
+ Arguments:
+ ir: The IR to process. This structure will be modified during processing.
+ stop_before_step: If set, process_ir will stop normalizing the IR just
+ before the specified step. This parameter should be None for non-test
+ code.
+
+ Returns:
+ (ir, errors), where ir is a complete IR, ready for consumption by an Emboss
+ back end, and errors is a list of compilation errors. If errors is not an
+ empty list, ir will be None.
+ """
+ passes = (synthetics.synthesize_fields,
+ symbol_resolver.resolve_symbols,
+ dependency_checker.find_dependency_cycles,
+ dependency_checker.set_dependency_order,
+ symbol_resolver.resolve_field_references,
+ type_check.annotate_types,
+ type_check.check_types,
+ expression_bounds.compute_constants,
+ attribute_checker.normalize_and_verify,
+ constraints.check_constraints,
+ write_inference.set_write_methods)
+ assert stop_before_step in [None] + [f.__name__ for f in passes], (
+ "Bad value for stop_before_step.")
+ # Some parts of the IR are synthesized from "natural" parts of the IR, before
+ # the natural parts have been fully error checked. Because of this, the
+ # synthesized parts can have errors; in a couple of cases, they can have
+ # errors that show up in an earlier pass than the errors in the natural parts
+ # of the IR. As an example:
+ #
+ # struct Foo:
+ # 0 [+1] bits:
+ # 0 [+1] Flag flag
+ # 1 [+flag] UInt:8 field
+ #
+ # In this case, the use of `flag` as the size of `field` is incorrect, because
+ # `flag` is a boolean, but the size of a field must be an integer.
+ #
+ # Type checking occurs in two passes: in the first pass, expressions are
+ # checked for internal consistency. In the second pass, expression types are
+ # checked against their location. The use of `flag` would be caught in the
+ # second pass.
+ #
+ # However, the generated_fields pass will synthesize a $size_in_bytes virtual
+ # field that would look like:
+ #
+ # struct Foo:
+ # 0 [+1] bits:
+ # 0 [+1] Flag flag
+ # 1 [+flag] UInt:8 field
+ # let $size_in_bytes = $max(true ? 0 + 1 : 0, true ? 1 + flag : 0)
+ #
+ # Since `1 + flag` is not internally consistent, this type error would be
+ # caught in the first pass, and the user would see a very strange error
+ # message that "the right-hand argument of operator `+` must be an integer."
+ #
+ # In order to avoid showing these kinds of errors to the user, we defer any
+ # errors in synthetic parts of the IR. Unless there is a compiler bug, those
+ # errors will show up as errors in the natural parts of the IR, which should
+ # be much more comprehensible to end users.
+ #
+ # If, for some reason, there is an error in the synthetic IR, but no error in
+ # the natural IR, the synthetic errors will be shown. In this case, the
+ # formatting for the synthetic errors will show '[compiler bug]' for the
+ # error location, which (hopefully) will provide the end user with a cue that
+ # the error is a compiler bug.
+ deferred_errors = []
+ for function in passes:
+ if stop_before_step == function.__name__:
+ return (ir, [])
+ errors, hidden_errors = error.split_errors(function(ir))
+ if errors:
+ return (None, errors)
+ deferred_errors.extend(hidden_errors)
+
+ if deferred_errors:
+ return (None, deferred_errors)
+
+ assert stop_before_step is None, "Bad value for stop_before_step."
+ return (ir, [])
diff --git a/front_end/glue_test.py b/front_end/glue_test.py
new file mode 100644
index 0000000..1435922
--- /dev/null
+++ b/front_end/glue_test.py
@@ -0,0 +1,300 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for glue."""
+
+import pkgutil
+import unittest
+
+from front_end import glue
+from front_end import test_util
+from public import ir_pb2
+from util import error
+from util import parser_types
+
+_location = parser_types.make_location
+
+_ROOT_PACKAGE = "testdata.golden"
+_GOLDEN_PATH = ""
+
+_SPAN_SE_LOG_FILE_PATH = _GOLDEN_PATH + "span_se_log_file_status.emb"
+_SPAN_SE_LOG_FILE_EMB = pkgutil.get_data(
+ _ROOT_PACKAGE, _SPAN_SE_LOG_FILE_PATH).decode(encoding="UTF-8")
+_SPAN_SE_LOG_FILE_READER = test_util.dict_file_reader(
+ {_SPAN_SE_LOG_FILE_PATH: _SPAN_SE_LOG_FILE_EMB})
+_SPAN_SE_LOG_FILE_IR = ir_pb2.Module.from_json(
+ pkgutil.get_data(
+ _ROOT_PACKAGE,
+ _GOLDEN_PATH + "span_se_log_file_status.ir.txt"
+ ).decode(encoding="UTF-8"))
+_SPAN_SE_LOG_FILE_PARSE_TREE_TEXT = pkgutil.get_data(
+ _ROOT_PACKAGE,
+ _GOLDEN_PATH + "span_se_log_file_status.parse_tree.txt"
+).decode(encoding="UTF-8")
+_SPAN_SE_LOG_FILE_TOKENIZATION_TEXT = pkgutil.get_data(
+ _ROOT_PACKAGE,
+ _GOLDEN_PATH + "span_se_log_file_status.tokens.txt"
+).decode(encoding="UTF-8")
+
+
+class FrontEndGlueTest(unittest.TestCase):
+ """Tests for front_end.glue."""
+
+ def test_parse_module(self):
+ # parse_module(file) should return the same thing as
+ # parse_module_text(text), assuming file can be read.
+ main_module, debug_info, errors = glue.parse_module(
+ _SPAN_SE_LOG_FILE_PATH, _SPAN_SE_LOG_FILE_READER)
+ main_module2, debug_info2, errors2 = glue.parse_module_text(
+ _SPAN_SE_LOG_FILE_EMB, _SPAN_SE_LOG_FILE_PATH)
+ self.assertEqual([], errors)
+ self.assertEqual([], errors2)
+ self.assertEqual(main_module, main_module2)
+ self.assertEqual(debug_info, debug_info2)
+
+ def test_parse_module_no_such_file(self):
+ file_name = "nonexistent.emb"
+ ir, debug_info, errors = glue.parse_emboss_file(
+ file_name, test_util.dict_file_reader({}))
+ self.assertEqual([[
+ error.error("nonexistent.emb", _location((1, 1), (1, 1)),
+ "Unable to read file."),
+ error.note("nonexistent.emb", _location((1, 1), (1, 1)),
+ "File 'nonexistent.emb' not found."),
+ ]], errors)
+ self.assertFalse(file_name in debug_info.modules)
+ self.assertFalse(ir)
+
+ def test_parse_module_tokenization_error(self):
+ file_name = "tokens.emb"
+ ir, debug_info, errors = glue.parse_emboss_file(
+ file_name, test_util.dict_file_reader({file_name: "@"}))
+ self.assertTrue(debug_info.modules[file_name].source_code)
+ self.assertTrue(errors)
+ self.assertEqual("Unrecognized token", errors[0][0].message)
+ self.assertFalse(ir)
+
+ def test_parse_module_indentation_error(self):
+ file_name = "indent.emb"
+ ir, debug_info, errors = glue.parse_emboss_file(
+ file_name, test_util.dict_file_reader(
+ {file_name: "struct Foo:\n"
+ " 1 [+1] Int x\n"
+ " 2 [+1] Int y\n"}))
+ self.assertTrue(debug_info.modules[file_name].source_code)
+ self.assertTrue(errors)
+ self.assertEqual("Bad indentation", errors[0][0].message)
+ self.assertFalse(ir)
+
+ def test_parse_module_parse_error(self):
+ file_name = "parse.emb"
+ ir, debug_info, errors = glue.parse_emboss_file(
+ file_name, test_util.dict_file_reader(
+ {file_name: "struct foo:\n"
+ " 1 [+1] Int x\n"
+ " 3 [+1] Int y\n"}))
+ self.assertTrue(debug_info.modules[file_name].source_code)
+ self.assertEqual([[
+ error.error(file_name, _location((1, 8), (1, 11)),
+ "A type name must be CamelCase.\n"
+ "Found 'foo' (SnakeWord), expected CamelWord.")
+ ]], errors)
+ self.assertFalse(ir)
+
+ def test_parse_error(self):
+ file_name = "parse.emb"
+ ir, debug_info, errors = glue.parse_emboss_file(
+ file_name, test_util.dict_file_reader(
+ {file_name: "struct foo:\n"
+ " 1 [+1] Int x\n"
+ " 2 [+1] Int y\n"}))
+ self.assertTrue(debug_info.modules[file_name].source_code)
+ self.assertEqual([[
+ error.error(file_name, _location((1, 8), (1, 11)),
+ "A type name must be CamelCase.\n"
+ "Found 'foo' (SnakeWord), expected CamelWord.")
+ ]], errors)
+ self.assertFalse(ir)
+
+ def test_circular_dependency_error(self):
+ file_name = "cycle.emb"
+ ir, debug_info, errors = glue.parse_emboss_file(
+ file_name, test_util.dict_file_reader({
+ file_name: "struct Foo:\n"
+ " 0 [+field1] UInt field1\n"
+ }))
+ self.assertTrue(debug_info.modules[file_name].source_code)
+ self.assertTrue(errors)
+ self.assertEqual("Dependency cycle\nfield1", errors[0][0].message)
+ self.assertFalse(ir)
+
+ def test_ir_from_parse_module(self):
+ log_file_path_ir = ir_pb2.Module()
+ log_file_path_ir.CopyFrom(_SPAN_SE_LOG_FILE_IR)
+ log_file_path_ir.source_file_name = _SPAN_SE_LOG_FILE_PATH
+ self.assertEqual(log_file_path_ir, glue.parse_module(
+ _SPAN_SE_LOG_FILE_PATH, _SPAN_SE_LOG_FILE_READER).ir)
+
+ def test_debug_info_from_parse_module(self):
+ debug_info = glue.parse_module(_SPAN_SE_LOG_FILE_PATH,
+ _SPAN_SE_LOG_FILE_READER).debug_info
+ self.maxDiff = 200000
+ self.assertEqual(_SPAN_SE_LOG_FILE_TOKENIZATION_TEXT.strip(),
+ debug_info.format_tokenization().strip())
+ self.assertEqual(_SPAN_SE_LOG_FILE_PARSE_TREE_TEXT.strip(),
+ debug_info.format_parse_tree().strip())
+ self.assertEqual(_SPAN_SE_LOG_FILE_IR, debug_info.ir)
+ self.assertEqual(repr(_SPAN_SE_LOG_FILE_IR), debug_info.format_module_ir())
+
+ def test_parse_emboss_file(self):
+ # parse_emboss_file calls parse_module, wraps its results, and calls
+ # symbol_resolver.resolve_symbols() on the resulting IR.
+ ir, debug_info, errors = glue.parse_emboss_file(_SPAN_SE_LOG_FILE_PATH,
+ _SPAN_SE_LOG_FILE_READER)
+ module_ir, module_debug_info, module_errors = glue.parse_module(
+ _SPAN_SE_LOG_FILE_PATH, _SPAN_SE_LOG_FILE_READER)
+ self.assertEqual([], errors)
+ self.assertEqual([], module_errors)
+ self.assertTrue(test_util.proto_is_superset(ir.module[0], module_ir))
+ self.assertEqual(module_debug_info,
+ debug_info.modules[_SPAN_SE_LOG_FILE_PATH])
+ self.assertEqual(2, len(debug_info.modules))
+ self.assertEqual(2, len(ir.module))
+ self.assertEqual(_SPAN_SE_LOG_FILE_PATH, ir.module[0].source_file_name)
+ self.assertEqual("", ir.module[1].source_file_name)
+
+ def test_synthetic_error(self):
+ file_name = "missing_byte_order_attribute.emb"
+ ir, unused_debug_info, errors = glue.only_parse_emboss_file(
+ file_name, test_util.dict_file_reader({
+ file_name: "struct Foo:\n"
+ " 0 [+8] UInt field\n"
+ }))
+ self.assertFalse(errors)
+ # Artificially mark the first field as is_synthetic.
+ first_field = ir.module[0].type[0].structure.field[0]
+ first_field.source_location.is_synthetic = True
+ ir, errors = glue.process_ir(ir, None)
+ self.assertTrue(errors)
+ self.assertEqual("Attribute 'byte_order' required on field which is byte "
+ "order dependent.", errors[0][0].message)
+ self.assertTrue(errors[0][0].location.is_synthetic)
+ self.assertFalse(ir)
+
+ def test_suppressed_synthetic_error(self):
+ file_name = "triplicate_symbol.emb"
+ ir, unused_debug_info, errors = glue.only_parse_emboss_file(
+ file_name, test_util.dict_file_reader({
+ file_name: "struct Foo:\n"
+ " 0 [+1] UInt field\n"
+ " 1 [+1] UInt field\n"
+ " 2 [+1] UInt field\n"
+ }))
+ self.assertFalse(errors)
+ # Artificially mark the name of the second field as is_synthetic.
+ second_field = ir.module[0].type[0].structure.field[1]
+ second_field.name.source_location.is_synthetic = True
+ second_field.name.name.source_location.is_synthetic = True
+ ir, errors = glue.process_ir(ir, None)
+ self.assertEqual(1, len(errors))
+ self.assertEqual("Duplicate name 'field'", errors[0][0].message)
+ self.assertFalse(errors[0][0].location.is_synthetic)
+ self.assertFalse(errors[0][1].location.is_synthetic)
+ self.assertFalse(ir)
+
+
+class DebugInfoTest(unittest.TestCase):
+ """Tests for DebugInfo and ModuleDebugInfo classes."""
+
+ def test_debug_info_initialization(self):
+ debug_info = glue.DebugInfo()
+ self.assertEqual({}, debug_info.modules)
+
+ def test_debug_info_invalid_attribute_set(self):
+ debug_info = glue.DebugInfo()
+ with self.assertRaises(AttributeError):
+ debug_info.foo = "foo"
+
+ def test_debug_info_equality(self):
+ debug_info = glue.DebugInfo()
+ debug_info2 = glue.DebugInfo()
+ self.assertEqual(debug_info, debug_info2)
+ debug_info.modules["foo"] = glue.ModuleDebugInfo("foo")
+ self.assertNotEqual(debug_info, debug_info2)
+ debug_info2.modules["foo"] = glue.ModuleDebugInfo("foo")
+ self.assertEqual(debug_info, debug_info2)
+
+ def test_module_debug_info_initialization(self):
+ module_info = glue.ModuleDebugInfo("bar.emb")
+ self.assertEqual("bar.emb", module_info.file_name)
+ self.assertEqual(None, module_info.tokens)
+ self.assertEqual(None, module_info.parse_tree)
+ self.assertEqual(None, module_info.ir)
+ self.assertEqual(None, module_info.used_productions)
+
+ def test_module_debug_info_attribute_set(self):
+ module_info = glue.ModuleDebugInfo("bar.emb")
+ module_info.tokens = "a"
+ module_info.parse_tree = "b"
+ module_info.ir = "c"
+ module_info.used_productions = "d"
+ module_info.source_code = "e"
+ self.assertEqual("a", module_info.tokens)
+ self.assertEqual("b", module_info.parse_tree)
+ self.assertEqual("c", module_info.ir)
+ self.assertEqual("d", module_info.used_productions)
+ self.assertEqual("e", module_info.source_code)
+
+ def test_module_debug_info_bad_attribute_set(self):
+ module_info = glue.ModuleDebugInfo("bar.emb")
+ with self.assertRaises(AttributeError):
+ module_info.foo = "foo"
+
+ def test_module_debug_info_equality(self):
+ module_info = glue.ModuleDebugInfo("foo")
+ module_info2 = glue.ModuleDebugInfo("foo")
+ module_info_bar = glue.ModuleDebugInfo("bar")
+ self.assertEqual(module_info, module_info2)
+ module_info_bar = glue.ModuleDebugInfo("bar")
+ self.assertNotEqual(module_info, module_info_bar)
+ module_info.tokens = []
+ self.assertNotEqual(module_info, module_info2)
+ module_info2.tokens = []
+ self.assertEqual(module_info, module_info2)
+ module_info.parse_tree = []
+ self.assertNotEqual(module_info, module_info2)
+ module_info2.parse_tree = []
+ self.assertEqual(module_info, module_info2)
+ module_info.ir = []
+ self.assertNotEqual(module_info, module_info2)
+ module_info2.ir = []
+ self.assertEqual(module_info, module_info2)
+ module_info.used_productions = []
+ self.assertNotEqual(module_info, module_info2)
+ module_info2.used_productions = []
+ self.assertEqual(module_info, module_info2)
+
+
+class TestFormatProductionSet(unittest.TestCase):
+ """Tests for format_production_set."""
+
+ def test_format_production_set(self):
+ production_texts = ["A -> B", "B -> C", "A -> C", "C -> A"]
+ productions = [parser_types.Production.parse(p) for p in production_texts]
+ self.assertEqual("\n".join(sorted(production_texts)),
+ glue.format_production_set(set(productions)))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/lr1.py b/front_end/lr1.py
new file mode 100644
index 0000000..112bf12
--- /dev/null
+++ b/front_end/lr1.py
@@ -0,0 +1,759 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""LR(1) parser generator.
+
+The primary class in this module, Grammar, takes a list of context-free grammar
+productions, and produces the corresponding LR(1) shift-reduce parser. This is
+an implementation of the algorithm on pages 221 and 261-265 of "Compilers:
+Principles, Techniques, & Tools" (Second Edition) by Aho, Lam, Sethi, and
+Ullman, also known as "The Dragon Book," hereafter referred to as "ALSU."
+
+This module only implements the LR(1) algorithms; unlike tools such as yacc, it
+does not implement the various bits of glue necessary to actually use a parser.
+Clients are expected to provide their own tokenizers and handle turning a raw
+parse tree into an intermediate representation on their own.
+"""
+
+import collections
+
+from util import parser_types
+
+
+class Item(collections.namedtuple("Item", ["production", "dot", "terminal",
+ "next_symbol"])):
+ """An Item is an LR(1) Item: a production, a cursor location, and a terminal.
+
+ An Item represents a partially-parsed production, and a lookahead symbol. The
+ position of the dot indicates what portion of the production has been parsed.
+ Generally, Items are an internal implementation detail, but they can be useful
+ elsewhere, particularly for debugging.
+
+ Attributes:
+ production: The Production this Item covers.
+ dot: The index of the "dot" in production's rhs.
+ terminal: The terminal lookahead symbol that follows the production in the
+ input stream.
+ """
+
+ def __str__(self):
+ """__str__ generates ASLU notation."""
+ return (str(self.production.lhs) + " -> " + " ".join(
+ [str(r) for r in self.production.rhs[0:self.dot] + (".",) +
+ self.production.rhs[self.dot:]]) + ", " + str(self.terminal))
+
+ @staticmethod
+ def parse(text):
+ """Parses an Item in ALSU notation.
+
+ Parses an Item from notation like:
+
+ symbol -> foo . bar baz, qux
+
+ where "symbol -> foo bar baz" will be taken as the production, the position
+ of the "." is taken as "dot" (in this case 1), and the symbol after "," is
+ taken as the "terminal". The following are also valid items:
+
+ sym -> ., foo
+ sym -> . foo bar, baz
+ sym -> foo bar ., baz
+
+ Symbols on the right-hand side of the production should be separated by
+ whitespace.
+
+ Arguments:
+ text: The text to parse into an Item.
+
+ Returns:
+ An Item.
+ """
+ production, terminal = text.split(",")
+ terminal = terminal.strip()
+ if terminal == "$":
+ terminal = END_OF_INPUT
+ lhs, rhs = production.split("->")
+ lhs = lhs.strip()
+ if lhs == "S'":
+ lhs = START_PRIME
+ before_dot, after_dot = rhs.split(".")
+ handle = before_dot.split()
+ tail = after_dot.split()
+ return make_item(parser_types.Production(lhs, tuple(handle + tail)),
+ len(handle), terminal)
+
+
+def make_item(production, dot, symbol):
+ return Item(production, dot, symbol,
+ None if dot >= len(production.rhs) else production.rhs[dot])
+
+
+class Conflict(
+ collections.namedtuple("Conflict", ["state", "symbol", "actions"])
+):
+ """Conflict represents a parse conflict."""
+
+ def __str__(self):
+ return "Conflict for {} in state {}: ".format(
+ self.symbol, self.state) + " vs ".join([str(a) for a in self.actions])
+
+
+Shift = collections.namedtuple("Shift", ["state", "items"])
+Reduce = collections.namedtuple("Reduce", ["rule"])
+Accept = collections.namedtuple("Accept", [])
+Error = collections.namedtuple("Error", ["code"])
+
+Symbol = collections.namedtuple("Symbol", ["symbol"])
+
+# START_PRIME is the implicit 'real' root symbol for the grammar.
+START_PRIME = "S'"
+
+# END_OF_INPUT is the implicit symbol at the end of input.
+END_OF_INPUT = "$"
+
+# ANY_TOKEN is used by mark_error as a "wildcard" token that should be replaced
+# by every other token.
+ANY_TOKEN = parser_types.Token(object(), "*",
+ parser_types.parse_location("0:0-0:0"))
+
+
+class Reduction(collections.namedtuple("Reduction",
+ ["symbol", "children", "production",
+ "source_location"])):
+ """A Reduction is a non-leaf node in a parse tree.
+
+ Attributes:
+ symbol: The name of this element in the parse.
+ children: The child elements of this parse.
+ production: The grammar production to which this reduction corresponds.
+ source_location: If known, the range in the source text corresponding to the
+ tokens from which this reduction was parsed. May be 'None' if this
+ reduction was produced from no symbols, or if the tokens fed to `parse`
+ did not include source_location.
+ """
+ pass
+
+
+class Grammar(object):
+ """Grammar is an LR(1) context-free grammar.
+
+ Attributes:
+ start: The start symbol for the grammar.
+ productions: A list of productions in the grammar, including the S' -> start
+ production.
+ symbols: A set of all symbols in the grammar, including $ and S'.
+ nonterminals: A set of all nonterminal symbols in the grammar, including S'.
+ terminals: A set of all terminal symbols in the grammar, including $.
+ """
+
+ def __init__(self, start_symbol, productions):
+ """Constructs a Grammar object.
+
+ Arguments:
+ start_symbol: The start symbol for the grammar.
+ productions: A list of productions (not including the "S' -> start_symbol"
+ production).
+ """
+ object.__init__(self)
+ self.start = start_symbol
+ self._seed_production = parser_types.Production(START_PRIME, (self.start,))
+ self.productions = productions + [self._seed_production]
+
+ self._single_level_closure_of_item_cache = {}
+ self._closure_of_item_cache = {}
+ self._compute_symbols()
+ self._compute_seed_firsts()
+ self._set_productions_by_lhs()
+ self._populate_item_cache()
+
+ def _set_productions_by_lhs(self):
+ # Prepopulating _productions_by_lhs speeds up _closure_of_item by about 30%,
+ # which is significant on medium-to-large grammars.
+ self._productions_by_lhs = {}
+ for production in self.productions:
+ self._productions_by_lhs.setdefault(production.lhs, list()).append(
+ production)
+
+ def _populate_item_cache(self):
+ # There are a relatively small number of possible Items for a grammar, and
+ # the algorithm needs to get Items from their constituent components very
+ # frequently. As it turns out, pre-caching all possible Items results in a
+ # ~35% overall speedup to Grammar.parser().
+ self._item_cache = {}
+ for symbol in self.terminals:
+ for production in self.productions:
+ for dot in range(len(production.rhs) + 1):
+ self._item_cache[production, dot, symbol] = make_item(
+ production, dot, symbol)
+
+ def _compute_symbols(self):
+ """Finds all grammar symbols, and sorts them into terminal and non-terminal.
+
+ Nonterminal symbols are those which appear on the left side of any
+ production. Terminal symbols are those which do not.
+
+ _compute_symbols is used during __init__.
+ """
+ self.symbols = {END_OF_INPUT}
+ self.nonterminals = set()
+ for production in self.productions:
+ self.symbols.add(production.lhs)
+ self.nonterminals.add(production.lhs)
+ for symbol in production.rhs:
+ self.symbols.add(symbol)
+ self.terminals = self.symbols - self.nonterminals
+
+ def _compute_seed_firsts(self):
+ """Computes FIRST (ALSU p221) for all terminal and nonterminal symbols.
+
+ The algorithm for computing FIRST is an iterative one that terminates when
+ it reaches a fixed point (that is, when further iterations stop changing
+ state). _compute_seed_firsts computes the fixed point for all single-symbol
+ strings, by repeatedly calling _first and updating the internal _firsts
+ table with the results.
+
+ Once _compute_seed_firsts has completed, _first will return correct results
+ for both single- and multi-symbol strings.
+
+ _compute_seed_firsts is used during __init__.
+ """
+ self.firsts = {}
+ # FIRST for a terminal symbol is always just that terminal symbol.
+ for terminal in self.terminals:
+ self.firsts[terminal] = set([terminal])
+ for nonterminal in self.nonterminals:
+ self.firsts[nonterminal] = set()
+ while True:
+ # The first iteration picks up all the productions that start with
+ # terminal symbols. The second iteration picks up productions that start
+ # with nonterminals that the first iteration picked up. The third
+ # iteration picks up nonterminals that the first and second picked up, and
+ # so on.
+ #
+ # This is guaranteed to end, in the worst case, when every terminal
+ # symbol and epsilon has been added to the _firsts set for every
+ # nonterminal symbol. This would be slow, but requires a pathological
+ # grammar; useful grammars should complete in only a few iterations.
+ firsts_to_add = {}
+ for production in self.productions:
+ for first in self._first(production.rhs):
+ if first not in self.firsts[production.lhs]:
+ if production.lhs not in firsts_to_add:
+ firsts_to_add[production.lhs] = set()
+ firsts_to_add[production.lhs].add(first)
+ if not firsts_to_add:
+ break
+ for symbol in firsts_to_add:
+ self.firsts[symbol].update(firsts_to_add[symbol])
+
+ def _first(self, symbols):
+ """The FIRST function from ALSU p221.
+
+ _first takes a string of symbols (both terminals and nonterminals) and
+ returns the set of terminal symbols which could be the first terminal symbol
+ of a string produced by the given list of symbols.
+
+ _first will not give fully-correct results until _compute_seed_firsts
+ finishes, but is called by _compute_seed_firsts, and must provide partial
+ results during that method's execution.
+
+ Args:
+ symbols: A list of symbols.
+
+ Returns:
+ A set of terminals which could be the first terminal in "symbols."
+ """
+ result = set()
+ all_contain_epsilon = True
+ for symbol in symbols:
+ for first in self.firsts[symbol]:
+ if first:
+ result.add(first)
+ if None not in self.firsts[symbol]:
+ all_contain_epsilon = False
+ break
+ if all_contain_epsilon:
+ # "None" seems like a Pythonic way of representing epsilon (no symbol).
+ result.add(None)
+ return result
+
+ def _closure_of_item(self, root_item):
+ """Modified implementation of CLOSURE from ALSU p261.
+
+ _closure_of_item performs the CLOSURE function with a single seed item, with
+ memoization. In the algorithm as presented in ALSU, CLOSURE is called with
+ a different set of items every time, which is unhelpful for memoization.
+ Instead, we let _parallel_goto merge the sets returned by _closure_of_item,
+ which results in a ~40% speedup.
+
+ CLOSURE, roughly, computes the set of LR(1) Items which might be active when
+ a "seed" set of Items is active.
+
+ Technically, it is the epsilon-closure of the NFA states represented by
+ "items," where an epsilon transition (a transition that does not consume any
+ symbols) occurs from a->Z.bY,q to b->.X,p when p is in FIRST(Yq). (a and b
+ are nonterminals, X, Y, and Z are arbitrary strings of symbols, and p and q
+ are terminals.) That is, it is the set of all NFA states which can be
+ reached from "items" without consuming any input. This set corresponds to a
+ single DFA state.
+
+ Args:
+ root_item: The initial LR(1) Item.
+
+ Returns:
+ A set of LR(1) items which may be active at the time when the provided
+ item is active.
+ """
+ if root_item in self._closure_of_item_cache:
+ return self._closure_of_item_cache[root_item]
+ item_set = set([root_item])
+ item_list = [root_item]
+ i = 0
+ # Each newly-added Item may trigger the addition of further Items, so
+ # iterate until no new Items are added. In the worst case, a new Item will
+ # be added for each production.
+ #
+ # This algorithm is really looking for "next" nonterminals in the existing
+ # items, and adding new items corresponding to their productions.
+ while i < len(item_list):
+ item = item_list[i]
+ i += 1
+ if not item.next_symbol:
+ continue
+ # If _closure_of_item_cache contains the full closure of item, then we can
+ # add its full closure to the result set, and skip checking any of its
+ # items: any item that would be added by any item in the cached result
+ # will already be in the _closure_of_item_cache entry.
+ if item in self._closure_of_item_cache:
+ item_set |= self._closure_of_item_cache[item]
+ continue
+ # Even if we don't have the full closure of item, we may have the
+ # immediate closure of item. It turns out that memoizing just this step
+ # speeds up this function by about 50%, even after the
+ # _closure_of_item_cache check.
+ if item not in self._single_level_closure_of_item_cache:
+ new_items = set()
+ for production in self._productions_by_lhs.get(item.next_symbol, []):
+ for terminal in self._first(item.production.rhs[item.dot + 1:] +
+ (item.terminal,)):
+ new_items.add(self._item_cache[production, 0, terminal])
+ self._single_level_closure_of_item_cache[item] = new_items
+ for new_item in self._single_level_closure_of_item_cache[item]:
+ if new_item not in item_set:
+ item_set.add(new_item)
+ item_list.append(new_item)
+ self._closure_of_item_cache[root_item] = item_set
+ # Typically, _closure_of_item() will be called on items whose closures
+ # bring in the greatest number of additional items, then on items which
+ # close over fewer and fewer other items. Since items are not added to
+ # _closure_of_item_cache unless _closure_of_item() is called directly on
+ # them, this means that it is unlikely that items brought in will (without
+ # intervention) have entries in _closure_of_item_cache, which slows down the
+ # computation of the larger closures.
+ #
+ # Although it is not guaranteed, items added to item_list last will tend to
+ # close over fewer items, and therefore be easier to compute. By forcibly
+ # re-calculating closures from last to first, and adding the results to
+ # _closure_of_item_cache at each step, we get a modest performance
+ # improvement: roughly 50% less time spent in _closure_of_item, which
+ # translates to about 5% less time in parser().
+ for item in item_list[::-1]:
+ self._closure_of_item(item)
+ return item_set
+
+ def _parallel_goto(self, items):
+ """The GOTO function from ALSU p261, executed on all symbols.
+
+ _parallel_goto takes a set of Items, and returns a dict from every symbol in
+ self.symbols to the set of Items that would be active after a shift
+ operation (if symbol is a terminal) or after a reduction operation (if
+ symbol is a nonterminal).
+
+ _parallel_goto is used in lieu of the single-symbol GOTO from ALSU because
+ it eliminates the outer loop over self.terminals, and thereby reduces the
+ number of next_symbol calls by a factor of len(self.terminals).
+
+ Args:
+ items: The set of items representing the initial DFA state.
+
+ Returns:
+ A dict from symbols to sets of items representing the new DFA states.
+ """
+ results = collections.defaultdict(set)
+ for item in items:
+ next_symbol = item.next_symbol
+ if next_symbol is None:
+ continue
+ item = self._item_cache[item.production, item.dot + 1, item.terminal]
+ # Inlining the cache check results in a ~25% speedup in this function, and
+ # about 10% overall speedup to parser().
+ if item in self._closure_of_item_cache:
+ closure = self._closure_of_item_cache[item]
+ else:
+ closure = self._closure_of_item(item)
+ # _closure will add newly-started Items (Items with dot=0) to the result
+ # set. After this operation, the result set will correspond to the new
+ # state.
+ results[next_symbol].update(closure)
+ return results
+
+ def _items(self):
+ """The items function from ALSU p261.
+
+ _items computes the set of sets of LR(1) items for a shift-reduce parser
+ that matches the grammar. Each set of LR(1) items corresponds to a single
+ DFA state.
+
+ Returns:
+ A tuple.
+
+ The first element of the tuple is a list of sets of LR(1) items (each set
+ corresponding to a DFA state).
+
+ The second element of the tuple is a dictionary from (int, symbol) pairs
+ to ints, where all the ints are indexes into the list of sets of LR(1)
+ items. This dictionary is based on the results of the _Goto function,
+ where item_sets[dict[i, sym]] == self._Goto(item_sets[i], sym).
+ """
+ # The list of states is seeded with the marker S' production.
+ item_list = [
+ frozenset(self._closure_of_item(
+ self._item_cache[self._seed_production, 0, END_OF_INPUT]))
+ ]
+ items = {item_list[0]: 0}
+ goto_table = {}
+ i = 0
+ # For each state, figure out what the new state when each symbol is added to
+ # the top of the parsing stack (see the comments in parser._parse). See
+ # _Goto for an explanation of how that is actually computed.
+ while i < len(item_list):
+ item_set = item_list[i]
+ gotos = self._parallel_goto(item_set)
+ for symbol, goto in gotos.items():
+ goto = frozenset(goto)
+ if goto not in items:
+ items[goto] = len(item_list)
+ item_list.append(goto)
+ goto_table[i, symbol] = items[goto]
+ i += 1
+ return item_list, goto_table
+
+ def parser(self):
+ """parser returns an LR(1) parser for the Grammar.
+
+ This implements the Canonical LR(1) ("LR(1)") parser algorithm ("Algorithm
+ 4.56", ALSU p265), rather than the more common Lookahead LR(1) ("LALR(1)")
+ algorithm. LALR(1) produces smaller tables, but is more complex and does
+ not cover all LR(1) grammars. When the LR(1) and LALR(1) algorithms were
+ invented, table sizes were an important consideration; now, the difference
+ between a few hundred and a few thousand entries is unlikely to matter.
+
+ At this time, Grammar does not handle ambiguous grammars, which are commonly
+ used to handle precedence, associativity, and the "dangling else" problem.
+ Formally, these can always be handled by an unambiguous grammar, though
+ doing so can be cumbersome, particularly for expression languages with many
+ levels of precedence. ALSU section 4.8 (pp278-287) contains some techniques
+ for handling these kinds of ambiguity.
+
+ Returns:
+ A Parser.
+ """
+ item_sets, goto = self._items()
+ action = {}
+ conflicts = set()
+ end_item = self._item_cache[self._seed_production, 1, END_OF_INPUT]
+ for i in range(len(item_sets)):
+ for item in item_sets[i]:
+ new_action = None
+ if (item.next_symbol is None and
+ item.production != self._seed_production):
+ terminal = item.terminal
+ new_action = Reduce(item.production)
+ elif item.next_symbol in self.terminals:
+ terminal = item.next_symbol
+ assert goto[i, terminal] is not None
+ new_action = Shift(goto[i, terminal], item_sets[goto[i, terminal]])
+ if new_action:
+ if (i, terminal) in action and action[i, terminal] != new_action:
+ conflicts.add(
+ Conflict(i, terminal,
+ frozenset([action[i, terminal], new_action])))
+ action[i, terminal] = new_action
+ if item == end_item:
+ new_action = Accept()
+ assert (i, END_OF_INPUT
+ ) not in action or action[i, END_OF_INPUT] == new_action
+ action[i, END_OF_INPUT] = new_action
+ trimmed_goto = {}
+ for k in goto:
+ if k[1] in self.nonterminals:
+ trimmed_goto[k] = goto[k]
+ expected = {}
+ for state, terminal in action:
+ if state not in expected:
+ expected[state] = set()
+ expected[state].add(terminal)
+ return Parser(item_sets, trimmed_goto, action, expected, conflicts,
+ self.terminals, self.nonterminals, self.productions)
+
+
+ParseError = collections.namedtuple("ParseError", ["code", "index", "token",
+ "state", "expected_tokens"])
+ParseResult = collections.namedtuple("ParseResult", ["parse_tree", "error"])
+
+
+class Parser(object):
+ """Parser is a shift-reduce LR(1) parser.
+
+ Generally, clients will want to get a Parser from a Grammar, rather than
+ directly instantiating one.
+
+ Parser exposes the raw tables needed to feed into a Shift-Reduce parser,
+ but can also be used directly for parsing.
+
+ Attributes:
+ item_sets: A list of item sets which correspond to the state numbers in
+ the action and goto tables. This is not necessary for parsing, but is
+ useful for debugging parsers.
+ goto: The GOTO table for this parser.
+ action: The ACTION table for this parser.
+ expected: A table of terminal symbols that are expected (that is, that
+ have a non-Error action) for each state. This can be used to provide
+ more helpful error messages for parse errors.
+ conflicts: A set of unresolved conflicts found during table generation.
+ terminals: A set of terminal symbols in the grammar.
+ nonterminals: A set of nonterminal symbols in the grammar.
+ productions: A list of productions in the grammar.
+ default_errors: A dict of states to default error codes to use when
+ encountering an error in that state, when a more-specific Error for the
+ state/terminal pair has not been set.
+ """
+
+ def __init__(self, item_sets, goto, action, expected, conflicts, terminals,
+ nonterminals, productions):
+ super(Parser, self).__init__()
+ self.item_sets = item_sets
+ self.goto = goto
+ self.action = action
+ self.expected = expected
+ self.conflicts = conflicts
+ self.terminals = terminals
+ self.nonterminals = nonterminals
+ self.productions = productions
+ self.default_errors = {}
+
+ def _parse(self, tokens):
+ """_parse implements Shift-Reduce parsing algorithm.
+
+ _parse implements the standard shift-reduce algorithm outlined on ASLU
+ pp236-237.
+
+ Arguments:
+ tokens: the list of token objects to parse.
+
+ Returns:
+ A ParseResult.
+ """
+ # The END_OF_INPUT token is explicitly added to avoid explicit "cursor <
+ # len(tokens)" checks.
+ tokens = list(tokens) + [Symbol(END_OF_INPUT)]
+
+ # Each element of stack is a parse state and a (possibly partial) parse
+ # tree. The state at the top of the stack encodes which productions are
+ # "active" (that is, which ones the parser has seen partial input which
+ # matches some prefix of the production, in a place where that production
+ # might be valid), and, for each active production, how much of the
+ # production has been completed.
+ stack = [(0, None)]
+
+ def state():
+ return stack[-1][0]
+
+ cursor = 0
+
+ # On each iteration, look at the next symbol and the current state, and
+ # perform the corresponding action.
+ while True:
+ if (state(), tokens[cursor].symbol) not in self.action:
+ # Most state/symbol entries would be Errors, so rather than exhaustively
+ # adding error entries, we just check here.
+ if state() in self.default_errors:
+ next_action = Error(self.default_errors[state()])
+ else:
+ next_action = Error(None)
+ else:
+ next_action = self.action[state(), tokens[cursor].symbol]
+
+ if isinstance(next_action, Shift):
+ # Shift means that there are no "complete" productions on the stack,
+ # and so the current token should be shifted onto the stack, with a new
+ # state indicating the new set of "active" productions.
+ stack.append((next_action.state, tokens[cursor]))
+ cursor += 1
+ elif isinstance(next_action, Accept):
+ # Accept means that parsing is over, successfully.
+ assert len(stack) == 2, "Accepted incompletely-reduced input."
+ assert tokens[cursor].symbol == END_OF_INPUT, ("Accepted parse before "
+ "end of input.")
+ return ParseResult(stack[-1][1], None)
+ elif isinstance(next_action, Reduce):
+ # Reduce means that there is a complete production on the stack, and
+ # that the next symbol implies that the completed production is the
+ # correct production.
+ #
+ # Per ALSU, we would simply pop an element off the state stack for each
+ # symbol on the rhs of the production, and then push a new state by
+ # looking up the (post-pop) current state and the lhs of the production
+ # in GOTO. The GOTO table, in some sense, is equivalent to shift
+ # actions for nonterminal symbols.
+ #
+ # Here, we attach a new partial parse tree, with the production lhs as
+ # the "name" of the tree, and the popped trees as the "children" of the
+ # new tree.
+ children = [
+ item[1] for item in stack[len(stack) - len(next_action.rule.rhs):]
+ ]
+ # Attach source_location, if known. The source location will not be
+ # known if the reduction consumes no symbols (empty rhs) or if the
+ # client did not specify source_locations for tokens.
+ #
+ # It is necessary to loop in order to handle cases like:
+ #
+ # C -> c D
+ # D ->
+ #
+ # The D child of the C reduction will not have a source location
+ # (because it is not produced from any source), so it is necessary to
+ # scan backwards through C's children to find the end position. The
+ # opposite is required in the case where initial children have no
+ # source.
+ #
+ # These loops implicitly handle the case where the reduction has no
+ # children, setting the source_location to None in that case.
+ start_position = None
+ end_position = None
+ for child in children:
+ if hasattr(child,
+ "source_location") and child.source_location is not None:
+ start_position = child.source_location.start
+ break
+ for child in reversed(children):
+ if hasattr(child,
+ "source_location") and child.source_location is not None:
+ end_position = child.source_location.end
+ break
+ if start_position is None:
+ source_location = None
+ else:
+ source_location = parser_types.make_location(start_position,
+ end_position)
+ reduction = Reduction(next_action.rule.lhs, children, next_action.rule,
+ source_location)
+ del stack[len(stack) - len(next_action.rule.rhs):]
+ stack.append((self.goto[state(), next_action.rule.lhs], reduction))
+ elif isinstance(next_action, Error):
+ # Error means that the parse is impossible. For typical grammars and
+ # texts, this usually happens within a few tokens after the mistake in
+ # the input stream, which is convenient (though imperfect) for error
+ # reporting.
+ return ParseResult(None,
+ ParseError(next_action.code, cursor, tokens[cursor],
+ state(), self.expected[state()]))
+ else:
+ assert False, "Shouldn't be here."
+
+ def mark_error(self, tokens, error_token, error_code):
+ """Marks an error state with the given error code.
+
+ mark_error implements the equivalent of the "Merr" system presented in
+ "Generating LR Syntax error Messages from Examples" (Jeffery, 2003).
+ This system has limitations, but has the primary advantage that error
+ messages can be specified by giving an example of the error and the
+ message itself.
+
+ Arguments:
+ tokens: a list of tokens to parse.
+ error_token: the token where the parse should fail, or None if the parse
+ should fail at the implicit end-of-input token.
+
+ If the error_token is the special ANY_TOKEN, then the error will be
+ recorded as the default error for the error state.
+ error_code: a value to record for the error state reached by parsing
+ tokens.
+
+ Returns:
+ None if error_code was successfully recorded, or an error message if there
+ was a problem.
+ """
+ result = self._parse(tokens)
+
+ # There is no error state to mark on a successful parse.
+ if not result.error:
+ return "Input successfully parsed."
+
+ # Check if the error occurred at the specified token; if not, then this was
+ # not the expected error.
+ if error_token is None:
+ error_symbol = END_OF_INPUT
+ if result.error.token.symbol != END_OF_INPUT:
+ return "error occurred on {} token, not end of input.".format(
+ result.error.token.symbol)
+ else:
+ error_symbol = error_token.symbol
+ if result.error.token != error_token:
+ return "error occurred on {} token, not {} token.".format(
+ result.error.token.symbol, error_token.symbol)
+
+ # If the expected error was found, attempt to mark it. It is acceptable if
+ # the given error_code is already set as the error code for the given parse,
+ # but not if a different code is set.
+ if result.error.token == ANY_TOKEN:
+ # For ANY_TOKEN, mark it as a default error.
+ if result.error.state in self.default_errors:
+ if self.default_errors[result.error.state] == error_code:
+ return None
+ else:
+ return ("Attempted to overwrite existing default error code {!r} "
+ "with new error code {!r} for state {}".format(
+ self.default_errors[result.error.state], error_code,
+ result.error.state))
+ else:
+ self.default_errors[result.error.state] = error_code
+ return None
+ else:
+ if (result.error.state, error_symbol) in self.action:
+ existing_error = self.action[result.error.state, error_symbol]
+ assert isinstance(existing_error, Error), "Bug"
+ if existing_error.code == error_code:
+ return None
+ else:
+ return ("Attempted to overwrite existing error code {!r} with new "
+ "error code {!r} for state {}, terminal {}".format(
+ existing_error.code, error_code, result.error.state,
+ error_symbol))
+ else:
+ self.action[result.error.state, error_symbol] = Error(error_code)
+ return None
+ assert False, "All other paths should lead to return."
+
+ def parse(self, tokens):
+ """Parses a list of tokens.
+
+ Arguments:
+ tokens: a list of tokens to parse.
+
+ Returns:
+ A ParseResult.
+ """
+ result = self._parse(tokens)
+ return result
diff --git a/front_end/lr1_test.py b/front_end/lr1_test.py
new file mode 100644
index 0000000..7573856
--- /dev/null
+++ b/front_end/lr1_test.py
@@ -0,0 +1,317 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for lr1."""
+
+import collections
+import unittest
+
+from front_end import lr1
+from util import parser_types
+
+
+def _make_items(text):
+ """Makes a list of lr1.Items from the lines in text."""
+ return frozenset([lr1.Item.parse(line.strip()) for line in text.splitlines()])
+
+
+Token = collections.namedtuple("Token", ["symbol", "source_location"])
+
+
+def _tokenize(text):
+ """"Tokenizes" text by making each character into a token."""
+ result = []
+ for i in range(len(text)):
+ result.append(Token(text[i], parser_types.make_location(
+ (1, i + 1), (1, i + 2))))
+ return result
+
+
+def _parse_productions(text):
+ """Parses text into a grammar by calling Production.parse on each line."""
+ return [parser_types.Production.parse(line) for line in text.splitlines()]
+
+# Example grammar 4.54 from Aho, Sethi, Lam, Ullman (ASLU) p263.
+_alsu_grammar = lr1.Grammar("S", _parse_productions("""S -> C C
+ C -> c C
+ C -> d"""))
+
+# Item sets corresponding to the above grammar, ASLU pp263-264.
+_alsu_items = [
+ _make_items("""S' -> . S, $
+ S -> . C C, $
+ C -> . c C, c
+ C -> . c C, d
+ C -> . d, c
+ C -> . d, d"""),
+ _make_items("""S' -> S ., $"""),
+ _make_items("""S -> C . C, $
+ C -> . c C, $
+ C -> . d, $"""),
+ _make_items("""C -> c . C, c
+ C -> c . C, d
+ C -> . c C, c
+ C -> . c C, d
+ C -> . d, c
+ C -> . d, d"""),
+ _make_items("""C -> d ., c
+ C -> d ., d"""),
+ _make_items("""S -> C C ., $"""),
+ _make_items("""C -> c . C, $
+ C -> . c C, $
+ C -> . d, $"""),
+ _make_items("""C -> d ., $"""),
+ _make_items("""C -> c C ., c
+ C -> c C ., d"""),
+ _make_items("""C -> c C ., $"""),
+]
+
+# ACTION table corresponding to the above grammar, ASLU p266.
+_alsu_action = {
+ (0, "c"): lr1.Shift(3, _alsu_items[3]),
+ (0, "d"): lr1.Shift(4, _alsu_items[4]),
+ (1, lr1.END_OF_INPUT): lr1.Accept(),
+ (2, "c"): lr1.Shift(6, _alsu_items[6]),
+ (2, "d"): lr1.Shift(7, _alsu_items[7]),
+ (3, "c"): lr1.Shift(3, _alsu_items[3]),
+ (3, "d"): lr1.Shift(4, _alsu_items[4]),
+ (4, "c"): lr1.Reduce(parser_types.Production("C", ("d",))),
+ (4, "d"): lr1.Reduce(parser_types.Production("C", ("d",))),
+ (5, lr1.END_OF_INPUT): lr1.Reduce(parser_types.Production("S", ("C", "C"))),
+ (6, "c"): lr1.Shift(6, _alsu_items[6]),
+ (6, "d"): lr1.Shift(7, _alsu_items[7]),
+ (7, lr1.END_OF_INPUT): lr1.Reduce(parser_types.Production("C", ("d",))),
+ (8, "c"): lr1.Reduce(parser_types.Production("C", ("c", "C"))),
+ (8, "d"): lr1.Reduce(parser_types.Production("C", ("c", "C"))),
+ (9, lr1.END_OF_INPUT): lr1.Reduce(parser_types.Production("C", ("c", "C"))),
+}
+
+# GOTO table corresponding to the above grammar, ASLU p266.
+_alsu_goto = {(0, "S"): 1, (0, "C"): 2, (2, "C"): 5, (3, "C"): 8, (6, "C"): 9,}
+
+
+def _normalize_table(items, table):
+ """Returns a canonical-form version of items and table, for comparisons."""
+ item_to_original_index = {}
+ for i in range(len(items)):
+ item_to_original_index[items[i]] = i
+ sorted_items = items[0:1] + sorted(items[1:], key=sorted)
+ original_index_to_index = {}
+ for i in range(len(sorted_items)):
+ original_index_to_index[item_to_original_index[sorted_items[i]]] = i
+ updated_table = {}
+ for k in table:
+ new_k = original_index_to_index[k[0]], k[1]
+ new_value = table[k]
+ if isinstance(new_value, int):
+ new_value = original_index_to_index[new_value]
+ elif isinstance(new_value, lr1.Shift):
+ new_value = lr1.Shift(original_index_to_index[new_value.state],
+ new_value.items)
+ updated_table[new_k] = new_value
+ return sorted_items, updated_table
+
+
+class Lr1Test(unittest.TestCase):
+ """Tests for lr1."""
+
+ def test_parse_lr1item(self):
+ self.assertEqual(lr1.Item.parse("S' -> . S, $"),
+ lr1.Item(parser_types.Production(lr1.START_PRIME, ("S",)),
+ 0, lr1.END_OF_INPUT, "S"))
+
+ def test_symbol_extraction(self):
+ self.assertEqual(_alsu_grammar.terminals, set(["c", "d", lr1.END_OF_INPUT]))
+ self.assertEqual(_alsu_grammar.nonterminals, set(["S", "C",
+ lr1.START_PRIME]))
+ self.assertEqual(_alsu_grammar.symbols,
+ set(["c", "d", "S", "C", lr1.END_OF_INPUT,
+ lr1.START_PRIME]))
+
+ def test_items(self):
+ self.assertEqual(set(_alsu_grammar._items()[0]), frozenset(_alsu_items))
+
+ def test_terminal_nonterminal_production_tables(self):
+ parser = _alsu_grammar.parser()
+ self.assertEqual(parser.terminals, _alsu_grammar.terminals)
+ self.assertEqual(parser.nonterminals, _alsu_grammar.nonterminals)
+ self.assertEqual(parser.productions, _alsu_grammar.productions)
+
+ def test_action_table(self):
+ parser = _alsu_grammar.parser()
+ norm_items, norm_action = _normalize_table(parser.item_sets, parser.action)
+ test_items, test_action = _normalize_table(_alsu_items, _alsu_action)
+ self.assertEqual(norm_items, test_items)
+ self.assertEqual(norm_action, test_action)
+
+ def test_goto_table(self):
+ parser = _alsu_grammar.parser()
+ norm_items, norm_goto = _normalize_table(parser.item_sets, parser.goto)
+ test_items, test_goto = _normalize_table(_alsu_items, _alsu_goto)
+ self.assertEqual(norm_items, test_items)
+ self.assertEqual(norm_goto, test_goto)
+
+ def test_successful_parse(self):
+ parser = _alsu_grammar.parser()
+ loc = parser_types.parse_location
+ s_to_c_c = parser_types.Production.parse("S -> C C")
+ c_to_c_c = parser_types.Production.parse("C -> c C")
+ c_to_d = parser_types.Production.parse("C -> d")
+ self.assertEqual(
+ lr1.Reduction("S", [lr1.Reduction("C", [
+ Token("c", loc("1:1-1:2")), lr1.Reduction(
+ "C", [Token("c", loc("1:2-1:3")),
+ lr1.Reduction("C",
+ [Token("c", loc("1:3-1:4")), lr1.Reduction(
+ "C", [Token("d", loc("1:4-1:5"))],
+ c_to_d, loc("1:4-1:5"))], c_to_c_c,
+ loc("1:3-1:5"))], c_to_c_c, loc("1:2-1:5"))
+ ], c_to_c_c, loc("1:1-1:5")), lr1.Reduction(
+ "C", [Token("c", loc("1:5-1:6")),
+ lr1.Reduction("C", [Token("d", loc("1:6-1:7"))], c_to_d,
+ loc("1:6-1:7"))], c_to_c_c, loc("1:5-1:7"))],
+ s_to_c_c, loc("1:1-1:7")),
+ parser.parse(_tokenize("cccdcd")).parse_tree)
+ self.assertEqual(
+ lr1.Reduction("S", [
+ lr1.Reduction("C", [Token("d", loc("1:1-1:2"))], c_to_d, loc(
+ "1:1-1:2")), lr1.Reduction("C", [Token("d", loc("1:2-1:3"))],
+ c_to_d, loc("1:2-1:3"))
+ ], s_to_c_c, loc("1:1-1:3")), parser.parse(_tokenize("dd")).parse_tree)
+
+ def test_parse_with_no_source_information(self):
+ parser = _alsu_grammar.parser()
+ s_to_c_c = parser_types.Production.parse("S -> C C")
+ c_to_d = parser_types.Production.parse("C -> d")
+ self.assertEqual(
+ lr1.Reduction("S", [
+ lr1.Reduction("C", [Token("d", None)], c_to_d, None),
+ lr1.Reduction("C", [Token("d", None)], c_to_d, None)
+ ], s_to_c_c, None),
+ parser.parse([Token("d", None), Token("d", None)]).parse_tree)
+
+ def test_failed_parses(self):
+ parser = _alsu_grammar.parser()
+ self.assertEqual(None, parser.parse(_tokenize("d")).parse_tree)
+ self.assertEqual(None, parser.parse(_tokenize("cccd")).parse_tree)
+ self.assertEqual(None, parser.parse(_tokenize("")).parse_tree)
+ self.assertEqual(None, parser.parse(_tokenize("cccdc")).parse_tree)
+
+ def test_mark_error(self):
+ parser = _alsu_grammar.parser()
+ self.assertIsNone(parser.mark_error(_tokenize("cccdc"), None,
+ "missing last d"))
+ self.assertIsNone(parser.mark_error(_tokenize("d"), None, "missing last C"))
+ # Marking an already-marked error with the same error code should succeed.
+ self.assertIsNone(parser.mark_error(_tokenize("d"), None, "missing last C"))
+ # Marking an already-marked error with a different error code should fail.
+ self.assertRegexpMatches(
+ parser.mark_error(_tokenize("d"), None, "different message"),
+ r"^Attempted to overwrite existing error code 'missing last C' with "
+ r"new error code 'different message' for state \d+, terminal \$$")
+ self.assertEqual(
+ "Input successfully parsed.",
+ parser.mark_error(_tokenize("dd"), None, "good parse"))
+ self.assertEqual(
+ parser.mark_error(_tokenize("x"), None, "wrong location"),
+ "error occurred on x token, not end of input.")
+ self.assertEqual(
+ parser.mark_error([], _tokenize("x")[0], "wrong location"),
+ "error occurred on $ token, not x token.")
+ self.assertIsNone(
+ parser.mark_error([lr1.ANY_TOKEN], lr1.ANY_TOKEN, "default error"))
+ # Marking an already-marked error with the same error code should succeed.
+ self.assertIsNone(
+ parser.mark_error([lr1.ANY_TOKEN], lr1.ANY_TOKEN, "default error"))
+ # Marking an already-marked error with a different error code should fail.
+ self.assertRegexpMatches(
+ parser.mark_error([lr1.ANY_TOKEN], lr1.ANY_TOKEN, "default error 2"),
+ r"^Attempted to overwrite existing default error code 'default error' "
+ r"with new error code 'default error 2' for state \d+$")
+
+ self.assertEqual(
+ "missing last d", parser.parse(_tokenize("cccdc")).error.code)
+ self.assertEqual("missing last d", parser.parse(_tokenize("dc")).error.code)
+ self.assertEqual("missing last C", parser.parse(_tokenize("d")).error.code)
+ self.assertEqual("default error", parser.parse(_tokenize("z")).error.code)
+ self.assertEqual(
+ "missing last C", parser.parse(_tokenize("ccccd")).error.code)
+ self.assertEqual(None, parser.parse(_tokenize("ccc")).error.code)
+
+ def test_grammar_with_empty_rhs(self):
+ grammar = lr1.Grammar("S", _parse_productions("""S -> A B
+ A -> a A
+ A ->
+ B -> b"""))
+ parser = grammar.parser()
+ self.assertFalse(parser.conflicts)
+ self.assertTrue(parser.parse(_tokenize("ab")).parse_tree)
+ self.assertTrue(parser.parse(_tokenize("b")).parse_tree)
+ self.assertTrue(parser.parse(_tokenize("aab")).parse_tree)
+
+ def test_grammar_with_reduce_reduce_conflicts(self):
+ grammar = lr1.Grammar("S", _parse_productions("""S -> A c
+ S -> B c
+ A -> a
+ B -> a"""))
+ parser = grammar.parser()
+ self.assertEqual(len(parser.conflicts), 1)
+ # parser.conflicts is a set
+ for conflict in parser.conflicts:
+ for action in conflict.actions:
+ self.assertTrue(isinstance(action, lr1.Reduce))
+
+ def test_grammar_with_shift_reduce_conflicts(self):
+ grammar = lr1.Grammar("S", _parse_productions("""S -> A B
+ A -> a
+ A ->
+ B -> a
+ B ->"""))
+ parser = grammar.parser()
+ self.assertEqual(len(parser.conflicts), 1)
+ # parser.conflicts is a set
+ for conflict in parser.conflicts:
+ reduces = 0
+ shifts = 0
+ for action in conflict.actions:
+ if isinstance(action, lr1.Reduce):
+ reduces += 1
+ elif isinstance(action, lr1.Shift):
+ shifts += 1
+ self.assertEqual(1, reduces)
+ self.assertEqual(1, shifts)
+
+ def test_item_str(self):
+ self.assertEqual(
+ "a -> b c ., d",
+ str(lr1.make_item(parser_types.Production.parse("a -> b c"), 2, "d")))
+ self.assertEqual(
+ "a -> b . c, d",
+ str(lr1.make_item(parser_types.Production.parse("a -> b c"), 1, "d")))
+ self.assertEqual(
+ "a -> . b c, d",
+ str(lr1.make_item(parser_types.Production.parse("a -> b c"), 0, "d")))
+ self.assertEqual(
+ "a -> ., d",
+ str(lr1.make_item(parser_types.Production.parse("a ->"), 0, "d")))
+
+ def test_conflict_str(self):
+ self.assertEqual("Conflict for 'A' in state 12: R vs S",
+ str(lr1.Conflict(12, "'A'", ["R", "S"])))
+ self.assertEqual("Conflict for 'A' in state 12: R vs S vs T",
+ str(lr1.Conflict(12, "'A'", ["R", "S", "T"])))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/module_ir.py b/front_end/module_ir.py
new file mode 100644
index 0000000..c42d570
--- /dev/null
+++ b/front_end/module_ir.py
@@ -0,0 +1,1394 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""module_ir contains code for generating module-level IRs from parse trees.
+
+The primary export is build_ir(), which takes a parse tree (as returned by a
+parser from lr1.py), and returns a module-level intermediate representation
+("module IR").
+
+This module also notably exports PRODUCTIONS and START_SYMBOL, which should be
+fed to lr1.Grammar in order to create a parser for the Emboss language.
+"""
+
+import re
+
+from public import ir_pb2
+from util import name_conversion
+from util import parser_types
+
+
+# Intermediate types; should not be found in the final IR.
+class _List(object):
+ """A list with source location information."""
+ __slots__ = ('list', 'source_location')
+
+ def __init__(self, l):
+ assert isinstance(l, list), "_List object must wrap list, not '%r'" % l
+ self.list = l
+ self.source_location = ir_pb2.Location()
+
+
+class _ExpressionTail(object):
+ """A fragment of an expression with an operator and right-hand side.
+
+ _ExpressionTail is the tail of an expression, consisting of an operator and
+ the right-hand argument to the operator; for example, in the expression (6+8),
+ the _ExpressionTail would be "+8".
+
+ This is used as a temporary object while converting the right-recursive
+ "expression" and "times-expression" productions into left-associative
+ Expressions.
+
+ Attributes:
+ operator: An ir_pb2.Word of the operator's name.
+ expression: The expression on the right side of the operator.
+ source_location: The source location of the operation fragment.
+ """
+ __slots__ = ('operator', 'expression', 'source_location')
+
+ def __init__(self, operator, expression):
+ self.operator = operator
+ self.expression = expression
+ self.source_location = ir_pb2.Location()
+
+
+class _FieldWithType(object):
+ """A field with zero or more types defined inline with that field."""
+ __slots__ = ('field', 'subtypes', 'source_location')
+
+ def __init__(self, field, subtypes=None):
+ self.field = field
+ self.subtypes = subtypes or []
+ self.source_location = ir_pb2.Location()
+
+
+def build_ir(parse_tree, used_productions=None):
+ r"""Builds a module-level intermediate representation from a valid parse tree.
+
+ The parse tree is precisely dictated by the exact productions in the grammar
+ used by the parser, with no semantic information. build_ir transforms this
+ "raw" form into a stable, cooked representation, thereby isolating subsequent
+ steps from the exact details of the grammar.
+
+ (Probably incomplete) list of transformations:
+
+ * ParseResult and Token nodes are replaced with Module, Attribute, Struct,
+ Type, etc. objects.
+
+ * Purely syntactic tokens ('"["', '"struct"', etc.) are discarded.
+
+ * Repeated elements are transformed from tree form to list form:
+
+ a*
+ / \
+ b a*
+ / \
+ c a*
+ / \
+ d a*
+
+ (where b, c, and d are nodes of type "a") becomes [b, c, d].
+
+ * The values of numeric constants (Number, etc. tokens) are parsed.
+
+ * Different classes of names (snake_names, CamelNames, ShoutyNames) are
+ folded into a single "Name" type, since they are guaranteed to appear in
+ the correct places in the parse tree.
+
+ Arguments:
+ parse_tree: A parse tree. Each leaf node should be a parser_types.Token
+ object, and each non-leaf node should have a 'symbol' attribute specifying
+ which grammar symbol it represents, and a 'children' attribute containing
+ a list of child nodes. This is the format returned by the parsers
+ produced by the lr1 module, when run against tokens from the tokenizer
+ module.
+ used_productions: If specified, used_productions.add() will be called with
+ each production actually used in parsing. This can be useful when
+ developing the grammar and writing tests; in particular, it can be used to
+ figure out which productions are *not* used when parsing a particular
+ file.
+
+ Returns:
+ A module-level intermediate representation (module IR) for an Emboss module
+ (source file). This IR will not have symbols resolved; that must be done on
+ a forest of module IRs so that names from other modules can be resolved.
+ """
+ if used_productions is None:
+ used_productions = set()
+ if hasattr(parse_tree, 'children'):
+ parsed_children = [build_ir(child, used_productions)
+ for child in parse_tree.children]
+ used_productions.add(parse_tree.production)
+ result = _handlers[parse_tree.production](*parsed_children)
+ if parse_tree.source_location is not None:
+ result.source_location.CopyFrom(parse_tree.source_location)
+ return result
+ else:
+ # For leaf nodes, the temporary "IR" is just the token. Higher-level rules
+ # will translate it to a real IR.
+ assert isinstance(parse_tree, parser_types.Token), str(parse_tree)
+ return parse_tree
+
+# Map of productions to their handlers.
+_handlers = {}
+
+_anonymous_name_counter = 0
+
+
+def _get_anonymous_field_name():
+ global _anonymous_name_counter
+ _anonymous_name_counter += 1
+ return 'emboss_reserved_anonymous_field_{}'.format(_anonymous_name_counter)
+
+
+def _handles(production_text):
+ """_handles marks a function as the handler for a particular production."""
+ production = parser_types.Production.parse(production_text)
+
+ def handles(f):
+ _handlers[production] = f
+ return f
+
+ return handles
+
+
+def _make_prelude_import(position):
+ """Helper function to construct a synthetic ir_pb2.Import for the prelude."""
+ location = parser_types.make_location(position, position)
+ return ir_pb2.Import(
+ file_name=ir_pb2.String(text='', source_location=location),
+ local_name=ir_pb2.Word(text='', source_location=location),
+ source_location=location)
+
+
+def _text_to_operator(text):
+ """Converts an operator's textual name to its corresponding enum."""
+ operations = {
+ '+': ir_pb2.Function.ADDITION,
+ '-': ir_pb2.Function.SUBTRACTION,
+ '*': ir_pb2.Function.MULTIPLICATION,
+ '==': ir_pb2.Function.EQUALITY,
+ '!=': ir_pb2.Function.INEQUALITY,
+ '&&': ir_pb2.Function.AND,
+ '||': ir_pb2.Function.OR,
+ '>': ir_pb2.Function.GREATER,
+ '>=': ir_pb2.Function.GREATER_OR_EQUAL,
+ '<': ir_pb2.Function.LESS,
+ '<=': ir_pb2.Function.LESS_OR_EQUAL,
+ }
+ return operations[text]
+
+
+def _text_to_function(text):
+ """Converts a function's textual name to its corresponding enum."""
+ functions = {
+ '$max': ir_pb2.Function.MAXIMUM,
+ '$present': ir_pb2.Function.PRESENCE,
+ '$upper_bound': ir_pb2.Function.UPPER_BOUND,
+ '$lower_bound': ir_pb2.Function.LOWER_BOUND,
+ }
+ return functions[text]
+
+
+################################################################################
+# Grammar & parse tree to IR translation.
+#
+# From here to (almost) the end of the file are functions which recursively
+# build an IR. The @_handles annotations indicate the exact grammar
+# production(s) handled by each function. The handler function should take
+# exactly one argument for each symbol in the production's RHS.
+#
+# The actual Emboss grammar is extracted directly from the @_handles
+# annotations, so this is also the grammar definition. For convenience, the
+# grammar can be viewed separately in g3doc/grammar.md.
+#
+# At the end, symbols whose names end in "*", "+", or "?" are extracted from the
+# grammar, and appropriate productions are added for zero-or-more, one-or-more,
+# or zero-or-one lists, respectively. (This is analogous to the *, +, and ?
+# operators in regex.) It is necessary for this to happen here (and not in
+# lr1.py) because the generated productions must be associated with
+# IR-generation functions.
+
+
+# A module file is a list of documentation, then imports, then top-level
+# attributes, then type definitions. Any section may be missing.
+# TODO(bolms): Should Emboss disallow completely empty files?
+@_handles('module -> comment-line* doc-line* import-line* attribute-line*'
+ ' type-definition*')
+def _file(leading_newlines, docs, imports, attributes, type_definitions):
+ """Assembles the top-level IR for a module."""
+ del leading_newlines # Unused.
+ # Figure out the best synthetic source_location for the synthesized prelude
+ # import.
+ if imports.list:
+ position = imports.list[0].source_location.start
+ elif docs.list:
+ position = docs.list[0].source_location.end
+ elif attributes.list:
+ position = attributes.list[0].source_location.start
+ elif type_definitions.list:
+ position = type_definitions.list[0].source_location.start
+ else:
+ position = 1, 1
+
+ # If the source file is completely empty, build_ir won't automatically
+ # populate the source_location attribute for the module.
+ if (not docs.list and not imports.list and not attributes.list and
+ not type_definitions.list):
+ module_source_location = parser_types.make_location((1, 1), (1, 1))
+ else:
+ module_source_location = None
+
+ return ir_pb2.Module(
+ documentation=docs.list,
+ foreign_import=[_make_prelude_import(position)] + imports.list,
+ attribute=attributes.list,
+ type=type_definitions.list,
+ source_location=module_source_location)
+
+
+@_handles('import-line ->'
+ ' "import" string-constant "as" snake-word Comment? eol')
+def _import(import_, file_name, as_, local_name, comment, eol):
+ del import_, as_, comment, eol # Unused
+ return ir_pb2.Import(file_name=file_name, local_name=local_name)
+
+
+@_handles('doc-line -> doc Comment? eol')
+def _doc_line(doc, comment, eol):
+ del comment, eol # Unused.
+ return doc
+
+
+@_handles('doc -> Documentation')
+def _doc(documentation):
+ # As a special case, an empty documentation string may omit the trailing
+ # space.
+ if documentation.text == '--':
+ doc_text = '-- '
+ else:
+ doc_text = documentation.text
+ assert doc_text[0:3] == '-- ', (
+ "Documentation token '{}' in unknown format.".format(
+ documentation.text))
+ return ir_pb2.Documentation(text=doc_text[3:])
+
+
+# A attribute-line is just a attribute on its own line.
+@_handles('attribute-line -> attribute Comment? eol')
+def _attribute_line(attr, comment, eol):
+ del comment, eol # Unused.
+ return attr
+
+
+# A attribute is [name = value].
+@_handles('attribute -> "[" attribute-context? "$default"?'
+ ' snake-word ":" attribute-value "]"')
+def _attribute(open_bracket, context_specifier, default_specifier, name, colon,
+ attribute_value, close_bracket):
+ del open_bracket, colon, close_bracket # Unused.
+ if context_specifier.list:
+ return ir_pb2.Attribute(name=name,
+ value=attribute_value,
+ is_default=bool(default_specifier.list),
+ back_end=context_specifier.list[0])
+ else:
+ return ir_pb2.Attribute(name=name,
+ value=attribute_value,
+ is_default=bool(default_specifier.list))
+
+
+@_handles('attribute-context -> "(" snake-word ")"')
+def _attribute_context(open_paren, context_name, close_paren):
+ del open_paren, close_paren # Unused.
+ return context_name
+
+
+@_handles('attribute-value -> expression')
+def _attribute_value_expression(expression):
+ return ir_pb2.AttributeValue(expression=expression)
+
+
+@_handles('attribute-value -> string-constant')
+def _attribute_value_string(string):
+ return ir_pb2.AttributeValue(string_constant=string)
+
+
+@_handles('boolean-constant -> BooleanConstant')
+def _boolean_constant(boolean):
+ return ir_pb2.BooleanConstant(value=(boolean.text == 'true'))
+
+
+@_handles('string-constant -> String')
+def _string_constant(string):
+ """Turns a String token into an ir_pb2.String, with proper unescaping.
+
+ Arguments:
+ string: A String token.
+
+ Returns:
+ An ir_pb2.String with the "text" field set to the unescaped value of
+ string.text.
+ """
+ # TODO(bolms): If/when this logic becomes more complex (e.g., to handle \NNN
+ # or \xNN escapes), extract this into a separate module with separate tests.
+ assert string.text[0] == '"'
+ assert string.text[-1] == '"'
+ assert len(string.text) >= 2
+ result = []
+ for substring in re.split(r'(\\.)', string.text[1:-1]):
+ if substring and substring[0] == '\\':
+ assert len(substring) == 2
+ result.append({'\\': '\\', '"': '"', 'n': '\n'}[substring[1]])
+ else:
+ result.append(substring)
+ return ir_pb2.String(text=''.join(result))
+
+
+# In Emboss, '&&' and '||' may not be mixed without parentheses. These are all
+# fine:
+#
+# x && y && z
+# x || y || z
+# (x || y) && z
+# x || (y && z)
+#
+# These are syntax errors:
+#
+# x || y && z
+# x && y || z
+#
+# This is accomplished by making && and || separate-but-equal in the precedence
+# hierarchy. Instead of the more traditional:
+#
+# logical-expression -> or-expression
+# or-expression -> and-expression or-expression-right*
+# or-expression-right -> '||' and-expression
+# and-expression -> equality-expression and-expression-right*
+# and-expression-right -> '&&' equality-expression
+#
+# Or, using yacc-style precedence specifiers:
+#
+# %left "||"
+# %left "&&"
+# expression -> expression
+# | expression '||' expression
+# | expression '&&' expression
+#
+# Emboss uses a slightly more complex grammar, in which '&&' and '||' are
+# parallel, but unmixable:
+#
+# logical-expression -> and-expression
+# | or-expression
+# | equality-expression
+# or-expression -> equality-expression or-expression-right+
+# or-expression-right -> '||' equality-expression
+# and-expression -> equality-expression and-expression-right+
+# and-expression-right -> '&&' equality-expression
+#
+# In either case, explicit parenthesization is handled elsewhere in the grammar.
+@_handles('logical-expression -> and-expression')
+@_handles('logical-expression -> or-expression')
+@_handles('logical-expression -> comparison-expression')
+@_handles('choice-expression -> logical-expression')
+@_handles('expression -> choice-expression')
+def _expression(expression):
+ return expression
+
+
+# The `logical-expression`s here means that ?: can't be chained without
+# parentheses. `x < 0 ? -1 : (x == 0 ? 0 : 1)` is OK, but `x < 0 ? -1 : x == 0
+# ? 0 : 1` is not. Parentheses are also needed in the middle: `x <= 0 ? x < 0 ?
+# -1 : 0 : 1` is not syntactically valid.
+@_handles('choice-expression -> logical-expression "?" logical-expression'
+ ' ":" logical-expression')
+def _choice_expression(condition, question, if_true, colon, if_false):
+ location = parser_types.make_location(
+ condition.source_location.start, if_false.source_location.end)
+ operator_location = parser_types.make_location(
+ question.source_location.start, colon.source_location.end)
+ # The function_name is a bit weird, but should suffice for any error messages
+ # that might need it.
+ return ir_pb2.Expression(
+ function=ir_pb2.Function(function=ir_pb2.Function.CHOICE,
+ args=[condition, if_true, if_false],
+ function_name=ir_pb2.Word(
+ text='?:',
+ source_location=operator_location),
+ source_location=location))
+
+
+@_handles('comparison-expression -> additive-expression')
+def _no_op_comparative_expression(expression):
+ return expression
+
+
+@_handles('comparison-expression ->'
+ ' additive-expression inequality-operator additive-expression')
+def _comparative_expression(left, operator, right):
+ location = parser_types.make_location(
+ left.source_location.start, right.source_location.end)
+ return ir_pb2.Expression(
+ function=ir_pb2.Function(function=_text_to_operator(operator.text),
+ args=[left, right],
+ function_name=operator,
+ source_location=location))
+
+
+@_handles('additive-expression -> times-expression additive-expression-right*')
+@_handles('times-expression -> negation-expression times-expression-right*')
+@_handles('and-expression -> comparison-expression and-expression-right+')
+@_handles('or-expression -> comparison-expression or-expression-right+')
+def _binary_operator_expression(expression, expression_right):
+ """Builds the IR for a chain of equal-precedence left-associative operations.
+
+ _binary_operator_expression transforms a right-recursive list of expression
+ tails into a left-associative Expression tree. For example, given the
+ arguments:
+
+ 6, (Tail("+", 7), Tail("-", 8), Tail("+", 10))
+
+ _expression produces a structure like:
+
+ Expression(Expression(Expression(6, "+", 7), "-", 8), "+", 10)
+
+ This transformation is necessary because strict LR(1) grammars do not allow
+ left recursion.
+
+ Note that this method is used for several productions; each of those
+ productions handles a different precedence level, but are identical in form.
+
+ Arguments:
+ expression: An ir_pb2.Expression which is the head of the (expr, operator,
+ expr, operator, expr, ...) list.
+ expression_right: A list of _ExpressionTails corresponding to the (operator,
+ expr, operator, expr, ...) list that comes after expression.
+
+ Returns:
+ An ir_pb2.Expression with the correct recursive structure to represent a
+ list of left-associative operations.
+ """
+ e = expression
+ for right in expression_right.list:
+ location = parser_types.make_location(
+ e.source_location.start, right.source_location.end)
+ e = ir_pb2.Expression(
+ function=ir_pb2.Function(
+ function=_text_to_operator(right.operator.text),
+ args=[e, right.expression],
+ function_name=right.operator,
+ source_location=location),
+ source_location=location)
+ return e
+
+
+@_handles('comparison-expression ->'
+ ' additive-expression equality-expression-right+')
+@_handles('comparison-expression ->'
+ ' additive-expression less-expression-right-list')
+@_handles('comparison-expression ->'
+ ' additive-expression greater-expression-right-list')
+def _chained_comparison_expression(expression, expression_right):
+ """Builds the IR for a chain of comparisons, like a == b == c.
+
+ Like _binary_operator_expression, _chained_comparison_expression transforms a
+ right-recursive list of expression tails into a left-associative Expression
+ tree. Unlike _binary_operator_expression, extra AND nodes are added. For
+ example, the following expression:
+
+ 0 <= b <= 64
+
+ must be translated to the conceptually-equivalent expression:
+
+ 0 <= b && b <= 64
+
+ (The middle subexpression is duplicated -- this would be a problem in a
+ programming language like C where expressions like `x++` have side effects,
+ but side effects do not make sense in a data definition language like Emboss.)
+
+ _chained_comparison_expression receives a left-hand head expression and a list
+ of tails, like:
+
+ 6, (Tail("<=", b), Tail("<=", 64))
+
+ which it translates to a structure like:
+
+ Expression(Expression(6, "<=", b), "&&", Expression(b, "<=", 64))
+
+ The Emboss grammar is constructed such that sequences of "<", "<=", and "=="
+ comparisons may be chained, and sequences of ">", ">=", and "==" can be
+ chained, but greater and less-than comparisons may not; e.g., "b < 64 > a" is
+ not allowed.
+
+ Arguments:
+ expression: An ir_pb2.Expression which is the head of the (expr, operator,
+ expr, operator, expr, ...) list.
+ expression_right: A list of _ExpressionTails corresponding to the (operator,
+ expr, operator, expr, ...) list that comes after expression.
+
+ Returns:
+ An ir_pb2.Expression with the correct recursive structure to represent a
+ chain of left-associative comparison operations.
+ """
+ sequence = [expression]
+ for right in expression_right.list:
+ sequence.append(right.operator)
+ sequence.append(right.expression)
+ comparisons = []
+ for i in range(0, len(sequence) - 1, 2):
+ left, operator, right = sequence[i:i+3]
+ location = parser_types.make_location(
+ left.source_location.start, right.source_location.end)
+ comparisons.append(ir_pb2.Expression(
+ function=ir_pb2.Function(
+ function=_text_to_operator(operator.text),
+ args=[left, right],
+ function_name=operator,
+ source_location=location),
+ source_location=location))
+ e = comparisons[0]
+ for comparison in comparisons[1:]:
+ location = parser_types.make_location(
+ e.source_location.start, comparison.source_location.end)
+ e = ir_pb2.Expression(
+ function=ir_pb2.Function(
+ function=ir_pb2.Function.AND,
+ args=[e, comparison],
+ function_name=ir_pb2.Word(
+ text='&&',
+ source_location=comparison.function.args[0].source_location),
+ source_location=location),
+ source_location=location)
+ return e
+
+
+# _chained_comparison_expression, above, handles three types of chains: `a == b
+# == c`, `a < b <= c`, and `a > b >= c`.
+#
+# This requires a bit of subtlety in the productions for
+# `x-expression-right-list`, because the `==` operator may be freely mixed into
+# greater-than or less-than chains, like `a < b == c <= d` or `a > b == c >= d`,
+# but greater-than and less-than may not be mixed; i.e., `a < b >= c` is
+# disallowed.
+#
+# In order to keep the grammar unambiguous -- that is, in order to ensure that
+# every valid input can only be parsed in exactly one way -- the languages
+# defined by `equality-expression-right*`, `greater-expression-right-list`, and
+# `less-expression-right-list` cannot overlap.
+#
+# `equality-expression-right*`, by definition, only contains `== n` elements.
+# By forcing `greater-expression-right-list` to contain at least one
+# `greater-expression-right`, we can ensure that a chain like `== n == m` cannot
+# be parsed as a `greater-expression-right-list`. Similar logic applies in the
+# less-than case.
+#
+# There is another potential source of ambiguity here: if
+# `greater-expression-right-list` were
+#
+# greater-expression-right-list ->
+# equality-or-greater-expression-right* greater-expression-right
+# equality-or-greater-expression-right*
+#
+# then a sequence like '> b > c > d' could be parsed as any of:
+#
+# () (> b) ((> c) (> d))
+# ((> b)) (> c) ((> d))
+# ((> b) (> c)) (> d) ()
+#
+# By using `equality-expression-right*` for the first symbol, only the first
+# parse is possible.
+@_handles('greater-expression-right-list ->'
+ ' equality-expression-right* greater-expression-right'
+ ' equality-or-greater-expression-right*')
+@_handles('less-expression-right-list ->'
+ ' equality-expression-right* less-expression-right'
+ ' equality-or-less-expression-right*')
+def _chained_comparison_tails(start, middle, end):
+ return _List(start.list + [middle] + end.list)
+
+
+@_handles('equality-or-greater-expression-right -> equality-expression-right')
+@_handles('equality-or-greater-expression-right -> greater-expression-right')
+@_handles('equality-or-less-expression-right -> equality-expression-right')
+@_handles('equality-or-less-expression-right -> less-expression-right')
+def _equality_or_less_or_greater(right):
+ return right
+
+
+@_handles('and-expression-right -> and-operator comparison-expression')
+@_handles('or-expression-right -> or-operator comparison-expression')
+@_handles('additive-expression-right -> additive-operator times-expression')
+@_handles('equality-expression-right -> equality-operator additive-expression')
+@_handles('greater-expression-right -> greater-operator additive-expression')
+@_handles('less-expression-right -> less-operator additive-expression')
+@_handles('times-expression-right ->'
+ ' multiplicative-operator negation-expression')
+def _expression_right_production(operator, expression):
+ return _ExpressionTail(operator, expression)
+
+
+# This supports a single layer of unary plus/minus, so "+5" and "-value" are
+# allowed, but "+-5" or "-+-something" are not.
+@_handles('negation-expression -> additive-operator bottom-expression')
+def _negation_expression_with_operator(operator, expression):
+ phantom_zero_location = ir_pb2.Location(start=operator.source_location.start,
+ end=operator.source_location.start)
+ return ir_pb2.Expression(
+ function=ir_pb2.Function(
+ function=_text_to_operator(operator.text),
+ args=[ir_pb2.Expression(
+ constant=ir_pb2.NumericConstant(
+ value='0',
+ source_location=phantom_zero_location),
+ source_location=phantom_zero_location), expression],
+ function_name=operator,
+ source_location=ir_pb2.Location(
+ start=operator.source_location.start,
+ end=expression.source_location.end)))
+
+
+@_handles('negation-expression -> bottom-expression')
+def _negation_expression(expression):
+ return expression
+
+
+@_handles('bottom-expression -> "(" expression ")"')
+def _bottom_expression_parentheses(open_paren, expression, close_paren):
+ del open_paren, close_paren # Unused.
+ return expression
+
+
+@_handles('bottom-expression -> function-name "(" argument-list ")"')
+def _bottom_expression_function(function, open_paren, arguments, close_paren):
+ del open_paren # Unused.
+ return ir_pb2.Expression(
+ function=ir_pb2.Function(
+ function=_text_to_function(function.text),
+ args=arguments.list,
+ function_name=function,
+ source_location=ir_pb2.Location(
+ start=function.source_location.start,
+ end=close_paren.source_location.end)))
+
+
+@_handles('comma-then-expression -> "," expression')
+def _comma_then_expression(comma, expression):
+ del comma # Unused.
+ return expression
+
+
+@_handles('argument-list -> expression comma-then-expression*')
+def _argument_list(head, tail):
+ tail.list.insert(0, head)
+ return tail
+
+
+@_handles('argument-list ->')
+def _empty_argument_list():
+ return _List([])
+
+
+@_handles('bottom-expression -> numeric-constant')
+def _bottom_expression_from_numeric_constant(constant):
+ return ir_pb2.Expression(constant=constant)
+
+
+@_handles('bottom-expression -> constant-reference')
+def _bottom_expression_from_constant_reference(reference):
+ return ir_pb2.Expression(constant_reference=reference)
+
+
+@_handles('bottom-expression -> builtin-reference')
+def _bottom_expression_from_builtin(reference):
+ return ir_pb2.Expression(builtin_reference=reference)
+
+
+@_handles('bottom-expression -> boolean-constant')
+def _bottom_expression_from_boolean_constant(boolean):
+ return ir_pb2.Expression(boolean_constant=boolean)
+
+
+@_handles('bottom-expression -> field-reference')
+def _bottom_expression_from_reference(reference):
+ return reference
+
+
+@_handles('field-reference -> snake-reference field-reference-tail*')
+def _indirect_field_reference(field_reference, field_references):
+ if field_references.source_location.HasField('end'):
+ end_location = field_references.source_location.end
+ else:
+ end_location = field_reference.source_location.end
+ return ir_pb2.Expression(field_reference=ir_pb2.FieldReference(
+ path=[field_reference] + field_references.list,
+ source_location=parser_types.make_location(
+ field_reference.source_location.start, end_location)))
+
+
+# If "Type.field" ever becomes syntactically valid, it will be necessary to
+# check that enum values are compile-time constants.
+@_handles('field-reference-tail -> "." snake-reference')
+def _field_reference_tail(dot, reference):
+ del dot # Unused.
+ return reference
+
+
+@_handles('numeric-constant -> Number')
+def _numeric_constant(number):
+ # All types of numeric constant tokenize to the same symbol, because they are
+ # interchangeable in source code.
+ if number.text[0:2] == '0b':
+ n = int(number.text.replace('_', '')[2:], 2)
+ elif number.text[0:2] == '0x':
+ n = int(number.text.replace('_', '')[2:], 16)
+ else:
+ n = int(number.text.replace('_', ''), 10)
+ return ir_pb2.NumericConstant(value=str(n))
+
+
+@_handles('type-definition -> struct')
+@_handles('type-definition -> bits')
+@_handles('type-definition -> enum')
+@_handles('type-definition -> external')
+def _type_definition(type_definition):
+ return type_definition
+
+
+# struct StructureName:
+# ... fields ...
+# bits BitName:
+# ... fields ...
+@_handles('struct -> "struct" type-name delimited-parameter-definition-list?'
+ ' ":" Comment? eol struct-body')
+@_handles('bits -> "bits" type-name delimited-parameter-definition-list? ":"'
+ ' Comment? eol bits-body')
+def _structure(struct, name, parameters, colon, comment, newline, struct_body):
+ """Composes the top-level IR for an Emboss structure."""
+ del colon, comment, newline # Unused.
+ struct_body.structure.source_location.start.CopyFrom(
+ struct.source_location.start)
+ struct_body.structure.source_location.end.CopyFrom(
+ struct_body.source_location.end)
+ struct_body.name.CopyFrom(name)
+ if parameters.list:
+ struct_body.runtime_parameter.extend(parameters.list[0].list)
+ return struct_body
+
+
+@_handles('delimited-parameter-definition-list ->'
+ ' "(" parameter-definition-list ")"')
+def _delimited_parameter_definition_list(open_paren, parameters, close_paren):
+ del open_paren, close_paren # Unused
+ return parameters
+
+
+@_handles('parameter-definition -> snake-name ":" type')
+def _parameter_definition(name, double_colon, parameter_type):
+ del double_colon # Unused
+ return ir_pb2.RuntimeParameter(name=name, physical_type_alias=parameter_type)
+
+
+@_handles('parameter-definition-list-tail -> "," parameter-definition')
+def _parameter_definition_list_tail(comma, parameter):
+ del comma # Unused.
+ return parameter
+
+
+@_handles('parameter-definition-list -> parameter-definition'
+ ' parameter-definition-list-tail*')
+def _parameter_definition_list(head, tail):
+ tail.list.insert(0, head)
+ return tail
+
+
+@_handles('parameter-definition-list ->')
+def _empty_parameter_definition_list():
+ return _List([])
+
+
+# The body of a struct: basically, the part after the first line.
+@_handles('struct-body -> Indent doc-line* attribute-line*'
+ ' type-definition* struct-field-block Dedent')
+def _struct_body(indent, docs, attributes, types, fields, dedent):
+ del indent, dedent # Unused.
+ return _structure_body(docs, attributes, types, fields,
+ ir_pb2.TypeDefinition.BYTE)
+
+
+def _structure_body(docs, attributes, types, fields, addressable_unit):
+ """Constructs the body of a structure (bits or struct) definition."""
+ return ir_pb2.TypeDefinition(
+ structure=ir_pb2.Structure(field=[field.field for field in fields.list]),
+ documentation=docs.list,
+ attribute=attributes.list,
+ subtype=types.list + [subtype for field in fields.list for subtype in
+ field.subtypes],
+ addressable_unit=addressable_unit)
+
+
+@_handles('struct-field-block ->')
+@_handles('bits-field-block ->')
+@_handles('anonymous-bits-field-block ->')
+def _empty_field_block():
+ return _List([])
+
+
+@_handles('struct-field-block ->'
+ ' conditional-struct-field-block struct-field-block')
+@_handles('bits-field-block ->'
+ ' conditional-bits-field-block bits-field-block')
+@_handles('anonymous-bits-field-block -> conditional-anonymous-bits-field-block'
+ ' anonymous-bits-field-block')
+def _conditional_block_plus_field_block(conditional_block, block):
+ return _List(conditional_block.list + block.list)
+
+
+@_handles('struct-field-block ->'
+ ' unconditional-struct-field struct-field-block')
+@_handles('bits-field-block ->'
+ ' unconditional-bits-field bits-field-block')
+@_handles('anonymous-bits-field-block ->'
+ ' unconditional-anonymous-bits-field anonymous-bits-field-block')
+def _unconditional_block_plus_field_block(field, block):
+ """Prepends an unconditional field to block."""
+ field.field.existence_condition.source_location.CopyFrom(
+ field.source_location)
+ field.field.existence_condition.boolean_constant.source_location.CopyFrom(
+ field.source_location)
+ field.field.existence_condition.boolean_constant.value = True
+ return _List([field] + block.list)
+
+
+# Struct "fields" are regular fields, inline enums, bits, or structs, anonymous
+# inline bits, or virtual fields.
+@_handles('unconditional-struct-field -> field')
+@_handles('unconditional-struct-field -> inline-enum-field-definition')
+@_handles('unconditional-struct-field -> inline-bits-field-definition')
+@_handles('unconditional-struct-field -> inline-struct-field-definition')
+@_handles('unconditional-struct-field -> anonymous-bits-field-definition')
+@_handles('unconditional-struct-field -> virtual-field')
+# Bits fields are "regular" fields, inline enums or bits, or virtual fields.
+#
+# Inline structs and anonymous inline bits are not allowed inside of bits:
+# anonymous inline bits are pointless, and inline structs do not make sense,
+# since a struct cannot be a part of a bits.
+#
+# Anonymous inline bits may not include virtual fields; instead, the virtual
+# field should be a direct part of the enclosing structure.
+@_handles('unconditional-anonymous-bits-field -> field')
+@_handles('unconditional-anonymous-bits-field -> inline-enum-field-definition')
+@_handles('unconditional-anonymous-bits-field -> inline-bits-field-definition')
+@_handles('unconditional-bits-field -> unconditional-anonymous-bits-field')
+@_handles('unconditional-bits-field -> virtual-field')
+def _unconditional_field(field):
+ """Handles the unifying grammar production for a struct or bits field."""
+ return field
+
+
+# TODO(bolms): Add 'elif' and 'else' support.
+# TODO(bolms): Should nested 'if' blocks be allowed?
+@_handles('conditional-struct-field-block ->'
+ ' "if" expression ":" Comment? eol'
+ ' Indent unconditional-struct-field+ Dedent')
+@_handles('conditional-bits-field-block ->'
+ ' "if" expression ":" Comment? eol'
+ ' Indent unconditional-bits-field+ Dedent')
+@_handles('conditional-anonymous-bits-field-block ->'
+ ' "if" expression ":" Comment? eol'
+ ' Indent unconditional-anonymous-bits-field+ Dedent')
+def _conditional_field_block(if_keyword, expression, colon, comment, newline,
+ indent, fields, dedent):
+ """Applies an existence_condition to each element of fields."""
+ del if_keyword, newline, colon, comment, indent, dedent # Unused.
+ for field in fields.list:
+ condition = field.field.existence_condition
+ condition.CopyFrom(expression)
+ condition.source_location.is_disjoint_from_parent = True
+ return fields
+
+
+# The body of a bit field definition: basically, the part after the first line.
+@_handles('bits-body -> Indent doc-line* attribute-line*'
+ ' type-definition* bits-field-block Dedent')
+def _bits_body(indent, docs, attributes, types, fields, dedent):
+ del indent, dedent # Unused.
+ return _structure_body(docs, attributes, types, fields,
+ ir_pb2.TypeDefinition.BIT)
+
+
+# Inline bits (defined as part of a field) are more restricted than standalone
+# bits.
+@_handles('anonymous-bits-body ->'
+ ' Indent attribute-line* anonymous-bits-field-block Dedent')
+def _anonymous_bits_body(indent, attributes, fields, dedent):
+ del indent, dedent # Unused.
+ return _structure_body(_List([]), attributes, _List([]), fields,
+ ir_pb2.TypeDefinition.BIT)
+
+
+# A field is:
+# range type name (abbr) [attr: value] [attr2: value] -- doc
+# -- doc
+# -- doc
+# [attr3: value]
+# [attr4: value]
+@_handles('field ->'
+ ' field-location type snake-name abbreviation? attribute* doc?'
+ ' Comment? eol field-body?')
+def _field(location, field_type, name, abbreviation, attributes, doc, comment,
+ newline, field_body):
+ """Constructs an ir_pb2.Field from the given components."""
+ del comment # Unused
+ field = ir_pb2.Field(location=location,
+ type=field_type,
+ name=name,
+ attribute=attributes.list,
+ documentation=doc.list)
+ if field_body.list:
+ field.attribute.extend(field_body.list[0].attribute)
+ field.documentation.extend(field_body.list[0].documentation)
+ if abbreviation.list:
+ field.abbreviation.CopyFrom(abbreviation.list[0])
+ field.source_location.start.CopyFrom(location.source_location.start)
+ if field_body.source_location.HasField('end'):
+ field.source_location.end.CopyFrom(field_body.source_location.end)
+ else:
+ field.source_location.end.CopyFrom(newline.source_location.end)
+ return _FieldWithType(field=field)
+
+
+# A "virtual field" is:
+# let name = value
+# -- doc
+# -- doc
+# [attr1: value]
+# [attr2: value]
+@_handles('virtual-field ->'
+ ' "let" snake-name "=" expression Comment? eol field-body?')
+def _virtual_field(let, name, equals, value, comment, newline, field_body):
+ """Constructs an ir_pb2.Field from the given components."""
+ del equals, comment # Unused
+ field = ir_pb2.Field(read_transform=value, name=name)
+ if field_body.list:
+ field.attribute.extend(field_body.list[0].attribute)
+ field.documentation.extend(field_body.list[0].documentation)
+ field.source_location.start.CopyFrom(let.source_location.start)
+ if field_body.source_location.HasField('end'):
+ field.source_location.end.CopyFrom(field_body.source_location.end)
+ else:
+ field.source_location.end.CopyFrom(newline.source_location.end)
+ return _FieldWithType(field=field)
+
+
+# An inline enum is:
+# range "enum" name (abbr):
+# -- doc
+# -- doc
+# [attr3: value]
+# [attr4: value]
+# NAME = 10
+# NAME2 = 20
+@_handles('inline-enum-field-definition ->'
+ ' field-location "enum" snake-name abbreviation? ":" Comment? eol'
+ ' enum-body')
+def _inline_enum_field(location, enum, name, abbreviation, colon, comment,
+ newline, enum_body):
+ """Constructs an ir_pb2.Field for an inline enum field."""
+ del enum, colon, comment, newline # Unused.
+ return _inline_type_field(location, name, abbreviation, enum_body)
+
+
+@_handles(
+ 'inline-struct-field-definition ->'
+ ' field-location "struct" snake-name abbreviation? ":" Comment? eol'
+ ' struct-body')
+def _inline_struct_field(location, struct, name, abbreviation, colon, comment,
+ newline, struct_body):
+ del struct, colon, comment, newline # Unused.
+ return _inline_type_field(location, name, abbreviation, struct_body)
+
+
+@_handles('inline-bits-field-definition ->'
+ ' field-location "bits" snake-name abbreviation? ":" Comment? eol'
+ ' bits-body')
+def _inline_bits_field(location, bits, name, abbreviation, colon, comment,
+ newline, bits_body):
+ del bits, colon, comment, newline # Unused.
+ return _inline_type_field(location, name, abbreviation, bits_body)
+
+
+def _inline_type_field(location, name, abbreviation, body):
+ """Shared implementation of _inline_enum_field and _anonymous_bit_field."""
+ field = ir_pb2.Field(location=location,
+ name=name,
+ attribute=body.attribute,
+ documentation=body.documentation)
+ # All attributes should be attached to the field, not the type definition: if
+ # the user wants to use type attributes, they should create a separate type
+ # definition and reference it.
+ del body.attribute[:]
+ type_name = ir_pb2.NameDefinition()
+ type_name.CopyFrom(name)
+ type_name.name.text = name_conversion.snake_to_camel(type_name.name.text)
+ field.type.atomic_type.reference.source_name.extend([type_name.name])
+ field.type.atomic_type.reference.source_location.CopyFrom(
+ type_name.source_location)
+ field.type.atomic_type.reference.is_local_name = True
+ field.type.atomic_type.source_location.CopyFrom(type_name.source_location)
+ field.type.source_location.CopyFrom(type_name.source_location)
+ if abbreviation.list:
+ field.abbreviation.CopyFrom(abbreviation.list[0])
+ field.source_location.start.CopyFrom(location.source_location.start)
+ body.source_location.start.CopyFrom(location.source_location.start)
+ if body.HasField('enumeration'):
+ body.enumeration.source_location.CopyFrom(body.source_location)
+ else:
+ assert body.HasField('structure')
+ body.structure.source_location.CopyFrom(body.source_location)
+ body.name.CopyFrom(type_name)
+ field.source_location.end.CopyFrom(body.source_location.end)
+ subtypes = [body] + list(body.subtype)
+ del body.subtype[:]
+ return _FieldWithType(field=field, subtypes=subtypes)
+
+
+@_handles('anonymous-bits-field-definition ->'
+ ' field-location "bits" ":" Comment? eol anonymous-bits-body')
+def _anonymous_bit_field(location, bits_keyword, colon, comment, newline,
+ bits_body):
+ """Constructs an ir_pb2.Field for an anonymous bit field."""
+ del colon, comment, newline # Unused.
+ name = ir_pb2.NameDefinition(
+ name=ir_pb2.Word(
+ text=_get_anonymous_field_name(),
+ source_location=bits_keyword.source_location),
+ source_location=bits_keyword.source_location,
+ is_anonymous=True)
+ return _inline_type_field(location, name, _List([]), bits_body)
+
+
+@_handles('field-body -> Indent doc-line* attribute-line* Dedent')
+def _field_body(indent, docs, attributes, dedent):
+ del indent, dedent # Unused.
+ return ir_pb2.Field(documentation=docs.list, attribute=attributes.list)
+
+
+# A parenthetically-denoted abbreviation.
+@_handles('abbreviation -> "(" snake-word ")"')
+def _abbreviation(open_paren, word, close_paren):
+ del open_paren, close_paren # Unused.
+ return word
+
+
+# enum EnumName:
+# ... values ...
+@_handles('enum -> "enum" type-name ":" Comment? eol enum-body')
+def _enum(enum, name, colon, comment, newline, enum_body):
+ del colon, comment, newline # Unused.
+ enum_body.enumeration.source_location.start.CopyFrom(
+ enum.source_location.start)
+ enum_body.enumeration.source_location.end.CopyFrom(
+ enum_body.source_location.end)
+ enum_body.name.CopyFrom(name)
+ return enum_body
+
+
+# [enum Foo:]
+# name = value
+# name = value
+@_handles('enum-body -> Indent doc-line* attribute-line* enum-value+ Dedent')
+def _enum_body(indent, docs, attributes, values, dedent):
+ del indent, dedent # Unused.
+ return ir_pb2.TypeDefinition(
+ enumeration=ir_pb2.Enum(value=values.list),
+ documentation=docs.list,
+ attribute=attributes.list,
+ addressable_unit=ir_pb2.TypeDefinition.BIT)
+
+
+# name = value
+@_handles('enum-value -> '
+ ' constant-name "=" expression doc? Comment? eol enum-value-body?')
+def _enum_value(name, equals, expression, documentation, comment, newline,
+ body):
+ del equals, comment, newline # Unused.
+ result = ir_pb2.EnumValue(name=name,
+ value=expression,
+ documentation=documentation.list)
+ if body.list:
+ result.documentation.extend(body.list[0].list)
+ return result
+
+
+@_handles('enum-value-body -> Indent doc-line* Dedent')
+def _enum_value_body(indent, docs, dedent):
+ del indent, dedent # Unused.
+ return docs
+
+
+# An external is just a declaration that a type exists and has certain
+# attributes.
+@_handles('external -> "external" type-name ":" Comment? eol external-body')
+def _external(external, name, colon, comment, newline, external_body):
+ del colon, comment, newline # Unused.
+ external_body.source_location.start.CopyFrom(external.source_location.start)
+ external_body.name.CopyFrom(name)
+ return external_body
+
+
+# This syntax implicitly requires either a documentation line or a attribute
+# line, or it won't parse (because no Indent/Dedent tokens will be emitted).
+@_handles('external-body -> Indent doc-line* attribute-line* Dedent')
+def _external_body(indent, docs, attributes, dedent):
+ return ir_pb2.TypeDefinition(
+ external=ir_pb2.External(
+ # Set source_location here, since it won't be set automatically.
+ source_location=ir_pb2.Location(start=indent.source_location.start,
+ end=dedent.source_location.end)),
+ documentation=docs.list,
+ attribute=attributes.list)
+
+
+@_handles('field-location -> expression "[" "+" expression "]"')
+def _field_location(start, open_bracket, plus, size, close_bracket):
+ del open_bracket, plus, close_bracket # Unused.
+ return ir_pb2.FieldLocation(start=start, size=size)
+
+
+@_handles('delimited-argument-list -> "(" argument-list ")"')
+def _type_argument_list(open_paren, arguments, close_paren):
+ del open_paren, close_paren # Unused
+ return arguments
+
+
+# A type is "TypeName" or "TypeName[length]" or "TypeName[length][length]", etc.
+# An array type may have an empty length ("Type[]"). This is only valid for the
+# outermost length (the last set of brackets), but that must be checked
+# elsewhere.
+@_handles('type -> type-reference delimited-argument-list? type-size-specifier?'
+ ' array-length-specifier*')
+def _type(reference, parameters, size, array_spec):
+ """Builds the IR for a type specifier."""
+ base_type_source_location_end = reference.source_location.end
+ atomic_type_source_location_end = reference.source_location.end
+ if parameters.list:
+ base_type_source_location_end = parameters.source_location.end
+ atomic_type_source_location_end = parameters.source_location.end
+ if size.list:
+ base_type_source_location_end = size.source_location.end
+ base_type_location = parser_types.make_location(
+ reference.source_location.start,
+ base_type_source_location_end)
+ atomic_type_location = parser_types.make_location(
+ reference.source_location.start,
+ atomic_type_source_location_end)
+ t = ir_pb2.Type(
+ atomic_type=ir_pb2.AtomicType(
+ reference=reference,
+ source_location=atomic_type_location,
+ runtime_parameter=parameters.list[0].list if parameters.list else []),
+ size_in_bits=size.list[0] if size.list else None,
+ source_location=base_type_location)
+ for length in array_spec.list:
+ location = parser_types.make_location(
+ t.source_location.start, length.source_location.end)
+ if isinstance(length, ir_pb2.Expression):
+ t = ir_pb2.Type(
+ array_type=ir_pb2.ArrayType(base_type=t,
+ element_count=length,
+ source_location=location),
+ source_location=location)
+ elif isinstance(length, ir_pb2.Empty):
+ t = ir_pb2.Type(
+ array_type=ir_pb2.ArrayType(base_type=t,
+ automatic=length,
+ source_location=location),
+ source_location=location)
+ else:
+ assert False, "Shouldn't be here."
+ return t
+
+
+# TODO(bolms): Should symbolic names or expressions be allowed? E.g.,
+# UInt:FIELD_SIZE or UInt:(16 + 16)?
+@_handles('type-size-specifier -> ":" numeric-constant')
+def _type_size_specifier(colon, numeric_constant):
+ """handles the ":32" part of a type specifier like "UInt:32"."""
+ del colon
+ return ir_pb2.Expression(constant=numeric_constant)
+
+
+# The distinctions between different formats of NameDefinitions, Words, and
+# References are enforced during parsing, but not propagated to the IR.
+@_handles('type-name -> type-word')
+@_handles('snake-name -> snake-word')
+@_handles('constant-name -> constant-word')
+def _name(word):
+ return ir_pb2.NameDefinition(name=word)
+
+
+@_handles('type-word -> CamelWord')
+@_handles('snake-word -> SnakeWord')
+@_handles('builtin-field-word -> "$size_in_bits"')
+@_handles('builtin-field-word -> "$size_in_bytes"')
+@_handles('builtin-field-word -> "$max_size_in_bits"')
+@_handles('builtin-field-word -> "$max_size_in_bytes"')
+@_handles('builtin-field-word -> "$min_size_in_bits"')
+@_handles('builtin-field-word -> "$min_size_in_bytes"')
+@_handles('builtin-word -> "$is_statically_sized"')
+@_handles('builtin-word -> "$static_size_in_bits"')
+@_handles('constant-word -> ShoutyWord')
+@_handles('and-operator -> "&&"')
+@_handles('or-operator -> "||"')
+@_handles('less-operator -> "<="')
+@_handles('less-operator -> "<"')
+@_handles('greater-operator -> ">="')
+@_handles('greater-operator -> ">"')
+@_handles('equality-operator -> "=="')
+@_handles('inequality-operator -> "!="')
+@_handles('additive-operator -> "+"')
+@_handles('additive-operator -> "-"')
+@_handles('multiplicative-operator -> "*"')
+@_handles('function-name -> "$max"')
+@_handles('function-name -> "$present"')
+@_handles('function-name -> "$upper_bound"')
+@_handles('function-name -> "$lower_bound"')
+def _word(word):
+ return ir_pb2.Word(text=word.text)
+
+
+@_handles('type-reference -> type-reference-tail')
+@_handles('constant-reference -> constant-reference-tail')
+def _un_module_qualified_type_reference(reference):
+ return reference
+
+
+@_handles('constant-reference-tail -> constant-word')
+@_handles('type-reference-tail -> type-word')
+@_handles('snake-reference -> snake-word')
+@_handles('snake-reference -> builtin-field-word')
+def _reference(word):
+ return ir_pb2.Reference(source_name=[word])
+
+
+@_handles('builtin-reference -> builtin-word')
+def _builtin_reference(word):
+ return ir_pb2.Reference(source_name=[word],
+ canonical_name=ir_pb2.CanonicalName(
+ object_path=[word.text]))
+
+
+# Because constant-references ("Enum.NAME") are used in the same contexts as
+# field-references ("field.subfield"), module-qualified constant references
+# ("module.Enum.VALUE") have to take snake-reference, not snake-word, on the
+# left side of the dot. Otherwise, when a "snake_word" is followed by a "." in
+# an expression context, the LR(1) parser cannot determine whether to reduce the
+# snake-word to snake-reference (to eventually become field-reference), or to
+# shift the dot onto the stack (to eventually become constant-reference). By
+# using snake-reference as the head of both, the parser can always reduce, then
+# shift the dot, then determine whether to proceed with constant-reference if it
+# sees "snake_name.TypeName" or field-reference if it sees
+# "snake_name.snake_name".
+@_handles('constant-reference -> snake-reference "." constant-reference-tail')
+def _module_qualified_constant_reference(new_head, dot, reference):
+ del dot # Unused.
+ new_source_name = list(new_head.source_name) + list(reference.source_name)
+ del reference.source_name[:]
+ reference.source_name.extend(new_source_name)
+ return reference
+
+
+@_handles('constant-reference-tail -> type-word "." constant-reference-tail')
+# module.Type.SubType.name is a reference to something that *must* be a
+# constant.
+@_handles('constant-reference-tail -> type-word "." snake-reference')
+@_handles('type-reference-tail -> type-word "." type-reference-tail')
+@_handles('type-reference -> snake-word "." type-reference-tail')
+def _qualified_reference(word, dot, reference):
+ """Adds a name. or Type. qualification to the head of a reference."""
+ del dot # Unused.
+ new_source_name = [word] + list(reference.source_name)
+ del reference.source_name[:]
+ reference.source_name.extend(new_source_name)
+ return reference
+
+
+# Arrays are properly translated to IR in _type().
+@_handles('array-length-specifier -> "[" expression "]"')
+def _array_length_specifier(open_bracket, length, close_bracket):
+ del open_bracket, close_bracket # Unused.
+ return length
+
+
+# An array specifier can end with empty brackets ("arr[3][]"), in which case the
+# array's size is inferred from the size of its enclosing field.
+@_handles('array-length-specifier -> "[" "]"')
+def _auto_array_length_specifier(open_bracket, close_bracket):
+ # Note that the Void's source_location is the space between the brackets (if
+ # any).
+ return ir_pb2.Empty(
+ source_location=ir_pb2.Location(start=open_bracket.source_location.end,
+ end=close_bracket.source_location.start))
+
+
+@_handles('eol -> "\\n" comment-line*')
+def _eol(eol, comments):
+ del comments # Unused
+ return eol
+
+
+@_handles('comment-line -> Comment? "\\n"')
+def _comment_line(comment, eol):
+ del comment # Unused
+ return eol
+
+
+def _finalize_grammar():
+ """_Finalize adds productions for foo*, foo+, and foo? symbols."""
+ star_symbols = set()
+ plus_symbols = set()
+ option_symbols = set()
+ for production in _handlers:
+ for symbol in production.rhs:
+ if symbol[-1] == '*':
+ star_symbols.add(symbol[:-1])
+ elif symbol[-1] == '+':
+ # symbol+ relies on the rule for symbol*
+ star_symbols.add(symbol[:-1])
+ plus_symbols.add(symbol[:-1])
+ elif symbol[-1] == '?':
+ option_symbols.add(symbol[:-1])
+ for symbol in star_symbols:
+ _handles('{s}* -> {s} {s}*'.format(s=symbol))(
+ lambda e, r: _List([e] + r.list))
+ _handles('{s}* ->'.format(s=symbol))(lambda: _List([]))
+ for symbol in plus_symbols:
+ _handles('{s}+ -> {s} {s}*'.format(s=symbol))(
+ lambda e, r: _List([e] + r.list))
+ for symbol in option_symbols:
+ _handles('{s}? -> {s}'.format(s=symbol))(lambda e: _List([e]))
+ _handles('{s}? ->'.format(s=symbol))(lambda: _List([]))
+
+
+_finalize_grammar()
+
+# End of grammar.
+################################################################################
+
+# These export the grammar used by module_ir so that parser_generator can build
+# a parser for the same language.
+START_SYMBOL = 'module'
+EXPRESSION_START_SYMBOL = 'expression'
+PRODUCTIONS = list(_handlers.keys())
diff --git a/front_end/module_ir_test.py b/front_end/module_ir_test.py
new file mode 100644
index 0000000..d2fc27f
--- /dev/null
+++ b/front_end/module_ir_test.py
@@ -0,0 +1,4087 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for module_ir."""
+
+from __future__ import print_function
+
+import collections
+import pkgutil
+import unittest
+
+from front_end import module_ir
+from front_end import parser
+from front_end import test_util
+from front_end import tokenizer
+from public import ir_pb2
+
+_TESTDATA_PATH = "testdata.golden"
+_MINIMAL_SAMPLE = parser.parse_module(
+ tokenizer.tokenize(
+ pkgutil.get_data(_TESTDATA_PATH, "span_se_log_file_status.emb").decode(
+ encoding="UTF-8"),
+ "")[0]).parse_tree
+_MINIMAL_SAMPLE_IR = ir_pb2.Module.from_json(
+ pkgutil.get_data(_TESTDATA_PATH, "span_se_log_file_status.ir.txt").decode(
+ encoding="UTF-8")
+)
+
+# _TEST_CASES contains test cases, separated by '===', that ensure that specific
+# results show up in the IR for .embs.
+#
+# Each test case is of the form:
+#
+# name
+# ---
+# .emb text
+# ---
+# (incomplete) IR text format
+#
+# For each test case, the .emb is parsed into a parse tree, which is fed into
+# module_ir.build_ir(), which should successfully return an IR. The generated
+# IR is then compared against the incomplete IR in the test case to ensure that
+# the generated IR is a strict superset of the test case IR -- that is, it is OK
+# if the generated IR contains fields that are not in the test case, but not if
+# the test case contains fields that are not in the generated IR, and not if the
+# test case contains fields whose values differ from the generated IR.
+#
+# Additionally, for each test case, a pass is executed to ensure that the source
+# code location for each node in the IR is strictly contained within the source
+# location for its parent node.
+_TEST_CASES = r"""
+prelude
+---
+external UInt:
+ [fixed_size: false]
+ [byte_order_dependent: true]
+
+external Byte:
+ [size: 1]
+ [byte_order_dependent: false]
+---
+{
+ "type": [
+ {
+ "external": {},
+ "name": { "name": { "text": "UInt" } },
+ "attribute": [
+ {
+ "name": { "text": "fixed_size" },
+ "value": { "expression": { "boolean_constant": { "value": false } } }
+ },
+ {
+ "name": { "text": "byte_order_dependent" },
+ "value": { "expression": { "boolean_constant": { "value": true } } }
+ }
+ ]
+ },
+ {
+ "external": {},
+ "name": { "name": { "text": "Byte" } },
+ "attribute": [
+ {
+ "name": { "text": "size" },
+ "value": { "expression": { "constant": { "value": "1" } } }
+ },
+ {
+ "name": { "text": "byte_order_dependent" },
+ "value": { "expression": { "boolean_constant": { "value": false } } }
+ }
+ ]
+ }
+ ]
+}
+
+===
+numbers
+---
+bits Foo:
+ 0000000000 [+0_000_000_003] UInt decimal
+ 0b00000100 [+0b0000_0111] UInt binary
+ 0b00000000_00001000 [+0b0_00001011] UInt binary2
+ 0b_0_00001100 [+0b_00001111] UInt binary3
+ 0x00000010 [+0x0000_0013] UInt hex
+ 0x00000000_00000014 [+0x0_00000017] UInt hex2
+ 0x_0_00000018 [+0x_0000001b] UInt hex3
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "decimal" } },
+ "location": {
+ "start": { "constant": { "value": "0" } },
+ "size": { "constant": { "value": "3" } }
+ }
+ },
+ {
+ "name": { "name": { "text": "binary" } },
+ "location": {
+ "start": { "constant": { "value": "4" } },
+ "size": { "constant": { "value": "7" }
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "binary2" } },
+ "location": {
+ "start": { "constant": { "value": "8" } },
+ "size": { "constant": { "value": "11" } }
+ }
+ },
+ {
+ "name": { "name": { "text": "binary3" } },
+ "location": {
+ "start": { "constant": { "value": "12" } },
+ "size": { "constant": { "value": "15" } }
+ }
+ },
+ {
+ "name": { "name": { "text": "hex" } },
+ "location": {
+ "start": { "constant": { "value": "16" } },
+ "size": { "constant": { "value": "19" } }
+ }
+ },
+ {
+ "name": { "name": { "text": "hex2" } },
+ "location": {
+ "start": { "constant": { "value": "20" } },
+ "size": { "constant": { "value": "23" } }
+ }
+ },
+ {
+ "name": { "name": { "text": "hex3" } },
+ "location": {
+ "start": { "constant": { "value": "24" } },
+ "size": { "constant": { "value": "27" } }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+enum
+---
+enum Kind:
+ WIDGET = 0
+ SPROCKET = 1
+ GEEGAW = 2 # Comment.
+ MAX32 = 4294967295
+ MAX64 = 9223372036854775807
+---
+{
+ "type": [
+ {
+ "enumeration": {
+ "value": [
+ {
+ "name": { "name": { "text": "WIDGET" } },
+ "value": { "constant": { "value": "0" } }
+ },
+ {
+ "name": { "name": { "text": "SPROCKET" } },
+ "value": { "constant": { "value": "1" } }
+ },
+ {
+ "name": { "name": { "text": "GEEGAW" } },
+ "value": { "constant": { "value": "2" } }
+ },
+ {
+ "name": { "name": { "text": "MAX32" } },
+ "value": { "constant": { "value": "4294967295" } }
+ },
+ {
+ "name": { "name": { "text": "MAX64" } },
+ "value": { "constant": { "value": "9223372036854775807" } }
+ }
+ ]
+ },
+ "name": { "name": { "text": "Kind" } }
+ }
+ ]
+}
+
+===
+struct attribute
+---
+struct Foo:
+ [size: 10]
+ 0 [+0] UInt field
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [ { "name": { "name": { "text": "field" } } } ]
+ },
+ "name": { "name": { "text": "Foo" } },
+ "attribute": [
+ {
+ "name": { "text": "size" },
+ "value": { "expression": { "constant": { "value": "10" } } },
+ "is_default": false
+ }
+ ]
+ }
+ ]
+}
+
+===
+$default attribute
+---
+[$default byte_order: "LittleEndian"]
+---
+{
+ "attribute": [
+ {
+ "name": { "text": "byte_order" },
+ "value": { "string_constant": { "text": "LittleEndian" } },
+ "is_default": true
+ }
+ ]
+}
+
+===
+abbreviations
+---
+struct Foo:
+ 0 [+1] UInt size (s)
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "size" } },
+ "abbreviation": { "text": "s" }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+expressions
+---
+struct Foo:
+ 0+1 [+2*3] UInt plus_times
+ 4-5 [+(6)] UInt minus_paren
+ nn [+7*(8+9)] UInt name_complex
+ 10+11+12 [+13*14*15] UInt associativity
+ 16+17*18 [+19*20-21] UInt precedence
+ -(+1) [+0-(-10)] UInt unary_plus_minus
+ 1 + + 2 [+3 - -4 - 5] UInt unary_plus_minus_2
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "plus_times" } },
+ "location": {
+ "start": {
+ "function": {
+ "function": "ADDITION",
+ "function_name": { "text": "+" },
+ "args": [
+ { "constant": { "value": "0" } },
+ { "constant": { "value": "1" } }
+ ]
+ }
+ },
+ "size": {
+ "function": {
+ "function": "MULTIPLICATION",
+ "function_name": { "text": "*" },
+ "args": [
+ { "constant": { "value": "2" } },
+ { "constant": { "value": "3" } }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "minus_paren" } },
+ "location": {
+ "start": {
+ "function": {
+ "function": "SUBTRACTION",
+ "args": [
+ { "constant": { "value": "4" } },
+ { "constant": { "value": "5" } }
+ ]
+ }
+ },
+ "size": { "constant": { "value": "6" } }
+ }
+ },
+ {
+ "name": { "name": { "text": "name_complex" } },
+ "location": {
+ "start": {
+ "field_reference": {
+ "path": [ { "source_name": [ { "text": "nn" } ] } ]
+ }
+ },
+ "size": {
+ "function": {
+ "function": "MULTIPLICATION",
+ "args": [
+ { "constant": { "value": "7" } },
+ {
+ "function": {
+ "function": "ADDITION",
+ "args": [
+ { "constant": { "value": "8" } },
+ { "constant": { "value": "9" } }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "associativity" } },
+ "location": {
+ "start": {
+ "function": {
+ "function": "ADDITION",
+ "args": [
+ {
+ "function": {
+ "function": "ADDITION",
+ "args": [
+ { "constant": { "value": "10" } },
+ { "constant": { "value": "11" } }
+ ]
+ }
+ },
+ { "constant": { "value": "12" } }
+ ]
+ }
+ },
+ "size": {
+ "function": {
+ "function": "MULTIPLICATION",
+ "args": [
+ {
+ "function": {
+ "function": "MULTIPLICATION",
+ "args": [
+ { "constant": { "value": "13" } },
+ { "constant": { "value": "14" } }
+ ]
+ }
+ },
+ { "constant": { "value": "15" } }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "precedence" } },
+ "location": {
+ "start": {
+ "function": {
+ "function": "ADDITION",
+ "args": [
+ { "constant": { "value": "16" } },
+ {
+ "function": {
+ "function": "MULTIPLICATION",
+ "args": [
+ { "constant": { "value": "17" } },
+ { "constant": { "value": "18" } }
+ ]
+ }
+ }
+ ]
+ }
+ },
+ "size": {
+ "function": {
+ "function": "SUBTRACTION",
+ "args": [
+ {
+ "function": {
+ "function": "MULTIPLICATION",
+ "args": [
+ { "constant": { "value": "19" } },
+ { "constant": { "value": "20" } }
+ ]
+ }
+ },
+ { "constant": { "value": "21" } }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "unary_plus_minus" } },
+ "location": {
+ "start": {
+ "function": {
+ "function": "SUBTRACTION",
+ "function_name": {
+ "text": "-",
+ "source_location": {
+ "start": { "line": 8, "column": 3 },
+ "end": { "line": 8, "column": 4 }
+ }
+ },
+ "args": [
+ {
+ "constant": {
+ "value": "0",
+ "source_location": {
+ "start": { "line": 8, "column": 3 },
+ "end": { "line": 8, "column": 3 }
+ }
+ },
+ "source_location": {
+ "start": { "line": 8, "column": 3 },
+ "end": { "line": 8, "column": 3 }
+ }
+ },
+ {
+ "function": {
+ "function": "ADDITION",
+ "function_name": {
+ "text": "+",
+ "source_location": {
+ "start": { "line": 8, "column": 5 },
+ "end": { "line": 8, "column": 6 }
+ }
+ },
+ "args": [
+ {
+ "constant": { "value": "0" },
+ "source_location": {
+ "start": { "line": 8, "column": 5 },
+ "end": { "line": 8, "column": 5 }
+ }
+ },
+ {
+ "constant": { "value": "1" },
+ "source_location": {
+ "start": { "line": 8, "column": 6 },
+ "end": { "line": 8, "column": 7 }
+ }
+ }
+ ]
+ },
+ "source_location": {
+ "start": { "line": 8, "column": 4 },
+ "end": { "line": 8, "column": 8 }
+ }
+ }
+ ]
+ }
+ },
+ "size": {
+ "function": {
+ "function": "SUBTRACTION",
+ "function_name": {
+ "text": "-",
+ "source_location": {
+ "start": { "line": 8, "column": 12 },
+ "end": { "line": 8, "column": 13 }
+ }
+ },
+ "args": [
+ {
+ "constant": {
+ "value": "0",
+ "source_location": {
+ "start": { "line": 8, "column": 11 },
+ "end": { "line": 8, "column": 12 }
+ }
+ },
+ "source_location": {
+ "start": { "line": 8, "column": 11 },
+ "end": { "line": 8, "column": 12 }
+ }
+ },
+ {
+ "function": {
+ "function": "SUBTRACTION",
+ "function_name": {
+ "text": "-",
+ "source_location": {
+ "start": { "line": 8, "column": 14 },
+ "end": { "line": 8, "column": 15 }
+ }
+ },
+ "args": [
+ {
+ "constant": { "value": "0" },
+ "source_location": {
+ "start": { "line": 8, "column": 14 },
+ "end": { "line": 8, "column": 14 }
+ }
+ },
+ {
+ "constant": { "value": "10" },
+ "source_location": {
+ "start": { "line": 8, "column": 15 },
+ "end": { "line": 8, "column": 17 }
+ }
+ }
+ ]
+ },
+ "source_location": {
+ "start": { "line": 8, "column": 13 },
+ "end": { "line": 8, "column": 18 }
+ }
+ }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "unary_plus_minus_2" } },
+ "location": {
+ "start": {
+ "function": {
+ "function": "ADDITION",
+ "args": [
+ { "constant": { "value": "1" } },
+ {
+ "function": {
+ "function": "ADDITION",
+ "args": [
+ { "constant": { "value": "0" } },
+ { "constant": { "value": "2" } }
+ ]
+ }
+ }
+ ]
+ }
+ },
+ "size": {
+ "function": {
+ "function": "SUBTRACTION",
+ "args": [
+ {
+ "function": {
+ "function": "SUBTRACTION",
+ "args": [
+ { "constant": { "value": "3" } },
+ {
+ "function": {
+ "function": "SUBTRACTION",
+ "args": [
+ { "constant": { "value": "0" } },
+ { "constant": { "value": "4" } }
+ ]
+ }
+ }
+ ]
+ }
+ },
+ { "constant": { "value": "5" } }
+ ]
+ }
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+auto array size
+---
+struct TenElementArray:
+ 0 [+10] Byte[] bytes
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "array_type": {
+ "base_type": {
+ "atomic_type": {
+ "reference": { "source_name": [ { "text": "Byte" } ] }
+ }
+ },
+ "automatic": {
+ "source_location": {
+ "start": { "line": 3, "column": 16 },
+ "end": { "line": 3, "column": 18 }
+ }
+ }
+ }
+ },
+ "name": { "name": { "text": "bytes" } }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+start [+size] ranges
+---
+struct Foo:
+ 0 [ + 1 ] UInt zero_plus_one
+ s [+2] UInt s_plus_two
+ s [+t] Byte[t] s_plus_t
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "zero_plus_one" } },
+ "location": {
+ "start": {
+ "constant": { "value": "0" },
+ "source_location": {
+ "start": { "line": 3, "column": 3 },
+ "end": { "line": 3, "column": 4 }
+ }
+ },
+ "size": {
+ "constant": { "value": "1" },
+ "source_location": {
+ "start": { "line": 3, "column": 9 },
+ "end": { "line": 3, "column": 10 }
+ }
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "s_plus_two" } },
+ "location": {
+ "start": {
+ "field_reference": {
+ "path": [ { "source_name": [ { "text": "s" } ] } ]
+ }
+ },
+ "size": { "constant": { "value": "2" } }
+ }
+ },
+ {
+ "name": { "name": { "text": "s_plus_t" } },
+ "location": {
+ "start": {
+ "field_reference": {
+ "path": [ { "source_name": [ { "text": "s" } ] } ]
+ }
+ },
+ "size": {
+ "field_reference": {
+ "path": [ { "source_name": [ { "text": "t" } ] } ]
+ }
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+Using Enum.VALUEs in expressions
+---
+struct Foo:
+ 0 [+0+Number.FOUR] UInt length_four
+ Number.FOUR [+8] UInt start_four
+ 8 [+3*Number.FOUR] UInt end_four
+ 12 [+16] Byte[Number.FOUR] array_size_four
+
+enum Number:
+ FOUR = 4
+ EIGHT = FOUR + Number.FOUR
+ SIXTEEN = Number.FOUR * FOUR
+ INVALID = Number.NaN.FOUR
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "length_four" } },
+ "location": {
+ "size": {
+ "function": {
+ "function": "ADDITION",
+ "args": [
+ { "constant": { "value": "0" } },
+ {
+ "constant_reference": {
+ "source_name": [
+ { "text": "Number" },
+ { "text": "FOUR" }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "start_four" } },
+ "location": {
+ "start": {
+ "constant_reference": {
+ "source_name": [
+ { "text": "Number" },
+ { "text": "FOUR" }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "end_four" } },
+ "location": {
+ "size": {
+ "function": {
+ "function": "MULTIPLICATION",
+ "args": [
+ { "constant": { "value": "3" } },
+ {
+ "constant_reference": {
+ "source_name": [
+ { "text": "Number" },
+ { "text": "FOUR" }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "type": {
+ "array_type": {
+ "element_count": {
+ "constant_reference": {
+ "source_name": [
+ { "text": "Number" },
+ { "text": "FOUR" }
+ ]
+ }
+ }
+ }
+ },
+ "name": { "name": { "text": "array_size_four" } }
+ }
+ ]
+ }
+ },
+ {
+ "enumeration": {
+ "value": [
+ {
+ "name": { "name": { "text": "FOUR" } },
+ "value": { "constant": { "value": "4" } }
+ },
+ {
+ "name": { "name": { "text": "EIGHT" } },
+ "value": {
+ "function": {
+ "function": "ADDITION",
+ "args": [
+ {
+ "constant_reference": {
+ "source_name": [ { "text": "FOUR" } ]
+ }
+ },
+ {
+ "constant_reference": {
+ "source_name": [
+ { "text": "Number" },
+ { "text": "FOUR" }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "SIXTEEN" } },
+ "value": {
+ "function": {
+ "function": "MULTIPLICATION",
+ "args": [
+ {
+ "constant_reference": {
+ "source_name": [
+ { "text": "Number" },
+ { "text": "FOUR" }
+ ]
+ }
+ },
+ {
+ "constant_reference": {
+ "source_name": [ { "text": "FOUR" } ]
+ }
+ }
+ ]
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "INVALID" } },
+ "value": {
+ "constant_reference": {
+ "source_name": [
+ { "text": "Number" },
+ { "text": "NaN" },
+ { "text": "FOUR" }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+Using Type.constants in expressions
+---
+struct Foo:
+ 0 [+Bar.four] UInt length_four
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "length_four" } },
+ "location": {
+ "size": {
+ "constant_reference": {
+ "source_name": [ { "text": "Bar" }, { "text": "four" } ]
+ }
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+using Type.Subtype
+---
+struct Foo:
+ 0 [+0] Bar.Baz bar_baz
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [ { "text": "Bar" }, { "text": "Baz" } ]
+ }
+ }
+ },
+ "name": { "name": { "text": "bar_baz" } }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+module.Type
+---
+struct Foo:
+ 0 [+0] bar.Baz bar_baz
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [ { "text": "bar" }, { "text": "Baz" } ]
+ }
+ }
+ },
+ "name": { "name": { "text": "bar_baz" } }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+module.Type.ENUM_VALUE
+---
+struct Foo:
+ bar.Baz.QUX [+0] UInt i
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "i" } },
+ "location": {
+ "start": {
+ "constant_reference": {
+ "source_name": [
+ { "text": "bar" },
+ { "text": "Baz" },
+ { "text": "QUX" }
+ ]
+ }
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+field attributes
+---
+struct Foo:
+ 0 [+1] UInt field [fixed_size: true]
+ [size: 1]
+ 1 [+2] UInt field2
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "field"
+ }
+ },
+ "attribute": [
+ {
+ "name": {
+ "text": "fixed_size"
+ },
+ "value": {
+ "expression": {
+ "boolean_constant": {
+ "value": true
+ }
+ }
+ }
+ },
+ {
+ "name": {
+ "text": "size"
+ },
+ "value": {
+ "expression": {
+ "constant": {
+ "value": "1"
+ }
+ }
+ }
+ }
+ ]
+ },
+ {
+ "name": {
+ "name": {
+ "text": "field2"
+ }
+ }
+ }
+ ]
+ },
+ "name": {
+ "name": {
+ "text": "Foo"
+ }
+ }
+ }
+ ]
+}
+
+===
+enum attribute
+---
+enum Foo:
+ [fixed_size: false]
+ NAME = 1
+---
+{
+ "type": [
+ {
+ "enumeration": {
+ "value": [ { "name": { "name": { "text": "NAME" } } } ]
+ },
+ "name": { "name": { "text": "Foo" } },
+ "attribute": [
+ {
+ "name": { "text": "fixed_size" },
+ "value": {
+ "expression": { "boolean_constant": { "value": false } }
+ }
+ }
+ ]
+ }
+ ]
+}
+
+===
+string attribute
+---
+[abc: "abc"]
+[bs: "abc\\"]
+[bsbs: "abc\\\\"]
+[nl: "abc\nd"]
+[q: "abc\"d"]
+[qq: "abc\"\""]
+---
+{
+ "attribute": [
+ {
+ "name": { "text": "abc" },
+ "value": { "string_constant": { "text": "abc" } }
+ },
+ {
+ "name": { "text": "bs" },
+ "value": { "string_constant": { "text": "abc\\" } }
+ },
+ {
+ "name": { "text": "bsbs" },
+ "value": { "string_constant": { "text": "abc\\\\" } }
+ },
+ {
+ "name": { "text": "nl" },
+ "value": { "string_constant": { "text": "abc\nd" } }
+ },
+ {
+ "name": { "text": "q" },
+ "value": { "string_constant": { "text": "abc\"d" } }
+ },
+ {
+ "name": { "text": "qq" },
+ "value": { "string_constant": { "text": "abc\"\"" } }
+ }
+ ]
+}
+
+===
+back-end-specific attribute
+---
+[(cpp) namespace: "a::b::c"]
+---
+{
+ "attribute": [
+ {
+ "name": { "text": "namespace" },
+ "value": { "string_constant": { "text": "a::b::c" } },
+ "back_end": { "text": "cpp" }
+ }
+ ]
+}
+
+===
+documentation
+---
+-- module doc
+--
+-- module doc 2
+struct Foo:
+ -- foo doc
+ -- foo doc 2
+ 0 [+1] UInt bar -- bar inline doc
+ -- bar continued doc
+ -- bar continued doc 2
+enum Baz:
+ -- baz doc
+ -- baz doc 2
+ QUX = 1 -- qux inline doc
+ -- qux continued doc
+ -- qux continued doc 2
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "documentation": [
+ {
+ "text": "bar inline doc"
+ },
+ {
+ "text": "bar continued doc"
+ },
+ {
+ "text": "bar continued doc 2"
+ }
+ ]
+ }
+ ]
+ },
+ "name": {
+ "name": {
+ "text": "Foo"
+ }
+ },
+ "documentation": [
+ {
+ "text": "foo doc"
+ },
+ {
+ "text": "foo doc 2"
+ }
+ ]
+ },
+ {
+ "enumeration": {
+ "value": [
+ {
+ "name": {
+ "name": {
+ "text": "QUX"
+ }
+ },
+ "documentation": [
+ {
+ "text": "qux inline doc"
+ },
+ {
+ "text": "qux continued doc"
+ },
+ {
+ "text": "qux continued doc 2"
+ }
+ ]
+ }
+ ]
+ },
+ "name": {
+ "name": {
+ "text": "Baz"
+ }
+ },
+ "documentation": [
+ {
+ "text": "baz doc"
+ },
+ {
+ "text": "baz doc 2"
+ }
+ ]
+ }
+ ],
+ "documentation": [
+ {
+ "text": "module doc"
+ },
+ {
+ "text": ""
+ },
+ {
+ "text": "module doc 2"
+ }
+ ]
+}
+
+===
+inline enum
+---
+struct Foo:
+ 0 [+1] enum baz_qux_gibble (bqg):
+ [q: 5]
+ BAR = 1
+ FOO = 2
+bits Bar:
+ 0 [+1] enum baz_qux_gibble (bqg):
+ [q: 5]
+ BAR = 1
+ FOO = 2
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [ { "text": "BazQuxGibble" } ],
+ "is_local_name": true
+ }
+ }
+ },
+ "name": { "name": { "text": "baz_qux_gibble" } },
+ "abbreviation": { "text": "bqg" },
+ "attribute": [
+ {
+ "name": { "text": "q" },
+ "value": { "expression": { "constant": { "value": "5" } } }
+ }
+ ]
+ }
+ ]
+ },
+ "name": { "name": { "text": "Foo" } },
+ "subtype": [
+ {
+ "enumeration": {
+ "value": [
+ {
+ "name": { "name": { "text": "BAR" } },
+ "value": { "constant": { "value": "1" } }
+ },
+ {
+ "name": { "name": { "text": "FOO" } },
+ "value": { "constant": { "value": "2" } }
+ }
+ ]
+ },
+ "name": { "name": { "text": "BazQuxGibble" } }
+ }
+ ]
+ },
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [ { "text": "BazQuxGibble" } ],
+ "is_local_name": true
+ }
+ }
+ },
+ "name": { "name": { "text": "baz_qux_gibble" } },
+ "abbreviation": { "text": "bqg" },
+ "attribute": [
+ {
+ "name": { "text": "q" },
+ "value": { "expression": { "constant": { "value": "5" } } }
+ }
+ ]
+ }
+ ]
+ },
+ "name": { "name": { "text": "Bar" } },
+ "subtype": [
+ {
+ "enumeration": {
+ "value": [
+ {
+ "name": { "name": { "text": "BAR" } },
+ "value": { "constant": { "value": "1" } }
+ },
+ {
+ "name": { "name": { "text": "FOO" } },
+ "value": { "constant": { "value": "2" } }
+ }
+ ]
+ },
+ "name": { "name": { "text": "BazQuxGibble" } }
+ }
+ ]
+ }
+ ]
+}
+
+===
+inline struct
+---
+struct Foo:
+ 0 [+1] struct baz_qux_gibble (bqg):
+ [q: 5]
+ 0 [+1] UInt bar
+ 1 [+1] UInt foo
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [ { "text": "BazQuxGibble" } ],
+ "is_local_name": true
+ }
+ }
+ },
+ "name": { "name": { "text": "baz_qux_gibble" } },
+ "abbreviation": { "text": "bqg" },
+ "attribute": [
+ {
+ "name": { "text": "q" },
+ "value": { "expression": { "constant": { "value": "5" } } }
+ }
+ ]
+ }
+ ]
+ },
+ "name": { "name": { "text": "Foo" } },
+ "subtype": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": { "source_name": [ { "text": "UInt" } ] }
+ }
+ },
+ "name": { "name": { "text": "bar" } }
+ },
+ {
+ "type": {
+ "atomic_type": {
+ "reference": { "source_name": [ { "text": "UInt" } ] }
+ }
+ },
+ "name": { "name": { "text": "foo" } }
+ }
+ ]
+ },
+ "name": { "name": { "text": "BazQuxGibble" } }
+ }
+ ]
+ }
+ ]
+}
+
+===
+inline bits
+---
+struct Foo:
+ 0 [+1] bits baz_qux_gibble (bqg):
+ [q: 5]
+ 0 [+1] UInt bar
+ 1 [+1] UInt foo
+bits Bar:
+ 0 [+8] bits baz_qux_gibble (bqg):
+ [q: 5]
+ 0 [+1] UInt bar
+ 1 [+1] UInt foo
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [
+ {
+ "text": "BazQuxGibble"
+ }
+ ]
+ }
+ }
+ },
+ "name": {
+ "name": {
+ "text": "baz_qux_gibble"
+ }
+ },
+ "abbreviation": {
+ "text": "bqg"
+ },
+ "attribute": [
+ {
+ "name": {
+ "text": "q"
+ },
+ "value": {
+ "expression": {
+ "constant": {
+ "value": "5"
+ }
+ }
+ }
+ }
+ ]
+ }
+ ]
+ },
+ "name": {
+ "name": {
+ "text": "Foo"
+ }
+ },
+ "subtype": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [
+ {
+ "text": "UInt"
+ }
+ ]
+ }
+ }
+ },
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ }
+ },
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [
+ {
+ "text": "UInt"
+ }
+ ]
+ }
+ }
+ },
+ "name": {
+ "name": {
+ "text": "foo"
+ }
+ }
+ }
+ ]
+ },
+ "name": {
+ "name": {
+ "text": "BazQuxGibble"
+ }
+ }
+ }
+ ]
+ },
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [
+ {
+ "text": "BazQuxGibble"
+ }
+ ]
+ }
+ }
+ },
+ "name": {
+ "name": {
+ "text": "baz_qux_gibble"
+ }
+ },
+ "abbreviation": {
+ "text": "bqg"
+ },
+ "attribute": [
+ {
+ "name": {
+ "text": "q"
+ },
+ "value": {
+ "expression": {
+ "constant": {
+ "value": "5"
+ }
+ }
+ }
+ }
+ ]
+ }
+ ]
+ },
+ "name": {
+ "name": {
+ "text": "Bar"
+ }
+ },
+ "subtype": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [
+ {
+ "text": "UInt"
+ }
+ ]
+ }
+ }
+ },
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ }
+ },
+ {
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "source_name": [
+ {
+ "text": "UInt"
+ }
+ ]
+ }
+ }
+ },
+ "name": {
+ "name": {
+ "text": "foo"
+ }
+ }
+ }
+ ]
+ },
+ "name": {
+ "name": {
+ "text": "BazQuxGibble"
+ }
+ }
+ }
+ ]
+ }
+ ]
+}
+
+===
+subfield
+---
+struct Foo:
+ foo.bar [+1] UInt x
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "x"
+ }
+ },
+ "location": {
+ "start": {
+ "field_reference": {
+ "path": [
+ {
+ "source_name": [
+ {
+ "text": "foo"
+ }
+ ]
+ },
+ {
+ "source_name": [
+ {
+ "text": "bar"
+ }
+ ]
+ }
+ ]
+ }
+ }
+ }
+ }
+ ]
+ },
+ "name": {
+ "name": {
+ "text": "Foo"
+ }
+ }
+ }
+ ]
+}
+
+===
+anonymous bits
+---
+struct Foo:
+ 0 [+1] bits:
+ 31 [+1] enum high_bit:
+ OFF = 0
+ ON = 1
+ 0 [+1] Flag low_bit
+ if false:
+ 16 [+1] UInt mid_high
+ 15 [+1] UInt mid_low
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "is_anonymous": true
+ },
+ "location": {
+ "start": {
+ "constant": {
+ "value": "0"
+ }
+ },
+ "size": {
+ "constant": {
+ "value": "1"
+ }
+ }
+ }
+ }
+ ]
+ },
+ "name": {
+ "name": {
+ "text": "Foo"
+ }
+ },
+ "subtype": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "high_bit"
+ }
+ }
+ },
+ {
+ "name": {
+ "name": {
+ "text": "low_bit"
+ }
+ }
+ },
+ {
+ "name": {
+ "name": {
+ "text": "mid_high"
+ }
+ },
+ "existence_condition": {
+ "boolean_constant": {
+ "value": false
+ }
+ }
+ },
+ {
+ "name": {
+ "name": {
+ "text": "mid_low"
+ }
+ },
+ "existence_condition": {
+ "boolean_constant": {
+ "value": false
+ }
+ }
+ }
+ ]
+ },
+ "name": { "is_anonymous": true }
+ },
+ {
+ "enumeration": {
+ "value": [
+ {
+ "name": { "name": { "text": "OFF" } },
+ "value": { "constant": { "value": "0" } }
+ },
+ {
+ "name": { "name": { "text": "ON" } },
+ "value": { "constant": { "value": "1" } }
+ }
+ ]
+ },
+ "name": { "name": { "text": "HighBit" } }
+ }
+ ]
+ }
+ ]
+}
+
+===
+explicit type size
+---
+struct Foo:
+ 0 [+1] Bar:8 bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ { "type": { "size_in_bits": { "constant": { "value": "8" } } } }
+ ]
+ },
+ "name": { "name": { "text": "Foo" } }
+ }
+ ]
+}
+
+===
+import
+---
+import "xyz.emb" as yqf
+---
+{
+ "foreign_import": [
+ { "file_name": { "text": "" }, "local_name": { "text": "" } },
+ { "file_name": { "text": "xyz.emb" }, "local_name": { "text": "yqf" } }
+ ]
+}
+
+===
+empty file
+---
+---
+{
+ "foreign_import": [
+ {
+ "file_name": {
+ "text": "",
+ "source_location": {
+ "start": { "line": 1, "column": 1 },
+ "end": { "line": 1, "column": 1 }
+ }
+ },
+ "local_name": {
+ "text": "",
+ "source_location": {
+ "start": { "line": 1, "column": 1 },
+ "end": { "line": 1, "column": 1 }
+ }
+ },
+ "source_location": {
+ "start": { "line": 1, "column": 1 },
+ "end": { "line": 1, "column": 1 }
+ }
+ }
+ ],
+ "source_location": {
+ "start": { "line": 1, "column": 1 },
+ "end": { "line": 1, "column": 1 }
+ }
+}
+
+===
+existence_condition on unconditional field
+---
+struct Foo:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "bar" } },
+ "existence_condition": { "boolean_constant": { "value": true } }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+conditional struct fields
+---
+struct Foo:
+ if true == false:
+ 0 [+1] UInt bar
+ 1 [+1] bits:
+ 0 [+1] UInt xx
+ 1 [+1] UInt yy
+ 2 [+1] enum baz:
+ XX = 1
+ YY = 2
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "bar" } },
+ "existence_condition": {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ { "boolean_constant": { "value": true } },
+ { "boolean_constant": { "value": false } }
+ ]
+ }
+ }
+ },
+ {
+ "existence_condition": {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ { "boolean_constant": { "value": true } },
+ { "boolean_constant": { "value": false } }
+ ]
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "baz" } },
+ "existence_condition": {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ { "boolean_constant": { "value": true } },
+ { "boolean_constant": { "value": false } }
+ ]
+ }
+ }
+ }
+ ]
+ },
+ "subtype": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "xx" } },
+ "existence_condition": { "boolean_constant": { "value": true } }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ ]
+}
+
+===
+negative condition
+---
+struct Foo:
+ if true != false:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "bar" } },
+ "existence_condition": {
+ "function": {
+ "function": "INEQUALITY",
+ "args": [
+ { "boolean_constant": { "value": true } },
+ { "boolean_constant": { "value": false } }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+conditional bits fields
+---
+bits Foo:
+ if true == false:
+ 0 [+1] UInt bar
+ 1 [+1] enum baz:
+ XX = 1
+ YY = 2
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "bar" } },
+ "existence_condition": {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ { "boolean_constant": { "value": true } },
+ { "boolean_constant": { "value": false } }
+ ]
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "baz" } },
+ "existence_condition": {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ { "boolean_constant": { "value": true } },
+ { "boolean_constant": { "value": false } }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+conditional with logical and
+---
+struct Foo:
+ if true && false:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "boolean_constant": {
+ "value": true
+ }
+ },
+ {
+ "boolean_constant": {
+ "value": false
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+conditional with logical or
+---
+struct Foo:
+ if true || false:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "OR",
+ "args": [
+ {
+ "boolean_constant": {
+ "value": true
+ }
+ },
+ {
+ "boolean_constant": {
+ "value": false
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+conditional with multiple logical ands
+---
+struct Foo:
+ if true && false && true:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "boolean_constant": {
+ "value": true
+ }
+ },
+ {
+ "boolean_constant": {
+ "value": false
+ }
+ }
+ ]
+ }
+ },
+ {
+ "boolean_constant": {
+ "value": true
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+conditional with multiple logical ors
+---
+struct Foo:
+ if true || false || true:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "OR",
+ "args": [
+ {
+ "function": {
+ "function": "OR",
+ "args": [
+ {
+ "boolean_constant": {
+ "value": true
+ }
+ },
+ {
+ "boolean_constant": {
+ "value": false
+ }
+ }
+ ]
+ }
+ },
+ {
+ "boolean_constant": {
+ "value": true
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+conditional with comparisons and logical or
+---
+struct Foo:
+ if 5 == 6 || 6 == 6:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "OR",
+ "args": [
+ {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ {
+ "constant": {
+ "value": "5"
+ }
+ },
+ {
+ "constant": {
+ "value": "6"
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ {
+ "constant": {
+ "value": "6"
+ }
+ },
+ {
+ "constant": {
+ "value": "6"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+conditional with or-of-ands
+---
+struct Foo:
+ if true || (false && true):
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "OR",
+ "args": [
+ {
+ "boolean_constant": {
+ "value": true
+ }
+ },
+ {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "boolean_constant": {
+ "value": false
+ }
+ },
+ {
+ "boolean_constant": {
+ "value": true
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+less-than comparison
+---
+struct Foo:
+ if 1 < 2:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "LESS",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+less-than-or-equal comparison
+---
+struct Foo:
+ if 1 <= 2:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "LESS_OR_EQUAL",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+greater-than comparison
+---
+struct Foo:
+ if 1 > 2:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "GREATER",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+greater-than-or-equal comparison
+---
+struct Foo:
+ if 1 >= 2:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "GREATER_OR_EQUAL",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+chained less-than comparison
+---
+struct Foo:
+ if 1 < 2 < 3:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "LESS",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "LESS",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+chained greater-than comparison
+---
+struct Foo:
+ if 1 > 2 > 3:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "GREATER",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "GREATER",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+longer chained less-than comparison
+---
+struct Foo:
+ if 1 < 2 < 3 <= 4:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "LESS",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "LESS",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "LESS_OR_EQUAL",
+ "args": [
+ {
+ "constant": {
+ "value": "3"
+ }
+ },
+ {
+ "constant": {
+ "value": "4"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+longer chained greater-than comparison
+---
+struct Foo:
+ if 1 > 2 > 3 >= 4:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "GREATER",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "GREATER",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "GREATER_OR_EQUAL",
+ "args": [
+ {
+ "constant": {
+ "value": "3"
+ }
+ },
+ {
+ "constant": {
+ "value": "4"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+chained less-than and equal comparison
+---
+struct Foo:
+ if 1 < 2 == 3:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "LESS",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+chained greater-than and equal comparison
+---
+struct Foo:
+ if 1 > 2 == 3:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "GREATER",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+chained equal and less-than comparison
+---
+struct Foo:
+ if 1 == 2 < 3:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "LESS",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+chained equal and greater-than comparison
+---
+struct Foo:
+ if 1 == 2 > 3:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "GREATER",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+chained equality comparison
+---
+struct Foo:
+ if 1 == 2 == 3:
+ 0 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "AND",
+ "args": [
+ {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ {
+ "constant": {
+ "value": "1"
+ }
+ },
+ {
+ "constant": {
+ "value": "2"
+ }
+ }
+ ]
+ }
+ },
+ {
+ "function": {
+ "function": "EQUALITY",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+choice operator
+---
+struct Foo:
+ true ? 0 : 1 [+1] UInt bar
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "bar"
+ }
+ },
+ "location": {
+ "start": {
+ "function": {
+ "function": "CHOICE",
+ "args": [
+ {
+ "boolean_constant": {
+ "value": true
+ }
+ },
+ {
+ "constant": {
+ "value": "0"
+ }
+ },
+ {
+ "constant": {
+ "value": "1"
+ }
+ }
+ ]
+ }
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+max function
+---
+struct Foo:
+ $max() [+1] UInt no_arg
+ $max(0) [+1] UInt one_arg
+ $max(2 * 3) [+1] UInt mul_arg
+ $max(2, 3) [+1] UInt two_arg
+ $max(2, 3, 4, 5, 6) [+1] UInt five_arg
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "no_arg"
+ }
+ },
+ "location": {
+ "start": {
+ "function": {
+ "function": "MAXIMUM"
+ }
+ }
+ }
+ },
+ {
+ "name": {
+ "name": {
+ "text": "one_arg"
+ }
+ },
+ "location": {
+ "start": {
+ "function": {
+ "function": "MAXIMUM",
+ "args": [
+ {
+ "constant": {
+ "value": "0"
+ }
+ }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "name": {
+ "name": {
+ "text": "mul_arg"
+ }
+ },
+ "location": {
+ "start": {
+ "function": {
+ "function": "MAXIMUM",
+ "args": [
+ {
+ "function": {
+ "function": "MULTIPLICATION",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "name": {
+ "name": {
+ "text": "two_arg"
+ }
+ },
+ "location": {
+ "start": {
+ "function": {
+ "function": "MAXIMUM",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ }
+ ]
+ }
+ }
+ }
+ },
+ {
+ "name": {
+ "name": {
+ "text": "five_arg"
+ }
+ },
+ "location": {
+ "start": {
+ "function": {
+ "function": "MAXIMUM",
+ "args": [
+ {
+ "constant": {
+ "value": "2"
+ }
+ },
+ {
+ "constant": {
+ "value": "3"
+ }
+ },
+ {
+ "constant": {
+ "value": "4"
+ }
+ },
+ {
+ "constant": {
+ "value": "5"
+ }
+ },
+ {
+ "constant": {
+ "value": "6"
+ }
+ }
+ ]
+ }
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+has function
+---
+struct Foo:
+ if $present(x):
+ 0 [+1] UInt field
+ if $present(x.y.z):
+ 0 [+1] UInt field2
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "field"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "PRESENCE",
+ "args": [
+ {
+ "field_reference": {
+ "path": [
+ {
+ "source_name": [
+ {
+ "text": "x"
+ }
+ ]
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ },
+ {
+ "name": {
+ "name": {
+ "text": "field2"
+ }
+ },
+ "existence_condition": {
+ "function": {
+ "function": "PRESENCE",
+ "args": [
+ {
+ "field_reference": {
+ "path": [
+ {
+ "source_name": [
+ {
+ "text": "x"
+ }
+ ]
+ },
+ {
+ "source_name": [
+ {
+ "text": "y"
+ }
+ ]
+ },
+ {
+ "source_name": [
+ {
+ "text": "z"
+ }
+ ]
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+upper_bound function
+---
+struct Foo:
+ $upper_bound(0) [+1] UInt one
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "one"
+ }
+ },
+ "location": {
+ "start": {
+ "function": {
+ "function": "UPPER_BOUND",
+ "args": [
+ {
+ "constant": {
+ "value": "0"
+ }
+ }
+ ]
+ }
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+lower_bound function
+---
+struct Foo:
+ $lower_bound(0) [+1] UInt one
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": {
+ "name": {
+ "text": "one"
+ }
+ },
+ "location": {
+ "start": {
+ "function": {
+ "function": "LOWER_BOUND",
+ "args": [
+ {
+ "constant": {
+ "value": "0"
+ }
+ }
+ ]
+ }
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+struct addressable_unit
+---
+struct Foo:
+ 0 [+1] UInt size
+---
+{ "type": [ { "structure": {}, "addressable_unit": "BYTE" } ] }
+
+===
+bits addressable_unit
+---
+bits Foo:
+ 0 [+1] UInt size
+---
+{ "type": [ { "structure": {}, "addressable_unit": "BIT" } ] }
+
+===
+enum addressable_unit
+---
+enum Foo:
+ BAR = 0
+---
+{ "type": [ { "enumeration": {}, "addressable_unit": "BIT" } ] }
+
+===
+type size source_location
+---
+struct Foo:
+ 0 [+4] UInt:32 field
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": { "source_name": [ { "text": "UInt" } ] }
+ },
+ "size_in_bits": {
+ "source_location": {
+ "start": { "line": 3, "column": 15 },
+ "end": { "line": 3, "column": 18 }
+ }
+ },
+ "source_location": {
+ "start": { "line": 3, "column": 11 },
+ "end": { "line": 3, "column": 18 }
+ }
+ },
+ "name": { "name": { "text": "field" } }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+builtin references
+---
+external Foo:
+ [requires: $is_statically_sized && $static_size_in_bits == 64]
+---
+{
+ "type": [
+ {
+ "external": {},
+ "attribute": [
+ {
+ "name": { "text": "requires" },
+ "value": {
+ "expression": {
+ "function": {
+ "args": [
+ {
+ "builtin_reference": {
+ "canonical_name": {
+ "module_file": "",
+ "object_path": [ "$is_statically_sized" ]
+ },
+ "source_name": [ { "text": "$is_statically_sized" } ]
+ }
+ },
+ {
+ "function": {
+ "args": [
+ {
+ "builtin_reference": {
+ "canonical_name": {
+ "module_file": "",
+ "object_path": [ "$static_size_in_bits" ]
+ },
+ "source_name": [
+ { "text": "$static_size_in_bits" }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ ]
+ }
+ }
+ }
+ }
+ ]
+ }
+ ]
+}
+
+===
+virtual fields
+---
+struct Foo:
+ let x = 10
+bits Bar:
+ let y = 100
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "x" } },
+ "read_transform": { "constant": { "value": "10" } }
+ }
+ ]
+ }
+ },
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "y" } },
+ "read_transform": { "constant": { "value": "100" } }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+builtin fields
+---
+struct Foo:
+ let x = $size_in_bytes
+ let y = $max_size_in_bytes
+ let z = $min_size_in_bytes
+bits Bar:
+ let x = $size_in_bits
+ let y = $max_size_in_bits
+ let z = $min_size_in_bits
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "x" } },
+ "read_transform": {
+ "field_reference": {
+ "path": [ { "source_name": [ { "text": "$size_in_bytes" } ] } ]
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "y" } },
+ "read_transform": {
+ "field_reference": {
+ "path": [
+ { "source_name": [ { "text": "$max_size_in_bytes" } ] }
+ ]
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "z" } },
+ "read_transform": {
+ "field_reference": {
+ "path": [
+ { "source_name": [ { "text": "$min_size_in_bytes" } ] }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ },
+ {
+ "structure": {
+ "field": [
+ {
+ "name": { "name": { "text": "x" } },
+ "read_transform": {
+ "field_reference": {
+ "path": [ { "source_name": [ { "text": "$size_in_bits" } ] } ]
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "y" } },
+ "read_transform": {
+ "field_reference": {
+ "path": [
+ { "source_name": [ { "text": "$max_size_in_bits" } ] }
+ ]
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "z" } },
+ "read_transform": {
+ "field_reference": {
+ "path": [
+ { "source_name": [ { "text": "$min_size_in_bits" } ] }
+ ]
+ }
+ }
+ }
+ ]
+ }
+ }
+ ]
+}
+
+===
+parameterized type definitions
+---
+struct Foo(a: Flag, b: UInt:32):
+ let x = 10
+bits Bar(c: UInt:16):
+ let y = 100
+struct Baz():
+ let x = 10
+---
+{
+ "type": [
+ {
+ "runtime_parameter": [
+ {
+ "name": { "name": { "text": "a" } },
+ "physical_type_alias": {
+ "atomic_type": {
+ "reference": { "source_name": [ { "text": "Flag" } ] }
+ }
+ }
+ },
+ {
+ "name": { "name": { "text": "b" } },
+ "physical_type_alias": {
+ "atomic_type": {
+ "reference": { "source_name": [ { "text": "UInt" } ] }
+ },
+ "size_in_bits": { "constant": { "value": "32" } }
+ }
+ }
+ ]
+ },
+ {
+ "runtime_parameter": [
+ {
+ "name": { "name": { "text": "c" } },
+ "physical_type_alias": {
+ "atomic_type": {
+ "reference": { "source_name": [ { "text": "UInt" } ] }
+ },
+ "size_in_bits": { "constant": { "value": "16" } }
+ }
+ }
+ ]
+ },
+ {}
+ ]
+}
+
+===
+parameterized type usages
+---
+struct Foo:
+ 0 [+1] Two(1, 2) two
+ 1 [+1] One(3) one
+ 2 [+1] Zero() zero
+---
+{
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "type": {
+ "atomic_type": {
+ "reference": { "source_name": [ { "text": "Two" } ] },
+ "runtime_parameter": [
+ { "constant": { "value": "1" } },
+ { "constant": { "value": "2" } }
+ ]
+ }
+ },
+ "name": { "name": { "text": "two" } }
+ },
+ {
+ "type": {
+ "atomic_type": {
+ "reference": { "source_name": [ { "text": "One" } ] },
+ "runtime_parameter": [ { "constant": { "value": "3" } } ]
+ }
+ },
+ "name": { "name": { "text": "one" } }
+ },
+ {
+ "type": {
+ "atomic_type": {
+ "reference": { "source_name": [ { "text": "Zero" } ] }
+ }
+ },
+ "name": { "name": { "text": "zero" } }
+ }
+ ]
+ }
+ }
+ ]
+}
+"""
+
+
+# For each test in _NEGATIVE_TEST_CASES, parsing should fail, and the failure
+# should indicate the specified token.
+_NEGATIVE_TEST_CASES = """
+anonymous bits does not allow documentation
+---
+-- doc
+---
+struct Foo:
+ 0 [+1] bits:
+ -- doc
+ 0 [+2] UInt bar
+===
+anonymous bits does not allow subtypes
+---
+enum
+---
+struct Foo:
+ 0 [+1] bits:
+ enum Bar:
+ X = 1
+ 0 [+2] Bar bar
+"""
+
+
+def _get_test_cases():
+ test_case = collections.namedtuple("test_case", ["name", "parse_tree", "ir"])
+ result = []
+ for case in _TEST_CASES.split("==="):
+ name, emb, ir_text = case.split("---")
+ name = name.strip()
+ try:
+ ir = ir_pb2.Module.from_json(ir_text)
+ except Exception:
+ print(name)
+ raise
+ parse_result = parser.parse_module(tokenizer.tokenize(emb, "")[0])
+ assert not parse_result.error, "{}:\n{}".format(name, parse_result.error)
+ result.append(test_case(name, parse_result.parse_tree, ir))
+ return result
+
+
+def _get_negative_test_cases():
+ test_case = collections.namedtuple("test_case",
+ ["name", "text", "error_token"])
+ result = []
+ for case in _NEGATIVE_TEST_CASES.split("==="):
+ name, error_token, text = case.split("---")
+ name = name.strip()
+ error_token = error_token.strip()
+ result.append(test_case(name, text, error_token))
+ return result
+
+
+def _check_source_location(source_location, path, min_start, max_end):
+ """Performs sanity checks on a source_location field.
+
+ Arguments:
+ source_location: The source_location to check.
+ path: The path, to use in error messages.
+ min_start: A minimum value for source_location.start, or None.
+ max_end: A maximum value for source_location.end, or None.
+
+ Returns:
+ A list of error messages, or an empty list if no errors.
+ """
+ if source_location.is_disjoint_from_parent:
+ # If source_location.is_disjoint_from_parent, then this source_location is
+ # allowed to be outside of the parent's source_location.
+ return []
+
+ result = []
+ start = None
+ end = None
+ if not source_location.HasField("start"):
+ result.append("{}.start missing".format(path))
+ else:
+ start = source_location.start
+ if not source_location.HasField("end"):
+ result.append("{}.end missing".format(path))
+ else:
+ end = source_location.end
+
+ if start and end:
+ if start.HasField("line") and end.HasField("line"):
+ if start.line > end.line:
+ result.append("{}.start.line > {}.end.line ({} vs {})".format(
+ path, path, start.line, end.line))
+ elif start.line == end.line:
+ if (start.HasField("column") and end.HasField("column") and
+ start.column > end.column):
+ result.append("{}.start.column > {}.end.column ({} vs {})".format(
+ path, path, start.column, end.column))
+
+ for name, field in (("start", start), ("end", end)):
+ if not field:
+ continue
+ if field.HasField("line"):
+ if field.line <= 0:
+ result.append("{}.{}.line <= 0 ({})".format(path, name, field.line))
+ else:
+ result.append("{}.{}.line missing".format(path, name))
+ if field.HasField("column"):
+ if field.column <= 0:
+ result.append("{}.{}.column <= 0 ({})".format(path, name, field.column))
+ else:
+ result.append("{}.{}.column missing".format(path, name))
+
+ if min_start and start:
+ if min_start.line > start.line or (
+ min_start.line == start.line and min_start.column > start.column):
+ result.append("{}.start before parent start".format(path))
+
+ if max_end and end:
+ if max_end.line < end.line or (
+ max_end.line == end.line and max_end.column < end.column):
+ result.append("{}.end after parent end".format(path))
+
+ return result
+
+
+def _check_all_source_locations(proto, path="", min_start=None, max_end=None):
+ """Performs sanity checks on all source_locations in proto.
+
+ Arguments:
+ proto: The proto to recursively check.
+ path: The path, to use in error messages.
+ min_start: A minimum value for source_location.start, or None.
+ max_end: A maximum value for source_location.end, or None.
+
+ Returns:
+ A list of error messages, or an empty list if no errors.
+ """
+ if path:
+ path += "."
+
+ errors = []
+
+ child_start = None
+ child_end = None
+ # Only check the source_location value if this proto message actually has a
+ # source_location field.
+ if "source_location" in proto.raw_fields:
+ errors.extend(_check_source_location(proto.source_location,
+ path + "source_location",
+ min_start, max_end))
+ child_start = proto.source_location.start
+ child_end = proto.source_location.end
+
+ for name, spec in proto.field_specs.items():
+ if name == "source_location":
+ continue
+ if not proto.HasField(name):
+ continue
+ field_path = "{}{}".format(path, name)
+ if isinstance(spec, ir_pb2.Repeated):
+ if issubclass(spec.type, ir_pb2.Message):
+ index = 0
+ for i in getattr(proto, name):
+ item_path = "{}[{}]".format(field_path, index)
+ index += 1
+ errors.extend(
+ _check_all_source_locations(i, item_path, child_start, child_end))
+ else:
+ if issubclass(spec.type, ir_pb2.Message):
+ errors.extend(_check_all_source_locations(getattr(proto, name),
+ field_path, child_start,
+ child_end))
+
+ return errors
+
+
+class ModuleIrTest(unittest.TestCase):
+ """Tests the module_ir.build_ir() function."""
+
+ def test_build_ir(self):
+ self.assertEqual(module_ir.build_ir(_MINIMAL_SAMPLE), _MINIMAL_SAMPLE_IR)
+
+ def test_production_coverage(self):
+ """Checks that all grammar productions are used somewhere in tests."""
+ used_productions = set()
+ module_ir.build_ir(_MINIMAL_SAMPLE, used_productions)
+ for test in _get_test_cases():
+ module_ir.build_ir(test.parse_tree, used_productions)
+ self.assertEqual(set(module_ir.PRODUCTIONS) - used_productions, set([]))
+
+ def test_double_negative_non_compilation(self):
+ """Checks that unparenthesized double unary minus/plus is a parse error."""
+ for example in ("[x: - -3]", "[x: + -3]", "[x: - +3]", "[x: + +3]"):
+ parse_result = parser.parse_module(tokenizer.tokenize(example, "")[0])
+ self.assertTrue(parse_result.error)
+ self.assertEqual(7, parse_result.error.token.source_location.start.column)
+ for example in ("[x:-(-3)]", "[x:+(-3)]", "[x:-(+3)]", "[x:+(+3)]"):
+ parse_result = parser.parse_module(tokenizer.tokenize(example, "")[0])
+ self.assertFalse(parse_result.error)
+
+
+def _make_superset_tests():
+
+ def _make_superset_test(test):
+
+ def test_case(self):
+ ir = module_ir.build_ir(test.parse_tree)
+ is_superset, error_message = test_util.proto_is_superset(ir, test.ir)
+ self.assertTrue(
+ is_superset,
+ error_message + "\n" + ir.to_json(indent=2) + "\n" + test.ir.to_json(indent=2))
+
+ return test_case
+
+ for test in _get_test_cases():
+ test_name = "test " + test.name + " proto superset"
+ assert not hasattr(ModuleIrTest, test_name)
+ setattr(ModuleIrTest, test_name, _make_superset_test(test))
+
+
+def _make_source_location_tests():
+
+ def _make_source_location_test(test):
+
+ def test_case(self):
+ error_list = _check_all_source_locations(
+ module_ir.build_ir(test.parse_tree))
+ self.assertFalse(error_list, "\n".join([test.name] + error_list))
+
+ return test_case
+
+ for test in _get_test_cases():
+ test_name = "test " + test.name + " source location"
+ assert not hasattr(ModuleIrTest, test_name)
+ setattr(ModuleIrTest, test_name, _make_source_location_test(test))
+
+
+def _make_negative_tests():
+
+ def _make_negative_test(test):
+
+ def test_case(self):
+ parse_result = parser.parse_module(tokenizer.tokenize(test.text, "")[0])
+ self.assertEqual(test.error_token, parse_result.error.token.text.strip())
+
+ return test_case
+
+ for test in _get_test_cases():
+ test_name = "test " + test.name + " compilation failure"
+ assert not hasattr(ModuleIrTest, test_name)
+ setattr(ModuleIrTest, test_name, _make_negative_test(test))
+
+
+_make_superset_tests()
+_make_source_location_tests()
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/parser.py b/front_end/parser.py
new file mode 100644
index 0000000..9ef5eb1
--- /dev/null
+++ b/front_end/parser.py
@@ -0,0 +1,127 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Routines to generate a shift-reduce parser from the module_ir module."""
+
+import pkgutil
+
+from front_end import lr1
+from front_end import module_ir
+from front_end import tokenizer
+from util import simple_memoizer
+
+
+class ParserGenerationError(Exception):
+ """An error occurred during parser generation."""
+ pass
+
+
+def parse_error_examples(error_example_text):
+ """Parses error examples from error_example_text.
+
+ Arguments:
+ error_example_text: The text of an error example file.
+
+ Returns:
+ A list of tuples, suitable for passing into generate_parser.
+
+ Raises:
+ ParserGenerationError: There is a problem parsing the error examples.
+ """
+ error_examples = error_example_text.split("\n" + "=" * 80 + "\n")
+ result = []
+ # Everything before the first "======" line is explanatory text: ignore it.
+ for error_example in error_examples[1:]:
+ message_and_examples = error_example.split("\n" + "-" * 80 + "\n")
+ if len(message_and_examples) != 2:
+ raise ParserGenerationError(
+ "Expected one error message and one example section in:\n" +
+ error_example)
+ message, example_text = message_and_examples
+ examples = example_text.split("\n---\n")
+ for example in examples:
+ # TODO(bolms): feed a line number into tokenize, so that tokenization
+ # failures refer to the correct line within error_example_text.
+ tokens, errors = tokenizer.tokenize(example, "")
+ if errors:
+ raise ParserGenerationError(str(errors))
+
+ for i in range(len(tokens)):
+ if tokens[i].symbol == "BadWord" and tokens[i].text == "$ANY":
+ tokens[i] = lr1.ANY_TOKEN
+
+ error_token = None
+ for i in range(len(tokens)):
+ if tokens[i].symbol == "BadWord" and tokens[i].text == "$ERR":
+ error_token = tokens[i + 1]
+ del tokens[i]
+ break
+ else:
+ raise ParserGenerationError(
+ "No error token marker '$ERR' in:\n" + error_example)
+
+ result.append((tokens, error_token, message.strip(), example))
+ return result
+
+
+def generate_parser(start_symbol, productions, error_examples):
+ """Generates a parser from grammar, and applies error_examples.
+
+ Arguments:
+ start_symbol: the start symbol of the grammar (a string)
+ productions: a list of parser_types.Production in the grammar
+ error_examples: A list of (source tokens, error message, source text)
+ tuples.
+
+ Returns:
+ A parser.
+
+ Raises:
+ ParserGenerationError: There is a problem generating the parser.
+ """
+ parser = lr1.Grammar(start_symbol, productions).parser()
+ if parser.conflicts:
+ raise ParserGenerationError("\n".join([str(c) for c in parser.conflicts]))
+ for example in error_examples:
+ mark_result = parser.mark_error(example[0], example[1], example[2])
+ if mark_result:
+ raise ParserGenerationError(
+ "error marking example: {}\nExample:\n{}".format(
+ mark_result, example[3]))
+ return parser
+
+
+@simple_memoizer.memoize
+def _load_module_parser():
+ path = "front_end"
+ error_examples = parse_error_examples(
+ pkgutil.get_data(path, "error_examples").decode("utf-8"))
+ return generate_parser(module_ir.START_SYMBOL, module_ir.PRODUCTIONS,
+ error_examples)
+
+
+@simple_memoizer.memoize
+def _load_expression_parser():
+ return generate_parser(module_ir.EXPRESSION_START_SYMBOL,
+ module_ir.PRODUCTIONS, [])
+
+
+def parse_module(tokens):
+ """Parses the provided Emboss token list into an Emboss module parse tree."""
+ return _load_module_parser().parse(tokens)
+
+
+def parse_expression(tokens):
+ """Parses the provided Emboss token list into an expression parse tree."""
+ return _load_expression_parser().parse(tokens)
diff --git a/front_end/parser_test.py b/front_end/parser_test.py
new file mode 100644
index 0000000..e3f6579
--- /dev/null
+++ b/front_end/parser_test.py
@@ -0,0 +1,207 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for parser."""
+
+import unittest
+from front_end import lr1
+from front_end import parser
+from front_end import tokenizer
+from util import parser_types
+
+
+# TODO(bolms): This is repeated in lr1_test.py; separate into test utils?
+def _parse_productions(*productions):
+ """Parses text into a grammar by calling Production.parse on each line."""
+ return [parser_types.Production.parse(p) for p in productions]
+
+
+_EXAMPLE_DIVIDER = "\n" + "=" * 80 + "\n"
+_MESSAGE_ERROR_DIVIDER = "\n" + "-" * 80 + "\n"
+_ERROR_DIVIDER = "\n---\n"
+
+
+class ParserGeneratorTest(unittest.TestCase):
+ """Tests parser.parse_error_examples and generate_parser."""
+
+ def test_parse_good_error_examples(self):
+ errors = parser.parse_error_examples(
+ _EXAMPLE_DIVIDER + # ======...
+ "structure names must be Camel" + # Message.
+ _MESSAGE_ERROR_DIVIDER + # ------...
+ "struct $ERR FOO" + # First example.
+ _ERROR_DIVIDER + # ---
+ "struct $ERR foo" + # Second example.
+ _EXAMPLE_DIVIDER + # ======...
+ ' \n struct must be followed by ":" \n\n' + # Second message.
+ _MESSAGE_ERROR_DIVIDER + # ------...
+ "struct Foo $ERR") # Example for second message.
+ self.assertEqual(tokenizer.tokenize("struct FOO", "")[0], errors[0][0])
+ self.assertEqual("structure names must be Camel", errors[0][2])
+ self.assertEqual(tokenizer.tokenize("struct foo", "")[0], errors[1][0])
+ self.assertEqual("structure names must be Camel", errors[1][2])
+ self.assertEqual(tokenizer.tokenize("struct Foo ", "")[0], errors[2][0])
+ self.assertEqual('struct must be followed by ":"', errors[2][2])
+
+ def test_parse_good_wildcard_example(self):
+ errors = parser.parse_error_examples(
+ _EXAMPLE_DIVIDER + # ======...
+ ' \n struct must be followed by ":" \n\n' + # Second message.
+ _MESSAGE_ERROR_DIVIDER + # ------...
+ "struct Foo $ERR $ANY")
+ tokens = tokenizer.tokenize("struct Foo ", "")[0]
+ # The $ANY token should come just before the end-of-line token in the parsed
+ # result.
+ tokens.insert(-1, lr1.ANY_TOKEN)
+ self.assertEqual(tokens, errors[0][0])
+ self.assertEqual('struct must be followed by ":"', errors[0][2])
+
+ def test_parse_with_no_error_marker(self):
+ self.assertRaises(
+ parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "msg" + _MESSAGE_ERROR_DIVIDER + "-- doc")
+
+ def test_that_no_error_example_fails(self):
+ self.assertRaises(parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "msg" + _EXAMPLE_DIVIDER + "msg" +
+ _MESSAGE_ERROR_DIVIDER + "example")
+
+ def test_that_message_example_divider_must_be_on_its_own_line(self):
+ self.assertRaises(parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "msg" + "-" * 80 + "example")
+ self.assertRaises(parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "msg\n" + "-" * 80 + "example")
+ self.assertRaises(parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "msg" + "-" * 80 + "\nexample")
+ self.assertRaises(parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "msg\n" + "-" * 80 + " \nexample")
+
+ def test_that_example_divider_must_be_on_its_own_line(self):
+ self.assertRaises(
+ parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "msg" + _MESSAGE_ERROR_DIVIDER + "example" + "=" * 80
+ + "msg" + _MESSAGE_ERROR_DIVIDER + "example")
+ self.assertRaises(
+ parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "msg" + _MESSAGE_ERROR_DIVIDER + "example\n" + "=" *
+ 80 + "msg" + _MESSAGE_ERROR_DIVIDER + "example")
+ self.assertRaises(
+ parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "msg" + _MESSAGE_ERROR_DIVIDER + "example" + "=" * 80
+ + "\nmsg" + _MESSAGE_ERROR_DIVIDER + "example")
+ self.assertRaises(
+ parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "msg" + _MESSAGE_ERROR_DIVIDER + "example\n" + "=" *
+ 80 + " \nmsg" + _MESSAGE_ERROR_DIVIDER + "example")
+
+ def test_that_tokenization_failure_results_in_failure(self):
+ self.assertRaises(
+ parser.ParserGenerationError,
+ parser.parse_error_examples,
+ _EXAMPLE_DIVIDER + "message" + _MESSAGE_ERROR_DIVIDER + "|")
+
+ def test_generate_parser(self):
+ self.assertTrue(parser.generate_parser("C", _parse_productions("C -> s"),
+ []))
+ self.assertTrue(parser.generate_parser(
+ "C", _parse_productions("C -> s", "C -> d"), []))
+
+ def test_generated_parser_error(self):
+ test_parser = parser.generate_parser(
+ "C", _parse_productions("C -> s", "C -> d"),
+ [([parser_types.Token("s", "s", None),
+ parser_types.Token("s", "s", None)],
+ parser_types.Token("s", "s", None),
+ "double s", "ss")])
+ parse_result = test_parser.parse([parser_types.Token("s", "s", None),
+ parser_types.Token("s", "s", None)])
+ self.assertEqual(None, parse_result.parse_tree)
+ self.assertEqual("double s", parse_result.error.code)
+
+ def test_conflict_error(self):
+ self.assertRaises(
+ parser.ParserGenerationError,
+ parser.generate_parser,
+ "C", _parse_productions("C -> S", "C -> D", "S -> a", "D -> a"), [])
+
+ def test_bad_mark_error(self):
+ self.assertRaises(parser.ParserGenerationError,
+ parser.generate_parser,
+ "C", _parse_productions("C -> s", "C -> d"),
+ [([parser_types.Token("s", "s", None),
+ parser_types.Token("s", "s", None)],
+ parser_types.Token("s", "s", None),
+ "double s", "ss"),
+ ([parser_types.Token("s", "s", None),
+ parser_types.Token("s", "s", None)],
+ parser_types.Token("s", "s", None),
+ "double 's'", "ss")])
+ self.assertRaises(parser.ParserGenerationError,
+ parser.generate_parser,
+ "C", _parse_productions("C -> s", "C -> d"),
+ [([parser_types.Token("s", "s", None)],
+ parser_types.Token("s", "s", None),
+ "single s", "s")])
+
+
+class ModuleParserTest(unittest.TestCase):
+ """Tests for parser.parse_module().
+
+ Correct parses should mostly be checked in conjunction with
+ module_ir.build_ir, as the exact data structure returned by
+ parser.parse_module() is determined by the grammar defined in module_ir.
+ These tests only need to cover errors and sanity checking.
+ """
+
+ def test_error_reporting_by_example(self):
+ parse_result = parser.parse_module(
+ tokenizer.tokenize("struct LogFileStatus:\n"
+ " 0 [+4] UInt\n", "")[0])
+ self.assertEqual(None, parse_result.parse_tree)
+ self.assertEqual("A name is required for a struct field.",
+ parse_result.error.code)
+ self.assertEqual('"\\n"', parse_result.error.token.symbol)
+ self.assertEqual(set(['"["', "SnakeWord", '"."', '":"', '"("']),
+ parse_result.error.expected_tokens)
+
+ def test_error_reporting_without_example(self):
+ parse_result = parser.parse_module(
+ tokenizer.tokenize("struct LogFileStatus:\n"
+ " 0 [+4] UInt foo +\n", "")[0])
+ self.assertEqual(None, parse_result.parse_tree)
+ self.assertEqual(None, parse_result.error.code)
+ self.assertEqual('"+"', parse_result.error.token.symbol)
+ self.assertEqual(set(['"("', '"\\n"', '"["', "Documentation", "Comment"]),
+ parse_result.error.expected_tokens)
+
+ def test_ok_parse(self):
+ parse_result = parser.parse_module(
+ tokenizer.tokenize("struct LogFileStatus:\n"
+ " 0 [+4] UInt foo\n", "")[0])
+ self.assertTrue(parse_result.parse_tree)
+ self.assertEqual(None, parse_result.error)
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/prelude.emb b/front_end/prelude.emb
new file mode 100644
index 0000000..5a54251
--- /dev/null
+++ b/front_end/prelude.emb
@@ -0,0 +1,71 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- This is the Emboss Prelude.
+--
+-- This is a special file whose names are imported into (and therefore usable
+-- in) every Emboss module. The IR for the Prelude module is included in every
+-- Emboss IR as the second element of the `module` list.
+
+# This namespace needs to match the namespace in emboss_prelude.h.
+# TODO(bolms): Move back-end-specific declarations to a separate file.
+[(cpp) namespace: "emboss::prelude"]
+
+
+external UInt:
+ -- UInt is an automatically-sized unsigned integer.
+ [static_requirements: $is_statically_sized && 1 <= $static_size_in_bits <= 64]
+ [is_integer: true]
+ [addressable_unit_size: 1]
+
+
+external Int:
+ -- Int is an automatically-sized signed 2's-complement integer.
+ [static_requirements: $is_statically_sized && 1 <= $static_size_in_bits <= 64]
+ [is_integer: true]
+ [addressable_unit_size: 1]
+
+
+external Bcd:
+ -- `Bcd` is an automatically-sized unsigned integer stored in Binary-Coded
+ -- Decimal (BCD) format. https://en.wikipedia.org/wiki/Binary-coded_decimal
+ --
+ -- `Bcd` can be used in `bits` constructs. If its size is not a multiple of
+ -- 4 bits, the value will be padded out to a multiple of 4 bits using 0s, and
+ -- then the resulting bit pattern will be treated as BCD. Thus, a 7-bit `Bcd`
+ -- will have a range of 0..79, a 10-bit `Bcd` will have a range of 0..399, and
+ -- so on.
+ [static_requirements: $is_statically_sized && 1 <= $static_size_in_bits <= 64]
+ [is_integer: true]
+ [addressable_unit_size: 1]
+
+
+external Flag:
+ -- `Flag` is a boolean value, with `0` meaning `false` and `1` meaning `true`.
+ [static_requirements: $is_statically_sized && $static_size_in_bits == 1]
+ [fixed_size_in_bits: 1]
+ # Flags are not integers; if a user wants a 1-bit integer, they should use a
+ # 1-bit UInt.
+ [is_integer: false]
+ [addressable_unit_size: 1]
+
+
+external Float:
+ -- `Float` is a number in an IEEE 754 binaryNN format.
+ [static_requirements: $is_statically_sized && ($static_size_in_bits == 32 || $static_size_in_bits == 64)]
+ [is_integer: false]
+ [addressable_unit_size: 1]
+
+
+# TODO(bolms): Add Fixed-point.
diff --git a/front_end/reserved_words b/front_end/reserved_words
new file mode 100644
index 0000000..57dba44
--- /dev/null
+++ b/front_end/reserved_words
@@ -0,0 +1,992 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Reserved words for Emboss.
+#
+# In the interest of avoiding problems during codegen, Emboss disallows fields,
+# types, and enum values that would collide with reserved words from a number
+# of languages. This (mostly) avoids cases where the back-end code generator
+# would want to emit a field accessor with the same name as a keyword. (Proto,
+# for example, handles this case by appending "_" to the field name if it
+# collides with a keyword; however, this is not documented, and Proto still
+# breaks if you happen to have fields named "struct" and "struct_" in the same
+# message.)
+#
+# Emboss reserves words from many languages, on the off chance that they will
+# someday have code generators, but there is an emphasis on systems languages
+# (such as C), hardware definition languages (such as Verilog), and languages
+# which can easily be used for handling binary data (such as Erlang).
+#
+# Non-blank/comment lines in this file take one of two forms:
+#
+# -- Source
+# word
+#
+# e.g.:
+#
+# -- C
+# int
+# long
+# _Bool # C99
+#
+# -- Verilog
+# always
+# case
+#
+# A word may appear in multiple language sections. The first language name for
+# a word will be used in error messages when names matching that word are
+# found.
+
+# TODO(bolms): There still needs to be a way to override field names in
+# generated code. This list is *not* a complete list of every possible
+# reserved word -- such a list is impossible, given that language designers
+# continue to add new keywords to existing languages, new languages pop up,
+# standards such as POSIX reserve huge swathes of namespace for future
+# expansion (i.e., anything starting with E[A-Z0-9]), and different C and C++
+# code bases may have custom preprocessor macros that collide with *anything*.
+
+-- C
+asm
+auto
+break
+case
+char
+const
+continue
+default
+do
+double
+else
+extern
+float
+for
+fortran
+goto
+if
+inline # C99
+int
+long
+register
+restrict # C99
+return
+short
+signed
+sizeof
+static
+switch
+typedef
+unsigned
+void
+volatile
+while
+_Alignas # C11
+_Alignof # C11
+_Atomic # C11
+_Bool # C99
+_Complex # C99
+_Generic # C11
+_Imaginary # C99
+_Noreturn # C11
+_Pragma # C99
+_Static_assert # C11
+_Thread_local # C11
+
+# The following are *macros* defined in the C standard library. For the most
+# part, I do not think banning these will inconvenience many people.
+
+# <assert.h>
+NDEBUG
+static_assert
+assert
+
+# <complex.h>
+__STDC_NO_COMPLEX__
+complex
+_Complex_I
+imaginary
+_Imaginary_I
+I
+CMPLX
+CMPLXF
+CMPLXL
+
+# <errno.h>
+errno
+EDOM
+EILSEQ
+ERANGE
+
+# <fenv.h>
+FE_DIVBYZERO
+FE_INEXACT
+FE_INVALID
+FE_OVERFLOW
+FE_UNDERFLOW
+FE_ALL_EXCEPT
+FE_DOWNWARD
+FE_TONEAREST
+FE_TOWARDZERO
+FE_UPWARD
+FE_DFL_ENV
+
+# <float.h>
+FLT_ROUNDS
+FLT_EVAL_METHOD
+FLT_HAS_SUBNORM
+DBL_HAS_SUBNORM
+LDBL_HAS_SUBNORM
+FLT_RADIX
+FLT_MANT_DIG
+DBL_MANT_DIG
+LDBL_MANT_DIG
+FLT_DECIMAL_DIG
+DBL_DECIMAL_DIG
+LDBL_DECIMAL_DIG
+DECIMAL_DIG
+FLT_DIG
+DBL_DIG
+LDBL_DIG
+FLT_MIN_EXP
+DBL_MIN_EXP
+LDBL_MIN_EXP
+FLT_MIN_10_EXP
+DBL_MIN_10_EXP
+LDBL_MIN_10_EXP
+FLT_MAX_EXP
+DBL_MAX_EXP
+LDBL_MAX_EXP
+FLT_MAX_10_EXP
+DBL_MAX_10_EXP
+LDBL_MAX_10_EXP
+FLT_MAX
+DBL_MAX
+LDBL_MAX
+FLT_EPSILON
+DBL_EPSILON
+LDBL_EPSILON
+FLT_MIN
+DBL_MIN
+LDBL_MIN
+FLT_TRUE_MIN
+DBL_TRUE_MIN
+LDBL_TRUE_MIN
+
+# <iso646.h>
+# These are not frequently used in real C code, but then, it is hard to think
+# of a good reason to use these as field names.
+and
+and_eq
+bitand
+bitor
+compl
+not
+not_eq
+or
+or_eq
+xor
+xor_eq
+
+# <limits.h>
+CHAR_BIT
+SCHAR_MIN
+SCHAR_MAX
+UCHAR_MAX
+CHAR_MIN
+CHAR_MAX
+MB_LEN_MAX
+SHRT_MIN
+SHRT_MAX
+USHRT_MAX
+INT_MIN
+INT_MAX
+UINT_MAX
+LONG_MIN
+LONG_MAX
+ULONG_MAX
+LLONG_MIN
+LLONG_MAX
+ULLONG_MAX
+
+# <locale.h>
+NULL
+LC_ALL
+LC_COLLATE
+LC_CTYPE
+LC_MONETARY
+LC_NUMERIC
+LC_TIME
+
+# <math.h>
+HUGE_VAL
+HUGE_VALF
+HUGE_VALL
+INFINITY
+NAN
+FP_INFINITE
+FP_NAN
+FP_NORMAL
+FP_SUBNORMAL
+FP_ZERO
+FP_FAST_FMA
+FP_FAST_FMAF
+FP_FAST_FMAL
+FP_ILOGB0
+FP_ILOGBNAN
+MATH_ERRNO
+MATH_ERREXCEPT
+math_errhandling
+fpclassify
+isfinite
+isinf
+isnan
+isnormal
+signbit
+isgreater
+isgreaterequal
+isless
+islessequal
+islessgreater
+isunordered
+
+# <setjmp.h>
+setjmp
+# Oddly, setjmp is a macro, but longjmp is not.
+
+# <signal.h>
+SIG_DFL
+SIG_ERR
+SIG_IGN
+SIGABRT
+SIGFPE
+SIGILL
+SIGINT
+SIGSEGV
+SIGTERM
+
+# <stdalign.h>
+alignas
+__alignas_is_defined
+
+# <stdarg.h>
+va_arg
+va_copy
+va_end
+va_start
+
+# <stdatomic.h>
+__STDC_NO_ATOMICS__
+ATOMIC_BOOL_LOCK_FREE
+ATOMIC_CHAR_LOCK_FREE
+ATOMIC_CHAR16_T_LOCK_FREE
+ATOMIC_CHAR32_T_LOCK_FREE
+ATOMIC_WCHAR_T_LOCK_FREE
+ATOMIC_SHORT_LOCK_FREE
+ATOMIC_INT_LOCK_FREE
+ATOMIC_LONG_LOCK_FREE
+ATOMIC_LLONG_LOCK_FREE
+ATOMIC_POINTER_LOCK_FREE
+ATOMIC_FLAG_INIT
+ATOMIC_VAR_INIT
+# Many of the following are listed by the standard as "generic functions"
+# instead of explicitly calling them out as macros.
+atomic_init
+kill_dependency
+atomic_is_lock_free
+atomic_store
+atomic_store_explicit
+atomic_load
+atomic_load_explicit
+atomic_exchange
+atomic_exchange_explicit
+atomic_compare_exchange_strong
+atomic_compare_exchange_strong_explicit
+atomic_compare_exchange_weak
+atomic_compare_exchange_weak_explicit
+atomic_fetch_add
+atomic_fetch_sub
+atomic_fetch_or
+atomic_fetch_xor
+atomic_fetch_and
+atomic_fetch_add_explicit
+atomic_fetch_sub_explicit
+atomic_fetch_or_explicit
+atomic_fetch_xor_explicit
+atomic_fetch_and_explicit
+
+# <stdbool.h>
+bool
+__bool_true_false_are_defined
+
+# <stddef.h>
+NULL
+offsetof
+
+# <stdint.h>
+INT16_C
+INT16_MAX
+INT16_MIN
+INT32_C
+INT32_MAX
+INT32_MIN
+INT64_C
+INT64_MAX
+INT64_MIN
+INT8_C
+INT8_MAX
+INT8_MIN
+INT_FAST16_MAX
+INT_FAST16_MIN
+INT_FAST32_MAX
+INT_FAST32_MIN
+INT_FAST64_MAX
+INT_FAST64_MIN
+INT_FAST8_MAX
+INT_FAST8_MIN
+INT_LEAST16_MAX
+INT_LEAST16_MIN
+INT_LEAST32_MAX
+INT_LEAST32_MIN
+INT_LEAST64_MAX
+INT_LEAST64_MIN
+INT_LEAST8_MAX
+INT_LEAST8_MIN
+INTMAX_C
+INTMAX_MAX
+INTMAX_MIN
+INTPTR_MAX
+INTPTR_MIN
+PTRDIFF_MAX
+PTRDIFF_MIN
+SIG_ATOMIC_MAX
+SIG_ATOMIC_MIN
+SIZE_MAX
+UINT16_C
+UINT16_MAX
+UINT32_C
+UINT32_MAX
+UINT64_C
+UINT64_MAX
+UINT8_C
+UINT8_MAX
+UINT_FAST16_MAX
+UINT_FAST32_MAX
+UINT_FAST64_MAX
+UINT_FAST8_MAX
+UINT_LEAST16_MAX
+UINT_LEAST32_MAX
+UINT_LEAST64_MAX
+UINT_LEAST8_MAX
+UINTMAX_C
+UINTMAX_MAX
+UINTPTR_MAX
+WCHAR_MAX
+WCHAR_MIN
+WINT_MAX
+WINT_MIN
+
+# <stdio.h>
+NULL
+_IOFBF
+_IOLBF
+_IONBF
+BUFSIZ
+EOF
+FOPEN_MAX
+FILENAME_MAX
+L_tmpnam
+SEEK_CUR
+SEEK_END
+SEEK_SET
+TMP_MAX
+stderr
+stdin
+stdout
+L_tmpnam_s
+TMP_MAX_S
+
+# <stdlib.h>
+NULL
+EXIT_FAILURE
+EXIT_SUCCESS
+RAND_MAX
+MB_CUR_MAX
+
+# <stdnoreturn.h>
+noreturn
+
+# <string.h>
+NULL
+
+# <tgmath.h>
+acos
+asin
+atan
+acosh
+asinh
+atanh
+cos
+sin
+tan
+cosh
+sinh
+tanh
+exp
+log
+pow
+sqrt
+fabs
+atan2
+cbrt
+ceil
+copysign
+erf
+erfc
+exp2
+expm1
+fdim
+floor
+fma
+fmax
+fmin
+fmod
+frexp
+hypot
+ilogb
+ldexp
+lgamma
+llrint
+llround
+log10
+log1p
+log2
+logb
+lrint
+lround
+nearbyint
+nextafter
+nexttoward
+remainder
+remquo
+rint
+round
+scalbn
+scalbln
+tgamma
+trunc
+carg
+cimag
+conj
+cproj
+creal
+
+# <threads.h>
+__STD_C_NO_THREADS__
+thread_local
+ONCE_FLAG_INIT
+TSS_DTOR_ITERATIONS
+
+# <time.h>
+NULL
+CLOCKS_PER_SEC
+TIME_UTC
+
+# <uchar.h> has no macros.
+
+# <wchar.h>
+NULL
+WCHAR_MAX
+WCHAR_MIN
+WEOF
+
+# <wctype.h>
+WEOF
+
+
+-- C++
+alignas # C++11
+alignof # C++11
+and
+and_eq
+asm
+auto
+bitand
+bitor
+bool
+break
+case
+catch
+char
+char16_t # C++11
+char32_t # C++11
+class
+compl
+concept # concepts TS
+const
+constexpr # C++11
+const_cast
+continue
+decltype # C++11
+default
+delete
+do
+double
+dynamic_cast
+else
+enum
+explicit
+export
+extern
+false
+float
+for
+friend
+goto
+if
+inline
+int
+long
+mutable
+namespace
+new
+noexcept # C++11
+not
+not_eq
+nullptr # C++11
+operator
+or
+or_eq
+private
+protected
+public
+register
+reinterpret_cast
+requires # concepts TS
+return
+short
+signed
+sizeof
+static
+static_assert # C++11
+static_cast
+struct
+switch
+template
+this
+thread_local # C++11
+throw
+true
+try
+typedef
+typeid
+typename
+union
+unsigned
+using
+virtual
+void
+volatile
+wchar_t
+while
+xor
+xor_eq
+
+
+-- System V libc
+# <math.h>
+DOMAIN
+SING
+OVERFLOW
+UNDERFLOW
+TLOSS
+PLOSS
+
+
+-- BSD libc
+# <math.h>
+MAXFLOAT
+M_E
+M_LOG2E
+M_LOG10E
+M_LN2
+M_LN10
+M_PI
+M_PI_2
+M_PI_4
+M_1_PI
+M_2_PI
+M_2_SQRTPI
+M_SQRT2
+M_SQRT1_2
+M_TWOPI
+M_3PI_4
+M_SQRTPI
+M_LN2LO
+M_LN2HI
+M_SQRT3
+M_IVLN10
+M_LOG2_E
+M_INVLN2
+
+
+-- Verilog
+# Verilog and System Verilog allow any keyword to be used as an identifier as
+# long as it is prefixed with '\' and followed by whitespace; e.g., \if .
+
+
+-- VHDL
+# VHDL allows any keyword to be used as an identifier by surrounding it with
+# '\' characters; e.g. \if\.
+
+
+-- Go
+break
+case
+chan
+const
+continue
+default
+defer
+else
+fallthrough
+for
+func
+go
+goto
+if
+import
+interface
+map
+package
+range
+return
+select
+switch
+type
+var
+
+
+-- Python 2
+# Python 2-only keywords.
+exec
+print
+
+
+-- Python
+# Keywords in both Python 2 and 3.
+and
+as
+assert
+break
+class
+continue
+def
+del
+elif
+else
+except
+finally
+for
+from
+global
+if
+import
+in
+is
+lambda
+not
+or
+pass
+raise
+return
+try
+while
+with
+yield
+
+
+-- Python 3
+# Python 3-only keywords.
+False
+None
+nonlocal
+print
+True
+
+
+-- Java
+abstract
+assert
+boolean
+break
+byte
+case
+catch
+char
+class
+const
+continue
+default
+do
+double
+else
+extends
+final
+finally
+float
+for
+goto
+if
+implements
+import
+instanceof
+int
+interface
+long
+native
+new
+package
+private
+protected
+public
+return
+short
+static
+strictfp
+super
+switch
+synchronized
+this
+throw
+throws
+transient
+try
+void
+volatile
+while
+
+
+-- Protocol Buffers
+# The protobuf compiler does not reserve *any* words. The following is a
+# perfectly valid .proto file:
+#
+# message message {
+# optional optional optional = 1;
+# };
+#
+# message optional {
+# optional message message = 1;
+# };
+#
+# Unsurprisingly, the same appears to be true of Cap'n'Proto.
+
+
+-- Dart
+assert
+break
+case
+catch
+class
+const
+continue
+default
+do
+else
+enum
+extends
+false
+final
+finally
+for
+if
+in
+is
+new
+null
+rethrow
+return
+super
+switch
+this
+throw
+true
+try
+var
+void
+while
+with
+
+
+-- Objective C
+auto
+break
+case
+CGFloat
+char
+const
+continue
+default
+do
+double
+else
+enum
+extern
+float
+for
+goto
+if
+implementation
+int
+interface
+long
+nonatomic
+NSInteger
+NSNumber
+NSObject
+_Packed
+property
+protocol
+readonly
+readwrite
+register
+retain
+return
+short
+signed
+sizeof
+static
+strong
+struct
+switch
+typedef
+union
+unsafe_unretained
+unsigned
+void
+volatile
+weak
+while
+
+
+-- Swift
+# Swift allows any name to be used as an identifier if it is enclosed in
+# backticks (e.g., `if`).
+
+
+-- Erlang
+after
+and
+andalso
+band
+begin
+bnot
+bor
+bsl
+bsr
+bxor
+case
+catch
+cond
+div
+end
+fun
+if
+let
+not
+of
+or
+orelse
+receive
+rem
+try
+when
+xor
+
+
+-- Rust
+abstract
+alignof
+as
+become
+box
+break
+const
+continue
+crate
+do
+else
+extern
+final
+fn
+for
+if
+impl
+in
+let
+loop
+macro
+match
+mod
+move
+mut
+offsetof
+override
+priv
+proc
+pub
+pure
+ref
+return
+self
+Self
+sizeof
+static
+super
+trait
+type
+typeof
+unsafe
+unsized
+use
+virtual
+where
+while
+yield
+
+
+-- C#
+# C# allows any name to be used as an identifier if it is prefixed with '@';
+# e.g., @if.
+
+
+-- MATLAB
+break
+case
+catch
+classdef
+continue
+else
+elseif
+end
+for
+function
+global
+if
+otherwise
+parfor
+persistent
+return
+spmd
+switch
+try
+while
diff --git a/front_end/symbol_resolver.py b/front_end/symbol_resolver.py
new file mode 100644
index 0000000..e059275
--- /dev/null
+++ b/front_end/symbol_resolver.py
@@ -0,0 +1,530 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Symbol resolver for Emboss IR.
+
+The resolve_symbols function should be used to generate canonical resolutions
+for all symbol references in an Emboss IR.
+"""
+
+import collections
+
+from public import ir_pb2
+from util import error
+from util import ir_util
+from util import traverse_ir
+
+# TODO(bolms): Symbol resolution raises an exception at the first error, but
+# this is one place where it can make sense to report multiple errors.
+
+FileLocation = collections.namedtuple("FileLocation", ["file", "location"])
+
+
+def ambiguous_name_error(file_name, location, name, candidate_locations):
+ """A name cannot be resolved because there are two or more candidates."""
+ result = [error.error(file_name, location, "Ambiguous name '{}'".format(name))
+ ]
+ for location in sorted(candidate_locations):
+ result.append(error.note(location.file, location.location,
+ "Possible resolution"))
+ return result
+
+
+def duplicate_name_error(file_name, location, name, original_location):
+ """A name is defined two or more times."""
+ return [error.error(file_name, location, "Duplicate name '{}'".format(name)),
+ error.note(original_location.file, original_location.location,
+ "Original definition")]
+
+
+def missing_name_error(file_name, location, name):
+ return [error.error(file_name, location, "No candidate for '{}'".format(name))
+ ]
+
+
+def array_subfield_error(file_name, location, name):
+ return [error.error(file_name, location,
+ "Cannot access member of array '{}'".format(name))]
+
+
+def noncomposite_subfield_error(file_name, location, name):
+ return [error.error(file_name, location,
+ "Cannot access member of noncomposite field '{}'".format(
+ name))]
+
+
+def _nested_name(canonical_name, name):
+ """Creates a new CanonicalName with name appended to the object_path."""
+ return ir_pb2.CanonicalName(
+ module_file=canonical_name.module_file,
+ object_path=list(canonical_name.object_path) + [name])
+
+
+class _Scope(dict):
+ """A _Scope holds data for a symbol.
+
+ A _Scope is a dict with some additional attributes. Lexically nested names
+ are kept in the dict, and bookkeeping is kept in the additional attributes.
+
+ For example, each module should have a child _Scope for each type contained in
+ the module. `struct` and `bits` types should have nested _Scopes for each
+ field; `enum` types should have nested scopes for each enumerated name.
+
+ Attributes:
+ canonical_name: The absolute name of this symbol; e.g. ("file.emb",
+ "TypeName", "SubTypeName", "field_name")
+ source_location: The ir_pb2.SourceLocation where this symbol is defined.
+ visibility: LOCAL, PRIVATE, or SEARCHABLE; see below.
+ alias: If set, this name is merely a pointer to another name.
+ """
+ __slots__ = ("canonical_name", "source_location", "visibility", "alias")
+
+ # A LOCAL name is visible outside of its enclosing scope, but should not be
+ # found when searching for a name. That is, this name should be matched in
+ # the tail of a qualified reference (the 'bar' in 'foo.bar'), but not when
+ # searching for names (the 'foo' in 'foo.bar' should not match outside of
+ # 'foo's scope). This applies to public field names.
+ LOCAL = object()
+
+ # A PRIVATE name is similar to LOCAL except that it is never visible outside
+ # its enclosing scope. This applies to abbreviations of field names: if 'a'
+ # is an abbreviation for field 'apple', then 'foo.a' is not a valid reference;
+ # instead it should be 'foo.apple'.
+ PRIVATE = object()
+
+ # A SEARCHABLE name is visible as long as it is in a scope in the search list.
+ # This applies to type names ('Foo'), which may be found from many scopes.
+ SEARCHABLE = object()
+
+ def __init__(self, canonical_name, source_location, visibility, alias=None):
+ super(_Scope, self).__init__()
+ self.canonical_name = canonical_name
+ self.source_location = source_location
+ self.visibility = visibility
+ self.alias = alias
+
+
+def _add_name_to_scope(name_ir, scope, canonical_name, visibility, errors):
+ """Adds the given name_ir to the given scope."""
+ name = name_ir.text
+ new_scope = _Scope(canonical_name, name_ir.source_location, visibility)
+ if name in scope:
+ errors.append(duplicate_name_error(
+ scope.canonical_name.module_file, name_ir.source_location, name,
+ FileLocation(scope[name].canonical_name.module_file,
+ scope[name].source_location)))
+ else:
+ scope[name] = new_scope
+ return new_scope
+
+
+def _add_name_to_scope_and_normalize(name_ir, scope, visibility, errors):
+ """Adds the given name_ir to scope and sets its canonical_name."""
+ name = name_ir.name.text
+ canonical_name = _nested_name(scope.canonical_name, name)
+ name_ir.canonical_name.CopyFrom(canonical_name)
+ return _add_name_to_scope(name_ir.name, scope, canonical_name, visibility,
+ errors)
+
+
+def _add_struct_field_to_scope(field, scope, errors):
+ """Adds the name of the given field to the scope."""
+ new_scope = _add_name_to_scope_and_normalize(field.name, scope, _Scope.LOCAL,
+ errors)
+ if field.HasField("abbreviation"):
+ _add_name_to_scope(field.abbreviation, scope, new_scope.canonical_name,
+ _Scope.PRIVATE, errors)
+
+ value_builtin_name = ir_pb2.Word(
+ text="this",
+ source_location=ir_pb2.Location(is_synthetic=True),
+ )
+ # In "inside field" scope, the name `this` maps back to the field itself.
+ # This is important for attributes like `[requires]`.
+ _add_name_to_scope(value_builtin_name, new_scope,
+ field.name.canonical_name, _Scope.PRIVATE, errors)
+
+
+def _add_parameter_name_to_scope(parameter, scope, errors):
+ """Adds the name of the given parameter to the scope."""
+ _add_name_to_scope_and_normalize(parameter.name, scope, _Scope.LOCAL, errors)
+
+
+def _add_enum_value_to_scope(value, scope, errors):
+ """Adds the name of the enum value to scope."""
+ _add_name_to_scope_and_normalize(value.name, scope, _Scope.LOCAL, errors)
+
+
+def _add_type_name_to_scope(type_definition, scope, errors):
+ """Adds the name of type_definition to the given scope."""
+ new_scope = _add_name_to_scope_and_normalize(type_definition.name, scope,
+ _Scope.SEARCHABLE, errors)
+ return {"scope": new_scope}
+
+
+def _set_scope_for_type_definition(type_definition, scope):
+ """Sets the current scope for an ir_pb2.TypeDefinition."""
+ return {"scope": scope[type_definition.name.name.text]}
+
+
+def _add_module_to_scope(module, scope):
+ """Adds the name of the module to the given scope."""
+ module_symbol_table = _Scope(
+ ir_pb2.CanonicalName(module_file=module.source_file_name,
+ object_path=[]),
+ None,
+ _Scope.SEARCHABLE)
+ scope[module.source_file_name] = module_symbol_table
+ return {"scope": scope[module.source_file_name]}
+
+
+def _set_scope_for_module(module, scope):
+ """Adds the name of the module to the given scope."""
+ return {"scope": scope[module.source_file_name]}
+
+
+def _add_import_to_scope(foreign_import, table, module, errors):
+ if not foreign_import.local_name.text:
+ # This is the prelude import; ignore it.
+ return
+ _add_alias_to_scope(foreign_import.local_name, table, module.canonical_name,
+ [foreign_import.file_name.text], _Scope.SEARCHABLE,
+ errors)
+
+
+def _construct_symbol_tables(ir):
+ """Constructs per-module symbol tables for each module in ir."""
+ symbol_tables = {}
+ errors = []
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Module], _add_module_to_scope,
+ parameters={"errors": errors, "scope": symbol_tables})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.TypeDefinition], _add_type_name_to_scope,
+ incidental_actions={ir_pb2.Module: _set_scope_for_module},
+ parameters={"errors": errors, "scope": symbol_tables})
+ if errors:
+ # Ideally, we would find duplicate field names elsewhere in the module, even
+ # if there are duplicate type names, but field/enum names in the colliding
+ # types also end up colliding, leading to spurious errors. E.g., if you
+ # have two `struct Foo`s, then the field check will also discover a
+ # collision for `$size_in_bytes`, since there are two `Foo.$size_in_bytes`.
+ return symbol_tables, errors
+
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.EnumValue], _add_enum_value_to_scope,
+ incidental_actions={
+ ir_pb2.Module: _set_scope_for_module,
+ ir_pb2.TypeDefinition: _set_scope_for_type_definition,
+ },
+ parameters={"errors": errors, "scope": symbol_tables})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Field], _add_struct_field_to_scope,
+ incidental_actions={
+ ir_pb2.Module: _set_scope_for_module,
+ ir_pb2.TypeDefinition: _set_scope_for_type_definition,
+ },
+ parameters={"errors": errors, "scope": symbol_tables})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.RuntimeParameter], _add_parameter_name_to_scope,
+ incidental_actions={
+ ir_pb2.Module: _set_scope_for_module,
+ ir_pb2.TypeDefinition: _set_scope_for_type_definition,
+ },
+ parameters={"errors": errors, "scope": symbol_tables})
+ return symbol_tables, errors
+
+
+def _add_alias_to_scope(name_ir, table, scope, alias, visibility, errors):
+ """Adds the given name to the scope as an alias."""
+ name = name_ir.text
+ new_scope = _Scope(_nested_name(scope, name), name_ir.source_location,
+ visibility, alias)
+ scoped_table = table[scope.module_file]
+ for path_element in scope.object_path:
+ scoped_table = scoped_table[path_element]
+ if name in scoped_table:
+ errors.append(duplicate_name_error(
+ scoped_table.canonical_name.module_file, name_ir.source_location, name,
+ FileLocation(scoped_table[name].canonical_name.module_file,
+ scoped_table[name].source_location)))
+ else:
+ scoped_table[name] = new_scope
+ return new_scope
+
+
+def _resolve_head_of_field_reference(field_reference, table, current_scope,
+ visible_scopes, source_file_name, errors):
+ return _resolve_reference(
+ field_reference.path[0], table, current_scope,
+ visible_scopes, source_file_name, errors)
+
+
+def _resolve_reference(reference, table, current_scope, visible_scopes,
+ source_file_name, errors):
+ """Sets the canonical name of the given reference."""
+ if reference.HasField("canonical_name"):
+ # This reference has already been resolved by the _resolve_field_reference
+ # pass.
+ return
+ target = _find_target_of_reference(reference, table, current_scope,
+ visible_scopes, source_file_name, errors)
+ if target is not None:
+ assert not target.alias
+ reference.canonical_name.CopyFrom(target.canonical_name)
+
+
+def _find_target_of_reference(reference, table, current_scope, visible_scopes,
+ source_file_name, errors):
+ """Returns the resolved name of the given reference."""
+ found_in_table = None
+ name = reference.source_name[0].text
+ for scope in visible_scopes:
+ scoped_table = table[scope.module_file]
+ for path_element in scope.object_path:
+ scoped_table = scoped_table[path_element]
+ if (name in scoped_table and
+ (scope == current_scope or
+ scoped_table[name].visibility == _Scope.SEARCHABLE)):
+ # Prelude is "", so explicitly check for None.
+ if found_in_table is not None:
+ # TODO(bolms): Currently, this catches the case where a module tries to
+ # use a name that is defined (at the same scope) in two different
+ # modules. It may make sense to raise duplicate_name_error whenever two
+ # modules define the same name (whether it is used or not), and reserve
+ # ambiguous_name_error for cases where a name is found in multiple
+ # scopes.
+ errors.append(ambiguous_name_error(
+ source_file_name, reference.source_location, name, [FileLocation(
+ found_in_table[name].canonical_name.module_file,
+ found_in_table[name].source_location), FileLocation(
+ scoped_table[name].canonical_name.module_file, scoped_table[
+ name].source_location)]))
+ continue
+ found_in_table = scoped_table
+ if reference.is_local_name:
+ # This is a little hacky. When "is_local_name" is True, the name refers
+ # to a type that was defined inline. In many cases, the type should be
+ # found at the same scope as the field; e.g.:
+ #
+ # struct Foo:
+ # 0 [+1] enum bar:
+ # BAZ = 1
+ #
+ # In this case, `Foo.bar` has type `Foo.Bar`. Unfortunately, things
+ # break down a little bit when there is an inline type in an anonymous
+ # `bits`:
+ #
+ # struct Foo:
+ # 0 [+1] bits:
+ # 0 [+7] enum bar:
+ # BAZ = 1
+ #
+ # Types inside of anonymous `bits` are hoisted into their parent type,
+ # so instead of `Foo.EmbossReservedAnonymous1.Bar`, `bar`'s type is just
+ # `Foo.Bar`. Unfortunately, the field is still
+ # `Foo.EmbossReservedAnonymous1.bar`, so `bar`'s type won't be found in
+ # `bar`'s `current_scope`.
+ #
+ # (The name `bar` is exposed from `Foo` as an alias virtual field, so
+ # perhaps the correct answer is to allow type aliases, so that `Bar` can
+ # be found in both `Foo` and `Foo.EmbossReservedAnonymous1`. That would
+ # involve an entirely new feature, though.)
+ #
+ # The workaround here is to search scopes from the innermost outward,
+ # and just stop as soon as a match is found. This isn't ideal, because
+ # it relies on other bits of the front end having correctly added the
+ # inline type to the correct scope before symbol resolution, but it does
+ # work. Names with False `is_local_name` will still be checked for
+ # ambiguity.
+ break
+ if found_in_table is None:
+ errors.append(missing_name_error(
+ source_file_name, reference.source_name[0].source_location, name))
+ if not errors:
+ for subname in reference.source_name:
+ if subname.text not in found_in_table:
+ errors.append(missing_name_error(source_file_name,
+ subname.source_location, subname.text))
+ return None
+ found_in_table = found_in_table[subname.text]
+ while found_in_table.alias:
+ referenced_table = table
+ for name in found_in_table.alias:
+ referenced_table = referenced_table[name]
+ # TODO(bolms): This section should really be a recursive lookup
+ # function, which would be able to handle arbitrary aliases through
+ # other aliases.
+ #
+ # This should be fine for now, since the only aliases here should be
+ # imports, which can't refer to other imports.
+ assert not referenced_table.alias, "Alias found to contain alias."
+ found_in_table = referenced_table
+ return found_in_table
+ return None
+
+
+def _resolve_field_reference(field_reference, source_file_name, errors, ir):
+ """Resolves the References inside of a FieldReference."""
+ if field_reference.path[-1].HasField("canonical_name"):
+ # Already done.
+ return
+ previous_field = ir_util.find_object_or_none(field_reference.path[0], ir)
+ previous_reference = field_reference.path[0]
+ for ref in field_reference.path[1:]:
+ while ir_util.field_is_virtual(previous_field):
+ if (previous_field.read_transform.WhichOneof("expression") ==
+ "field_reference"):
+ # Pass a separate error list into the recursive _resolve_field_reference
+ # call so that only one copy of the error for a particular reference
+ # will actually surface: in particular, the one that results from a
+ # direct call from traverse_ir_top_down into _resolve_field_reference.
+ new_errors = []
+ _resolve_field_reference(
+ previous_field.read_transform.field_reference,
+ previous_field.name.canonical_name.module_file, new_errors, ir)
+ # If the recursive _resolve_field_reference was unable to resolve the
+ # field, then bail. Otherwise we get a cascade of errors, where an
+ # error in `x` leads to errors in anything trying to reach a member of
+ # `x`.
+ if not previous_field.read_transform.field_reference.path[-1].HasField(
+ "canonical_name"):
+ return
+ previous_field = ir_util.find_object(
+ previous_field.read_transform.field_reference.path[-1], ir)
+ else:
+ errors.append(
+ noncomposite_subfield_error(source_file_name,
+ previous_reference.source_location,
+ previous_reference.source_name[0].text))
+ return
+ if previous_field.type.WhichOneof("type") == "array_type":
+ errors.append(
+ array_subfield_error(source_file_name,
+ previous_reference.source_location,
+ previous_reference.source_name[0].text))
+ return
+ assert previous_field.type.WhichOneof("type") == "atomic_type"
+ member_name = ir_pb2.CanonicalName()
+ member_name.CopyFrom(
+ previous_field.type.atomic_type.reference.canonical_name)
+ member_name.object_path.extend([ref.source_name[0].text])
+ previous_field = ir_util.find_object_or_none(member_name, ir)
+ if previous_field is None:
+ errors.append(
+ missing_name_error(source_file_name,
+ ref.source_name[0].source_location,
+ ref.source_name[0].text))
+ return
+ ref.canonical_name.CopyFrom(member_name)
+ previous_reference = ref
+
+
+def _set_visible_scopes_for_type_definition(type_definition, visible_scopes):
+ """Sets current_scope and visible_scopes for the given type_definition."""
+ return {
+ "current_scope": type_definition.name.canonical_name,
+
+ # In order to ensure that the iteration through scopes in
+ # _find_target_of_reference will go from innermost to outermost, it is
+ # important that the current scope (type_definition.name.canonical_name)
+ # precedes the previous visible_scopes here.
+ "visible_scopes": (type_definition.name.canonical_name,) + visible_scopes,
+ }
+
+
+def _set_visible_scopes_for_module(module):
+ """Sets visible_scopes for the given module."""
+ self_scope = ir_pb2.CanonicalName(module_file=module.source_file_name)
+ extra_visible_scopes = []
+ for foreign_import in module.foreign_import:
+ # Anonymous imports are searched for top-level names; named imports are not.
+ # As of right now, only the prelude should be imported anonymously; other
+ # modules must be imported with names.
+ if not foreign_import.local_name.text:
+ extra_visible_scopes.append(
+ ir_pb2.CanonicalName(module_file=foreign_import.file_name.text))
+ return {"visible_scopes": (self_scope,) + tuple(extra_visible_scopes)}
+
+
+def _set_visible_scopes_for_attribute(attribute, field, visible_scopes):
+ """Sets current_scope and visible_scopes for the attribute."""
+ del attribute # Unused
+ if field is None:
+ return
+ return {
+ "current_scope": field.name.canonical_name,
+ "visible_scopes": (field.name.canonical_name,) + visible_scopes,
+ }
+
+
+def _resolve_symbols_from_table(ir, table):
+ """Resolves all references in the given IR, given the constructed table."""
+ errors = []
+ # Symbol resolution is broken into five passes. First, this code resolves any
+ # imports, and adds import aliases to modules.
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Import], _add_import_to_scope,
+ incidental_actions={
+ ir_pb2.Module: lambda m, table: {"module": table[m.source_file_name]},
+ },
+ parameters={"errors": errors, "table": table})
+ if errors:
+ return errors
+ # Next, this resolves all absolute references (e.g., it resolves "UInt" in
+ # "0:1 UInt field" to [prelude]::UInt).
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Reference], _resolve_reference,
+ skip_descendants_of=(ir_pb2.FieldReference,),
+ incidental_actions={
+ ir_pb2.TypeDefinition: _set_visible_scopes_for_type_definition,
+ ir_pb2.Module: _set_visible_scopes_for_module,
+ ir_pb2.Field: lambda f: {"field": f},
+ ir_pb2.Attribute: _set_visible_scopes_for_attribute,
+ },
+ parameters={"table": table, "errors": errors, "field": None})
+ # Lastly, head References to fields (e.g., the `a` of `a.b.c`) are resolved.
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.FieldReference], _resolve_head_of_field_reference,
+ incidental_actions={
+ ir_pb2.TypeDefinition: _set_visible_scopes_for_type_definition,
+ ir_pb2.Module: _set_visible_scopes_for_module,
+ ir_pb2.Field: lambda f: {"field": f},
+ ir_pb2.Attribute: _set_visible_scopes_for_attribute,
+ },
+ parameters={"table": table, "errors": errors, "field": None})
+ return errors
+
+
+def resolve_field_references(ir):
+ """Resolves structure member accesses ("field.subfield") in ir."""
+ errors = []
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.FieldReference], _resolve_field_reference,
+ incidental_actions={
+ ir_pb2.TypeDefinition: _set_visible_scopes_for_type_definition,
+ ir_pb2.Module: _set_visible_scopes_for_module,
+ ir_pb2.Field: lambda f: {"field": f},
+ ir_pb2.Attribute: _set_visible_scopes_for_attribute,
+ },
+ parameters={"errors": errors, "field": None})
+ return errors
+
+
+def resolve_symbols(ir):
+ """Resolves the symbols in all modules in ir."""
+ symbol_tables, errors = _construct_symbol_tables(ir)
+ if errors:
+ return errors
+ return _resolve_symbols_from_table(ir, symbol_tables)
diff --git a/front_end/symbol_resolver_test.py b/front_end/symbol_resolver_test.py
new file mode 100644
index 0000000..a2f7f4e
--- /dev/null
+++ b/front_end/symbol_resolver_test.py
@@ -0,0 +1,748 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for emboss.front_end.symbol_resolver."""
+
+import unittest
+from front_end import glue
+from front_end import symbol_resolver
+from front_end import test_util
+from util import error
+
+_HAPPY_EMB = """
+struct Foo:
+ 0 [+4] UInt uint_field
+ 4 [+4] Bar bar_field
+ 8 [+16] UInt[4] array_field
+
+struct Bar:
+ 0 [+4] Qux bar
+
+enum Qux:
+ ABC = 1
+ DEF = 2
+
+struct FieldRef:
+ n-4 [+n] UInt:8[n] data
+ offset-4 [+offset] UInt:8[offset] data2
+ 0 [+4] UInt offset (n)
+
+struct VoidLength:
+ 0 [+10] UInt:8[] ten_bytes
+
+enum Quux:
+ ABC = 1
+ DEF = ABC
+
+struct UsesParameter(x: UInt:8):
+ 0 [+x] UInt:8[] block
+"""
+
+
+class ResolveSymbolsTest(unittest.TestCase):
+ """Tests for symbol_resolver.resolve_symbols()."""
+
+ def _construct_ir_multiple(self, file_dict, primary_emb_name):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ primary_emb_name,
+ test_util.dict_file_reader(file_dict),
+ stop_before_step="resolve_symbols")
+ assert not errors
+ return ir
+
+ def _construct_ir(self, emb_text, name="happy.emb"):
+ return self._construct_ir_multiple({name: emb_text}, name)
+
+ def test_struct_field_atomic_type_resolution(self):
+ ir = self._construct_ir(_HAPPY_EMB)
+ self.assertEqual([], symbol_resolver.resolve_symbols(ir))
+ struct_ir = ir.module[0].type[0].structure
+ atomic_field1_reference = struct_ir.field[0].type.atomic_type.reference
+ self.assertEqual(atomic_field1_reference.canonical_name.object_path, ["UInt"
+ ])
+ self.assertEqual(atomic_field1_reference.canonical_name.module_file, "")
+ atomic_field2_reference = struct_ir.field[1].type.atomic_type.reference
+ self.assertEqual(atomic_field2_reference.canonical_name.object_path, ["Bar"
+ ])
+ self.assertEqual(atomic_field2_reference.canonical_name.module_file,
+ "happy.emb")
+
+ def test_struct_field_enum_type_resolution(self):
+ ir = self._construct_ir(_HAPPY_EMB)
+ self.assertEqual([], symbol_resolver.resolve_symbols(ir))
+ struct_ir = ir.module[0].type[1].structure
+ atomic_field_reference = struct_ir.field[0].type.atomic_type.reference
+ self.assertEqual(atomic_field_reference.canonical_name.object_path, ["Qux"])
+ self.assertEqual(atomic_field_reference.canonical_name.module_file,
+ "happy.emb")
+
+ def test_struct_field_array_type_resolution(self):
+ ir = self._construct_ir(_HAPPY_EMB)
+ self.assertEqual([], symbol_resolver.resolve_symbols(ir))
+ array_field_type = ir.module[0].type[0].structure.field[2].type.array_type
+ array_field_reference = array_field_type.base_type.atomic_type.reference
+ self.assertEqual(array_field_reference.canonical_name.object_path, ["UInt"])
+ self.assertEqual(array_field_reference.canonical_name.module_file, "")
+
+ def test_inner_type_resolution(self):
+ ir = self._construct_ir(_HAPPY_EMB)
+ self.assertEqual([], symbol_resolver.resolve_symbols(ir))
+ array_field_type = ir.module[0].type[0].structure.field[2].type.array_type
+ array_field_reference = array_field_type.base_type.atomic_type.reference
+ self.assertEqual(array_field_reference.canonical_name.object_path, ["UInt"])
+ self.assertEqual(array_field_reference.canonical_name.module_file, "")
+
+ def test_struct_field_resolution_in_expression_in_location(self):
+ ir = self._construct_ir(_HAPPY_EMB)
+ self.assertEqual([], symbol_resolver.resolve_symbols(ir))
+ struct_ir = ir.module[0].type[3].structure
+ field0_loc = struct_ir.field[0].location
+ abbreviation_reference = field0_loc.size.field_reference.path[0]
+ self.assertEqual(abbreviation_reference.canonical_name.object_path,
+ ["FieldRef", "offset"])
+ self.assertEqual(abbreviation_reference.canonical_name.module_file,
+ "happy.emb")
+ field0_start_left = field0_loc.start.function.args[0]
+ nested_abbreviation_reference = field0_start_left.field_reference.path[0]
+ self.assertEqual(nested_abbreviation_reference.canonical_name.object_path,
+ ["FieldRef", "offset"])
+ self.assertEqual(nested_abbreviation_reference.canonical_name.module_file,
+ "happy.emb")
+ field1_loc = struct_ir.field[1].location
+ direct_reference = field1_loc.size.field_reference.path[0]
+ self.assertEqual(direct_reference.canonical_name.object_path, ["FieldRef",
+ "offset"])
+ self.assertEqual(direct_reference.canonical_name.module_file, "happy.emb")
+ field1_start_left = field1_loc.start.function.args[0]
+ nested_direct_reference = field1_start_left.field_reference.path[0]
+ self.assertEqual(nested_direct_reference.canonical_name.object_path,
+ ["FieldRef", "offset"])
+ self.assertEqual(nested_direct_reference.canonical_name.module_file,
+ "happy.emb")
+
+ def test_struct_field_resolution_in_expression_in_array_length(self):
+ ir = self._construct_ir(_HAPPY_EMB)
+ self.assertEqual([], symbol_resolver.resolve_symbols(ir))
+ struct_ir = ir.module[0].type[3].structure
+ field0_array_type = struct_ir.field[0].type.array_type
+ field0_array_element_count = field0_array_type.element_count
+ abbreviation_reference = field0_array_element_count.field_reference.path[0]
+ self.assertEqual(abbreviation_reference.canonical_name.object_path,
+ ["FieldRef", "offset"])
+ self.assertEqual(abbreviation_reference.canonical_name.module_file,
+ "happy.emb")
+ field1_array_type = struct_ir.field[1].type.array_type
+ direct_reference = field1_array_type.element_count.field_reference.path[0]
+ self.assertEqual(direct_reference.canonical_name.object_path, ["FieldRef",
+ "offset"])
+ self.assertEqual(direct_reference.canonical_name.module_file, "happy.emb")
+
+ def test_struct_parameter_resolution(self):
+ ir = self._construct_ir(_HAPPY_EMB)
+ self.assertEqual([], symbol_resolver.resolve_symbols(ir))
+ struct_ir = ir.module[0].type[6].structure
+ size_ir = struct_ir.field[0].location.size
+ self.assertTrue(size_ir.HasField("field_reference"))
+ self.assertEqual(size_ir.field_reference.path[0].canonical_name.object_path,
+ ["UsesParameter", "x"])
+
+ def test_enum_value_resolution_in_expression_in_enum_field(self):
+ ir = self._construct_ir(_HAPPY_EMB)
+ self.assertEqual([], symbol_resolver.resolve_symbols(ir))
+ enum_ir = ir.module[0].type[5].enumeration
+ value_reference = enum_ir.value[1].value.constant_reference
+ self.assertEqual(value_reference.canonical_name.object_path,
+ ["Quux", "ABC"])
+ self.assertEqual(value_reference.canonical_name.module_file, "happy.emb")
+
+ def test_symbol_resolution_in_expression_in_void_array_length(self):
+ ir = self._construct_ir(_HAPPY_EMB)
+ self.assertEqual([], symbol_resolver.resolve_symbols(ir))
+ struct_ir = ir.module[0].type[4].structure
+ array_type = struct_ir.field[0].type.array_type
+ # The symbol resolver should ignore void fields.
+ self.assertEqual("automatic", array_type.WhichOneof("size"))
+
+ def test_name_definitions_have_correct_canonical_names(self):
+ ir = self._construct_ir(_HAPPY_EMB)
+ self.assertEqual([], symbol_resolver.resolve_symbols(ir))
+ foo_name = ir.module[0].type[0].name
+ self.assertEqual(foo_name.canonical_name.object_path, ["Foo"])
+ self.assertEqual(foo_name.canonical_name.module_file, "happy.emb")
+ uint_field_name = ir.module[0].type[0].structure.field[0].name
+ self.assertEqual(uint_field_name.canonical_name.object_path, ["Foo",
+ "uint_field"])
+ self.assertEqual(uint_field_name.canonical_name.module_file, "happy.emb")
+ foo_name = ir.module[0].type[2].name
+ self.assertEqual(foo_name.canonical_name.object_path, ["Qux"])
+ self.assertEqual(foo_name.canonical_name.module_file, "happy.emb")
+
+ def test_duplicate_type_name(self):
+ ir = self._construct_ir("struct Foo:\n"
+ " 0 [+4] UInt field\n"
+ "struct Foo:\n"
+ " 0 [+4] UInt bar\n", "duplicate_type.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ self.assertEqual([
+ [error.error("duplicate_type.emb",
+ ir.module[0].type[1].name.source_location,
+ "Duplicate name 'Foo'"),
+ error.note("duplicate_type.emb",
+ ir.module[0].type[0].name.source_location,
+ "Original definition")]
+ ], errors)
+
+ def test_duplicate_field_name_in_struct(self):
+ ir = self._construct_ir("struct Foo:\n"
+ " 0 [+4] UInt field\n"
+ " 4 [+4] UInt field\n", "duplicate_field.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("duplicate_field.emb",
+ struct.field[1].name.source_location,
+ "Duplicate name 'field'"),
+ error.note("duplicate_field.emb",
+ struct.field[0].name.source_location,
+ "Original definition")
+ ]], errors)
+
+ def test_duplicate_abbreviation_in_struct(self):
+ ir = self._construct_ir("struct Foo:\n"
+ " 0 [+4] UInt field1 (f)\n"
+ " 4 [+4] UInt field2 (f)\n",
+ "duplicate_field.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("duplicate_field.emb",
+ struct.field[1].abbreviation.source_location,
+ "Duplicate name 'f'"),
+ error.note("duplicate_field.emb",
+ struct.field[0].abbreviation.source_location,
+ "Original definition")
+ ]], errors)
+
+ def test_abbreviation_duplicates_field_name_in_struct(self):
+ ir = self._construct_ir("struct Foo:\n"
+ " 0 [+4] UInt field\n"
+ " 4 [+4] UInt field2 (field)\n",
+ "duplicate_field.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("duplicate_field.emb",
+ struct.field[1].abbreviation.source_location,
+ "Duplicate name 'field'"),
+ error.note("duplicate_field.emb",
+ struct.field[0].name.source_location,
+ "Original definition")
+ ]], errors)
+
+ def test_field_name_duplicates_abbreviation_in_struct(self):
+ ir = self._construct_ir("struct Foo:\n"
+ " 0 [+4] UInt field (field2)\n"
+ " 4 [+4] UInt field2\n", "duplicate_field.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ struct = ir.module[0].type[0].structure
+ self.assertEqual([[
+ error.error("duplicate_field.emb",
+ struct.field[1].name.source_location,
+ "Duplicate name 'field2'"),
+ error.note("duplicate_field.emb",
+ struct.field[0].abbreviation.source_location,
+ "Original definition")
+ ]], errors)
+
+ def test_duplicate_value_name_in_enum(self):
+ ir = self._construct_ir("enum Foo:\n"
+ " BAR = 1\n"
+ " BAR = 1\n", "duplicate_enum.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ self.assertEqual([[
+ error.error(
+ "duplicate_enum.emb",
+ ir.module[0].type[0].enumeration.value[1].name.source_location,
+ "Duplicate name 'BAR'"),
+ error.note(
+ "duplicate_enum.emb",
+ ir.module[0].type[0].enumeration.value[0].name.source_location,
+ "Original definition")
+ ]], errors)
+
+ def test_ambiguous_name(self):
+ # struct UInt will be ambiguous with the external UInt in the prelude.
+ ir = self._construct_ir("struct UInt:\n"
+ " 0 [+4] Int:8[4] field\n"
+ "struct Foo:\n"
+ " 0 [+4] UInt bar\n", "ambiguous.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ # Find the UInt definition in the prelude.
+ for type_ir in ir.module[1].type:
+ if type_ir.name.name.text == "UInt":
+ prelude_uint = type_ir
+ break
+ ambiguous_type_ir = ir.module[0].type[1].structure.field[0].type.atomic_type
+ self.assertEqual([[
+ error.error("ambiguous.emb",
+ ambiguous_type_ir.reference.source_name[0].source_location,
+ "Ambiguous name 'UInt'"), error.note(
+ "", prelude_uint.name.source_location,
+ "Possible resolution"),
+ error.note("ambiguous.emb", ir.module[0].type[0].name.source_location,
+ "Possible resolution")
+ ]], errors)
+
+ def test_missing_name(self):
+ ir = self._construct_ir("struct Foo:\n"
+ " 0 [+4] Bar field\n",
+ "missing.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ missing_type_ir = ir.module[0].type[0].structure.field[0].type.atomic_type
+ self.assertEqual([
+ [error.error("missing.emb",
+ missing_type_ir.reference.source_name[0].source_location,
+ "No candidate for 'Bar'")]
+ ], errors)
+
+ def test_missing_leading_name(self):
+ ir = self._construct_ir("struct Foo:\n"
+ " 0 [+Num.FOUR] UInt field\n", "missing.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ missing_expr_ir = ir.module[0].type[0].structure.field[0].location.size
+ self.assertEqual([
+ [error.error(
+ "missing.emb",
+ missing_expr_ir.constant_reference.source_name[0].source_location,
+ "No candidate for 'Num'")]
+ ], errors)
+
+ def test_missing_trailing_name(self):
+ ir = self._construct_ir("struct Foo:\n"
+ " 0 [+Num.FOUR] UInt field\n"
+ "enum Num:\n"
+ " THREE = 3\n", "missing.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ missing_expr_ir = ir.module[0].type[0].structure.field[0].location.size
+ self.assertEqual([
+ [error.error(
+ "missing.emb",
+ missing_expr_ir.constant_reference.source_name[1].source_location,
+ "No candidate for 'FOUR'")]
+ ], errors)
+
+ def test_missing_middle_name(self):
+ ir = self._construct_ir("struct Foo:\n"
+ " 0 [+Num.NaN.FOUR] UInt field\n"
+ "enum Num:\n"
+ " FOUR = 4\n", "missing.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ missing_expr_ir = ir.module[0].type[0].structure.field[0].location.size
+ self.assertEqual([
+ [error.error(
+ "missing.emb",
+ missing_expr_ir.constant_reference.source_name[1].source_location,
+ "No candidate for 'NaN'")]
+ ], errors)
+
+ def test_inner_resolution(self):
+ ir = self._construct_ir(
+ "struct OuterStruct:\n"
+ "\n"
+ " struct InnerStruct2:\n"
+ " 0 [+1] InnerStruct.InnerEnum inner_enum\n"
+ "\n"
+ " struct InnerStruct:\n"
+ " enum InnerEnum:\n"
+ " ONE = 1\n"
+ "\n"
+ " 0 [+1] InnerEnum inner_enum\n"
+ "\n"
+ " 0 [+InnerStruct.InnerEnum.ONE] InnerStruct.InnerEnum inner_enum\n",
+ "nested.emb")
+ errors = symbol_resolver.resolve_symbols(ir)
+ self.assertFalse(errors)
+ outer_struct = ir.module[0].type[0]
+ inner_struct = outer_struct.subtype[1]
+ inner_struct_2 = outer_struct.subtype[0]
+ inner_enum = inner_struct.subtype[0]
+ self.assertEqual(["OuterStruct", "InnerStruct"],
+ list(inner_struct.name.canonical_name.object_path))
+ self.assertEqual(["OuterStruct", "InnerStruct", "InnerEnum"],
+ list(inner_enum.name.canonical_name.object_path))
+ self.assertEqual(["OuterStruct", "InnerStruct2"],
+ list(inner_struct_2.name.canonical_name.object_path))
+ outer_field = outer_struct.structure.field[0]
+ outer_field_end_ref = outer_field.location.size.constant_reference
+ self.assertEqual(
+ ["OuterStruct", "InnerStruct", "InnerEnum", "ONE"], list(
+ outer_field_end_ref.canonical_name.object_path))
+ self.assertEqual(
+ ["OuterStruct", "InnerStruct", "InnerEnum"],
+ list(outer_field.type.atomic_type.reference.canonical_name.object_path))
+ inner_field_2_type = inner_struct_2.structure.field[0].type.atomic_type
+ self.assertEqual(
+ ["OuterStruct", "InnerStruct", "InnerEnum"
+ ], list(inner_field_2_type.reference.canonical_name.object_path))
+
+ def test_resolution_against_anonymous_bits(self):
+ ir = self._construct_ir("struct Struct:\n"
+ " 0 [+1] bits:\n"
+ " 7 [+1] Flag last_packet\n"
+ " 5 [+2] enum inline_inner_enum:\n"
+ " AA = 0\n"
+ " BB = 1\n"
+ " CC = 2\n"
+ " DD = 3\n"
+ " 0 [+5] UInt header_size (h)\n"
+ " 0 [+h] UInt:8[] header_bytes\n"
+ "\n"
+ "struct Struct2:\n"
+ " 0 [+1] Struct.InlineInnerEnum value\n",
+ "anonymity.emb")
+ errors = symbol_resolver.resolve_symbols(ir)
+ self.assertFalse(errors)
+ struct1 = ir.module[0].type[0]
+ struct1_bits_field = struct1.structure.field[0]
+ struct1_bits_field_type = struct1_bits_field.type.atomic_type.reference
+ struct1_byte_field = struct1.structure.field[4]
+ inner_bits = struct1.subtype[0]
+ inner_enum = struct1.subtype[1]
+ self.assertTrue(inner_bits.HasField("structure"))
+ self.assertTrue(inner_enum.HasField("enumeration"))
+ self.assertTrue(inner_bits.name.is_anonymous)
+ self.assertFalse(inner_enum.name.is_anonymous)
+ self.assertEqual(["Struct", "InlineInnerEnum"],
+ list(inner_enum.name.canonical_name.object_path))
+ self.assertEqual(
+ ["Struct", "InlineInnerEnum", "AA"],
+ list(inner_enum.enumeration.value[0].name.canonical_name.object_path))
+ self.assertEqual(
+ list(inner_bits.name.canonical_name.object_path),
+ list(struct1_bits_field_type.canonical_name.object_path))
+ self.assertEqual(2, len(inner_bits.name.canonical_name.object_path))
+ self.assertEqual(
+ ["Struct", "header_size"],
+ list(struct1_byte_field.location.size.field_reference.path[0].
+ canonical_name.object_path))
+
+ def test_duplicate_name_in_different_inline_bits(self):
+ ir = self._construct_ir(
+ "struct Struct:\n"
+ " 0 [+1] bits:\n"
+ " 7 [+1] Flag a\n"
+ " 1 [+1] bits:\n"
+ " 0 [+1] Flag a\n", "duplicate_in_anon.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_symbols(ir))
+ supertype = ir.module[0].type[0]
+ self.assertEqual([[
+ error.error(
+ "duplicate_in_anon.emb",
+ supertype.structure.field[3].name.source_location,
+ "Duplicate name 'a'"),
+ error.note(
+ "duplicate_in_anon.emb",
+ supertype.structure.field[1].name.source_location,
+ "Original definition")
+ ]], errors)
+
+ def test_duplicate_name_in_same_inline_bits(self):
+ ir = self._construct_ir(
+ "struct Struct:\n"
+ " 0 [+1] bits:\n"
+ " 7 [+1] Flag a\n"
+ " 0 [+1] Flag a\n", "duplicate_in_anon.emb")
+ errors = symbol_resolver.resolve_symbols(ir)
+ supertype = ir.module[0].type[0]
+ self.assertEqual([[
+ error.error(
+ "duplicate_in_anon.emb",
+ supertype.structure.field[2].name.source_location,
+ "Duplicate name 'a'"),
+ error.note(
+ "duplicate_in_anon.emb",
+ supertype.structure.field[1].name.source_location,
+ "Original definition")
+ ]], error.filter_errors(errors))
+
+ def test_import_type_resolution(self):
+ importer = ('import "ed.emb" as ed\n'
+ "struct Ff:\n"
+ " 0 [+1] ed.Gg gg\n")
+ imported = ("struct Gg:\n"
+ " 0 [+1] UInt qq\n")
+ ir = self._construct_ir_multiple({"ed.emb": imported, "er.emb": importer},
+ "er.emb")
+ errors = symbol_resolver.resolve_symbols(ir)
+ self.assertEqual([], errors)
+
+ def test_duplicate_import_name(self):
+ importer = ('import "ed.emb" as ed\n'
+ 'import "ed.emb" as ed\n'
+ "struct Ff:\n"
+ " 0 [+1] ed.Gg gg\n")
+ imported = ("struct Gg:\n"
+ " 0 [+1] UInt qq\n")
+ ir = self._construct_ir_multiple({"ed.emb": imported, "er.emb": importer},
+ "er.emb")
+ errors = symbol_resolver.resolve_symbols(ir)
+ # Note: the error is on import[2] duplicating import[1] because the implicit
+ # prelude import is import[0].
+ self.assertEqual([
+ [error.error("er.emb",
+ ir.module[0].foreign_import[2].local_name.source_location,
+ "Duplicate name 'ed'"),
+ error.note("er.emb",
+ ir.module[0].foreign_import[1].local_name.source_location,
+ "Original definition")]
+ ], errors)
+
+ def test_import_enum_resolution(self):
+ importer = ('import "ed.emb" as ed\n'
+ "struct Ff:\n"
+ " if ed.Gg.GG == ed.Gg.GG:\n"
+ " 0 [+1] UInt gg\n")
+ imported = ("enum Gg:\n"
+ " GG = 0\n")
+ ir = self._construct_ir_multiple({"ed.emb": imported, "er.emb": importer},
+ "er.emb")
+ errors = symbol_resolver.resolve_symbols(ir)
+ self.assertEqual([], errors)
+
+ def test_that_double_import_names_are_syntactically_invalid(self):
+ # There are currently no checks in resolve_symbols that it is not possible
+ # to get to symbols imported by another module, because it is syntactically
+ # invalid. This may change in the future, in which case this test should be
+ # fixed by adding an explicit check to resolve_symbols and checking the
+ # error message here.
+ importer = ('import "ed.emb" as ed\n'
+ "struct Ff:\n"
+ " 0 [+1] ed.ed2.Gg gg\n")
+ imported = 'import "ed2.emb" as ed2\n'
+ imported2 = ("struct Gg:\n"
+ " 0 [+1] UInt qq\n")
+ unused_ir, unused_debug_info, errors = glue.parse_emboss_file(
+ "er.emb",
+ test_util.dict_file_reader({"ed.emb": imported,
+ "ed2.emb": imported2,
+ "er.emb": importer}),
+ stop_before_step="resolve_symbols")
+ assert errors
+
+ def test_no_error_when_inline_name_aliases_outer_name(self):
+ # The inline enum's complete type should be Foo.Foo. During parsing, the
+ # name is set to just "Foo", but symbol resolution should a) select the
+ # correct Foo, and b) not complain that multiple Foos could match.
+ ir = self._construct_ir(
+ "struct Foo:\n"
+ " 0 [+1] enum foo:\n"
+ " BAR = 0\n")
+ errors = symbol_resolver.resolve_symbols(ir)
+ self.assertEqual([], errors)
+ field = ir.module[0].type[0].structure.field[0]
+ self.assertEqual(
+ ["Foo", "Foo"],
+ list(field.type.atomic_type.reference.canonical_name.object_path))
+
+ def test_no_error_when_inline_name_in_anonymous_bits_aliases_outer_name(self):
+ # There is an extra layer of complexity when an inline type appears inside
+ # of an inline bits.
+ ir = self._construct_ir(
+ "struct Foo:\n"
+ " 0 [+1] bits:\n"
+ " 0 [+4] enum foo:\n"
+ " BAR = 0\n")
+ errors = symbol_resolver.resolve_symbols(ir)
+ self.assertEqual([], error.filter_errors(errors))
+ field = ir.module[0].type[0].subtype[0].structure.field[0]
+ self.assertEqual(
+ ["Foo", "Foo"],
+ list(field.type.atomic_type.reference.canonical_name.object_path))
+
+
+class ResolveFieldReferencesTest(unittest.TestCase):
+ """Tests for symbol_resolver.resolve_field_references()."""
+
+ def _construct_ir_multiple(self, file_dict, primary_emb_name):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ primary_emb_name,
+ test_util.dict_file_reader(file_dict),
+ stop_before_step="resolve_field_references")
+ assert not errors
+ return ir
+
+ def _construct_ir(self, emb_text, name="happy.emb"):
+ return self._construct_ir_multiple({name: emb_text}, name)
+
+ def test_subfield_resolution(self):
+ ir = self._construct_ir(
+ "struct Ff:\n"
+ " 0 [+1] Gg gg\n"
+ " 1 [+gg.qq] UInt:8[] data\n"
+ "struct Gg:\n"
+ " 0 [+1] UInt qq\n", "subfield.emb")
+ errors = symbol_resolver.resolve_field_references(ir)
+ self.assertFalse(errors)
+ ff = ir.module[0].type[0]
+ location_end_path = ff.structure.field[1].location.size.field_reference.path
+ self.assertEqual(["Ff", "gg"],
+ list(location_end_path[0].canonical_name.object_path))
+ self.assertEqual(["Gg", "qq"],
+ list(location_end_path[1].canonical_name.object_path))
+
+ def test_aliased_subfield_resolution(self):
+ ir = self._construct_ir(
+ "struct Ff:\n"
+ " 0 [+1] Gg real_gg\n"
+ " 1 [+gg.qq] UInt:8[] data\n"
+ " let gg = real_gg\n"
+ "struct Gg:\n"
+ " 0 [+1] UInt real_qq\n"
+ " let qq = real_qq", "subfield.emb")
+ errors = symbol_resolver.resolve_field_references(ir)
+ self.assertFalse(errors)
+ ff = ir.module[0].type[0]
+ location_end_path = ff.structure.field[1].location.size.field_reference.path
+ self.assertEqual(["Ff", "gg"],
+ list(location_end_path[0].canonical_name.object_path))
+ self.assertEqual(["Gg", "qq"],
+ list(location_end_path[1].canonical_name.object_path))
+
+ def test_aliased_aliased_subfield_resolution(self):
+ ir = self._construct_ir(
+ "struct Ff:\n"
+ " 0 [+1] Gg really_real_gg\n"
+ " 1 [+gg.qq] UInt:8[] data\n"
+ " let gg = real_gg\n"
+ " let real_gg = really_real_gg\n"
+ "struct Gg:\n"
+ " 0 [+1] UInt qq\n", "subfield.emb")
+ errors = symbol_resolver.resolve_field_references(ir)
+ self.assertFalse(errors)
+ ff = ir.module[0].type[0]
+ location_end_path = ff.structure.field[1].location.size.field_reference.path
+ self.assertEqual(["Ff", "gg"],
+ list(location_end_path[0].canonical_name.object_path))
+ self.assertEqual(["Gg", "qq"],
+ list(location_end_path[1].canonical_name.object_path))
+
+ def test_subfield_resolution_fails(self):
+ ir = self._construct_ir(
+ "struct Ff:\n"
+ " 0 [+1] Gg gg\n"
+ " 1 [+gg.rr] UInt:8[] data\n"
+ "struct Gg:\n"
+ " 0 [+1] UInt qq\n", "subfield.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_field_references(ir))
+ self.assertEqual([
+ [error.error("subfield.emb", ir.module[0].type[0].structure.field[
+ 1].location.size.field_reference.path[1].source_name[
+ 0].source_location, "No candidate for 'rr'")]
+ ], errors)
+
+ def test_subfield_resolution_failure_shortcuts_further_resolution(self):
+ ir = self._construct_ir(
+ "struct Ff:\n"
+ " 0 [+1] Gg gg\n"
+ " 1 [+gg.rr.qq] UInt:8[] data\n"
+ "struct Gg:\n"
+ " 0 [+1] UInt qq\n", "subfield.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_field_references(ir))
+ self.assertEqual([
+ [error.error("subfield.emb", ir.module[0].type[0].structure.field[
+ 1].location.size.field_reference.path[1].source_name[
+ 0].source_location, "No candidate for 'rr'")]
+ ], errors)
+
+ def test_subfield_resolution_failure_with_aliased_name(self):
+ ir = self._construct_ir(
+ "struct Ff:\n"
+ " 0 [+1] Gg gg\n"
+ " 1 [+gg.gg] UInt:8[] data\n"
+ "struct Gg:\n"
+ " 0 [+1] UInt qq\n", "subfield.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_field_references(ir))
+ self.assertEqual([
+ [error.error("subfield.emb", ir.module[0].type[0].structure.field[
+ 1].location.size.field_reference.path[1].source_name[
+ 0].source_location, "No candidate for 'gg'")]
+ ], errors)
+
+ def test_subfield_resolution_failure_with_array(self):
+ ir = self._construct_ir(
+ "struct Ff:\n"
+ " 0 [+1] Gg[1] gg\n"
+ " 1 [+gg.qq] UInt:8[] data\n"
+ "struct Gg:\n"
+ " 0 [+1] UInt qq\n", "subfield.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_field_references(ir))
+ self.assertEqual([
+ [error.error("subfield.emb", ir.module[0].type[0].structure.field[
+ 1].location.size.field_reference.path[0].source_name[
+ 0].source_location, "Cannot access member of array 'gg'")]
+ ], errors)
+
+ def test_subfield_resolution_failure_with_int(self):
+ ir = self._construct_ir(
+ "struct Ff:\n"
+ " 0 [+1] UInt gg_source\n"
+ " 1 [+gg.qq] UInt:8[] data\n"
+ " let gg = gg_source + 1\n",
+ "subfield.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_field_references(ir))
+ error_field = ir.module[0].type[0].structure.field[1]
+ error_reference = error_field.location.size.field_reference
+ error_location = error_reference.path[0].source_name[0].source_location
+ self.assertEqual([
+ [error.error("subfield.emb", error_location,
+ "Cannot access member of noncomposite field 'gg'")]
+ ], errors)
+
+ def test_subfield_resolution_failure_with_int_no_cascade(self):
+ ir = self._construct_ir(
+ "struct Ff:\n"
+ " 0 [+1] UInt gg_source\n"
+ " 1 [+qqx] UInt:8[] data\n"
+ " let gg = gg_source + 1\n"
+ " let yy = gg.no_field\n"
+ " let qqx = yy.x\n"
+ " let qqy = yy.y\n",
+ "subfield.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_field_references(ir))
+ error_field = ir.module[0].type[0].structure.field[3]
+ error_reference = error_field.read_transform.field_reference
+ error_location = error_reference.path[0].source_name[0].source_location
+ self.assertEqual([
+ [error.error("subfield.emb", error_location,
+ "Cannot access member of noncomposite field 'gg'")]
+ ], errors)
+
+ def test_subfield_resolution_failure_with_abbreviation(self):
+ ir = self._construct_ir(
+ "struct Ff:\n"
+ " 0 [+1] Gg gg\n"
+ " 1 [+gg.q] UInt:8[] data\n"
+ "struct Gg:\n"
+ " 0 [+1] UInt qq (q)\n", "subfield.emb")
+ errors = error.filter_errors(symbol_resolver.resolve_field_references(ir))
+ self.assertEqual([
+ # TODO(bolms): Make the error message clearer, in this case.
+ [error.error("subfield.emb", ir.module[0].type[0].structure.field[
+ 1].location.size.field_reference.path[1].source_name[
+ 0].source_location, "No candidate for 'q'")]
+ ], errors)
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/synthetics.py b/front_end/synthetics.py
new file mode 100644
index 0000000..b47f1c6
--- /dev/null
+++ b/front_end/synthetics.py
@@ -0,0 +1,235 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Adds auto-generated virtual fields to the IR."""
+
+from front_end import attributes
+from public import ir_pb2
+from util import expression_parser
+from util import ir_util
+from util import traverse_ir
+
+
+def _mark_as_synthetic(proto):
+ """Marks all source_locations in proto with is_synthetic=True."""
+ if not isinstance(proto, ir_pb2.Message):
+ return
+ if hasattr(proto, "source_location"):
+ proto.source_location.is_synthetic = True
+ for name, value in proto.raw_fields.items():
+ if name != "source_location":
+ if isinstance(value, ir_pb2.TypedScopedList):
+ for i in range(len(value)):
+ _mark_as_synthetic(value[i])
+ else:
+ _mark_as_synthetic(value)
+
+
+def _skip_text_output_attribute():
+ """Returns the IR for a [text_output: "Skip"] attribute."""
+ result = ir_pb2.Attribute(
+ name=ir_pb2.Word(text=attributes.TEXT_OUTPUT),
+ value=ir_pb2.AttributeValue(string_constant=ir_pb2.String(text="Skip")))
+ _mark_as_synthetic(result)
+ return result
+
+
+# The existence condition for an alias for an anonymous bits' field is the union
+# of the existence condition for the anonymous bits and the existence condition
+# for the field within. The 'x' and 'x.y' are placeholders here; they'll be
+# overwritten in _add_anonymous_aliases.
+_ANONYMOUS_BITS_ALIAS_EXISTENCE_SKELETON = expression_parser.parse(
+ "$present(x) && $present(x.y)")
+
+
+def _add_anonymous_aliases(structure, type_definition):
+ """Adds synthetic alias fields for all fields in anonymous fields.
+
+ This essentially completes the rewrite of this:
+
+ struct Foo:
+ 0 [+4] bits:
+ 0 [+1] Flag low
+ 31 [+1] Flag high
+
+ Into this:
+
+ struct Foo:
+ bits EmbossReservedAnonymous0:
+ [text_output: "Skip"]
+ 0 [+1] Flag low
+ 31 [+1] Flag high
+ 0 [+4] EmbossReservedAnonymous0 emboss_reserved_anonymous_1
+ let low = emboss_reserved_anonymous_1.low
+ let high = emboss_reserved_anonymous_1.high
+
+ Note that this pass runs very, very early -- even before symbols have been
+ resolved -- so very little in ir_util will work at this point.
+
+ Arguments:
+ structure: The ir_pb2.Structure on which to synthesize fields.
+ type_definition: The ir_pb2.TypeDefinition containing structure.
+
+ Returns:
+ None
+ """
+ new_fields = []
+ for field in structure.field:
+ new_fields.append(field)
+ if not field.name.is_anonymous:
+ continue
+ field.attribute.extend([_skip_text_output_attribute()])
+ for subtype in type_definition.subtype:
+ if (subtype.name.name.text ==
+ field.type.atomic_type.reference.source_name[-1].text):
+ field_type = subtype
+ break
+ else:
+ assert False, "Unable to find corresponding type {} for anonymous field in {}.".format(field.type.atomic_type.reference, type_definition)
+ anonymous_reference = ir_pb2.Reference(source_name=[field.name.name])
+ anonymous_field_reference = ir_pb2.FieldReference(
+ path=[anonymous_reference])
+ for subfield in field_type.structure.field:
+ alias_field_reference = ir_pb2.FieldReference(
+ path=[
+ anonymous_reference,
+ ir_pb2.Reference(source_name=[subfield.name.name]),
+ ]
+ )
+ new_existence_condition = ir_pb2.Expression()
+ new_existence_condition.CopyFrom(_ANONYMOUS_BITS_ALIAS_EXISTENCE_SKELETON)
+ existence_clauses = new_existence_condition.function.args
+ existence_clauses[0].function.args[0].field_reference.CopyFrom(
+ anonymous_field_reference)
+ existence_clauses[1].function.args[0].field_reference.CopyFrom(
+ alias_field_reference)
+ new_read_transform = ir_pb2.Expression(
+ field_reference=alias_field_reference)
+ # This treats *most* of the alias field as synthetic, but not its name(s):
+ # leaving the name(s) as "real" means that symbol collisions with the
+ # surrounding structure will be properly reported to the user.
+ _mark_as_synthetic(new_existence_condition)
+ _mark_as_synthetic(new_read_transform)
+ new_alias = ir_pb2.Field(
+ read_transform=new_read_transform,
+ existence_condition=new_existence_condition,
+ name=subfield.name)
+ if subfield.HasField("abbreviation"):
+ new_alias.abbreviation.CopyFrom(subfield.abbreviation)
+ _mark_as_synthetic(new_alias.existence_condition)
+ _mark_as_synthetic(new_alias.read_transform)
+ new_fields.append(new_alias)
+ # Since the alias field's name(s) are "real," it is important to mark the
+ # original field's name(s) as synthetic, to avoid duplicate error
+ # messages.
+ _mark_as_synthetic(subfield.name)
+ if subfield.HasField("abbreviation"):
+ _mark_as_synthetic(subfield.abbreviation)
+ del structure.field[:]
+ structure.field.extend(new_fields)
+
+
+_SIZE_BOUNDS = {
+ "$max_size_in_bits": expression_parser.parse("$upper_bound($size_in_bits)"),
+ "$min_size_in_bits": expression_parser.parse("$lower_bound($size_in_bits)"),
+ "$max_size_in_bytes": expression_parser.parse(
+ "$upper_bound($size_in_bytes)"),
+ "$min_size_in_bytes": expression_parser.parse(
+ "$lower_bound($size_in_bytes)"),
+}
+
+
+def _add_size_bound_virtuals(structure, type_definition):
+ """Adds ${min,max}_size_in_{bits,bytes} virtual fields to structure."""
+ names = {
+ ir_pb2.TypeDefinition.BIT: ("$max_size_in_bits", "$min_size_in_bits"),
+ ir_pb2.TypeDefinition.BYTE: ("$max_size_in_bytes", "$min_size_in_bytes"),
+ }
+ for name in names[type_definition.addressable_unit]:
+ bound_field = ir_pb2.Field(
+ read_transform=_SIZE_BOUNDS[name],
+ name=ir_pb2.NameDefinition(name=ir_pb2.Word(text=name)),
+ existence_condition=expression_parser.parse("true"),
+ attribute=[_skip_text_output_attribute()]
+ )
+ _mark_as_synthetic(bound_field.read_transform)
+ structure.field.extend([bound_field])
+
+
+# Each non-virtual field in a structure generates a clause that is passed to
+# `$max()` in the definition of `$size_in_bits`/`$size_in_bytes`. Additionally,
+# the `$max()` call is seeded with a `0` argument: this ensures that
+# `$size_in_units` is never negative, and ensures that structures with no
+# physical fields don't end up with a zero-argument `$max()` call, which would
+# fail type checking.
+_SIZE_CLAUSE_SKELETON = expression_parser.parse(
+ "existence_condition ? start + size : 0")
+_SIZE_SKELETON = expression_parser.parse("$max(0)")
+
+
+def _add_size_virtuals(structure, type_definition):
+ """Adds a $size_in_bits or $size_in_bytes virtual field to structure."""
+ names = {
+ ir_pb2.TypeDefinition.BIT: "$size_in_bits",
+ ir_pb2.TypeDefinition.BYTE: "$size_in_bytes",
+ }
+ size_field_name = names[type_definition.addressable_unit]
+ size_clauses = []
+ for field in structure.field:
+ # Virtual fields do not have a physical location, and thus do not contribute
+ # to the size of the structure.
+ if ir_util.field_is_virtual(field):
+ continue
+ size_clause = ir_pb2.Expression()
+ size_clause.CopyFrom(_SIZE_CLAUSE_SKELETON)
+ # Copy the appropriate clauses into `existence_condition ? start + size : 0`
+ size_clause.function.args[0].CopyFrom(field.existence_condition)
+ size_clause.function.args[1].function.args[0].CopyFrom(field.location.start)
+ size_clause.function.args[1].function.args[1].CopyFrom(field.location.size)
+ size_clauses.append(size_clause)
+ size_expression = ir_pb2.Expression()
+ size_expression.CopyFrom(_SIZE_SKELETON)
+ size_expression.function.args.extend(size_clauses)
+ _mark_as_synthetic(size_expression)
+ size_field = ir_pb2.Field(
+ read_transform=size_expression,
+ name=ir_pb2.NameDefinition(name=ir_pb2.Word(text=size_field_name)),
+ existence_condition=ir_pb2.Expression(
+ boolean_constant=ir_pb2.BooleanConstant(value=True)
+ ),
+ attribute=[_skip_text_output_attribute()]
+ )
+ structure.field.extend([size_field])
+
+
+def _add_virtuals_to_structure(structure, type_definition):
+ _add_anonymous_aliases(structure, type_definition)
+ _add_size_virtuals(structure, type_definition)
+ _add_size_bound_virtuals(structure, type_definition)
+
+
+def synthesize_fields(ir):
+ """Adds synthetic fields to all structures.
+
+ Adds aliases for all fields in anonymous `bits` to the enclosing structure.
+
+ Arguments:
+ ir: The IR to which to add fields.
+
+ Returns:
+ A list of errors, or an empty list.
+ """
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Structure], _add_virtuals_to_structure)
+ return []
diff --git a/front_end/synthetics_test.py b/front_end/synthetics_test.py
new file mode 100644
index 0000000..10a23d3
--- /dev/null
+++ b/front_end/synthetics_test.py
@@ -0,0 +1,228 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for front_end.synthetics."""
+
+import unittest
+from front_end import glue
+from front_end import synthetics
+from front_end import test_util
+from public import ir_pb2
+
+
+class SyntheticsTest(unittest.TestCase):
+
+ def _find_attribute(self, field, name):
+ result = None
+ for attribute in field.attribute:
+ if attribute.name.text == name:
+ self.assertIsNone(result)
+ result = attribute
+ self.assertIsNotNone(result)
+ return result
+
+ def _make_ir(self, emb_text):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ "m.emb",
+ test_util.dict_file_reader({"m.emb": emb_text}),
+ stop_before_step="synthesize_fields")
+ assert not errors, errors
+ return ir
+
+ def test_nothing_to_do(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] UInt:8[] y\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+
+ def test_adds_anonymous_bits_fields(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] bits:\n"
+ " 0 [+4] Bar bar\n"
+ " 4 [+4] UInt uint\n"
+ " 1 [+1] bits:\n"
+ " 0 [+4] Bits nested_bits\n"
+ "enum Bar:\n"
+ " BAR = 0\n"
+ "bits Bits:\n"
+ " 0 [+4] UInt uint\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+ structure = ir.module[0].type[0].structure
+ # The first field should be the anonymous bits structure.
+ self.assertTrue(structure.field[0].HasField("location"))
+ # Then the aliases generated for those structures.
+ self.assertEqual("bar", structure.field[1].name.name.text)
+ self.assertEqual("uint", structure.field[2].name.name.text)
+ # Then the second anonymous bits.
+ self.assertTrue(structure.field[3].HasField("location"))
+ # Then the alias from the second anonymous bits.
+ self.assertEqual("nested_bits", structure.field[4].name.name.text)
+
+ def test_adds_correct_existence_condition(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] bits:\n"
+ " 0 [+4] UInt bar\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+ bits_field = ir.module[0].type[0].structure.field[0]
+ alias_field = ir.module[0].type[0].structure.field[1]
+ self.assertEqual("bar", alias_field.name.name.text)
+ self.assertEqual(bits_field.name.name.text,
+ alias_field.existence_condition.function.args[0].function.
+ args[0].field_reference.path[0].source_name[-1].text)
+ self.assertEqual(bits_field.name.name.text,
+ alias_field.existence_condition.function.args[1].function.
+ args[0].field_reference.path[0].source_name[-1].text)
+ self.assertEqual("bar",
+ alias_field.existence_condition.function.args[1].function.
+ args[0].field_reference.path[1].source_name[-1].text)
+ self.assertEqual(
+ ir_pb2.Function.PRESENCE,
+ alias_field.existence_condition.function.args[0].function.function)
+ self.assertEqual(
+ ir_pb2.Function.PRESENCE,
+ alias_field.existence_condition.function.args[1].function.function)
+ self.assertEqual(ir_pb2.Function.AND,
+ alias_field.existence_condition.function.function)
+
+ def test_adds_correct_read_transform(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] bits:\n"
+ " 0 [+4] UInt bar\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+ bits_field = ir.module[0].type[0].structure.field[0]
+ alias_field = ir.module[0].type[0].structure.field[1]
+ self.assertEqual("bar", alias_field.name.name.text)
+ self.assertEqual(
+ bits_field.name.name.text,
+ alias_field.read_transform.field_reference.path[0].source_name[-1].text)
+ self.assertEqual(
+ "bar",
+ alias_field.read_transform.field_reference.path[1].source_name[-1].text)
+
+ def test_adds_correct_abbreviation(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] bits:\n"
+ " 0 [+4] UInt bar\n"
+ " 4 [+4] UInt baz (qux)\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+ bar_alias = ir.module[0].type[0].structure.field[1]
+ baz_alias = ir.module[0].type[0].structure.field[2]
+ self.assertFalse(bar_alias.HasField("abbreviation"))
+ self.assertEqual("qux", baz_alias.abbreviation.text)
+
+ def test_anonymous_bits_sets_correct_is_synthetic(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] bits:\n"
+ " 0 [+4] UInt bar (b)\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+ bits_field = ir.module[0].type[0].subtype[0].structure.field[0]
+ alias_field = ir.module[0].type[0].structure.field[1]
+ self.assertFalse(alias_field.name.source_location.is_synthetic)
+ self.assertTrue(alias_field.HasField("abbreviation"))
+ self.assertFalse(alias_field.abbreviation.source_location.is_synthetic)
+ self.assertTrue(alias_field.HasField("read_transform"))
+ read_alias = alias_field.read_transform
+ self.assertTrue(read_alias.source_location.is_synthetic)
+ self.assertTrue(
+ read_alias.field_reference.path[0].source_location.is_synthetic)
+ alias_condition = alias_field.existence_condition
+ self.assertTrue(alias_condition.source_location.is_synthetic)
+ self.assertTrue(
+ alias_condition.function.args[0].source_location.is_synthetic)
+ self.assertTrue(bits_field.name.source_location.is_synthetic)
+ self.assertTrue(bits_field.name.name.source_location.is_synthetic)
+ self.assertTrue(bits_field.abbreviation.source_location.is_synthetic)
+
+ def test_adds_text_output_skip_attribute_to_anonymous_bits(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] bits:\n"
+ " 0 [+4] UInt bar (b)\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+ bits_field = ir.module[0].type[0].structure.field[0]
+ text_output_attribute = self._find_attribute(bits_field, "text_output")
+ self.assertEqual("Skip", text_output_attribute.value.string_constant.text)
+
+ def test_skip_attribute_is_marked_as_synthetic(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] bits:\n"
+ " 0 [+4] UInt bar\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+ bits_field = ir.module[0].type[0].structure.field[0]
+ attribute = self._find_attribute(bits_field, "text_output")
+ self.assertTrue(attribute.source_location.is_synthetic)
+ self.assertTrue(attribute.name.source_location.is_synthetic)
+ self.assertTrue(attribute.value.source_location.is_synthetic)
+ self.assertTrue(
+ attribute.value.string_constant.source_location.is_synthetic)
+
+ def test_adds_size_in_bytes(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 1 [+l] UInt:8[] bytes\n"
+ " 0 [+1] UInt length (l)\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+ structure = ir.module[0].type[0].structure
+ size_in_bytes_field = structure.field[2]
+ max_size_in_bytes_field = structure.field[3]
+ min_size_in_bytes_field = structure.field[4]
+ self.assertEqual("$size_in_bytes", size_in_bytes_field.name.name.text)
+ self.assertEqual(ir_pb2.Function.MAXIMUM,
+ size_in_bytes_field.read_transform.function.function)
+ self.assertEqual("$max_size_in_bytes",
+ max_size_in_bytes_field.name.name.text)
+ self.assertEqual(ir_pb2.Function.UPPER_BOUND,
+ max_size_in_bytes_field.read_transform.function.function)
+ self.assertEqual("$min_size_in_bytes",
+ min_size_in_bytes_field.name.name.text)
+ self.assertEqual(ir_pb2.Function.LOWER_BOUND,
+ min_size_in_bytes_field.read_transform.function.function)
+ # The correctness of $size_in_bytes et al are tested much further down
+ # stream, in tests of the generated C++ code.
+
+ def test_adds_size_in_bits(self):
+ ir = self._make_ir("bits Foo:\n"
+ " 1 [+9] UInt hi\n"
+ " 0 [+1] Flag lo\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+ structure = ir.module[0].type[0].structure
+ size_in_bits_field = structure.field[2]
+ max_size_in_bits_field = structure.field[3]
+ min_size_in_bits_field = structure.field[4]
+ self.assertEqual("$size_in_bits", size_in_bits_field.name.name.text)
+ self.assertEqual(ir_pb2.Function.MAXIMUM,
+ size_in_bits_field.read_transform.function.function)
+ self.assertEqual("$max_size_in_bits",
+ max_size_in_bits_field.name.name.text)
+ self.assertEqual(ir_pb2.Function.UPPER_BOUND,
+ max_size_in_bits_field.read_transform.function.function)
+ self.assertEqual("$min_size_in_bits",
+ min_size_in_bits_field.name.name.text)
+ self.assertEqual(ir_pb2.Function.LOWER_BOUND,
+ min_size_in_bits_field.read_transform.function.function)
+ # The correctness of $size_in_bits et al are tested much further down
+ # stream, in tests of the generated C++ code.
+
+ def test_adds_text_output_skip_attribute_to_size_in_bytes(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 1 [+l] UInt:8[] bytes\n"
+ " 0 [+1] UInt length (l)\n")
+ self.assertEqual([], synthetics.synthesize_fields(ir))
+ size_in_bytes_field = ir.module[0].type[0].structure.field[2]
+ self.assertEqual("$size_in_bytes", size_in_bytes_field.name.name.text)
+ text_output_attribute = self._find_attribute(size_in_bytes_field,
+ "text_output")
+ self.assertEqual("Skip", text_output_attribute.value.string_constant.text)
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/test_util.py b/front_end/test_util.py
new file mode 100644
index 0000000..1bbdd35
--- /dev/null
+++ b/front_end/test_util.py
@@ -0,0 +1,105 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Utilities for test code."""
+
+from public import ir_pb2
+
+
+def proto_is_superset(proto, expected_values, path=""):
+ """Returns true if every value in expected_values is set in proto.
+
+ This is intended to be used in assertTrue in a unit test, like so:
+
+ self.assertTrue(*proto_is_superset(proto, expected))
+
+ Arguments:
+ proto: The proto to check.
+ expected_values: The reference proto.
+ path: The path to the elements being compared. Clients can generally leave
+ this at default.
+
+ Returns:
+ A tuple; the first element is True if the fields set in proto are a strict
+ superset of the fields set in expected_values. The second element is an
+ informational string specifying the path of a value found in expected_values
+ but not in proto.
+
+ Every atomic field that is set in expected_values must be set to the same
+ value in proto; every message field set in expected_values must have a
+ matching field in proto, such that proto_is_superset(proto.field,
+ expected_values.field) is true.
+
+ For repeated fields in expected_values, each element in the expected_values
+ proto must have a corresponding element at the same index in proto; proto
+ may have additional elements.
+ """
+ if path:
+ path += "."
+ for name, expected_value in expected_values.raw_fields.items():
+ field_path = "{}{}".format(path, name)
+ value = getattr(proto, name)
+ if issubclass(proto.field_specs[name].type, ir_pb2.Message):
+ if isinstance(proto.field_specs[name], ir_pb2.Repeated):
+ if len(expected_value) > len(value):
+ return False, "{}[{}] missing".format(field_path,
+ len(getattr(proto, name)))
+ for i in range(len(expected_value)):
+ result = proto_is_superset(value[i], expected_value[i],
+ "{}[{}]".format(field_path, i))
+ if not result[0]:
+ return result
+ else:
+ if (expected_values.HasField(name) and
+ not proto.HasField(name)):
+ return False, "{} missing".format(field_path)
+ result = proto_is_superset(value, expected_value, field_path)
+ if not result[0]:
+ return result
+ else:
+ # Zero-length repeated fields and not-there repeated fields are "the
+ # same."
+ if (expected_value != value and
+ (isinstance(proto.field_specs[name], ir_pb2.Optional) or
+ len(expected_value))):
+ if isinstance(proto.field_specs[name], ir_pb2.Repeated):
+ return False, "{} differs: found {}, expected {}".format(
+ field_path, list(value), list(expected_value))
+ else:
+ return False, "{} differs: found {}, expected {}".format(
+ field_path, value, expected_value)
+ return True, ""
+
+
+def dict_file_reader(file_dict):
+ """Returns a callable that retrieves entries from file_dict as files.
+
+ This can be used to call glue.parse_emboss_file with file text declared
+ inline.
+
+ Arguments:
+ file_dict: A dictionary from "file names" to "contents."
+
+ Returns:
+ A callable that can be passed to glue.parse_emboss_file in place of the
+ "read" builtin.
+ """
+
+ def read(file_name):
+ try:
+ return file_dict[file_name], None
+ except KeyError:
+ return None, ["File '{}' not found.".format(file_name)]
+
+ return read
diff --git a/front_end/test_util_test.py b/front_end/test_util_test.py
new file mode 100644
index 0000000..89c7c9f
--- /dev/null
+++ b/front_end/test_util_test.py
@@ -0,0 +1,170 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for front_end.test_util."""
+
+import unittest
+
+from front_end import test_util
+from public import ir_pb2
+from util import parser_types
+
+
+class ProtoIsSupersetTest(unittest.TestCase):
+ """Tests for test_util.proto_is_superset."""
+
+ def test_superset_extra_optional_field(self):
+ self.assertEqual(
+ (True, ""),
+ test_util.proto_is_superset(
+ ir_pb2.Structure(
+ field=[ir_pb2.Field()],
+ source_location=parser_types.parse_location("1:2-3:4")),
+ ir_pb2.Structure(field=[ir_pb2.Field()])))
+
+ def test_superset_extra_repeated_field(self):
+ self.assertEqual(
+ (True, ""),
+ test_util.proto_is_superset(
+ ir_pb2.Structure(
+ field=[ir_pb2.Field(), ir_pb2.Field()],
+ source_location=parser_types.parse_location("1:2-3:4")),
+ ir_pb2.Structure(field=[ir_pb2.Field()])))
+
+ def test_superset_missing_empty_repeated_field(self):
+ self.assertEqual(
+ (False, "field[0] missing"),
+ test_util.proto_is_superset(
+ ir_pb2.Structure(
+ field=[],
+ source_location=parser_types.parse_location("1:2-3:4")),
+ ir_pb2.Structure(field=[ir_pb2.Field(), ir_pb2.Field()])))
+
+ def test_superset_missing_empty_optional_field(self):
+ self.assertEqual((False, "source_location missing"),
+ test_util.proto_is_superset(
+ ir_pb2.Structure(field=[]),
+ ir_pb2.Structure(source_location=ir_pb2.Location())))
+
+ def test_array_element_differs(self):
+ self.assertEqual(
+ (False,
+ "field[0].source_location.start.line differs: found 1, expected 2"),
+ test_util.proto_is_superset(
+ ir_pb2.Structure(
+ field=[ir_pb2.Field(source_location=parser_types.parse_location(
+ "1:2-3:4"))]),
+ ir_pb2.Structure(
+ field=[ir_pb2.Field(source_location=parser_types.parse_location(
+ "2:2-3:4"))])))
+
+ def test_equal(self):
+ self.assertEqual(
+ (True, ""),
+ test_util.proto_is_superset(parser_types.parse_location("1:2-3:4"),
+ parser_types.parse_location("1:2-3:4")))
+
+ def test_superset_missing_optional_field(self):
+ self.assertEqual(
+ (False, "source_location missing"),
+ test_util.proto_is_superset(
+ ir_pb2.Structure(field=[ir_pb2.Field()]),
+ ir_pb2.Structure(
+ field=[ir_pb2.Field()],
+ source_location=parser_types.parse_location("1:2-3:4"))))
+
+ def test_optional_field_differs(self):
+ self.assertEqual((False, "end.line differs: found 4, expected 3"),
+ test_util.proto_is_superset(
+ parser_types.parse_location("1:2-4:4"),
+ parser_types.parse_location("1:2-3:4")))
+
+ def test_non_message_repeated_field_equal(self):
+ self.assertEqual((True, ""),
+ test_util.proto_is_superset(
+ ir_pb2.CanonicalName(object_path=[]),
+ ir_pb2.CanonicalName(object_path=[])))
+
+ def test_non_message_repeated_field_missing_element(self):
+ self.assertEqual(
+ (False, "object_path differs: found {none!r}, expected {a!r}".format(
+ none=[],
+ a=[u"a"])),
+ test_util.proto_is_superset(
+ ir_pb2.CanonicalName(object_path=[]),
+ ir_pb2.CanonicalName(object_path=[u"a"])))
+
+ def test_non_message_repeated_field_element_differs(self):
+ self.assertEqual(
+ (False, "object_path differs: found {aa!r}, expected {ab!r}".format(
+ aa=[u"a", u"a"],
+ ab=[u"a", u"b"])),
+ test_util.proto_is_superset(
+ ir_pb2.CanonicalName(object_path=[u"a", u"a"]),
+ ir_pb2.CanonicalName(object_path=[u"a", u"b"])))
+
+ def test_non_message_repeated_field_extra_element(self):
+ # For repeated fields of int/bool/str values, the entire list is treated as
+ # an atomic unit, and should be equal.
+ self.assertEqual(
+ (False,
+ "object_path differs: found {!r}, expected {!r}".format(
+ [u"a", u"a"], [u"a"])),
+ test_util.proto_is_superset(
+ ir_pb2.CanonicalName(object_path=["a", "a"]),
+ ir_pb2.CanonicalName(object_path=["a"])))
+
+ def test_non_message_repeated_field_no_expected_value(self):
+ # When a repeated field is empty, it is the same as if it were entirely
+ # missing -- there is no way to differentiate those two conditions.
+ self.assertEqual((True, ""),
+ test_util.proto_is_superset(
+ ir_pb2.CanonicalName(object_path=["a", "a"]),
+ ir_pb2.CanonicalName(object_path=[])))
+
+
+class DictFileReaderTest(unittest.TestCase):
+ """Tests for dict_file_reader."""
+
+ def test_empty_dict(self):
+ reader = test_util.dict_file_reader({})
+ self.assertEqual((None, ["File 'anything' not found."]), reader("anything"))
+ self.assertEqual((None, ["File '' not found."]), reader(""))
+
+ def test_one_element_dict(self):
+ reader = test_util.dict_file_reader({"m": "abc"})
+ self.assertEqual((None, ["File 'not_there' not found."]),
+ reader("not_there"))
+ self.assertEqual((None, ["File '' not found."]), reader(""))
+ self.assertEqual(("abc", None), reader("m"))
+
+ def test_two_element_dict(self):
+ reader = test_util.dict_file_reader({"m": "abc", "n": "def"})
+ self.assertEqual((None, ["File 'not_there' not found."]),
+ reader("not_there"))
+ self.assertEqual((None, ["File '' not found."]), reader(""))
+ self.assertEqual(("abc", None), reader("m"))
+ self.assertEqual(("def", None), reader("n"))
+
+ def test_dict_with_empty_key(self):
+ reader = test_util.dict_file_reader({"m": "abc", "": "def"})
+ self.assertEqual((None, ["File 'not_there' not found."]),
+ reader("not_there"))
+ self.assertEqual((None, ["File 'None' not found."]), reader(None))
+ self.assertEqual(("abc", None), reader("m"))
+ self.assertEqual(("def", None), reader(""))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/tokenizer.py b/front_end/tokenizer.py
new file mode 100644
index 0000000..a9d005a
--- /dev/null
+++ b/front_end/tokenizer.py
@@ -0,0 +1,223 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tokenization for the Emboss definition language.
+
+This module exports the tokenize function and various errors.
+
+In addition, a couple of lists are exported for the use of
+generate_grammar_md.py:
+
+LITERAL_TOKEN_PATTERNS: A list of literal strings which are matched against
+ input.
+REGEX_TOKEN_PATTERNS: A list of regexes used for tokenization.
+ REGEX_TOKEN_PATTERNS[n].regex is an re.RegexObject
+ (REGEX_TOKEN_PATTERNS[n].regex.pattern contains the text of the pattern), and
+ REGEX_TOKEN_PATTERNS[n].symbol is the name of the symbol assigned to tokens
+ which match the pattern.
+"""
+
+import collections
+import re
+
+from util import error
+from util import parser_types
+
+
+def tokenize(text, file_name):
+ # TODO(bolms): suppress end-of-line, indent, and dedent tokens between matched
+ # delimiters ([], (), and {}).
+ """Tokenizes its argument.
+
+ Arguments:
+ text: The raw text of a .emb file.
+ file_name: The name of the file to use in errors.
+
+ Returns:
+ A tuple of:
+ a list of parser_types.Tokens or None
+ a possibly-empty list of errors.
+ """
+ tokens = []
+ indent_stack = [""]
+ line_number = 0
+ for line in text.splitlines():
+ line_number += 1
+
+ # _tokenize_line splits the actual text into tokens.
+ line_tokens, errors = _tokenize_line(line, line_number, file_name)
+ if errors:
+ return None, errors
+
+ # Lines with only whitespace and comments are not used for Indent/Dedent
+ # calculation, and do not produce end-of-line tokens.
+ for token in line_tokens:
+ if token.symbol != "Comment":
+ break
+ else:
+ tokens.extend(line_tokens)
+ tokens.append(parser_types.Token(
+ '"\\n"', "\n", parser_types.make_location(
+ (line_number, len(line) + 1), (line_number, len(line) + 1))))
+ continue
+
+ # Leading whitespace is whatever .lstrip() removes.
+ leading_whitespace = line[0:len(line) - len(line.lstrip())]
+ if leading_whitespace == indent_stack[-1]:
+ # If the current leading whitespace is equal to the last leading
+ # whitespace, do not emit an Indent or Dedent token.
+ pass
+ elif leading_whitespace.startswith(indent_stack[-1]):
+ # If the current leading whitespace is longer than the last leading
+ # whitespace, emit an Indent token. For the token text, take the new
+ # part of the whitespace.
+ tokens.append(
+ parser_types.Token(
+ "Indent", leading_whitespace[len(indent_stack[-1]):],
+ parser_types.make_location(
+ (line_number, len(indent_stack[-1]) + 1),
+ (line_number, len(leading_whitespace) + 1))))
+ indent_stack.append(leading_whitespace)
+ else:
+ # Otherwise, search for the unclosed indentation level that matches
+ # the current indentation level. Emit a Dedent token for each
+ # newly-closed indentation level.
+ for i in range(len(indent_stack) - 1, -1, -1):
+ if leading_whitespace == indent_stack[i]:
+ break
+ tokens.append(
+ parser_types.Token("Dedent", "", parser_types.make_location(
+ (line_number, len(leading_whitespace) + 1),
+ (line_number, len(leading_whitespace) + 1))))
+ del indent_stack[i]
+ else:
+ return None, [[error.error(
+ file_name, parser_types.make_location(
+ (line_number, 1), (line_number, len(leading_whitespace) + 1)),
+ "Bad indentation")]]
+
+ tokens.extend(line_tokens)
+
+ # Append an end-of-line token (for non-whitespace lines).
+ tokens.append(parser_types.Token(
+ '"\\n"', "\n", parser_types.make_location(
+ (line_number, len(line) + 1), (line_number, len(line) + 1))))
+ for i in range(len(indent_stack) - 1):
+ tokens.append(parser_types.Token("Dedent", "", parser_types.make_location(
+ (line_number + 1, 1), (line_number + 1, 1))))
+ return tokens, []
+
+# Token patterns used by _tokenize_line.
+LITERAL_TOKEN_PATTERNS = (
+ "[ ] ( ) : = + - * . ? == != && || < > <= >= , "
+ "$static_size_in_bits $is_statically_sized "
+ "$max $present $upper_bound $lower_bound "
+ "$size_in_bits $size_in_bytes "
+ "$max_size_in_bits $max_size_in_bytes $min_size_in_bits $min_size_in_bytes "
+ "$default struct bits enum external import as if let").split()
+_T = collections.namedtuple("T", ["regex", "symbol"])
+REGEX_TOKEN_PATTERNS = [
+ # Words starting with variations of "emboss reserved" are reserved for
+ # internal use by the Emboss compiler.
+ _T(re.compile(r"EmbossReserved[A-Za-z0-9]*"), "BadWord"),
+ _T(re.compile(r"emboss_reserved[_a-z0-9]*"), "BadWord"),
+ _T(re.compile(r"EMBOSS_RESERVED[_A-Z0-9]*"), "BadWord"),
+ _T(re.compile(r'"(?:[^"\n\\]|\\[n\\"])*"'), "String"),
+ _T(re.compile("[0-9]+"), "Number"),
+ _T(re.compile("[0-9]{1,3}(?:_[0-9]{3})*"), "Number"),
+ _T(re.compile("0x[0-9a-fA-F]+"), "Number"),
+ _T(re.compile("0x_?[0-9a-fA-F]{1,4}(?:_[0-9a-fA-F]{4})*"), "Number"),
+ _T(re.compile("0x_?[0-9a-fA-F]{1,8}(?:_[0-9a-fA-F]{8})*"), "Number"),
+ _T(re.compile("0b[01]+"), "Number"),
+ _T(re.compile("0b_?[01]{1,4}(?:_[01]{4})*"), "Number"),
+ _T(re.compile("0b_?[01]{1,8}(?:_[01]{8})*"), "Number"),
+ _T(re.compile("true|false"), "BooleanConstant"),
+ _T(re.compile("[a-z][a-z_0-9]*"), "SnakeWord"),
+ # Single-letter ShoutyWords (like "A") and single-letter-followed-by-number
+ # ShoutyWords ("A100") are disallowed due to ambiguity with CamelWords. A
+ # ShoutyWord must start with an upper case letter and contain at least one
+ # more upper case letter or '_'.
+ _T(re.compile("[A-Z][A-Z_0-9]*[A-Z_][A-Z_0-9]*"), "ShoutyWord"),
+ # A CamelWord starts with A-Z and contains at least one a-z, and no _.
+ _T(re.compile("[A-Z][a-zA-Z0-9]*[a-z][a-zA-Z0-9]*"), "CamelWord"),
+ _T(re.compile("-- .*"), "Documentation"),
+ _T(re.compile("--$"), "Documentation"),
+ _T(re.compile("--.*"), "BadDocumentation"),
+ _T(re.compile(r"\s+"), None),
+ _T(re.compile("#.*"), "Comment"),
+ # BadWord and BadNumber are a catch-alls for words and numbers so that
+ # something like "abcDef" doesn't tokenize to [SnakeWord, CamelWord].
+ #
+ # This is preferable to returning an error because the BadWord and BadNumber
+ # token types can be used in example-based errors.
+ _T(re.compile("[0-9][bxBX]?[0-9a-fA-F_]*"), "BadNumber"),
+ _T(re.compile("[a-zA-Z_$0-9]+"), "BadWord"),
+]
+del _T
+
+
+def _tokenize_line(line, line_number, file_name):
+ """Tokenizes a single line of input.
+
+ Arguments:
+ line: The line of text to tokenize.
+ line_number: The line number (used when constructing token objects).
+ file_name: The name of a file to use in errors.
+
+ Returns:
+ A tuple of:
+ A list of token objects or None.
+ A possibly-empty list of errors.
+ """
+ tokens = []
+ offset = 0
+ while offset < len(line):
+ best_candidate = ""
+ best_candidate_symbol = None
+ # Find the longest match. Ties go to the first match. This way, keywords
+ # ("struct") are matched as themselves, but words that only happen to start
+ # with keywords ("structure") are matched as words.
+ #
+ # There is never a reason to try to match a literal after a regex that
+ # could also match that literal, so check literals first.
+ for literal in LITERAL_TOKEN_PATTERNS:
+ if line[offset:].startswith(literal) and len(literal) > len(
+ best_candidate):
+ best_candidate = literal
+ # For Emboss, the name of a literal token is just the literal in quotes,
+ # so that the grammar can read a little more naturally, e.g.:
+ #
+ # expression -> expression "+" expression
+ #
+ # instead of
+ #
+ # expression -> expression Plus expression
+ best_candidate_symbol = '"' + literal + '"'
+ for pattern in REGEX_TOKEN_PATTERNS:
+ match_result = pattern.regex.match(line[offset:])
+ if match_result and len(match_result.group(0)) > len(best_candidate):
+ best_candidate = match_result.group(0)
+ best_candidate_symbol = pattern.symbol
+ if not best_candidate:
+ return None, [[error.error(
+ file_name, parser_types.make_location(
+ (line_number, offset + 1), (line_number, offset + 2)),
+ "Unrecognized token")]]
+ if best_candidate_symbol:
+ tokens.append(parser_types.Token(
+ best_candidate_symbol, best_candidate, parser_types.make_location(
+ (line_number, offset + 1),
+ (line_number, offset + len(best_candidate) + 1))))
+ offset += len(best_candidate)
+ return tokens, None
diff --git a/front_end/tokenizer_test.py b/front_end/tokenizer_test.py
new file mode 100644
index 0000000..97b1494
--- /dev/null
+++ b/front_end/tokenizer_test.py
@@ -0,0 +1,388 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for tokenizer."""
+
+import unittest
+from front_end import tokenizer
+from util import error
+from util import parser_types
+
+
+def _token_symbols(token_list):
+ """Given a list of tokens, returns a list of their symbol names."""
+ return [token.symbol for token in token_list]
+
+
+class TokenizerTest(unittest.TestCase):
+ """Tests for the tokenizer.tokenize function."""
+
+ def test_bad_indent_tab_versus_space(self):
+ # A bad indent is one that doesn't match a previous unmatched indent.
+ tokens, errors = tokenizer.tokenize(" a\n\tb", "file")
+ self.assertFalse(tokens)
+ self.assertEqual([[error.error("file", parser_types.make_location(
+ (2, 1), (2, 2)), "Bad indentation")]], errors)
+
+ def test_bad_indent_tab_versus_eight_spaces(self):
+ tokens, errors = tokenizer.tokenize(" a\n\tb", "file")
+ self.assertFalse(tokens)
+ self.assertEqual([[error.error("file", parser_types.make_location(
+ (2, 1), (2, 2)), "Bad indentation")]], errors)
+
+ def test_bad_indent_tab_versus_four_spaces(self):
+ tokens, errors = tokenizer.tokenize(" a\n\tb", "file")
+ self.assertFalse(tokens)
+ self.assertEqual([[error.error("file", parser_types.make_location(
+ (2, 1), (2, 2)), "Bad indentation")]], errors)
+
+ def test_bad_indent_two_spaces_versus_one_space(self):
+ tokens, errors = tokenizer.tokenize(" a\n b", "file")
+ self.assertFalse(tokens)
+ self.assertEqual([[error.error("file", parser_types.make_location(
+ (2, 1), (2, 2)), "Bad indentation")]], errors)
+
+ def test_bad_indent_matches_closed_indent(self):
+ tokens, errors = tokenizer.tokenize(" a\nb\n c\n d", "file")
+ self.assertFalse(tokens)
+ self.assertEqual([[error.error("file", parser_types.make_location(
+ (4, 1), (4, 2)), "Bad indentation")]], errors)
+
+ def test_bad_string_after_string_with_escaped_backslash_at_end(self):
+ tokens, errors = tokenizer.tokenize(r'"\\""', "name")
+ self.assertFalse(tokens)
+ self.assertEqual([[error.error("name", parser_types.make_location(
+ (1, 5), (1, 6)), "Unrecognized token")]], errors)
+
+
+def _make_short_token_match_tests():
+ """Makes tests for short, simple tokenization cases."""
+ eol = '"\\n"'
+ cases = {
+ "Cam": ["CamelWord", eol],
+ "Ca9": ["CamelWord", eol],
+ "CanB": ["CamelWord", eol],
+ "CanBee": ["CamelWord", eol],
+ "CBa": ["CamelWord", eol],
+ "cam": ["SnakeWord", eol],
+ "ca9": ["SnakeWord", eol],
+ "can_b": ["SnakeWord", eol],
+ "can_bee": ["SnakeWord", eol],
+ "c_ba": ["SnakeWord", eol],
+ "cba_": ["SnakeWord", eol],
+ "c_b_a_": ["SnakeWord", eol],
+ "CAM": ["ShoutyWord", eol],
+ "CA9": ["ShoutyWord", eol],
+ "CAN_B": ["ShoutyWord", eol],
+ "CAN_BEE": ["ShoutyWord", eol],
+ "C_BA": ["ShoutyWord", eol],
+ "C": ["BadWord", eol],
+ "C1": ["BadWord", eol],
+ "c": ["SnakeWord", eol],
+ "$": ["BadWord", eol],
+ "_": ["BadWord", eol],
+ "_a": ["BadWord", eol],
+ "_A": ["BadWord", eol],
+ "Cb_A": ["BadWord", eol],
+ "aCb": ["BadWord", eol],
+ "a b": ["SnakeWord", "SnakeWord", eol],
+ "a\tb": ["SnakeWord", "SnakeWord", eol],
+ "a \t b ": ["SnakeWord", "SnakeWord", eol],
+ " \t ": [eol],
+ "a #b": ["SnakeWord", "Comment", eol],
+ "a#": ["SnakeWord", "Comment", eol],
+ "# b": ["Comment", eol],
+ " # b": ["Comment", eol],
+ " #": ["Comment", eol],
+ "": [],
+ "\n": [eol],
+ "\na": [eol, "SnakeWord", eol],
+ "a--example": ["SnakeWord", "BadDocumentation", eol],
+ "a ---- example": ["SnakeWord", "BadDocumentation", eol],
+ "a --- example": ["SnakeWord", "BadDocumentation", eol],
+ "a-- example": ["SnakeWord", "Documentation", eol],
+ "a -- -- example": ["SnakeWord", "Documentation", eol],
+ "a -- - example": ["SnakeWord", "Documentation", eol],
+ "--": ["Documentation", eol],
+ "-- ": ["Documentation", eol],
+ "-- ": ["Documentation", eol],
+ "$default": ['"$default"', eol],
+ "$defaultx": ["BadWord", eol],
+ "$def": ["BadWord", eol],
+ "x$default": ["BadWord", eol],
+ "9$default": ["BadWord", eol],
+ "struct": ['"struct"', eol],
+ "external": ['"external"', eol],
+ "bits": ['"bits"', eol],
+ "enum": ['"enum"', eol],
+ "as": ['"as"', eol],
+ "import": ['"import"', eol],
+ "true": ["BooleanConstant", eol],
+ "false": ["BooleanConstant", eol],
+ "truex": ["SnakeWord", eol],
+ "falsex": ["SnakeWord", eol],
+ "structx": ["SnakeWord", eol],
+ "bitsx": ["SnakeWord", eol],
+ "enumx": ["SnakeWord", eol],
+ "0b": ["BadNumber", eol],
+ "0x": ["BadNumber", eol],
+ "0b011101": ["Number", eol],
+ "0b0": ["Number", eol],
+ "0b0111_1111_0000": ["Number", eol],
+ "0b00_000_00": ["BadNumber", eol],
+ "0b0_0_0": ["BadNumber", eol],
+ "0b0111012": ["BadNumber", eol],
+ "0b011101x": ["BadWord", eol],
+ "0b011101b": ["BadNumber", eol],
+ "0B0": ["BadNumber", eol],
+ "0X0": ["BadNumber", eol],
+ "0b_": ["BadNumber", eol],
+ "0x_": ["BadNumber", eol],
+ "0b__": ["BadNumber", eol],
+ "0x__": ["BadNumber", eol],
+ "0b_0000": ["Number", eol],
+ "0b0000_": ["BadNumber", eol],
+ "0b00_____00": ["BadNumber", eol],
+ "0x00_000_00": ["BadNumber", eol],
+ "0x0_0_0": ["BadNumber", eol],
+ "0b____0____": ["BadNumber", eol],
+ "0b00000000000000000000": ["Number", eol],
+ "0b_00000000": ["Number", eol],
+ "0b0000_0000_0000": ["Number", eol],
+ "0b000_0000_0000": ["Number", eol],
+ "0b00_0000_0000": ["Number", eol],
+ "0b0_0000_0000": ["Number", eol],
+ "0b_0000_0000_0000": ["Number", eol],
+ "0b_000_0000_0000": ["Number", eol],
+ "0b_00_0000_0000": ["Number", eol],
+ "0b_0_0000_0000": ["Number", eol],
+ "0b00000000_00000000_00000000": ["Number", eol],
+ "0b0000000_00000000_00000000": ["Number", eol],
+ "0b000000_00000000_00000000": ["Number", eol],
+ "0b00000_00000000_00000000": ["Number", eol],
+ "0b0000_00000000_00000000": ["Number", eol],
+ "0b000_00000000_00000000": ["Number", eol],
+ "0b00_00000000_00000000": ["Number", eol],
+ "0b0_00000000_00000000": ["Number", eol],
+ "0b_00000000_00000000_00000000": ["Number", eol],
+ "0b_0000000_00000000_00000000": ["Number", eol],
+ "0b_000000_00000000_00000000": ["Number", eol],
+ "0b_00000_00000000_00000000": ["Number", eol],
+ "0b_0000_00000000_00000000": ["Number", eol],
+ "0b_000_00000000_00000000": ["Number", eol],
+ "0b_00_00000000_00000000": ["Number", eol],
+ "0b_0_00000000_00000000": ["Number", eol],
+ "0x0": ["Number", eol],
+ "0x00000000000000000000": ["Number", eol],
+ "0x_0000": ["Number", eol],
+ "0x_00000000": ["Number", eol],
+ "0x0000_0000_0000": ["Number", eol],
+ "0x000_0000_0000": ["Number", eol],
+ "0x00_0000_0000": ["Number", eol],
+ "0x0_0000_0000": ["Number", eol],
+ "0x_0000_0000_0000": ["Number", eol],
+ "0x_000_0000_0000": ["Number", eol],
+ "0x_00_0000_0000": ["Number", eol],
+ "0x_0_0000_0000": ["Number", eol],
+ "0x00000000_00000000_00000000": ["Number", eol],
+ "0x0000000_00000000_00000000": ["Number", eol],
+ "0x000000_00000000_00000000": ["Number", eol],
+ "0x00000_00000000_00000000": ["Number", eol],
+ "0x0000_00000000_00000000": ["Number", eol],
+ "0x000_00000000_00000000": ["Number", eol],
+ "0x00_00000000_00000000": ["Number", eol],
+ "0x0_00000000_00000000": ["Number", eol],
+ "0x_00000000_00000000_00000000": ["Number", eol],
+ "0x_0000000_00000000_00000000": ["Number", eol],
+ "0x_000000_00000000_00000000": ["Number", eol],
+ "0x_00000_00000000_00000000": ["Number", eol],
+ "0x_0000_00000000_00000000": ["Number", eol],
+ "0x_000_00000000_00000000": ["Number", eol],
+ "0x_00_00000000_00000000": ["Number", eol],
+ "0x_0_00000000_00000000": ["Number", eol],
+ "0x__00000000_00000000": ["BadNumber", eol],
+ "0x00000000_00000000_0000": ["BadNumber", eol],
+ "0x00000000_0000_0000": ["BadNumber", eol],
+ "0x_00000000000000000000": ["BadNumber", eol],
+ "0b_00000000000000000000": ["BadNumber", eol],
+ "0b00000000_00000000_0000": ["BadNumber", eol],
+ "0b00000000_0000_0000": ["BadNumber", eol],
+ "0x0000_": ["BadNumber", eol],
+ "0x00_____00": ["BadNumber", eol],
+ "0x____0____": ["BadNumber", eol],
+ "EmbossReserved": ["BadWord", eol],
+ "EmbossReservedA": ["BadWord", eol],
+ "EmbossReserved_": ["BadWord", eol],
+ "EMBOSS_RESERVED": ["BadWord", eol],
+ "EMBOSS_RESERVED_": ["BadWord", eol],
+ "EMBOSS_RESERVEDA": ["BadWord", eol],
+ "emboss_reserved": ["BadWord", eol],
+ "emboss_reserved_": ["BadWord", eol],
+ "emboss_reserveda": ["BadWord", eol],
+ "0x0123456789abcdefABCDEF": ["Number", eol],
+ "0": ["Number", eol],
+ "1": ["Number", eol],
+ "1a": ["BadNumber", eol],
+ "1g": ["BadWord", eol],
+ "1234567890": ["Number", eol],
+ "1_234_567_890": ["Number", eol],
+ "234_567_890": ["Number", eol],
+ "34_567_890": ["Number", eol],
+ "4_567_890": ["Number", eol],
+ "1_2_3_4_5_6_7_8_9_0": ["BadNumber", eol],
+ "1234567890_": ["BadNumber", eol],
+ "1__234567890": ["BadNumber", eol],
+ "_1234567890": ["BadWord", eol],
+ "[]": ['"["', '"]"', eol],
+ "()": ['"("', '")"', eol],
+ "..": ['"."', '"."', eol],
+ "...": ['"."', '"."', '"."', eol],
+ "....": ['"."', '"."', '"."', '"."', eol],
+ '"abc"': ["String", eol],
+ '""': ["String", eol],
+ r'"\\"': ["String", eol],
+ r'"\""': ["String", eol],
+ r'"\n"': ["String", eol],
+ r'"\\n"': ["String", eol],
+ r'"\\xyz"': ["String", eol],
+ r'"\\\\"': ["String", eol],
+ }
+ for c in ("[ ] ( ) ? : = + - * . == != < <= > >= && || , $max $present "
+ "$upper_bound $lower_bound $size_in_bits $size_in_bytes "
+ "$max_size_in_bits $max_size_in_bytes $min_size_in_bits "
+ "$min_size_in_bytes "
+ "$default struct bits enum external import as if let").split():
+ cases[c] = ['"' + c + '"', eol]
+
+ def make_test_case(case):
+
+ def test_case(self):
+ tokens, errors = tokenizer.tokenize(case, "name")
+ symbols = _token_symbols(tokens)
+ self.assertFalse(errors)
+ self.assertEqual(symbols, cases[case])
+
+ return test_case
+
+ for c in cases:
+ setattr(TokenizerTest, "testShortTokenMatch{!r}".format(c),
+ make_test_case(c))
+
+
+def _make_bad_char_tests():
+ """Makes tests that an error is returned for bad characters."""
+
+ def make_test_case(case):
+
+ def test_case(self):
+ tokens, errors = tokenizer.tokenize(case, "name")
+ self.assertFalse(tokens)
+ self.assertEqual([[error.error("name", parser_types.make_location(
+ (1, 1), (1, 2)), "Unrecognized token")]], errors)
+
+ return test_case
+
+ for c in "~`!@%^&\\|;'\"/{}":
+ setattr(TokenizerTest, "testBadChar{!r}".format(c), make_test_case(c))
+
+
+def _make_bad_string_tests():
+ """Makes tests that an error is returned for bad strings."""
+ bad_strings = (r'"\"', '"\\\n"', r'"\\\"', r'"', r'"\q"', r'"\\\q"')
+
+ def make_test_case(string):
+
+ def test_case(self):
+ tokens, errors = tokenizer.tokenize(string, "name")
+ self.assertFalse(tokens)
+ self.assertEqual([[error.error("name", parser_types.make_location(
+ (1, 1), (1, 2)), "Unrecognized token")]], errors)
+
+ return test_case
+
+ for s in bad_strings:
+ setattr(TokenizerTest, "testBadString{!r}".format(s), make_test_case(s))
+
+
+def _make_multiline_tests():
+ """Makes tests for indent/dedent insertion and eol insertion."""
+
+ c = "Comment"
+ eol = '"\\n"'
+ sw = "SnakeWord"
+ ind = "Indent"
+ ded = "Dedent"
+ cases = {
+ "a\nb\n": [sw, eol, sw, eol],
+ "a\n\nb\n": [sw, eol, eol, sw, eol],
+ "a\n#foo\nb\n": [sw, eol, c, eol, sw, eol],
+ "a\n #foo\nb\n": [sw, eol, c, eol, sw, eol],
+ "a\n b\n": [sw, eol, ind, sw, eol, ded],
+ "a\n b\n\n": [sw, eol, ind, sw, eol, eol, ded],
+ "a\n b\n c\n": [sw, eol, ind, sw, eol, ind, sw, eol, ded, ded],
+ "a\n b\n c\n": [sw, eol, ind, sw, eol, sw, eol, ded],
+ "a\n b\n\n c\n": [sw, eol, ind, sw, eol, eol, sw, eol, ded],
+ "a\n b\n #\n c\n": [sw, eol, ind, sw, eol, c, eol, sw, eol, ded],
+ "a\n\tb\n #\n\tc\n": [sw, eol, ind, sw, eol, c, eol, sw, eol, ded],
+ " a\n b\n c\n d\n": [ind, sw, eol, ind, sw, eol, ind, sw, eol, ded,
+ ded, sw, eol, ded],
+ }
+
+ def make_test_case(case):
+
+ def test_case(self):
+ tokens, errors = tokenizer.tokenize(case, "file")
+ self.assertFalse(errors)
+ self.assertEqual(_token_symbols(tokens), cases[case])
+
+ return test_case
+
+ for c in cases:
+ setattr(TokenizerTest, "testMultiline{!r}".format(c), make_test_case(c))
+
+
+def _make_offset_tests():
+ """Makes tests that the tokenizer fills in correct source locations."""
+ cases = {
+ "a+": ["1:1-1:2", "1:2-1:3", "1:3-1:3"],
+ "a + ": ["1:1-1:2", "1:5-1:6", "1:9-1:9"],
+ "a\n\nb": ["1:1-1:2", "1:2-1:2", "2:1-2:1", "3:1-3:2", "3:2-3:2"],
+ "a\n b": ["1:1-1:2", "1:2-1:2", "2:1-2:3", "2:3-2:4", "2:4-2:4",
+ "3:1-3:1"],
+ "a\n b\nc": ["1:1-1:2", "1:2-1:2", "2:1-2:3", "2:3-2:4", "2:4-2:4",
+ "3:1-3:1", "3:1-3:2", "3:2-3:2"],
+ "a\n b\n c": ["1:1-1:2", "1:2-1:2", "2:1-2:2", "2:2-2:3", "2:3-2:3",
+ "3:2-3:3", "3:3-3:4", "3:4-3:4", "4:1-4:1", "4:1-4:1"],
+ }
+
+ def make_test_case(case):
+
+ def test_case(self):
+ self.assertEqual([parser_types.format_location(l.source_location)
+ for l in tokenizer.tokenize(case, "file")[0]],
+ cases[case])
+
+ return test_case
+
+ for c in cases:
+ setattr(TokenizerTest, "testOffset{!r}".format(c), make_test_case(c))
+
+_make_short_token_match_tests()
+_make_bad_char_tests()
+_make_bad_string_tests()
+_make_multiline_tests()
+_make_offset_tests()
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/type_check.py b/front_end/type_check.py
new file mode 100644
index 0000000..141880f
--- /dev/null
+++ b/front_end/type_check.py
@@ -0,0 +1,477 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Functions for checking expression types."""
+
+from front_end import attributes
+from public import ir_pb2
+from util import error
+from util import ir_util
+from util import traverse_ir
+
+
+def _type_check_expression(expression, source_file_name, ir, errors):
+ """Checks and annotates the type of an expression and all subexpressions."""
+ if expression.type.WhichOneof("type"):
+ # This expression has already been type checked.
+ return
+ expression_variety = expression.WhichOneof("expression")
+ if expression_variety == "constant":
+ _type_check_integer_constant(expression)
+ elif expression_variety == "constant_reference":
+ _type_check_constant_reference(expression, source_file_name, ir, errors)
+ elif expression_variety == "function":
+ _type_check_operation(expression, source_file_name, ir, errors)
+ elif expression_variety == "field_reference":
+ _type_check_local_reference(expression, ir, errors)
+ elif expression_variety == "boolean_constant":
+ _type_check_boolean_constant(expression)
+ elif expression_variety == "builtin_reference":
+ _type_check_builtin_reference(expression)
+ else:
+ assert False, "Unknown expression variety {!r}".format(expression_variety)
+
+
+def _annotate_as_integer(expression):
+ expression.type.integer.CopyFrom(ir_pb2.IntegerType())
+
+
+def _annotate_as_boolean(expression):
+ expression.type.boolean.CopyFrom(ir_pb2.BooleanType())
+
+
+def _type_check(expression, source_file_name, errors, type_oneof, type_name,
+ expression_name):
+ if expression.type.WhichOneof("type") != type_oneof:
+ errors.append([
+ error.error(source_file_name, expression.source_location,
+ "{} must be {}.".format(expression_name, type_name))
+ ])
+
+
+def _type_check_integer(expression, source_file_name, errors, expression_name):
+ _type_check(expression, source_file_name, errors, "integer",
+ "an integer", expression_name)
+
+
+def _type_check_boolean(expression, source_file_name, errors, expression_name):
+ _type_check(expression, source_file_name, errors, "boolean", "a boolean",
+ expression_name)
+
+
+def _kind_check_field_reference(expression, source_file_name, errors,
+ expression_name):
+ if expression.WhichOneof("expression") != "field_reference":
+ errors.append([
+ error.error(source_file_name, expression.source_location,
+ "{} must be a field.".format(expression_name))
+ ])
+
+
+def _type_check_integer_constant(expression):
+ _annotate_as_integer(expression)
+
+
+def _type_check_constant_reference(expression, source_file_name, ir, errors):
+ """Annotates the type of a constant reference."""
+ referred_name = expression.constant_reference.canonical_name
+ referred_object = ir_util.find_object(referred_name, ir)
+ if isinstance(referred_object, ir_pb2.EnumValue):
+ expression.type.enumeration.name.CopyFrom(expression.constant_reference)
+ del expression.type.enumeration.name.canonical_name.object_path[-1]
+ elif isinstance(referred_object, ir_pb2.Field):
+ if not ir_util.field_is_virtual(referred_object):
+ errors.append([
+ error.error(source_file_name, expression.source_location,
+ "Static references to physical fields are not allowed."),
+ error.note(referred_name.module_file, referred_object.source_location,
+ "{} is a physical field.".format(
+ referred_name.object_path[-1])),
+ ])
+ return
+ _type_check_expression(referred_object.read_transform,
+ referred_name.module_file, ir, errors)
+ expression.type.CopyFrom(referred_object.read_transform.type)
+ else:
+ assert False, "Unexpected constant reference type."
+
+
+def _type_check_operation(expression, source_file_name, ir, errors):
+ for arg in expression.function.args:
+ _type_check_expression(arg, source_file_name, ir, errors)
+ function = expression.function.function
+ if function in (ir_pb2.Function.EQUALITY, ir_pb2.Function.INEQUALITY):
+ _type_check_equality_operator(expression, source_file_name, errors)
+ elif function == ir_pb2.Function.CHOICE:
+ _type_check_choice_operator(expression, source_file_name, errors)
+ else:
+ _type_check_monomorphic_operator(expression, source_file_name, errors)
+
+
+def _type_check_monomorphic_operator(expression, source_file_name, errors):
+ """Type checks an operator that accepts only one set of argument types."""
+ args = expression.function.args
+ int_args = _type_check_integer
+ bool_args = _type_check_boolean
+ field_args = _kind_check_field_reference
+ int_result = _annotate_as_integer
+ bool_result = _annotate_as_boolean
+ binary = ("Left argument", "Right argument")
+ n_ary = ("Argument {}".format(n) for n in range(len(args)))
+ functions = {
+ ir_pb2.Function.ADDITION: (int_result, int_args, binary, 2, 2,
+ "operator"),
+ ir_pb2.Function.SUBTRACTION: (int_result, int_args, binary, 2, 2,
+ "operator"),
+ ir_pb2.Function.MULTIPLICATION: (int_result, int_args, binary, 2, 2,
+ "operator"),
+ ir_pb2.Function.AND: (bool_result, bool_args, binary, 2, 2, "operator"),
+ ir_pb2.Function.OR: (bool_result, bool_args, binary, 2, 2, "operator"),
+ ir_pb2.Function.LESS: (bool_result, int_args, binary, 2, 2, "operator"),
+ ir_pb2.Function.LESS_OR_EQUAL: (bool_result, int_args, binary, 2, 2,
+ "operator"),
+ ir_pb2.Function.GREATER: (bool_result, int_args, binary, 2, 2,
+ "operator"),
+ ir_pb2.Function.GREATER_OR_EQUAL: (bool_result, int_args, binary, 2, 2,
+ "operator"),
+ ir_pb2.Function.MAXIMUM: (int_result, int_args, n_ary, 1, None,
+ "function"),
+ ir_pb2.Function.PRESENCE: (bool_result, field_args, n_ary, 1, 1,
+ "function"),
+ ir_pb2.Function.UPPER_BOUND: (int_result, int_args, n_ary, 1, 1,
+ "function"),
+ ir_pb2.Function.LOWER_BOUND: (int_result, int_args, n_ary, 1, 1,
+ "function"),
+ }
+ function = expression.function.function
+ (set_result_type, check_arg, arg_names, min_args, max_args,
+ kind) = functions[function]
+ for argument, name in zip(args, arg_names):
+ assert name is not None, "Too many arguments to function!"
+ check_arg(argument, source_file_name, errors,
+ "{} of {} '{}'".format(name, kind,
+ expression.function.function_name.text))
+ if len(args) < min_args:
+ errors.append([
+ error.error(source_file_name, expression.source_location,
+ "{} '{}' requires {} {} argument{}.".format(
+ kind.title(), expression.function.function_name.text,
+ "exactly" if min_args == max_args else "at least",
+ min_args, "s" if min_args > 1 else ""))
+ ])
+ if max_args is not None and len(args) > max_args:
+ errors.append([
+ error.error(source_file_name, expression.source_location,
+ "{} '{}' requires {} {} argument{}.".format(
+ kind.title(), expression.function.function_name.text,
+ "exactly" if min_args == max_args else "at most",
+ max_args, "s" if max_args > 1 else ""))
+ ])
+ set_result_type(expression)
+
+
+def _type_check_local_reference(expression, ir, errors):
+ """Annotates the type of a local reference."""
+ referrent = ir_util.find_object(expression.field_reference.path[-1], ir)
+ assert referrent, "Local reference should be non-None after name resolution."
+ if isinstance(referrent, ir_pb2.RuntimeParameter):
+ parameter = referrent
+ _set_expression_type_from_physical_type_reference(
+ expression, parameter.physical_type_alias.atomic_type.reference, ir)
+ return
+ field = referrent
+ if ir_util.field_is_virtual(field):
+ _type_check_expression(field.read_transform,
+ expression.field_reference.path[0], ir, errors)
+ expression.type.CopyFrom(field.read_transform.type)
+ return
+ if not field.type.HasField("atomic_type"):
+ expression.type.opaque.CopyFrom(ir_pb2.OpaqueType())
+ else:
+ _set_expression_type_from_physical_type_reference(
+ expression, field.type.atomic_type.reference, ir)
+
+
+def unbounded_expression_type_for_physical_type(type_definition):
+ """Gets the ExpressionType for a field of the given TypeDefinition.
+
+ Arguments:
+ type_definition: an ir_pb2.TypeDefinition.
+
+ Returns:
+ An ir_pb2.ExpressionType with the corresponding expression type filled in:
+ for example, [prelude].UInt will result in an ExpressionType with the
+ `integer` field filled in.
+
+ The returned ExpressionType will not have any bounds set.
+ """
+ # TODO(bolms): Add a `[value_type]` attribute for `external`s.
+ if ir_util.get_boolean_attribute(type_definition.attribute,
+ attributes.IS_INTEGER):
+ return ir_pb2.ExpressionType(integer=ir_pb2.IntegerType())
+ elif tuple(type_definition.name.canonical_name.object_path) == ("Flag",):
+ # This is a hack: the Flag type should say that it is a boolean.
+ return ir_pb2.ExpressionType(boolean=ir_pb2.BooleanType())
+ elif type_definition.HasField("enumeration"):
+ return ir_pb2.ExpressionType(
+ enumeration=ir_pb2.EnumType(
+ name=ir_pb2.Reference(
+ canonical_name=type_definition.name.canonical_name)))
+ else:
+ return ir_pb2.ExpressionType(opaque=ir_pb2.OpaqueType())
+
+
+def _set_expression_type_from_physical_type_reference(expression,
+ type_reference, ir):
+ """Sets the type of an expression to match a physical type."""
+ field_type = ir_util.find_object(type_reference, ir)
+ assert field_type, "Field type should be non-None after name resolution."
+ expression.type.CopyFrom(
+ unbounded_expression_type_for_physical_type(field_type))
+
+
+def _annotate_parameter_type(parameter, ir, source_file_name, errors):
+ if parameter.physical_type_alias.WhichOneof("type") != "atomic_type":
+ errors.append([
+ error.error(
+ source_file_name, parameter.physical_type_alias.source_location,
+ "Parameters cannot be arrays.")
+ ])
+ return
+ _set_expression_type_from_physical_type_reference(
+ parameter, parameter.physical_type_alias.atomic_type.reference, ir)
+
+
+def _types_are_compatible(a, b):
+ """Returns true if a and b have compatible types."""
+ if a.type.WhichOneof("type") != b.type.WhichOneof("type"):
+ return False
+ elif a.type.WhichOneof("type") == "enumeration":
+ return (ir_util.hashable_form_of_reference(a.type.enumeration.name) ==
+ ir_util.hashable_form_of_reference(b.type.enumeration.name))
+ elif a.type.WhichOneof("type") in ("integer", "boolean"):
+ # All integers are compatible with integers; booleans are compatible with
+ # booleans
+ return True
+ else:
+ assert False, "_types_are_compatible works with enums, integers, booleans."
+
+
+def _type_check_equality_operator(expression, source_file_name, errors):
+ """Checks the type of an equality operator (== or !=)."""
+ left = expression.function.args[0]
+ if left.type.WhichOneof("type") not in ("integer", "boolean", "enumeration"):
+ errors.append([
+ error.error(source_file_name, left.source_location,
+ "Left argument of operator '{}' must be an integer, "
+ "boolean, or enum.".format(
+ expression.function.function_name.text))
+ ])
+ return
+ right = expression.function.args[1]
+ if not _types_are_compatible(left, right):
+ errors.append([
+ error.error(source_file_name, expression.source_location,
+ "Both arguments of operator '{}' must have the same "
+ "type.".format(expression.function.function_name.text))
+ ])
+ _annotate_as_boolean(expression)
+
+
+def _type_check_choice_operator(expression, source_file_name, errors):
+ """Checks the type of the choice operator cond ? if_true : if_false."""
+ condition = expression.function.args[0]
+ if condition.type.WhichOneof("type") != "boolean":
+ errors.append([
+ error.error(source_file_name, condition.source_location,
+ "Condition of operator '?:' must be a boolean.")
+ ])
+ if_true = expression.function.args[1]
+ if if_true.type.WhichOneof("type") not in ("integer", "boolean",
+ "enumeration"):
+ errors.append([
+ error.error(source_file_name, if_true.source_location,
+ "If-true clause of operator '?:' must be an integer, "
+ "boolean, or enum.")
+ ])
+ return
+ if_false = expression.function.args[2]
+ if not _types_are_compatible(if_true, if_false):
+ errors.append([
+ error.error(source_file_name, expression.source_location,
+ "The if-true and if-false clauses of operator '?:' must "
+ "have the same type.")
+ ])
+ if if_true.type.WhichOneof("type") == "integer":
+ _annotate_as_integer(expression)
+ elif if_true.type.WhichOneof("type") == "boolean":
+ _annotate_as_boolean(expression)
+ elif if_true.type.WhichOneof("type") == "enumeration":
+ expression.type.enumeration.name.CopyFrom(if_true.type.enumeration.name)
+ else:
+ assert False, "Unexpected type for if_true."
+
+
+def _type_check_boolean_constant(expression):
+ _annotate_as_boolean(expression)
+
+
+def _type_check_builtin_reference(expression):
+ name = expression.builtin_reference.canonical_name.object_path[0]
+ if name == "$is_statically_sized":
+ _annotate_as_boolean(expression)
+ elif name == "$static_size_in_bits":
+ _annotate_as_integer(expression)
+ else:
+ assert False, "Unknown builtin '{}'.".format(name)
+
+
+def _type_check_array_size(expression, source_file_name, errors):
+ _type_check_integer(expression, source_file_name, errors, "Array size")
+
+
+def _type_check_field_location(location, source_file_name, errors):
+ _type_check_integer(location.start, source_file_name, errors,
+ "Start of field")
+ _type_check_integer(location.size, source_file_name, errors, "Size of field")
+
+
+def _type_check_field_existence_condition(field, source_file_name, errors):
+ _type_check_boolean(field.existence_condition, source_file_name, errors,
+ "Existence condition")
+
+
+def _type_name_for_error_messages(expression_type):
+ if expression_type.WhichOneof("type") == "integer":
+ return "integer"
+ elif expression_type.WhichOneof("type") == "enumeration":
+ # TODO(bolms): Should this be the fully-qualified name?
+ return expression_type.enumeration.name.canonical_name.object_path[-1]
+ assert False, "Shouldn't be here."
+
+
+def _type_check_passed_parameters(atomic_type, ir, source_file_name, errors):
+ """Checks the types of parameters to a parameterized physical type."""
+ referenced_type = ir_util.find_object(atomic_type.reference.canonical_name,
+ ir)
+ if (len(referenced_type.runtime_parameter) !=
+ len(atomic_type.runtime_parameter)):
+ errors.append([
+ error.error(
+ source_file_name, atomic_type.source_location,
+ "Type {} requires {} parameter{}; {} parameter{} given.".format(
+ referenced_type.name.name.text,
+ len(referenced_type.runtime_parameter),
+ "" if len(referenced_type.runtime_parameter) == 1 else "s",
+ len(atomic_type.runtime_parameter),
+ "" if len(atomic_type.runtime_parameter) == 1 else "s")),
+ error.note(
+ atomic_type.reference.canonical_name.module_file,
+ referenced_type.source_location,
+ "Definition of type {}.".format(referenced_type.name.name.text))
+ ])
+ return
+ for i in range(len(referenced_type.runtime_parameter)):
+ if referenced_type.runtime_parameter[i].type.WhichOneof("type") not in (
+ "integer", "boolean", "enumeration"):
+ # _type_check_parameter will catch invalid parameter types at the
+ # definition site; no need for another, probably-confusing error at any
+ # usage sites.
+ continue
+ if (atomic_type.runtime_parameter[i].type.WhichOneof("type") !=
+ referenced_type.runtime_parameter[i].type.WhichOneof("type")):
+ errors.append([
+ error.error(
+ source_file_name,
+ atomic_type.runtime_parameter[i].source_location,
+ "Parameter {} of type {} must be {}, not {}.".format(
+ i, referenced_type.name.name.text,
+ _type_name_for_error_messages(
+ referenced_type.runtime_parameter[i].type),
+ _type_name_for_error_messages(
+ atomic_type.runtime_parameter[i].type))),
+ error.note(
+ atomic_type.reference.canonical_name.module_file,
+ referenced_type.runtime_parameter[i].source_location,
+ "Parameter {} of {}.".format(i, referenced_type.name.name.text))
+ ])
+
+
+def _type_check_parameter(runtime_parameter, source_file_name, errors):
+ """Checks the type of a parameter to a physical type."""
+ if runtime_parameter.type.WhichOneof("type") not in ("integer",
+ "enumeration"):
+ errors.append([
+ error.error(source_file_name,
+ runtime_parameter.physical_type_alias.source_location,
+ "Runtime parameters must be integer or enum.")
+ ])
+
+
+def annotate_types(ir):
+ """Adds type annotations to all expressions in ir.
+
+ annotate_types adds type information to all expressions (and subexpressions)
+ in the IR. Additionally, it checks expressions for internal type consistency:
+ it will generate an error for constructs like "1 + true", where the types of
+ the operands are not accepted by the operator.
+
+ Arguments:
+ ir: an IR to which to add type annotations
+
+ Returns:
+ A (possibly empty) list of errors.
+ """
+ errors = []
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Expression], _type_check_expression,
+ skip_descendants_of={ir_pb2.Expression},
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.RuntimeParameter], _annotate_parameter_type,
+ parameters={"errors": errors})
+ return errors
+
+
+def check_types(ir):
+ """Checks that expressions within the IR have the correct top-level types.
+
+ check_types ensures that expressions at the top level have correct types; in
+ particular, it ensures that array sizes are integers ("UInt[true]" is not a
+ valid array type) and that the starts and ends of ranges are integers.
+
+ Arguments:
+ ir: an IR to type check.
+
+ Returns:
+ A (possibly empty) list of errors.
+ """
+ errors = []
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.FieldLocation], _type_check_field_location,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.ArrayType, ir_pb2.Expression], _type_check_array_size,
+ skip_descendants_of={ir_pb2.AtomicType},
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.Field], _type_check_field_existence_condition,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.RuntimeParameter], _type_check_parameter,
+ parameters={"errors": errors})
+ traverse_ir.fast_traverse_ir_top_down(
+ ir, [ir_pb2.AtomicType], _type_check_passed_parameters,
+ parameters={"errors": errors})
+ return errors
diff --git a/front_end/type_check_test.py b/front_end/type_check_test.py
new file mode 100644
index 0000000..16defc9
--- /dev/null
+++ b/front_end/type_check_test.py
@@ -0,0 +1,650 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for front_end.type_check."""
+
+import unittest
+from front_end import glue
+from front_end import test_util
+from front_end import type_check
+from util import error
+
+
+class TypeAnnotationTest(unittest.TestCase):
+
+ def _make_ir(self, emb_text):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ "m.emb",
+ test_util.dict_file_reader({"m.emb": emb_text}),
+ stop_before_step="annotate_types")
+ assert not errors, errors
+ return ir
+
+ def test_adds_integer_constant_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] UInt:8[] y\n")
+ self.assertEqual([], type_check.annotate_types(ir))
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual(expression.type.WhichOneof("type"), "integer")
+
+ def test_adds_boolean_constant_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+true] UInt:8[] y\n")
+ self.assertEqual([], error.filter_errors(type_check.annotate_types(ir)),
+ ir.to_json(indent=2))
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual(expression.type.WhichOneof("type"), "boolean")
+
+ def test_adds_enum_constant_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+Enum.VALUE] UInt x\n"
+ "enum Enum:\n"
+ " VALUE = 1\n")
+ self.assertEqual([], error.filter_errors(type_check.annotate_types(ir)))
+ expression = ir.module[0].type[0].structure.field[0].location.size
+ self.assertEqual(expression.type.WhichOneof("type"), "enumeration")
+ enum_type_name = expression.type.enumeration.name.canonical_name
+ self.assertEqual(enum_type_name.module_file, "m.emb")
+ self.assertEqual(enum_type_name.object_path[0], "Enum")
+
+ def test_adds_enum_field_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] Enum x\n"
+ " 1 [+x] UInt y\n"
+ "enum Enum:\n"
+ " VALUE = 1\n")
+ self.assertEqual([], error.filter_errors(type_check.annotate_types(ir)))
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual(expression.type.WhichOneof("type"), "enumeration")
+ enum_type_name = expression.type.enumeration.name.canonical_name
+ self.assertEqual(enum_type_name.module_file, "m.emb")
+ self.assertEqual(enum_type_name.object_path[0], "Enum")
+
+ def test_adds_integer_operation_types(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1+1] UInt:8[] y\n")
+ self.assertEqual([], type_check.annotate_types(ir))
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual(expression.type.WhichOneof("type"), "integer")
+ self.assertEqual(expression.function.args[0].type.WhichOneof("type"),
+ "integer")
+ self.assertEqual(expression.function.args[1].type.WhichOneof("type"),
+ "integer")
+
+ def test_adds_enum_operation_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+Enum.VAL==Enum.VAL] UInt:8[] y\n"
+ "enum Enum:\n"
+ " VAL = 1\n")
+ self.assertEqual([], error.filter_errors(type_check.annotate_types(ir)))
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual(expression.type.WhichOneof("type"), "boolean")
+ self.assertEqual(expression.function.args[0].type.WhichOneof("type"),
+ "enumeration")
+ self.assertEqual(expression.function.args[1].type.WhichOneof("type"),
+ "enumeration")
+
+ def test_adds_integer_field_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+x] UInt:8[] y\n")
+ self.assertEqual([], type_check.annotate_types(ir))
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual(expression.type.WhichOneof("type"), "integer")
+
+ def test_adds_opaque_field_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] Bar x\n"
+ " 1 [+x] UInt:8[] y\n"
+ "struct Bar:\n"
+ " 0 [+1] UInt z\n")
+ self.assertEqual([], error.filter_errors(type_check.annotate_types(ir)))
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual(expression.type.WhichOneof("type"), "opaque")
+
+ def test_adds_opaque_field_type_for_array(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+x] UInt:8[] y\n")
+ self.assertEqual([], error.filter_errors(type_check.annotate_types(ir)))
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual(expression.type.WhichOneof("type"), "opaque")
+
+ def test_error_on_bad_plus_operand_types(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1+true] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[1].source_location,
+ "Right argument of operator '+' must be an integer.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_minus_operand_types(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1-true] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[1].source_location,
+ "Right argument of operator '-' must be an integer.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_times_operand_types(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1*true] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[1].source_location,
+ "Right argument of operator '*' must be an integer.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_equality_left_operand(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+x==x] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[0].source_location,
+ "Left argument of operator '==' must be an integer, "
+ "boolean, or enum.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_equality_right_operand(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+1==x] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Both arguments of operator '==' must have the same "
+ "type.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_equality_mismatched_operands_int_bool(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1==true] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Both arguments of operator '==' must have the same "
+ "type.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_equality_mismatched_operands_bool_int(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+true==1] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Both arguments of operator '==' must have the same "
+ "type.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_equality_mismatched_operands_enum_enum(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+Bar.BAR==Baz.BAZ] UInt:8[] y\n"
+ "enum Bar:\n"
+ " BAR = 1\n"
+ "enum Baz:\n"
+ " BAZ = 1\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Both arguments of operator '==' must have the same "
+ "type.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_choice_condition_operand(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+5 ? 0 : 1] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ condition_arg = expression.function.args[0]
+ self.assertEqual([
+ [error.error("m.emb", condition_arg.source_location,
+ "Condition of operator '?:' must be a boolean.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_choice_if_true_operand(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+true ? x : x] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ if_true_arg = expression.function.args[1]
+ self.assertEqual([
+ [error.error("m.emb", if_true_arg.source_location,
+ "If-true clause of operator '?:' must be an integer, "
+ "boolean, or enum.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_choice_of_bools(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+true ? true : false] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([], error.filter_errors(type_check.annotate_types(ir)))
+ self.assertEqual("boolean", expression.type.WhichOneof("type"))
+
+ def test_choice_of_integers(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+true ? 0 : 100] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([], type_check.annotate_types(ir))
+ self.assertEqual("integer", expression.type.WhichOneof("type"))
+
+ def test_choice_of_enums(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] enum xx:\n"
+ " XX = 1\n"
+ " YY = 1\n"
+ " 1 [+true ? Xx.XX : Xx.YY] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([], error.filter_errors(type_check.annotate_types(ir)))
+ self.assertEqual("enumeration", expression.type.WhichOneof("type"))
+ self.assertFalse(expression.type.enumeration.HasField("value"))
+ self.assertEqual(
+ "m.emb", expression.type.enumeration.name.canonical_name.module_file)
+ self.assertEqual(
+ "Foo", expression.type.enumeration.name.canonical_name.object_path[0])
+ self.assertEqual(
+ "Xx", expression.type.enumeration.name.canonical_name.object_path[1])
+
+ def test_error_on_bad_choice_mismatched_operands(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+true ? 0 : true] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "The if-true and if-false clauses of operator '?:' must "
+ "have the same type.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_choice_mismatched_enum_operands(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+true ? Baz.BAZ : Bar.BAR] UInt:8[] y\n"
+ "enum Bar:\n"
+ " BAR = 1\n"
+ "enum Baz:\n"
+ " BAZ = 1\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "The if-true and if-false clauses of operator '?:' must "
+ "have the same type.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_left_operand_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+true+1] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[0].source_location,
+ "Left argument of operator '+' must be an integer.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_opaque_operand_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+x+1] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[0].source_location,
+ "Left argument of operator '+' must be an integer.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_left_comparison_operand_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+true<1] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[0].source_location,
+ "Left argument of operator '<' must be an integer.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_right_comparison_operand_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1>=true] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[1].source_location,
+ "Right argument of operator '>=' must be an integer.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_bad_boolean_operand_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1&&true] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[0].source_location,
+ "Left argument of operator '&&' must be a boolean.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_max_return_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $max(1, 2, 3) [+1] UInt:8[] x\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([], type_check.annotate_types(ir))
+ self.assertEqual("integer", expression.type.WhichOneof("type"))
+
+ def test_error_on_bad_max_argument(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $max(Bar.XX) [+1] UInt:8[] x\n"
+ "enum Bar:\n"
+ " XX = 0\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[0].source_location,
+ "Argument 0 of function '$max' must be an integer.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_no_max_argument(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $max() [+1] UInt:8[] x\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Function '$max' requires at least 1 argument.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_upper_bound_return_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $upper_bound(3) [+1] UInt:8[] x\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([], type_check.annotate_types(ir))
+ self.assertEqual("integer", expression.type.WhichOneof("type"))
+
+ def test_upper_bound_too_few_arguments(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $upper_bound() [+1] UInt:8[] x\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Function '$upper_bound' requires exactly 1 argument.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_upper_bound_too_many_arguments(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $upper_bound(1, 2) [+1] UInt:8[] x\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Function '$upper_bound' requires exactly 1 argument.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_upper_bound_wrong_argument_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $upper_bound(Bar.XX) [+1] UInt:8[] x\n"
+ "enum Bar:\n"
+ " XX = 0\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([
+ [error.error(
+ "m.emb", expression.function.args[0].source_location,
+ "Argument 0 of function '$upper_bound' must be an integer.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_lower_bound_return_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $lower_bound(3) [+1] UInt:8[] x\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([], type_check.annotate_types(ir))
+ self.assertEqual("integer", expression.type.WhichOneof("type"))
+
+ def test_lower_bound_too_few_arguments(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $lower_bound() [+1] UInt:8[] x\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Function '$lower_bound' requires exactly 1 argument.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_lower_bound_too_many_arguments(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $lower_bound(1, 2) [+1] UInt:8[] x\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Function '$lower_bound' requires exactly 1 argument.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_lower_bound_wrong_argument_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " $lower_bound(Bar.XX) [+1] UInt:8[] x\n"
+ "enum Bar:\n"
+ " XX = 0\n")
+ expression = ir.module[0].type[0].structure.field[0].location.start
+ self.assertEqual([
+ [error.error(
+ "m.emb", expression.function.args[0].source_location,
+ "Argument 0 of function '$lower_bound' must be an integer.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_static_reference_to_physical_field(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " let y = Foo.x\n")
+ static_ref = ir.module[0].type[0].structure.field[1].read_transform
+ physical_field = ir.module[0].type[0].structure.field[0]
+ self.assertEqual([
+ [error.error("m.emb", static_ref.source_location,
+ "Static references to physical fields are not allowed."),
+ error.note("m.emb", physical_field.source_location,
+ "x is a physical field.")]
+ ], type_check.annotate_types(ir))
+
+ def test_error_on_non_field_argument_to_has(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if $present(0):\n"
+ " 0 [+1] UInt x\n")
+ expression = ir.module[0].type[0].structure.field[0].existence_condition
+ self.assertEqual([
+ [error.error("m.emb", expression.function.args[0].source_location,
+ "Argument 0 of function '$present' must be a field.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_no_argument_has(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if $present():\n"
+ " 0 [+1] UInt x\n")
+ expression = ir.module[0].type[0].structure.field[0].existence_condition
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Function '$present' requires exactly 1 argument.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_error_on_too_many_argument_has(self):
+ ir = self._make_ir("struct Foo:\n"
+ " if $present(y, y):\n"
+ " 0 [+1] UInt x\n"
+ " 1 [+1] UInt y\n")
+ expression = ir.module[0].type[0].structure.field[0].existence_condition
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Function '$present' requires exactly 1 argument.")]
+ ], error.filter_errors(type_check.annotate_types(ir)))
+
+ def test_checks_that_parameters_are_atomic_types(self):
+ ir = self._make_ir("struct Foo(y: UInt:8[1]):\n"
+ " 0 [+1] UInt x\n")
+ error_parameter = ir.module[0].type[0].runtime_parameter[0]
+ error_location = error_parameter.physical_type_alias.source_location
+ self.assertEqual(
+ [[error.error("m.emb", error_location,
+ "Parameters cannot be arrays.")]],
+ error.filter_errors(type_check.annotate_types(ir)))
+
+
+class TypeCheckTest(unittest.TestCase):
+
+ def _make_ir(self, emb_text):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ "m.emb",
+ test_util.dict_file_reader({"m.emb": emb_text}),
+ stop_before_step="check_types")
+ assert not errors, errors
+ return ir
+
+ def test_error_on_opaque_type_in_field_start(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " x [+10] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.start
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Start of field must be an integer.")]
+ ], type_check.check_types(ir))
+
+ def test_error_on_boolean_type_in_field_start(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " true [+10] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.start
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Start of field must be an integer.")]
+ ], type_check.check_types(ir))
+
+ def test_error_on_opaque_type_in_field_size(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+x] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Size of field must be an integer.")]
+ ], type_check.check_types(ir))
+
+ def test_error_on_boolean_type_in_field_size(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+true] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].location.size
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Size of field must be an integer.")]
+ ], type_check.check_types(ir))
+
+ def test_error_on_opaque_type_in_array_size(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+9] UInt:8[x] y\n")
+ expression = (ir.module[0].type[0].structure.field[1].type.array_type.
+ element_count)
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Array size must be an integer.")]
+ ], type_check.check_types(ir))
+
+ def test_error_on_boolean_type_in_array_size(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " 1 [+9] UInt:8[true] y\n")
+ expression = (ir.module[0].type[0].structure.field[1].type.array_type.
+ element_count)
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Array size must be an integer.")]
+ ], type_check.check_types(ir))
+
+ def test_error_on_integer_type_in_existence_condition(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt:8[] x\n"
+ " if 1:\n"
+ " 1 [+9] UInt:8[] y\n")
+ expression = ir.module[0].type[0].structure.field[1].existence_condition
+ self.assertEqual([
+ [error.error("m.emb", expression.source_location,
+ "Existence condition must be a boolean.")]
+ ], type_check.check_types(ir))
+
+ def test_error_on_non_integer_non_enum_parameter(self):
+ ir = self._make_ir("struct Foo(f: Flag):\n"
+ " 0 [+1] UInt:8[] x\n")
+ parameter = ir.module[0].type[0].runtime_parameter[0]
+ self.assertEqual(
+ [[error.error("m.emb", parameter.physical_type_alias.source_location,
+ "Runtime parameters must be integer or enum.")]],
+ type_check.check_types(ir))
+
+ def test_error_on_failure_to_pass_parameter(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] Bar b\n"
+ "struct Bar(f: UInt:6):\n"
+ " 0 [+1] UInt:8[] x\n")
+ type_ir = ir.module[0].type[0].structure.field[0].type
+ bar = ir.module[0].type[1]
+ self.assertEqual(
+ [[
+ error.error("m.emb", type_ir.source_location,
+ "Type Bar requires 1 parameter; 0 parameters given."),
+ error.note("m.emb", bar.source_location,
+ "Definition of type Bar.")
+ ]],
+ type_check.check_types(ir))
+
+ def test_error_on_passing_unneeded_parameter(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] Bar(1) b\n"
+ "struct Bar:\n"
+ " 0 [+1] UInt:8[] x\n")
+ type_ir = ir.module[0].type[0].structure.field[0].type
+ bar = ir.module[0].type[1]
+ self.assertEqual(
+ [[
+ error.error("m.emb", type_ir.source_location,
+ "Type Bar requires 0 parameters; 1 parameter given."),
+ error.note("m.emb", bar.source_location,
+ "Definition of type Bar.")
+ ]],
+ type_check.check_types(ir))
+
+ def test_error_on_passing_wrong_parameter_type(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] Bar(1) b\n"
+ "enum Baz:\n"
+ " QUX = 1\n"
+ "struct Bar(n: Baz):\n"
+ " 0 [+1] UInt:8[] x\n")
+ type_ir = ir.module[0].type[0].structure.field[0].type
+ usage_parameter_ir = type_ir.atomic_type.runtime_parameter[0]
+ source_parameter_ir = ir.module[0].type[2].runtime_parameter[0]
+ self.assertEqual(
+ [[
+ error.error("m.emb", usage_parameter_ir.source_location,
+ "Parameter 0 of type Bar must be Baz, not integer."),
+ error.note("m.emb", source_parameter_ir.source_location,
+ "Parameter 0 of Bar.")
+ ]],
+ type_check.check_types(ir))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/front_end/write_inference.py b/front_end/write_inference.py
new file mode 100644
index 0000000..bef4ac6
--- /dev/null
+++ b/front_end/write_inference.py
@@ -0,0 +1,280 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Adds auto-generated virtual fields to the IR."""
+
+from front_end import attributes
+from front_end import expression_bounds
+from public import ir_pb2
+from util import ir_util
+from util import traverse_ir
+
+
+def _find_field_reference_path(expression):
+ """Returns a path to a field reference, or None.
+
+ If the provided expression contains exactly one field_reference,
+ _find_field_reference_path will return a list of indexes, such that
+ recursively reading the index'th element of expression.function.args will find
+ the field_reference. For example, for:
+
+ 5 + (x * 2)
+
+ _find_field_reference_path will return [1, 0]: from the top-level `+`
+ expression, arg 1 is the `x * 2` expression, and `x` is arg 0 of the `*`
+ expression.
+
+ Arguments:
+ expression: an ir_pb2.Expression to walk
+
+ Returns:
+ A list of indexes to find a field_reference, or None.
+ """
+ found, indexes = _recursively_find_field_reference_path(expression)
+ if found == 1:
+ return indexes
+ else:
+ return None
+
+
+def _recursively_find_field_reference_path(expression):
+ """Recursive implementation of _find_field_reference_path."""
+ if expression.WhichOneof("expression") == "field_reference":
+ return 1, []
+ elif expression.WhichOneof("expression") == "function":
+ field_count = 0
+ path = []
+ for index in range(len(expression.function.args)):
+ arg = expression.function.args[index]
+ arg_result = _recursively_find_field_reference_path(arg)
+ arg_field_count, arg_path = arg_result
+ if arg_field_count == 1 and field_count == 0:
+ path = [index] + arg_path
+ field_count += arg_field_count
+ if field_count == 1:
+ return field_count, path
+ else:
+ return field_count, []
+ else:
+ return 0, []
+
+
+def _invert_expression(expression, ir):
+ """For the given expression, searches for an algebraic inverse expression.
+
+ That is, it takes the notional equation:
+
+ $logical_value = expression
+
+ and, if there is exactly one `field_reference` in `expression`, it will
+ attempt to solve the equation for that field. For example, if the expression
+ is `x + 1`, it will iteratively transform:
+
+ $logical_value = x + 1
+ $logical_value - 1 = x + 1 - 1
+ $logical_value - 1 = x
+
+ and finally return `x` and `$logical_value - 1`.
+
+ The purpose of this transformation is to find an assignment statement that can
+ be used to write back through certain virtual fields. E.g., given:
+
+ struct Foo:
+ 0 [+1] UInt raw_value
+ let actual_value = raw_value + 100
+
+ it should be possible to write a value to the `actual_value` field, and have
+ it set `raw_value` to the appropriate value.
+
+ Arguments:
+ expression: an ir_pb2.Expression to be inverted.
+ ir: the full IR, for looking up symbols.
+
+ Returns:
+ (field_reference, inverse_expression) if expression can be inverted,
+ otherwise None.
+ """
+ reference_path = _find_field_reference_path(expression)
+ if reference_path is None:
+ return None
+ subexpression = expression
+ result = ir_pb2.Expression(
+ builtin_reference=ir_pb2.Reference(
+ canonical_name=ir_pb2.CanonicalName(
+ module_file="",
+ object_path=["$logical_value"]
+ ),
+ source_name=[ir_pb2.Word(
+ text="$logical_value",
+ source_location=ir_pb2.Location(is_synthetic=True)
+ )],
+ source_location=ir_pb2.Location(is_synthetic=True)
+ ),
+ type=expression.type,
+ source_location=ir_pb2.Location(is_synthetic=True)
+ )
+
+ # This loop essentially starts with:
+ #
+ # f(g(x)) == $logical_value
+ #
+ # and ends with
+ #
+ # x == g_inv(f_inv($logical_value))
+ #
+ # At each step, `subexpression` has one layer removed, and `result` has a
+ # corresponding inverse function applied. So, for example, it might start
+ # with:
+ #
+ # 2 + ((3 - x) - 10) == $logical_value
+ #
+ # On each iteration, `subexpression` and `result` will become:
+ #
+ # (3 - x) - 10 == $logical_value - 2 [subtract 2 from both sides]
+ # (3 - x) == ($logical_value - 2) + 10 [add 10 to both sides]
+ # x == 3 - (($logical_value - 2) + 10) [subtract both sides from 3]
+ #
+ # This is an extremely limited algebraic solver, but it covers common-enough
+ # cases.
+ #
+ # Note that any equation that can be solved here becomes part of Emboss's
+ # contract, forever, so be conservative in expanding its solving capabilities!
+ for index in reference_path:
+ if subexpression.function.function == ir_pb2.Function.ADDITION:
+ result = ir_pb2.Expression(
+ function=ir_pb2.Function(
+ function=ir_pb2.Function.SUBTRACTION,
+ args=[
+ result,
+ subexpression.function.args[1 - index],
+ ]
+ ),
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType())
+ )
+ elif subexpression.function.function == ir_pb2.Function.SUBTRACTION:
+ if index == 0:
+ result = ir_pb2.Expression(
+ function=ir_pb2.Function(
+ function=ir_pb2.Function.ADDITION,
+ args=[
+ result,
+ subexpression.function.args[1],
+ ]
+ ),
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType())
+ )
+ else:
+ result = ir_pb2.Expression(
+ function=ir_pb2.Function(
+ function=ir_pb2.Function.SUBTRACTION,
+ args=[
+ subexpression.function.args[0],
+ result,
+ ]
+ ),
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType())
+ )
+ else:
+ return None
+ subexpression = subexpression.function.args[index]
+ expression_bounds.compute_constraints_of_expression(result, ir)
+ return subexpression, result
+
+
+def _add_write_method(field, ir):
+ """Adds an appropriate write_method to field, if applicable.
+
+ Currently, the "alias" write_method will be added for virtual fields of the
+ form `let v = some_field_reference` when `some_field_reference` is a physical
+ field or a writeable alias. The "physical" write_method will be added for
+ physical fields. The "transform" write_method will be added when the virtual
+ field's value is an easily-invertible function of a single writeable field.
+ All other fields will have the "read_only" write_method; i.e., they will not
+ be writeable.
+
+ Arguments:
+ field: an ir_pb2.Field to which to add a write_method.
+ ir: The IR in which to look up field_references.
+
+ Returns:
+ None
+ """
+ if field.HasField("write_method"):
+ # Do not recompute anything.
+ return
+
+ if not ir_util.field_is_virtual(field):
+ # If the field is not virtual, writes are physical.
+ field.write_method.physical = True
+ return
+
+ # A virtual field cannot be a direct alias if it has an additional
+ # requirement.
+ requires_attr = ir_util.get_attribute(field.attribute, attributes.REQUIRES)
+ if (field.read_transform.WhichOneof("expression") != "field_reference" or
+ requires_attr is not None):
+ inverse = _invert_expression(field.read_transform, ir)
+ if inverse:
+ field_reference, function_body = inverse
+ referenced_field = ir_util.find_object(
+ field_reference.field_reference.path[-1], ir)
+ if not isinstance(referenced_field, ir_pb2.Field):
+ reference_is_read_only = True
+ else:
+ _add_write_method(referenced_field, ir)
+ reference_is_read_only = referenced_field.write_method.read_only
+ if not reference_is_read_only:
+ field.write_method.transform.destination.CopyFrom(
+ field_reference.field_reference)
+ field.write_method.transform.function_body.CopyFrom(function_body)
+ else:
+ # If the virtual field's expression is invertible, but its target field
+ # is read-only, it is also read-only.
+ field.write_method.read_only = True
+ else:
+ # If the virtual field's expression is not invertible, it is
+ # read-only.
+ field.write_method.read_only = True
+ return
+
+ referenced_field = ir_util.find_object(
+ field.read_transform.field_reference.path[-1], ir)
+ if not isinstance(referenced_field, ir_pb2.Field):
+ # If the virtual field aliases a non-field (i.e., a parameter), it is
+ # read-only.
+ field.write_method.read_only = True
+ return
+
+ _add_write_method(referenced_field, ir)
+ if referenced_field.write_method.read_only:
+ # If the virtual field directly aliases a read-only field, it is read-only.
+ field.write_method.read_only = True
+ return
+
+ # Otherwise, it can be written as a direct alias.
+ field.write_method.alias.CopyFrom(
+ field.read_transform.field_reference)
+
+
+def set_write_methods(ir):
+ """Sets the write_method member of all ir_pb2.Fields in ir.
+
+ Arguments:
+ ir: The IR to which to add write_methods.
+
+ Returns:
+ A list of errors, or an empty list.
+ """
+ traverse_ir.fast_traverse_ir_top_down(ir, [ir_pb2.Field], _add_write_method)
+ return []
diff --git a/front_end/write_inference_test.py b/front_end/write_inference_test.py
new file mode 100644
index 0000000..71cfc53
--- /dev/null
+++ b/front_end/write_inference_test.py
@@ -0,0 +1,216 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for ...emboss.front_end.write_inference."""
+
+import unittest
+from front_end import glue
+from front_end import test_util
+from front_end import write_inference
+from public import ir_pb2
+
+
+class WriteInferenceTest(unittest.TestCase):
+
+ def _make_ir(self, emb_text):
+ ir, unused_debug_info, errors = glue.parse_emboss_file(
+ "m.emb",
+ test_util.dict_file_reader({"m.emb": emb_text}),
+ stop_before_step="set_write_methods")
+ assert not errors, errors
+ return ir
+
+ def test_adds_physical_write_method(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ self.assertTrue(
+ ir.module[0].type[0].structure.field[0].write_method.physical)
+
+ def test_adds_read_only_write_method_to_non_alias_virtual(self):
+ ir = self._make_ir("struct Foo:\n"
+ " let x = 5\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ self.assertTrue(
+ ir.module[0].type[0].structure.field[0].write_method.read_only)
+
+ def test_adds_alias_write_method_to_alias_of_physical_field(self):
+ ir = self._make_ir("struct Foo:\n"
+ " let x = y\n"
+ " 0 [+1] UInt y\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[0]
+ self.assertTrue(field.write_method.HasField("alias"))
+ self.assertEqual(
+ "y", field.write_method.alias.path[0].canonical_name.object_path[-1])
+
+ def test_adds_alias_write_method_to_alias_of_alias_of_physical_field(self):
+ ir = self._make_ir("struct Foo:\n"
+ " let x = z\n"
+ " let z = y\n"
+ " 0 [+1] UInt y\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[0]
+ self.assertTrue(field.write_method.HasField("alias"))
+ self.assertEqual(
+ "z", field.write_method.alias.path[0].canonical_name.object_path[-1])
+
+ def test_adds_read_only_write_method_to_alias_of_read_only(self):
+ ir = self._make_ir("struct Foo:\n"
+ " let x = y\n"
+ " let y = 5\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[0]
+ self.assertTrue(field.write_method.read_only)
+
+ def test_adds_read_only_write_method_to_alias_of_alias_of_read_only(self):
+ ir = self._make_ir("struct Foo:\n"
+ " let x = z\n"
+ " let z = y\n"
+ " let y = 5\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[0]
+ self.assertTrue(field.write_method.read_only)
+
+ def test_adds_read_only_write_method_to_alias_of_parameter(self):
+ ir = self._make_ir("struct Foo(x: UInt:8):\n"
+ " let y = x\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[0]
+ self.assertTrue(field.write_method.read_only)
+
+ def test_adds_transform_write_method_to_base_value_field(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " let y = x + 50\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[1]
+ transform = field.write_method.transform
+ self.assertTrue(transform)
+ self.assertEqual(
+ "x",
+ transform.destination.path[0].canonical_name.object_path[-1])
+ self.assertEqual(ir_pb2.Function.SUBTRACTION,
+ transform.function_body.function.function)
+ arg0, arg1 = transform.function_body.function.args
+ self.assertEqual("$logical_value",
+ arg0.builtin_reference.canonical_name.object_path[0])
+ self.assertEqual("50", arg1.constant.value)
+
+ def test_adds_transform_write_method_to_negative_base_value_field(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " let y = x - 50\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[1]
+ transform = field.write_method.transform
+ self.assertTrue(transform)
+ self.assertEqual(
+ "x",
+ transform.destination.path[0].canonical_name.object_path[-1])
+ self.assertEqual(ir_pb2.Function.ADDITION,
+ transform.function_body.function.function)
+ arg0, arg1 = transform.function_body.function.args
+ self.assertEqual("$logical_value",
+ arg0.builtin_reference.canonical_name.object_path[0])
+ self.assertEqual("50", arg1.constant.value)
+
+ def test_adds_transform_write_method_to_reversed_base_value_field(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " let y = 50 + x\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[1]
+ transform = field.write_method.transform
+ self.assertTrue(transform)
+ self.assertEqual(
+ "x",
+ transform.destination.path[0].canonical_name.object_path[-1])
+ self.assertEqual(ir_pb2.Function.SUBTRACTION,
+ transform.function_body.function.function)
+ arg0, arg1 = transform.function_body.function.args
+ self.assertEqual("$logical_value",
+ arg0.builtin_reference.canonical_name.object_path[0])
+ self.assertEqual("50", arg1.constant.value)
+
+ def test_adds_transform_write_method_to_reversed_negative_base_value_field(
+ self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " let y = 50 - x\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[1]
+ transform = field.write_method.transform
+ self.assertTrue(transform)
+ self.assertEqual(
+ "x",
+ transform.destination.path[0].canonical_name.object_path[-1])
+ self.assertEqual(ir_pb2.Function.SUBTRACTION,
+ transform.function_body.function.function)
+ arg0, arg1 = transform.function_body.function.args
+ self.assertEqual("50", arg0.constant.value)
+ self.assertEqual("$logical_value",
+ arg1.builtin_reference.canonical_name.object_path[0])
+
+ def test_adds_transform_write_method_to_nested_invertible_field(self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " let y = 30 + (50 - x)\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[1]
+ transform = field.write_method.transform
+ self.assertTrue(transform)
+ self.assertEqual(
+ "x",
+ transform.destination.path[0].canonical_name.object_path[-1])
+ self.assertEqual(ir_pb2.Function.SUBTRACTION,
+ transform.function_body.function.function)
+ arg0, arg1 = transform.function_body.function.args
+ self.assertEqual("50", arg0.constant.value)
+ self.assertEqual(ir_pb2.Function.SUBTRACTION, arg1.function.function)
+ arg10, arg11 = arg1.function.args
+ self.assertEqual("$logical_value",
+ arg10.builtin_reference.canonical_name.object_path[0])
+ self.assertEqual("30", arg11.constant.value)
+
+ def test_does_not_add_transform_write_method_for_parameter_target(self):
+ ir = self._make_ir("struct Foo(x: UInt:8):\n"
+ " let y = 50 + x\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[0]
+ self.assertEqual("read_only", field.write_method.WhichOneof("method"))
+
+ def test_adds_transform_write_method_with_complex_auxiliary_subexpression(
+ self):
+ ir = self._make_ir("struct Foo:\n"
+ " 0 [+1] UInt x\n"
+ " let y = x - $max(Foo.$size_in_bytes, Foo.z)\n"
+ " let z = 500\n")
+ self.assertEqual([], write_inference.set_write_methods(ir))
+ field = ir.module[0].type[0].structure.field[1]
+ transform = field.write_method.transform
+ self.assertTrue(transform)
+ self.assertEqual(
+ "x",
+ transform.destination.path[0].canonical_name.object_path[-1])
+ self.assertEqual(ir_pb2.Function.ADDITION,
+ transform.function_body.function.function)
+ args = transform.function_body.function.args
+ self.assertEqual("$logical_value",
+ args[0].builtin_reference.canonical_name.object_path[0])
+ self.assertEqual(field.read_transform.function.args[1], args[1])
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/g3doc/BUILD b/g3doc/BUILD
new file mode 100644
index 0000000..3ee8e01
--- /dev/null
+++ b/g3doc/BUILD
@@ -0,0 +1,29 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Documentation for Emboss.
+#
+# This BUILD file only exists to export grammar.md for use in a test in
+# emboss/misc.
+
+filegroup(
+ name = "grammar_md",
+ srcs = [
+ "grammar.md",
+ "__init__.py",
+ ],
+ # This should only be needed by docs_are_up_to_date_test, but there is no
+ # way to specify a narrower visibility.
+ visibility = ["//front_end:__pkg__"],
+)
diff --git a/g3doc/BogoNEL_BN-P-6000404_User_Guide.pdf b/g3doc/BogoNEL_BN-P-6000404_User_Guide.pdf
new file mode 100644
index 0000000..0a4f19e
--- /dev/null
+++ b/g3doc/BogoNEL_BN-P-6000404_User_Guide.pdf
Binary files differ
diff --git a/g3doc/__init__.py b/g3doc/__init__.py
new file mode 100644
index 0000000..2c31d84
--- /dev/null
+++ b/g3doc/__init__.py
@@ -0,0 +1,14 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
diff --git a/g3doc/cpp-guide.md b/g3doc/cpp-guide.md
new file mode 100644
index 0000000..2be9a8b
--- /dev/null
+++ b/g3doc/cpp-guide.md
@@ -0,0 +1,314 @@
+# Emboss C++ User Guide
+
+[TOC]
+
+## General Principles
+
+In C++, Emboss generates *view* classes which *do not* take ownership of any
+data. Application code is expected to manage the actual binary data. However,
+Emboss views are extremely cheap to construct (often free when optimizations are
+turned on), so it is expected that applications can pass around pointers to
+binary data and instantiate views as needed.
+
+All of the generated C++ code is in templates, so only code that is actually
+called will be linked into your application.
+
+Unless otherwise noted, all code for a given Emboss module will be generated in
+the namespace given by the module's `[(cpp) namespace]` attribute.
+
+
+### Read-Only vs Read-Write vs C++ `const`
+
+Emboss views can be applied to read-only or read-write storage:
+
+```c++
+void CopyX(const std::vector<char> &src, std::vector<char> *dest) {
+ auto source_view = MakeXView(&src);
+ auto dest_view = MakeXView(dest);
+ dest_view.x().Write(source_view.x().Read());
+}
+```
+
+When applied to read-only storage, methods like `Write()` or
+`UpdateFromTextStream()` won't compile:
+
+```c++
+void WontCompile(const std::vector<char> &data) {
+ auto view = MakeXView(&data);
+ view.x().Write(10); // Won't compile.
+}
+```
+
+This is separate from the C++ `const`ness of the view itself! For example, the
+following will work with no issue:
+
+```c++
+void WillCompileAndRun(std::vector<char> *data) {
+ const auto view = MakeXView(&data);
+ view.x().Write(10);
+}
+```
+
+This works because views are like pointers. In C++, you can have any
+combination of `const`/non-`const` pointer to `const`/non-`const` data:
+
+```c++
+char * ncnc; // Pointer is mutable, and points to mutable data.
+const char * ncc; // Point is mutable, but points to const data.
+char const * ncc2; // Another way of writing const char *
+char *const cnc; // Pointer is constant, but points to mutable data.
+using char_p = char *;
+const char_p cnc2; // Another way of writing char *const
+const char *const cc; // Pointer is constant, and points to constant data.
+using c_char_p = const char *;
+const c_char_p * cc2; // Another way of writing const char *const
+```
+
+The Emboss view equivalents are:
+
+```c++
+GenericMyStructView<ContiguousBuffer<char, ...>> ncnc;
+GenericMyStructView<ContiguousBuffer<const char, ...>> ncc;
+GenericMyStructView<ContiguousBuffer<char const, ...>> ncc2;
+GenericMyStructView<ContiguousBuffer<char, ...>> const cnc;
+const GenericMyStructView<ContiguousBuffer<char, ...>> cnc2;
+GenericMyStructView<ContiguousBuffer<const char, ...>> const cc;
+const GenericMyStructView<ContiguousBuffer<const char, ...>> cc2;
+```
+
+For this reason, `const` methods of views work on `const` *views*, not
+necessarily on `const` data: for example, `UpdateFromTextStream()` is a `const`
+method, because it does not modify the view itself, but it will not work if the
+view points to `const` data. This is analogous to writing through a constant
+pointer, like: `char *const p = &some_char; *p = 'z';`.
+
+Conversely, non-`const` methods, such as `operator=`, still work on views of
+`const` data. This is analogous to `pointer_to_const_char =
+other_pointer_to_const_char`.
+
+
+## Example: Fixed-Size `struct`
+
+Given a simple, fixed-size `struct`:
+
+```
+[(cpp) namespace = "example"]
+
+struct MyStruct:
+ 0 [+4] UInt field_a
+ 4 [+4] Int field_b
+ 8 [+4] Bcd field_c
+```
+
+Emboss will generate code with this public C++ interface:
+
+```c++
+namespace example {
+
+// The view class for the struct. Views are like pointers: they do not own
+// their storage.
+//
+// `Storage` is typically some ::emboss::support::ContiguousBuffer (which uses
+// contiguous memory as backing storage), but you would typically just use
+// `auto`:
+//
+// auto view = MakeMyStructView(&container);
+//
+// If you need to make a view of some non-RAM backing storage (e.g., a register
+// file on a remote device, accessed via SPI), you can provide your own Storage.
+template <class Storage>
+class GenericMyStructView final {
+ public:
+ // Typically, you do not need to explicitly call any of the constructors.
+
+ // The default constructor gives you a "null" view: you cannot read or write
+ // through the view, Ok() and IsComplete() return false, and so on.
+ GenericMyStructView();
+
+ // A non-"null" view must be constructed with an appropriate Storage.
+ explicit GenericMyStructView(Storage bytes);
+
+ // Views can be copy-constructed and assigned from views of "compatible"
+ // Storage. For ContiguousBuffer, that means ContiguousBuffer over any of the
+ // char types -- char, unsigned char, and signed char. std::uint8_t and
+ // std::int8_t are typically aliases of char types, but are not required to
+ // be by the C++ standard.
+ template <typename OtherStorage>
+ GenericMyStructView(const GenericMyStructView<OtherStorage> &other);
+
+ template <typename OtherStorage>
+ GenericMyStructView<Storage> &operator=(
+ const GenericMyStructView<OtherStorage> &other);
+
+
+ // Ok() returns true if the Storage is big enough for the struct (for
+ // MyStruct, at least 12 bytes), and all fields are Ok(). For this struct,
+ // the Int and UInt fields are always Ok(), and the Bcd field is Ok() if none
+ // of its nibbles has a value greater than 9.
+ bool Ok() const;
+
+ // IsComplete() returns true if the Storage is big enough for the struct.
+ // This is most useful when you are reading bytes from some stream: you can
+ // read until IsComplete() is true, and then use IntrinsicSizeInBytes() to
+ // find out how many bytes are actually used by the struct, and Ok() to find
+ // out if the bytes are correct.
+ //
+ // An alternate way of thinking about it is: Ok() tells you if you can read a
+ // structure; IsComplete() tells you if you can write to it.
+ bool IsComplete() const;
+
+
+ // The Equals() and UncheckedEquals() methods check if two structs are
+ // *logically* equal. Equals() performs Ok() and bounds checks,
+ // UncheckedEquals() does not: UncheckedEquals() is useful when you need
+ // maximum performance, and can guarantee that your structures are Ok()
+ // before calling UncheckedEquals().
+ template <typename OtherStorage>
+ bool Equals(GenericMyStructView<OtherStorage> other) const;
+ template <typename OtherStorage>
+ bool UncheckedEquals(GenericMyStructView<OtherStorage> other) const;
+
+ // CopyFrom() and UncheckedCopyFrom() copy the bytes of the source structure
+ // directly from its Storage. CopyFrom() performs bounds checks to ensure
+ // that there are enough bytes available in the source; UncheckedCopyFrom()
+ // does not. With ContiguousBuffer storage, these should have essentially
+ // identical performance to memcpy().
+ template <typename OtherStorage>
+ void CopyFrom(GenericMyStructView<OtherStorage> other) const;
+ template <typename OtherStorage>
+ void UncheckedCopyFrom(GenericMyStructView<OtherStorage> other) const;
+
+
+ // UpdateFromTextStream() attempts to update the structure from text format.
+ // The Stream class provides a simple interface for getting and ungetting
+ // characters; typically, you would use ::emboss::UpdateFromText(view,
+ // some_string) instead of calling this yourself.
+ template <class Stream>
+ bool UpdateFromTextStream(Stream *stream) const;
+
+ // WriteToTextStream() writes a textual representation of the structure to the
+ // provided stream. Typically, you would use ::emboss::WriteToString(view)
+ // instead.
+ template <class Stream>
+ void WriteToTextStream(Stream *stream,
+ ::emboss::TextOutputOptions options) const;
+
+
+ // Each field in the struct will have a method to get its corresponding view.
+ //
+ // The exact types of the returned views are not contractual.
+ ::emboss::prelude::UIntView<...> field_a() const;
+ ::emboss::prelude::IntView<...> field_b() const;
+ ::emboss::prelude::BcdView<...> field_c() const;
+
+
+ // The built-in virtual fields also have methods to get their views:
+ // $size_in_bytes has IntrinsicSizeInBytes(), $max_size_in_bytes has
+ // MaxSizeInBytes(), and $min_size_in_bytes has MinSizeInBytes().
+ //
+ // Because $min_size_in_bytes and $max_size_in_bytes are always constant,
+ // their corresponding field methods are always static constexpr. Because
+ // $size_in_bytes is also constant for MyStruct, IntrinsicSizeInBytes() will
+ // also be static constexpr for GenericMyStructView:
+ //
+ // For any virtual field, you can use its Ok() method to find out if you can
+ // Read() its value:
+ //
+ // if (view.IntrinsicSizeInBytes().Ok()) {
+ // // The size of the struct is known.
+ // DoSomethingWithNBytes(view.IntrinsicSizeInBytes().Read());
+ // }
+ //
+ // For constant values, Ok() will always return true.
+ //
+ // For MyStruct, my_struct_view.IntrinsicSizeInBytes().Read(),
+ // my_struct_view.MinSizeInBytes().Read(), and
+ // my_struct_view.MaxSizeInBytes().Read() will all return 12.
+ //
+ // For constexpr fields, you can also get their values from functions in the
+ // structure's namespace, which also lets you skip the Read():
+ //
+ // MyStruct::IntrinsicSizeInBytes()
+ // MyStruct::MaxSizeInBytes()
+ // MyStruct::MinSizeInBytes()
+ static constexpr IntrinsicSizeInBytesView IntrinsicSizeInBytes();
+ static constexpr MinSizeInBytesView MinSizeInBytes();
+ static constexpr MaxSizeInBytesView MaxSizeInBytes();
+
+ // The IntrinsicSizeInBytes() method returns the view of the $size_in_bytes
+ // virtual field. Because $size_in_bytes is constant, this is a static
+ // constexpr method.
+ //
+ // Typically, you would use IntrinsicSizeInBytes().Ok() and
+ // IntrinsicSizeInBytes().Read():
+ //
+ // if (view.IntrinsicSizeInBytes().Ok()) {
+ // // The size of the struct is known.
+ // DoSomethingWithNBytes(view.IntrinsicSizeInBytes().Read());
+ // }
+ //
+ // Because MyStruct is always 12 bytes,
+ // GenericMyStructView::IntrinsicSizeInBytes().Ok() will always be true.
+ static constexpr UIntView<...> IntrinsicSizeInBytes();
+
+ // If you need to get at the raw bytes underneath the view, you can get the
+ // view's Storage.
+ Storage BackingStorage() const;
+};
+
+
+// An overload of MakeMyStructView is provided which accepts a pointer to a
+// container type: this generally works with STL and STL-like containers of
+// chars, that have size() and data() methods. This is known to work with
+// std::vector<char>, std::array<char>, std::string, absl:: and
+// std::string_view, and some others. Note that you need to call this with a
+// pointer to the container:
+//
+// auto view = MakeMyStructView(&container);
+//
+// IMPORTANT: this does *not* keep a reference to the actual container, so if
+// you call a container method that invalidates data() (such as
+// std::vector<>::reserve()), you will have to make a new view.
+template <typename Container>
+inline GenericMyStructView<...> MakeMyStructView(Container *arg);
+
+// Alternately, a "C-style" overload is provided, if you just have a pointer and
+// length:
+template <typename CharType>
+inline GenericMyStructView<...> MakeMyStructView(CharType *buffer,
+ std::size_t length);
+
+
+// In addition to the View class, a namespace will be generated with the
+// compile-time constant elements of the class. This is a convenience, so that
+// you can write something like:
+//
+// std::array<char, MyStruct::IntrinsicSizeInBytes()>
+//
+// instead of:
+//
+// std::array<char, GenericMyStructView<ContiguousBuffer<
+// char>>::IntrinsicSizeInBytes().Read()>
+namespace MyStruct {
+
+// Because MyStruct only has some constant virtual fields, the namespace
+// MyStruct only contains a few corresponding functions. Note that the
+// functions here return values, not views:
+inline constexpr unsigned int IntrinsicSizeInBytes();
+inline constexpr unsigned int MaxSizeInBytes();
+inline constexpr unsigned int MinSizeInBytes();
+
+} // namespace MyStruct
+} // namespace example
+```
+
+
+## TODO(bolms): Example: Variable-Size `struct`
+
+
+## TODO(bolms): Example: `enum`
+
+
+## TODO(bolms): Example: `bits`
+
+
diff --git a/g3doc/cpp-reference.md b/g3doc/cpp-reference.md
new file mode 100644
index 0000000..98c70b8
--- /dev/null
+++ b/g3doc/cpp-reference.md
@@ -0,0 +1,1973 @@
+# Emboss C++ Generated Code Reference
+
+[TOC]
+
+## `struct`s
+
+A `struct` will have a corresponding view class, and functions to create views.
+
+### `Make`*`Struct`*`View` free function
+
+```c++
+template <typename T>
+auto MakeStructView(/* view parameters, */ T *data, size_t size);
+```
+
+```c++
+template <typename T>
+auto MakeStructView(/* view parameters, */ T *container);
+```
+
+*`Struct`* will be replaced by the name of the specific `struct` whose view
+will be constructed; for example, to make a view for `struct Message`, call the
+`MakeMessageView` function.
+
+*View parameters* will be replaced by one argument for each parameter attached
+to the `struct`. E.g., for:
+
+```
+struct Foo(x: UInt:8):
+ --
+```
+
+`MakeFooView` will be:
+
+```c++
+template <typename T>
+auto MakeFooView(std::uint8_t x, T *data, size_t size);
+```
+
+```c++
+template <typename T>
+auto MakeFooView(std::uint8_t x, T *container);
+```
+
+And for:
+
+```
+struct Bar(x: UInt:8, y: Int:32):
+ --
+```
+
+`MakeBarView` will be:
+
+```c++
+template <typename T>
+auto MakeBarView(std::uint8_t x, std::int32_t y, T *data, size_t size);
+```
+
+```c++
+template <typename T>
+auto MakeBarView(std::uint8_t x, std::int32_t y, T *container);
+```
+
+The `Make`*`Struct`*`View` functions construct a view for *`Struct`* over the
+given bytes. For the data/size form, the type `T` must be a character type:
+`char`, `const char`, `unsigned char`, `const unsigned char`, `signed char`, or
+`const signed char`. For the container form, the container can be a
+`std::vector`, `std::array`, or `std::basic_string` of a character type, or any
+other type with a `data()` method that returns a possibly-`const` `char *`,
+`signed char *`, or `unsigned char *`, and a `size()` method that returns a size
+in bytes. Google's `absl::string_view` is one example of such a type.
+
+If given a pointer to a `const` character type or a `const` reference to a
+container, `Make`*`Struct`*`View` will return a read-only view; otherwise
+it will return a read-write view.
+
+The result of `Make`*`Struct`*`View` should be stored in an `auto` variable:
+
+```c++
+auto view = MakeFooView(byte_buffer, available_byte_count);
+```
+
+The specific type returned by `Make`*`Struct`*`View` is subject to change.
+
+
+### `CopyFrom` method
+
+```c++
+template <typename OtherStorage>
+void CopyFrom(GenericStructView<OtherStorage> other) const;
+```
+
+The `CopyFrom` method copies data from the view `other` into the current view.
+When complete, the current view's backing storage will contain the same bytes
+as `other`. This works even if the view's backing storage overlaps, in which
+case `other`'s backing storage is modified by the operation.
+
+### `UncheckedCopyFrom` method
+
+```c++
+template <typename OtherStorage>
+void UncheckedCopyFrom(GenericStructView<OtherStorage> other) const;
+```
+
+The `UncheckedCopyFrom` method performs the same operation as `CopyFrom` but
+without any checks on the integrity of or the compatibility of the two views.
+
+### `TryToCopyFrom` method
+
+```c++
+template <typename OtherStorage>
+bool TryToCopyFrom(GenericStructView<OtherStorage> other) const;
+```
+
+`TryToCopyFrom` copies data from `other` into the current view, if `other` is
+`Ok()` and the current backing storage is large enough to hold `other`'s data.
+
+### `Equals` method
+
+```c++
+template <typename OtherStorage>
+bool Equals(GenericStructView<OtherStorage> other);
+```
+
+The `Equals` method returns `true` if and only if itself and `other` contain the
+same fields yielding equivalent values (as measured by the `==` operator).
+`Equals()` should only be called if `Ok()` is true on both views.
+
+### `UncheckedEquals` method
+
+```c++
+template <typename OtherStorage>
+bool UncheckedEquals(GenericStructView<OtherStorage> other);
+```
+
+The `UncheckedEquals` method performs the same operation as `Equals`, but
+without any checks on the integrity of or the compatibility of the two views
+when reading values. `UncheckedEquals()` should only be called if `Ok()` is
+true on both views.
+
+### `Ok` method
+
+```c++
+bool Ok() const;
+```
+
+The `Ok` method returns `true` if and only if there are enough bytes in the
+backing store, and the `Ok` methods of all active fields return `true`.
+
+
+### `IsComplete` method
+
+```c++
+bool IsComplete() const;
+```
+
+The `IsComplete` method returns `true` if there are enough bytes in the backing
+store to fully contain the `struct`. If `IsComplete()` returns `true` but
+`Ok()` returns `false`, then the structure is broken in some way that cannot be
+fixed by adding more bytes.
+
+
+### `IntrinsicSizeInBytes` method
+
+```c++
+auto IntrinsicSizeInBytes() const;
+```
+
+or
+
+```c++
+static constexpr auto IntrinsicSizeInBytes() const;
+```
+
+The `IntrinsicSizeInBytes` method is the [field method](#struct-field-methods)
+for [`$size_in_bytes`](language-reference.md#size-in-bytes). The `Read` method
+of the result returns the size of the `struct`, and the `Ok` method returns
+`true` if the `struct`'s intrinsic size is known; i.e.:
+
+```c++
+if (view.IntrinsicSizeInBytes().Ok()) {
+ // The exact return type of view.IntrinsicSizeInBytes().Read() may vary, but
+ // it will always be implicitly convertible to std::uint64_t.
+ std::uint64_t view_size = view.IntrinsicSizeInBytes().Read();
+}
+```
+
+Alternately, if you are sure the size is known:
+
+```c++
+std::uint64_t view_size = view.IntrinsicSizeInBytes().UncheckedRead();
+```
+
+Or, if the size is a compile-time constant:
+
+```c++
+constexpr std::uint64_t view_size = StructView::IntrinsicSizeInBytes().Read();
+constexpr std::uint64_t view_size2 = Struct::IntrinsicSizeInBytes();
+```
+
+
+### `MaxSizeInBytes` method
+
+```c++
+auto MaxSizeInBytes() const;
+```
+
+or
+
+```c++
+static constexpr auto MaxSizeInBytes() const;
+```
+
+The `MaxSizeInBytes` method is the [field method](#struct-field-methods)
+for [`$max_size_in_bytes`](language-reference.md#max-size-in-bytes). The `Read`
+method of the result returns the maximum size of the `struct`, and the `Ok`
+always method returns `true`.
+
+```c++
+assert(view.MaxSizeInBytes().Ok());
+// The exact return type of view.MaxSizeInBytes().Read() may vary, but it will
+// always be implicitly convertible to std::uint64_t.
+std::uint64_t view_size = view.MaxSizeInBytes().Read();
+```
+
+Alternately:
+
+```c++
+std::uint64_t view_size = view.MaxSizeInBytes().UncheckedRead();
+```
+
+Or:
+
+```c++
+constexpr std::uint64_t view_size = StructView::MaxSizeInBytes().Read();
+constexpr std::uint64_t view_size2 = Struct::MaxSizeInBytes();
+```
+
+
+### `MinSizeInBytes` method
+
+```c++
+auto MinSizeInBytes() const;
+```
+
+or
+
+```c++
+static constexpr auto MinSizeInBytes() const;
+```
+
+The `MinSizeInBytes` method is the [field method](#struct-field-methods)
+for [`$min_size_in_bytes`](language-reference.md#max-size-in-bytes). The `Read`
+method of the result returns the minimum size of the `struct`, and the `Ok`
+always method returns `true`.
+
+```c++
+assert(view.MinSizeInBytes().Ok());
+// The exact return type of view.MinSizeInBytes().Read() may vary, but it will
+// always be implicitly convertible to std::uint64_t.
+std::uint64_t view_size = view.MinSizeInBytes().Read();
+```
+
+Alternately:
+
+```c++
+std::uint64_t view_size = view.MinSizeInBytes().UncheckedRead();
+```
+
+Or:
+
+```c++
+constexpr std::uint64_t view_size = StructView::MinSizeInBytes().Read();
+constexpr std::uint64_t view_size2 = Struct::MinSizeInBytes();
+```
+
+
+### `SizeIsKnown` method
+
+```c++
+bool SizeIsKnown() const;
+```
+
+or
+
+```c++
+static constexpr bool SizeIsKnown() const;
+```
+
+The `SizeIsKnown` method is an alias of `IntrinsicSizeInBytes().Ok()`.
+
+The `SizeIsKnown` method returns `true` if the size of the `struct` can be
+determined from the bytes that are available. For example, consider a `struct`
+like:
+
+```
+struct Message:
+ 0 [+4] UInt payload_length (pl)
+ 4 [+pl] UInt:8[pl] payload
+```
+
+The `Message`'s view's `SizeIsKnown` method will return `true` if at least four
+bytes are available in the backing store, because it can determine the actual
+size of the message if at least four bytes can be read. If the backing store
+contains three or fewer bytes, then `SizeIsKnown` will be false.
+
+Note that if the `struct` contains no dynamically-sized or dynamically-located
+fields, then `SizeIsKnown` will be a `static constexpr` method that always
+return `true`.
+
+
+### `SizeInBytes` method
+
+```c++
+std::size_t SizeInBytes() const;
+```
+
+or
+
+```c++
+static constexpr std::size_t SizeInBytes() const;
+```
+
+The `SizeInBytes` method returns
+`static_cast<std::size_t>(IntrinsicSizeInBytes().Read())`.
+
+The `SizeInBytes` method returns the size of the `struct` in bytes.
+`SizeInBytes` asserts that `SizeIsKnown()`, so applications should ensure that
+`SizeIsKnown()` before calling `SizeInBytes`.
+
+If the `struct` contains no dynamically-sized or dynamically-located fields,
+then `SizeInBytes` will be a `static constexpr` method, and can always be called
+safely.
+
+
+### `UpdateFromTextStream` method
+
+```c++
+template <class Stream>
+bool UpdateFromTextStream(Stream *stream) const;
+```
+
+`UpdateFromTextStream` will read a text-format representation of the structure
+from the given `stream` and update fields. Generally, applications would not
+call this directly; instead, use the global `UpdateFromText` method, which
+handles setting up a stream from a `std::string`.
+
+### `WriteToTextStream` method
+
+```c++
+template <class Stream>
+bool WriteToTextStream(Stream *stream, const TextOutputOptions &options) const;
+```
+
+`WriteToTextStream` will write a text representation of the current value in a
+form that can be decoded by `UpdateFromTextStream`. Generally, applications
+would not call this directly; instead, use the global `WriteToString` method,
+which handles setting up the stream and returning the resulting string.
+
+### `BackingStorage` method
+
+```c++
+Storage BackingStorage() const;
+```
+
+Returns the backing storage for the view. The return type of `BackingStorage()`
+is a template parameter on the view.
+
+
+### Field methods {#struct-field-methods}
+
+Each physical field and virtual field in the `struct` will have a corresponding
+method in the generated view for that `struct`, which returns a subview of that
+field. For example, take the `struct` definition:
+
+```
+struct Foo:
+ 0 [+4] UInt bar
+ 4 [+4] Int baz
+ let qux = 2 * bar
+ let bar_alias = bar
+```
+
+In this case, the generated code will have methods
+
+```c++
+auto bar() const;
+auto baz() const;
+auto qux() const;
+auto bar_alias() const;
+```
+
+The `bar` method will return a `UInt` view, and `baz()` will return an `Int`
+view. The `qux` method will return a pseudo-`UInt` view which can only be read.
+The `bar_alias` method actually forwards to `bar`, and can be both read and
+written:
+
+```c++
+auto foo_view = MakeFooView(&vector_of_foo_bytes);
+uint32_t bar_value = foo_view.bar().Read();
+int32_t baz_value = foo_view.baz().Read();
+int64_t qux_value = foo_view.qux().Read();
+uint32_t bar_alias_value = foo_view.bar_alias().Read();
+foo_view.bar_alias().Write(100);
+assert(foo_view.bar().Read() == 100);
+```
+
+As with `Make`*`Struct`*`View`, the exact return type of field methods is
+subject to change; if a field's view must be stored, use an `auto` variable.
+
+Fields in anonymous `bits` are treated as if they were fields of the enclosing
+`struct` in the generated code. Take this `struct`:
+
+```
+struct Foo:
+ 0 [+4] bits:
+ 5 [+5] UInt bar
+```
+
+In C++, `bar` would be read like so:
+
+```c++
+auto foo_view = MakeFooView(&vector_of_foo_bytes);
+uint8_t bar_value = foo_view.bar().Read();
+```
+
+For each field, there is a `has_`*`field`*`()` method, which returns an object.
+`has_` methods are typically used for conditional fields. Suppose you have a
+`struct` like:
+
+```
+struct Foo:
+ 0 [+1] enum message_type:
+ BAR = 1
+ if message_type == MessageType.BAR:
+ 1 [+25] Bar bar
+```
+
+When you have a view of a `Foo`, you can call `foo_view.has_bar().Known()` to
+find out whether `foo_view` has enough information to determine if the field
+`bar` should exist. If it does `.Known()` returns `true`, you may call
+`foo_view.has_bar().Value()` to find out if `bar` should exist. You can also
+call `foo_view.has_bar().ValueOr(false)`, which will return `true` if `bar`'s
+status is known, and `bar` exists.
+
+Every field will have a corresponding `has_` method. In the example above,
+`foo_view.has_message_type().Known()` and `foo_view.has_message_type().Value()`
+are both supported calls; both will always return `true`.
+
+Note that just because a field "exists," that does not mean that it can be read
+from or written to the current message: the field's bytes might be missing, or
+present but contain a non-`Ok()` value. You can use `view.field().Ok()` to
+determine if the field can be *read*, and `view.field().IsComplete()` to
+determine if the field can be *written*.
+
+
+### Constant Virtual Fields
+
+Virtual fields whose values are compile-time constants can be read without
+instantiating a view:
+
+```
+struct Foo:
+ let register_number = 0xf8
+ 0 [+4] UInt foo
+```
+
+```
+// Foo::register_number() is a constexpr function.
+static_assert(Foo::register_number() == 0xf8);
+```
+
+
+### *`field`*`().Ok()` vs *`field`*`().IsComplete()` vs `has_`*`field`*`()`
+
+Emboss provides a number of methods to query different kinds of validity.
+
+`has_`*`field`*`()` is used for checking the existence condition specified in
+the `.emb` file:
+
+```
+struct Foo:
+ 0 [+1] UInt x
+ if x < 10:
+ 1 [+1] UInt y
+```
+
+In the .cc file:
+
+```c++
+::std::array<char, 2> bytes = { 5, 7 };
+auto foo = MakeFooView(&bytes);
+assert(foo.x().Read() == 5);
+
+// foo.x() is readable, so the existence condition on y is known.
+assert(foo.has_y().Known());
+
+// foo.x().Read() < 10, so y exists in foo.
+assert(foo.has_y().Value());
+
+foo.x().Write(15);
+
+// foo.x().Read() >= 10, so y no longer exists in foo.
+assert(foo.has_y().Known());
+assert(!foo.has_y().Value());
+
+// foo.has_x() is always true, since x's existence condition is just "true".
+assert(foo.has_x().Known());
+assert(foo.has_x().Value());
+
+// incomplete_foo has 0 bytes of backing storage, so x is unreadable.
+auto incomplete_foo = MakeFooView(&bytes[0], 0);
+
+// incomplete_foo.has_x() is known, since it does not depend on anything.
+assert(incomplete_foo.has_x().Known());
+assert(incomplete_foo.has_x().Value());
+
+// incomplete_foo.x().Ok() is false, since x cannot be read.
+assert(!incomplete_foo.x().Ok());
+
+// Since x cannot be read, incomplete_foo.has_y().Known() is false.
+assert(!incomplete_foo.has_y().Known());
+
+// Since has_y() is not Known(), calling has_y().Value() will crash if Emboss
+// assertions are enabled.
+// incomplete_foo.has_y().Value() // Would crash
+
+// It is safe to call has_y().ValueOr(false).
+assert(!incomplete_foo.has_y().ValueOr(false));
+```
+
+`has_`*`field`*`()` is notional: it queries whether *`field`* *should* be
+present in the view. Even if `has_`*`field`*`().Value()` is `true`,
+*`field`*`().IsComplete()` and *`field`*`().Ok()` might return `false`.
+
+*`field`*`().IsComplete()` tests if there are enough bytes in the backing
+storage to hold *`field`*. If *`field`*`().IsComplete()`, it is safe to call
+`Write()` on the field with a valid value for that field. *`field`*`().Ok()`
+tests if there are enough bytes in the backing storage to hold *`field`*, *and*
+that those bytes contain a valid value for *`field`*:
+
+```
+struct Bar:
+ 0 [+1] Bcd x
+ 1 [+1] Bcd y
+```
+
+```c++
+::std::array<char, 1> bytes = { 0xbb }; // Not a valid BCD number.
+auto bar = MakeBarView(&bytes);
+
+// There are enough bytes to read and write x.
+assert(bar.x().IsComplete());
+
+// The value in x is not correct.
+assert(!bar.x().Ok());
+
+// Read() would crash if assertions are enabled.
+// bar.x().Read();
+
+// Writing a valid value is safe.
+bar.x().Write(99);
+assert(bar.x().Ok());
+
+// Notionally, bar should have y, even though y's byte is not available:
+assert(bar.has_y().Value());
+
+// Since there is no byte to read y from, y is not complete:
+assert(!bar.y().IsComplete());
+```
+
+Note that all views have `Ok()` and `IsComplete()` methods. A view of a
+structure is `Ok()` if all of its fields are either `Ok()` or not present, and
+`has_`*`field`*`().Known()` is `true` for all fields.
+
+A structure view `IsComplete()` if its `SizeIsKnown()` and its backing storage
+contains at least `SizeInBits()` or `SizeInBytes()` bits or bytes. In other
+words: `IsComplete()` is true if Emboss can determine that (just) adding more
+bytes to the view's backing storage won't help. Note that just because
+`IsComplete()` is false, that does not mean that adding more bytes *will* help.
+It is possible to define incoherent structures that will confuse Emboss, such
+as:
+
+```
+struct SizeNeverKnown:
+ if false:
+ 0 [+1] UInt x_loc
+ x_loc [+1] UInt x
+```
+
+<!-- TODO(bolms): Rename "existence condition" to "presence condition." -->
+
+
+## `bits` Views
+
+The code generated for a `bits` construct is very similar to the code generated
+for a `struct`. The primary differences are that there is no
+`Make`*`Bits`*`View` function and that `SizeInBytes` is replaced by
+`SizeInBits`.
+
+
+### `Ok` method
+
+```c++
+bool Ok() const;
+```
+
+The `Ok` method returns `true` if and only if there are enough bytes in the
+backing store, and the `Ok` methods of all active fields return `true`.
+
+
+### `IsComplete` method
+
+```c++
+bool IsComplete() const;
+```
+
+The `IsComplete` method returns `true` if there are enough bytes in the backing
+store to fully contain the `bits`. If `IsComplete()` returns `true` but
+`Ok()` returns `false`, then the structure is broken in some way that cannot be
+fixed by adding more bytes.
+
+
+### `IntrinsicSizeInBits` method
+
+```c++
+auto IntrinsicSizeInBits() const;
+```
+
+or
+
+```c++
+static constexpr auto IntrinsicSizeInBits() const;
+```
+
+The `IntrinsicSizeInBits` method is the [field method](#bits-field-methods) for
+[`$size_in_bits`](language-reference.md#size-in-bits). The `Read` method of
+the result returns the size of the `struct`, and the `Ok` method returns `true`
+if the `struct`'s intrinsic size is known; i.e.:
+
+```c++
+if (view.IntrinsicSizeInBits().Ok()) {
+ std::uint64_t view_size = view.IntrinsicSizeInBits().Read();
+}
+```
+
+Since the intrinsic size of a `bits` is always a compile-time constant:
+
+```c++
+constexpr std::uint64_t view_size = BitsView::IntrinsicSizeInBits().Read();
+constexpr std::uint64_t view_size2 = Bits::IntrinsicSizeInBits();
+```
+
+
+### `MaxSizeInBits` method
+
+```c++
+auto MaxSizeInBits() const;
+```
+
+or
+
+```c++
+static constexpr auto MaxSizeInBits() const;
+```
+
+The `MaxSizeInBits` method is the [field method](#struct-field-methods)
+for [`$max_size_in_bits`](language-reference.md#max-size-in-bits). The `Read`
+method of the result returns the maximum size of the `bits`, and the `Ok`
+always method returns `true`.
+
+```c++
+assert(view.MaxSizeInBits().Ok());
+// The exact return type of view.MaxSizeInBits().Read() may vary, but it will
+// always be implicitly convertible to std::uint64_t.
+std::uint64_t view_size = view.MaxSizeInBits().Read();
+```
+
+Alternately:
+
+```c++
+std::uint64_t view_size = view.MaxSizeInBits().UncheckedRead();
+```
+
+Or:
+
+```c++
+constexpr std::uint64_t view_size = StructView::MaxSizeInBits().Read();
+constexpr std::uint64_t view_size2 = Struct::MaxSizeInBits();
+```
+
+
+### `MinSizeInBits` method
+
+```c++
+auto MinSizeInBits() const;
+```
+
+or
+
+```c++
+static constexpr auto MinSizeInBits() const;
+```
+
+The `MinSizeInBits` method is the [field method](#struct-field-methods)
+for [`$min_size_in_bits`](language-reference.md#min-size-in-bits). The `Read`
+method of the result returns the minimum size of the `bits`, and the `Ok`
+always method returns `true`.
+
+```c++
+assert(view.MinSizeInBits().Ok());
+// The exact return type of view.MinSizeInBits().Read() may vary, but it will
+// always be implicitly convertible to std::uint64_t.
+std::uint64_t view_size = view.MinSizeInBits().Read();
+```
+
+Alternately:
+
+```c++
+std::uint64_t view_size = view.MinSizeInBits().UncheckedRead();
+```
+
+Or:
+
+```c++
+constexpr std::uint64_t view_size = StructView::MinSizeInBits().Read();
+constexpr std::uint64_t view_size2 = Struct::MinSizeInBits();
+```
+
+
+### `SizeIsKnown` method
+
+```c++
+static constexpr bool SizeIsKnown() const;
+```
+
+For a `bits` construct, `SizeIsKnown()` always returns `true`, because the size
+of a `bits` construct is always statically known at compilation time.
+
+
+### `SizeInBits` method
+
+```c++
+static constexpr std::size_t SizeInBits() const;
+```
+
+The `SizeInBits` method returns the size of the `bits` in bits. It is
+equivalent to `static_cast<std::size_t>(IntrinsicSizeInBits().Read())`.
+
+
+### `UpdateFromTextStream` method
+
+```c++
+template <class Stream>
+bool UpdateFromTextStream(Stream *stream) const;
+```
+
+`UpdateFromTextStream` will read a text-format representation of the structure
+from the given `stream` and update fields. Generally, applications would not
+call this directly; instead, use the global `UpdateFromText` method, which
+handles setting up a stream from a `std::string`.
+
+### `WriteToTextStream` method
+
+```c++
+template <class Stream>
+bool WriteToTextStream(Stream *stream, const TextOutputOptions &options) const;
+```
+
+`WriteToTextStream` will write a text representation of the current value in a
+form that can be decoded by `UpdateFromTextStream`. Generally, applications
+would not call this directly; instead, use the global `WriteToString` method,
+which handles setting up the stream and returning the resulting string.
+
+### Field methods {#bits-field-methods}
+
+As with `struct`, each field in a `bits` will have a corresponding method of the
+same name generated, and each such method will return a view of the given field.
+Take the module:
+
+```
+bits Bar:
+ 0 [+12] UInt baz
+ 31 [+1] Flag qux
+ let two_baz = baz * 2
+
+struct Foo:
+ 0 [+4] Bar bar
+```
+
+In this case, the generated code in the `Bar` view will have methods
+
+```c++
+auto baz() const;
+auto qux() const;
+auto two_baz() const;
+```
+
+The `baz` method will return a `UInt` view, and `qux()` will return a `Flag`
+view:
+
+```c++
+auto foo_view = MakeFooView(&vector_of_foo_bytes);
+uint16_t baz_value = foo_view.bar().baz().Read();
+bool qux_value = foo_view.bar().qux().Read();
+uint32_t two_baz_value = foo_view.bar().two_baz().Read();
+```
+
+The exact return type of field methods is subject to change; if a field's view
+must be stored, use an `auto` variable.
+
+
+## `enum`s
+
+For each `enum` in an `.emb`, the Emboss compiler will generate a corresponding
+C++11-style `enum class`. Take the following Emboss `enum`:
+
+```
+enum Foo:
+ BAR = 1
+ BAZ = 1000
+```
+
+Emboss will generate something equivalent to the following C++:
+
+```c++
+enum class Foo : uint64_t {
+ BAR = 1,
+ BAZ = 1000,
+};
+```
+
+Additionally, like other Emboss entities, `enum`s have corresponding view
+classes.
+
+
+### `TryToGetEnumFromName` free function
+
+```c++
+static inline bool TryToGetEnumFromName(const char *name, EnumType *result);
+```
+
+The `TryToGetEnumFromName` function will try to match `name` against the names
+in the Emboss `enum` definition. If it finds an exact match, it will return
+`true` and update `result` with the corresponding enum value. If it does not
+find a match, it will return `false` and leave `result` unchanged.
+
+Note that `TryToGetNameFromEnum` will not match the text of the numeric value of
+an enum; given the `Foo` enum above, `TryToGetEnumFromName("1000", &my_foo)`
+would return `false`.
+
+
+### `TryToGetNameFromEnum` free function
+
+```c++
+static inline const char *TryToGetNameFromEnum(EnumType value);
+```
+
+`TryToGetNameFromEnum` will attempt to find the textual name for the
+corresponding enum value. If a name is found, it will be returned; otherwise
+`TryToGetEnumFromName` will return `nullptr`. (Note that C++ enums are allowed
+to contain numeric values that are not explicitly listed in the enum definition,
+as long as they are in range for the underlying integral type.) If the given
+value has more than one name, the first name that appears in the Emboss
+definition will be returned.
+
+
+### `Read` method
+
+```c++
+EnumType Read() const;
+```
+
+The `Read` method reads the enum from the underlying bytes and returns its
+value as a C++ enum. `Read` will assert that there are enough bytes to read.
+If the application cannot tolerate a failed assertion, it should first call
+`Ok()` to ensure that it can safely read the enum. If performance is critical
+and the application can assure that there will always be enough bytes to read
+the enum, it can call `UncheckedRead` instead.
+
+
+### `UncheckedRead` method
+
+```c++
+EnumType UncheckedRead() const;
+```
+
+Like `Read`, `UncheckedRead` reads the enum from the underlying bytes and
+returns it value as a C++ enum. Unlike `Read`, `UncheckedRead` does not attempt
+to validate that there are enough bytes in the backing store to actually perform
+the read. In performance-critical situations, if the application is otherwise
+able to ensure that there are sufficient bytes in the backing store to read the
+enum, `UncheckedRead` may be used.
+
+
+### `Write` method
+
+```c++
+void Write(EnumType value) const;
+```
+
+`Write` writes the `value` into the backing store. Like `Read`, `Write` asserts
+that there are enough bytes in the backing store to safely write the enum. If
+the application cannot tolerate an assertion failure, it can use `TryToWrite` or
+the combination of `IsComplete` and `CouldWriteValue`.
+
+
+### `TryToWrite` method
+
+```c++
+bool TryToWrite(EnumType value) const;
+```
+
+`TryToWrite` attempts to write the `value` into the backing store. If the
+backing store does not have enough bytes to hold the enum field, or `value` is
+too large for the specific enum field, then `TryToWrite` will return `false` and
+not update anything.
+
+
+### `CouldWriteValue` method
+
+```c++
+static constexpr bool CouldWriteValue(EnumType value);
+```
+
+`CouldWriteValue` returns `true` if the given `value` could be written into the
+enum field, assuming that there were enough bytes in the backing store to cover
+the field.
+
+Although `CouldWriteValue` is `static constexpr`, it is tricky to call
+statically; client code that wishes to call it statically must use `decltype`
+and `declval` to get the specific type for the specific enum *field* in
+question.
+
+
+### `UncheckedWrite` method
+
+```c++
+void UncheckedWrite(EnumType value) const;
+```
+
+Like `Write`, `UncheckedWrite` writes the given value to the backing store.
+Unlike `Write`, `UncheckedWrite` does not check that there are actually enough
+bytes in the backing store to safely write; it should only be used if the
+application has ensured that there are sufficient bytes in the backing store in
+some other way, and performance is a concern.
+
+
+### `Ok` method
+
+```c++
+bool Ok() const;
+```
+
+`Ok` returns `true` if there are enough bytes in the backing store for the enum
+field to be read or written.
+
+In the future, Emboss may add a "known values only" annotation to enum fields,
+in which case `Ok` would also check that the given field contains a known value.
+
+
+### `IsComplete` method
+
+```c++
+bool IsComplete() const;
+```
+
+`IsComplete` returns `true` if there are enough bytes in the backing store for
+the enum field to be read or written.
+
+
+### `UpdateFromTextStream` method
+
+```c++
+template <class Stream>
+bool UpdateFromTextStream(Stream *stream) const;
+```
+
+`UpdateFromTextStream` will read a text-format representation of the enum from
+the given `stream` and write it into the backing store. Generally, applications
+would not call this directly; instead, use the global `UpdateFromText` method,
+which handles setting up a stream from a `std::string`.
+
+### `WriteToTextStream` method
+
+```c++
+template <class Stream>
+bool WriteToTextStream(Stream *stream, const TextOutputOptions &options) const;
+```
+
+`WriteToTextStream` will write a text representation of the current value in a
+form that can be decoded by `UpdateFromTextStream`. Generally, applications
+would not call this directly; instead, use the global `WriteToString` method,
+which handles setting up the stream and returning the resulting string.
+
+## Arrays
+
+### `operator[]` method
+
+```c++
+ElementView operator[](size_t index) const;
+```
+
+The `operator[]` method of an array view returns a view of the array element at
+`index`.
+
+### `begin()`/`rbegin()` and `end()`/`rend()` methods
+
+```c++
+ElementViewIterator<> begin();
+ElementViewIterator<> end();
+ElementViewIterator<> rbegin();
+ElementViewIterator<> rend();
+```
+
+The `begin()` and `end()` methods of an array view returns view iterators to the
+beginning and past-the-end of the array, respectively. They may be used with
+arrays in range-based for loops, for example:
+
+```c++
+ auto view = MakeArrayView(...);
+ for(auto element : view){
+ int a = view.member().Read();
+ ...
+ }
+```
+
+The `rbegin()` and `rend()` methods of an array view returns reverse view
+iterators to the end and element preceding the first, respectively.
+
+### `SizeInBytes` or `SizeInBits` method
+
+```c++
+size_t SizeInBytes() const;
+```
+
+or
+
+```c++
+size_t SizeInBits() const;
+```
+
+Arrays in `struct`s have the `SizeInBytes` method; arrays in `bits` have the
+`SizeInBits` method. `SizeInBytes` returns the size of the array in bytes;
+`SizeInBits` returns the size of the array in bits.
+
+
+### `ElementCount` method
+
+```c++
+size_t ElementCount() const;
+```
+
+`ElementCount` returns the number of elements in the array.
+
+
+### `Ok` method
+
+```c++
+bool Ok() const;
+```
+
+`Ok` returns `true` if there are enough bytes in the backing store to hold the
+entire array, and every element's `Ok` method returns `true`.
+
+
+### `IsComplete` method
+
+```c++
+bool IsComplete() const;
+```
+
+`IsComplete` returns `true` if there are sufficient bytes in the backing store
+to hold the entire array.
+
+
+### `UpdateFromTextStream` method
+
+```c++
+template <class Stream>
+bool UpdateFromTextStream(Stream *stream) const;
+```
+
+`UpdateFromTextStream` will read a text-format representation of the structure
+from the given `stream` and update array elements. Generally, applications
+would not call this directly; instead, use the global `UpdateFromText` method,
+which handles setting up a stream from a `std::string`.
+
+### `WriteToTextStream` method
+
+```c++
+template <class Stream>
+bool WriteToTextStream(Stream *stream, const TextOutputOptions &options) const;
+```
+
+`WriteToTextStream` will write a text representation of the current value in a
+form that can be decoded by `UpdateFromTextStream`. Generally, applications
+would not call this directly; instead, use the global `WriteToString` method,
+which handles setting up the stream and returning the resulting string.
+
+### `BackingStorage` method
+
+```c++
+Storage BackingStorage() const;
+```
+
+Returns the backing storage for the view. The return type of `BackingStorage()`
+is a template parameter on the view.
+
+## `UInt`
+
+### Type `ValueType`
+
+```c++
+using ValueType = ...;
+```
+
+The `ValueType` type alias maps to the least-width C++ unsigned integer type
+that contains enough bits to hold any value of the given `UInt`. For example:
+
+* a `UInt:32`'s `ValueType` would be `uint32_t`
+* a `UInt:64`'s `ValueType` would be `uint64_t`
+* a `UInt:12`'s `ValueType` would be `uint16_t`
+* a `UInt:2`'s `ValueType` would be `uint8_t`
+
+The `Read` and `Write` families of methods use `ValueType` to return or accept
+values, respectively.
+
+
+### `Read` method
+
+```c++
+ValueType Read() const;
+```
+
+The `Read` method reads the `UInt` from the underlying bytes and returns its
+value as a C++ unsigned integer type. `Read` will assert that there are enough
+bytes to read. If the application cannot tolerate a failed assertion, it should
+first call `Ok()` to ensure that it can safely read the `UInt`. If performance
+is critical and the application can assure that there will always be enough
+bytes to read the `UInt`, it can call `UncheckedRead` instead.
+
+
+### `UncheckedRead` method
+
+```c++
+ValueType UncheckedRead();
+```
+
+Like `Read`, `UncheckedRead` reads the `UInt` from the underlying bytes and
+returns it value as a C++ unsigned integer type. Unlike `Read`, `UncheckedRead`
+does not attempt to validate that there are enough bytes in the backing store to
+actually perform the read. In performance-critical situations, if the
+application is otherwise able to ensure that there are sufficient bytes in the
+backing store to read the `UInt`, `UncheckedRead` may be used.
+
+
+### `Write` method
+
+```c++
+void Write(ValueType value);
+```
+
+`Write` writes the `value` into the backing store. Like `Read`, `Write` asserts
+that there are enough bytes in the backing store to safely write the `UInt`. If
+the application cannot tolerate an assertion failure, it can use `TryToWrite` or
+the combination of `IsComplete` and `CouldWriteValue`.
+
+
+### `TryToWrite` method
+
+```c++
+bool TryToWrite(ValueType value);
+```
+
+`TryToWrite` attempts to write the `value` into the backing store. If the
+backing store does not have enough bytes to hold the `UInt` field, or `value` is
+too large for the `UInt` field, then `TryToWrite` will return `false` and not
+update anything.
+
+
+### `CouldWriteValue` method
+
+```c++
+static constexpr bool CouldWriteValue(ValueType value);
+```
+
+`CouldWriteValue` returns `true` if the given `value` could be written into the
+`UInt` field, assuming that there were enough bytes in the backing store to
+cover the field.
+
+Although `CouldWriteValue` is `static constexpr`, it is tricky to call
+statically; client code that wishes to call it statically must use `decltype`
+and `declval` to get the specific type for the specific `UInt` field in
+question.
+
+
+### `UncheckedWrite` method
+
+```c++
+void UncheckedWrite(ValueType value);
+```
+
+Like `Write`, `UncheckedWrite` writes the given value to the backing store.
+Unlike `Write`, `UncheckedWrite` does not check that there are actually enough
+bytes in the backing store to safely write; it should only be used if the
+application has ensured that there are sufficient bytes in the backing store in
+some other way, and performance is a concern.
+
+
+### `Ok` method
+
+```c++
+bool Ok() const;
+```
+
+The `Ok` method returns `true` if there are enough bytes in the backing store to
+hold the given `UInt` field.
+
+
+### `IsComplete` method
+
+```c++
+bool IsComplete();
+```
+
+The `IsComplete` method returns `true` if there are enough bytes in the backing
+store to hold the given `UInt` field.
+
+
+### `UpdateFromTextStream` method
+
+```c++
+template <class Stream>
+bool UpdateFromTextStream(Stream *stream) const;
+```
+
+`UpdateFromTextStream` will read a text-format representation of the `UInt` from
+the given `stream` and update fields. Generally, applications would not call
+this directly; instead, use the global `UpdateFromText` method, which handles
+setting up a stream from a `std::string`.
+
+### `WriteToTextStream` method
+
+```c++
+template <class Stream>
+bool WriteToTextStream(Stream *stream, const TextOutputOptions &options) const;
+```
+
+`WriteToTextStream` will write a text representation of the current value in a
+form that can be decoded by `UpdateFromTextStream`. Generally, applications
+would not call this directly; instead, use the global `WriteToString` method,
+which handles setting up the stream and returning the resulting string.
+
+### `SizeInBits` method
+
+```c++
+static constexpr int SizeInBits();
+```
+
+The `SizeInBits` method returns the size of this specific `UInt` field, in bits.
+
+
+## `Int`
+
+### Type `ValueType`
+
+```c++
+using ValueType = ...;
+```
+
+The `ValueType` type alias maps to the least-width C++ signed integer type
+that contains enough bits to hold any value of the given `Int`. For example:
+
+* a `Int:32`'s `ValueType` would be `int32_t`
+* a `Int:64`'s `ValueType` would be `int64_t`
+* a `Int:12`'s `ValueType` would be `int16_t`
+* a `Int:2`'s `ValueType` would be `int8_t`
+
+The `Read` and `Write` families of methods use `ValueType` to return or accept
+values, respectively.
+
+
+### `Read` method
+
+```c++
+ValueType Read() const;
+```
+
+The `Read` method reads the `Int` from the underlying bytes and returns its
+value as a C++ signed integer type. `Read` will assert that there are enough
+bytes to read. If the application cannot tolerate a failed assertion, it should
+first call `Ok()` to ensure that it can safely read the `Int`. If performance
+is critical and the application can assure that there will always be enough
+bytes to read the `Int`, it can call `UncheckedRead` instead.
+
+
+### `UncheckedRead` method
+
+```c++
+ValueType UncheckedRead();
+```
+
+Like `Read`, `UncheckedRead` reads the `Int` from the underlying bytes and
+returns it value as a C++ signed integer type. Unlike `Read`, `UncheckedRead`
+does not attempt to validate that there are enough bytes in the backing store to
+actually perform the read. In performance-critical situations, if the
+application is otherwise able to ensure that there are sufficient bytes in the
+backing store to read the `Int`, `UncheckedRead` may be used.
+
+
+### `Write` method
+
+```c++
+void Write(ValueType value);
+```
+
+`Write` writes the `value` into the backing store. Like `Read`, `Write` asserts
+that there are enough bytes in the backing store to safely write the `Int`. If
+the application cannot tolerate an assertion failure, it can use `TryToWrite` or
+the combination of `IsComplete` and `CouldWriteValue`.
+
+
+### `TryToWrite` method
+
+```c++
+bool TryToWrite(ValueType value);
+```
+
+`TryToWrite` attempts to write the `value` into the backing store. If the
+backing store does not have enough bytes to hold the `Int` field, or `value` is
+too large for the `Int` field, then `TryToWrite` will return `false` and not
+update anything.
+
+
+### `CouldWriteValue` method
+
+```c++
+static constexpr bool CouldWriteValue(ValueType value);
+```
+
+`CouldWriteValue` returns `true` if the given `value` could be written into the
+`Int` field, assuming that there were enough bytes in the backing store to cover
+the field.
+
+Although `CouldWriteValue` is `static constexpr`, it is tricky to call
+statically; client code that wishes to call it statically must use `decltype`
+and `declval` to get the specific type for the specific `Int` field in question.
+
+
+### `UncheckedWrite` method
+
+```c++
+void UncheckedWrite(ValueType value);
+```
+
+Like `Write`, `UncheckedWrite` writes the given value to the backing store.
+Unlike `Write`, `UncheckedWrite` does not check that there are actually enough
+bytes in the backing store to safely write; it should only be used if the
+application has ensured that there are sufficient bytes in the backing store in
+some other way, and performance is a concern.
+
+
+### `Ok` method
+
+```c++
+bool Ok() const;
+```
+
+The `Ok` method returns `true` if there are enough bytes in the backing store to
+hold the given `Int` field.
+
+
+### `IsComplete` method
+
+```c++
+bool IsComplete();
+```
+
+The `IsComplete` method returns `true` if there are enough bytes in the backing
+store to hold the given `Int` field.
+
+
+### `UpdateFromTextStream` method
+
+```c++
+template <class Stream>
+bool UpdateFromTextStream(Stream *stream) const;
+```
+
+`UpdateFromTextStream` will read a text-format representation of the `Int` from
+the given `stream` and update fields. Generally, applications would not call
+this directly; instead, use the global `UpdateFromText` method, which handles
+setting up a stream from a `std::string`.
+
+### `WriteToTextStream` method
+
+```c++
+template <class Stream>
+bool WriteToTextStream(Stream *stream, const TextOutputOptions &options) const;
+```
+
+`WriteToTextStream` will write a text representation of the current value in a
+form that can be decoded by `UpdateFromTextStream`. Generally, applications
+would not call this directly; instead, use the global `WriteToString` method,
+which handles setting up the stream and returning the resulting string.
+
+### `SizeInBits` method
+
+```c++
+static constexpr int SizeInBits();
+```
+
+The `SizeInBits` method returns the size of this specific `Int` field, in bits.
+
+
+## `Bcd`
+
+### Type `ValueType`
+
+```c++
+using ValueType = ...;
+```
+
+The `ValueType` type alias maps to a C++ unsigned integer type that contains
+at least enough bits to hold any value of the given `Bcd`. For example:
+
+* a `Bcd:32`'s `ValueType` would be `uint32_t`
+* a `Bcd:64`'s `ValueType` would be `uint64_t`
+* a `Bcd:12`'s `ValueType` would be `uint16_t`
+* a `Bcd:2`'s `ValueType` would be `uint8_t`
+
+The `Read` and `Write` families of methods use `ValueType` to return or accept
+values, respectively.
+
+
+### `Read` method
+
+```c++
+ValueType Read() const;
+```
+
+The `Read` method reads the `Bcd` from the underlying bytes and returns its
+value as a C++ unsigned integer type. `Read` will assert that there are enough
+bytes to read, and that the binary representation is a valid BCD integer. If
+the application cannot tolerate a failed assertion, it should first call `Ok()`
+to ensure that it can safely read the `Bcd`. If performance is critical and the
+application can assure that there will always be enough bytes to read the `Bcd`,
+and that the bytes will be a valid BCD value, it can call `UncheckedRead`
+instead.
+
+
+### `UncheckedRead` method
+
+```c++
+ValueType UncheckedRead();
+```
+
+Like `Read`, `UncheckedRead` reads the `Bcd` from the underlying bytes and
+returns it value as a C++ unsigned integer type. Unlike `Read`, `UncheckedRead`
+does not attempt to validate that there are enough bytes in the backing store to
+actually perform the read, nor that the bytes contain an actual BCD number. In
+performance-critical situations, if the application is otherwise able to ensure
+that there are sufficient bytes in the backing store to read the `Bcd`,
+`UncheckedRead` may be used.
+
+
+### `Write` method
+
+```c++
+void Write(ValueType value);
+```
+
+`Write` writes the `value` into the backing store. Like `Read`, `Write` asserts
+that there are enough bytes in the backing store to safely write the `Bcd`. If
+the application cannot tolerate an assertion failure, it can use `TryToWrite` or
+the combination of `IsComplete` and `CouldWriteValue`.
+
+
+### `TryToWrite` method
+
+```c++
+bool TryToWrite(ValueType value);
+```
+
+`TryToWrite` attempts to write the `value` into the backing store. If the
+backing store does not have enough bytes to hold the `Bcd` field, or `value` is
+too large for the `Bcd` field, then `TryToWrite` will return `false` and not
+update anything.
+
+
+### `CouldWriteValue` method
+
+```c++
+static constexpr bool CouldWriteValue(ValueType value);
+```
+
+`CouldWriteValue` returns `true` if the given `value` could be written into the
+`Bcd` field, assuming that there were enough bytes in the backing store to cover
+the field.
+
+Although `CouldWriteValue` is `static constexpr`, it is tricky to call
+statically; client code that wishes to call it statically must use `decltype`
+and `declval` to get the specific type for the specific `Bcd` field in question.
+
+
+### `UncheckedWrite` method
+
+```c++
+void UncheckedWrite(ValueType value);
+```
+
+Like `Write`, `UncheckedWrite` writes the given value to the backing store.
+Unlike `Write`, `UncheckedWrite` does not check that there are actually enough
+bytes in the backing store to safely write; it should only be used if the
+application has ensured that there are sufficient bytes in the backing store in
+some other way, and performance is a concern.
+
+
+### `Ok` method
+
+```c++
+bool Ok() const;
+```
+
+The `Ok` method returns `true` if there are enough bytes in the backing store to
+hold the given `Bcd` field, and the bytes contain a valid BCD number: that is,
+that every nibble in the backing store contains a value between 0 and 9,
+inclusive.
+
+
+### `IsComplete` method
+
+```c++
+bool IsComplete();
+```
+
+The `IsComplete` method returns `true` if there are enough bytes in the backing
+store to hold the given `Bcd` field.
+
+
+### `UpdateFromTextStream` method
+
+```c++
+template <class Stream>
+bool UpdateFromTextStream(Stream *stream) const;
+```
+
+`UpdateFromTextStream` will read a text-format representation of the `Bcd` from
+the given `stream` and update fields. Generally, applications would not call
+this directly; instead, use the global `UpdateFromText` method, which handles
+setting up a stream from a `std::string`.
+
+### `WriteToTextStream` method
+
+```c++
+template <class Stream>
+bool WriteToTextStream(Stream *stream, const TextOutputOptions &options) const;
+```
+
+`WriteToTextStream` will write a text representation of the current value in a
+form that can be decoded by `UpdateFromTextStream`. Generally, applications
+would not call this directly; instead, use the global `WriteToString` method,
+which handles setting up the stream and returning the resulting string.
+
+### `SizeInBits` method
+
+```c++
+static constexpr int SizeInBits();
+```
+
+The `SizeInBits` method returns the size of this specific `Bcd` field, in bits.
+
+
+## `Flag`
+
+### `Read` method
+
+```c++
+bool Read() const;
+```
+
+The `Read` method reads the `Flag` from the underlying bit and returns its
+value as a C++ `bool`. `Read` will assert that the underlying bit is in the
+backing store. If the application cannot tolerate a failed assertion, it should
+first call `Ok()` to ensure that it can safely read the `Flag`. If performance
+is critical and the application can assure that there will always be enough
+bytes to read the `Flag`, it can call `UncheckedRead` instead.
+
+
+### `UncheckedRead` method
+
+```c++
+bool UncheckedRead();
+```
+
+Like `Read`, `UncheckedRead` reads the `Flag` from the underlying bit and
+returns it value as a C++ bool. Unlike `Read`, `UncheckedRead` does not attempt
+to validate that the backing bit is actually in the backing store. In
+performance-critical situations, if the application is otherwise able to ensure
+that there are sufficient bytes in the backing store to read the `Flag`,
+`UncheckedRead` may be used.
+
+
+### `Write` method
+
+```c++
+void Write(bool value);
+```
+
+`Write` writes the `value` into the backing store. Like `Read`, `Write` asserts
+that there are enough bytes in the backing store to safely write the `Flag`. If
+the application cannot tolerate an assertion failure, it can use `TryToWrite` or
+the combination of `IsComplete` and `CouldWriteValue`.
+
+
+### `TryToWrite` method
+
+```c++
+bool TryToWrite(bool value);
+```
+
+`TryToWrite` attempts to write the `value` into the backing store. If the
+backing store does not contain the `Flag`'s bit, then `TryToWrite` will return
+`false` and not update anything.
+
+
+### `CouldWriteValue` method
+
+```c++
+static constexpr bool CouldWriteValue(bool value);
+```
+
+`CouldWriteValue` returns `true`, as both C++ `bool` values can be written to
+any `Flag`.
+
+
+### `UncheckedWrite` method
+
+```c++
+void UncheckedWrite(ValueType value);
+```
+
+Like `Write`, `UncheckedWrite` writes the given value to the backing store.
+Unlike `Write`, `UncheckedWrite` does not check that there are actually enough
+bytes in the backing store to safely write; it should only be used if the
+application has ensured that there are sufficient bytes in the backing store in
+some other way, and performance is a concern.
+
+
+### `Ok` method
+
+```c++
+bool Ok() const;
+```
+
+The `Ok` method returns `true` if the backing store contains the `Flag`'s bit.
+
+
+### `IsComplete` method
+
+```c++
+bool IsComplete();
+```
+
+The `IsComplete` method returns `true` if the backing store contains the
+`Flag`'s bit.
+
+
+### `UpdateFromTextStream` method
+
+```c++
+template <class Stream>
+bool UpdateFromTextStream(Stream *stream) const;
+```
+
+`UpdateFromTextStream` will read a text-format representation of the `Flag` from
+the given `stream` and update fields. Generally, applications would not call
+this directly; instead, use the global `UpdateFromText` method, which handles
+setting up a stream from a `std::string`.
+
+### `WriteToTextStream` method
+
+```c++
+template <class Stream>
+bool WriteToTextStream(Stream *stream, const TextOutputOptions &options) const;
+```
+
+`WriteToTextStream` will write a text representation of the current value in a
+form that can be decoded by `UpdateFromTextStream`. Generally, applications
+would not call this directly; instead, use the global `WriteToString` method,
+which handles setting up the stream and returning the resulting string.
+
+## `Float`
+
+### Type `ValueType`
+
+```c++
+using ValueType = ...;
+```
+
+The `ValueType` type alias maps to the C++ floating-point type that matches the
+`Float` field's type; generally `float` for 32-bit `Float`s and `double` for
+64-bit `Float`s.
+
+The `Read` and `Write` families of methods use `ValueType` to return or accept
+values, respectively.
+
+
+### `Read` method
+
+```c++
+ValueType Read() const;
+```
+
+The `Read` method reads the `Float` from the underlying bytes and returns its
+value as a C++ floating point type. `Read` will assert that there are enough
+bytes to read. If the application cannot tolerate a failed assertion, it should
+first call `Ok()` to ensure that it can safely read the `Float`. If performance
+is critical and the application can assure that there will always be enough
+bytes to read the `Float`, it can call `UncheckedRead` instead.
+
+
+### `UncheckedRead` method
+
+```c++
+ValueType UncheckedRead();
+```
+
+Like `Read`, `UncheckedRead` reads the `Float` from the underlying bytes and
+returns it value as a C++ floating point type. Unlike `Read`, `UncheckedRead`
+does not attempt to validate that there are enough bytes in the backing store to
+actually perform the read. In performance-critical situations, if the
+application is otherwise able to ensure that there are sufficient bytes in the
+backing store to read the `Float`, `UncheckedRead` may be used.
+
+
+### `Write` method
+
+```c++
+void Write(ValueType value);
+```
+
+`Write` writes the `value` into the backing store. Like `Read`, `Write` asserts
+that there are enough bytes in the backing store to safely write the `Float`.
+If the application cannot tolerate an assertion failure, it can use `TryToWrite`
+or the combination of `IsComplete` and `CouldWriteValue`.
+
+
+### `TryToWrite` method
+
+```c++
+bool TryToWrite(ValueType value);
+```
+
+`TryToWrite` attempts to write the `value` into the backing store. If the
+backing store does not have enough bytes to hold the `Float` field, then
+`TryToWrite` will return `false` and not update anything.
+
+
+### `CouldWriteValue` method
+
+```c++
+static constexpr bool CouldWriteValue(ValueType value);
+```
+
+`CouldWriteValue` returns `true`.
+
+
+### `UncheckedWrite` method
+
+```c++
+void UncheckedWrite(ValueType value);
+```
+
+Like `Write`, `UncheckedWrite` writes the given value to the backing store.
+Unlike `Write`, `UncheckedWrite` does not check that there are actually enough
+bytes in the backing store to safely write; it should only be used if the
+application has ensured that there are sufficient bytes in the backing store in
+some other way, and performance is a concern.
+
+
+### `Ok` method
+
+```c++
+bool Ok() const;
+```
+
+The `Ok` method returns `true` if there are enough bytes in the backing store to
+hold the given `Float` field.
+
+
+### `IsComplete` method
+
+```c++
+bool IsComplete();
+```
+
+The `IsComplete` method returns `true` if there are enough bytes in the backing
+store to hold the given `Float` field.
+
+
+### `UpdateFromTextStream` method
+
+```c++
+template <class Stream>
+bool UpdateFromTextStream(Stream *stream) const;
+```
+
+`UpdateFromTextStream` will read a text-format representation of the `Float`
+from the given `stream` and update fields. Generally, applications would not
+call this directly; instead, use the global `UpdateFromText` method, which
+handles setting up a stream from a `std::string`.
+
+*Note: this method is not yet implemented.*
+
+### `WriteToTextStream` method
+
+```c++
+template <class Stream>
+bool WriteToTextStream(Stream *stream, const TextOutputOptions &options) const;
+```
+
+`WriteToTextStream` will write a text representation of the current value in a
+form that can be decoded by `UpdateFromTextStream`. Generally, applications
+would not call this directly; instead, use the global `WriteToString` method,
+which handles setting up the stream and returning the resulting string.
+
+*Note: this method is not yet implemented.*
+
+
+## `::emboss::UpdateFromText` function
+
+```c++
+template <typename EmbossViewType>
+bool UpdateFromText(EmbossViewType view, const ::std::string &text) const;
+```
+
+The `::emboss::UpdateFromText` function constructs an appropriate text strem
+object from the given `text` and calls `view`'s `UpdateFromTextStream` method.
+This is the preferred way to read Emboss text format in C++.
+
+## `::emboss::WriteToString` function
+
+```c++
+template <typename EmbossViewType>
+::std::string WriteToString(EmbossViewType view);
+template <typename EmbossViewType>
+::std::string WriteToString(EmbossViewType view, TextOutputOptions options);
+```
+
+The `::emboss::WriteToString` function constructs a string stream, passes it
+into the `view`'s `WriteToTextStream` method, and finally returns the text
+format of the `view`.
+
+The single-argument form `WriteToString(view)` will return a single line of
+text. For more readable output, `WriteToString(view, ::emboss::MultilineText())`
+should help.
+
+## `::emboss::TextOutputOptions` class
+
+The `TextOutputOptions` is used to set options for text output, such as numeric
+base, whether or not to use multiple lines, etc.
+
+### `PlusOneIndent` method
+
+```c++
+TextOutputOptions PlusOneIndent() const;
+```
+
+`PlusOneIndent` returns a new `TextOutputOptions` with one more level of
+indentation than the current `TextOutputOptions`. This is primarily intended for
+use inside of `WriteToTextStream` methods, as a way to get an indented
+`TextOutputOptions` to pass to the `WriteToTextStream` methods of child objects.
+
+### `Multiline` method
+
+```c++
+TextOutputOptions Multiline(bool new_value) const;
+```
+
+Returns a new `TextOutputOptions` with the same options as the current
+`TextOutputOptions`, except for a new value for `multiline()`.
+
+### `WithIndent` method
+
+```c++
+TextOutputOptions WithIndent(::std::string new_value) const;
+```
+
+Returns a new `TextOutputOptions` with the same options as the current
+`TextOutputOptions`, except for a new value for `indent()`.
+
+### `WithComments` method
+
+```c++
+TextOutputOptions WithComments(bool new_value) const;
+```
+
+Returns a new `TextOutputOptions` with the same options as the current
+`TextOutputOptions`, except for a new value for `comments()`.
+
+### `WithDigitGrouping` method
+
+```c++
+TextOutputOptions WithDigitGrouping(bool new_value) const;
+```
+
+Returns a new `TextOutputOptions` with the same options as the current
+`TextOutputOptions`, except for a new value for `digit_grouping()`.
+
+### `WithNumericBase` method
+
+```c++
+TextOutputOptions WithNumericBase(int new_value) const;
+```
+
+Returns a new `TextOutputOptions` with the same options as the current
+`TextOutputOptions`, except for a new value for `digit_grouping()`. The new
+numeric base should be 2, 10, or 16.
+
+### `current_indent` method
+
+```c++
+::std::string current_indent() const;
+```
+
+Returns the current indent string.
+
+### `indent` method
+
+```c++
+::std::string indent() const;
+```
+
+Returns the indent string. The indent string is the string used for a single
+level of indentation; most callers will prefer `current_indent`.
+
+### `multiline` method
+
+```c++
+bool multiline() const;
+```
+
+Returns `true` if text output should use multiple lines, or `false` if text
+output should be single-line only.
+
+### `digit_grouping` method
+
+```c++
+bool digit_grouping() const;
+```
+
+Returns `true` if text output should include digit separators on numbers; i.e.
+`1_000_000` instead of `1000000`.
+
+### `comments` method
+
+```c++
+bool comments() const;
+```
+
+Returns `true` if text output should include comments, e.g., to show numbers in
+multiple bases.
+
+### `numeric_base` method
+
+```c++
+uint8_t numeric_base() const;
+```
+
+Returns the numeric base that should be used for formatting numbers. This should
+always be 2, 10, or 16.
diff --git a/g3doc/design.md b/g3doc/design.md
new file mode 100644
index 0000000..62a1f71
--- /dev/null
+++ b/g3doc/design.md
@@ -0,0 +1,162 @@
+# Design of the Emboss Tool
+
+This document describes the internals of Emboss. End users do not need to read
+this document.
+
+*TODO(bolms): Update this doc to include the newer passes.*
+
+The Emboss compiler is divided into separate "front end" and "back end"
+programs. The front end parses Emboss files (`.emb` files) and produces a
+stable intermediate representation (IR), which is consumed by the back ends.
+This IR is defined in [public/ir_pb2.py][ir_pb2.py].
+
+[ir_pb]: public/ir_pb2.py
+
+The back ends read the IR and emit code to view and manipulate Emboss-defined
+data structures. Currently, only a C++ back-end exists.
+
+*TODO(bolms): Split the symbol resolution and validation steps in a separate
+"middle" component, to allow external code generators to generate undecorated
+Emboss IR instead of Emboss source text?*
+
+## Front End
+
+*Implemented in [front_end/...][front_end]*
+
+[front_end]: front_end/
+
+The front end is responsible for reading in Emboss definitions and producing a
+normalized intermediate representation (IR). It is divided into several steps:
+roughly, parsing, import resolution, symbol resolution, and validation.
+
+The front end is orchestrated by [glue.py][glue_py], which runs each front end
+component in the proper order to construct an IR suitable for consumption by the
+back end.
+
+[glue_py]: front_end/glue.py
+
+The actual driver program is [emboss_front_end.py][emboss_front_end_py], which
+just calls `glue.ParseEmbossFile` and prints the results.
+
+[emboss_front_end_py]: front_end/emboss_front_end.py
+
+### File Parsing
+
+Per-file parsing consumes the text of a single Emboss module, and produces an
+"undecorated" IR for the module, containing only syntactic-level information
+from the module.
+
+This "undecorated" IR is (almost) a subset of the final IR: later steps will add
+information and perform validation, but will rarely remove anything from the IR
+before it is emitted.
+
+#### Tokenization
+
+*Implemented in [tokenizer.py][tokenizer_py]*
+
+[tokenizer_py]: front_end/tokenizer.py
+
+The tokenizer is a fairly standard tokenizer, with Indent/Dedent insertion a la
+Python. It divides source text into `parse_types.Symbol` objects, suitable for
+feeding into the parser.
+
+#### Syntax Tree Generation
+
+*Implemented in [lr1.py][lr1_py] and [parser_generator.py][parser_generator_py], with a façade in [structure_parser.py][structure_parser_py]*
+
+[lr1_py]: front_end/lr1.py
+[parser_generator_py]: front_end/parser_generator.py
+[structure_parser_py]: front_end/structure_parser.py
+
+Emboss uses a pretty standard Shift-Reduce LR(1) parser. This is implemented in
+three parts in Emboss:
+
+* A generic parser generator implementing the table generation algorithms from
+ *[Compilers: Principles, Techniques, & Tools][dragon_book]* and the
+ error-marking algorithm from *[Generating LR Syntax Error Messages from
+ Examples][jeffery_2003]*.
+* An Emboss-specific parser builder which glues the Emboss tokenizer, grammar,
+ and error examples to the parser generator, producing an Emboss parser.
+* The Emboss grammar, which is extracted from the file normalizer
+ (*[module_ir.py][module_ir_py]*).
+
+[dragon_book]: http://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811
+[jeffery_2003]: http://dl.acm.org/citation.cfm?id=937566
+
+#### Normalization
+
+*Implemented in [module_ir.py][module_ir_py]*
+
+[module_ir_py]: front_end/module_ir.py
+
+Once a parse tree has been generated, it is fed into a normalizer which
+recursively turns the raw syntax tree into a "first stage" intermediate
+representation (IR). The first stage IR serves to isolate later stages from
+minor changes in the grammar, but only contains information from a single file,
+and does not perform any semantic checking.
+
+### Import Resolution
+
+*TODO(bolms): Implement imports.*
+
+After each file is parsed, any new imports it has are added to a work queue.
+Each file in the work queue is parsed, potentially adding more imports to the
+queue, until the queue is empty.
+
+### Symbol Resolution
+
+*Implemented in [symbol_resolver.py][symbol_resolver_py]*
+
+[symbol_resolver_py]: front_end/symbol_resolver.py
+
+Symbol resolution is the process of correlating names in the IR. At the end of
+symbol resolution, every named entity (type definition, field definition, enum
+name, etc.) has a `CanonicalName`, and every reference in the IR has a
+`Reference` to the entity to which it refers.
+
+This assignment occurs in two passes. First, the full IR is scanned, generating
+scoped symbol tables (nested dictionaries of names to `CanonicalName`), and
+assigning identities to each `Name` in the IR. Then the IR is fully scanned a
+second time, and each `Reference` in the IR is resolved: all scopes visible to
+the reference are scanned for the name, and the corresponding `CanonicalName` is
+assigned to the reference.
+
+### Validation
+
+*TODO(bolms): other validations?*
+
+#### Size Checking
+
+*TODO(bolms): describe*
+
+#### Overlap Checking
+
+*TODO(bolms): describe*
+
+## Back End
+
+*Implemented in [back_end/...][back_end]*
+
+[back_end]: back_end/
+
+Currently, only a C++ back end is implemented.
+
+A back end takes Emboss IR and produces code in a specific language for
+manipulating the Emboss-defined data structures.
+
+### C++
+
+*Implemented in [header_generator.py][header_generator_py] with templates in
+[generated_code_templates][generated_code_templates], support code in
+[emboss_cpp_util.h][emboss_cpp_util_h], and a driver program in
+[emboss_codegen_cpp.py][emboss_codegen_cpp_py]*
+
+[header_generator_py]: back_end/cpp/header_generator.py
+[generated_code_templates]: back_end/cpp/generated_code_templates
+[emboss_cpp_util_h]: back_end/cpp/emboss_cpp_util.h
+[emboss_codegen_cpp_py]: back_end/cpp/emboss_codegen_cpp.py
+
+The C++ code generator is currently very minimal. `header_generator.py`
+essentially inserts values from the IR into text templates.
+
+*TODO(bolms): add more documentation once the C++ back end has more features.*
diff --git a/g3doc/grammar.md b/g3doc/grammar.md
new file mode 100644
index 0000000..7bc99fa
--- /dev/null
+++ b/g3doc/grammar.md
@@ -0,0 +1,474 @@
+This is the context-free grammar for Emboss. Terminal symbols are in `"quotes"`
+or are named in `CamelCase`; nonterminal symbols are named in `snake_case`. The
+term `<empty>` to the right of the `->` indicates an empty production (a rule
+where the left-hand-side may be parsed from an empty string).
+
+This listing is auto-generated from the grammar defined in `module_ir.py`.
+
+Note that, unlike in many languages, comments are included in the grammar. This
+is so that comments can be handled more easily by the autoformatter; comments
+are ignored by the compiler. This is distinct from *documentation*, which is
+included in the IR for use by documentation generators.
+
+```shell
+module -> comment-line* doc-line* import-line*
+ attribute-line* type-definition*
+type-definition -> bits
+ | enum
+ | external
+ | struct
+struct -> "struct" type-name
+ delimited-parameter-definition-list?
+ ":" Comment? eol struct-body
+struct-body -> Indent doc-line* attribute-line*
+ type-definition* struct-field-block
+ Dedent
+struct-field-block -> <empty>
+ | conditional-struct-field-block
+ struct-field-block
+ | unconditional-struct-field
+ struct-field-block
+unconditional-struct-field -> anonymous-bits-field-definition
+ | field
+ | inline-bits-field-definition
+ | inline-enum-field-definition
+ | inline-struct-field-definition
+ | virtual-field
+virtual-field -> "let" snake-name "=" expression
+ Comment? eol field-body?
+field-body -> Indent doc-line* attribute-line* Dedent
+expression -> choice-expression
+choice-expression -> logical-expression
+ | logical-expression "?"
+ logical-expression ":"
+ logical-expression
+logical-expression -> and-expression
+ | comparison-expression
+ | or-expression
+or-expression -> comparison-expression
+ or-expression-right+
+or-expression-right -> or-operator comparison-expression
+or-operator -> "||"
+comparison-expression -> additive-expression
+ | additive-expression
+ equality-expression-right+
+ | additive-expression
+ greater-expression-right-list
+ | additive-expression inequality-operator
+ additive-expression
+ | additive-expression
+ less-expression-right-list
+less-expression-right-list -> equality-expression-right*
+ less-expression-right
+ equality-or-less-expression-right*
+equality-or-less-expression-right -> equality-expression-right
+ | less-expression-right
+less-expression-right -> less-operator additive-expression
+less-operator -> "<"
+ | "<="
+inequality-operator -> "!="
+greater-expression-right-list -> equality-expression-right*
+ greater-expression-right
+ equality-or-greater-expression-right*
+equality-or-greater-expression-right -> equality-expression-right
+ | greater-expression-right
+greater-expression-right -> greater-operator additive-expression
+greater-operator -> ">"
+ | ">="
+equality-expression-right -> equality-operator additive-expression
+equality-operator -> "=="
+additive-expression -> times-expression
+ additive-expression-right*
+additive-expression-right -> additive-operator times-expression
+additive-operator -> "+"
+ | "-"
+times-expression -> negation-expression
+ times-expression-right*
+times-expression-right -> multiplicative-operator
+ negation-expression
+multiplicative-operator -> "*"
+negation-expression -> additive-operator bottom-expression
+ | bottom-expression
+bottom-expression -> "(" expression ")"
+ | boolean-constant
+ | builtin-reference
+ | constant-reference
+ | field-reference
+ | function-name "(" argument-list ")"
+ | numeric-constant
+numeric-constant -> Number
+argument-list -> <empty>
+ | expression comma-then-expression*
+comma-then-expression -> "," expression
+function-name -> "$lower_bound"
+ | "$max"
+ | "$present"
+ | "$upper_bound"
+field-reference -> snake-reference field-reference-tail*
+field-reference-tail -> "." snake-reference
+snake-reference -> builtin-field-word
+ | snake-word
+snake-word -> SnakeWord
+builtin-field-word -> "$max_size_in_bits"
+ | "$max_size_in_bytes"
+ | "$min_size_in_bits"
+ | "$min_size_in_bytes"
+ | "$size_in_bits"
+ | "$size_in_bytes"
+constant-reference -> constant-reference-tail
+ | snake-reference "."
+ constant-reference-tail
+constant-reference-tail -> constant-word
+ | type-word "." constant-reference-tail
+ | type-word "." snake-reference
+type-word -> CamelWord
+constant-word -> ShoutyWord
+builtin-reference -> builtin-word
+builtin-word -> "$is_statically_sized"
+ | "$static_size_in_bits"
+boolean-constant -> BooleanConstant
+and-expression -> comparison-expression
+ and-expression-right+
+and-expression-right -> and-operator comparison-expression
+and-operator -> "&&"
+snake-name -> snake-word
+inline-struct-field-definition -> field-location "struct" snake-name
+ abbreviation? ":" Comment? eol
+ struct-body
+abbreviation -> "(" snake-word ")"
+field-location -> expression "[" "+" expression "]"
+inline-enum-field-definition -> field-location "enum" snake-name
+ abbreviation? ":" Comment? eol
+ enum-body
+enum-body -> Indent doc-line* attribute-line*
+ enum-value+ Dedent
+enum-value -> constant-name "=" expression doc?
+ Comment? eol enum-value-body?
+enum-value-body -> Indent doc-line* Dedent
+doc -> Documentation
+constant-name -> constant-word
+inline-bits-field-definition -> field-location "bits" snake-name
+ abbreviation? ":" Comment? eol
+ bits-body
+bits-body -> Indent doc-line* attribute-line*
+ type-definition* bits-field-block
+ Dedent
+bits-field-block -> <empty>
+ | conditional-bits-field-block
+ bits-field-block
+ | unconditional-bits-field
+ bits-field-block
+unconditional-bits-field -> unconditional-anonymous-bits-field
+ | virtual-field
+unconditional-anonymous-bits-field -> field
+ | inline-bits-field-definition
+ | inline-enum-field-definition
+conditional-bits-field-block -> "if" expression ":" Comment? eol Indent
+ unconditional-bits-field+ Dedent
+field -> field-location type snake-name
+ abbreviation? attribute* doc? Comment?
+ eol field-body?
+attribute -> "[" attribute-context? "$default"?
+ snake-word ":" attribute-value "]"
+attribute-value -> expression
+ | string-constant
+string-constant -> String
+attribute-context -> "(" snake-word ")"
+type -> type-reference delimited-argument-list?
+ type-size-specifier?
+ array-length-specifier*
+array-length-specifier -> "[" "]"
+ | "[" expression "]"
+type-size-specifier -> ":" numeric-constant
+delimited-argument-list -> "(" argument-list ")"
+type-reference -> snake-word "." type-reference-tail
+ | type-reference-tail
+type-reference-tail -> type-word
+ | type-word "." type-reference-tail
+anonymous-bits-field-definition -> field-location "bits" ":" Comment? eol
+ anonymous-bits-body
+anonymous-bits-body -> Indent attribute-line*
+ anonymous-bits-field-block Dedent
+anonymous-bits-field-block -> <empty>
+ | conditional-anonymous-bits-field-block
+ anonymous-bits-field-block
+ | unconditional-anonymous-bits-field
+ anonymous-bits-field-block
+conditional-anonymous-bits-field-block -> "if" expression ":" Comment? eol Indent
+ unconditional-anonymous-bits-field+
+ Dedent
+conditional-struct-field-block -> "if" expression ":" Comment? eol Indent
+ unconditional-struct-field+ Dedent
+eol -> "\n" comment-line*
+delimited-parameter-definition-list -> "(" parameter-definition-list ")"
+parameter-definition-list -> <empty>
+ | parameter-definition
+ parameter-definition-list-tail*
+parameter-definition-list-tail -> "," parameter-definition
+parameter-definition -> snake-name ":" type
+type-name -> type-word
+external -> "external" type-name ":" Comment? eol
+ external-body
+external-body -> Indent doc-line* attribute-line* Dedent
+enum -> "enum" type-name ":" Comment? eol
+ enum-body
+bits -> "bits" type-name
+ delimited-parameter-definition-list?
+ ":" Comment? eol bits-body
+attribute-line -> attribute Comment? eol
+import-line -> "import" string-constant "as"
+ snake-word Comment? eol
+doc-line -> doc Comment? eol
+comment-line -> Comment? "\n"
+```
+
+The following productions are automatically generated to handle zero-or-more,
+one-or-more, and zero-or-one repeated lists (`foo*`, `foo+`, and `foo?`
+nonterminals) in LR(1). They are included for completeness, but may be ignored
+if you just want to understand the grammar.
+
+```shell
+"$default"? -> <empty>
+ | "$default"
+Comment? -> <empty>
+ | Comment
+abbreviation? -> <empty>
+ | abbreviation
+additive-expression-right* -> <empty>
+ | additive-expression-right
+ additive-expression-right*
+and-expression-right* -> <empty>
+ | and-expression-right
+ and-expression-right*
+and-expression-right+ -> and-expression-right
+ and-expression-right*
+array-length-specifier* -> <empty>
+ | array-length-specifier
+ array-length-specifier*
+attribute* -> <empty>
+ | attribute attribute*
+attribute-context? -> <empty>
+ | attribute-context
+attribute-line* -> <empty>
+ | attribute-line attribute-line*
+comma-then-expression* -> <empty>
+ | comma-then-expression
+ comma-then-expression*
+comment-line* -> <empty>
+ | comment-line comment-line*
+delimited-argument-list? -> <empty>
+ | delimited-argument-list
+delimited-parameter-definition-list? -> <empty>
+ | delimited-parameter-definition-list
+doc-line* -> <empty>
+ | doc-line doc-line*
+doc? -> <empty>
+ | doc
+enum-value* -> <empty>
+ | enum-value enum-value*
+enum-value+ -> enum-value enum-value*
+enum-value-body? -> <empty>
+ | enum-value-body
+equality-expression-right* -> <empty>
+ | equality-expression-right
+ equality-expression-right*
+equality-expression-right+ -> equality-expression-right
+ equality-expression-right*
+equality-or-greater-expression-right* -> <empty>
+ | equality-or-greater-expression-right
+ equality-or-greater-expression-right*
+equality-or-less-expression-right* -> <empty>
+ | equality-or-less-expression-right
+ equality-or-less-expression-right*
+field-body? -> <empty>
+ | field-body
+field-reference-tail* -> <empty>
+ | field-reference-tail
+ field-reference-tail*
+import-line* -> <empty>
+ | import-line import-line*
+or-expression-right* -> <empty>
+ | or-expression-right or-expression-right*
+or-expression-right+ -> or-expression-right or-expression-right*
+parameter-definition-list-tail* -> <empty>
+ | parameter-definition-list-tail
+ parameter-definition-list-tail*
+times-expression-right* -> <empty>
+ | times-expression-right
+ times-expression-right*
+type-definition* -> <empty>
+ | type-definition type-definition*
+type-size-specifier? -> <empty>
+ | type-size-specifier
+unconditional-anonymous-bits-field* -> <empty>
+ | unconditional-anonymous-bits-field
+ unconditional-anonymous-bits-field*
+unconditional-anonymous-bits-field+ -> unconditional-anonymous-bits-field
+ unconditional-anonymous-bits-field*
+unconditional-bits-field* -> <empty>
+ | unconditional-bits-field
+ unconditional-bits-field*
+unconditional-bits-field+ -> unconditional-bits-field
+ unconditional-bits-field*
+unconditional-struct-field* -> <empty>
+ | unconditional-struct-field
+ unconditional-struct-field*
+unconditional-struct-field+ -> unconditional-struct-field
+ unconditional-struct-field*
+```
+
+The following regexes are used to tokenize input into the corresponding symbols.
+Note that the `Indent`, `Dedent`, and `EndOfLine` symbols are generated using
+separate logic.
+
+Pattern | Symbol
+------------------------------------------ | ------------------------------
+`\[` | `"["`
+`\]` | `"]"`
+`\(` | `"("`
+`\)` | `")"`
+`\:` | `":"`
+`\=` | `"="`
+`\+` | `"+"`
+`\-` | `"-"`
+`\*` | `"*"`
+`\.` | `"."`
+`\?` | `"?"`
+`\=\=` | `"=="`
+`\!\=` | `"!="`
+`\&\&` | `"&&"`
+`\|\|` | `"||"`
+`\<` | `"<"`
+`\>` | `">"`
+`\<\=` | `"<="`
+`\>\=` | `">="`
+`\,` | `","`
+`\$static_size_in_bits` | `"$static_size_in_bits"`
+`\$is_statically_sized` | `"$is_statically_sized"`
+`\$max` | `"$max"`
+`\$present` | `"$present"`
+`\$upper_bound` | `"$upper_bound"`
+`\$lower_bound` | `"$lower_bound"`
+`\$size_in_bits` | `"$size_in_bits"`
+`\$size_in_bytes` | `"$size_in_bytes"`
+`\$max_size_in_bits` | `"$max_size_in_bits"`
+`\$max_size_in_bytes` | `"$max_size_in_bytes"`
+`\$min_size_in_bits` | `"$min_size_in_bits"`
+`\$min_size_in_bytes` | `"$min_size_in_bytes"`
+`\$default` | `"$default"`
+`struct` | `"struct"`
+`bits` | `"bits"`
+`enum` | `"enum"`
+`external` | `"external"`
+`import` | `"import"`
+`as` | `"as"`
+`if` | `"if"`
+`let` | `"let"`
+`EmbossReserved[A-Za-z0-9]*` | `BadWord`
+`emboss_reserved[_a-z0-9]*` | `BadWord`
+`EMBOSS_RESERVED[_A-Z0-9]*` | `BadWord`
+`"(?:[^"\n\\]\|\\[n\\"])*"` | `String`
+`[0-9]+` | `Number`
+`[0-9]{1,3}(?:_[0-9]{3})*` | `Number`
+`0x[0-9a-fA-F]+` | `Number`
+`0x_?[0-9a-fA-F]{1,4}(?:_[0-9a-fA-F]{4})*` | `Number`
+`0x_?[0-9a-fA-F]{1,8}(?:_[0-9a-fA-F]{8})*` | `Number`
+`0b[01]+` | `Number`
+`0b_?[01]{1,4}(?:_[01]{4})*` | `Number`
+`0b_?[01]{1,8}(?:_[01]{8})*` | `Number`
+`true\|false` | `BooleanConstant`
+`[a-z][a-z_0-9]*` | `SnakeWord`
+`[A-Z][A-Z_0-9]*[A-Z_][A-Z_0-9]*` | `ShoutyWord`
+`[A-Z][a-zA-Z0-9]*[a-z][a-zA-Z0-9]*` | `CamelWord`
+`-- .*` | `Documentation`
+`--$` | `Documentation`
+`--.*` | `BadDocumentation`
+`\s+` | *no symbol emitted*
+`#.*` | `Comment`
+`[0-9][bxBX]?[0-9a-fA-F_]*` | `BadNumber`
+`[a-zA-Z_$0-9]+` | `BadWord`
+
+The following 534 keywords are reserved, but not used, by Emboss. They may not
+be used as field, type, or enum value names.
+
+`ATOMIC_BOOL_LOCK_FREE` `ATOMIC_CHAR16_T_LOCK_FREE` `ATOMIC_CHAR32_T_LOCK_FREE`
+`ATOMIC_CHAR_LOCK_FREE` `ATOMIC_FLAG_INIT` `ATOMIC_INT_LOCK_FREE`
+`ATOMIC_LLONG_LOCK_FREE` `ATOMIC_LONG_LOCK_FREE` `ATOMIC_POINTER_LOCK_FREE`
+`ATOMIC_SHORT_LOCK_FREE` `ATOMIC_VAR_INIT` `ATOMIC_WCHAR_T_LOCK_FREE` `BUFSIZ`
+`CGFloat` `CHAR_BIT` `CHAR_MAX` `CHAR_MIN` `CLOCKS_PER_SEC` `CMPLX` `CMPLXF`
+`CMPLXL` `DBL_DECIMAL_DIG` `DBL_DIG` `DBL_EPSILON` `DBL_HAS_SUBNORM`
+`DBL_MANT_DIG` `DBL_MAX` `DBL_MAX_10_EXP` `DBL_MAX_EXP` `DBL_MIN`
+`DBL_MIN_10_EXP` `DBL_MIN_EXP` `DBL_TRUE_MIN` `DECIMAL_DIG` `DOMAIN` `EDOM`
+`EILSEQ` `EOF` `ERANGE` `EXIT_FAILURE` `EXIT_SUCCESS` `FE_ALL_EXCEPT`
+`FE_DFL_ENV` `FE_DIVBYZERO` `FE_DOWNWARD` `FE_INEXACT` `FE_INVALID`
+`FE_OVERFLOW` `FE_TONEAREST` `FE_TOWARDZERO` `FE_UNDERFLOW` `FE_UPWARD`
+`FILENAME_MAX` `FLT_DECIMAL_DIG` `FLT_DIG` `FLT_EPSILON` `FLT_EVAL_METHOD`
+`FLT_HAS_SUBNORM` `FLT_MANT_DIG` `FLT_MAX` `FLT_MAX_10_EXP` `FLT_MAX_EXP`
+`FLT_MIN` `FLT_MIN_10_EXP` `FLT_MIN_EXP` `FLT_RADIX` `FLT_ROUNDS` `FLT_TRUE_MIN`
+`FOPEN_MAX` `FP_FAST_FMA` `FP_FAST_FMAF` `FP_FAST_FMAL` `FP_ILOGB0`
+`FP_ILOGBNAN` `FP_INFINITE` `FP_NAN` `FP_NORMAL` `FP_SUBNORMAL` `FP_ZERO`
+`False` `HUGE_VAL` `HUGE_VALF` `HUGE_VALL` `INFINITY` `INT16_C` `INT16_MAX`
+`INT16_MIN` `INT32_C` `INT32_MAX` `INT32_MIN` `INT64_C` `INT64_MAX` `INT64_MIN`
+`INT8_C` `INT8_MAX` `INT8_MIN` `INTMAX_C` `INTMAX_MAX` `INTMAX_MIN` `INTPTR_MAX`
+`INTPTR_MIN` `INT_FAST16_MAX` `INT_FAST16_MIN` `INT_FAST32_MAX` `INT_FAST32_MIN`
+`INT_FAST64_MAX` `INT_FAST64_MIN` `INT_FAST8_MAX` `INT_FAST8_MIN`
+`INT_LEAST16_MAX` `INT_LEAST16_MIN` `INT_LEAST32_MAX` `INT_LEAST32_MIN`
+`INT_LEAST64_MAX` `INT_LEAST64_MIN` `INT_LEAST8_MAX` `INT_LEAST8_MIN` `INT_MAX`
+`INT_MIN` `LC_ALL` `LC_COLLATE` `LC_CTYPE` `LC_MONETARY` `LC_NUMERIC` `LC_TIME`
+`LDBL_DECIMAL_DIG` `LDBL_DIG` `LDBL_EPSILON` `LDBL_HAS_SUBNORM` `LDBL_MANT_DIG`
+`LDBL_MAX` `LDBL_MAX_10_EXP` `LDBL_MAX_EXP` `LDBL_MIN` `LDBL_MIN_10_EXP`
+`LDBL_MIN_EXP` `LDBL_TRUE_MIN` `LLONG_MAX` `LLONG_MIN` `LONG_MAX` `LONG_MIN`
+`MATH_ERREXCEPT` `MATH_ERRNO` `MAXFLOAT` `MB_CUR_MAX` `MB_LEN_MAX` `M_1_PI`
+`M_2_PI` `M_2_SQRTPI` `M_3PI_4` `M_E` `M_INVLN2` `M_IVLN10` `M_LN10` `M_LN2`
+`M_LN2HI` `M_LN2LO` `M_LOG10E` `M_LOG2E` `M_LOG2_E` `M_PI` `M_PI_2` `M_PI_4`
+`M_SQRT1_2` `M_SQRT2` `M_SQRT3` `M_SQRTPI` `M_TWOPI` `NAN` `NDEBUG` `NSInteger`
+`NSNumber` `NSObject` `NULL` `None` `ONCE_FLAG_INIT` `OVERFLOW` `PLOSS`
+`PTRDIFF_MAX` `PTRDIFF_MIN` `RAND_MAX` `SCHAR_MAX` `SCHAR_MIN` `SEEK_CUR`
+`SEEK_END` `SEEK_SET` `SHRT_MAX` `SHRT_MIN` `SIGABRT` `SIGFPE` `SIGILL` `SIGINT`
+`SIGSEGV` `SIGTERM` `SIG_ATOMIC_MAX` `SIG_ATOMIC_MIN` `SIG_DFL` `SIG_ERR`
+`SIG_IGN` `SING` `SIZE_MAX` `Self` `TIME_UTC` `TLOSS` `TMP_MAX` `TMP_MAX_S`
+`TSS_DTOR_ITERATIONS` `True` `UCHAR_MAX` `UINT16_C` `UINT16_MAX` `UINT32_C`
+`UINT32_MAX` `UINT64_C` `UINT64_MAX` `UINT8_C` `UINT8_MAX` `UINTMAX_C`
+`UINTMAX_MAX` `UINTPTR_MAX` `UINT_FAST16_MAX` `UINT_FAST32_MAX`
+`UINT_FAST64_MAX` `UINT_FAST8_MAX` `UINT_LEAST16_MAX` `UINT_LEAST32_MAX`
+`UINT_LEAST64_MAX` `UINT_LEAST8_MAX` `UINT_MAX` `ULLONG_MAX` `ULONG_MAX`
+`UNDERFLOW` `USHRT_MAX` `WCHAR_MAX` `WCHAR_MIN` `WEOF` `WINT_MAX` `WINT_MIN`
+`abstract` `acos` `acosh` `after` `alignas` `alignof` `and` `and_eq` `andalso`
+`asin` `asinh` `asm` `assert` `atan` `atan2` `atanh`
+`atomic_compare_exchange_strong` `atomic_compare_exchange_strong_explicit`
+`atomic_compare_exchange_weak` `atomic_compare_exchange_weak_explicit`
+`atomic_exchange` `atomic_exchange_explicit` `atomic_fetch_add`
+`atomic_fetch_add_explicit` `atomic_fetch_and` `atomic_fetch_and_explicit`
+`atomic_fetch_or` `atomic_fetch_or_explicit` `atomic_fetch_sub`
+`atomic_fetch_sub_explicit` `atomic_fetch_xor` `atomic_fetch_xor_explicit`
+`atomic_init` `atomic_is_lock_free` `atomic_load` `atomic_load_explicit`
+`atomic_store` `atomic_store_explicit` `auto` `band` `become` `begin` `bitand`
+`bitor` `bnot` `bool` `boolean` `bor` `box` `break` `bsl` `bsr` `bxor` `byte`
+`carg` `case` `catch` `cbrt` `ceil` `chan` `char` `char16_t` `char32_t` `cimag`
+`class` `classdef` `compl` `complex` `concept` `cond` `conj` `const`
+`const_cast` `constexpr` `continue` `copysign` `cos` `cosh` `cproj` `crate`
+`creal` `decltype` `def` `default` `defer` `del` `delete` `div` `do` `double`
+`dynamic_cast` `elif` `else` `elseif` `end` `erf` `erfc` `errno` `except` `exec`
+`exp` `exp2` `explicit` `expm1` `export` `extends` `extern` `fabs` `fallthrough`
+`fdim` `final` `finally` `float` `floor` `fma` `fmax` `fmin` `fmod` `fn` `for`
+`fortran` `fpclassify` `frexp` `friend` `from` `fun` `func` `function` `global`
+`go` `goto` `hypot` `ilogb` `imaginary` `impl` `implementation` `implements`
+`in` `inline` `instanceof` `int` `interface` `is` `isfinite` `isgreater`
+`isgreaterequal` `isinf` `isless` `islessequal` `islessgreater` `isnan`
+`isnormal` `isunordered` `kill_dependency` `lambda` `ldexp` `lgamma` `llrint`
+`llround` `log` `log10` `log1p` `log2` `logb` `long` `loop` `lrint` `lround`
+`macro` `map` `match` `math_errhandling` `mod` `move` `mut` `mutable`
+`namespace` `native` `nearbyint` `new` `nextafter` `nexttoward` `noexcept`
+`nonatomic` `nonlocal` `noreturn` `not` `not_eq` `null` `nullptr` `of`
+`offsetof` `operator` `or` `or_eq` `orelse` `otherwise` `override` `package`
+`parfor` `pass` `persistent` `pow` `print` `priv` `private` `proc` `property`
+`protected` `protocol` `pub` `public` `pure` `raise` `range` `readonly`
+`readwrite` `receive` `ref` `register` `reinterpret_cast` `rem` `remainder`
+`remquo` `requires` `restrict` `retain` `rethrow` `return` `rint` `round`
+`scalbln` `scalbn` `select` `self` `setjmp` `short` `signbit` `signed` `sin`
+`sinh` `sizeof` `spmd` `sqrt` `static` `static_assert` `static_cast` `stderr`
+`stdin` `stdout` `strictfp` `strong` `super` `switch` `synchronized` `tan`
+`tanh` `template` `tgamma` `this` `thread_local` `throw` `throws` `trait`
+`transient` `trunc` `try` `type` `typedef` `typeid` `typename` `typeof` `union`
+`unsafe` `unsafe_unretained` `unsigned` `unsized` `use` `using` `va_arg`
+`va_copy` `va_end` `va_start` `var` `virtual` `void` `volatile` `wchar_t` `weak`
diff --git a/g3doc/guide.md b/g3doc/guide.md
new file mode 100644
index 0000000..d52ebd5
--- /dev/null
+++ b/g3doc/guide.md
@@ -0,0 +1,460 @@
+# Emboss User Guide
+
+[TOC]
+
+
+## Getting Started
+
+First, you must identify a data structure you want to read and write. These are
+often documented in hardware manuals a bit like [this one, for the fictional
+BN-P-6000404 illuminated button panel](BogoNEL_BN-P-6000404_User_Guide.pdf). We
+will use the BN-P-6000404 as an example.
+
+
+### A Caution
+
+Emboss is still beta software. While we believe that we will not need to make
+any more breaking changes before 1.0, you may still encounter bugs and there are
+many missing features.
+
+You can contact `emboss-dev@google.com` with any issues. Emboss is not an
+officially supported Google product, but the Emboss authors will try to answer
+emails.
+
+
+### System Requirements
+
+#### Running the Emboss Compiler
+
+The Emboss compiler requires Python 3.6 or later.
+
+
+#### Using the Generated Code
+
+The code generated by Emboss requires a C++11-compliant compiler, and a
+reasonably up-to-date standard library. Emboss has been tested with GCC and
+Clang, libc++ and libstd++. In theory, it should work with MSVC, ICC, etc., but
+it has not been tested, so there are likely to be bugs.
+
+
+#### Contributing to the Compiler
+
+If you want to contribute features or bugfixes to the Emboss compiler itself,
+you will need Bazel to run the Emboss test suite.
+
+
+### Create an `.emb` file
+
+Next, you will need to translate your structures.
+
+```
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "bogonel::bnp6000404"]
+```
+
+The BN-P-6000404 uses little-endian numbers, so we can set the default byte
+order to `LittleEndian`. There is no particular C++ namespace implied by the
+BN-P-6000404 user guide, so we use one that is specific to the BN-P-6000404.
+
+The BN-P-6000404, like many devices with serial interfaces, uses a framed
+message system, with a fixed header and a variable message body depending on a
+message ID. For the BN-P-6000404, this framing looks like this:
+
+<!-- TODO(bolms): finalize the "magic value initialization" feature, document it
+here. -->
+
+```
+struct Message:
+ -- Top-level message structure, specified in section 5.3 of the BN-P-6000404
+ -- user guide.
+
+ 0 [+1] UInt sync_1
+ [requires: this == 0x42]
+
+ 1 [+1] UInt sync_2
+ [requires: this == 0x4E]
+
+ 2 [+1] MessageId message_id
+ -- Type of message
+
+ 3 [+1] UInt message_length (ml)
+ -- Length of message, including header and checksum
+
+ # ... body fields to follow ...
+```
+
+We could have chosen to put the header fields into a separate `Header` structure
+instead of placing them directly in the `Message` structure.
+
+The `sync_1` and `sync_2` fields are required to have specific magic values, so
+we add the appropriate `[requires: ...]` attributes to them. This tells Emboss
+that if those fields do not have those values, then the `Message` `struct` is
+ill-formed: in the client code, the `Message` will not be `Ok()` if those fields
+have the wrong values, and Emboss will not allow wrong values to be written into
+those fields using the checked (default) APIs.
+
+Unfortunately, BogoNEL does not provide a nice table of message IDs, but
+fortunately there are only a few, so we can gather them from the individual
+messages:
+
+```
+enum MessageId:
+ -- Message type idenfiers for the BN-P-6000404.
+ IDENTIFICATION = 0x01
+ INTERACTION = 0x02
+ QUERY_IDENTIFICATION = 0x10
+ QUERY_BUTTONS = 0x11
+ SET_ILLUMINATION = 0x12
+```
+
+Next, we should translate the individual messages to Emboss.
+
+```
+struct Identification:
+ -- IDENTIFICATION message, specified in section 5.3.3.
+
+ 0 [+4] UInt vendor
+ # 0x4F474F42 is "BOGO" in ASCII, interpreted as a 4-byte little-endian
+ # value.
+ [requires: this == 0x4F47_4F42]
+
+ 0 [+4] UInt:8[4] vendor_ascii
+ -- "BOGO" for BogoNEL Corp
+ # The `vendor` field really contains the four ASCII characters "BOGO", so we
+ # could use a byte array instead of a single UInt. Since it is valid to
+ # have overlapping fields, we can have both `vendor` and `vendor_ascii` in
+ # our Emboss specification.
+
+ 4 [+2] UInt firmware_major
+ -- Firmware major version
+
+ 6 [+2] UInt firmware_minor
+ -- Firmware minor version
+```
+
+<!-- TODO(bolms): fixed-length, ASCIIZ, and variable-length string support? -->
+
+The `Identification` structure is fairly straightforward. In this case, we
+provide an alternate view of the `vendor` field via `vendor_ascii`: 0x4F474F42
+in little-endian works out to the ASCII characters "BOGO".
+
+Note that `vendor_ascii` uses `UInt:8[4]` for its type, and not `UInt[4]`. For
+most fields, we can use plain `UInt` and Emboss will figure out how big the
+`UInt` should be, but for an array we must be explicit that we want 8-bit
+elements.
+
+```
+struct Interaction:
+ -- INTERACTION message, specified in section 5.3.4.
+
+ 0 [+1] UInt number_of_buttons (n)
+ -- Number of buttons currently depressed by user
+
+ 4 [+n] ButtonId:8[n] button_id
+ -- ID of pressed button. A number of entries equal to number_of_buttons
+ -- will be provided.
+```
+
+<!-- TODO(bolms): reserved field support -->
+
+`Interaction` is also fairly straightforward. The only tricky bit is the
+`button_id` field: since `Interaction` can return a variable number of button
+IDs, depending on how many buttons are currently pressed, the `button_id` field
+must has length `n`. It would have been OK to use `[+number_of_buttons]`, but
+full field names can get cumbersome, particularly when the length involves are
+more complex expression. Instead, we set an *alias* for `number_of_buttons`
+using `(n)`, and then use the alias in `button_id`'s length. The `n` alias is
+not visible outside of the `Interaction` message, and won't be available in the
+generated code, so the short name is not likely to cause confusion.
+
+```
+enum ButtonId:
+ -- Button IDs, specified in table 5-6.
+ BUTTON_A = 0x00
+ BUTTON_B = 0x04
+ BUTTON_C = 0x08
+ BUTTON_D = 0x0C
+ BUTTON_E = 0x01
+ BUTTON_F = 0x05
+ BUTTON_G = 0x09
+ BUTTON_H = 0x0D
+ BUTTON_I = 0x02
+ BUTTON_J = 0x06
+ BUTTON_K = 0x0A
+ BUTTON_L = 0x0E
+ BUTTON_M = 0x03
+ BUTTON_N = 0x07
+ BUTTON_O = 0x0B
+ BUTTON_P = 0x0F
+```
+
+We had to prefix all of the button names with `BUTTON_` because Emboss does not
+allow single-character enum names.
+
+The QUERY IDENTIFICATION and QUERY BUTTONS messages don't have any fields other
+than `checksum`, so we will handle them a bit differently.
+
+```
+struct SetIllumination:
+ -- SET ILLUMINATION message, specified in section 5.3.7.
+
+ 0 [+1] bits:
+ 0 [+1] Flag red_channel_enable
+ -- Enables setting the RED channel.
+
+ 1 [+1] Flag blue_channel_enable
+ -- Enables setting the BLUE channel.
+
+ 2 [+1] Flag green_channel_enable
+ -- Enables setting the GREEN channel.
+
+ 1 [+1] UInt blink_duty
+ -- Sets the proportion of time between time on and time off for blink
+ -- feature.
+ --
+ -- Minimum value = 0 (no illumination)
+ --
+ -- Maximum value = 240 (constant illumination)
+ [requires: 0 <= this <= 240]
+
+ 2 [+2] UInt blink_period
+ -- Sets the blink period, in milliseconds.
+ --
+ -- Minimum value = 10
+ --
+ -- Maximum value = 10000
+ [requires: 10 <= this <= 10_000]
+
+ 4 [+4] bits:
+ 0 [+32] UInt:2[16] intensity
+ -- Intensity values for the unmasked channels. 2 bits of intensity for
+ -- each button.
+```
+
+`SetIllumination` requires us to use bitfields. The first bitfield is in the
+CHANNEL MASK field: rather than making a single `channel_mask` field, Emboss
+lets us specify the red, green, and blue channel masks separately.
+
+As with `sync_1` and `sync_2`, we have added `[requires: ...]` to the
+`blink_duty` and `blink_period` fields: this time, specifying a range of valid
+values. `[requires: ...]` accepts an arbitrary expression, which can be as
+simple or as complex as desired.
+
+It is not clear from BogoNEL's documentation whether "bit 0" means the least
+significant or most significant bit of its byte, but a little experimentation
+with the device shows that setting the least significant bit causes
+`SetIllumination` to set its red channel. Emboss always numbers bits in
+bitfields from least significant (bit 0) to most significant.
+
+The other bitfield is the `intensity` array. The BN-P-6000404 uses an array of
+2 bit intensity values, so we specify that array.
+
+Finally, we should add all of the sub-messages into `Message`, and also take
+care of `checksum`. After making those changes, `Message` looks like:
+
+```
+struct Message:
+ -- Top-level message structure, specified in section 5.3 of the BN-P-6000404
+ -- user guide.
+
+ 0 [+1] UInt sync_1
+ [requires: this == 0x42]
+
+ 1 [+1] UInt sync_2
+ [requires: this == 0x4E]
+
+ 2 [+1] MessageId message_id
+ -- Type of message
+
+ 3 [+1] UInt message_length (ml)
+ -- Length of message, including header and checksum
+
+ if message_id == MessageId.IDENTIFICATION:
+ 4 [+ml-8] Identification identification
+
+ if message_id == MessageId.INTERACTION:
+ 4 [+ml-8] Interaction interaction
+
+ if message_id == MessageId.SET_ILLUMINATION:
+ 4 [+ml-8] SetIllumination set_illumination
+
+ 0 [+ml-4] UInt:8[] checksummed_bytes
+
+ ml-4 [+4] UInt checksum
+```
+
+By wrapping the various message types in `if message_id == ...` constructs,
+those substructures will only be available when the `message_id` field is set to
+the corresponding message type. This kind of selection is used for any
+structure field that is only valid some of the time.
+
+The substructures all have the length `ml-8`. The `ml` is a short alias for the
+`message_length` field; these short aliases are available so that the field
+types and names don't have to be pushed far to the right. Aliases may only be
+used directly in the same structure definition where they are created; they may
+not be used elsewhere in an Emboss file, and they are not available in the
+generated code. The length is `ml-8` in this case because the `message_length`
+includes the header and checksum, which left out of the substructures.
+
+Note that we simply don't have any subfield for QUERY IDENTIFICATION or QUERY
+BUTTONS: since those messages do not have any fields, there is no need for a
+zero-byte structure.
+
+We also added the `checksummed_bytes` field as a convenience for computing the
+checksum.
+
+
+### Generate code
+
+Once you have an `.emb`, you will need to generate code from it.
+
+The simplest way to do so is to run the `embossc` tool:
+
+```
+embossc -I src --generate cc --output-path generated bogonel.emb
+```
+
+The `-I` option adds a directory to the *include path*. The input file -- in
+this case, `bogonel.emb` -- must be found somewhere on the include path.
+
+The `--generate` option specifies which back end to use; `cc` is the C++ back
+end.
+
+The `--output-path` option specifies where the generated file should be placed.
+Note that the output path will include all of the path components of the input
+file: if the input file is `x/y/z.emb`, then the path `x/y/z.emb.h` will be
+appended to the `--output-path`. Missing directories will be created.
+
+
+<!-- #### Using Bazel -->
+
+<!-- TODO(bolms): Make this usable from Bazel. -->
+
+
+### Include the generated C++ code
+
+Emboss generates a single C++ header file from your `.emb` by appending `.h` to
+the file name: to use the BogoNEL definitions, you would `#include
+"path/to/bogonel.emb.h"` in your C++ code.
+
+Currently, Emboss does not generate a corresponding `.cc` file: the code that
+Emboss generates is all templates, which exist in the `.h`. Although the Emboss
+maintainers (e.g., bolms@) like the simplicity of generating a single file, this
+could change at some point.
+
+
+### Use the generated C++ code
+
+Emboss generates *views*, which your program can use to read and write existing
+arrays of bytes, and which do not take ownership. For example:
+
+```c++
+#include "path/to/bogonel.emb.h"
+
+template <typename View>
+bool ChecksumIsCorrect(View message_view);
+
+// Handles BogoNEL BN-P-6000404 device messages from a byte stream. Returns
+// the number of bytes that were processed. Unprocessed bytes should be
+// passed into the next call.
+int HandleBogonelPanelMessages(const char *bytes, int byte_count) {
+ auto message_view = bogonel::bnp6000404::MakeMessageView(bytes, byte_count);
+
+ // IsComplete() will return true if the view has enough bytes to fully
+ // contain the message; i.e., that byte_count is at least
+ // message_view.message_length().Read() + 4.
+ if (!message_view->IsComplete()) {
+ return 0;
+ }
+
+ // If Emboss is happy with the message, we still need to check the checksum:
+ // Emboss does not (yet) have support for automatically checking checksums and
+ // CRCs.
+ if (!message_view->Ok() || !ChecksumIsCorrect(message_view)) {
+ // If the message is complete, but not correct, we need to log an error.
+ HandleBrokenMessage(message_view);
+ return message_view->Size();
+ }
+
+
+ // At this point, we know the message is complete and (basically) OK, so
+ // we dispatch it to a message-type-specific handler.
+ switch (message_view->message_id().Read()) {
+ case bogonel::bnp6000404::MessageId::IDENTIFICATION:
+ HandleIdentificationMessage(message_view);
+ break;
+
+ case bogonel::bnp6000404::MessageId::INTERACTION:
+ HandleInteractionMessage(message_view);
+ break;
+
+ case bogonel::bnp6000404::MessageId::QUERY_IDENTIFICATION:
+ case bogonel::bnp6000404::MessageId::QUERY_BUTTONS:
+ case bogonel::bnp6000404::MessageId::SET_ILLUMINATION:
+ Log("Unexpected host to device message type.");
+ break;
+
+ default:
+ Log("Unknown message type.");
+ break;
+ }
+
+ return message_view->Size();
+}
+
+template <typename View>
+bool ChecksumIsCorrect(View message_view) {
+ uint32_t checksum = 0;
+ for (int i = 0; i < message_view.checksum_bytes().ElementCount(); ++i) {
+ checksum += message_view.checksum_bytes()[i].Read();
+ }
+ return checksum == message_view.checksum().Read();
+}
+```
+
+<!-- TODO(bolms): solidify support for checksums, so that the Ok() call in the
+example actually checks them. -->
+
+The `message_view` object in this example is a lightweight object that simply
+provides *access* to the bytes in `message`. Emboss views are very cheap to
+construct because they only contain a couple of pointers and a length -- they do
+not copy or take ownership of the underlying bytes. This also means that you
+have to keep the underlying bytes alive as long as you are using a view -- you
+can't let them go out of scope or delete them.
+
+Views can also be used for writing, if they are given pointers to mutable
+memory:
+
+```c++
+void ConstructSetIlluminationMessage(const vector<bool> &lit_buttons,
+ vector<char> *result) {
+ // The SetIllumination message has a constant size, so SizeInBytes() is
+ // available as a static method.
+ int length = bogonel::bnp6000404::SetIllumination::SizeInBytes() + 8;
+ result->clear();
+ result->resize(length);
+
+ auto view = bogonel::bnp6000404::MakeMessageView(result);
+ view->sync_1().Write(0x42);
+ view->sync_2().Write(0x4E);
+ view->message_id().Write(bogonel::bnp6000404::MessageId::SET_ILLUMINATION);
+ view->message_length().Write(length);
+ view->set_illumination().red_channel_enable().Write(true);
+ view->set_illumination().blue_channel_enable().Write(true);
+ view->set_illumination().green_channel_enable().Write(true);
+ view->set_illumination().blink_duty().Write(240);
+ view->set_illumination().blink_period().Write(10000);
+ for (int i = 0; i < view->set_illumination().intensity().ElementCount();
+ ++i) {
+ view->set_illumination().intensity()[i].Write(lit_buttons[i] ? 3 : 0);
+ }
+}
+```
+
+
+### Use the `.emb` Autoformatter
+
+You can use the `.emb` autoformatter to avoid manual formatting. For now, it is
+available at `front_end/format.py`.
+
+*TODO(bolms): Package the Emboss tools for easy workstation installation.*
diff --git a/g3doc/index.md b/g3doc/index.md
new file mode 100644
index 0000000..6c383e8
--- /dev/null
+++ b/g3doc/index.md
@@ -0,0 +1,18 @@
+Welcome to Emboss, the Embedded Systems Binary Structure Tool.
+
+If you are new to Emboss, a good place to start would be the [User
+Guide](guide.md).
+
+The [C++ User Guide](cpp-guide.md) has an (incomplete) explanation of the
+generated C++ code.
+
+Details of the Emboss language can be found in the [Emboss Language
+Reference](language-reference.md).
+
+A reference to the C++ code that Emboss generates can be found in the [Emboss
+C++ Generated Code Reference](cpp-reference.md).
+
+Details of the textual representation Emboss uses for structures can be found in
+the [Emboss Text Format Reference](text-format.md).
+
+There is a tentative [roadmap of future development](roadmap.md).
diff --git a/g3doc/language-reference.md b/g3doc/language-reference.md
new file mode 100644
index 0000000..3a07bce
--- /dev/null
+++ b/g3doc/language-reference.md
@@ -0,0 +1,1390 @@
+# Emboss Language Reference
+
+[TOC]
+
+## Top Level Structure
+
+An `.emb` file contains four sections: a documentation block, imports, an
+attribute block, containing attributes which apply to the whole module, followed
+by a list of type definitions:
+
+```
+# Documentation block (optional)
+-- This is an example of an .emb file, with every section.
+
+# Imports (optional)
+import "other.emb" as other
+import "project/more.emb" as project_more
+
+# Attribute block (optional)
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "foo::bar::baz"]
+[(java) namespace: "com.example.foo.bar.baz"]
+
+# Type definitions
+enum Foo:
+ ONE = 1
+ TEN = 10
+ PURPLE = 12
+
+struct Bar:
+ 0 [+4] Foo purple
+ 4 [+4] UInt payload_size (s)
+ 8 [+s] UInt:8[] payload
+```
+
+The documentation and/or attribute blocks may be omitted if they are not
+necessary.
+
+
+### Comments
+
+Comments start with `#` and extend to the end of the line:
+
+```
+struct Foo: # This is a comment
+ # This is a comment
+ 0 [+1] UInt field # This is a comment
+```
+
+Comments are ignored. They should not be confused with
+[*documentation*](#documentation), which is intended to be used by some back
+ends.
+
+
+## Documentation
+
+Documentation blocks may be attached to modules, types, fields, or enum values.
+They are different from comments in that they will be used by the
+(not-yet-ready) documentation generator back-end.
+
+Documentation blocks take the form of any number of lines starting with `-- `:
+
+```
+-- This is a module documentation block. Text in this block will be attached to
+-- the module as documentation.
+--
+-- This is a new paragraph in the same module documentation block.
+--
+-- Module-level documentation should describe the purpose of the module, and may
+-- point out the most salient features of the module.
+
+struct Message:
+ -- This is a documentation block attached to the Message structure. It should
+ -- describe the purpose of Message, and how it should be used.
+ 0 [+4] UInt header_length
+ -- This is documentation for the header_length field. Again, it should
+ -- describe this specific field.
+ 4 [+4] MessageType message_type -- Short docs can go on the same line.
+```
+
+Documentation should be written in CommonMark format, ignoring the leading
+`-- `.
+
+
+## Imports
+
+An `import` line tells Emboss to read another `.emb` file and make its types
+available to the current file under the given name. For example, given the
+import line:
+
+```
+import "other.emb" as helper
+```
+
+then the type `Type` from `other.emb` may be referenced as `helper.Type`.
+
+The `--import-dir` command-line flag tells Emboss which directories to search
+for imported files; it may be specified multiple times. If no `--import-dir` is
+specified, Emboss will search the current working directory.
+
+
+## Attributes
+
+Attributes are an extensible way of adding arbitrary information to a module,
+type, field, or enum value. Currently, only whitelisted attributes are allowed
+by the Emboss compiler, but this may change in the future.
+
+Attributes take a form like:
+
+```
+[name: value] # name has value for the current entity.
+[$default name: value] # Default name to value for all sub-entities.
+[(backend) name: value] # Attribute for a specific back end.
+```
+
+
+### `byte_order`
+
+The `byte_order` attribute is used to specify the byte order of `bits` fields
+and of field with an atomic type, such as `UInt`.
+
+`byte_order` takes a string value, which must be either `"BigEndian"`,
+`"LittleEndian"`, or `"Null"`:
+
+```
+[$default byte_order: "LittleEndian"]
+
+struct Foo:
+ [$default byte_order: "Null"]
+
+ 0 [+4] UInt bar
+ [byte_order: "BigEndian"]
+
+ 4 [+4] bits:
+ [byte_order: "LittleEndian"]
+
+ 0 [+23] UInt baz
+ 23 [+9] UInt qux
+
+ 8 [+1] UInt froble
+```
+
+A `$default` byte order may be set on a module or structure.
+
+The `"BigEndian"` and `"LittleEndian"` byte orders set the byte order to big or
+little endian, respectively. That is, for little endian:
+
+```
+ byte 0 byte 1 byte 2 byte 3
++--------+--------+--------+--------+
+|76543210|76543210|76543210|76543210|
++--------+--------+--------+--------+
+ ^ ^ ^ ^ ^ ^ ^ ^
+ 07 00 15 08 23 16 31 24
+ ^^^^^^^^^^^^^^^ bit ^^^^^^^^^^^^^^^
+```
+
+And for big endian:
+
+```
+ byte 0 byte 1 byte 2 byte 3
++--------+--------+--------+--------+
+|76543210|76543210|76543210|76543210|
++--------+--------+--------+--------+
+ ^ ^ ^ ^ ^ ^ ^ ^
+ 31 24 23 16 15 08 07 00
+ ^^^^^^^^^^^^^^^ bit ^^^^^^^^^^^^^^^
+```
+
+The `"Null"` byte order is used if no `byte_order` attribute is specified.
+`"Null"` indicates that the byte order is unknown; it is an error if a
+byte-order-dependent field that is not exactly 8 bits has the `"Null"` byte
+order.
+
+
+### `requires`
+
+The `requires` attribute may be placed on an atomic field (e.g., type `UInt`,
+`Int`, `Flag`, etc.) to specify a predicate that values of that field must
+satisfy, or on a `struct` or `bits` to specify relationships between fields that
+must be satisfied.
+
+```
+struct Foo:
+ [requires: bar < qux]
+
+ 0 [+4] UInt bar
+ [requires: this <= 999_999_999]
+
+ 4 [+4] UInt qux
+ [requires: 100 <= this <= 1_000_000_000]
+
+ let bar_plus_qux = bar + qux
+ [requires: this >= 199]
+```
+
+For `[requires]` on a field, other fields may not be referenced, and the value
+of the current field must be referred to as `this`.
+
+For `[requires]` on a `struct` or `bits`, any atomic field in the structure may
+be referenced.
+
+
+### `(cpp) namespace`
+
+The `namespace` attribute is used by the C++ back end to determine which
+namespace to place the generated code in:
+
+```
+[(cpp) namespace: "foo::bar::baz"]
+```
+
+A leading `::` is allowed, but not required; the previous example could also be
+written as:
+
+```
+[(cpp) namespace: "::foo::bar::baz"]
+```
+
+Internally, Emboss will translate either of these into a nested `namespace foo {
+namespace bar { namespace baz { ... } } }` wrapping the generated C++ code for
+this module.
+
+The `namespace` attribute may only be used at the module level; all structures
+and enums within a module will be placed in the same namespace.
+
+
+### `text_output`
+
+The `text_output` attribute may be attached to a `struct` or `bits` field to
+control whether or not the field is included when emitting the text format
+version of the structure. For example:
+
+```
+struct SuppressedField:
+ 0 [+1] UInt a
+ 1 [+1] UInt b
+ [text_output: "Skip"]
+```
+
+The text format output (as from `emboss::WriteToString()` in C++) would be of
+the form:
+
+```
+{ a: 1 }
+```
+
+instead of the default:
+
+```
+{ a: 1, b: 2 }
+```
+
+For completeness, `[text_output: "Emit"]` may be used to explicitly specify that
+a field should be included in text output.
+
+
+### `external` specifier attributes
+
+The `addressable_unit_size`, `type_requires`, `fixed_size_in_bits`, and
+`is_integer` attributes are used on `external` types to tell the compiler what
+it needs to know about the `external` types. They are currently
+unstable, and should only be used internally.
+
+
+## Type Definitions
+
+Emboss allows you to define structs, unions, bits, and enums, and uses externals
+to define "basic types." Types may be defined in any order, and may freely
+reference other types in the same module or any imported modules (including the
+implicitly-imported prelude).
+
+### `struct`
+
+A `struct` defines a view of a sequence of bytes. Each field of a `struct` is a
+view of some particular subsequence of the `struct`'s bytes, whose
+interpretation is determined by the field's type.
+
+For example:
+
+```
+struct FramedMessage:
+ -- A FramedMessage wraps a Message with magic bytes, lengths, and CRC.
+ [$default byte_order: "LittleEndian"]
+ 0 [+4] UInt magic_value
+ 4 [+4] UInt header_length (h)
+ 8 [+4] UInt message_length (m)
+ h [+m] Message message
+ h+m [+4] UInt crc32
+ [byte_order: "BigEndian"]
+```
+
+The first line introduces the `struct` and gives it a name. This name may be
+used in field definitions to specify that the field has a structured type, and
+is used in the generated code. For example, to read the `message_length` from a
+sequence of bytes in C++, you would construct a `FramedMessageView` over the
+bytes:
+
+```c++
+// vector<uint8_t> bytes;
+auto framed_message_view = FramedMessageView(&bytes[0], bytes.size());
+uint32_t message_length = framed_message_view.message_length().Read();
+```
+
+(Note that the `FramedMessageView` does not take ownership of the bytes: it only
+provides a view of them.)
+
+Each field starts with a byte range (`0 [+4]`) that indicates *where* the field
+sits in the struct. For example, the `magic_value` field covers the first four
+bytes of the struct.
+
+Field locations *do not have to be constants*. In the example above, the
+`message` field starts at the end of the header (as determined by the
+`header_length` field) and covers `message_length` bytes.
+
+After the field's location is the field's *type*. The type determines how the
+field's bytes are interpreted: the `header_length` field will be interpreted as
+an unsigned integer (`UInt`), while the `message` field is interpreted as a
+`Message` -- another `struct` type defined elsewhere.
+
+After the type is the field's *name*: this is a name used in the generated code
+to access that field, as in `framed_message_view.message_length()`. The name
+may be followed by an optional *abbreviation*, like the `(h)` after
+`header_length`. The abbreviation can be used elsewhere in the `struct`, but is
+not available in the generated code: `framed_message_view.h()` wouldn't compile.
+
+Finally, fields may have attributes and documentation, just like any other
+Emboss construct.
+
+
+#### Parameters
+
+`struct`s and `bits` can take runtime parameters:
+
+```
+struct Foo(x: Int:8, y: Int:8):
+ 0 [+x] UInt:8[] xs
+ x [+y] UInt:8[] ys
+
+enum Version:
+ VERSION_1 = 10
+ VERSION_2 = 20
+
+struct Bar(version: Version):
+ 0 [+1] UInt payload
+ if payload == 1 && version == Version.VERSION_1:
+ 1 [+10] OldPayload1 old_payload_1
+ if payload == 1 && version == Version.VERSION_2:
+ 1 [+12] NewPayload1 new_payload_1
+```
+
+Each parameter must have the form *name`:` type*. Currently, the *type* can
+be:
+
+* `UInt:`*`n`*, where *`n`* is a number from 1 to 64, inclusive.
+* `Int:`*`n`*, where *`n`* is a number from 1 to 64, inclusive.
+* The name of an Emboss `enum` type.
+
+`UInt`- and `Int`-typed parameters are integers with the corresponding range:
+for example, an `Int:4` parameter can have any integer value from -8 to +7.
+
+`enum`-typed parameters can take any value in the `enum`'s native range. Note
+that Emboss `enum`s are *open*, so unnamed values are allowed.
+
+Parameterized structures can be included in other structures by passing their
+parameters:
+
+```
+struct Baz:
+ 0 [+1] Version version
+ 1 [+1] UInt:8 size
+ 2 [+size] Bar(version) bar
+```
+
+
+#### Virtual "Fields"
+
+It is possible to define a non-physical "field" whose value is an expression:
+
+```
+struct Foo:
+ 0 [+4] UInt bar
+ let two_bar = 2 * bar
+```
+
+These virtual "fields" may be used like any other field in most circumstances:
+
+```
+struct Bar:
+ 0 [+4] Foo foo
+ if foo.two_bar < 100:
+ foo.two_bar [+4] UInt uint_at_offset_two_bar
+```
+
+Virtual fields may be integers, booleans, or an enum:
+
+```
+enum Size:
+ SMALL = 1
+ LARGE = 2
+
+struct Qux:
+ 0 [+4] UInt x
+ let x_is_big = x > 100
+ let x_size = x_is_big ? Size.LARGE : Size.SMALL
+```
+
+When a virtual field has a constant value, you may refer to it using its type:
+
+```
+struct Foo:
+ let foo_offset = 0x120
+ 0 [+4] UInt foo
+
+struct Bar:
+ Foo.foo_offset [+4] Foo foo
+```
+
+This does not work for non-constant virtual fields:
+
+```
+struct Foo:
+ 0 [+4] UInt foo
+ let foo_offset = foo + 10
+
+struct Bar:
+ Foo.foo_offset [+4] Foo foo # Won't compile.
+```
+
+Note that, in some cases, you *must* use Type.field, and not field.field:
+
+```
+struct Foo:
+ 0 [+4] UInt foo
+ let foo_offset = 10
+
+struct Bar:
+ # Won't compile: foo.foo_offset depends on foo, which depends on
+ # foo.foo_offset.
+ foo.foo_offset [+4] Foo foo
+
+ # Will compile: Foo.foo_offset is a static constant.
+ Foo.foo_offset [+4] Foo foo
+```
+
+This limitation may be lifted in the future, but it has no practical effect.
+
+
+##### Aliases
+
+Virtual fields of the form `let x = y` or `let x = y.z.q` are allowed even when
+`y` or `q` are composite fields. Virtuals of this form are considered to be
+*aliases* of the referred field; in generated code, they may be written as well
+as read, and writing through them is equivalent to writing to the aliased field.
+
+
+##### Simple Transforms
+
+Virtual fields of the forms `let x1 = y + 1`, `let x2 = 2 + y`, `let x3 = y -
+3`, and `let x4 = 4 - y`, where `y` is a writeable field, will be writeable in
+the generated code. When writing through these fields, the transformed field
+will be set to an appropriate value. For example, writing `5` to `x1` will
+actually write `4` to `y`, and writing `6` to `x4` will write `-2` to `y`. This
+can be used to model fields whose raw values should be adjusted by some constant
+value, e.g.:
+
+```
+struct PosixDate:
+ 0 [+1] Int raw_year
+ -- Number of years since 1900.
+
+ let year = raw_year + 1900
+ -- Gregorian year number.
+
+ 1 [+1] Int zero_based_month
+ -- Month number, from 0-11. Good for looking up a month name in a table.
+
+ let month = zero_based_month + 1
+ -- Month number, from 1-12. Good for printing directly.
+
+ 2 [+1] Int day
+ -- Day number, one-based.
+```
+
+
+#### Subtypes
+
+A `struct` definition may contain other type definitions:
+
+```
+struct Foo:
+ struct Bar:
+ 0 [+2] UInt baz
+ 2 [+2] UInt qux
+
+ 0 [+4] Bar bar
+ 4 [+4] Bar bar2
+```
+
+
+#### Conditional fields
+
+A `struct` field may have fields which are only present under some
+circumstances. For example:
+
+```
+struct FramedMessage:
+ 0 [+4] enum message_id:
+ TYPE1 = 1
+ TYPE2 = 2
+
+ if message_id == MessageId.TYPE1:
+ 4 [+16] Type1Message type_1_message
+
+ if message_id == MessageId.TYPE2:
+ 4 [+8] Type2Message type_2_message
+```
+
+The `type_1_message` field will only be available if `message_id` is `TYPE1`,
+and similarly the `type_2_message` field will only be available if `message_id`
+is `TYPE2`. If `message_id` is some other value, then neither field will be
+available.
+
+
+#### Inline `struct`
+
+It is possible to define a `struct` inline in a `struct` field. For example:
+
+```
+struct Message:
+ [$default byte_order: "BigEndian"]
+ 0 [+4] UInt message_length
+ 4 [+4] struct payload:
+ 0 [+1] UInt incoming
+ 2 [+2] UInt scale_factor
+```
+
+This is equivalent to:
+
+```
+struct Message:
+ [$default byte_order: "BigEndian"]
+
+ struct Payload:
+ 0 [+1] UInt incoming
+ 2 [+2] UInt scale_factor
+
+ 0 [+4] UInt message_length
+ 4 [+4] Payload payload
+```
+
+This can be useful as a way to group related fields together.
+
+
+#### Automatically-Generated Fields
+
+A `struct` will have `$size_in_bytes`, `$max_size_in_bytes`, and
+`$min_size_in_bytes` virtual field automatically generated. These virtual field
+can be referenced inside the Emboss language just like any other virtual field:
+
+```
+struct Inner:
+ 0 [+4] UInt field_a
+ 4 [+4] UInt field_b
+
+struct Outer:
+ 0 [+1] UInt message_type
+ if message_type == 4:
+ 4 [+Inner.$size_in_bytes] Inner payload
+```
+
+
+##### `$size_in_bytes` {#size-in-bytes}
+
+An Emboss `struct` has an *intrinsic* size, which is the size required to hold
+every field in the `struct`, regardless of how many bytes are in the buffer that
+backs the `struct`. For example:
+
+```
+struct FixedSize:
+ 0 [+4] UInt long_field
+ 4 [+2] UInt short_field
+```
+
+In this case, `FixedSize.$size_in_bytes` will always be `6`, even if a
+`FixedSize` is placed in a larger field:
+
+```
+struct Envelope:
+ # padded_payload.$size_in_bytes == FixedSize.$size_in_bytes == 6
+ 0 [+8] FixedSize padded_payload
+```
+
+The intrinsic size of a `struct` might not be constant:
+
+```
+struct DynamicallySizedField:
+ 0 [+1] UInt length
+ 1 [+length] UInt:8[] payload
+ # $size_in_bytes == 1 + length
+
+struct DynamicallyPlacedField:
+ 0 [+1] UInt offset
+ offset [+1] UInt payload
+ # $size_in_bytes == offset + 1
+
+struct OptionalField:
+ 0 [+1] UInt version
+ if version > 3:
+ 1 [+1] UInt optional_field
+ # $size_in_bytes == (version > 3 ? 2 : 1)
+```
+
+If the intrinsic size is dynamic, it can still be read dynamically from a field:
+
+```
+struct Envelope2:
+ 0 [+1] UInt payload_size
+ 1 [+payload_size] DynamicallySizedField payload
+ let padding_bytes = payload_size - payload.$size_in_bytes
+```
+
+
+##### `$max_size_in_bytes` {#max-size-in-bytes}
+
+The `$max_size_in_bytes` virtual field is a constant value that is at least as
+large as the largest possible value for `$size_in_bytes`. In most cases, it
+will exactly equal the largest possible message size, but it is possible to
+outsmart Emboss's bounds checker.
+
+```
+struct DynamicallySizedStruct:
+ 0 [+1] UInt length
+ 1 [+length] UInt:8[] payload
+
+struct PaddedContainer:
+ 0 [+DynamicallySizedStruct.$max_size_in_bytes] DynamicallySizedStruct s
+ # s will be 256 bytes long.
+```
+
+
+##### `$min_size_in_bytes` {#min-size-in-bytes}
+
+The `$min_size_in_bytes` virtual field is a constant value that is no larger
+than the smallest possible value for `$size_in_bytes`. In most cases, it will
+exactly equal the smallest possible message size, but it is possible to
+outsmart Emboss's bounds checker.
+
+```
+struct DynamicallySizedStruct:
+ 0 [+1] UInt length
+ 1 [+length] UInt:8[] payload
+
+struct PaddedContainer:
+ 0 [+DynamicallySizedStruct.$min_size_in_bytes] DynamicallySizedStruct s
+ # s will be 1 byte long.
+```
+
+
+### `enum`
+
+An `enum` defines a set of named integers.
+
+```
+enum Color:
+ BLACK = 0
+ RED = 1
+ GREEN = 2
+ YELLOW = 3
+ BLUE = 4
+ MAGENTA = 5
+ CYAN = 6
+ WHITE = 7
+
+struct PaletteEntry:
+ 0 [+1] UInt id
+ 1 [+1] Color color
+```
+
+Enum values are always read the same way as `Int` or `UInt` -- that is, as an
+unsigned integer or as a 2's-complement signed integer, depending on whether the
+`enum` contains any negative values or not.
+
+Enum values do not have to be contiguous, and may repeat:
+
+```
+enum Baud:
+ B300 = 300
+ B600 = 600
+ B1200 = 1200
+ STANDARD = 1200
+```
+
+All values in a single `enum` must either be between -9223372036854775808
+(-2^63) and 9223372036854775807 (2^(63)-1), inclusive, or between 0 and
+18446744073709551615 (2^(64)-1), inclusive.
+
+It is valid to have an `enum` field that is too small to contain some values in
+the `enum`:
+
+```
+enum LittleAndBig:
+ LITTLE = 1
+ BIG = 0x1_0000_0000
+
+struct LittleOnly:
+ 0 [+1] LittleAndBig:8 little_only # Too small to hold LittleAndBig.BIG
+```
+
+
+#### Inline `enum`
+
+It is possible to provide an enum definition directly in a field definition in a
+`struct` or `bits`:
+
+```
+struct TurnSpecification:
+ 0 [+1] UInt degrees
+ 1 [+1] enum direction:
+ LEFT = 0
+ RIGHT = 1
+```
+
+This example creates a nested `enum` `TurnSpecification.Direction`, exactly as
+if it were written:
+
+```
+struct TurnSpecification:
+ enum Direction:
+ LEFT = 0
+ RIGHT = 1
+
+ 0 [+1] UInt degrees
+ 1 [+1] Direction direction
+```
+
+This can be useful when a particular `enum` is short and only used in one place.
+
+
+### `bits`
+
+A `bits` defines a view of an ordered sequence of bits. Each field is a view of
+some particular subsequence of the `bits`'s bits, whose interpretation is
+determined by the field's type.
+
+The structure of a `bits` definition is very similar to a `struct`, except that
+a `struct` provides a structured view of bytes, where a `bits` provides a
+structured view of bits. Fields in a `bits` must have bit-oriented types (such
+as other `bits`, `UInt`, `Bcd`, `Flag`). Byte-oriented types, such as
+`struct`s, may not be embedded in a `bits`.
+
+For example:
+
+```
+bits ControlRegister:
+ -- The `ControlRegister` holds basic control values.
+
+ 4 [+12] UInt horizontal_start_offset
+ -- The number of pixel clock ticks to wait after the start of a line
+ -- before starting to draw pixel data.
+
+ 3 [+1] Flag horizontal_overscan_disable
+ -- If set, the electron gun will be disabled during the overscan period,
+ -- otherwise the overscan color will be used.
+
+ 0 [+3] UInt horizontal_overscan_color
+ -- The palette index of the overscan color to use.
+
+struct RegisterPage:
+ -- The registers of the BGA (Bogus Graphics Array) card.
+
+ 0 [+2] ControlRegister control_register
+ [byte_order: "LittleEndian"]
+```
+
+The first line introduces the `bits` and gives it a name. This name may be
+used in field definitions to specify that the field has a structured type, and
+is used in the generated code.
+
+For example, to write a `horizontal_overscan_color` of 7 to a pair of bytes in
+C++, you would use:
+
+```c++
+// vector<uint8_t> bytes;
+auto register_page_view = RegisterPageWriter(&bytes[0], bytes.size());
+register_page_view.control_register().horizontal_overscan_color().Write(7);
+```
+
+Similar to `struct`, each field starts with a *bit* range (`4 [+12]`) that
+indicates which bits it covers. For example, the `horizontal_overscan_disable`
+field only covers bit 3. Bit 0 always corresponds to the lowest-order bit the
+bitfield; that is, if a `UInt` covers the same bits as the `bits` construct,
+then bit 0 in the `bits` will be the same as the `UInt` mod 2. This is often,
+but not always, how bits are numbered in protocol specifications.
+
+After the field's location is the field's *type*. The type determines how the
+field's bits are interpreted: typical choices are `UInt` (for unsigned
+integers), `Flag` (for boolean flags), and `enum`s. Other `bits` may also be
+used, as well as any `external` types declared with `[addressable_unit_size:
+1]`.
+
+Fields may have attributes and documentation, just like any other Emboss
+construct.
+
+In generated code, reading or writing any field of a `bits` construct will cause
+the entire field to be read or written -- something to keep in mind when reading
+or writing a memory-mapped register space.
+
+
+#### Automatically-Generated Fields
+
+A `bits` will have `$size_in_bits`, `$max_size_in_bits`, and `$min_size_in_bits`
+virtual fields automatically generated. These virtual fields can be referenced
+inside the Emboss language just like any other virtual field:
+
+```
+bits Inner:
+ 0 [+4] UInt field_a
+ 4 [+4] UInt field_b
+
+struct Outer:
+ 0 [+1] UInt message_type
+ if message_type == 4:
+ 4 [+Inner.$size_in_bits] Inner payload
+```
+
+
+##### `$size_in_bits` {#size-in-bits}
+
+Like a `struct`, an Emboss `bits` has an *intrinsic* size, which is the size
+required to hold every field in the `bits`, regardless of how many bits are
+in the buffer that backs the `bits`. For example:
+
+```
+bits FixedSize:
+ 0 [+3] UInt long_field
+ 3 [+1] Flag short_field
+```
+
+In this case, `FixedSize.$size_in_bits` will always be `4`, even if a
+`FixedSize` is placed in a larger field:
+
+```
+struct Envelope:
+ # padded_payload.$size_in_bits == FixedSize.$size_in_bits == 4
+ 0 [+8] FixedSize padded_payload
+```
+
+Unlike `struct`s, the size of `bits` must known at compile time; there are no
+dynamic `$size_in_bits` fields.
+
+
+##### `$max_size_in_bits` {#max-size-in-bits}
+
+Since `bits` must be fixed size, the `$max_size_in_bits` field has the same
+value as `$size_in_bits`. It is provided for consistency with
+`$max_size_in_bytes`.
+
+
+##### `$min_size_in_bits` {#min-size-in-bits}
+
+Since `bits` must be fixed size, the `$min_size_in_bits` field has the same
+value as `$size_in_bits`. It is provided for consistency with
+`$min_size_in_bytes`.
+
+
+#### Anonymous `bits`
+
+It is possible to use an anonymous `bits` definition directly in a `struct`;
+for example:
+
+```
+struct Message:
+ [$default byte_order: "BigEndian"]
+ 0 [+4] UInt message_length
+ 4 [+4] bits:
+ 0 [+1] Flag incoming
+ 1 [+1] Flag last_fragment
+ 2 [+4] UInt scale_factor
+ 31 [+1] Flag error
+```
+
+In this case, the fields of the `bits` will be treated as though they are fields
+of the outer struct.
+
+
+#### Inline `bits`
+
+Like `enum`s, it is also possible to define a named `bits` inline in a `struct`
+or `bits`. For example:
+
+```
+struct Message:
+ [$default byte_order: "BigEndian"]
+ 0 [+4] UInt message_length
+ 4 [+4] bits payload:
+ 0 [+1] Flag incoming
+ 1 [+1] Flag last_fragment
+ 2 [+4] UInt scale_factor
+ 31 [+1] Flag error
+```
+
+This is equivalent to:
+
+```
+struct Message:
+ [$default byte_order: "BigEndian"]
+
+ bits Payload:
+ 0 [+1] Flag incoming
+ 1 [+1] Flag last_fragment
+ 2 [+4] UInt scale_factor
+ 31 [+1] Flag error
+
+ 0 [+4] UInt message_length
+ 4 [+4] Payload payload
+```
+
+This can be useful as a way to group related fields together.
+
+
+### `external`
+
+An `external` type is used when a type cannot be defined in Emboss itself;
+instead, external code must be provided to manipulate the type.
+
+Emboss's built-in types, such as `UInt`, `Bcd`, and `Flag`, are defined this way
+in a special file called the *prelude*. For example, `UInt` is defined as:
+
+```
+external UInt:
+ -- UInt is an automatically-sized unsigned integer.
+ [type_requires: $is_statically_sized && 1 <= $static_size_in_bits <= 64]
+ [is_integer: true]
+ [addressable_unit_size: 1]
+```
+
+`external` types are an unstable feature. Contact `emboss-dev` if you would
+like to add your own `external`s.
+
+
+## Builtin Types and the Prelude
+
+Emboss has a built-in module called the *Prelude*, which contains types that are
+automatically usable from any module. In particular, types like `Int` and
+`UInt` are defined in the Prelude.
+
+The Prelude is (more or less) a standard Emboss file, called `prelude.emb`, that
+is embedded in the Emboss compiler.
+
+<!-- TODO(bolms): When the documentation generator backend is built, generate
+the Prelude documentation from prelude.emb. -->
+
+
+### `UInt`
+
+A `UInt` is an unsigned integer. `UInt` can be anywhere from 1 to 64 bits in
+size, and may be used both in `struct`s and in `bits`. `UInt` fields may be
+referenced in integer expressions.
+
+
+### `Int`
+
+An `Int` is a signed two's-complement integer. `Int` can be anywhere from 1 to
+64 bits in size, and may be used both in `struct`s and in `bits`. `Int` fields
+may be referenced in integer expressions.
+
+
+### `Bcd`
+
+(Note: `Bcd` is subject to change.)
+
+A `Bcd` is an unsigned binary-coded decimal integer. `Bcd` can be anywhere from
+1 to 64 bits in size, and may be used both in `struct`s and in `bits`. `Bcd`
+fields may be referenced in integer expressions.
+
+When a `Bcd`'s size is not a multiple of 4 bits, the high-order "digit" is
+treated as if it were zero-extended to a multiple of 4 bits. For example, a
+7-bit `Bcd` value can store any number from 0 to 79.
+
+
+### `Flag`
+
+A `Flag` is a 1-bit boolean value. A stored value of `0` means `false`, and a
+stored value of `1` means `true`.
+
+
+### `Float`
+
+A `Float` is a floating-point value in an IEEE 754 binaryNN format, where NN is
+the bit width.
+
+Only 32- and 64-bit `Float`s are supported. There are no current plans to
+support 16- or 128-bit `Float`s, nor the nonstandard x86 80-bit `Float`s.
+
+IEEE 754 does not specify which NaN bit patterns are signalling NaNs and which
+are quiet NaNs, and thus Emboss also does not specify which NaNs are which.
+This means that a quiet NaN written through an Emboss view one system could be
+read out as a signalling NaN through an Emboss view on a different system. If
+this is a concern, the application must explicitly check for NaN before doing
+arithmetic on any floating-point value read from a `Float` field.
+
+
+## General Syntax
+
+### Names
+
+All names in Emboss must be ASCII, for compatibility with languages such as C
+and C++ that do not support Unicode identifiers.
+
+Type names in Emboss are always `CamelCase`. They must start with a capital
+letter, contain at least one lower-case letter, and contain only letters and
+digits. They are required to match the regex
+`[A-Z][a-zA-Z0-9]*[a-z][a-zA-Z0-9]*`
+
+Imported module names and field names are always `snake_case`. They must start
+with a lower-case letter, and may only contain lower-case letters, numbers, and
+underscore. They must match the regex `[a-z][a-z_0-9]*`.
+
+Enum value names are always `SHOUTY_CASE`. They must start with a capital
+letter, may only contain capital letters, numbers, and underscore, and must be
+at least two characters long. They must match the regex
+`[A-Z][A-Z_0-9]*[A-Z_][A-Z_0-9]*`.
+
+Additionally, names that are used as keywords in common programming languages
+are disallowed. A complete list can be found in the [Grammar
+Reference](grammar.md).
+
+
+### Expressions
+
+#### Primary expressions
+
+Emboss primary expressions are field names (like `field` or `field.subfield`),
+numeric constants (like `9` or `0x1_0000_0000`), enum value names (like
+`Enum.VALUE`), and the boolean constants `true` and `false`.
+
+Subfields may be specified using `.`; e.g., `foo.bar` references the `bar`
+subfield of the `foo` field. Emboss parses `.` before any expressions: unlike
+many languages, something like `(foo).bar` is a syntax error in Emboss.
+
+Enum values generally must be qualified by their type; e.g., `Color.RED` rather
+than just `RED`. Enums defined in other modules must use the imported module
+name, as in `styles.Color.RED`.
+
+
+#### Operators and Functions
+
+Note: Emboss currently has a relatively limited set of operators because
+operators have been implemented as needed. If you could use an operator that is
+not on the list, email `emboss-dev@`, and we'll see about adding it.
+
+Emboss operators have the following precedence (tightest binding to loosest
+binding):
+
+1. `()` `$max()` `$present()` `$upper_bound()` `$lower_bound()`
+2. unary `+` and `-` ([see note 1](#precedence-note-unary-plus-minus))
+3. `*`
+4. `+` `-`
+5. `<` `>` `==` `!=` `>=` `<=` ([see note 2](#precedence-note-comparisons))
+6. `&&` `||` ([see note 3](#precedence-note-and-or))
+7. `?:` ([see note 4](#precedence-note-choice))
+
+
+###### Note 1 {#precedence-note-unary-plus-minus}
+
+Only one unary `+` or `-` may be applied to an expression without parentheses.
+These expressions are valid:
+
+```
+-5
++6
+-(-x)
+```
+
+These are not:
+
+```
+- -5
+-+5
++ +5
++-5
+```
+
+
+###### Note 2 {#precedence-note-comparisons}
+
+The relational operators may be chained like so:
+
+```
+10 <= x < 50 # 10 <= x && x < 50
+10 <= x == y < 50 # 10 <= x && x == y && y < 50
+100 > y >= 2 # 100 > y && y >= 2
+x == y == 15 # x == y && y == 15
+```
+
+These are not:
+
+```
+10 < x > 50
+10 < x == y >= z
+x == y >= z <= 50
+```
+
+If one specifically wants to compare the result of a comparison, parentheses
+must be used:
+
+```
+(x > 15) == (y > 15)
+(x > 15) == true
+```
+
+The `!=` operator may not be chained.
+
+A chain may contain either `<`, `<=`, and/or `==`, or `>`, `>=`, and/or `==`.
+Greater-than comparisons may not be mixed with less-than comparisons.
+
+
+###### Note 3 {#precedence-note-and-or}
+
+The boolean logical operators have the same precedence, but may not be mixed
+without parentheses. The following are allowed:
+
+```
+x && y && z
+x || y || z
+(x || y) && z
+x || (y && z)
+```
+
+The following are not allowed:
+
+```
+x || y && z
+x && y || z
+```
+
+
+###### Note 4 {#precedence-note-choice}
+
+The choice operator `?:` may not be chained without parentheses. These are OK:
+
+```
+q ? x : (r ? y : z)
+q ? (r ? x : y) : z
+```
+
+This is not:
+
+```
+q ? x : r ? y : z # Is this `(q?x:r)?y:z` or `q?x:(r?y:z)`?
+q ? r ? x : y : z # Technically unambiguous, but visually confusing
+```
+
+
+##### `()`
+
+Parentheses are used to override precedence. The subexpression inside the
+parentheses will be evaluated as a unit:
+
+```
+3 * 4 + 5 == 17
+3 * (4 + 5) == 27
+```
+
+The value inside the parentheses can have any type; the value of the resulting
+expression will have the same type.
+
+
+##### `$present()`
+
+The `$present()` function takes a field as an argument, and returns `true` if
+the field is present in its structure.
+
+```
+struct PresentExample:
+ 0 [+1] UInt x
+ if false:
+ 1 [+1] UInt y
+ if x > 10:
+ 2 [+1] UInt z
+ if $present(x): # Always true
+ 0 [+1] Int x2
+ if $present(y): # Always false
+ 1 [+1] Int y2
+ if $present(z): # Equivalent to `if x > 10`
+ 2 [+1] Int z2
+```
+
+`$present()` takes exactly one argument.
+
+The argument to `$present()` must be a reference to a field. It can be a nested
+reference, like `$present(x.y.z.q.r)`. The type of the field does not matter.
+
+`$present()` returns a boolean.
+
+
+##### `$max()`
+
+The `$max()` function returns the maximum value out of its arguments:
+
+```
+$max(1) == 1
+$max(-10, -5) == -5
+$max(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) == 10
+```
+
+`$max()` requires at least one argument. There is no explicit limit on the
+number of arguments, but at some point the Emboss compiler will run out of
+memory.
+
+All arguments to `$max()` must be integers, and it returns an integer.
+
+
+##### `$upper_bound()`
+
+The `$upper_bound()` function returns a value that is at least as high as the
+maximum possible value of its argument:
+
+```
+$upper_bound(1) == 1
+$upper_bound(-10) == -10
+$upper_bound(foo) == 255 # If foo is UInt:8
+$upper_bound($max(foo, 500)) == 500 # If foo is UInt:8
+```
+
+Generally, `$upper_bound()` will return a tight bound, but it is possible to
+outsmart Emboss's bounds checker.
+
+`$upper_bound()` takes a single integer argument, and returns a single integer
+argument.
+
+
+##### `$lower_bound()`
+
+The `$lower_bound()` function returns a value that is no greater than the
+minimum possible value of its argument:
+
+```
+$lower_bound(1) == 1
+$lower_bound(-10) == -10
+$lower_bound(foo) == -127 # If foo is Int:8
+$lower_bound($min(foo, -500)) == -500 # If foo is Int:8
+```
+
+Generally, `$lower_bound()` will return a tight bound, but it is possible to
+outsmart Emboss's bounds checker.
+
+`$lower_bound()` takes a single integer argument, and returns a single integer
+argument.
+
+
+##### Unary `+` and `-`
+
+The unary `+` operator returns its argument unchanged.
+
+The unary `-` operator subtracts its argument from 0:
+
+```
+3 * -4 == 0 - 12
+-(3 * 4) == -12
+```
+
+Unary `+` and `-` require an integer argument, and return an integer result.
+
+
+##### `*`
+
+`*` is the multiplication operator:
+
+```
+3 * 4 == 12
+10 * 10 == 100
+```
+
+The `*` operator requires two integer arguments, and returns an integer.
+
+
+##### `+` and `-`
+
+`+` and `-` are the addition and subtraction operators, respectively:
+
+```
+3 + 4 == 7
+3 - 4 == -1
+```
+
+The `+` and `-` operators require two integer arguments, and return an integer
+result.
+
+
+##### `==` and `!=`
+
+The `==` operator returns `true` if its arguments are equal, and `false` if not.
+
+The `!=` operator returns `false` if its arguments are equal, and `true` if not.
+
+Both operators take two boolean arguments, two integer arguments, or two
+arguments of the same enum type, and return a boolean result.
+
+
+##### `<`, `<=`, `>`, and `>=`
+
+The `<` operator returns `true` if its first argument is numerically less than
+its second argument.
+
+The `>` operator returns `true` if its first argument is numerically greater
+than its second argument.
+
+The `<=` operator returns `true` if its first argument is numerically less than
+or equal to its second argument.
+
+The `>=` operator returns `true` if its first argument is numerically greater
+than or equal to its second argument.
+
+All of these operators take two integer arguments, and return a boolean value.
+
+
+##### `&&` and `||`
+
+The `&&` operator returns `false` if either of its arguments are `false`, even
+if the other argument cannot be computed. `&&` returns `true` if both arguments
+are `true`.
+
+The `||` operator returns `true` if either of its arguments are `true`, even if
+the other argument cannot be computed. `||` returns `false` if both arguments
+are `false`.
+
+The `&&` and `||` operators require two boolean arguments, and return a boolean
+result.
+
+
+##### `?:`
+
+The `?:` operator, used like *`condition`*` ? `*`if_true`*` : `*`if_false`*,
+returns *`if_true`* if *`condition`* is `true`, otherwise *`if_false`*.
+
+Other than having stricter type requirements for its arguments, it behaves like
+the C, C++, Java, JavaScript, C#, etc. conditional operator `?:` (sometimes
+called the "ternary operator").
+
+The `?:` operator's *`condition`* argument must be a boolean, and the
+*`if_true`* and *`if_false`* arguments must have the same type. It returns the
+same type as *`if_true`* and *`if_false`*.
+
+
+### Numeric Constant Formats
+
+Numeric constants in Emboss may be written in decimal, hexadecimal, or binary
+format:
+
+```
+12 # The decimal value of 6 + 6.
+012 # The same value; NOT interpreted as octal.
+0xc # The same value, written in hexadecimal.
+0xC # Hex digits may be written in capital letters.
+ # Note that the 'x' must be lower-case: 0XC is not allowed.
+0b1100 # The same value, in binary.
+```
+
+Decimal numbers may use `_` as a thousands separator:
+
+```
+1_000_000 # 1e6
+123_456_789
+```
+
+Hexadecimal and binary numbers may use `_` as a separator every 4 or 8 digits:
+
+```
+0x1234_5678_9abc_def0
+0x12345678_9abcdef0
+0b1010_0101_1010_0101
+0b10100101_10100101
+```
+
+If separators are used, they *must* be thousands separators (for decimal
+numbers) or 4- or 8-digit separators (for binary or hexadecimal numbers); `_`
+may *not* be placed arbitrarily. Binary and hexadecimal numbers must be
+consistent about whether they use 4- or 8-digit separators; they cannot be
+mixed in the same constant:
+
+```
+1000_000 # Not allowed: missing the separator after 1.
+1_000_00 # Not allowed: separators must be followed by a multiple
+ # of 3 digits.
+0x1234_567 # Not allowed: separators must be followed by a multiple
+ # of 4 or 8 digits.
+0x1234_5678_9abcdef0 # Not allowed: cannot mix 4- and 8-digit separators.
+```
diff --git a/g3doc/modular_congruence_multiplication_proof.md b/g3doc/modular_congruence_multiplication_proof.md
new file mode 100644
index 0000000..e69edfd
--- /dev/null
+++ b/g3doc/modular_congruence_multiplication_proof.md
@@ -0,0 +1,100 @@
+# Modular Congruence of the Product of Two Values with Known Modular Congruences
+
+(Draft)
+
+TODO(webstera): Try to simplify this proof.
+
+$$\text{If}$$
+
+$${a} \equiv {r} \pmod{{m}}$$
+
+$${b} \equiv {s} \pmod{{n}}$$
+
+$${a}, {r}, {m}, {b}, {s}, {n} \in ℤ$$
+
+$$\text{then}$$
+
+$${a}{b} \equiv {r}{s} \pmod{G\left(\dfrac{{m}}{G\left({m}, {r}\right)},
+\dfrac{{n}}{G\left({n}, {s}\right)}\right) \cdot G\left({m}, {r}\right) \cdot
+G\left({n}, {s}\right)}$$
+
+$$\text{where }G\text{ is the greatest common divisor function.}$$
+
+$$\text{Proof:}$$
+
+1. $$\exists {x} \in ℤ : {a} = {m}{x} + {r} \text{ by the definition of modular
+ congruence}$$
+
+2. $$\exists {y} \in ℤ : {b} = {n}{y} + {s} \text{ by the definition of modular
+ congruence}$$
+
+3. $$\text{Let }{q} = G\left({m}, {r}\right)$$
+
+4. $$\text{Let }{p} = G\left({n}, {s}\right)$$
+
+5. $$\text{Let }{z} = G\left(\dfrac{{m}}{{q}}, \dfrac{{n}}{{p}}\right) =
+ G\left(\dfrac{{m}}{G\left({m}, {r}\right)}, \dfrac{{n}}{G\left({n},
+ {s}\right)}\right)$$
+
+6. $${a} = {q}\left(\dfrac{{m}{x}}{q} + \dfrac{{r}}{q}\right) \text{ by
+ multiplying } \dfrac{{q}}{{q}} \text{ and distributing }
+ \dfrac{1}{{q}}$$
+
+7. $$\dfrac{{m}{x}}{q}, \dfrac{{r}}{q} \in ℤ \text{ by the definition of
+ } {q} \text{ in (3) }$$
+
+8. $${b} = {p}\left(\dfrac{{n}{y}}{{p}} + \dfrac{{s}}{{p}}\right) \text{ by
+ multiplying } \dfrac{{p}}{{p}} \text{ and distributing }
+ \dfrac{1}{{p}}$$
+
+9. $$\dfrac{{n}{y}}{{p}}, \dfrac{{s}}{{p}} \in ℤ \text{ by the definition of
+ } {p} \text{ in (4) }$$
+
+10. $${a} = {q}\left({z} \cdot \dfrac{{m}{x}}{{q}{z}} +
+ \dfrac{{r}}{{q}}\right) \text{ by multiplying } \dfrac{{z}}{{z}}$$
+
+11. $$\dfrac{{m}{x}}{{q}{z}} \in ℤ \text{ by the definition of } {z} \text{ in
+ (5) }$$
+
+12. $${b} = {p}\left({z} \cdot \dfrac{{n}{y}}{{p}{z}} +
+ \dfrac{{s}}{{p}}\right) \text{ by multiplying } \dfrac{{z}}{{z}}$$
+
+13. $$\dfrac{{n}{y}}{{p}{z}} \in ℤ \text{ by the definition of } {z} \text{ in
+ (5)}$$
+
+14. $${a}{b} = {q}{p}\left({z} \cdot \dfrac{{m}{x}}{{q}{z}} +
+ \dfrac{{r}}{{q}}\right)\left({z} \cdot \dfrac{{n}{y}}{{p}{z}} +
+ \dfrac{{s}}{{p}}\right) \text{ by (10) and (12)}$$
+
+15. $${a}{b} = {q}{p}\left({z}^2 \cdot \dfrac{{m}{x}}{{q}{z}} \cdot
+ \dfrac{{n}{y}}{{p}{z}} + {z} \cdot \dfrac{{r}}{{q}} \cdot
+ \dfrac{{n}{y}}{{p}{z}} + {z} \cdot \dfrac{{m}{x}}{{q}{z}} \cdot
+ \dfrac{{s}}{{p}} + \dfrac{{r}}{{q}} \cdot \dfrac{{s}}{{p}}\right) \text{ by
+ partially distributing (14)}$$
+
+16. $${a}{b} = {q}{p}\left({z}^2 \cdot \dfrac{{m}{x}}{{q}{z}} \cdot
+ \dfrac{{n}{y}}{{p}{z}} + {z} \cdot \dfrac{{r}}{{q}} \cdot
+ \dfrac{{n}{y}}{{p}{z}} + {z} \cdot \dfrac{{m}{x}}{{q}{z}} \cdot
+ \dfrac{{s}}{{p}}\right) + {r}{s} \text{ by extracting the
+ } \dfrac{{r}{s}}{{q}{p}} \text{ term from (15) and cancelling
+ } \dfrac{{q}{p}}{{q}{p}}$$
+
+17. $${a}{b} = {q}{p}{z}\left({z} \cdot \dfrac{{m}{x}}{{q}{z}} \cdot
+ \dfrac{{n}{y}}{{p}{z}} + \dfrac{{r}}{{q}} \cdot \dfrac{{n}{y}}{{p}{z}} +
+ \dfrac{{m}{x}}{{q}{z}} \cdot \dfrac{{s}}{{p}}\right) + {r}{s} \text{ by
+ factoring } {z} \text{ from (16)}$$
+
+18. $${z} \cdot \dfrac{{m}{x}}{{q}{z}} \cdot
+ \dfrac{{n}{y}}{{p}{z}} + \dfrac{{r}}{{q}} \cdot \dfrac{{n}{y}}{{p}{z}} +
+ \dfrac{{m}{x}}{{q}{z}} \cdot \dfrac{{s}}{{p}} \in ℤ \text{ because
+ } {z}, \dfrac{{r}}{q}, \dfrac{{s}}{{p}}, \dfrac{{m}{x}}{{q}{z}},
+ \dfrac{{n}{y}}{{p}{z}}, {z} \in ℤ \text{ per (5), (7), (9), (11),
+ (13)}$$
+
+19. $${a}{b} \equiv {r}{s} \pmod{{q}{p}{z}} \text{ by the definition of
+ modulus}$$
+
+20. $${a}{b} ≡ {r}{s} \pmod{G\left(\dfrac{{m}}{G\left({m}, {r}\right)},
+ \dfrac{{n}}{G\left({n}, {s}\right)}\right) \cdot G\left({m}, {r}\right)
+ \cdot G\left({n}, {s}\right)} \text{ by the definitions of } {q} \text{,
+ } {p} \text{, and } {z} \text{ in (3), (4), and (5)}$$
diff --git a/g3doc/roadmap.md b/g3doc/roadmap.md
new file mode 100644
index 0000000..87ee2b3
--- /dev/null
+++ b/g3doc/roadmap.md
@@ -0,0 +1,193 @@
+# Emboss Roadmap
+
+Arrows indicate implementation order; that is, "A -> B -> C" means "C depends on
+B which depends on A."
+
+```dot {layout_engine=dot}
+digraph {
+ node [ shape=box target="_top" ]
+ edge [ dir="back" ]
+ rankdir="RL"
+
+ {
+ rank="sink"
+ misc [ label="Other" URL="#misc" ]
+ strings [ label="String Type" URL="#strings" ]
+ proto_back [ label="Proto Backend" URL="#proto_back" ]
+ arr_of_bits [ label="Packed Arrays" URL="#arr_of_bits" ]
+ array_stride [ label="Arbitrary Array Stride" URL="#array_stride" ]
+ dyn_array_elem_size [ label="Dynamic Elements" URL="#dyn_array_elem_size" ]
+ shift_ops [ label="<< and >>" URL="#shift_ops" ]
+ private [ label="Private Fields" URL="#private" ]
+ type_syn [ label="Type Syntax" URL="#type_syn" ]
+ division [ label="//" URL="#division" ]
+ exponent [ label="**" URL="#exponent" ]
+ requires [ label="[requires]" URL="#requires" ]
+ }
+
+ del_range [ label="rm [range]" URL="#del_range" ]
+ del_is_integer [ label="rm [is_integer]" URL="#del_is_integer" ]
+
+ {
+ rank="source"
+
+ del_anon_hack [ label="Unhack Anon Bits" URL="#del_anon_hack" ]
+ checksums [ label="CRC/Checksum" URL="#checksums" ]
+ }
+
+ del_is_integer -> type_syn
+
+ del_range -> requires
+ checksums -> requires
+ del_anon_hack -> private
+ del_is_integer -> exponent
+ del_is_integer -> division
+
+ open_source -> del_range
+ open_source -> del_is_integer
+
+ edge [style=dashed]
+}
+```
+
+[TOC]
+
+
+## Type Syntax {#type_syn}
+
+A syntax for expressing Emboss expression types. Likely some variation of set
+builder notation:
+
+```
+ { x in $integer | 0 <= x <= 510 && x % 2 == 0 }
+```
+
+This is needed to replace the `[is_integer]` attribute
+
+
+## Remove `[is_integer]` {#del_is_integer}
+
+Replace the `[is_integer]` attribute on `external`s with a `[type]` attribute.
+This also lets us remove the hack in `expression_bounds.py`. In passing, it
+should also allow `Flag` fields to be used as booleans in expressions.
+
+
+## Private Fields {#private}
+
+Provide some annotation -- likely an attribute -- that indicates that a field is
+"private:" that it should not appear in text format, and that it cannot be
+referenced from outside the structure. From an end user perspective, this is
+primarily useful for virtual fields; internally, the automatic structure that
+wraps anonymous bits can also be marked private so that the backend can stop
+caring about `is_anonymous`.
+
+
+## Finishing Refactoring Hacky Anonymous Bits Code {#del_anon_hack}
+
+Replace the final uses of *`name`*`.is_anonymous` with something that marks
+"anonymous" fields as "private."
+
+
+
+## `[requires]` on Fields {#requires}
+
+Allow the `[requires]` attribute on fields, which can provide an arbitrary
+expression whose value must be `true` for the field to be `Ok()`.
+
+
+## Remove `[range]` {#del_range}
+
+Replace all extant uses of `[range: low..high]` with `[requires: low <= field <
+high]`, and remove support for `[range]`.
+
+
+## Native Checksum/CRC Support {#checksums}
+
+Add support for checking standard checksums/CRCs, and for automatically
+computing them.
+
+
+## Shift Operators (`<<` and `>>`) {#shift_ops}
+
+Left and right shift operators for use in expressions.
+
+
+## Flooring Integer Division (`//`) and Modulus (`%`) {#division}
+
+Flooring (not truncating) integer division for use in expressions, and the
+corresponding modulus operator.
+
+
+## Exponentiation (`**`) {#exponent}
+
+Exponentiation operator. Mostly needed for `[type]` on `UInt`, `Int`, and
+`Bcd`.
+
+
+## Arrays with Runtime-Sized Elements {#dyn_array_elem_size}
+
+Support for arrays where the element size is not known until runtime; e.g.,
+arrays of arrays, where the inner array's size is determined by some variable.
+
+
+## Arbitrary Array Stride {#array_stride}
+
+Support for arrays where the stride (distance between the starts of successive
+elements) is different from the element size. Needed to support padding between
+elements and interlaced elements.
+
+
+## Large Arrays of `bits` {#arr_of_bits}
+
+Support for large, packed arrays of, e.g., 12-bit integers. Requires such
+arrays to be byte-order-aware, and there are potential readability concerns.
+
+
+## Proto Backend {#proto_back}
+
+A backend which generates:
+
+1. A `.proto` file with an "equivalent" set of types.
+2. C++ code that can populate the protos from Emboss views and vice versa.
+
+Essentially, you would be able to serialize to and deserialize from the "proto
+form" the same way that you can currently serialize to and deserialize from
+text.
+
+
+## String Type {#strings}
+
+A physical type for strings. This is complex, because strings can be stored in
+many formats. It is likely that Emboss will only support byte strings, and will
+not directly handle character encodings, but that still leaves a number of
+common formats:
+
+1. Length-specified: length is determined by another field.
+2. Delimited: string runs until some delimiter byte; usually zero, but
+ sometimes ASCII space or '$'.
+3. Optionally-delimited: string runs until either a delimiter byte *or* some
+ maximum size.
+4. Right-padded: string bytes run to some maximum size, but padding bytes
+ (usually ASCII space) should be chopped off the end. Bytes with the
+ padding value can appear in the middle.
+
+
+## Miscellaneous/Potential Features {#misc}
+
+These features could happen if there is interest, but there is no current plan
+to implement them.
+
+
+### Fixed Point
+
+Support for fixed-point arithmetic, both in expressions and physical formats.
+
+
+### Documentation Backend
+
+Support for an HTML/PDF/???-format documentation generator.
+
+
+### Python Backend
+
+A code generator for Python.
diff --git a/g3doc/sitemap.md b/g3doc/sitemap.md
new file mode 100644
index 0000000..2d2520d
--- /dev/null
+++ b/g3doc/sitemap.md
@@ -0,0 +1,7 @@
+* [Home](index.md)
+* [User Guide](guide.md)
+* [Emboss Language Reference](language-reference.md)
+* [Emboss C++ Generated Code Reference](cpp-reference.md)
+* [Grammar Specification](grammar.md)
+* [Design of the Emboss tool](design.md)
+* [Emboss Text Format Reference](text-format.md)
diff --git a/g3doc/text-format.md b/g3doc/text-format.md
new file mode 100644
index 0000000..02cbe05
--- /dev/null
+++ b/g3doc/text-format.md
@@ -0,0 +1,178 @@
+<!-- TODO(bolms): this file could use a review to make sure it is still correct
+(as of 2017 December). -->
+
+# Text Format
+
+[TOC]
+
+## Background
+
+Emboss messages may be automatically converted between a human-readable text
+format and machine-readable bytes. For example, if you have the following
+`.emb` file:
+
+```
+struct Foo:
+ 0 [+1] UInt a
+ 1 [+1] UInt b
+
+struct Bar:
+ 0 [+2] Foo c
+ 2 [+2] Foo d
+```
+
+You may decode a Bar like so:
+
+```c++
+uint8_t buffer[4];
+auto bar_writer = BarWriter(buffer, sizeof buffer);
+bar_writer.UpdateFromText(R"(
+ {
+ c: {
+ a: 12
+ b: 0x20 # Hex numbers are supported.
+ }
+ d: {
+ a: 33
+ b: 0b10110011 # ... as are binary.
+ }
+ }
+)");
+assert(bar_writer.c().a().Read() == 12);
+assert(bar_writer.c().b().Read() == 32);
+assert(bar_writer.d().a().Read() == 33);
+assert(bar_writer.d().b().Read() == 0xb3);
+```
+
+Note that you can use `#`-style comments inside of the text format.
+
+It is also acceptable to omit fields, in which case they will not be updated:
+
+```c++
+bar_writer.UpdateFromText("d { a: 123 }");
+assert(bar_writer.c().a().Read() == 12);
+assert(bar_writer.d().a().Read() == 123);
+```
+
+Because Emboss does not enforce dependencies or duplicate field sets in
+`UpdateFromText`, it is currently possible to do something like this:
+
+```
+# memory_selector.emb
+struct MemorySelector:
+ 0 [+1] UInt addr
+ addr [+1] UInt:8 byte
+```
+
+```c++
+// memory_select_writer.cc
+uint8_t buffer[4];
+auto memory_writer = MemoryWriter(buffer, sizeof buffer);
+memory_writer.UpdateFromText(R"(
+ {
+ addr: 1
+ byte: 10
+ addr: 2
+ byte: 20
+ addr: 3
+ byte: 30
+ addr: 0
+ }
+)");
+assert(buffer[1] == 10);
+assert(buffer[2] == 20);
+assert(buffer[3] == 30);
+assert(buffer[0] == 0);
+```
+
+*Do not rely on this behavior.* A future version of Emboss may add tracking to
+ensure that this example is an error.
+
+
+## Text Format Details
+
+The exact text format accepted by an Emboss view depends on the view type.
+Extra whitespace is ignored between tokens. Any place where whitespace is
+allowed, the `#` character denotes a comment which extends to the end of the
+line.
+
+
+### `struct` and `bits`
+
+The text format of a `struct` or `bits` is a sequence of name/value pairs
+surrounded by braces, where field names are separated from field values by
+colons:
+
+ {
+ field_name: FIELD_VALUE
+ field_name_2: FIELD_VALUE_2
+ substructure: {
+ subfield: 123
+ }
+ }
+
+Only fields which are actually listed in the text will be set.
+
+If a field's address depends on another field's value, then the order in which
+they are listed in the text format becomes important. When setting both,
+always make sure to set the dependee field before the dependent field.
+
+It is currently possible to specify a field more than once, but this may not be
+supported in the future.
+
+
+### `UInt` and `Int`
+
+`UInt`s and `Int`s accept numeric values in the same formats that are allowed
+in Emboss source files:
+
+ 123456
+ 123_456
+ 0x1234cdef
+ 0x1234_cdef
+ 0b10100101
+ 0b1010_0101
+ -123
+ -0b111
+
+
+### `Flag`
+
+`Flag`s expect either `true` or `false`.
+
+
+### `enum`
+
+An `enum`'s value may be either a name listed in the enum definition, or a
+numeric value:
+
+ FOO
+ 2
+ 100
+
+
+### Arrays
+
+An array is a list of values (in the appropriate format for the type of the
+array), separated by commas and surrounded by braces. Values may be optionally
+prefixed with index markers of the form `[0]:`, where `0` may be any unsigned
+integer. An extra comma at the end of the list is allowed, but not required:
+
+ { 0, 1, 2, 3, 4, 5, 6, 7 }
+ { 0, 1, 2, 3, 4, 5, 6, 7, }
+ { 0, 1, 2, 3, 4, [7]: 7, [6]: 6, [5]: 5 }
+
+When no index marker is specified, values are written to the index which is one
+greater than the previous value's index:
+
+ { [4]: 4, 5, 6, 7, [0]: 0, 1, 2, 3 }
+
+It is currently possible to specify multiple values for a single index, but
+this may not be supported in the future.
+
+*TODO(bolms): In the future section about creating new `external` types, make
+sure to note that the `external`'s text format should not start with `[` or
+`}`.*
+
+
+
diff --git a/g3doc/todo.md b/g3doc/todo.md
new file mode 100644
index 0000000..611f02e
--- /dev/null
+++ b/g3doc/todo.md
@@ -0,0 +1,2 @@
+* Syntax files:
+ * TODO(bolms): Emacs syntax file
diff --git a/license_header b/license_header
new file mode 100644
index 0000000..d446874
--- /dev/null
+++ b/license_header
@@ -0,0 +1,13 @@
+Copyright 2019 Google LLC
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+ https://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
diff --git a/public/BUILD b/public/BUILD
new file mode 100644
index 0000000..45810a9
--- /dev/null
+++ b/public/BUILD
@@ -0,0 +1,169 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Emboss public definitions.
+
+load(
+ ":build_defs.bzl",
+ "emboss_cc_util_test",
+)
+
+py_library(
+ name = "ir_pb2",
+ srcs = ["ir_pb2.py"],
+ visibility = ["//visibility:public"],
+)
+
+cc_library(
+ name = "cpp_utils",
+ hdrs = [
+ "emboss_arithmetic.h",
+ "emboss_array_view.h",
+ "emboss_bit_util.h",
+ "emboss_constant_view.h",
+ "emboss_cpp_types.h",
+ "emboss_cpp_util.h",
+ "emboss_defines.h",
+ "emboss_enum_view.h",
+ "emboss_maybe.h",
+ "emboss_memory_util.h",
+ "emboss_prelude.h",
+ "emboss_text_util.h",
+ "emboss_view_parameters.h",
+ ],
+ deps = [
+ ],
+ visibility = ["//visibility:public"],
+)
+
+emboss_cc_util_test(
+ name = "emboss_prelude_test",
+ srcs = [
+ "emboss_prelude_test.cc",
+ ],
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"],
+ deps = [
+ ":cpp_utils",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_util_test(
+ name = "emboss_arithmetic_test",
+ srcs = [
+ "emboss_arithmetic_test.cc",
+ ],
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"],
+ deps = [
+ ":cpp_utils",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_util_test(
+ name = "emboss_array_view_test",
+ srcs = [
+ "emboss_array_view_test.cc",
+ ],
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"],
+ deps = [
+ ":cpp_utils",
+ "@com_google_googletest//:gtest_main",
+ "@com_google_absl//absl/strings:str_format",
+ ],
+)
+
+emboss_cc_util_test(
+ name = "emboss_bit_util_test",
+ srcs = [
+ "emboss_bit_util_test.cc",
+ ],
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"],
+ deps = [
+ ":cpp_utils",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_util_test(
+ name = "emboss_constant_view_test",
+ srcs = [
+ "emboss_constant_view_test.cc",
+ ],
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"],
+ deps = [
+ ":cpp_utils",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_util_test(
+ name = "emboss_cpp_types_test",
+ srcs = [
+ "emboss_cpp_types_test.cc",
+ ],
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"],
+ deps = [
+ ":cpp_utils",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_util_test(
+ name = "emboss_defines_test",
+ srcs = [
+ "emboss_defines_test.cc",
+ ],
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"],
+ deps = [
+ ":cpp_utils",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_util_test(
+ name = "emboss_maybe_test",
+ srcs = [
+ "emboss_maybe_test.cc",
+ ],
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"],
+ deps = [
+ ":cpp_utils",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_util_test(
+ name = "emboss_memory_util_test",
+ srcs = [
+ "emboss_memory_util_test.cc",
+ ],
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"],
+ deps = [
+ ":cpp_utils",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
+
+emboss_cc_util_test(
+ name = "emboss_text_util_test",
+ srcs = [
+ "emboss_text_util_test.cc",
+ ],
+ copts = ["-DEMBOSS_FORCE_ALL_CHECKS"],
+ deps = [
+ ":cpp_utils",
+ "@com_google_googletest//:gtest_main",
+ ],
+)
diff --git a/public/build_defs.bzl b/public/build_defs.bzl
new file mode 100644
index 0000000..47ed4e1
--- /dev/null
+++ b/public/build_defs.bzl
@@ -0,0 +1,93 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# -*- mode: python; -*-
+# vim:set ft=blazebuild:
+"""Build defs for Emboss.
+
+This file exports the emboss_cc_library rule, which accepts an .emb file and
+produces a corresponding C++ library.
+"""
+
+def emboss_cc_library(name, srcs, deps = [], visibility = None):
+ """Constructs a C++ library from an .emb file."""
+ if len(srcs) != 1:
+ fail(
+ "Must specify exactly one Emboss source file for emboss_cc_library.",
+ "srcs",
+ )
+
+ native.filegroup(
+ # The original .emb file must be visible to any other emboss_cc_library
+ # that specifies this emboss_cc_library in its deps. This rule makes the
+ # original .emb available to dependent rules.
+ # TODO(bolms): As an optimization, use the precompiled IR instead of
+ # reparsing the raw .embs.
+ name = name + "__emb",
+ srcs = srcs,
+ visibility = visibility,
+ )
+
+ native.genrule(
+ # The generated header may be used in non-cc_library rules.
+ name = name + "_header",
+ tools = [
+ # TODO(bolms): Make "emboss" driver program.
+ "//front_end:emboss_front_end",
+ "//back_end/cpp:emboss_codegen_cpp",
+ ],
+ srcs = srcs + [dep + "__emb" for dep in deps],
+ cmd = ("$(location //front_end:emboss_front_end) " +
+ "--output-ir-to-stdout " +
+ "--import-dir=. " +
+ "--import-dir='$(GENDIR)' " +
+ "$(location {}) > $(@D)/$$(basename $(OUTS) .h).ir; " +
+ "$(location //back_end/cpp:emboss_codegen_cpp) " +
+ "< $(@D)/$$(basename $(OUTS) .h).ir > " +
+ "$(OUTS); " +
+ "rm $(@D)/$$(basename $(OUTS) .h).ir").format(") $location( ".join(srcs)),
+ outs = [src + ".h" for src in srcs],
+ # This rule should only be visible to the following rule.
+ visibility = ["//visibility:private"],
+ )
+
+ native.cc_library(
+ name = name,
+ hdrs = [
+ ":" + name + "_header",
+ ],
+ deps = deps + [
+ "//public:cpp_utils",
+ ],
+ visibility = visibility,
+ )
+
+# TODO(bolms): Maybe move this to a non-public build_defs?
+def emboss_cc_util_test(name, copts = [], **kwargs):
+ """Constructs two cc_test targets, with and without optimizations."""
+ native.cc_test(
+ name = name,
+ copts = copts,
+ **kwargs
+ )
+ native.cc_test(
+ name = name + "_no_opts",
+ copts = copts + [
+ # This is generally a dangerous flag for an individual target, but
+ # these tests do not depend on any other .cc files that might
+ # #include any Emboss headers.
+ "-DEMBOSS_NO_OPTIMIZATIONS",
+ ],
+ **kwargs
+ )
diff --git a/public/emboss_arithmetic.h b/public/emboss_arithmetic.h
new file mode 100644
index 0000000..bb73b7f
--- /dev/null
+++ b/public/emboss_arithmetic.h
@@ -0,0 +1,325 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Implementations for the operations and builtin functions in the Emboss
+// expression language.
+#ifndef EMBOSS_PUBLIC_EMBOSS_ARITHMETIC_H_
+#define EMBOSS_PUBLIC_EMBOSS_ARITHMETIC_H_
+
+#include <cstdint>
+#include <type_traits>
+
+#include "public/emboss_bit_util.h"
+#include "public/emboss_maybe.h"
+
+namespace emboss {
+namespace support {
+
+// Arithmetic operations
+//
+// Emboss arithmetic is performed by special-purpose functions, not (directly)
+// using C++ operators. This allows Emboss to handle the minor differences
+// between the ways that Emboss operations are defined and the way that C++
+// operations are defined, and provides a convenient way to handle arithmetic on
+// values that might not be readable.
+//
+// The biggest differences are:
+//
+// Emboss's And and Or are defined to return false or true, respectively, if at
+// least one operand is false or true, respectively, even if the other operand
+// is not Known(). This is similar to C/C++ shortcut evaluation, except that it
+// is symmetric.
+//
+// Emboss's expression type system uses (notionally) infinite-size integers, but
+// it is an error in Emboss if the full range of any subexpression cannot fit in
+// either [-(2**63), 2**63 - 1] or [0, 2**64 - 1]. Additionally, either all
+// arguments to and the return type of an operation, if integers, must fit in
+// int64_t, or they must all fit in uin64_t. This means that C++ integer types
+// can be used directly for each operation, but casting may be required in
+// between operations.
+
+inline constexpr bool AllKnown() { return true; }
+
+template <typename T, typename... RestT>
+inline constexpr bool AllKnown(T value, RestT... rest) {
+ return value.Known() && AllKnown(rest...);
+}
+
+// MaybeDo implements the logic of checking for known values, unwrapping the
+// known values, passing the unwrapped values to OperatorT, and then rewrapping
+// the result.
+template <typename IntermediateT, typename ResultT, typename OperatorT,
+ typename... ArgsT>
+inline constexpr Maybe<ResultT> MaybeDo(Maybe<ArgsT>... args) {
+ return AllKnown(args...)
+ ? Maybe<ResultT>(static_cast<ResultT>(OperatorT::template Do(
+ static_cast<IntermediateT>(args.ValueOrDefault())...)))
+ : Maybe<ResultT>();
+}
+
+//// Operations intended to be passed to MaybeDo:
+
+struct SumOperation {
+ template <typename T>
+ static inline constexpr T Do(T l, T r) {
+ return l + r;
+ }
+};
+
+struct DifferenceOperation {
+ template <typename T>
+ static inline constexpr T Do(T l, T r) {
+ return l - r;
+ }
+};
+
+struct ProductOperation {
+ template <typename T>
+ static inline constexpr T Do(T l, T r) {
+ return l * r;
+ }
+};
+
+// Assertions for the template types of comparisons.
+template <typename ResultT, typename LeftT, typename RightT>
+inline constexpr bool AssertComparisonInPartsTypes() {
+ static_assert(::std::is_same<ResultT, bool>::value,
+ "EMBOSS BUG: Comparisons must return bool.");
+ static_assert(
+ ::std::is_signed<LeftT>::value || ::std::is_signed<RightT>::value,
+ "EMBOSS BUG: Comparisons in parts expect one side to be signed.");
+ static_assert(
+ ::std::is_unsigned<LeftT>::value || ::std::is_unsigned<RightT>::value,
+ "EMBOSS BUG: Comparisons in parts expect one side to be unsigned.");
+ return true; // A literal return type is required for a constexpr function.
+}
+
+struct EqualOperation {
+ template <typename T>
+ static inline constexpr bool Do(T l, T r) {
+ return l == r;
+ }
+};
+
+struct NotEqualOperation {
+ template <typename T>
+ static inline constexpr bool Do(T l, T r) {
+ return l != r;
+ }
+};
+
+struct LessThanOperation {
+ template <typename T>
+ static inline constexpr bool Do(T l, T r) {
+ return l < r;
+ }
+};
+
+struct LessThanOrEqualOperation {
+ template <typename T>
+ static inline constexpr bool Do(T l, T r) {
+ return l <= r;
+ }
+};
+
+struct GreaterThanOperation {
+ template <typename T>
+ static inline constexpr bool Do(T l, T r) {
+ return l > r;
+ }
+};
+
+struct GreaterThanOrEqualOperation {
+ template <typename T>
+ static inline constexpr bool Do(T l, T r) {
+ return l >= r;
+ }
+};
+
+// MaximumOperation is a bit more complex, in order to handle the variable
+// number of parameters.
+struct MaximumOperation {
+ template <typename T>
+ static inline constexpr T Do(T arg) {
+ // Base case for recursive template.
+ return arg;
+ }
+
+ // Ideally, this would only use template<typename T>, but C++11 requires a
+ // full variadic template or C-style variadic function in order to accept a
+ // variable number of arguments. C-style variadic functions have no intrinsic
+ // way of figuring out how many arguments they receive, so we have to use a
+ // variadic template.
+ //
+ // The static_assert ensures that all arguments are actually the same type.
+ template <typename T1, typename T2, typename... T>
+ static inline constexpr T1 Do(T1 l, T2 r, T... rest) {
+ // C++11 std::max is not constexpr, so we can't just call it.
+ static_assert(::std::is_same<T1, T2>::value,
+ "Expected Do to be called with a proper intermediate type.");
+ return Do(l < r ? r : l, rest...);
+ }
+};
+
+//// Special operations, where either un-Known() operands do not always result
+//// in un-Known() results, or where Known() operands do not always result in
+//// Known() results.
+
+// Assertions for And and Or.
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr bool AssertBooleanOperationTypes() {
+ // And and Or are templates so that the Emboss code generator
+ // doesn't have to special case AND, but they should only be instantiated with
+ // <bool, bool, bool>. This pushes a bit of extra work onto the C++ compiler.
+ static_assert(::std::is_same<IntermediateT, bool>::value,
+ "EMBOSS BUG: Boolean operations must have bool IntermediateT.");
+ static_assert(::std::is_same<ResultT, bool>::value,
+ "EMBOSS BUG: Boolean operations must return bool.");
+ static_assert(::std::is_same<LeftT, bool>::value,
+ "EMBOSS BUG: Boolean operations require boolean operands.");
+ static_assert(::std::is_same<RightT, bool>::value,
+ "EMBOSS BUG: Boolean operations require boolean operands.");
+ return true; // A literal return type is required for a constexpr function.
+}
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> And(Maybe<LeftT> l, Maybe<RightT> r) {
+ // If either value is false, the result is false, even if the other value is
+ // unknown. Otherwise, if either value is unknown, the result is unknown.
+ // Otherwise, both values are true, and the result is true.
+ return AssertBooleanOperationTypes<IntermediateT, ResultT, LeftT, RightT>(),
+ !l.ValueOr(true) || !r.ValueOr(true)
+ ? Maybe<ResultT>(false)
+ : (!l.Known() || !r.Known() ? Maybe<ResultT>()
+ : Maybe<ResultT>(true));
+}
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> Or(Maybe<LeftT> l, Maybe<RightT> r) {
+ // If either value is true, the result is true, even if the other value is
+ // unknown. Otherwise, if either value is unknown, the result is unknown.
+ // Otherwise, both values are false, and the result is false.
+ return AssertBooleanOperationTypes<IntermediateT, ResultT, LeftT, RightT>(),
+ l.ValueOr(false) || r.ValueOr(false)
+ ? Maybe<ResultT>(true)
+ : (!l.Known() || !r.Known() ? Maybe<ResultT>()
+ : Maybe<ResultT>(false));
+}
+
+template <typename ResultT, typename ValueT>
+inline constexpr Maybe<ResultT> MaybeStaticCast(Maybe<ValueT> value) {
+ return value.Known()
+ ? Maybe<ResultT>(static_cast<ResultT>(value.ValueOrDefault()))
+ : Maybe<ResultT>();
+}
+
+template <typename IntermediateT, typename ResultT, typename ConditionT,
+ typename TrueT, typename FalseT>
+inline constexpr Maybe<ResultT> Choice(Maybe<ConditionT> condition,
+ Maybe<TrueT> if_true,
+ Maybe<FalseT> if_false) {
+ // Since the result of a condition could be any value from either if_true or
+ // if_false, it should be the same type as IntermediateT.
+ static_assert(::std::is_same<IntermediateT, ResultT>::value,
+ "Choice's IntermediateT should be the same as ResultT.");
+ static_assert(::std::is_same<ConditionT, bool>::value,
+ "Choice operation requires a boolean condition.");
+ // If the condition is un-Known(), then the result is un-Known(). Otherwise,
+ // the result is if_true if condition, or if_false if not condition. For
+ // integral types, ResultT may differ from TrueT or FalseT, so Known() results
+ // must be unwrapped, cast to ResultT, and re-wrapped in Maybe<ResultT>. For
+ // non-integral TrueT/FalseT/ResultT, the cast is unnecessary, but safe.
+ return condition.Known() ? condition.ValueOrDefault()
+ ? MaybeStaticCast<ResultT, TrueT>(if_true)
+ : MaybeStaticCast<ResultT, FalseT>(if_false)
+ : Maybe<ResultT>();
+}
+
+//// From here down: boilerplate instantiations of the various operations, which
+//// only forward to MaybeDo:
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> Sum(Maybe<LeftT> l, Maybe<RightT> r) {
+ return MaybeDo<IntermediateT, ResultT, SumOperation, LeftT, RightT>(l, r);
+}
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> Difference(Maybe<LeftT> l, Maybe<RightT> r) {
+ return MaybeDo<IntermediateT, ResultT, DifferenceOperation, LeftT, RightT>(l,
+ r);
+}
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> Product(Maybe<LeftT> l, Maybe<RightT> r) {
+ return MaybeDo<IntermediateT, ResultT, ProductOperation, LeftT, RightT>(l, r);
+}
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> Equal(Maybe<LeftT> l, Maybe<RightT> r) {
+ return MaybeDo<IntermediateT, ResultT, EqualOperation, LeftT, RightT>(l, r);
+}
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> NotEqual(Maybe<LeftT> l, Maybe<RightT> r) {
+ return MaybeDo<IntermediateT, ResultT, NotEqualOperation, LeftT, RightT>(l,
+ r);
+}
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> LessThan(Maybe<LeftT> l, Maybe<RightT> r) {
+ return MaybeDo<IntermediateT, ResultT, LessThanOperation, LeftT, RightT>(l,
+ r);
+}
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> LessThanOrEqual(Maybe<LeftT> l,
+ Maybe<RightT> r) {
+ return MaybeDo<IntermediateT, ResultT, LessThanOrEqualOperation, LeftT,
+ RightT>(l, r);
+}
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> GreaterThan(Maybe<LeftT> l, Maybe<RightT> r) {
+ return MaybeDo<IntermediateT, ResultT, GreaterThanOperation, LeftT, RightT>(
+ l, r);
+}
+
+template <typename IntermediateT, typename ResultT, typename LeftT,
+ typename RightT>
+inline constexpr Maybe<ResultT> GreaterThanOrEqual(Maybe<LeftT> l,
+ Maybe<RightT> r) {
+ return MaybeDo<IntermediateT, ResultT, GreaterThanOrEqualOperation, LeftT,
+ RightT>(l, r);
+}
+
+template <typename IntermediateT, typename ResultT, typename... ArgsT>
+inline constexpr Maybe<ResultT> Maximum(Maybe<ArgsT>... args) {
+ return MaybeDo<IntermediateT, ResultT, MaximumOperation, ArgsT...>(args...);
+}
+
+} // namespace support
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_ARITHMETIC_H_
diff --git a/public/emboss_arithmetic_test.cc b/public/emboss_arithmetic_test.cc
new file mode 100644
index 0000000..3f41315
--- /dev/null
+++ b/public/emboss_arithmetic_test.cc
@@ -0,0 +1,291 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_arithmetic.h"
+
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace support {
+
+// EXPECT_EQ uses operator==. For un-Known() Maybes, this follows the semantics
+// for operator==(std::optional<T>, std::optional<T>), which returns true if
+// neither argument has_value(). (It also matches Rust's Option and Haskell's
+// Maybe.)
+//
+// Given the name "Known", it arguably should follow NaN != NaN semantics
+// instead, but this is more useful for tests.
+template <typename T>
+constexpr inline bool operator==(const Maybe<T> &l, const Maybe<T> &r) {
+ return l.Known() == r.Known() && l.ValueOrDefault() == r.ValueOrDefault();
+}
+
+namespace test {
+
+using ::std::int32_t;
+using ::std::int64_t;
+using ::std::uint32_t;
+using ::std::uint64_t;
+
+TEST(Sum, Sum) {
+ EXPECT_EQ(Maybe<int32_t>(0), (Sum<int32_t, int32_t, int32_t, int32_t>(
+ Maybe<int32_t>(0), Maybe<int32_t>(0))));
+ EXPECT_EQ(Maybe<int32_t>(2147483647),
+ (Sum<int32_t, int32_t, int32_t, int32_t>(Maybe<int32_t>(2147483646),
+ Maybe<int32_t>(1))));
+ EXPECT_EQ(Maybe<int32_t>(-2147483647 - 1),
+ (Sum<int32_t, int32_t, int32_t, int32_t>(
+ Maybe<int32_t>(-2147483647), Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<uint32_t>(2147483648U),
+ (Sum<uint32_t, uint32_t, int32_t, int32_t>(
+ Maybe<int32_t>(2147483647), Maybe<int32_t>(1))));
+ EXPECT_EQ(Maybe<int32_t>(2147483647),
+ (Sum<int64_t, int32_t, uint32_t, int32_t>(
+ Maybe<uint32_t>(2147483648U), Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<int32_t>(), (Sum<int64_t, int32_t, uint32_t, int32_t>(
+ Maybe<uint32_t>(), Maybe<int32_t>(-1))));
+}
+
+TEST(Difference, Difference) {
+ EXPECT_EQ(Maybe<int32_t>(0), (Difference<int32_t, int32_t, int32_t, int32_t>(
+ Maybe<int32_t>(0), Maybe<int32_t>(0))));
+ EXPECT_EQ(Maybe<int32_t>(2147483647),
+ (Difference<int32_t, int32_t, int32_t, int32_t>(
+ Maybe<int32_t>(2147483646), Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<int32_t>(-2147483647 - 1),
+ (Difference<int32_t, int32_t, int32_t, int32_t>(
+ Maybe<int32_t>(-2147483647), Maybe<int32_t>(1))));
+ EXPECT_EQ(Maybe<uint32_t>(2147483648U),
+ (Difference<uint32_t, uint32_t, int32_t, int32_t>(
+ Maybe<int32_t>(2147483647), Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<int32_t>(2147483647),
+ (Difference<uint32_t, int32_t, uint32_t, int32_t>(
+ Maybe<uint32_t>(2147483648U), Maybe<int32_t>(1))));
+ EXPECT_EQ(Maybe<int32_t>(-2147483647 - 1),
+ (Difference<int64_t, int32_t, int32_t, uint32_t>(
+ Maybe<int32_t>(1), Maybe<uint32_t>(2147483649U))));
+ EXPECT_EQ(Maybe<int32_t>(), (Difference<int64_t, int32_t, int32_t, uint32_t>(
+ Maybe<int32_t>(1), Maybe<uint32_t>())));
+}
+
+TEST(Product, Product) {
+ EXPECT_EQ(Maybe<int32_t>(0), (Product<int32_t, int32_t, int32_t, int32_t>(
+ Maybe<int32_t>(0), Maybe<int32_t>(0))));
+ EXPECT_EQ(Maybe<int32_t>(-2147483646),
+ (Product<int32_t, int32_t, int32_t, int32_t>(
+ Maybe<int32_t>(2147483646), Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<int32_t>(-2147483647 - 1),
+ (Product<int32_t, int32_t, int32_t, int32_t>(
+ Maybe<int32_t>(-2147483647 - 1), Maybe<int32_t>(1))));
+ EXPECT_EQ(Maybe<uint32_t>(2147483648U),
+ (Product<uint32_t, uint32_t, int32_t, int32_t>(
+ Maybe<int32_t>(1073741824), Maybe<int32_t>(2))));
+ EXPECT_EQ(Maybe<uint32_t>(), (Product<uint32_t, uint32_t, int32_t, int32_t>(
+ Maybe<int32_t>(), Maybe<int32_t>(2))));
+}
+
+TEST(Equal, Equal) {
+ EXPECT_EQ(Maybe<bool>(true), (Equal<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(0), Maybe<int32_t>(0))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (Equal<int32_t, bool, int32_t, int32_t>(Maybe<int32_t>(2147483646),
+ Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (Equal<int32_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(2147483647), Maybe<uint32_t>(2147483647))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (Equal<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(-2147483648LL), Maybe<uint32_t>(2147483648U))));
+ EXPECT_EQ(Maybe<bool>(),
+ (Equal<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(), Maybe<uint32_t>(2147483648U))));
+}
+
+TEST(NotEqual, NotEqual) {
+ EXPECT_EQ(Maybe<bool>(false), (NotEqual<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(0), Maybe<int32_t>(0))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (NotEqual<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(2147483646), Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (NotEqual<int32_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(2147483647), Maybe<uint32_t>(2147483647))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (NotEqual<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(-2147483648LL), Maybe<uint32_t>(2147483648U))));
+ EXPECT_EQ(Maybe<bool>(),
+ (NotEqual<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(-2147483648LL), Maybe<uint32_t>())));
+}
+
+TEST(LessThan, LessThan) {
+ EXPECT_EQ(Maybe<bool>(false), (LessThan<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(0), Maybe<int32_t>(0))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (LessThan<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(2147483646), Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (LessThan<int32_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(2147483647), Maybe<uint32_t>(2147483647))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (LessThan<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(-2147483648LL), Maybe<uint32_t>(2147483648U))));
+ EXPECT_EQ(Maybe<bool>(),
+ (LessThan<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(), Maybe<uint32_t>(2147483648U))));
+}
+
+TEST(LessThanOrEqual, LessThanOrEqual) {
+ EXPECT_EQ(Maybe<bool>(true),
+ (LessThanOrEqual<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(0), Maybe<int32_t>(0))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (LessThanOrEqual<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(2147483646), Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (LessThanOrEqual<int32_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(2147483647), Maybe<uint32_t>(2147483647))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (LessThanOrEqual<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(-2147483648LL), Maybe<uint32_t>(2147483648U))));
+ EXPECT_EQ(Maybe<bool>(),
+ (LessThanOrEqual<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(), Maybe<uint32_t>(2147483648U))));
+}
+
+TEST(GreaterThan, GreaterThan) {
+ EXPECT_EQ(Maybe<bool>(false), (GreaterThan<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(0), Maybe<int32_t>(0))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (GreaterThan<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(2147483646), Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (GreaterThan<int32_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(2147483647), Maybe<uint32_t>(2147483647))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (GreaterThan<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(-2147483648LL), Maybe<uint32_t>(2147483648U))));
+ EXPECT_EQ(Maybe<bool>(),
+ (GreaterThan<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(), Maybe<uint32_t>(2147483648U))));
+}
+
+TEST(GreaterThanOrEqual, GreaterThanOrEqual) {
+ EXPECT_EQ(Maybe<bool>(true),
+ (GreaterThanOrEqual<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(0), Maybe<int32_t>(0))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (GreaterThanOrEqual<int32_t, bool, int32_t, int32_t>(
+ Maybe<int32_t>(2147483646), Maybe<int32_t>(-1))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (GreaterThanOrEqual<int32_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(2147483647), Maybe<uint32_t>(2147483647))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (GreaterThanOrEqual<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(-2147483648LL), Maybe<uint32_t>(2147483648U))));
+ EXPECT_EQ(Maybe<bool>(),
+ (GreaterThanOrEqual<int64_t, bool, int32_t, uint32_t>(
+ Maybe<int32_t>(), Maybe<uint32_t>(2147483648U))));
+}
+
+TEST(And, And) {
+ EXPECT_EQ(Maybe<bool>(true), (And<bool, bool, bool, bool>(
+ Maybe<bool>(true), Maybe<bool>(true))));
+ EXPECT_EQ(Maybe<bool>(),
+ (And<bool, bool, bool, bool>(Maybe<bool>(), Maybe<bool>(true))));
+ EXPECT_EQ(Maybe<bool>(),
+ (And<bool, bool, bool, bool>(Maybe<bool>(), Maybe<bool>())));
+ EXPECT_EQ(Maybe<bool>(),
+ (And<bool, bool, bool, bool>(Maybe<bool>(true), Maybe<bool>())));
+ EXPECT_EQ(Maybe<bool>(false), (And<bool, bool, bool, bool>(
+ Maybe<bool>(false), Maybe<bool>(true))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (And<bool, bool, bool, bool>(Maybe<bool>(false), Maybe<bool>())));
+ EXPECT_EQ(Maybe<bool>(false), (And<bool, bool, bool, bool>(
+ Maybe<bool>(false), Maybe<bool>(false))));
+ EXPECT_EQ(Maybe<bool>(false), (And<bool, bool, bool, bool>(
+ Maybe<bool>(true), Maybe<bool>(false))));
+ EXPECT_EQ(Maybe<bool>(false),
+ (And<bool, bool, bool, bool>(Maybe<bool>(), Maybe<bool>(false))));
+}
+
+TEST(Or, Or) {
+ EXPECT_EQ(Maybe<bool>(false), (Or<bool, bool, bool, bool>(
+ Maybe<bool>(false), Maybe<bool>(false))));
+ EXPECT_EQ(Maybe<bool>(),
+ (Or<bool, bool, bool, bool>(Maybe<bool>(), Maybe<bool>(false))));
+ EXPECT_EQ(Maybe<bool>(),
+ (Or<bool, bool, bool, bool>(Maybe<bool>(), Maybe<bool>())));
+ EXPECT_EQ(Maybe<bool>(),
+ (Or<bool, bool, bool, bool>(Maybe<bool>(false), Maybe<bool>())));
+ EXPECT_EQ(Maybe<bool>(true), (Or<bool, bool, bool, bool>(Maybe<bool>(false),
+ Maybe<bool>(true))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (Or<bool, bool, bool, bool>(Maybe<bool>(true), Maybe<bool>())));
+ EXPECT_EQ(Maybe<bool>(true),
+ (Or<bool, bool, bool, bool>(Maybe<bool>(true), Maybe<bool>(true))));
+ EXPECT_EQ(Maybe<bool>(true), (Or<bool, bool, bool, bool>(
+ Maybe<bool>(true), Maybe<bool>(false))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (Or<bool, bool, bool, bool>(Maybe<bool>(), Maybe<bool>(true))));
+}
+
+TEST(Choice, Choice) {
+ EXPECT_EQ(Maybe<int>(), (Choice<int, int, bool, int, int>(
+ Maybe<bool>(), Maybe<int>(1), Maybe<int>(2))));
+ EXPECT_EQ(Maybe<int>(1),
+ (Choice<int, int, bool, int, int>(Maybe<bool>(true), Maybe<int>(1),
+ Maybe<int>(2))));
+ EXPECT_EQ(Maybe<int>(2),
+ (Choice<int, int, bool, int, int>(Maybe<bool>(false), Maybe<int>(1),
+ Maybe<int>(2))));
+ EXPECT_EQ(Maybe<int>(), (Choice<int, int, bool, int, int>(
+ Maybe<bool>(true), Maybe<int>(), Maybe<int>(2))));
+ EXPECT_EQ(Maybe<int>(),
+ (Choice<int, int, bool, int, int>(Maybe<bool>(false), Maybe<int>(1),
+ Maybe<int>())));
+ EXPECT_EQ(Maybe<int64_t>(2),
+ (Choice<int64_t, int64_t, bool, int32_t, int32_t>(
+ Maybe<bool>(false), Maybe<int32_t>(1), Maybe<int32_t>(2))));
+ EXPECT_EQ(Maybe<int64_t>(2),
+ (Choice<int64_t, int64_t, bool, int32_t, uint32_t>(
+ Maybe<bool>(false), Maybe<int32_t>(-1), Maybe<uint32_t>(2))));
+ EXPECT_EQ(Maybe<int64_t>(-1),
+ (Choice<int64_t, int64_t, bool, int32_t, uint32_t>(
+ Maybe<bool>(true), Maybe<int32_t>(-1), Maybe<uint32_t>(2))));
+ EXPECT_EQ(Maybe<bool>(true),
+ (Choice<bool, bool, bool, bool, bool>(
+ Maybe<bool>(false), Maybe<bool>(false), Maybe<bool>(true))));
+}
+
+TEST(Maximum, Maximum) {
+ EXPECT_EQ(Maybe<int>(100), (Maximum<int, int, int>(Maybe<int>(100))));
+ EXPECT_EQ(Maybe<int>(99),
+ (Maximum<int, int, int, int>(Maybe<int>(99), Maybe<int>(50))));
+ EXPECT_EQ(Maybe<int>(98),
+ (Maximum<int, int, int, int>(Maybe<int>(50), Maybe<int>(98))));
+ EXPECT_EQ(Maybe<int>(97),
+ (Maximum<int, int, int, int, int>(Maybe<int>(50), Maybe<int>(70),
+ Maybe<int>(97))));
+ EXPECT_EQ(Maybe<int>(), (Maximum<int, int, int, int, int>(
+ Maybe<int>(50), Maybe<int>(), Maybe<int>(97))));
+ EXPECT_EQ(Maybe<int>(-100),
+ (Maximum<int, int, int, int, int>(
+ Maybe<int>(-120), Maybe<int>(-150), Maybe<int>(-100))));
+ EXPECT_EQ(Maybe<int>(), (Maximum<int, int, int>(Maybe<int>())));
+}
+
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_array_view.h b/public/emboss_array_view.h
new file mode 100644
index 0000000..12f0a59
--- /dev/null
+++ b/public/emboss_array_view.h
@@ -0,0 +1,382 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// View classes for arrays and bit arrays.
+#ifndef EMBOSS_PUBLIC_EMBOSS_ARRAY_VIEW_H_
+#define EMBOSS_PUBLIC_EMBOSS_ARRAY_VIEW_H_
+
+#include <cstddef>
+#include <iterator>
+#include <tuple>
+#include <type_traits>
+
+#include "public/emboss_arithmetic.h"
+#include "public/emboss_array_view.h"
+#include "public/emboss_text_util.h"
+
+namespace emboss {
+
+// Forward declarations for use by WriteShorthandArrayCommentToTextStream.
+namespace prelude {
+template <class Parameters, class BitViewType>
+class UIntView;
+template <class Parameters, class BitViewType>
+class IntView;
+} // namespace prelude
+
+namespace support {
+
+// Advance direction for ElementViewIterator.
+enum class ElementViewIteratorDirection { kForward, kReverse };
+
+// Iterator adapter for elements in a GenericArrayView.
+template <class GenericArrayView, ElementViewIteratorDirection kDirection>
+class ElementViewIterator {
+ public:
+ using iterator_category = ::std::random_access_iterator_tag;
+ using value_type = typename GenericArrayView::ViewType;
+ using difference_type = ::std::ptrdiff_t;
+ using pointer = typename ::std::add_pointer<value_type>::type;
+ using reference = typename ::std::add_lvalue_reference<value_type>::type;
+
+ explicit ElementViewIterator(const GenericArrayView *array_view,
+ ::std::ptrdiff_t index)
+ : array_view_(array_view), view_((*array_view)[index]), index_(index) {}
+
+ ElementViewIterator() = default;
+
+ reference operator*() { return view_; }
+
+ pointer operator->() { return &view_; }
+
+ ElementViewIterator &operator+=(difference_type d) {
+ index_ += (kDirection == ElementViewIteratorDirection::kForward ? d : -d);
+ view_ = (*array_view_)[index_];
+ return *this;
+ }
+
+ ElementViewIterator &operator-=(difference_type d) { return *this += (-d); }
+
+ ElementViewIterator &operator++() {
+ *this += 1;
+ return *this;
+ }
+
+ ElementViewIterator &operator--() {
+ *this -= 1;
+ return *this;
+ }
+
+ ElementViewIterator operator++(int) {
+ auto copy = *this;
+ ++(*this);
+ return copy;
+ }
+
+ ElementViewIterator operator--(int) {
+ auto copy = *this;
+ --(*this);
+ return copy;
+ }
+
+ ElementViewIterator operator+(difference_type d) const {
+ auto copy = *this;
+ copy += d;
+ return copy;
+ }
+
+ ElementViewIterator operator-(difference_type d) const {
+ return *this + (-d);
+ }
+
+ difference_type operator-(const ElementViewIterator &other) const {
+ return kDirection == ElementViewIteratorDirection::kForward
+ ? index_ - other.index_
+ : other.index_ - index_;
+ }
+
+ bool operator==(const ElementViewIterator &other) const {
+ return array_view_ == other.array_view_ && index_ == other.index_;
+ }
+
+ bool operator!=(const ElementViewIterator &other) const {
+ return !(*this == other);
+ }
+
+ bool operator<(const ElementViewIterator &other) const {
+ return kDirection == ElementViewIteratorDirection::kForward
+ ? index_ < other.index_
+ : other.index_ < index_;
+ }
+
+ bool operator<=(const ElementViewIterator &other) const {
+ return kDirection == ElementViewIteratorDirection::kForward
+ ? index_ <= other.index_
+ : other.index_ <= index_;
+ }
+
+ bool operator>(const ElementViewIterator &other) const {
+ return !(*this <= other);
+ }
+
+ bool operator>=(const ElementViewIterator &other) const {
+ return !(*this < other);
+ }
+
+ private:
+ const GenericArrayView *array_view_;
+ typename GenericArrayView::ViewType view_;
+ ::std::ptrdiff_t index_;
+};
+
+// View for an array in a structure.
+//
+// ElementView should be the view class for a single array element (e.g.,
+// UIntView<...> or ArrayView<...>).
+//
+// BufferType is the storage type that will be passed into the array.
+//
+// kElementSize is the fixed size of a single element, in addressable units.
+//
+// kAddressableUnitSize is the size of a single addressable unit. It should be
+// either 1 (one bit) or 8 (one byte).
+//
+// ElementViewParameterTypes is a list of the types of parameters which must be
+// passed down to each element of the array. ElementViewParameterTypes can be
+// empty.
+template <class ElementView, class BufferType, ::std::size_t kElementSize,
+ ::std::size_t kAddressableUnitSize,
+ typename... ElementViewParameterTypes>
+class GenericArrayView final {
+ public:
+ using ViewType = ElementView;
+ using ForwardIterator =
+ ElementViewIterator<GenericArrayView,
+ ElementViewIteratorDirection::kForward>;
+ using ReverseIterator =
+ ElementViewIterator<GenericArrayView,
+ ElementViewIteratorDirection::kReverse>;
+
+ GenericArrayView() : buffer_() {}
+ explicit GenericArrayView(const ElementViewParameterTypes &... parameters,
+ BufferType buffer)
+ : parameters_{parameters...}, buffer_{buffer} {}
+
+ ElementView operator[](::std::size_t index) const {
+ return IndexOperatorHelper<sizeof...(ElementViewParameterTypes) ==
+ 0>::ConstructElement(parameters_, buffer_,
+ index);
+ }
+
+ ForwardIterator begin() const { return ForwardIterator(this, 0); }
+ ForwardIterator end() const { return ForwardIterator(this, ElementCount()); }
+ ReverseIterator rbegin() const {
+ return ReverseIterator(this, ElementCount() - 1);
+ }
+ ReverseIterator rend() const { return ReverseIterator(this, -1); }
+
+ // In order to selectively enable SizeInBytes and SizeInBits, it is
+ // necessary to make them into templates. Further, it is necessary for
+ // ::std::enable_if to have a dependency on the template parameter, otherwise
+ // SFINAE won't kick in. Thus, these are templated on an int, and that int
+ // is (spuriously) used as the left argument to `,` in the enable_if
+ // condition. The explicit cast to void is needed to silence GCC's
+ // -Wunused-value.
+ template <int N = 0>
+ typename ::std::enable_if<((void)N, kAddressableUnitSize == 8),
+ ::std::size_t>::type
+ SizeInBytes() const {
+ return buffer_.SizeInBytes();
+ }
+ template <int N = 0>
+ typename ::std::enable_if<((void)N, kAddressableUnitSize == 1),
+ ::std::size_t>::type
+ SizeInBits() const {
+ return buffer_.SizeInBits();
+ }
+
+ ::std::size_t ElementCount() const { return SizeOfBuffer() / kElementSize; }
+ bool Ok() const {
+ if (!buffer_.Ok()) return false;
+ if (SizeOfBuffer() % kElementSize != 0) return false;
+ for (::std::size_t i = 0; i < ElementCount(); ++i) {
+ if (!(*this)[i].Ok()) return false;
+ }
+ return true;
+ }
+ template <class OtherElementView, class OtherBufferType>
+ bool Equals(
+ const GenericArrayView<OtherElementView, OtherBufferType, kElementSize,
+ kAddressableUnitSize> &other) const {
+ if (ElementCount() != other.ElementCount()) return false;
+ for (::std::size_t i = 0; i < ElementCount(); ++i) {
+ if (!(*this)[i].Equals(other[i])) return false;
+ }
+ return true;
+ }
+ template <class OtherElementView, class OtherBufferType>
+ bool UncheckedEquals(
+ const GenericArrayView<OtherElementView, OtherBufferType, kElementSize,
+ kAddressableUnitSize> &other) const {
+ if (ElementCount() != other.ElementCount()) return false;
+ for (::std::size_t i = 0; i < ElementCount(); ++i) {
+ if (!(*this)[i].UncheckedEquals(other[i])) return false;
+ }
+ return true;
+ }
+ bool IsComplete() const { return buffer_.Ok(); }
+
+ template <class Stream>
+ bool UpdateFromTextStream(Stream *stream) const {
+ return ReadArrayFromTextStream(this, stream);
+ }
+
+ template <class Stream>
+ void WriteToTextStream(Stream *stream,
+ const TextOutputOptions &options) const {
+ WriteArrayToTextStream(this, stream, options);
+ }
+
+ BufferType BackingStorage() const { return buffer_; }
+
+ private:
+ // This uses the same technique to select the correct definition of
+ // SizeOfBuffer() as in the SizeInBits()/SizeInBytes() selection above.
+ template <int N = 0>
+ typename ::std::enable_if<((void)N, kAddressableUnitSize == 8),
+ ::std::size_t>::type
+ SizeOfBuffer() const {
+ return SizeInBytes();
+ }
+ template <int N = 0>
+ typename ::std::enable_if<((void)N, kAddressableUnitSize == 1),
+ ::std::size_t>::type
+ SizeOfBuffer() const {
+ return SizeInBits();
+ }
+
+ // This mess is needed to expand the parameters_ tuple into individual
+ // arguments to the ElementView constructor. If parameters_ has M elements,
+ // then:
+ //
+ // IndexOperatorHelper<false>::ConstructElement() calls
+ // IndexOperatorHelper<false, 0>::ConstructElement(), which calls
+ // IndexOperatorHelper<false, 0, 1>::ConstructElement(), and so on, up to
+ // IndexOperatorHelper<false, 0, 1, ..., M-1>::ConstructElement(), which calls
+ // IndexOperatorHelper<true, 0, 1, ..., M>::ConstructElement()
+ //
+ // That last call will resolve to the second, specialized version of
+ // IndexOperatorHelper. That version's ConstructElement() uses
+ // `std::get<N>(parameters)...`, which will be expanded into
+ // `std::get<0>(parameters), std::get<1>(parameters), std::get<2>(parameters),
+ // ..., std::get<M>(parameters)`.
+ //
+ // If there are 0 parameters, then operator[]() will call
+ // IndexOperatorHelper<true>::ConstructElement(), which still works --
+ // `std::get<N>(parameters)...,` will be replaced by ``.
+ //
+ // In C++14, a lot of this can be replaced by std::index_sequence_of, and in
+ // C++17 it can be replaced with std::apply and a lambda.
+ //
+ // An alternate solution would be to force each parameterized view to have a
+ // constructor that accepts a tuple, instead of individual parameters, but
+ // that (further) complicates the matrix of constructors for view types.
+ template <bool, ::std::size_t... N>
+ struct IndexOperatorHelper {
+ static ElementView ConstructElement(
+ const ::std::tuple<ElementViewParameterTypes...> ¶meters,
+ BufferType buffer, ::std::size_t index) {
+ return IndexOperatorHelper<
+ sizeof...(ElementViewParameterTypes) == 1 + sizeof...(N), N...,
+ sizeof...(N)>::ConstructElement(parameters, buffer, index);
+ }
+ };
+
+ template </**/ ::std::size_t... N>
+ struct IndexOperatorHelper<true, N...> {
+ static ElementView ConstructElement(
+ const ::std::tuple<ElementViewParameterTypes...> ¶meters,
+ BufferType buffer, ::std::size_t index) {
+ return ElementView(::std::get<N>(parameters)...,
+ buffer.template GetOffsetStorage<kElementSize, 0>(
+ kElementSize * index, kElementSize));
+ }
+ };
+
+ ::std::tuple<ElementViewParameterTypes...> parameters_;
+ BufferType buffer_;
+};
+
+// Optionally prints a shorthand representation of a BitArray in a comment.
+template <class ElementView, class BufferType, size_t kElementSize,
+ size_t kAddressableUnitSize, class Stream>
+void WriteShorthandArrayCommentToTextStream(
+ const GenericArrayView<ElementView, BufferType, kElementSize,
+ kAddressableUnitSize> *array,
+ Stream *stream, const TextOutputOptions &options) {
+ // Intentionally empty. Overload for specific element types.
+}
+
+// Prints out the elements of an 8-bit Int or UInt array as characters.
+template <class Array, class Stream>
+void WriteShorthandAsciiArrayCommentToTextStream(
+ const Array *array, Stream *stream, const TextOutputOptions &options) {
+ if (!options.multiline()) return;
+ if (!options.comments()) return;
+ if (array->ElementCount() == 0) return;
+ static constexpr int kCharsPerBlock = 64;
+ static constexpr char kStandInForNonPrintableChar = '.';
+ auto start_new_line = [&]() {
+ stream->Write("\n");
+ stream->Write(options.current_indent());
+ stream->Write("# ");
+ };
+ for (int i = 0, n = array->ElementCount(); i < n; ++i) {
+ const int c = (*array)[i].Read();
+ const bool c_is_printable = (c >= 32 && c <= 126);
+ const bool starting_new_block = ((i % kCharsPerBlock) == 0);
+ if (starting_new_block) start_new_line();
+ stream->Write(c_is_printable ? static_cast<char>(c)
+ : kStandInForNonPrintableChar);
+ }
+}
+
+// Overload for arrays of UInt.
+// Prints out the elements as ASCII characters for arrays of UInt:8.
+template <class BufferType, class BitViewType, class Stream,
+ size_t kElementSize, class Parameters,
+ class = typename ::std::enable_if<Parameters::kBits == 8>::type>
+void WriteShorthandArrayCommentToTextStream(
+ const GenericArrayView<prelude::UIntView<Parameters, BitViewType>,
+ BufferType, kElementSize, 8> *array,
+ Stream *stream, const TextOutputOptions &options) {
+ WriteShorthandAsciiArrayCommentToTextStream(array, stream, options);
+}
+
+// Overload for arrays of UInt.
+// Prints out the elements as ASCII characters for arrays of Int:8.
+template <class BufferType, class BitViewType, class Stream,
+ size_t kElementSize, class Parameters,
+ class = typename ::std::enable_if<Parameters::kBits == 8>::type>
+void WriteShorthandArrayCommentToTextStream(
+ const GenericArrayView<prelude::IntView<Parameters, BitViewType>,
+ BufferType, kElementSize, 8> *array,
+ Stream *stream, const TextOutputOptions &options) {
+ WriteShorthandAsciiArrayCommentToTextStream(array, stream, options);
+}
+
+} // namespace support
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_ARRAY_VIEW_H_
diff --git a/public/emboss_array_view_test.cc b/public/emboss_array_view_test.cc
new file mode 100644
index 0000000..1868d65
--- /dev/null
+++ b/public/emboss_array_view_test.cc
@@ -0,0 +1,280 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_array_view.h"
+
+#include <string>
+#include <type_traits>
+
+#include "absl/strings/str_format.h"
+#include <gtest/gtest.h>
+
+#include "public/emboss_prelude.h"
+
+namespace emboss {
+namespace support {
+namespace test {
+
+using ::emboss::prelude::IntView;
+using ::emboss::prelude::UIntView;
+
+template <class ElementView, class BufferType, ::std::size_t kElementSize>
+using ArrayView = GenericArrayView<ElementView, BufferType, kElementSize, 8>;
+
+template <class ElementView, class BufferType, ::std::size_t kElementSize>
+using BitArrayView = GenericArrayView<ElementView, BufferType, kElementSize, 1>;
+
+template <size_t kBits>
+using LittleEndianBitBlockN =
+ BitBlock<LittleEndianByteOrderer<ReadWriteContiguousBuffer>, kBits>;
+
+template <size_t kBits>
+using FixedUIntView = UIntView<FixedSizeViewParameters<kBits, AllValuesAreOk>,
+ LittleEndianBitBlockN<kBits>>;
+
+template <size_t kBits>
+using FixedIntView = IntView<FixedSizeViewParameters<kBits, AllValuesAreOk>,
+ LittleEndianBitBlockN<kBits>>;
+
+TEST(ArrayView, Methods) {
+ uint8_t bytes[] = {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09,
+ 0x08, 0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01};
+ auto byte_array = ArrayView<FixedUIntView<8>, ReadWriteContiguousBuffer, 1>{
+ ReadWriteContiguousBuffer{bytes, sizeof bytes - 4}};
+ EXPECT_EQ(sizeof bytes - 4, byte_array.SizeInBytes());
+ EXPECT_EQ(bytes[0], byte_array[0].Read());
+ EXPECT_EQ(bytes[1], byte_array[1].Read());
+ EXPECT_EQ(bytes[2], byte_array[2].Read());
+ EXPECT_DEATH(byte_array[sizeof bytes - 4].Read(), "");
+ EXPECT_EQ(bytes[sizeof bytes - 4],
+ byte_array[sizeof bytes - 4].UncheckedRead());
+ EXPECT_TRUE(byte_array[sizeof bytes - 5].IsComplete());
+ EXPECT_FALSE(byte_array[sizeof bytes - 4].IsComplete());
+ EXPECT_TRUE(byte_array.Ok());
+ EXPECT_TRUE(byte_array.IsComplete());
+ EXPECT_FALSE((ArrayView<FixedUIntView<8>, ReadWriteContiguousBuffer, 1>{
+ ReadWriteContiguousBuffer{
+ nullptr}}.Ok()));
+ EXPECT_TRUE(byte_array.IsComplete());
+
+ auto uint32_array =
+ ArrayView<FixedUIntView<32>, ReadWriteContiguousBuffer, 4>{
+ ReadWriteContiguousBuffer{bytes, sizeof bytes - 4}};
+ EXPECT_EQ(sizeof bytes - 4, uint32_array.SizeInBytes());
+ EXPECT_TRUE(uint32_array[0].Ok());
+ EXPECT_EQ(0x0d0e0f10, uint32_array[0].Read());
+ EXPECT_EQ(0x090a0b0c, uint32_array[1].Read());
+ EXPECT_EQ(0x05060708, uint32_array[2].Read());
+ EXPECT_DEATH(uint32_array[3].Read(), "");
+ EXPECT_EQ(0x01020304, uint32_array[3].UncheckedRead());
+ EXPECT_TRUE(uint32_array[2].IsComplete());
+ EXPECT_FALSE(uint32_array[3].IsComplete());
+ EXPECT_TRUE(uint32_array.Ok());
+ EXPECT_TRUE(uint32_array.IsComplete());
+ EXPECT_FALSE((ArrayView<FixedUIntView<32>, ReadWriteContiguousBuffer, 1>{
+ ReadWriteContiguousBuffer{
+ nullptr}}.Ok()));
+}
+
+TEST(ArrayView, Ok) {
+ uint8_t bytes[] = {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09,
+ 0x08, 0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01};
+ // All elements are complete and, themselves, Ok(), so the array should be
+ // Ok().
+ auto byte_array = ArrayView<FixedUIntView<16>, ReadWriteContiguousBuffer, 2>(
+ ReadWriteContiguousBuffer(bytes, sizeof bytes - 4));
+ EXPECT_TRUE(byte_array.Ok());
+
+ // An array with a partial element at the end should not be Ok().
+ byte_array = ArrayView<FixedUIntView<16>, ReadWriteContiguousBuffer, 2>(
+ ReadWriteContiguousBuffer(bytes, sizeof bytes - 3));
+ EXPECT_FALSE(byte_array.Ok());
+
+ // An empty array should be Ok().
+ byte_array = ArrayView<FixedUIntView<16>, ReadWriteContiguousBuffer, 2>(
+ ReadWriteContiguousBuffer(bytes, 0));
+ EXPECT_TRUE(byte_array.Ok());
+}
+
+TEST(ArrayView, TextFormatInput) {
+ uint8_t bytes[16] = {0};
+ auto byte_array = ArrayView<FixedUIntView<8>, ReadWriteContiguousBuffer, 1>{
+ ReadWriteContiguousBuffer{bytes, sizeof bytes}};
+ EXPECT_FALSE(UpdateFromText(byte_array, ""));
+ EXPECT_FALSE(UpdateFromText(byte_array, "[]"));
+ EXPECT_FALSE(UpdateFromText(byte_array, "{"));
+ EXPECT_FALSE(UpdateFromText(byte_array, "{[0"));
+ EXPECT_FALSE(UpdateFromText(byte_array, "{[0:0}"));
+ EXPECT_FALSE(UpdateFromText(byte_array, "{[]:0}"));
+ EXPECT_FALSE(UpdateFromText(byte_array, "{[0] 0}"));
+ EXPECT_FALSE(UpdateFromText(byte_array, "{[0] 0}"));
+ EXPECT_TRUE(UpdateFromText(byte_array, "{}"));
+ EXPECT_FALSE(UpdateFromText(byte_array, "{,1}"));
+ EXPECT_FALSE(UpdateFromText(byte_array, "{1,,}"));
+ EXPECT_FALSE(UpdateFromText(byte_array, "{ a }"));
+ EXPECT_TRUE(UpdateFromText(byte_array, "{1}"));
+ EXPECT_EQ(1, bytes[0]);
+ EXPECT_TRUE(UpdateFromText(byte_array, " {2}"));
+ EXPECT_EQ(2, bytes[0]);
+ EXPECT_TRUE(UpdateFromText(byte_array, " {\t\r\n4 } junk"));
+ EXPECT_EQ(4, bytes[0]);
+ EXPECT_TRUE(UpdateFromText(byte_array, "{3,}"));
+ EXPECT_EQ(3, bytes[0]);
+ EXPECT_FALSE(UpdateFromText(byte_array, "{4 5}"));
+ EXPECT_TRUE(UpdateFromText(byte_array, "{4, 5}"));
+ EXPECT_EQ(4, bytes[0]);
+ EXPECT_EQ(5, bytes[1]);
+ EXPECT_TRUE(UpdateFromText(byte_array, "{5, [6]: 5}"));
+ EXPECT_EQ(5, bytes[0]);
+ EXPECT_EQ(5, bytes[1]);
+ EXPECT_EQ(5, bytes[6]);
+ EXPECT_TRUE(UpdateFromText(byte_array, "{6, [7]:6, 6}"));
+ EXPECT_EQ(6, bytes[0]);
+ EXPECT_EQ(5, bytes[1]);
+ EXPECT_EQ(5, bytes[6]);
+ EXPECT_EQ(6, bytes[7]);
+ EXPECT_EQ(6, bytes[8]);
+ EXPECT_TRUE(UpdateFromText(byte_array, "{[7]: 7, 7, [0]: 7, 7}"));
+ EXPECT_EQ(7, bytes[0]);
+ EXPECT_EQ(7, bytes[1]);
+ EXPECT_EQ(7, bytes[7]);
+ EXPECT_EQ(7, bytes[8]);
+ EXPECT_FALSE(UpdateFromText(byte_array, "{[16]: 0}"));
+ EXPECT_FALSE(UpdateFromText(byte_array, "{[15]: 0, 0}"));
+}
+
+TEST(ArrayView, TextFormatOutput_WithAndWithoutComments) {
+ signed char bytes[16] = {-3, 2, -1, 1, 0, 1, 1, 2,
+ 3, 5, 8, 13, 21, 34, 55, 89};
+ auto buffer = ReadWriteContiguousBuffer{reinterpret_cast<uint8_t *>(bytes),
+ sizeof bytes};
+ auto byte_array =
+ ArrayView<FixedIntView<8>, ReadWriteContiguousBuffer, 1>{buffer};
+ EXPECT_EQ(
+ "{ [0]: -3, 2, -1, 1, 0, 1, 1, 2, [8]: 3, 5, 8, 13, 21, 34, 55, 89 }",
+ WriteToString(byte_array));
+ EXPECT_EQ(WriteToString(byte_array, MultilineText()),
+ R"({
+ # ............."7Y
+ [0]: -3 # -0x3
+ [1]: 2 # 0x2
+ [2]: -1 # -0x1
+ [3]: 1 # 0x1
+ [4]: 0 # 0x0
+ [5]: 1 # 0x1
+ [6]: 1 # 0x1
+ [7]: 2 # 0x2
+ [8]: 3 # 0x3
+ [9]: 5 # 0x5
+ [10]: 8 # 0x8
+ [11]: 13 # 0xd
+ [12]: 21 # 0x15
+ [13]: 34 # 0x22
+ [14]: 55 # 0x37
+ [15]: 89 # 0x59
+})");
+ EXPECT_EQ(
+ WriteToString(byte_array,
+ MultilineText().WithIndent(" ").WithComments(false)),
+ R"({
+ [0]: -3
+ [1]: 2
+ [2]: -1
+ [3]: 1
+ [4]: 0
+ [5]: 1
+ [6]: 1
+ [7]: 2
+ [8]: 3
+ [9]: 5
+ [10]: 8
+ [11]: 13
+ [12]: 21
+ [13]: 34
+ [14]: 55
+ [15]: 89
+})");
+ EXPECT_EQ(
+ WriteToString(byte_array, TextOutputOptions().WithNumericBase(16)),
+ "{ [0x0]: -0x3, 0x2, -0x1, 0x1, 0x0, 0x1, 0x1, 0x2, [0x8]: 0x3, 0x5, "
+ "0x8, 0xd, 0x15, 0x22, 0x37, 0x59 }");
+}
+
+TEST(ArrayView, TextFormatOutput_8BitIntElementTypes) {
+ uint8_t bytes[1] = {65};
+ auto buffer = ReadWriteContiguousBuffer{bytes, sizeof bytes};
+ const ::std::string expected_text = R"({
+ # A
+ [0]: 65 # 0x41
+})";
+ EXPECT_EQ(
+ WriteToString(
+ ArrayView<FixedIntView<8>, ReadWriteContiguousBuffer, 1>{buffer},
+ MultilineText()),
+ expected_text);
+ EXPECT_EQ(
+ WriteToString(
+ ArrayView<FixedUIntView<8>, ReadWriteContiguousBuffer, 1>{buffer},
+ MultilineText()),
+ expected_text);
+}
+
+TEST(ArrayView, TextFormatOutput_16BitIntElementTypes) {
+ uint16_t bytes[1] = {65};
+ auto buffer = ReadWriteContiguousBuffer{reinterpret_cast<uint8_t *>(bytes),
+ sizeof bytes};
+ const ::std::string expected_text = R"({
+ [0]: 65 # 0x41
+})";
+ EXPECT_EQ(
+ WriteToString(
+ ArrayView<FixedIntView<16>, ReadWriteContiguousBuffer, 2>{buffer},
+ MultilineText()),
+ expected_text);
+ EXPECT_EQ(
+ WriteToString(
+ ArrayView<FixedUIntView<16>, ReadWriteContiguousBuffer, 2>{buffer},
+ MultilineText()),
+ expected_text);
+}
+
+TEST(ArrayView, TextFormatOutput_MultilineComment) {
+ uint8_t bytes[65];
+ for (::std::size_t i = 0; i < sizeof bytes; ++i) {
+ bytes[i] = '0' + (i % 10);
+ }
+ for (const ::std::size_t length : {63, 64, 65}) {
+ auto buffer = ReadWriteContiguousBuffer{bytes, length};
+ ::std::string expected_text =
+ "{\n # "
+ "012345678901234567890123456789012345678901234567890123456789012";
+ if (length > 63) expected_text += "3";
+ if (length > 64) expected_text += "\n # 4";
+ expected_text += "\n";
+ for (::std::size_t i = 0; i < length; ++i) {
+ expected_text +=
+ absl::StrFormat(" [%d]: %d # 0x%02x\n", i, bytes[i], bytes[i]);
+ }
+ expected_text += "}";
+ EXPECT_EQ(
+ WriteToString(
+ ArrayView<FixedIntView<8>, ReadWriteContiguousBuffer, 1>{buffer},
+ MultilineText()),
+ expected_text);
+ }
+}
+
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_bit_util.h b/public/emboss_bit_util.h
new file mode 100644
index 0000000..24ac132
--- /dev/null
+++ b/public/emboss_bit_util.h
@@ -0,0 +1,79 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// This file contains various utility routines for manipulating values at a low
+// level, such as byte swaps and safe casts.
+#ifndef EMBOSS_PUBLIC_EMBOSS_BIT_UTIL_H_
+#define EMBOSS_PUBLIC_EMBOSS_BIT_UTIL_H_
+
+#include <cstdint>
+#include <type_traits>
+
+#include "public/emboss_defines.h"
+
+namespace emboss {
+namespace support {
+
+// Where possible, it is best to use byte swap builtins, but if they are not
+// available ByteSwap can fall back to portable code.
+inline constexpr ::std::uint8_t ByteSwap(::std::uint8_t x) { return x; }
+inline constexpr ::std::uint16_t ByteSwap(::std::uint16_t x) {
+#ifdef EMBOSS_BYTESWAP16
+ return EMBOSS_BYTESWAP16(x);
+#else
+ return (x << 8) | (x >> 8);
+#endif
+}
+inline constexpr ::std::uint32_t ByteSwap(::std::uint32_t x) {
+#ifdef EMBOSS_BYTESWAP32
+ return EMBOSS_BYTESWAP32(x);
+#else
+ return (static_cast</**/ ::std::uint32_t>(
+ ByteSwap(static_cast</**/ ::std::uint16_t>(x)))
+ << 16) |
+ ByteSwap(static_cast</**/ ::std::uint16_t>(x >> 16));
+#endif
+}
+inline constexpr ::std::uint64_t ByteSwap(::std::uint64_t x) {
+#ifdef EMBOSS_BYTESWAP64
+ return EMBOSS_BYTESWAP64(x);
+#else
+ return (static_cast</**/ ::std::uint64_t>(
+ ByteSwap(static_cast</**/ ::std::uint32_t>(x)))
+ << 32) |
+ ByteSwap(static_cast</**/ ::std::uint32_t>(x >> 32));
+#endif
+}
+
+// Masks the given value to the given number of bits.
+template <typename T>
+inline constexpr T MaskToNBits(T value, unsigned bits) {
+ static_assert(!::std::is_signed<T>::value,
+ "MaskToNBits only works on unsigned values.");
+ return bits < sizeof value * 8 ? value & ((static_cast<T>(1) << bits) - 1)
+ : value;
+}
+
+template <typename T>
+inline constexpr bool IsPowerOfTwo(T value) {
+ // This check relies on an old bit-counting trick; x & (x - 1) always has one
+ // fewer bit set to 1 than x (if x is nonzero), and powers of 2 always have
+ // exactly one 1 bit, thus x & (x - 1) == 0 if x is a power of 2.
+ return value > 0 && (value & (value - 1)) == 0;
+}
+
+} // namespace support
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_BIT_UTIL_H_
diff --git a/public/emboss_bit_util_test.cc b/public/emboss_bit_util_test.cc
new file mode 100644
index 0000000..8986383
--- /dev/null
+++ b/public/emboss_bit_util_test.cc
@@ -0,0 +1,184 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_bit_util.h"
+
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace support {
+namespace test {
+
+TEST(ByteSwap, ByteSwap) {
+ EXPECT_EQ(0x01, ByteSwap(uint8_t{0x01}));
+ EXPECT_EQ(0x0102, ByteSwap(uint16_t{0x0201}));
+ EXPECT_EQ(0x01020304, ByteSwap(uint32_t{0x04030201}));
+ EXPECT_EQ(0x0102030405060708UL, ByteSwap(uint64_t{0x0807060504030201UL}));
+}
+
+TEST(MaskToNBits, MaskToNBits) {
+ EXPECT_EQ(0xff, MaskToNBits(0xffffffffU, 8));
+ EXPECT_EQ(0x00, MaskToNBits(0xffffff00U, 8));
+ EXPECT_EQ(0x01, MaskToNBits(0xffffffffU, 1));
+ EXPECT_EQ(0x00, MaskToNBits(0xfffffffeU, 1));
+ EXPECT_EQ(0xffffffffU, MaskToNBits(0xffffffffU, 32));
+ EXPECT_EQ(0xffffffffffffffffU, MaskToNBits(0xffffffffffffffffU, 64));
+ EXPECT_EQ(0xf, MaskToNBits(uint8_t{0xff}, 4));
+}
+
+TEST(IsPowerOfTwo, IsPowerOfTwo) {
+ EXPECT_TRUE(IsPowerOfTwo(1U));
+ EXPECT_TRUE(IsPowerOfTwo(2U));
+ EXPECT_TRUE(IsPowerOfTwo(1UL << 63));
+ EXPECT_TRUE(IsPowerOfTwo(uint8_t{128}));
+ EXPECT_TRUE(IsPowerOfTwo(1));
+ EXPECT_TRUE(IsPowerOfTwo(2));
+ EXPECT_TRUE(IsPowerOfTwo(1L << 62));
+ EXPECT_TRUE(IsPowerOfTwo(int8_t{64}));
+
+ EXPECT_FALSE(IsPowerOfTwo(0U));
+ EXPECT_FALSE(IsPowerOfTwo(3U));
+ EXPECT_FALSE(IsPowerOfTwo((1UL << 63) - 1));
+ EXPECT_FALSE(IsPowerOfTwo((1UL << 62) + 1));
+ EXPECT_FALSE(IsPowerOfTwo((3UL << 62)));
+ EXPECT_FALSE(IsPowerOfTwo(::std::numeric_limits<uint64_t>::max()));
+ EXPECT_FALSE(IsPowerOfTwo(uint8_t{129}));
+ EXPECT_FALSE(IsPowerOfTwo(uint8_t{255}));
+ EXPECT_FALSE(IsPowerOfTwo(-1));
+ EXPECT_FALSE(IsPowerOfTwo(-2));
+ EXPECT_FALSE(IsPowerOfTwo(-3));
+ EXPECT_FALSE(IsPowerOfTwo(::std::numeric_limits<int64_t>::min()));
+ EXPECT_FALSE(IsPowerOfTwo(::std::numeric_limits<int64_t>::max()));
+ EXPECT_FALSE(IsPowerOfTwo(0));
+ EXPECT_FALSE(IsPowerOfTwo(3));
+ EXPECT_FALSE(IsPowerOfTwo((1L << 62) - 1));
+ EXPECT_FALSE(IsPowerOfTwo((1L << 61) + 1));
+ EXPECT_FALSE(IsPowerOfTwo((3L << 61)));
+ EXPECT_FALSE(IsPowerOfTwo(int8_t{-1}));
+ EXPECT_FALSE(IsPowerOfTwo(int8_t{-128}));
+ EXPECT_FALSE(IsPowerOfTwo(int8_t{65}));
+ EXPECT_FALSE(IsPowerOfTwo(int8_t{127}));
+}
+
+#if defined(EMBOSS_LITTLE_ENDIAN_TO_NATIVE)
+TEST(EndianConversion, LittleEndianToNative) {
+ ::std::uint16_t data16 = 0;
+ reinterpret_cast<char *>(&data16)[0] = 0x01;
+ reinterpret_cast<char *>(&data16)[1] = 0x02;
+ EXPECT_EQ(0x0201, EMBOSS_LITTLE_ENDIAN_TO_NATIVE(data16));
+
+ ::std::uint32_t data32 = 0;
+ reinterpret_cast<char *>(&data32)[0] = 0x01;
+ reinterpret_cast<char *>(&data32)[1] = 0x02;
+ reinterpret_cast<char *>(&data32)[2] = 0x03;
+ reinterpret_cast<char *>(&data32)[3] = 0x04;
+ EXPECT_EQ(0x04030201, EMBOSS_LITTLE_ENDIAN_TO_NATIVE(data32));
+
+ ::std::uint64_t data64 = 0;
+ reinterpret_cast<char *>(&data64)[0] = 0x01;
+ reinterpret_cast<char *>(&data64)[1] = 0x02;
+ reinterpret_cast<char *>(&data64)[2] = 0x03;
+ reinterpret_cast<char *>(&data64)[3] = 0x04;
+ reinterpret_cast<char *>(&data64)[4] = 0x05;
+ reinterpret_cast<char *>(&data64)[5] = 0x06;
+ reinterpret_cast<char *>(&data64)[6] = 0x07;
+ reinterpret_cast<char *>(&data64)[7] = 0x08;
+ EXPECT_EQ(0x0807060504030201, EMBOSS_LITTLE_ENDIAN_TO_NATIVE(data64));
+}
+#endif // defined(EMBOSS_LITTLE_ENDIAN_TO_NATIVE)
+
+#if defined(EMBOSS_BIG_ENDIAN_TO_NATIVE)
+TEST(EndianConversion, BigEndianToNative) {
+ ::std::uint16_t data16 = 0;
+ reinterpret_cast<char *>(&data16)[0] = 0x01;
+ reinterpret_cast<char *>(&data16)[1] = 0x02;
+ EXPECT_EQ(0x0102, EMBOSS_BIG_ENDIAN_TO_NATIVE(data16));
+
+ ::std::uint32_t data32 = 0;
+ reinterpret_cast<char *>(&data32)[0] = 0x01;
+ reinterpret_cast<char *>(&data32)[1] = 0x02;
+ reinterpret_cast<char *>(&data32)[2] = 0x03;
+ reinterpret_cast<char *>(&data32)[3] = 0x04;
+ EXPECT_EQ(0x01020304, EMBOSS_BIG_ENDIAN_TO_NATIVE(data32));
+
+ ::std::uint64_t data64 = 0;
+ reinterpret_cast<char *>(&data64)[0] = 0x01;
+ reinterpret_cast<char *>(&data64)[1] = 0x02;
+ reinterpret_cast<char *>(&data64)[2] = 0x03;
+ reinterpret_cast<char *>(&data64)[3] = 0x04;
+ reinterpret_cast<char *>(&data64)[4] = 0x05;
+ reinterpret_cast<char *>(&data64)[5] = 0x06;
+ reinterpret_cast<char *>(&data64)[6] = 0x07;
+ reinterpret_cast<char *>(&data64)[7] = 0x08;
+ EXPECT_EQ(0x0102030405060708, EMBOSS_BIG_ENDIAN_TO_NATIVE(data64));
+}
+#endif // defined(EMBOSS_BIG_ENDIAN_TO_NATIVE)
+
+#if defined(EMBOSS_NATIVE_TO_LITTLE_ENDIAN)
+TEST(EndianConversion, NativeToLittleEndian) {
+ ::std::uint16_t data16 =
+ EMBOSS_NATIVE_TO_LITTLE_ENDIAN(static_cast</**/ ::std::uint16_t>(0x0201));
+ EXPECT_EQ(0x01, reinterpret_cast<char *>(&data16)[0]);
+ EXPECT_EQ(0x02, reinterpret_cast<char *>(&data16)[1]);
+
+ ::std::uint32_t data32 = EMBOSS_NATIVE_TO_LITTLE_ENDIAN(
+ static_cast</**/ ::std::uint32_t>(0x04030201));
+ EXPECT_EQ(0x01, reinterpret_cast<char *>(&data32)[0]);
+ EXPECT_EQ(0x02, reinterpret_cast<char *>(&data32)[1]);
+ EXPECT_EQ(0x03, reinterpret_cast<char *>(&data32)[2]);
+ EXPECT_EQ(0x04, reinterpret_cast<char *>(&data32)[3]);
+
+ ::std::uint64_t data64 = EMBOSS_NATIVE_TO_LITTLE_ENDIAN(
+ static_cast</**/ ::std::uint64_t>(0x0807060504030201));
+ EXPECT_EQ(0x01, reinterpret_cast<char *>(&data64)[0]);
+ EXPECT_EQ(0x02, reinterpret_cast<char *>(&data64)[1]);
+ EXPECT_EQ(0x03, reinterpret_cast<char *>(&data64)[2]);
+ EXPECT_EQ(0x04, reinterpret_cast<char *>(&data64)[3]);
+ EXPECT_EQ(0x05, reinterpret_cast<char *>(&data64)[4]);
+ EXPECT_EQ(0x06, reinterpret_cast<char *>(&data64)[5]);
+ EXPECT_EQ(0x07, reinterpret_cast<char *>(&data64)[6]);
+ EXPECT_EQ(0x08, reinterpret_cast<char *>(&data64)[7]);
+}
+#endif // defined(EMBOSS_NATIVE_TO_LITTLE_ENDIAN)
+
+#if defined(EMBOSS_NATIVE_TO_BIG_ENDIAN)
+TEST(EndianConversion, NativeToBigEndian) {
+ ::std::uint16_t data16 =
+ EMBOSS_NATIVE_TO_BIG_ENDIAN(static_cast</**/ ::std::uint16_t>(0x0102));
+ EXPECT_EQ(0x01, reinterpret_cast<char *>(&data16)[0]);
+ EXPECT_EQ(0x02, reinterpret_cast<char *>(&data16)[1]);
+
+ ::std::uint32_t data32 = EMBOSS_NATIVE_TO_BIG_ENDIAN(
+ static_cast</**/ ::std::uint32_t>(0x01020304));
+ EXPECT_EQ(0x01, reinterpret_cast<char *>(&data32)[0]);
+ EXPECT_EQ(0x02, reinterpret_cast<char *>(&data32)[1]);
+ EXPECT_EQ(0x03, reinterpret_cast<char *>(&data32)[2]);
+ EXPECT_EQ(0x04, reinterpret_cast<char *>(&data32)[3]);
+
+ ::std::uint64_t data64 = EMBOSS_NATIVE_TO_BIG_ENDIAN(
+ static_cast</**/ ::std::uint64_t>(0x0102030405060708));
+ EXPECT_EQ(0x01, reinterpret_cast<char *>(&data64)[0]);
+ EXPECT_EQ(0x02, reinterpret_cast<char *>(&data64)[1]);
+ EXPECT_EQ(0x03, reinterpret_cast<char *>(&data64)[2]);
+ EXPECT_EQ(0x04, reinterpret_cast<char *>(&data64)[3]);
+ EXPECT_EQ(0x05, reinterpret_cast<char *>(&data64)[4]);
+ EXPECT_EQ(0x06, reinterpret_cast<char *>(&data64)[5]);
+ EXPECT_EQ(0x07, reinterpret_cast<char *>(&data64)[6]);
+ EXPECT_EQ(0x08, reinterpret_cast<char *>(&data64)[7]);
+}
+#endif // defined(EMBOSS_NATIVE_TO_BIG_ENDIAN)
+
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_constant_view.h b/public/emboss_constant_view.h
new file mode 100644
index 0000000..41785e5
--- /dev/null
+++ b/public/emboss_constant_view.h
@@ -0,0 +1,51 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#ifndef EMBOSS_PUBLIC_EMBOSS_CONSTANT_VIEW_H_
+#define EMBOSS_PUBLIC_EMBOSS_CONSTANT_VIEW_H_
+
+#include "public/emboss_maybe.h"
+
+namespace emboss {
+namespace support {
+
+// MaybeConstantView is a "view" type that "reads" a value passed into its
+// constructor.
+//
+// This is used internally by generated structure view classes to provide views
+// of parameters; in this way, parameters can be treated like fields in the
+// generated code.
+template <typename ValueT>
+class MaybeConstantView {
+ public:
+ MaybeConstantView() : value_() {}
+ constexpr explicit MaybeConstantView(ValueT value) : value_(value) {}
+ MaybeConstantView(const MaybeConstantView &) = default;
+ MaybeConstantView(MaybeConstantView &&) = default;
+ MaybeConstantView &operator=(const MaybeConstantView &) = default;
+ MaybeConstantView &operator=(MaybeConstantView &&) = default;
+ ~MaybeConstantView() = default;
+
+ constexpr ValueT Read() const { return value_.Value(); }
+ constexpr ValueT UncheckedRead() const { return value_.ValueOrDefault(); }
+ constexpr bool Ok() { return value_.Known(); }
+
+ private:
+ ::emboss::support::Maybe<ValueT> value_;
+};
+
+} // namespace support
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_CONSTANT_VIEW_H_
diff --git a/public/emboss_constant_view_test.cc b/public/emboss_constant_view_test.cc
new file mode 100644
index 0000000..aafc14e
--- /dev/null
+++ b/public/emboss_constant_view_test.cc
@@ -0,0 +1,62 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_constant_view.h"
+
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace support {
+namespace test {
+
+TEST(MaybeConstantViewTest, Read) {
+ EXPECT_EQ(7, MaybeConstantView</**/ ::std::uint8_t>(7).Read());
+ EXPECT_DEATH(MaybeConstantView</**/ ::std::uint8_t>().Read(), "Known\\(\\)");
+}
+
+TEST(MaybeConstantViewTest, UncheckedRead) {
+ EXPECT_EQ(7, MaybeConstantView</**/ ::std::uint8_t>(7).UncheckedRead());
+ EXPECT_EQ(0, MaybeConstantView</**/ ::std::uint8_t>().UncheckedRead());
+}
+
+TEST(MaybeConstantViewTest, Ok) {
+ EXPECT_TRUE(MaybeConstantView</**/ ::std::uint8_t>(7).Ok());
+ EXPECT_FALSE(MaybeConstantView</**/ ::std::uint8_t>().Ok());
+}
+
+TEST(MaybeConstantViewTest, CopyConstruction) {
+ auto with_value = MaybeConstantView</**/ ::std::uint8_t>(7);
+ auto copied_with_value = with_value;
+ EXPECT_EQ(7, copied_with_value.Read());
+
+ auto without_value = MaybeConstantView</**/ ::std::uint8_t>();
+ auto copied_without_value = without_value;
+ EXPECT_FALSE(copied_without_value.Ok());
+}
+
+TEST(MaybeConstantViewTest, Assignment) {
+ auto with_value = MaybeConstantView</**/ ::std::uint8_t>(7);
+ MaybeConstantView</**/ ::std::uint8_t> copied_with_value;
+ copied_with_value = with_value;
+ EXPECT_EQ(7, copied_with_value.Read());
+
+ auto without_value = MaybeConstantView</**/ ::std::uint8_t>();
+ MaybeConstantView</**/ ::std::uint8_t> copied_without_value;
+ copied_without_value = without_value;
+ EXPECT_FALSE(copied_without_value.Ok());
+}
+
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_cpp_types.h b/public/emboss_cpp_types.h
new file mode 100644
index 0000000..a4aa3b2
--- /dev/null
+++ b/public/emboss_cpp_types.h
@@ -0,0 +1,125 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// This file contains various C++ type aliases for use in Emboss.
+#ifndef EMBOSS_PUBLIC_EMBOSS_CPP_TYPES_H_
+#define EMBOSS_PUBLIC_EMBOSS_CPP_TYPES_H_
+
+#include <climits>
+#include <cstdint>
+#include <type_traits>
+
+namespace emboss {
+namespace support {
+
+static_assert(sizeof(long long) * CHAR_BIT >= 64, // NOLINT
+ "Emboss requires that long long is at least 64 bits.");
+
+// FloatType<n_bits>::Type is the C++ floating-point type of the appropriate
+// size.
+template <int kBits>
+struct FloatType final {
+ static_assert(kBits == 32 || kBits == 64, "Unknown floating-point size.");
+};
+template <>
+struct FloatType<64> final {
+ static_assert(sizeof(double) * CHAR_BIT == 64,
+ "C++ double type must be 64 bits!");
+ using Type = double;
+ using UIntType = ::std::uint64_t;
+};
+template <>
+struct FloatType<32> final {
+ static_assert(sizeof(float) * CHAR_BIT == 32,
+ "C++ float type must be 32 bits!");
+ using Type = float;
+ using UIntType = ::std::uint32_t;
+};
+
+// LeastWidthInteger<n_bits>::Unsigned is the smallest uintNN_t type that can
+// hold n_bits or more. LeastWidthInteger<n_bits>::Signed is the corresponding
+// signed type.
+template <int kBits>
+struct LeastWidthInteger final {
+ static_assert(kBits <= 64, "Only bit sizes up to 64 are supported.");
+ using Unsigned = typename LeastWidthInteger<kBits + 1>::Unsigned;
+ using Signed = typename LeastWidthInteger<kBits + 1>::Signed;
+};
+template <>
+struct LeastWidthInteger<64> final {
+ using Unsigned = ::std::uint64_t;
+ using Signed = ::std::int64_t;
+};
+template <>
+struct LeastWidthInteger<32> final {
+ using Unsigned = ::std::uint32_t;
+ using Signed = ::std::int32_t;
+};
+template <>
+struct LeastWidthInteger<16> final {
+ using Unsigned = ::std::uint16_t;
+ using Signed = ::std::int16_t;
+};
+template <>
+struct LeastWidthInteger<8> final {
+ using Unsigned = ::std::uint8_t;
+ using Signed = ::std::int8_t;
+};
+
+// IsChar<T>::value is true if T is a character type; i.e. const? volatile?
+// (signed|unsigned)? char.
+template <typename T>
+struct IsChar {
+ // Note that 'char' is a distinct type from 'signed char' and 'unsigned char'.
+ static constexpr bool value =
+ ::std::is_same<char, typename ::std::remove_cv<T>::type>::value ||
+ ::std::is_same<unsigned char,
+ typename ::std::remove_cv<T>::type>::value ||
+ ::std::is_same<signed char, typename ::std::remove_cv<T>::type>::value;
+};
+
+// The static member variable requires a definition.
+template <typename T>
+constexpr bool IsChar<T>::value;
+
+// AddSourceConst<SourceT, DestT>::Type is DestT's base type with const added if
+// SourceT is const.
+template <typename SourceT, typename DestT>
+struct AddSourceConst {
+ using Type = typename ::std::conditional<
+ /**/ ::std::is_const<SourceT>::value,
+ typename ::std::add_const<DestT>::type, DestT>::type;
+};
+
+// AddSourceVolatile<SourceT, DestT>::Type is DestT's base type with volatile
+// added if SourceT is volatile.
+template <typename SourceT, typename DestT>
+struct AddSourceVolatile {
+ using Type = typename ::std::conditional<
+ /**/ ::std::is_volatile<SourceT>::value,
+ typename ::std::add_volatile<DestT>::type, DestT>::type;
+};
+
+// AddCV<SourceT, DestT>::Type is DestT's base type with SourceT's const and
+// volatile qualifiers added, if any.
+template <typename SourceT, typename DestT>
+struct AddSourceCV {
+ using Type = typename AddSourceConst<
+ SourceT, typename AddSourceVolatile<SourceT, DestT>::Type>::Type;
+};
+
+} // namespace support
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_CPP_TYPES_H_
diff --git a/public/emboss_cpp_types_test.cc b/public/emboss_cpp_types_test.cc
new file mode 100644
index 0000000..e56858a
--- /dev/null
+++ b/public/emboss_cpp_types_test.cc
@@ -0,0 +1,199 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_cpp_types.h"
+
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace support {
+namespace test {
+
+TEST(FloatTypes, Types) {
+ EXPECT_EQ(32 / CHAR_BIT, sizeof(FloatType<32>::Type));
+ EXPECT_EQ(64 / CHAR_BIT, sizeof(FloatType<64>::Type));
+ EXPECT_EQ(32 / CHAR_BIT, sizeof(FloatType<32>::UIntType));
+ EXPECT_EQ(64 / CHAR_BIT, sizeof(FloatType<64>::UIntType));
+ EXPECT_TRUE(::std::is_floating_point<FloatType<32>::Type>::value);
+ EXPECT_TRUE(::std::is_floating_point<FloatType<64>::Type>::value);
+ EXPECT_TRUE(
+ (::std::is_same<FloatType<32>::UIntType, ::std::uint32_t>::value));
+ EXPECT_TRUE(
+ (::std::is_same<FloatType<64>::UIntType, ::std::uint64_t>::value));
+}
+
+TEST(LeastWidthInteger, Types) {
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<1>::Unsigned, ::std::uint8_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<1>::Signed, ::std::int8_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<4>::Unsigned, ::std::uint8_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<4>::Signed, ::std::int8_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<7>::Unsigned, ::std::uint8_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<7>::Signed, ::std::int8_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<8>::Unsigned, ::std::uint8_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<8>::Signed, ::std::int8_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<9>::Unsigned, ::std::uint16_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<9>::Signed, ::std::int16_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<12>::Unsigned, ::std::uint16_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<12>::Signed, ::std::int16_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<15>::Unsigned, ::std::uint16_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<15>::Signed, ::std::int16_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<16>::Unsigned, ::std::uint16_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<16>::Signed, ::std::int16_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<17>::Unsigned, ::std::uint32_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<17>::Signed, ::std::int32_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<28>::Unsigned, ::std::uint32_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<28>::Signed, ::std::int32_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<31>::Unsigned, ::std::uint32_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<31>::Signed, ::std::int32_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<32>::Unsigned, ::std::uint32_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<32>::Signed, ::std::int32_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<33>::Unsigned, ::std::uint64_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<33>::Signed, ::std::int64_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<60>::Unsigned, ::std::uint64_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<60>::Signed, ::std::int64_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<63>::Unsigned, ::std::uint64_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<63>::Signed, ::std::int64_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<64>::Unsigned, ::std::uint64_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<LeastWidthInteger<64>::Signed, ::std::int64_t>::value));
+}
+
+TEST(IsChar, CharTypes) {
+ EXPECT_TRUE(IsChar<char>::value);
+ EXPECT_TRUE(IsChar<unsigned char>::value);
+ EXPECT_TRUE(IsChar<signed char>::value);
+ EXPECT_TRUE(IsChar<const char>::value);
+ EXPECT_TRUE(IsChar<const unsigned char>::value);
+ EXPECT_TRUE(IsChar<const signed char>::value);
+ EXPECT_TRUE(IsChar<volatile char>::value);
+ EXPECT_TRUE(IsChar<volatile unsigned char>::value);
+ EXPECT_TRUE(IsChar<volatile signed char>::value);
+ EXPECT_TRUE(IsChar<const volatile char>::value);
+ EXPECT_TRUE(IsChar<const volatile unsigned char>::value);
+ EXPECT_TRUE(IsChar<const volatile signed char>::value);
+}
+
+TEST(IsChar, NonCharTypes) {
+ struct OneByte { char c; };
+ EXPECT_EQ(1, sizeof(OneByte));
+ EXPECT_FALSE(IsChar<int>::value);
+ EXPECT_FALSE(IsChar<unsigned>::value);
+ EXPECT_FALSE(IsChar<const int>::value);
+ EXPECT_FALSE(IsChar<OneByte>::value);
+}
+
+TEST(AddSourceConst, AddSourceConst) {
+ EXPECT_TRUE(
+ (::std::is_same<const char,
+ typename AddSourceConst<const int, char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ const volatile char,
+ typename AddSourceConst<const int, volatile char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<char, typename AddSourceConst<int, char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ char, typename AddSourceConst<volatile int, char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<const char,
+ typename AddSourceConst<int, const char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<const char, typename AddSourceConst<
+ const int, const char>::Type>::value));
+}
+
+TEST(AddSourceVolatile, AddSourceVolatile) {
+ EXPECT_TRUE(
+ (::std::is_same<volatile char, typename AddSourceVolatile<
+ volatile int, char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ const volatile char,
+ typename AddSourceVolatile<volatile int, const char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<char,
+ typename AddSourceVolatile<int, char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ char, typename AddSourceVolatile<const int, char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<volatile char, typename AddSourceVolatile<
+ int, volatile char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<volatile char,
+ typename AddSourceVolatile<volatile int,
+ volatile char>::Type>::value));
+}
+
+TEST(AddSourceCV, AddSourceCV) {
+ EXPECT_TRUE(
+ (::std::is_same<const char,
+ typename AddSourceCV<const int, char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<volatile char,
+ typename AddSourceCV<volatile int, char>::Type>::value));
+ EXPECT_TRUE((::std::is_same<
+ const volatile char,
+ typename AddSourceCV<volatile int, const char>::Type>::value));
+ EXPECT_TRUE((::std::is_same<
+ const volatile char,
+ typename AddSourceCV<const int, volatile char>::Type>::value));
+ EXPECT_TRUE((::std::is_same<
+ const volatile char,
+ typename AddSourceCV<const volatile int, char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<char, typename AddSourceCV<int, char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<volatile char,
+ typename AddSourceCV<int, volatile char>::Type>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ volatile char,
+ typename AddSourceCV<volatile int, volatile char>::Type>::value));
+}
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_cpp_util.h b/public/emboss_cpp_util.h
new file mode 100644
index 0000000..624f931
--- /dev/null
+++ b/public/emboss_cpp_util.h
@@ -0,0 +1,31 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// This header exports utilities that are needed for all Emboss-generated C++
+// code.
+#ifndef EMBOSS_PUBLIC_EMBOSS_CPP_UTIL_H_
+#define EMBOSS_PUBLIC_EMBOSS_CPP_UTIL_H_
+
+#include "public/emboss_arithmetic.h"
+#include "public/emboss_array_view.h"
+#include "public/emboss_bit_util.h"
+#include "public/emboss_constant_view.h"
+#include "public/emboss_cpp_types.h"
+#include "public/emboss_defines.h"
+#include "public/emboss_enum_view.h"
+#include "public/emboss_memory_util.h"
+#include "public/emboss_text_util.h"
+#include "public/emboss_view_parameters.h"
+
+#endif // EMBOSS_PUBLIC_EMBOSS_CPP_UTIL_H_
diff --git a/public/emboss_cpp_util_google_integration_test.cc b/public/emboss_cpp_util_google_integration_test.cc
new file mode 100644
index 0000000..c634b10
--- /dev/null
+++ b/public/emboss_cpp_util_google_integration_test.cc
@@ -0,0 +1,37 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_cpp_util.h"
+
+#include <gtest/gtest.h>
+#include "third_party/absl/strings/string_view.h"
+
+namespace emboss {
+namespace support {
+namespace test {
+
+TEST(TextStream, Construction) {
+ absl::string_view view_text = "gh";
+ auto text_stream = TextStream(view_text);
+ char result;
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('g', result);
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('h', result);
+ EXPECT_FALSE(text_stream.Read(&result));
+}
+
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_cpp_util_nc.cc b/public/emboss_cpp_util_nc.cc
new file mode 100644
index 0000000..7e72db7
--- /dev/null
+++ b/public/emboss_cpp_util_nc.cc
@@ -0,0 +1,43 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_cpp_util.h"
+
+#include <string>
+
+namespace emboss {
+namespace {
+
+void X() {
+#ifdef TEST_CANNOT_CONSTRUCT_READ_WRITE_CONTIGUOUS_BUFFER_FROM_STRING
+ ::std::string foo = "string";
+
+ // Read-only ContiguousBuffer should be fine.
+ (void)ContiguousBuffer<const char>(foo);
+
+ // Read-write ContiguousBuffer should be fail.
+ (void)ContiguousBuffer<char>(foo);
+#endif // TEST_CANNOT_CONSTRUCT_READ_WRITE_CONTIGUOUS_BUFFER_FROM_STRING
+
+#ifdef TEST_CANNOT_CONSTRUCT_NON_BYTE_CONTIGUOUS_BUFFER
+ // ContiguousBuffer<char>(nullptr) should be fine...
+ (void)ContiguousBuffer<char>(nullptr);
+
+ // ... but ContiguousBuffer<int>(nullptr) should not.
+ (void)ContiguousBuffer<int>(nullptr);
+#endif // TEST_CANNOT_CONSTRUCT_NON_BYTE_CONTIGUOUS_BUFFER
+}
+
+} // namespace
+} // namespace emboss
diff --git a/public/emboss_defines.h b/public/emboss_defines.h
new file mode 100644
index 0000000..3be4d66
--- /dev/null
+++ b/public/emboss_defines.h
@@ -0,0 +1,167 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// This header contains #defines that are used to control Emboss's generated
+// code.
+#ifndef EMBOSS_PUBLIC_EMBOSS_DEFINES_H_
+#define EMBOSS_PUBLIC_EMBOSS_DEFINES_H_
+
+// TODO(bolms): Add an explicit extension point for these macros.
+#include <assert.h>
+#define EMBOSS_CHECK(x) assert((x))
+#define EMBOSS_CHECK_LE(x, y) assert((x) <= (y))
+#define EMBOSS_CHECK_GE(x, y) assert((x) >= (y))
+#define EMBOSS_CHECK_GT(x, y) assert((x) > (y))
+#define EMBOSS_CHECK_EQ(x, y) assert((x) == (y))
+
+// If EMBOSS_FORCE_ALL_CHECKS is #defined, then all checks are enabled even in
+// optimized modes. Otherwise, EMBOSS_DCHECK only runs in debug mode.
+#ifdef EMBOSS_FORCE_ALL_CHECKS
+#define EMBOSS_DCHECK_EQ(x, y) assert((x) == (y))
+#define EMBOSS_DCHECK_GE(x, y) assert((x) >= (y))
+#else
+#define EMBOSS_DCHECK_EQ(x, y) assert((x) == (y))
+#define EMBOSS_DCHECK_GE(x, y) assert((x) >= (y))
+#endif // EMBOSS_FORCE_ALL_CHECKS
+
+// Technically, the mapping from pointers to integers is implementation-defined,
+// but the standard states "[ Note: It is intended to be unsurprising to those
+// who know the addressing structure of the underlying machine. - end note ],"
+// so this should be a reasonably safe way to check that a pointer is aligned.
+#define EMBOSS_DCHECK_POINTER_ALIGNMENT(p, align, offset) \
+ EMBOSS_DCHECK_EQ(reinterpret_cast</**/ ::std::uintptr_t>((p)) % (align), \
+ (offset))
+#define EMBOSS_CHECK_POINTER_ALIGNMENT(p, align, offset) \
+ EMBOSS_CHECK_EQ(reinterpret_cast</**/ ::std::uintptr_t>((p)) % (align), \
+ (offset))
+
+// !! WARNING !!
+//
+// It is possible to pre-#define a number of macros used below to influence
+// Emboss's system-specific optimizations. If so, they *must* be #defined the
+// same way in every compilation unit that #includes any Emboss-related header
+// or generated code, before any such headers are #included, or else there is a
+// real risk of ODR violations. It is recommended that any EMBOSS_* #defines
+// are added to the global C++ compiler options in your build system, rather
+// than being individually specified in source files.
+//
+// TODO(bolms): Add an #include for a site-specific header file, where
+// site-specific customizations can be placed, and recommend that any overrides
+// be placed in that header. Further, use that header for the EMBOSS_CHECK_*
+// #defines, above.
+
+// EMBOSS_NO_OPTIMIZATIONS is used to turn off all system-specific
+// optimizations. This is mostly intended for testing, but could be used if
+// optimizations are causing problems.
+#if !defined(EMBOSS_NO_OPTIMIZATIONS)
+#if defined(__GNUC__) // GCC and "compatible" compilers, such as Clang.
+
+// GCC, Clang, and ICC only support two's-complement systems, so it is safe to
+// assume two's-complement for those systems. In particular, this means that
+// static_cast<int>() will treat its argument as a two's-complement bit pattern,
+// which means that it is reasonable to static_cast<int>(some_unsigned_value).
+//
+// TODO(bolms): Are there actually any non-archaic systems that use any integer
+// types other than 2's-complement?
+#ifndef EMBOSS_SYSTEM_IS_TWOS_COMPLEMENT
+#define EMBOSS_SYSTEM_IS_TWOS_COMPLEMENT 1
+#endif
+
+#if !defined(__INTEL_COMPILER)
+// On systems with known host byte order, Emboss can always use memcpy to safely
+// and relatively efficiently read and write values from and to memory.
+// However, memcpy cannot assume that its pointers are aligned. On common
+// platforms, particularly x86, this almost never matters; however, on some
+// systems this can add considerable overhead, as memcpy must either perform
+// byte-by-byte copies or perform tests to determine pointer alignment and then
+// dispatch to alignment-specific code.
+//
+// Platforms with no alignment restrictions:
+//
+// * x86 (except for a few SSE instructions like movdqa: see
+// http://pzemtsov.github.io/2016/11/06/bug-story-alignment-on-x86.html)
+// * ARM systems with ARMv6 and later ISAs
+// * High-end POWER-based systems
+// * POWER-based systems with an alignment exception handler installed (but note
+// that emulated unaligned reads are *very* slow)
+//
+// Platforms with alignment restrictions:
+//
+// * MicroBlaze
+// * Emscripten
+// * Low-end bare-metal POWER-based systems
+// * ARM systems with ARMv5 and earlier ISAs
+// * x86 with the AC bit of EEFLAGS enabled (but note that this is never enabled
+// on any normal system, and, e.g., you will get crashes in glibc if you try
+// to enable it)
+//
+// The naive solution is to reinterpret_cast to a type like uint32_t, then read
+// or write through that pointer; however, this can easily run afoul of C++'s
+// type aliasing rules and result in undefined behavior.
+//
+// On GCC, there is a solution to this: use the "__may_alias__" type attribute,
+// which essentially forces the type to have the same aliasing rules as char;
+// i.e., it is safe to read and write through a pointer derived from
+// reinterpret_cast<T __attribute__((__may_alias__)) *>, just as it is safe to
+// read and write through a pointer derived from reinterpret_cast<char *>.
+//
+// Note that even though ICC pretends to be compatible with GCC by defining
+// __GNUC__, it does *not* appear to support the __may_alias__ attribute.
+// (TODO(bolms): verify this if/when Emboss explicitly supports ICC.)
+//
+// Note the lack of parentheses around 't' in the expansion: unfortunately,
+// GCC's attribute syntax disallows parentheses in that particular position.
+#define EMBOSS_ALIAS_SAFE_POINTER_CAST(t, x) \
+ reinterpret_cast<t __attribute__((__may_alias__)) *>((x))
+#endif // !defined(__INTEL_COMPILER)
+
+// GCC supports __BYTE_ORDER__ of __ORDER_LITTLE_ENDIAN__, __ORDER_BIG_ENDIAN__,
+// and __ORDER_PDP_ENDIAN__. Since all available test systems are
+// __ORDER_LITTLE_ENDIAN__, only little-endian hosts get optimized code paths;
+// however, big-endian support ought to be trivial to add.
+//
+// There are no plans to support PDP-endian systems.
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+// EMBOSS_LITTLE_ENDIAN_TO_NATIVE and EMBOSS_BIG_ENDIAN_TO_NATIVE can be used to
+// fix up integers after a little- or big-endian value has been memcpy'ed into
+// them.
+//
+// On little-endian systems, no fixup is needed for little-endian sources, but
+// big-endian sources require a byte swap.
+#define EMBOSS_LITTLE_ENDIAN_TO_NATIVE(x) (x)
+#define EMBOSS_NATIVE_TO_LITTLE_ENDIAN(x) (x)
+#define EMBOSS_BIG_ENDIAN_TO_NATIVE(x) (::emboss::support::ByteSwap((x)))
+#define EMBOSS_NATIVE_TO_BIG_ENDIAN(x) (::emboss::support::ByteSwap((x)))
+// TODO(bolms): Find a way to test on a big-endian architecture, and add support
+// for __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+#endif // __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+
+// Prior to version 4.8, __builtin_bswap16 was not available on all platforms.
+// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52624
+//
+// Clang pretends to be an earlier GCC, but does support __builtin_bswap16.
+// Clang recommends using __has_builtin(__builtin_bswap16), but unfortunately
+// that fails to compile on GCC, even with defined(__has_builtin) &&
+// __has_builtin(__builtin_bswap16), so instead Emboss just checks for
+// defined(__clang__).
+#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8) || defined(__clang__)
+#define EMBOSS_BYTESWAP16(x) __builtin_bswap16((x))
+#endif // __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8)
+#define EMBOSS_BYTESWAP32(x) __builtin_bswap32((x))
+#define EMBOSS_BYTESWAP64(x) __builtin_bswap64((x))
+
+#endif // defined(__GNUC__)
+#endif // !defined(EMBOSS_NO_OPTIMIZATIONS)
+
+#endif // EMBOSS_PUBLIC_EMBOSS_DEFINES_H_
diff --git a/public/emboss_defines_test.cc b/public/emboss_defines_test.cc
new file mode 100644
index 0000000..198b0c4
--- /dev/null
+++ b/public/emboss_defines_test.cc
@@ -0,0 +1,59 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_defines.h"
+
+#include <cstdint>
+
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace support {
+namespace test {
+
+TEST(CheckPointerAlignment, Aligned) {
+ ::std::uint32_t t;
+ EMBOSS_DCHECK_POINTER_ALIGNMENT(&t, sizeof t, 0);
+ EMBOSS_DCHECK_POINTER_ALIGNMENT(&t, 1, 0);
+ EMBOSS_DCHECK_POINTER_ALIGNMENT(reinterpret_cast<char *>(&t) + 1, sizeof t,
+ 1);
+ EMBOSS_DCHECK_POINTER_ALIGNMENT(reinterpret_cast<char *>(&t) + 1, 1, 0);
+}
+
+TEST(CheckPointerAlignment, Misaligned) {
+ ::std::uint32_t t;
+ EXPECT_DEATH(EMBOSS_DCHECK_POINTER_ALIGNMENT(&t, sizeof t, 1), "");
+ EXPECT_DEATH(EMBOSS_DCHECK_POINTER_ALIGNMENT(reinterpret_cast<char *>(&t) + 1,
+ sizeof t, 0),
+ "");
+}
+
+#if EMBOSS_SYSTEM_IS_TWOS_COMPLEMENT
+TEST(SystemIsTwosComplement, CastToSigned) {
+ EXPECT_EQ(-static_cast</**/ ::std::int64_t>(0x80000000),
+ static_cast</**/ ::std::int32_t>(0x80000000));
+}
+#endif // EMBOSS_SYSTEM_IS_TWOS_COMPLEMENT
+
+// Note: I (bolms@) can't think of a way to truly test
+// EMBOSS_ALIAS_SAFE_POINTER_CAST, since the compiler might let it work even if
+// it's not "supposed" to. (E.g., even with -fstrict-aliasing, GCC doesn't
+// always take advantage of strict aliasing to do any optimizations.)
+
+// The native <=> fixed endian macros are tested in emboss_bit_util_test.cc,
+// since their expansions rely on emboss_bit_util.h.
+
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_enum_view.h b/public/emboss_enum_view.h
new file mode 100644
index 0000000..6ed9453
--- /dev/null
+++ b/public/emboss_enum_view.h
@@ -0,0 +1,159 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// View class template for enums.
+#ifndef EMBOSS_PUBLIC_EMBOSS_ENUM_VIEW_H_
+#define EMBOSS_PUBLIC_EMBOSS_ENUM_VIEW_H_
+
+#include <cctype>
+#include <cstdint>
+#include <string>
+#include <utility>
+
+#include "public/emboss_text_util.h"
+#include "public/emboss_view_parameters.h"
+
+namespace emboss {
+namespace support {
+
+// EnumView is a view for Enums inside of bitfields.
+template <class Enum, class Parameters, class BitViewType>
+class EnumView final {
+ public:
+ using ValueType = typename ::std::remove_cv<Enum>::type;
+ static_assert(
+ Parameters::kBits <= sizeof(ValueType) * 8,
+ "EnumView requires sizeof(ValueType) * 8 >= Parameters::kBits.");
+ template <typename... Args>
+ explicit EnumView(Args &&... args) : buffer_{::std::forward<Args>(args)...} {}
+ EnumView() : buffer_() {}
+ EnumView(const EnumView &) = default;
+ EnumView(EnumView &&) = default;
+ EnumView &operator=(const EnumView &) = default;
+ EnumView &operator=(EnumView &&) = default;
+ ~EnumView() = default;
+
+ // TODO(bolms): Here and in CouldWriteValue(), the static_casts to ValueType
+ // rely on implementation-defined behavior when ValueType is signed.
+ ValueType Read() const {
+ ValueType result = static_cast<ValueType>(buffer_.ReadUInt());
+ EMBOSS_CHECK(Parameters::ValueIsOk(result));
+ return result;
+ }
+ ValueType UncheckedRead() const {
+ return static_cast<ValueType>(buffer_.UncheckedReadUInt());
+ }
+ void Write(ValueType value) const { EMBOSS_CHECK(TryToWrite(value)); }
+ bool TryToWrite(ValueType value) const {
+ if (!CouldWriteValue(value)) return false;
+ if (!IsComplete()) return false;
+ buffer_.WriteUInt(static_cast<typename BitViewType::ValueType>(value));
+ return true;
+ }
+ static constexpr bool CouldWriteValue(ValueType value) {
+ // The value can be written if:
+ //
+ // a) it can fit in BitViewType::ValueType (verified by casting to
+ // BitViewType::ValueType and back, and making sure that the value is
+ // unchanged)
+ //
+ // and either:
+ //
+ // b1) the field size is large enough to hold all values, or
+ // b2) the value is less than 2**(field size in bits)
+ return value == static_cast<ValueType>(
+ static_cast<typename BitViewType::ValueType>(value)) &&
+ ((Parameters::kBits ==
+ sizeof(typename BitViewType::ValueType) * 8) ||
+ (static_cast<typename BitViewType::ValueType>(value) <
+ ((static_cast<typename BitViewType::ValueType>(1)
+ << (Parameters::kBits - 1))
+ << 1))) &&
+ Parameters::ValueIsOk(value);
+ }
+ void UncheckedWrite(ValueType value) const {
+ buffer_.UncheckedWriteUInt(
+ static_cast<typename BitViewType::ValueType>(value));
+ }
+
+ template <typename OtherView>
+ void CopyFrom(const OtherView &other) const {
+ Write(other.Read());
+ }
+ template <typename OtherView>
+ void UncheckedCopyFrom(const OtherView &other) const {
+ UncheckedWrite(other.UncheckedRead());
+ }
+ template <typename OtherView>
+ bool TryToCopyFrom(const OtherView &other) const {
+ return other.Ok() && TryToWrite(other.Read());
+ }
+
+ // All bit patterns in the underlying buffer are valid, so Ok() is always
+ // true if IsComplete() is true.
+ bool Ok() const {
+ return IsComplete() && Parameters::ValueIsOk(UncheckedRead());
+ }
+ template <class OtherBitViewType>
+ bool Equals(const EnumView<Enum, Parameters, OtherBitViewType> &other) const {
+ return Read() == other.Read();
+ }
+ template <class OtherBitViewType>
+ bool UncheckedEquals(
+ const EnumView<Enum, Parameters, OtherBitViewType> &other) const {
+ return UncheckedRead() == other.UncheckedRead();
+ }
+ bool IsComplete() const {
+ return buffer_.Ok() && buffer_.SizeInBits() >= Parameters::kBits;
+ }
+
+ template <class Stream>
+ bool UpdateFromTextStream(Stream *stream) const {
+ ::std::string token;
+ if (!ReadToken(stream, &token)) return false;
+ if (token.empty()) return false;
+ if (::std::isdigit(token[0])) {
+ ::std::uint64_t value;
+ if (!DecodeInteger(token, &value)) return false;
+ // TODO(bolms): Fix the static_cast<ValueType> for signed ValueType.
+ // TODO(bolms): Should values between 2**63 and 2**64-1 actually be
+ // allowed in the text format when ValueType is signed?
+ return TryToWrite(static_cast<ValueType>(value));
+ } else if (token[0] == '-') {
+ ::std::int64_t value;
+ if (!DecodeInteger(token, &value)) return false;
+ return TryToWrite(static_cast<ValueType>(value));
+ } else {
+ ValueType value;
+ if (!TryToGetEnumFromName(token.c_str(), &value)) return false;
+ return TryToWrite(value);
+ }
+ }
+
+ template <class Stream>
+ void WriteToTextStream(Stream *stream,
+ const TextOutputOptions &options) const {
+ ::emboss::support::WriteEnumViewToTextStream(this, stream, options);
+ }
+
+ static constexpr int SizeInBits() { return Parameters::kBits; }
+
+ private:
+ BitViewType buffer_;
+};
+
+} // namespace support
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_ENUM_VIEW_H_
diff --git a/public/emboss_enum_view_test.cc b/public/emboss_enum_view_test.cc
new file mode 100644
index 0000000..c6c90d4
--- /dev/null
+++ b/public/emboss_enum_view_test.cc
@@ -0,0 +1,275 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_enum_view.h"
+
+#include "public/emboss_prelude.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace support {
+namespace test {
+
+template <size_t kBits>
+using LittleEndianBitBlockN =
+ BitBlock<LittleEndianByteOrderer<ReadWriteContiguousBuffer>, kBits>;
+
+enum class Foo : int64_t {
+ kMin = -0x7fffffffffffffffL - 1,
+ kOne = 1,
+ kTwo = 2,
+ kBig = 0x0e0f10,
+ kBigBackwards = 0x100f0e,
+ kReallyBig = 0x090a0b0c0d0e0f10L,
+ kReallyBigBackwards = 0x100f0e0d0c0b0a09L,
+ k2to24MinusOne = (1L << 24) - 1,
+ k2to24 = 1L << 24,
+ kMax = 0x7fffffffffffffffL,
+};
+
+static inline bool TryToGetEnumFromName(const char *name, Foo *result) {
+ if (!strcmp("kMin", name)) {
+ *result = Foo::kMin;
+ return true;
+ }
+ if (!strcmp("kOne", name)) {
+ *result = Foo::kOne;
+ return true;
+ }
+ if (!strcmp("kTwo", name)) {
+ *result = Foo::kTwo;
+ return true;
+ }
+ if (!strcmp("kBig", name)) {
+ *result = Foo::kBig;
+ return true;
+ }
+ if (!strcmp("kBigBackwards", name)) {
+ *result = Foo::kBigBackwards;
+ return true;
+ }
+ if (!strcmp("kReallyBig", name)) {
+ *result = Foo::kReallyBig;
+ return true;
+ }
+ if (!strcmp("kReallyBigBackwards", name)) {
+ *result = Foo::kReallyBigBackwards;
+ return true;
+ }
+ if (!strcmp("k2to24MinusOne", name)) {
+ *result = Foo::k2to24MinusOne;
+ return true;
+ }
+ if (!strcmp("k2to24", name)) {
+ *result = Foo::k2to24;
+ return true;
+ }
+ if (!strcmp("kMax", name)) {
+ *result = Foo::kMax;
+ return true;
+ }
+ return false;
+}
+
+static inline const char *TryToGetNameFromEnum(Foo value) {
+ switch (value) {
+ case Foo::kMin:
+ return "kMin";
+ case Foo::kOne:
+ return "kOne";
+ case Foo::kTwo:
+ return "kTwo";
+ case Foo::kBig:
+ return "kBig";
+ case Foo::kBigBackwards:
+ return "kBigBackwards";
+ case Foo::kReallyBig:
+ return "kReallyBig";
+ case Foo::kReallyBigBackwards:
+ return "kReallyBigBackwards";
+ case Foo::k2to24MinusOne:
+ return "k2to24MinusOne";
+ case Foo::k2to24:
+ return "k2to24";
+ case Foo::kMax:
+ return "kMax";
+ default:
+ return nullptr;
+ }
+}
+
+template <size_t kBits>
+using FooViewN = EnumView<Foo, FixedSizeViewParameters<kBits, AllValuesAreOk>,
+ LittleEndianBitBlockN<kBits>>;
+
+template <int kMaxBits>
+void CheckEnumViewSizeInBits() {
+ const int size_in_bits =
+ EnumView<Foo, FixedSizeViewParameters<kMaxBits, AllValuesAreOk>,
+ OffsetBitBlock<LittleEndianBitBlockN<64>>>::SizeInBits();
+ EXPECT_EQ(size_in_bits, kMaxBits);
+ return CheckEnumViewSizeInBits<kMaxBits - 1>();
+}
+
+template <>
+void CheckEnumViewSizeInBits<0>() {
+ return;
+}
+
+TEST(EnumView, SizeInBits) { CheckEnumViewSizeInBits<64>(); }
+
+TEST(EnumView, ValueType) {
+ using BitBlockType = OffsetBitBlock<LittleEndianBitBlockN<64>>;
+ EXPECT_TRUE(
+ (::std::is_same<Foo,
+ EnumView<Foo, FixedSizeViewParameters<8, AllValuesAreOk>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<Foo,
+ EnumView<Foo, FixedSizeViewParameters<6, AllValuesAreOk>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<Foo,
+ EnumView<Foo, FixedSizeViewParameters<33, AllValuesAreOk>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<Foo,
+ EnumView<Foo, FixedSizeViewParameters<64, AllValuesAreOk>,
+ BitBlockType>::ValueType>::value));
+}
+
+TEST(EnumView, CouldWriteValue) {
+ EXPECT_TRUE(FooViewN<64>::CouldWriteValue(Foo::kMax));
+ EXPECT_TRUE(FooViewN<64>::CouldWriteValue(Foo::kMax));
+ EXPECT_TRUE(FooViewN<24>::CouldWriteValue(Foo::k2to24MinusOne));
+ EXPECT_FALSE(FooViewN<24>::CouldWriteValue(Foo::k2to24));
+}
+
+TEST(EnumView, ReadAndWriteWithSufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}};
+ auto enum64_view = FooViewN<64>{ReadWriteContiguousBuffer{bytes.data(), 8}};
+ EXPECT_EQ(Foo::kReallyBig, enum64_view.Read());
+ EXPECT_EQ(Foo::kReallyBig, enum64_view.UncheckedRead());
+ enum64_view.Write(Foo::kReallyBigBackwards);
+ EXPECT_EQ((::std::vector<uint8_t>{0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
+ 0x10, 0x08}),
+ bytes);
+ enum64_view.UncheckedWrite(Foo::kReallyBig);
+ EXPECT_EQ((::std::vector<uint8_t>{0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a,
+ 0x09, 0x08}),
+ bytes);
+ EXPECT_TRUE(enum64_view.TryToWrite(Foo::kReallyBigBackwards));
+ EXPECT_EQ((::std::vector<uint8_t>{0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
+ 0x10, 0x08}),
+ bytes);
+ EXPECT_TRUE(enum64_view.Ok());
+ EXPECT_TRUE(enum64_view.IsComplete());
+}
+
+TEST(EnumView, ReadAndWriteWithInsufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}};
+ auto enum64_view = FooViewN<64>{ReadWriteContiguousBuffer{bytes.data(), 4}};
+ EXPECT_DEATH(enum64_view.Read(), "");
+ EXPECT_EQ(Foo::kReallyBig, enum64_view.UncheckedRead());
+ EXPECT_DEATH(enum64_view.Write(Foo::kReallyBigBackwards), "");
+ EXPECT_FALSE(enum64_view.TryToWrite(Foo::kReallyBigBackwards));
+ EXPECT_EQ((::std::vector<uint8_t>{0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a,
+ 0x09, 0x08}),
+ bytes);
+ enum64_view.UncheckedWrite(Foo::kReallyBigBackwards);
+ EXPECT_EQ((::std::vector<uint8_t>{0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
+ 0x10, 0x08}),
+ bytes);
+ EXPECT_FALSE(enum64_view.Ok());
+ EXPECT_FALSE(enum64_view.IsComplete());
+}
+
+TEST(EnumView, NonPowerOfTwoSize) {
+ ::std::vector<uint8_t> bytes = {{0x10, 0x0f, 0x0e, 0x0d}};
+ auto enum24_view = FooViewN<24>{ReadWriteContiguousBuffer{bytes.data(), 3}};
+ EXPECT_EQ(Foo::kBig, enum24_view.Read());
+ EXPECT_EQ(Foo::kBig, enum24_view.UncheckedRead());
+ enum24_view.Write(Foo::kBigBackwards);
+ EXPECT_EQ((::std::vector<uint8_t>{0x0e, 0x0f, 0x10, 0x0d}), bytes);
+ EXPECT_DEATH(enum24_view.Write(Foo::k2to24), "");
+ enum24_view.UncheckedWrite(Foo::k2to24);
+ EXPECT_EQ((::std::vector<uint8_t>{0x00, 0x00, 0x00, 0x0d}), bytes);
+ EXPECT_TRUE(enum24_view.Ok());
+ EXPECT_TRUE(enum24_view.IsComplete());
+}
+
+TEST(EnumView, NonPowerOfTwoSizeInsufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {{0x10, 0x0f, 0x0e, 0x0d}};
+ auto enum24_view = FooViewN<24>{ReadWriteContiguousBuffer{bytes.data(), 2}};
+ EXPECT_DEATH(enum24_view.Read(), "");
+ EXPECT_EQ(Foo::kBig, enum24_view.UncheckedRead());
+ EXPECT_DEATH(enum24_view.Write(Foo::kBigBackwards), "");
+ enum24_view.UncheckedWrite(Foo::kBigBackwards);
+ EXPECT_EQ((::std::vector<uint8_t>{0x0e, 0x0f, 0x10, 0x0d}), bytes);
+ EXPECT_FALSE(enum24_view.Ok());
+ EXPECT_FALSE(enum24_view.IsComplete());
+}
+
+TEST(EnumView, UpdateFromText) {
+ ::std::vector<uint8_t> bytes = {
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}};
+ const auto enum64_view =
+ FooViewN<64>{ReadWriteContiguousBuffer{bytes.data(), 8}};
+ EXPECT_TRUE(UpdateFromText(enum64_view, "kBig"));
+ EXPECT_EQ(Foo::kBig, enum64_view.Read());
+ EXPECT_TRUE(UpdateFromText(enum64_view, "k2to24"));
+ EXPECT_EQ(Foo::k2to24, enum64_view.Read());
+ EXPECT_FALSE(UpdateFromText(enum64_view, "k2to24M"));
+ EXPECT_EQ(Foo::k2to24, enum64_view.Read());
+ EXPECT_TRUE(UpdateFromText(enum64_view, "k2to24MinusOne"));
+ EXPECT_EQ(Foo::k2to24MinusOne, enum64_view.Read());
+ EXPECT_TRUE(UpdateFromText(enum64_view, "0x0e0f10"));
+ EXPECT_EQ(Foo::kBig, enum64_view.Read());
+ EXPECT_TRUE(UpdateFromText(enum64_view, "0x7654321"));
+ EXPECT_EQ(static_cast<Foo>(0x7654321), enum64_view.Read());
+ EXPECT_FALSE(UpdateFromText(enum64_view, "0y0"));
+ EXPECT_EQ(static_cast<Foo>(0x7654321), enum64_view.Read());
+ EXPECT_FALSE(UpdateFromText(enum64_view, "-x"));
+ EXPECT_EQ(static_cast<Foo>(0x7654321), enum64_view.Read());
+ EXPECT_TRUE(UpdateFromText(enum64_view, "-0x8000_0000_0000_0000"));
+ EXPECT_EQ(Foo::kMin, enum64_view.Read());
+}
+
+TEST(EnumView, WriteToText) {
+ ::std::vector<uint8_t> bytes = {
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}};
+ const auto enum64_view =
+ FooViewN<64>{ReadWriteContiguousBuffer{bytes.data(), 8}};
+ EXPECT_EQ("kReallyBig", WriteToString(enum64_view));
+ EXPECT_EQ("kReallyBig # 651345242494996240",
+ WriteToString(enum64_view, TextOutputOptions().WithComments(true)));
+ EXPECT_EQ("kReallyBig # 0x90a0b0c0d0e0f10",
+ WriteToString(
+ enum64_view,
+ TextOutputOptions().WithComments(true).WithNumericBase(16)));
+ enum64_view.Write(static_cast<Foo>(123));
+ EXPECT_EQ("123", WriteToString(enum64_view));
+ EXPECT_EQ("123",
+ WriteToString(enum64_view, TextOutputOptions().WithComments(true)));
+ EXPECT_EQ("0x7b",
+ WriteToString(
+ enum64_view,
+ TextOutputOptions().WithComments(true).WithNumericBase(16)));
+}
+
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_maybe.h b/public/emboss_maybe.h
new file mode 100644
index 0000000..a11ba39
--- /dev/null
+++ b/public/emboss_maybe.h
@@ -0,0 +1,76 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Definition of the Maybe<T> template class.
+#ifndef EMBOSS_PUBLIC_EMBOSS_MAYBE_H_
+#define EMBOSS_PUBLIC_EMBOSS_MAYBE_H_
+
+#include <utility>
+
+#include "public/emboss_defines.h"
+
+namespace emboss {
+// TODO(bolms): Should Maybe be a public type (i.e., live in ::emboss)?
+namespace support {
+
+// Maybe<T> is similar to, but much more restricted than, C++17's std::optional.
+// It is intended for use in Emboss's expression system, wherein a non-Known()
+// Maybe<T> will usually (but not always) poison the result of an operation.
+//
+// As such, Maybe<> is intended for use with small, copyable T's: specifically,
+// integers, enums, and booleans. It may not perform well with other types.
+template <typename T>
+class Maybe final {
+ public:
+ constexpr Maybe() : value_(), known_(false) {}
+ constexpr explicit Maybe(T value)
+ : value_(::std::move(value)), known_(true) {}
+ constexpr Maybe(const Maybe<T> &) = default;
+ ~Maybe() = default;
+ Maybe &operator=(const Maybe &) = default;
+ Maybe &operator=(T value) {
+ value_ = ::std::move(value);
+ known_ = true;
+ return *this;
+ }
+ Maybe &operator=(const T &value) {
+ value_ = value;
+ known_ = true;
+ return *this;
+ }
+
+ constexpr bool Known() const { return known_; }
+ T Value() const {
+ EMBOSS_CHECK(Known());
+ return value_;
+ }
+ constexpr T ValueOr(T default_value) const {
+ return known_ ? value_ : default_value;
+ }
+ // A non-Ok() Maybe value-initializes value_ to a default (by explicitly
+ // calling the nullary constructor on value_ in the initializer list), so it
+ // is safe to just return value_ here. For integral types and enums, value_
+ // will be 0, for bool it will be false, and for other types it depends on the
+ // constructor's behavior.
+ constexpr T ValueOrDefault() const { return value_; }
+
+ private:
+ T value_;
+ bool known_;
+};
+
+} // namespace support
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_MAYBE_H_
diff --git a/public/emboss_maybe_test.cc b/public/emboss_maybe_test.cc
new file mode 100644
index 0000000..acbf917
--- /dev/null
+++ b/public/emboss_maybe_test.cc
@@ -0,0 +1,61 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_maybe.h"
+
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace support {
+namespace test {
+
+enum class Foo : ::std::int64_t {
+ BAR = 1,
+ BAZ = 2,
+};
+
+TEST(Maybe, Known) {
+ EXPECT_TRUE(Maybe<int>(10).Known());
+ EXPECT_EQ(10, Maybe<int>(10).ValueOr(3));
+ EXPECT_EQ(10, Maybe<int>(10).ValueOrDefault());
+ EXPECT_EQ(10, Maybe<int>(10).Value());
+ EXPECT_TRUE(Maybe<bool>(true).Value());
+ EXPECT_EQ(Foo::BAZ, Maybe<Foo>(Foo::BAZ).ValueOrDefault());
+
+ Maybe<int> x = Maybe<int>(1000);
+ Maybe<int> y = Maybe<int>();
+ y = x;
+ EXPECT_TRUE(y.Known());
+ EXPECT_EQ(1000, y.Value());
+}
+
+TEST(Maybe, Unknown) {
+ EXPECT_FALSE(Maybe<int>().Known());
+ EXPECT_EQ(3, Maybe<int>().ValueOr(3));
+ EXPECT_EQ(0, Maybe<int>().ValueOrDefault());
+ EXPECT_FALSE(Maybe<bool>().ValueOrDefault());
+ EXPECT_DEATH(Maybe<int>().Value(), "Known()");
+ EXPECT_FALSE(Maybe<bool>().ValueOrDefault());
+ EXPECT_EQ(static_cast<Foo>(0), Maybe<Foo>().ValueOrDefault());
+
+ Maybe<int> x = Maybe<int>();
+ Maybe<int> y = Maybe<int>(1000);
+ y = x;
+ EXPECT_FALSE(y.Known());
+}
+
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_memory_util.h b/public/emboss_memory_util.h
new file mode 100644
index 0000000..eb91138
--- /dev/null
+++ b/public/emboss_memory_util.h
@@ -0,0 +1,968 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Utilities for efficiently reading and writing to/from memory.
+#ifndef EMBOSS_PUBLIC_EMBOSS_MEMORY_UTIL_H_
+#define EMBOSS_PUBLIC_EMBOSS_MEMORY_UTIL_H_
+
+#include <algorithm>
+#include <cstddef>
+#include <cstring>
+
+#include "public/emboss_bit_util.h"
+#include "public/emboss_cpp_types.h"
+#include "public/emboss_defines.h"
+
+namespace emboss {
+namespace support {
+
+// MemoryAccessor reads and writes big- and little-endian unsigned integers in
+// and out of memory, using optimized routines where possible.
+//
+// The default MemoryAccessor just proxies to the MemoryAccessor with the
+// next-smallest alignment and equivalent offset: MemoryAccessor<C, 8, 0, 32>
+// and MemoryAccessor<C, 8, 4, 32> will proxy to MemoryAccessor<C, 4, 0, 32>,
+// since an 8-byte-aligned pointer is also 4-byte-aligned, as is a pointer that
+// is 4 bytes away from 8-byte alignment.
+template <typename CharT, ::std::size_t kAlignment, ::std::size_t kOffset,
+ ::std::size_t kBits>
+struct MemoryAccessor {
+ static_assert(IsPowerOfTwo(kAlignment),
+ "MemoryAccessor requires power-of-two alignment.");
+ static_assert(
+ kOffset < kAlignment,
+ "MemoryAccessor requires offset to be strictly less than alignment.");
+
+ using ChainedAccessor =
+ MemoryAccessor<CharT, kAlignment / 2, kOffset % (kAlignment / 2), kBits>;
+ using Unsigned = typename LeastWidthInteger<kBits>::Unsigned;
+ static inline Unsigned ReadLittleEndianUInt(const CharT *bytes) {
+ return ChainedAccessor::ReadLittleEndianUInt(bytes);
+ }
+ static inline void WriteLittleEndianUInt(CharT *bytes, Unsigned value) {
+ ChainedAccessor::WriteLittleEndianUInt(bytes, value);
+ }
+ static inline Unsigned ReadBigEndianUInt(const CharT *bytes) {
+ return ChainedAccessor::ReadBigEndianUInt(bytes);
+ }
+ static inline void WriteBigEndianUInt(CharT *bytes, Unsigned value) {
+ ChainedAccessor::WriteBigEndianUInt(bytes, value);
+ }
+};
+
+// The least-aligned case for MemoryAccessor is 8-bit alignment, and the default
+// version of MemoryAccessor will devolve to this one if there is no more
+// specific override.
+//
+// If the system byte order is known, then these routines can use memcpy and
+// (possibly) a byte swap; otherwise they can read individual bytes and
+// shift+or them together in the appropriate order. I (bolms@) haven't found a
+// compiler that will optimize the multiple reads, shifts, and ors into a single
+// read, so the memcpy version is *much* faster for 32-bit and larger reads.
+template <typename CharT, ::std::size_t kBits>
+struct MemoryAccessor<CharT, 1, 0, kBits> {
+ static_assert(kBits % 8 == 0,
+ "MemoryAccessor can only read and write whole-byte values.");
+ static_assert(IsChar<CharT>::value,
+ "MemoryAccessor can only be used on pointers to char types.");
+
+ using Unsigned = typename LeastWidthInteger<kBits>::Unsigned;
+
+#if defined(EMBOSS_LITTLE_ENDIAN_TO_NATIVE)
+ static inline Unsigned ReadLittleEndianUInt(const CharT *bytes) {
+ Unsigned result = 0;
+ ::std::memcpy(&result, bytes, kBits / 8);
+ return EMBOSS_LITTLE_ENDIAN_TO_NATIVE(result);
+ }
+#else
+ static inline Unsigned ReadLittleEndianUInt(const CharT *bytes) {
+ Unsigned result = 0;
+ for (decltype(kBits) i = 0; i < kBits / 8; ++i) {
+ result |= static_cast<Unsigned>(static_cast<uint8_t>(bytes[i])) << i * 8;
+ }
+ return result;
+ }
+#endif
+
+#if defined(EMBOSS_NATIVE_TO_LITTLE_ENDIAN)
+ static inline void WriteLittleEndianUInt(CharT *bytes, Unsigned value) {
+ value = EMBOSS_NATIVE_TO_LITTLE_ENDIAN(value);
+ ::std::memcpy(bytes, &value, kBits / 8);
+ }
+#else
+ static inline void WriteLittleEndianUInt(CharT *bytes, Unsigned value) {
+ for (decltype(kBits) i = 0; i < kBits / 8; ++i) {
+ bytes[i] = static_cast<CharT>(static_cast<uint8_t>(value));
+ if (sizeof value > 1) {
+ // Shifting an 8-bit type by 8 bits is undefined behavior, so skip this
+ // step for uint8_t.
+ value >>= 8;
+ }
+ }
+ }
+#endif
+
+#if defined(EMBOSS_BIG_ENDIAN_TO_NATIVE)
+ static inline Unsigned ReadBigEndianUInt(const CharT *bytes) {
+ Unsigned result = 0;
+ // When a big-endian source integer is smaller than the result, the source
+ // bytes must be copied into the final bytes of the destination. This is
+ // true whether the host is big- or little-endian.
+ //
+ // For a little-endian host:
+ //
+ // source (big-endian value 0x112233):
+ //
+ // byte 0 byte 1 byte 2
+ // +--------+--------+--------+
+ // | 0x11 | 0x22 | 0x33 |
+ // +--------+--------+--------+
+ //
+ // result after memcpy (host-interpreted value 0x33221100):
+ //
+ // byte 0 byte 1 byte 2 byte 3
+ // +--------+--------+--------+--------+
+ // | 0x00 | 0x11 | 0x22 | 0x33 |
+ // +--------+--------+--------+--------+
+ //
+ // result after 32-bit byte swap (host-interpreted value 0x112233):
+ //
+ // byte 0 byte 1 byte 2 byte 3
+ // +--------+--------+--------+--------+
+ // | 0x33 | 0x22 | 0x11 | 0x00 |
+ // +--------+--------+--------+--------+
+ //
+ // For a big-endian host:
+ //
+ // source (value 0x112233):
+ //
+ // byte 0 byte 1 byte 2
+ // +--------+--------+--------+
+ // | 0x11 | 0x22 | 0x33 |
+ // +--------+--------+--------+
+ //
+ // result after memcpy (value 0x112233) -- no byte swap needed:
+ //
+ // byte 0 byte 1 byte 2 byte 3
+ // +--------+--------+--------+--------+
+ // | 0x00 | 0x11 | 0x22 | 0x33 |
+ // +--------+--------+--------+--------+
+ ::std::memcpy(reinterpret_cast<char *>(&result) + sizeof result - kBits / 8,
+ bytes, kBits / 8);
+ result = EMBOSS_BIG_ENDIAN_TO_NATIVE(result);
+ return result;
+ }
+#else
+ static inline Unsigned ReadBigEndianUInt(const CharT *bytes) {
+ Unsigned result = 0;
+ for (decltype(kBits) i = 0; i < kBits / 8; ++i) {
+ result |= static_cast<Unsigned>(static_cast</**/::std::uint8_t>(bytes[i]))
+ << (kBits - 8 - i * 8);
+ }
+ return result;
+ }
+#endif
+
+#if defined(EMBOSS_NATIVE_TO_BIG_ENDIAN)
+ static inline void WriteBigEndianUInt(CharT *bytes, Unsigned value) {
+ value = EMBOSS_NATIVE_TO_BIG_ENDIAN(value);
+ ::std::memcpy(bytes,
+ reinterpret_cast<char *>(&value) + sizeof value - kBits / 8,
+ kBits / 8);
+ }
+#else
+ static inline void WriteBigEndianUInt(CharT *bytes, Unsigned value) {
+ for (decltype(kBits) i = 0; i < kBits / 8; ++i) {
+ bytes[kBits / 8 - 1 - i] =
+ static_cast<CharT>(static_cast</**/ ::std::uint8_t>(value));
+ if (sizeof value > 1) {
+ // Shifting an 8-bit type by 8 bits is undefined behavior, so skip this
+ // step for uint8_t.
+ value >>= 8;
+ }
+ }
+ }
+#endif
+};
+
+// Specialization of UIntMemoryAccessor for 16- 32- and 64-bit-aligned reads and
+// writes, using EMBOSS_ALIAS_SAFE_POINTER_CAST instead of memcpy.
+#if defined(EMBOSS_ALIAS_SAFE_POINTER_CAST) && \
+ defined(EMBOSS_LITTLE_ENDIAN_TO_NATIVE) && \
+ defined(EMBOSS_BIG_ENDIAN_TO_NATIVE) && \
+ defined(EMBOSS_NATIVE_TO_LITTLE_ENDIAN) && \
+ defined(EMBOSS_NATIVE_TO_BIG_ENDIAN)
+template <typename CharT>
+struct MemoryAccessor<CharT, 8, 0, 64> {
+ static inline ::std::uint64_t ReadLittleEndianUInt(const CharT *bytes) {
+ return EMBOSS_LITTLE_ENDIAN_TO_NATIVE(
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(const ::std::uint64_t, bytes));
+ }
+
+ static inline void WriteLittleEndianUInt(CharT *bytes,
+ ::std::uint64_t value) {
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(::std::uint64_t, bytes) =
+ EMBOSS_NATIVE_TO_LITTLE_ENDIAN(value);
+ }
+
+ static inline ::std::uint64_t ReadBigEndianUInt(const CharT *bytes) {
+ return EMBOSS_BIG_ENDIAN_TO_NATIVE(
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(const ::std::uint64_t, bytes));
+ }
+
+ static inline void WriteBigEndianUInt(CharT *bytes, ::std::uint64_t value) {
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(::std::uint64_t, bytes) =
+ EMBOSS_NATIVE_TO_BIG_ENDIAN(value);
+ }
+};
+
+template <typename CharT>
+struct MemoryAccessor<CharT, 4, 0, 32> {
+ static inline ::std::uint32_t ReadLittleEndianUInt(const CharT *bytes) {
+ return EMBOSS_LITTLE_ENDIAN_TO_NATIVE(
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(const ::std::uint32_t, bytes));
+ }
+
+ static inline void WriteLittleEndianUInt(CharT *bytes,
+ ::std::uint32_t value) {
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(::std::uint32_t, bytes) =
+ EMBOSS_NATIVE_TO_LITTLE_ENDIAN(value);
+ }
+
+ static inline ::std::uint32_t ReadBigEndianUInt(const CharT *bytes) {
+ return EMBOSS_BIG_ENDIAN_TO_NATIVE(
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(const ::std::uint32_t, bytes));
+ }
+
+ static inline void WriteBigEndianUInt(CharT *bytes, ::std::uint32_t value) {
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(::std::uint32_t, bytes) =
+ EMBOSS_NATIVE_TO_BIG_ENDIAN(value);
+ }
+};
+
+template <typename CharT>
+struct MemoryAccessor<CharT, 2, 0, 16> {
+ static inline ::std::uint16_t ReadLittleEndianUInt(const CharT *bytes) {
+ return EMBOSS_LITTLE_ENDIAN_TO_NATIVE(
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(const ::std::uint16_t, bytes));
+ }
+
+ static inline void WriteLittleEndianUInt(CharT *bytes,
+ ::std::uint16_t value) {
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(::std::uint16_t, bytes) =
+ EMBOSS_NATIVE_TO_LITTLE_ENDIAN(value);
+ }
+
+ static inline ::std::uint16_t ReadBigEndianUInt(const CharT *bytes) {
+ return EMBOSS_BIG_ENDIAN_TO_NATIVE(
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(const ::std::uint16_t, bytes));
+ }
+
+ static inline void WriteBigEndianUInt(CharT *bytes, ::std::uint16_t value) {
+ *EMBOSS_ALIAS_SAFE_POINTER_CAST(::std::uint16_t, bytes) =
+ EMBOSS_NATIVE_TO_BIG_ENDIAN(value);
+ }
+};
+#endif // defined(EMBOSS_ALIAS_SAFE_POINTER_CAST) &&
+ // defined(EMBOSS_LITTLE_ENDIAN_TO_NATIVE) &&
+ // defined(EMBOSS_BIG_ENDIAN_TO_NATIVE) &&
+ // defined(EMBOSS_NATIVE_TO_LITTLE_ENDIAN) &&
+ // defined(EMBOSS_NATIVE_TO_BIG_ENDIAN)
+
+// This is the Euclidean GCD algorithm, in C++11-constexpr-safe form. The
+// initial is-b-greater-than-a-if-so-swap is omitted, since gcd(b % a, a) is the
+// same as gcd(b, a) when a > b.
+inline constexpr ::std::size_t GreatestCommonDivisor(::std::size_t a,
+ ::std::size_t b) {
+ return a == 0 ? b : GreatestCommonDivisor(b % a, a);
+}
+
+// ContiguousBuffer is a direct view of a fixed number of contiguous bytes in
+// memory. If Byte is a const type, it will be a read-only view; if Byte is
+// non-const, then writes will be allowed.
+//
+// The kAlignment and kOffset parameters are used to optimize certain reads and
+// writes. static_cast<uintptr_t>(bytes_) % kAlignment must equal kOffset.
+//
+// This class is used extensively by generated code, and is not intended to be
+// heavily used by hand-written code -- some interfaces can be tricky to call
+// correctly.
+template <typename Byte, ::std::size_t kAlignment, ::std::size_t kOffset>
+class ContiguousBuffer final {
+ // There aren't many systems with non-8-bit chars, and a quirk of POSIX
+ // requires that POSIX C systems have CHAR_BIT == 8, but some DSPs use wider
+ // chars.
+ static_assert(CHAR_BIT == 8, "ContiguousBuffer requires 8-bit chars.");
+
+ // ContiguousBuffer assumes that its backing store is byte-oriented. The
+ // previous check ensures that chars are 8 bits, and this one ensures that the
+ // backing store uses chars.
+ //
+ // Note that this check is explicitly that Byte is one of the three standard
+ // char types, and not that (say) it is a one-byte type with an assignment
+ // operator that can be static_cast<> to and from uint8_t. I (bolms@) have
+ // chosen to lock it down to just char types to avoid running afoul of strict
+ // aliasing rules anywhere.
+ //
+ // Of somewhat academic interest, uint8_t is not required to be a char type
+ // (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66110#c10), though it is
+ // unlikely that any compiler vendor will actually change it, as there is
+ // probably enough real-world code that relies on uint8_t being allowed to
+ // alias.
+ static_assert(IsChar<Byte>::value, "ContiguousBuffer requires char type.");
+
+ // Because real-world processors only care about power-of-2 alignments,
+ // ContiguousBuffer only supports power-of-2 alignments. Note that
+ // GetOffsetStorage can handle non-power-of-2 alignments.
+ static_assert(IsPowerOfTwo(kAlignment),
+ "ContiguousBuffer requires power-of-two alignment.");
+
+ // To avoid template variant explosion, ContiguousBuffer requires kOffset to
+ // be strictly less than kAlignment. Users of ContiguousBuffer are expected
+ // to take the modulus of kOffset by kAlignment before passing it in as a
+ // parameter.
+ static_assert(
+ kOffset < kAlignment,
+ "ContiguousBuffer requires offset to be strictly less than alignment.");
+
+ public:
+ using ByteType = Byte;
+ // OffsetStorageType<kSubAlignment, kSubOffset> is the return type of
+ // GetOffsetStorage<kSubAlignment, kSubOffset>(...). This is used in a number
+ // of places in generated code to specify deeply-nested template values.
+ //
+ // In theory, anything that cared about this type could use
+ // decltype(declval(ContiguousBuffer<...>).GetOffsetStorage<kSubAlignment,
+ // kSubOffset>(0, 0)) instead, but that is much more cumbersome, and it
+ // appears that at least some versions of GCC do not handle it correctly.
+ template </**/ ::std::size_t kSubAlignment, ::std::size_t kSubOffset>
+ using OffsetStorageType =
+ ContiguousBuffer<Byte, GreatestCommonDivisor(kAlignment, kSubAlignment),
+ (kOffset + kSubOffset) %
+ GreatestCommonDivisor(kAlignment, kSubAlignment)>;
+
+ // Constructs a default ContiguousBuffer.
+ ContiguousBuffer() : bytes_(nullptr), size_(0) {}
+
+ // Constructs a ContiguousBuffer from a contiguous container type over some
+ // `char` type, such as std::string, std::vector<signed char>,
+ // std::array<unsigned char, N>, or std::string_view.
+ //
+ // This template is only enabled if:
+ //
+ // 1. bytes->data() returns a pointer to some char type.
+ // 2. Byte is at least as cv-qualified as decltype(*bytes->data()).
+ //
+ // The first requirement means that this constructor won't work on, e.g.,
+ // std::vector<int> -- this is mostly a precautionary measure, since
+ // ContiguousBuffer only uses alias-safe operations anyway.
+ //
+ // The second requirement means that const and volatile are respected in the
+ // expected way: a ContiguousBuffer<const unsigned char, ...> may be
+ // initialized from std::vector<char>, but a ContiguousBuffer<unsigned char,
+ // ...> may not be initialized from std::string_view.
+ template <
+ typename T,
+ typename = typename ::std::enable_if<
+ IsChar<typename ::std::remove_cv<typename ::std::remove_reference<
+ decltype(*(::std::declval<T>().data()))>::type>::type>::value
+ && ::std::is_same<
+ typename AddSourceCV<decltype(*::std::declval<T>().data()),
+ Byte>::Type,
+ Byte>::value>::type>
+ explicit ContiguousBuffer(T *bytes)
+ : bytes_{reinterpret_cast<Byte *>(bytes->data())}, size_{bytes->size()} {
+ if (bytes != nullptr)
+ EMBOSS_DCHECK_POINTER_ALIGNMENT(bytes, kAlignment, kOffset);
+ }
+
+ // Constructs a ContiguousBuffer from a pointer to a char type and a size. As
+ // with the constructor from a container, above, Byte must be at least as
+ // cv-qualified as T.
+ template <
+ typename T,
+ typename = typename ::std::enable_if<IsChar<T>::value && ::std::is_same<
+ typename AddSourceCV<T, Byte>::Type, Byte>::value>>
+ explicit ContiguousBuffer(T *bytes, ::std::size_t size)
+ : bytes_{reinterpret_cast<Byte *>(bytes)},
+ size_{bytes == nullptr ? 0 : size} {
+ if (bytes != nullptr)
+ EMBOSS_DCHECK_POINTER_ALIGNMENT(bytes, kAlignment, kOffset);
+ }
+
+ // Constructs a ContiguousBuffer from nullptr. Equivalent to
+ // ContiguousBuffer().
+ //
+ // TODO(bolms): Update callers and remove this constructor.
+ explicit ContiguousBuffer(::std::nullptr_t) : bytes_{nullptr}, size_{0} {}
+
+ // Implicitly constructs a ContiguousBuffer from an identical
+ // ContiguousBuffer.
+ ContiguousBuffer(const ContiguousBuffer &other) = default;
+
+ // Explicitly construct a ContiguousBuffers from another, compatible
+ // ContiguousBuffer. A compatible ContiguousBuffer has an
+ // equally-or-less-cv-qualified Byte type, an alignment that is an exact
+ // multiple of this ContiguousBuffer's alignment, and an offset that is the
+ // same when reduced to this ContiguousBuffer's alignment.
+ //
+ // The final !::std::is_same<...> clause prevents this constructor from
+ // overlapping with the *implicit* copy constructor.
+ template <
+ typename OtherByte, size_t kOtherAlignment, size_t kOtherOffset,
+ typename = typename ::std::enable_if<
+ kOtherAlignment % kAlignment == 0 &&
+ kOtherOffset % kAlignment ==
+ kOffset && ::std::is_same<
+ typename AddSourceCV<OtherByte, Byte>::Type, Byte>::value &&
+ !::std::is_same<ContiguousBuffer,
+ ContiguousBuffer<OtherByte, kOtherAlignment,
+ kOtherOffset>>::value>::type>
+ explicit ContiguousBuffer(
+ const ContiguousBuffer<OtherByte, kOtherAlignment, kOtherOffset> &other)
+ : bytes_{reinterpret_cast<Byte *>(other.data())},
+ size_{other.SizeInBytes()} {}
+
+ // Assignment from a compatible ContiguousBuffer.
+ template <typename OtherByte, size_t kOtherAlignment, size_t kOtherOffset,
+ typename = typename ::std::enable_if<
+ kOtherAlignment % kAlignment == 0 &&
+ kOtherOffset % kAlignment ==
+ kOffset && ::std::is_same<
+ typename AddSourceCV<OtherByte, Byte>::Type,
+ Byte>::value>::type>
+ ContiguousBuffer &operator=(
+ const ContiguousBuffer<OtherByte, kOtherAlignment, kOtherOffset> &other) {
+ bytes_ = reinterpret_cast<Byte *>(other.data());
+ size_ = other.SizeInBytes();
+ return *this;
+ }
+
+ // GetOffsetStorage returns a new ContiguousBuffer that is a subsection of
+ // this ContiguousBuffer, with appropriate alignment assertions. The new
+ // ContiguousBuffer will point to a region `offset` bytes into the original
+ // ContiguousBuffer, with a size of `max(size, original_size - offset)`.
+ //
+ // The kSubAlignment and kSubOffset template parameters act as assertions
+ // about the value of `offset`: `offset % (kSubAlignment / 8) - (kSubOffset /
+ // 8)` must be zero. That is, if `kSubAlignment` is 16 and `kSubOffset` is 8,
+ // then `offset` may be 1, 3, 5, 7, etc.
+ //
+ // As a special case, if `kSubAlignment` is 0, then `offset` must exactly
+ // equal `kSubOffset`.
+ //
+ // This method is used by generated structure views to get backing buffers for
+ // views of their fields; the code generator can determine proper values for
+ // `kSubAlignment` and `kSubOffset`.
+ template </**/ ::std::size_t kSubAlignment, ::std::size_t kSubOffset>
+ OffsetStorageType<kSubAlignment, kSubOffset> GetOffsetStorage(
+ ::std::size_t offset, ::std::size_t size) const {
+ static_assert(kSubAlignment == 0 || kSubAlignment > kSubOffset,
+ "kSubAlignment must be greater than kSubOffset.");
+ // Emboss provides a fast, unchecked path for reads and writes like:
+ //
+ // view.field().subfield().UncheckedWrite().
+ //
+ // Each of .field() and .subfield() call GetOffsetStorage(), so
+ // GetOffsetStorage() must be small and fast.
+ if (kSubAlignment == 0) {
+ EMBOSS_DCHECK_EQ(offset, kSubOffset);
+ } else {
+ // The weird ?:, below, silences -Werror=div-by-zero on versions of GCC
+ // that aren't smart enough to figure out that kSubAlignment can't be zero
+ // in this branch.
+ EMBOSS_DCHECK_EQ(offset % (kSubAlignment == 0 ? 1 : kSubAlignment),
+ kSubOffset);
+ }
+ using ResultStorageType = OffsetStorageType<kSubAlignment, kSubOffset>;
+ return bytes_ == nullptr
+ ? ResultStorageType{nullptr}
+ : ResultStorageType{
+ bytes_ + offset,
+ size_ < offset ? 0 : ::std::min(size, size_ - offset)};
+ }
+
+ // ReadLittleEndianUInt, ReadBigEndianUInt, and the unchecked versions thereof
+ // provide efficient multibyte read access to the underlying buffer. The
+ // kBits template parameter should always equal the buffer size when these are
+ // called.
+ //
+ // Generally, types other than unsigned integers can be relatively efficiently
+ // converted from unsigned integers, and views should use Read...UInt to read
+ // the raw value, then convert.
+ //
+ // Read...UInt always reads the entire buffer; to read a smaller section, use
+ // GetOffsetStorage first.
+ template </**/ ::std::size_t kBits>
+ typename LeastWidthInteger<kBits>::Unsigned ReadLittleEndianUInt() const {
+ EMBOSS_CHECK_EQ(SizeInBytes() * 8, kBits);
+ EMBOSS_CHECK_POINTER_ALIGNMENT(bytes_, kAlignment, kOffset);
+ return UncheckedReadLittleEndianUInt<kBits>();
+ }
+ template </**/ ::std::size_t kBits>
+ typename LeastWidthInteger<kBits>::Unsigned UncheckedReadLittleEndianUInt()
+ const {
+ static_assert(kBits % 8 == 0,
+ "ContiguousBuffer::ReadLittleEndianUInt() can only read "
+ "whole-byte values.");
+ return MemoryAccessor<Byte, kAlignment, kOffset,
+ kBits>::ReadLittleEndianUInt(bytes_);
+ }
+ template </**/ ::std::size_t kBits>
+ typename LeastWidthInteger<kBits>::Unsigned ReadBigEndianUInt() const {
+ EMBOSS_CHECK_EQ(SizeInBytes() * 8, kBits);
+ EMBOSS_CHECK_POINTER_ALIGNMENT(bytes_, kAlignment, kOffset);
+ return UncheckedReadBigEndianUInt<kBits>();
+ }
+ template </**/ ::std::size_t kBits>
+ typename LeastWidthInteger<kBits>::Unsigned UncheckedReadBigEndianUInt()
+ const {
+ static_assert(kBits % 8 == 0,
+ "ContiguousBuffer::ReadBigEndianUInt() can only read "
+ "whole-byte values.");
+ return MemoryAccessor<Byte, kAlignment, kOffset, kBits>::ReadBigEndianUInt(
+ bytes_);
+ }
+
+ // WriteLittleEndianUInt, WriteBigEndianUInt, and the unchecked versions
+ // thereof provide efficient write access to the buffer. Similar to the Read
+ // methods above, they write the entire buffer from an unsigned integer;
+ // non-unsigned values should be converted to the equivalent bit pattern, then
+ // written, and to write a subsection of the buffer use GetOffsetStorage
+ // first.
+ template </**/ ::std::size_t kBits>
+ void WriteLittleEndianUInt(
+ typename LeastWidthInteger<kBits>::Unsigned value) const {
+ EMBOSS_CHECK_EQ(SizeInBytes() * 8, kBits);
+ EMBOSS_CHECK_POINTER_ALIGNMENT(bytes_, kAlignment, kOffset);
+ UncheckedWriteLittleEndianUInt<kBits>(value);
+ }
+ template </**/ ::std::size_t kBits>
+ void UncheckedWriteLittleEndianUInt(
+ typename LeastWidthInteger<kBits>::Unsigned value) const {
+ static_assert(kBits % 8 == 0,
+ "ContiguousBuffer::WriteLittleEndianUInt() can only write "
+ "whole-byte values.");
+ MemoryAccessor<Byte, kAlignment, kOffset, kBits>::WriteLittleEndianUInt(
+ bytes_, value);
+ }
+ template </**/ ::std::size_t kBits>
+ void WriteBigEndianUInt(
+ typename LeastWidthInteger<kBits>::Unsigned value) const {
+ EMBOSS_CHECK_EQ(SizeInBytes() * 8, kBits);
+ EMBOSS_CHECK_POINTER_ALIGNMENT(bytes_, kAlignment, kOffset);
+ return UncheckedWriteBigEndianUInt<kBits>(value);
+ }
+ template </**/ ::std::size_t kBits>
+ void UncheckedWriteBigEndianUInt(
+ typename LeastWidthInteger<kBits>::Unsigned value) const {
+ static_assert(kBits % 8 == 0,
+ "ContiguousBuffer::WriteBigEndianUInt() can only write "
+ "whole-byte values.");
+ MemoryAccessor<Byte, kAlignment, kOffset, kBits>::WriteBigEndianUInt(bytes_,
+ value);
+ }
+
+ template <typename OtherByte, ::std::size_t kOtherAlignment,
+ ::std::size_t kOtherOffset>
+ void UncheckedCopyFrom(
+ const ContiguousBuffer<OtherByte, kOtherAlignment, kOtherOffset> &other,
+ ::std::size_t size) const {
+ memmove(data(), other.data(), size);
+ }
+ template <typename OtherByte, ::std::size_t kOtherAlignment,
+ ::std::size_t kOtherOffset>
+ void CopyFrom(
+ const ContiguousBuffer<OtherByte, kOtherAlignment, kOtherOffset> &other,
+ ::std::size_t size) const {
+ EMBOSS_CHECK(Ok());
+ EMBOSS_CHECK(other.Ok());
+ // It is OK if either buffer contains extra bytes that are not being copied.
+ EMBOSS_CHECK_GE(SizeInBytes(), size);
+ EMBOSS_CHECK_GE(other.SizeInBytes(), size);
+ UncheckedCopyFrom(other, size);
+ }
+ template <typename OtherByte, ::std::size_t kOtherAlignment,
+ ::std::size_t kOtherOffset>
+ bool TryToCopyFrom(
+ const ContiguousBuffer<OtherByte, kOtherAlignment, kOtherOffset> &other,
+ ::std::size_t size) const {
+ if (Ok() && other.Ok() && SizeInBytes() >= size &&
+ other.SizeInBytes() >= size) {
+ UncheckedCopyFrom(other, size);
+ return true;
+ }
+ return false;
+ }
+ ::std::size_t SizeInBytes() const { return size_; }
+ bool Ok() const { return bytes_ != nullptr; }
+ Byte *data() const { return bytes_; }
+ Byte *begin() const { return bytes_; }
+ Byte *end() const { return bytes_ + size_; }
+
+ private:
+ Byte *bytes_ = nullptr;
+ ::std::size_t size_ = 0;
+};
+
+// TODO(bolms): Remove these aliases.
+using ReadWriteContiguousBuffer = ContiguousBuffer<unsigned char, 1, 0>;
+using ReadOnlyContiguousBuffer = ContiguousBuffer<const unsigned char, 1, 0>;
+
+// LittleEndianByteOrderer is a pass-through adapter for a byte buffer class.
+// It is used to implement little-endian bit blocks.
+//
+// When used by BitBlock, the resulting bits are numbered as if they are
+// little-endian:
+//
+// bit addresses of each bit in each byte
+// +----+----+----+----+----+----+----+----+----+----+----+----+----
+// bit in 7 | 7 | 15 | 23 | 31 | 39 | 47 | 55 | 63 | 71 | 79 | 87 | 95 |
+// byte 6 | 6 | 14 | 22 | 30 | 38 | 46 | 54 | 62 | 70 | 78 | 86 | 94 |
+// 5 | 5 | 13 | 21 | 29 | 37 | 45 | 53 | 61 | 69 | 77 | 85 | 93 |
+// 4 | 4 | 12 | 20 | 28 | 36 | 44 | 52 | 60 | 68 | 76 | 84 | 92 |
+// 3 | 3 | 11 | 19 | 27 | 35 | 43 | 51 | 59 | 67 | 75 | 83 | 91 | ...
+// 2 | 2 | 10 | 18 | 26 | 34 | 42 | 50 | 58 | 66 | 74 | 82 | 90 |
+// 1 | 1 | 9 | 17 | 25 | 33 | 41 | 49 | 57 | 65 | 73 | 81 | 89 |
+// 0 | 0 | 8 | 16 | 24 | 32 | 40 | 48 | 56 | 64 | 72 | 80 | 88 |
+// +----+----+----+----+----+----+----+----+----+----+----+----+----
+// 0 1 2 3 4 5 6 7 8 9 10 11 ...
+// byte address
+//
+// Because endian-specific reads and writes are handled in ContiguousBuffer,
+// this class exists mostly to translate VerbUInt calls to VerbLittleEndianUInt.
+template <class BufferT>
+class LittleEndianByteOrderer final {
+ public:
+ // Type declaration so that BitBlock can use BufferType::BufferType.
+ using BufferType = BufferT;
+
+ LittleEndianByteOrderer() : buffer_() {}
+ explicit LittleEndianByteOrderer(BufferType buffer) : buffer_{buffer} {}
+ LittleEndianByteOrderer(const LittleEndianByteOrderer &other) = default;
+ LittleEndianByteOrderer(LittleEndianByteOrderer &&other) = default;
+ LittleEndianByteOrderer &operator=(const LittleEndianByteOrderer &other) =
+ default;
+
+ // LittleEndianByteOrderer just passes straight through to the underlying
+ // buffer.
+ bool Ok() const { return buffer_.Ok(); }
+ size_t SizeInBytes() const { return buffer_.SizeInBytes(); }
+
+ template <size_t kBits>
+ typename LeastWidthInteger<kBits>::Unsigned ReadUInt() const {
+ return buffer_.template ReadLittleEndianUInt<kBits>();
+ }
+ template <size_t kBits>
+ typename LeastWidthInteger<kBits>::Unsigned UncheckedReadUInt() const {
+ return buffer_.template UncheckedReadLittleEndianUInt<kBits>();
+ }
+ template <size_t kBits>
+ void WriteUInt(typename LeastWidthInteger<kBits>::Unsigned value) const {
+ buffer_.template WriteLittleEndianUInt<kBits>(value);
+ }
+ template <size_t kBits>
+ void UncheckedWriteUInt(
+ typename LeastWidthInteger<kBits>::Unsigned value) const {
+ buffer_.template UncheckedWriteLittleEndianUInt<kBits>(value);
+ }
+
+ private:
+ BufferType buffer_;
+};
+
+// BigEndianByteOrderer is an adapter for a byte buffer class which reverses
+// the addresses of the underlying byte buffer. It is used to implement
+// big-endian bit blocks.
+//
+// When used by BitBlock, the resulting bits are numbered with "bit 0" as the
+// lowest-order bit of the *last* byte in the buffer. For example, for a
+// 12-byte buffer, the bit ordering looks like:
+//
+// bit addresses of each bit in each byte
+// +----+----+----+----+----+----+----+----+----+----+----+----+
+// bit in 7 | 95 | 87 | 79 | 71 | 63 | 55 | 47 | 39 | 31 | 23 | 15 | 7 |
+// byte 6 | 94 | 86 | 78 | 70 | 62 | 54 | 46 | 38 | 30 | 22 | 14 | 6 |
+// 5 | 93 | 85 | 77 | 69 | 61 | 53 | 45 | 37 | 29 | 21 | 13 | 5 |
+// 4 | 92 | 84 | 76 | 68 | 60 | 52 | 44 | 36 | 28 | 20 | 12 | 4 |
+// 3 | 91 | 83 | 75 | 67 | 59 | 51 | 43 | 35 | 27 | 19 | 11 | 3 |
+// 2 | 90 | 82 | 74 | 66 | 58 | 50 | 42 | 34 | 26 | 18 | 10 | 2 |
+// 1 | 89 | 81 | 73 | 65 | 57 | 49 | 41 | 33 | 25 | 17 | 9 | 1 |
+// 0 | 88 | 80 | 72 | 64 | 56 | 48 | 40 | 32 | 24 | 16 | 8 | 0 |
+// +----+----+----+----+----+----+----+----+----+----+----+----+
+// 0 1 2 3 4 5 6 7 8 9 10 11
+// byte address
+//
+// Note that some big-endian protocols are documented with "bit 0" being the
+// *high-order* bit of a number, in which case "bit 0" would be the
+// highest-order bit of the first byte in the buffer. The "bit 0 is the
+// high-order bit" style seems to be more common in older documents (e.g., RFCs
+// 791 and 793, for IP and TCP), while the Emboss-style "bit 0 is in the last
+// byte" seems to be more common in newer documents (e.g., the hardware user
+// manuals bolms@ examined).
+// TODO(bolms): Examine more documents to see if the old vs new pattern holds.
+//
+// Because endian-specific reads and writes are handled in ContiguousBuffer,
+// this class exists mostly to translate VerbUInt calls to VerbBigEndianUInt.
+template <class BufferT>
+class BigEndianByteOrderer final {
+ public:
+ // Type declaration so that BitBlock can use BufferType::BufferType.
+ using BufferType = BufferT;
+
+ BigEndianByteOrderer() : buffer_() {}
+ explicit BigEndianByteOrderer(BufferType buffer) : buffer_{buffer} {}
+ BigEndianByteOrderer(const BigEndianByteOrderer &other) = default;
+ BigEndianByteOrderer(BigEndianByteOrderer &&other) = default;
+ BigEndianByteOrderer &operator=(const BigEndianByteOrderer &other) = default;
+
+ // Ok() and SizeInBytes() get passed through with no changes.
+ bool Ok() const { return buffer_.Ok(); }
+ size_t SizeInBytes() const { return buffer_.SizeInBytes(); }
+
+ template <size_t kBits>
+ typename LeastWidthInteger<kBits>::Unsigned ReadUInt() const {
+ return buffer_.template ReadBigEndianUInt<kBits>();
+ }
+ template <size_t kBits>
+ typename LeastWidthInteger<kBits>::Unsigned UncheckedReadUInt() const {
+ return buffer_.template UncheckedReadBigEndianUInt<kBits>();
+ }
+ template <size_t kBits>
+ void WriteUInt(typename LeastWidthInteger<kBits>::Unsigned value) const {
+ buffer_.template WriteBigEndianUInt<kBits>(value);
+ }
+ template <size_t kBits>
+ void UncheckedWriteUInt(
+ typename LeastWidthInteger<kBits>::Unsigned value) const {
+ buffer_.template UncheckedWriteBigEndianUInt<kBits>(value);
+ }
+
+ private:
+ BufferType buffer_;
+};
+
+// NullByteOrderer is a pass-through adapter for a byte buffer class. It is
+// used to implement single-byte bit blocks, where byte order does not matter.
+//
+// Technically, it should be valid to swap in BigEndianByteOrderer or
+// LittleEndianByteOrderer anywhere that NullByteOrderer is used, but
+// NullByteOrderer contains a few extra CHECKs to ensure it is being used
+// correctly.
+template <class BufferT>
+class NullByteOrderer final {
+ public:
+ // Type declaration so that BitBlock can use BufferType::BufferType.
+ using BufferType = BufferT;
+
+ NullByteOrderer() : buffer_() {}
+ explicit NullByteOrderer(BufferType buffer) : buffer_{buffer} {}
+ NullByteOrderer(const NullByteOrderer &other) = default;
+ NullByteOrderer(NullByteOrderer &&other) = default;
+ NullByteOrderer &operator=(const NullByteOrderer &other) = default;
+
+ bool Ok() const { return buffer_.Ok(); }
+ size_t SizeInBytes() const { return Ok() ? 1 : 0; }
+
+ template <size_t kBits>
+ typename LeastWidthInteger<kBits>::Unsigned ReadUInt() const {
+ static_assert(kBits == 8, "NullByteOrderer may only read 8-bit values.");
+ return buffer_.template ReadLittleEndianUInt<kBits>();
+ }
+ template <size_t kBits>
+ typename LeastWidthInteger<kBits>::Unsigned UncheckedReadUInt() const {
+ static_assert(kBits == 8, "NullByteOrderer may only read 8-bit values.");
+ return buffer_.template UncheckedReadLittleEndianUInt<kBits>();
+ }
+ template <size_t kBits>
+ void WriteUInt(typename LeastWidthInteger<kBits>::Unsigned value) const {
+ static_assert(kBits == 8, "NullByteOrderer may only read 8-bit values.");
+ buffer_.template WriteBigEndianUInt<kBits>(value);
+ }
+ template <size_t kBits>
+ void UncheckedWriteUInt(
+ typename LeastWidthInteger<kBits>::Unsigned value) const {
+ static_assert(kBits == 8, "NullByteOrderer may only read 8-bit values.");
+ buffer_.template UncheckedWriteBigEndianUInt<kBits>(value);
+ }
+
+ private:
+ BufferType buffer_;
+};
+
+// OffsetBitBlock is a filter on another BitBlock class, which adds a fixed
+// offset to reads from underlying bit block. This is used by Emboss generated
+// classes to read bitfields: the parent provides an OffsetBitBlock of its
+// buffer to the child's view.
+//
+// OffsetBitBlock is always statically sized, but because
+// BitBlock::GetOffsetStorage and OffsetBitBlock::GetOffsetStorage must have the
+// same signature as ContiguousBuffer::GetOffsetStorage, OffsetBitBlock's size
+// parameter must be a runtime value.
+//
+// TODO(bolms): Figure out how to add size as a template parameter to
+// OffsetBitBlock.
+template <class UnderlyingBitBlockType>
+class OffsetBitBlock final {
+ public:
+ using ValueType = typename UnderlyingBitBlockType::ValueType;
+ // Bit blocks do not use alignment information, but generated code expects bit
+ // blocks to have the same methods and types as byte blocks, so even though
+ // kNewAlignment and kNewOffset are unused, they must be present as template
+ // parameters.
+ template <size_t kNewAlignment, size_t kNewOffset>
+ using OffsetStorageType = OffsetBitBlock<UnderlyingBitBlockType>;
+
+ OffsetBitBlock() : bit_block_(), offset_(0), size_(0), ok_(false) {}
+ explicit OffsetBitBlock(UnderlyingBitBlockType bit_block, size_t offset,
+ size_t size, bool ok)
+ : bit_block_{bit_block},
+ offset_{static_cast<uint8_t>(offset)},
+ size_{static_cast<uint8_t>(size)},
+ ok_{offset == offset_ && size == size_ && ok} {}
+ OffsetBitBlock(const OffsetBitBlock &other) = default;
+ OffsetBitBlock &operator=(const OffsetBitBlock &other) = default;
+
+ template <size_t kNewAlignment, size_t kNewOffset>
+ OffsetStorageType<kNewAlignment, kNewOffset> GetOffsetStorage(
+ size_t offset, size_t size) const {
+ return OffsetStorageType<kNewAlignment, kNewOffset>{
+ bit_block_, offset_ + offset, size, ok_ && offset + size <= size_};
+ }
+
+ // ReadUInt reads the entire underlying bit block, then shifts and masks to
+ // the appropriate size.
+ ValueType ReadUInt() const {
+ EMBOSS_CHECK_GE(bit_block_.SizeInBits(), offset_ + size_);
+ EMBOSS_CHECK(Ok());
+ return MaskToNBits(bit_block_.ReadUInt(), offset_ + size_) >> offset_;
+ }
+ ValueType UncheckedReadUInt() const {
+ return MaskToNBits(bit_block_.UncheckedReadUInt(), offset_ + size_) >>
+ offset_;
+ }
+
+ // WriteUInt writes the entire underlying bit block; in order to only write
+ // the specific bits that should be changed, the current value is first read,
+ // then masked out and or'ed with the new value, and finally the result is
+ // written back to memory.
+ void WriteUInt(ValueType value) const {
+ EMBOSS_CHECK_EQ(value, MaskToNBits(value, size_));
+ EMBOSS_CHECK(Ok());
+ // OffsetBitBlock::WriteUInt *always* does a read-modify-write because it is
+ // assumed that if the user wanted to read or write the entire value they
+ // would just use the underlying BitBlock directly. This is mostly true for
+ // code generated by Emboss, which only uses OffsetBitBlock for subfields of
+ // `bits` types; bit-oriented types such as `UInt` will use BitBlock
+ // directly when they are placed directly in a `struct`.
+ bit_block_.WriteUInt(MaskInValue(bit_block_.ReadUInt(), value));
+ }
+ void UncheckedWriteUInt(ValueType value) const {
+ bit_block_.UncheckedWriteUInt(
+ MaskInValue(bit_block_.UncheckedReadUInt(), value));
+ }
+
+ size_t SizeInBits() const { return size_; }
+ bool Ok() const { return ok_; }
+
+ private:
+ ValueType MaskInValue(ValueType original_value, ValueType new_value) const {
+ ValueType original_mask =
+ ~(MaskToNBits(static_cast<ValueType>(~ValueType{0}), size_) << offset_);
+ return (original_value & original_mask) | (new_value << offset_);
+ }
+
+ const UnderlyingBitBlockType bit_block_;
+ const uint8_t offset_;
+ const uint8_t size_;
+ const uint8_t ok_;
+};
+
+// BitBlock is a view of a short, fixed-size sequence of bits somewhere in
+// memory. Big- and little-endian values are handled by BufferType, which is
+// typically BigEndianByteOrderer<ContiguousBuffer<...>> or
+// LittleEndianByteOrderer<ContiguousBuffer<...>>.
+//
+// BitBlock is implemented such that it always reads and writes its entire
+// buffer; unlike ContiguousBuffer for bytes, there is no way to modify part of
+// the underlying data without doing a read-modify-write of the full value.
+// This sidesteps a lot of weirdness with converting between bit addresses and
+// byte addresses for big-endian values, though it does mean that in certain
+// cases excess bits will be read or written, particularly if care is not taken
+// in the .emb definition to keep `bits` types to a minimum size.
+template <class BufferType, size_t kBufferSizeInBits>
+class BitBlock final {
+ static_assert(kBufferSizeInBits % 8 == 0,
+ "BitBlock can only operate on byte buffers.");
+ static_assert(kBufferSizeInBits <= 64,
+ "BitBlock can only operate on small buffers.");
+
+ public:
+ using ValueType = typename LeastWidthInteger<kBufferSizeInBits>::Unsigned;
+ // As with OffsetBitBlock::OffsetStorageType, the kNewAlignment and kNewOffset
+ // values are not used, but they must be template parameters so that generated
+ // code can work with both BitBlock and ContiguousBuffer.
+ template <size_t kNewAlignment, size_t kNewOffset>
+ using OffsetStorageType =
+ OffsetBitBlock<BitBlock<BufferType, kBufferSizeInBits>>;
+
+ explicit BitBlock() : buffer_() {}
+ explicit BitBlock(BufferType buffer) : buffer_{buffer} {}
+ explicit BitBlock(typename BufferType::BufferType buffer) : buffer_{buffer} {}
+ BitBlock(const BitBlock &) = default;
+ BitBlock(BitBlock &&) = default;
+ BitBlock &operator=(const BitBlock &) = default;
+ BitBlock &operator=(BitBlock &&) = default;
+ ~BitBlock() = default;
+
+ static constexpr size_t Bits() { return kBufferSizeInBits; }
+
+ template <size_t kNewAlignment, size_t kNewOffset>
+ OffsetStorageType<kNewAlignment, kNewOffset> GetOffsetStorage(
+ size_t offset, size_t size) const {
+ return OffsetStorageType<kNewAlignment, kNewOffset>{
+ *this, offset, size, Ok() && offset + size <= kBufferSizeInBits};
+ }
+
+ // BitBlock clients must read or write the entire BitBlock value as an
+ // unsigned integer. OffsetBitBlock can be used to extract a portion of the
+ // value via shift and mask, and individual view types such as IntView or
+ // BcdView are expected to convert ValueType to/from their desired types.
+ ValueType ReadUInt() const {
+ return buffer_.template ReadUInt<kBufferSizeInBits>();
+ }
+ ValueType UncheckedReadUInt() const {
+ return buffer_.template UncheckedReadUInt<kBufferSizeInBits>();
+ }
+ void WriteUInt(ValueType value) const {
+ EMBOSS_CHECK_EQ(value, MaskToNBits(value, kBufferSizeInBits));
+ buffer_.template WriteUInt<kBufferSizeInBits>(value);
+ }
+ void UncheckedWriteUInt(ValueType value) const {
+ buffer_.template UncheckedWriteUInt<kBufferSizeInBits>(value);
+ }
+
+ size_t SizeInBits() const { return kBufferSizeInBits; }
+ bool Ok() const {
+ return buffer_.Ok() && buffer_.SizeInBytes() * 8 == kBufferSizeInBits;
+ }
+
+ private:
+ BufferType buffer_;
+};
+
+} // namespace support
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_MEMORY_UTIL_H_
diff --git a/public/emboss_memory_util_test.cc b/public/emboss_memory_util_test.cc
new file mode 100644
index 0000000..9ac20db
--- /dev/null
+++ b/public/emboss_memory_util_test.cc
@@ -0,0 +1,641 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_memory_util.h"
+
+#include "public/emboss_prelude.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace support {
+namespace test {
+
+using ::emboss::prelude::IntView;
+using ::emboss::prelude::UIntView;
+
+template <size_t kBits>
+using BigEndianBitBlockN =
+ BitBlock<BigEndianByteOrderer<ReadWriteContiguousBuffer>, kBits>;
+
+template <size_t kBits>
+using LittleEndianBitBlockN =
+ BitBlock<LittleEndianByteOrderer<ReadWriteContiguousBuffer>, kBits>;
+
+TEST(GreatestCommonDivisor, GreatestCommonDivisor) {
+ EXPECT_EQ(4, GreatestCommonDivisor(12, 20));
+ EXPECT_EQ(4, GreatestCommonDivisor(20, 12));
+ EXPECT_EQ(4, GreatestCommonDivisor(20, 4));
+ EXPECT_EQ(6, GreatestCommonDivisor(12, 78));
+ EXPECT_EQ(6, GreatestCommonDivisor(6, 0));
+ EXPECT_EQ(6, GreatestCommonDivisor(0, 6));
+ EXPECT_EQ(3, GreatestCommonDivisor(9, 6));
+ EXPECT_EQ(0, GreatestCommonDivisor(0, 0));
+}
+
+// Because MemoryAccessor's parameters are template parameters, it is not
+// possible to loop through them directly. Instead, TestMemoryAccessor tests
+// a particular MemoryAccessor's methods, then calls the next template to test
+// the next set of template parameters to MemoryAccessor.
+template <typename CharT, size_t kAlignment, size_t kOffset, size_t kBits>
+void TestMemoryAccessor() {
+ alignas(kAlignment)
+ CharT bytes[] = {0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08};
+ EXPECT_EQ(
+ 0x0807060504030201UL & (~0x0UL >> (64 - kBits)),
+ (MemoryAccessor<CharT, kAlignment, kOffset, kBits>::ReadLittleEndianUInt(
+ bytes)))
+ << "kAlignment = " << kAlignment << "; kOffset = " << kOffset
+ << "; kBits = " << kBits;
+ EXPECT_EQ(
+ 0x0102030405060708UL >> (64 - kBits),
+ (MemoryAccessor<CharT, kAlignment, kOffset, kBits>::ReadBigEndianUInt(
+ bytes)))
+ << "kAlignment = " << kAlignment << "; kOffset = " << kOffset
+ << "; kBits = " << kBits;
+
+ MemoryAccessor<CharT, kAlignment, kOffset, kBits>::WriteLittleEndianUInt(
+ bytes, 0x7172737475767778UL & (~0x0UL >> (64 - kBits)));
+ ::std::vector<CharT> expected_vector_after_write = {
+ {0x78, 0x77, 0x76, 0x75, 0x74, 0x73, 0x72, 0x71}};
+ for (int i = kBits / 8; i < 8; ++i) {
+ expected_vector_after_write[i] = i + 1;
+ }
+ EXPECT_EQ(expected_vector_after_write,
+ ::std::vector<CharT>(bytes, bytes + sizeof bytes))
+ << "kAlignment = " << kAlignment << "; kOffset = " << kOffset
+ << "; kBits = " << kBits;
+
+ MemoryAccessor<CharT, kAlignment, kOffset, kBits>::WriteBigEndianUInt(
+ bytes, 0x7172737475767778UL >> (64 - kBits));
+ expected_vector_after_write = {
+ {0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77, 0x78}};
+ for (int i = kBits / 8; i < 8; ++i) {
+ expected_vector_after_write[i] = i + 1;
+ }
+ EXPECT_EQ(expected_vector_after_write,
+ ::std::vector<CharT>(bytes, bytes + sizeof bytes))
+ << "kAlignment = " << kAlignment << "; kOffset = " << kOffset
+ << "; kBits = " << kBits;
+
+ // Recursively iterate the template:
+ //
+ // For every kAlignment/kOffset pair, check kBits from 64 to 8 in increments
+ // of 8.
+ //
+ // If kBits is 8, reset kBits to 64 and go to the next kAlignment/kOffset
+ // pair.
+ //
+ // For each kAlignment, try all kOffsets from 0 to kAlignment - 1.
+ //
+ // If kBits is 8 and kOffset is kAlignment - 1, reset kBits to 64, kOffset to
+ // 0, and halve kAlignment.
+ //
+ // Base cases below handle kAlignment == 0, terminating the recursion.
+ TestMemoryAccessor<
+ CharT,
+ kBits == 8 && kAlignment == kOffset + 1 ? kAlignment / 2 : kAlignment,
+ kBits == 8 ? kAlignment == kOffset + 1 ? 0 : kOffset + 1 : kOffset,
+ kBits == 8 ? 64 : kBits - 8>();
+}
+
+template <>
+void TestMemoryAccessor<char, 0, 0, 64>() {}
+
+template <>
+void TestMemoryAccessor<signed char, 0, 0, 64>() {}
+
+template <>
+void TestMemoryAccessor<unsigned char, 0, 0, 64>() {}
+
+TEST(MemoryAccessor, LittleEndianReads) {
+ TestMemoryAccessor<char, 8, 0, 64>();
+ TestMemoryAccessor<signed char, 8, 0, 64>();
+ TestMemoryAccessor<unsigned char, 8, 0, 64>();
+}
+
+TEST(ContiguousBuffer, OffsetStorageType) {
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 2, 0>,
+ ContiguousBuffer<char, 2, 0>::OffsetStorageType<2, 0>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 2, 0>,
+ ContiguousBuffer<char, 2, 0>::OffsetStorageType<0, 0>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 2, 0>,
+ ContiguousBuffer<char, 2, 0>::OffsetStorageType<4, 0>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 2, 0>,
+ ContiguousBuffer<char, 4, 0>::OffsetStorageType<2, 0>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 2, 0>,
+ ContiguousBuffer<char, 4, 2>::OffsetStorageType<2, 0>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 2, 0>,
+ ContiguousBuffer<char, 4, 1>::OffsetStorageType<2, 1>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 4, 2>,
+ ContiguousBuffer<char, 4, 1>::OffsetStorageType<4, 1>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 4, 1>,
+ ContiguousBuffer<char, 4, 3>::OffsetStorageType<0, 2>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 4, 1>,
+ ContiguousBuffer<char, 4, 3>::OffsetStorageType<4, 2>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 4, 1>,
+ ContiguousBuffer<char, 4, 3>::OffsetStorageType<8, 6>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 4, 1>,
+ ContiguousBuffer<char, 4, 3>::OffsetStorageType<12, 6>>::value));
+ EXPECT_TRUE((::std::is_same<
+ ContiguousBuffer<char, 1, 0>,
+ ContiguousBuffer<char, 4, 1>::OffsetStorageType<3, 1>>::value));
+}
+
+// Minimal class that forwards to std::allocator. Used to test that
+// ReadOnlyContiguousBuffer can be constructed from std::vector<> and
+// std::basic_string<> with non-default trailing template parameters.
+template <class T>
+struct NonstandardAllocator {
+ using value_type = typename ::std::allocator<T>::value_type;
+ using pointer = typename ::std::allocator<T>::pointer;
+ using const_pointer = typename ::std::allocator<T>::const_pointer;
+ using reference = typename ::std::allocator<T>::reference;
+ using const_reference = typename ::std::allocator<T>::const_reference;
+ using size_type = typename ::std::allocator<T>::size_type;
+ using difference_type = typename ::std::allocator<T>::difference_type;
+
+ template <class U>
+ struct rebind {
+ using other = NonstandardAllocator<U>;
+ };
+
+ NonstandardAllocator() = default;
+ // This constructor is *not* explicit in order to conform to the requirements
+ // for an allocator.
+ template <class U>
+ NonstandardAllocator(const NonstandardAllocator<U> &) {} // NOLINT
+
+ T *allocate(size_t n) { return ::std::allocator<T>().allocate(n); }
+ void deallocate(T *p, size_t n) { ::std::allocator<T>().deallocate(p, n); }
+
+ static size_type max_size() {
+ return ::std::numeric_limits<size_type>::max() / sizeof(value_type);
+ }
+};
+
+template <class T, class U>
+bool operator==(const NonstandardAllocator<T> &,
+ const NonstandardAllocator<U> &) {
+ return true;
+}
+
+template <class T, class U>
+bool operator!=(const NonstandardAllocator<T> &,
+ const NonstandardAllocator<U> &) {
+ return false;
+}
+
+// ContiguousBuffer tests for std::vector, std::array, and std::string types.
+template <typename T>
+class ReadOnlyContiguousBufferTest : public ::testing::Test {};
+typedef ::testing::Types<
+ /**/ ::std::vector<char>, ::std::array<char, 8>,
+ ::std::vector<unsigned char>, ::std::vector<signed char>, ::std::string,
+ ::std::basic_string<signed char>, ::std::basic_string<unsigned char>,
+ ::std::vector<unsigned char, NonstandardAllocator<unsigned char>>,
+ ::std::basic_string<char, ::std::char_traits<char>,
+ NonstandardAllocator<char>>>
+ ReadOnlyContiguousContainerTypes;
+TYPED_TEST_SUITE(ReadOnlyContiguousBufferTest,
+ ReadOnlyContiguousContainerTypes);
+
+TYPED_TEST(ReadOnlyContiguousBufferTest, ConstructionFromContainers) {
+ const TypeParam bytes = {{0x08, 0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01}};
+ using CharType =
+ typename ::std::remove_reference<decltype(*bytes.data())>::type;
+ const auto buffer = ContiguousBuffer<const CharType, 1, 0>{&bytes};
+ EXPECT_EQ(bytes.size(), buffer.SizeInBytes());
+ EXPECT_TRUE(buffer.Ok());
+ EXPECT_EQ(0x0807060504030201UL, buffer.template ReadBigEndianUInt<64>());
+
+ const auto offset_buffer = buffer.template GetOffsetStorage<1, 0>(4, 4);
+ EXPECT_EQ(4, offset_buffer.SizeInBytes());
+ EXPECT_EQ(0x04030201U, offset_buffer.template ReadBigEndianUInt<32>());
+
+ // The size of the resulting buffer should be the minimum of the available
+ // size and the requested size.
+ EXPECT_EQ(bytes.size() - 4,
+ (buffer.template GetOffsetStorage<1, 0>(2, bytes.size() - 4)
+ .SizeInBytes()));
+ EXPECT_EQ(
+ 0,
+ (buffer.template GetOffsetStorage<1, 0>(bytes.size(), 4).SizeInBytes()));
+}
+
+// ContiguousBuffer tests for std::vector and std::array types.
+template <typename T>
+class ReadWriteContiguousBufferTest : public ::testing::Test {};
+typedef ::testing::Types</**/ ::std::vector<char>, ::std::array<char, 8>,
+ ::std::vector<unsigned char>,
+ ::std::vector<signed char>>
+ ReadWriteContiguousContainerTypes;
+TYPED_TEST_SUITE(ReadWriteContiguousBufferTest,
+ ReadWriteContiguousContainerTypes);
+
+TYPED_TEST(ReadWriteContiguousBufferTest, ConstructionFromContainers) {
+ TypeParam bytes = {{0x08, 0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01}};
+ using CharType =
+ typename ::std::remove_reference<decltype(*bytes.data())>::type;
+ const auto buffer = ContiguousBuffer<CharType, 1, 0>{&bytes};
+
+ // Read and Ok methods should work just as in ReadOnlyContiguousBuffer.
+ EXPECT_EQ(bytes.size(), buffer.SizeInBytes());
+ EXPECT_TRUE(buffer.Ok());
+ EXPECT_EQ(0x0807060504030201UL, buffer.template ReadBigEndianUInt<64>());
+
+ buffer.template WriteBigEndianUInt<64>(0x0102030405060708UL);
+ EXPECT_EQ((TypeParam{{0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08}}),
+ bytes);
+
+ bytes[4] = static_cast<CharType>(255);
+ EXPECT_EQ(0x1020304ff060708, buffer.template ReadBigEndianUInt<64>());
+}
+
+TEST(ContiguousBuffer, ReturnTypeOfReadUInt) {
+ const auto buffer = ContiguousBuffer<char, 1, 0>();
+
+ EXPECT_TRUE((::std::is_same<decltype(buffer.ReadBigEndianUInt<64>()),
+ uint64_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.ReadBigEndianUInt<48>()),
+ uint64_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.ReadBigEndianUInt<32>()),
+ uint32_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.ReadBigEndianUInt<16>()),
+ uint16_t>::value));
+ EXPECT_TRUE((
+ ::std::is_same<decltype(buffer.ReadBigEndianUInt<8>()), uint8_t>::value));
+
+ EXPECT_TRUE((::std::is_same<decltype(buffer.ReadLittleEndianUInt<64>()),
+ uint64_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.ReadLittleEndianUInt<48>()),
+ uint64_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.ReadLittleEndianUInt<32>()),
+ uint32_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.ReadLittleEndianUInt<16>()),
+ uint16_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.ReadLittleEndianUInt<8>()),
+ uint8_t>::value));
+
+ EXPECT_TRUE((::std::is_same<decltype(buffer.UncheckedReadBigEndianUInt<64>()),
+ uint64_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.UncheckedReadBigEndianUInt<48>()),
+ uint64_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.UncheckedReadBigEndianUInt<32>()),
+ uint32_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.UncheckedReadBigEndianUInt<16>()),
+ uint16_t>::value));
+ EXPECT_TRUE((::std::is_same<decltype(buffer.UncheckedReadBigEndianUInt<8>()),
+ uint8_t>::value));
+
+ EXPECT_TRUE(
+ (::std::is_same<decltype(buffer.UncheckedReadLittleEndianUInt<64>()),
+ uint64_t>::value));
+ EXPECT_TRUE(
+ (::std::is_same<decltype(buffer.UncheckedReadLittleEndianUInt<48>()),
+ uint64_t>::value));
+ EXPECT_TRUE(
+ (::std::is_same<decltype(buffer.UncheckedReadLittleEndianUInt<32>()),
+ uint32_t>::value));
+ EXPECT_TRUE(
+ (::std::is_same<decltype(buffer.UncheckedReadLittleEndianUInt<16>()),
+ uint16_t>::value));
+ EXPECT_TRUE(
+ (::std::is_same<decltype(buffer.UncheckedReadLittleEndianUInt<8>()),
+ uint8_t>::value));
+}
+
+TEST(ReadOnlyContiguousBuffer, Methods) {
+ const ::std::vector<uint8_t> bytes = {{0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b,
+ 0x0a, 0x09, 0x08, 0x07, 0x06, 0x05,
+ 0x04, 0x03, 0x02, 0x01}};
+ const auto buffer = ReadOnlyContiguousBuffer{bytes.data(), bytes.size() - 4};
+ EXPECT_DEATH(buffer.ReadBigEndianUInt<64>(), "");
+ EXPECT_TRUE(buffer.Ok());
+ EXPECT_EQ(bytes.size() - 4, buffer.SizeInBytes());
+ EXPECT_EQ(0x100f0e0d0c0b0a09, buffer.UncheckedReadBigEndianUInt<64>());
+ EXPECT_EQ(0x090a0b0c0d0e0f10, buffer.UncheckedReadLittleEndianUInt<64>());
+
+ const auto offset_buffer = buffer.GetOffsetStorage<1, 0>(4, 4);
+ EXPECT_EQ(0x0c0b0a09, offset_buffer.ReadBigEndianUInt<32>());
+ EXPECT_EQ(0x090a0b0c, offset_buffer.ReadLittleEndianUInt<32>());
+ EXPECT_EQ(0x0c0b0a0908070605, offset_buffer.UncheckedReadBigEndianUInt<64>());
+ EXPECT_EQ(4, offset_buffer.SizeInBytes());
+ EXPECT_TRUE(offset_buffer.Ok());
+
+ const auto small_offset_buffer = buffer.GetOffsetStorage<1, 0>(4, 1);
+ EXPECT_EQ(0x0c, small_offset_buffer.ReadBigEndianUInt<8>());
+ EXPECT_EQ(0x0c, small_offset_buffer.ReadLittleEndianUInt<8>());
+ EXPECT_EQ(1, small_offset_buffer.SizeInBytes());
+ EXPECT_TRUE(small_offset_buffer.Ok());
+
+ EXPECT_FALSE(ReadOnlyContiguousBuffer().Ok());
+ EXPECT_FALSE(
+ (ReadOnlyContiguousBuffer{static_cast<char *>(nullptr), 12}.Ok()));
+ EXPECT_DEATH((ReadOnlyContiguousBuffer{static_cast<char *>(nullptr), 4}
+ .ReadBigEndianUInt<32>()),
+ "");
+ EXPECT_EQ(0, ReadOnlyContiguousBuffer().SizeInBytes());
+ EXPECT_EQ(0, (ReadOnlyContiguousBuffer{static_cast<char *>(nullptr), 12}
+ .SizeInBytes()));
+ EXPECT_DEATH(
+ (ReadOnlyContiguousBuffer{bytes.data(), 0}.ReadBigEndianUInt<8>()), "");
+
+ // The size of the resulting buffer should be the minimum of the available
+ // size and the requested size.
+ EXPECT_EQ(bytes.size() - 8,
+ (buffer.GetOffsetStorage<1, 0>(4, bytes.size() - 4).SizeInBytes()));
+ EXPECT_EQ(4, (buffer.GetOffsetStorage<1, 0>(0, 4).SizeInBytes()));
+ EXPECT_EQ(0, (buffer.GetOffsetStorage<1, 0>(bytes.size(), 4).SizeInBytes()));
+ EXPECT_FALSE((ReadOnlyContiguousBuffer().GetOffsetStorage<1, 0>(0, 0).Ok()));
+}
+
+TEST(ReadWriteContiguousBuffer, Methods) {
+ ::std::vector<uint8_t> bytes = {
+ {0x0c, 0x0b, 0x0a, 0x09, 0x08, 0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01}};
+ const auto buffer = ReadWriteContiguousBuffer{bytes.data(), bytes.size() - 4};
+ // Read and Ok methods should work just as in ReadOnlyContiguousBuffer.
+ EXPECT_TRUE(buffer.Ok());
+ EXPECT_EQ(bytes.size() - 4, buffer.SizeInBytes());
+ EXPECT_EQ(0x0c0b0a0908070605, buffer.ReadBigEndianUInt<64>());
+
+ buffer.WriteBigEndianUInt<64>(0x05060708090a0b0c);
+ EXPECT_EQ((::std::vector<uint8_t>{0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b,
+ 0x0c, 0x04, 0x03, 0x02, 0x01}),
+ bytes);
+ buffer.WriteLittleEndianUInt<64>(0x05060708090a0b0c);
+ EXPECT_EQ((::std::vector<uint8_t>{0x0c, 0x0b, 0x0a, 0x09, 0x08, 0x07, 0x06,
+ 0x05, 0x04, 0x03, 0x02, 0x01}),
+ bytes);
+
+ const auto offset_buffer = buffer.GetOffsetStorage<1, 0>(4, 4);
+ offset_buffer.WriteBigEndianUInt<32>(0x05060708);
+ EXPECT_EQ((::std::vector<uint8_t>{0x0c, 0x0b, 0x0a, 0x09, 0x05, 0x06, 0x07,
+ 0x08, 0x04, 0x03, 0x02, 0x01}),
+ bytes);
+ offset_buffer.WriteLittleEndianUInt<32>(0x05060708);
+ EXPECT_EQ((::std::vector<uint8_t>{0x0c, 0x0b, 0x0a, 0x09, 0x08, 0x07, 0x06,
+ 0x05, 0x04, 0x03, 0x02, 0x01}),
+ bytes);
+
+ const auto small_offset_buffer = buffer.GetOffsetStorage<1, 0>(4, 1);
+ small_offset_buffer.WriteBigEndianUInt<8>(0x80);
+ EXPECT_EQ((::std::vector<uint8_t>{0x0c, 0x0b, 0x0a, 0x09, 0x80, 0x07, 0x06,
+ 0x05, 0x04, 0x03, 0x02, 0x01}),
+ bytes);
+ small_offset_buffer.WriteLittleEndianUInt<8>(0x08);
+ EXPECT_EQ((::std::vector<uint8_t>{0x0c, 0x0b, 0x0a, 0x09, 0x08, 0x07, 0x06,
+ 0x05, 0x04, 0x03, 0x02, 0x01}),
+ bytes);
+
+ EXPECT_DEATH(ReadWriteContiguousBuffer().ReadLittleEndianUInt<8>(), "");
+ EXPECT_DEATH(
+ (ReadWriteContiguousBuffer{static_cast<unsigned char *>(nullptr), 1}
+ .ReadLittleEndianUInt<8>()),
+ "");
+ EXPECT_DEATH(
+ (ReadWriteContiguousBuffer{static_cast<unsigned char *>(nullptr), 1}
+ .WriteLittleEndianUInt<8>(0xff)),
+ "");
+}
+
+TEST(ContiguousBuffer, AssignmentFromCompatibleContiguousBuffers) {
+ alignas(4) char data[8];
+ ContiguousBuffer<const unsigned char, 1, 0> buffer;
+ buffer = ContiguousBuffer<char, 4, 1>(data + 1, sizeof data - 1);
+ EXPECT_TRUE(buffer.Ok());
+ EXPECT_EQ(buffer.data(), reinterpret_cast<unsigned char *>(data + 1));
+
+ ContiguousBuffer<const signed char, 2, 1> aligned_buffer;
+ aligned_buffer =
+ ContiguousBuffer<unsigned char, 4, 3>(data + 3, sizeof data - 3);
+ EXPECT_TRUE(aligned_buffer.Ok());
+ EXPECT_EQ(aligned_buffer.data(), reinterpret_cast<signed char *>(data + 3));
+}
+
+TEST(ContiguousBuffer, ConstructionFromCompatibleContiguousBuffers) {
+ alignas(4) char data[8];
+ ContiguousBuffer<const unsigned char, 1, 0> buffer{
+ ContiguousBuffer<char, 4, 1>(data + 1, sizeof data - 1)};
+ EXPECT_TRUE(buffer.Ok());
+ EXPECT_EQ(buffer.data(), reinterpret_cast<unsigned char *>(data + 1));
+
+ ContiguousBuffer<const signed char, 2, 1> aligned_buffer{
+ ContiguousBuffer<unsigned char, 4, 3>(data + 3, sizeof data - 3)};
+ EXPECT_TRUE(aligned_buffer.Ok());
+ EXPECT_EQ(aligned_buffer.data(), reinterpret_cast<signed char *>(data + 3));
+}
+
+TEST(LittleEndianByteOrderer, Methods) {
+ ::std::vector<uint8_t> bytes = {{21, 22, 1, 2, 3, 4, 5, 6, 7, 8, 23, 24}};
+ const int buffer_start = 2;
+ const auto buffer = LittleEndianByteOrderer<ReadWriteContiguousBuffer>{
+ ReadWriteContiguousBuffer{bytes.data() + buffer_start, 8}};
+ EXPECT_EQ(8, buffer.SizeInBytes());
+ EXPECT_TRUE(buffer.Ok());
+ EXPECT_EQ(0x0807060504030201, buffer.ReadUInt<64>());
+ EXPECT_EQ(0x0807060504030201, buffer.UncheckedReadUInt<64>());
+ EXPECT_DEATH(buffer.ReadUInt<56>(), "");
+ EXPECT_EQ(0x07060504030201, buffer.UncheckedReadUInt<56>());
+ buffer.WriteUInt<64>(0x0102030405060708);
+ EXPECT_EQ((::std::vector<uint8_t>{21, 22, 8, 7, 6, 5, 4, 3, 2, 1, 23, 24}),
+ bytes);
+ buffer.UncheckedWriteUInt<64>(0x0807060504030201);
+ EXPECT_EQ((::std::vector<uint8_t>{21, 22, 1, 2, 3, 4, 5, 6, 7, 8, 23, 24}),
+ bytes);
+ EXPECT_DEATH(buffer.WriteUInt<56>(0x77777777777777), "");
+
+ EXPECT_FALSE(LittleEndianByteOrderer<ReadOnlyContiguousBuffer>().Ok());
+ EXPECT_EQ(0,
+ LittleEndianByteOrderer<ReadOnlyContiguousBuffer>().SizeInBytes());
+ EXPECT_EQ(bytes[1], (LittleEndianByteOrderer<ReadOnlyContiguousBuffer>{
+ ReadOnlyContiguousBuffer{bytes.data() + 1, 0}}
+ .UncheckedReadUInt<8>()));
+ EXPECT_TRUE((LittleEndianByteOrderer<ReadOnlyContiguousBuffer>{
+ ReadOnlyContiguousBuffer{bytes.data(), 0}}
+ .Ok()));
+}
+
+TEST(BigEndianByteOrderer, Methods) {
+ ::std::vector<uint8_t> bytes = {{21, 22, 1, 2, 3, 4, 5, 6, 7, 8, 23, 24}};
+ const int buffer_start = 2;
+ const auto buffer = BigEndianByteOrderer<ReadWriteContiguousBuffer>{
+ ReadWriteContiguousBuffer{bytes.data() + buffer_start, 8}};
+ EXPECT_EQ(8, buffer.SizeInBytes());
+ EXPECT_TRUE(buffer.Ok());
+ EXPECT_EQ(0x0102030405060708, buffer.ReadUInt<64>());
+ EXPECT_EQ(0x0102030405060708, buffer.UncheckedReadUInt<64>());
+ EXPECT_DEATH(buffer.ReadUInt<56>(), "");
+ EXPECT_EQ(0x01020304050607, buffer.UncheckedReadUInt<56>());
+ buffer.WriteUInt<64>(0x0807060504030201);
+ EXPECT_EQ((::std::vector<uint8_t>{21, 22, 8, 7, 6, 5, 4, 3, 2, 1, 23, 24}),
+ bytes);
+ buffer.UncheckedWriteUInt<64>(0x0102030405060708);
+ EXPECT_EQ((::std::vector<uint8_t>{21, 22, 1, 2, 3, 4, 5, 6, 7, 8, 23, 24}),
+ bytes);
+ EXPECT_DEATH(buffer.WriteUInt<56>(0x77777777777777), "");
+
+ EXPECT_FALSE(BigEndianByteOrderer<ReadOnlyContiguousBuffer>().Ok());
+ EXPECT_EQ(0, BigEndianByteOrderer<ReadOnlyContiguousBuffer>().SizeInBytes());
+ EXPECT_EQ(bytes[1], (BigEndianByteOrderer<ReadOnlyContiguousBuffer>{
+ ReadOnlyContiguousBuffer{bytes.data() + 1, 0}}
+ .UncheckedReadUInt<8>()));
+ EXPECT_TRUE((BigEndianByteOrderer<ReadOnlyContiguousBuffer>{
+ ReadOnlyContiguousBuffer{bytes.data(), 0}}
+ .Ok()));
+}
+
+TEST(NullByteOrderer, Methods) {
+ uint8_t bytes[] = {0xdb, 0x0f, 0x0e, 0x0d};
+ const auto buffer = NullByteOrderer<ReadWriteContiguousBuffer>{
+ ReadWriteContiguousBuffer{bytes, 1}};
+ EXPECT_EQ(bytes[0], buffer.ReadUInt<8>());
+ EXPECT_EQ(bytes[0], buffer.UncheckedReadUInt<8>());
+ // NullByteOrderer::UncheckedRead ignores its argument.
+ EXPECT_EQ(bytes[0], buffer.UncheckedReadUInt<8>());
+ buffer.WriteUInt<8>(0x24);
+ EXPECT_EQ(0x24, bytes[0]);
+ buffer.UncheckedWriteUInt<8>(0x25);
+ EXPECT_EQ(0x25, bytes[0]);
+ EXPECT_EQ(1, buffer.SizeInBytes());
+ EXPECT_TRUE(buffer.Ok());
+
+ EXPECT_FALSE(NullByteOrderer<ReadOnlyContiguousBuffer>().Ok());
+ EXPECT_EQ(0, NullByteOrderer<ReadOnlyContiguousBuffer>().SizeInBytes());
+ EXPECT_DEATH((NullByteOrderer<ReadOnlyContiguousBuffer>{
+ ReadOnlyContiguousBuffer{bytes, 0}}
+ .ReadUInt<8>()),
+ "");
+ EXPECT_DEATH((NullByteOrderer<ReadOnlyContiguousBuffer>{
+ ReadOnlyContiguousBuffer{bytes, 2}}
+ .ReadUInt<8>()),
+ "");
+ EXPECT_EQ(bytes[0], (NullByteOrderer<ReadOnlyContiguousBuffer>{
+ ReadOnlyContiguousBuffer{bytes, 0}}
+ .UncheckedReadUInt<8>()));
+ EXPECT_TRUE((NullByteOrderer<ReadOnlyContiguousBuffer>{
+ ReadOnlyContiguousBuffer{bytes, 0}}
+ .Ok()));
+}
+
+TEST(BitBlock, BigEndianMethods) {
+ uint8_t bytes[] = {0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
+ 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10};
+ const auto big_endian =
+ BigEndianBitBlockN<64>{ReadWriteContiguousBuffer{bytes + 4, 8}};
+ EXPECT_EQ(64, big_endian.SizeInBits());
+ EXPECT_TRUE(big_endian.Ok());
+ EXPECT_EQ(0x05060708090a0b0cUL, big_endian.ReadUInt());
+ EXPECT_EQ(0x05060708090a0b0cUL, big_endian.UncheckedReadUInt());
+ EXPECT_FALSE(BigEndianBitBlockN<64>().Ok());
+ EXPECT_EQ(64, BigEndianBitBlockN<64>().SizeInBits());
+ EXPECT_FALSE(
+ (BigEndianBitBlockN<64>{ReadWriteContiguousBuffer{bytes, 0}}.Ok()));
+}
+
+TEST(BitBlock, LittleEndianMethods) {
+ uint8_t bytes[] = {0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
+ 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10};
+ const auto little_endian =
+ LittleEndianBitBlockN<64>{ReadWriteContiguousBuffer{bytes + 4, 8}};
+ EXPECT_EQ(64, little_endian.SizeInBits());
+ EXPECT_TRUE(little_endian.Ok());
+ EXPECT_EQ(0x0c0b0a0908070605UL, little_endian.ReadUInt());
+ EXPECT_EQ(0x0c0b0a0908070605UL, little_endian.UncheckedReadUInt());
+ EXPECT_FALSE(LittleEndianBitBlockN<64>().Ok());
+ EXPECT_EQ(64, LittleEndianBitBlockN<64>().SizeInBits());
+ EXPECT_FALSE(
+ (LittleEndianBitBlockN<64>{ReadWriteContiguousBuffer{bytes, 0}}.Ok()));
+}
+
+TEST(BitBlock, GetOffsetStorage) {
+ uint8_t bytes[] = {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09,
+ 0x08, 0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01};
+ const auto bit_block =
+ LittleEndianBitBlockN<64>{ReadWriteContiguousBuffer{bytes, 8}};
+ const OffsetBitBlock<LittleEndianBitBlockN<64>> offset_block =
+ bit_block.GetOffsetStorage<1, 0>(4, 8);
+ EXPECT_EQ(8, offset_block.SizeInBits());
+ EXPECT_EQ(0xf1, offset_block.ReadUInt());
+ EXPECT_EQ(bit_block.SizeInBits(),
+ (bit_block.GetOffsetStorage<1, 0>(8, bit_block.SizeInBits())
+ .SizeInBits()));
+ EXPECT_FALSE(
+ (bit_block.GetOffsetStorage<1, 0>(8, bit_block.SizeInBits()).Ok()));
+ EXPECT_EQ(10, (bit_block.GetOffsetStorage<1, 0>(bit_block.SizeInBits(), 10)
+ .SizeInBits()));
+}
+
+TEST(OffsetBitBlock, Methods) {
+ ::std::vector<uint8_t> bytes = {
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09}};
+ const auto bit_block =
+ LittleEndianBitBlockN<64>{ReadWriteContiguousBuffer{&bytes}};
+ EXPECT_FALSE((bit_block.GetOffsetStorage<1, 0>(0, 96).Ok()));
+ EXPECT_TRUE((bit_block.GetOffsetStorage<1, 0>(0, 64).Ok()));
+
+ const auto offset_block = bit_block.GetOffsetStorage<1, 0>(8, 48);
+ EXPECT_FALSE((offset_block.GetOffsetStorage<1, 0>(40, 16).Ok()));
+ EXPECT_EQ(0x0a0b0c0d0e0f, offset_block.ReadUInt());
+ EXPECT_EQ(0x0a0b0c0d0e0f, offset_block.UncheckedReadUInt());
+ offset_block.WriteUInt(0x0f0e0d0c0b0a);
+ EXPECT_EQ(
+ (::std::vector<uint8_t>{0x10, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x09}),
+ bytes);
+ offset_block.UncheckedWriteUInt(0x0a0b0c0d0e0f);
+ EXPECT_EQ(
+ (::std::vector<uint8_t>{0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09}),
+ bytes);
+ EXPECT_DEATH(offset_block.WriteUInt(0x10f0e0d0c0b0a), "");
+ offset_block.UncheckedWriteUInt(0x10f0e0d0c0b0a);
+ EXPECT_EQ(
+ (::std::vector<uint8_t>{0x10, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x09}),
+ bytes);
+
+ const auto offset_offset_block = offset_block.GetOffsetStorage<1, 0>(16, 16);
+ EXPECT_FALSE((offset_offset_block.GetOffsetStorage<1, 0>(8, 16).Ok()));
+ EXPECT_EQ(0x0d0c, offset_offset_block.ReadUInt());
+ EXPECT_EQ(0x0d0c, offset_offset_block.UncheckedReadUInt());
+ offset_offset_block.WriteUInt(0x0c0d);
+ EXPECT_EQ(
+ (::std::vector<uint8_t>{0x10, 0x0a, 0x0b, 0x0d, 0x0c, 0x0e, 0x0f, 0x09}),
+ bytes);
+ offset_offset_block.UncheckedWriteUInt(0x0d0c);
+ EXPECT_EQ(
+ (::std::vector<uint8_t>{0x10, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x09}),
+ bytes);
+ EXPECT_DEATH(offset_offset_block.WriteUInt(0x10c0d), "");
+ offset_offset_block.UncheckedWriteUInt(0x20c0d);
+ EXPECT_EQ(
+ (::std::vector<uint8_t>{0x10, 0x0a, 0x0b, 0x0d, 0x0c, 0x0e, 0x0f, 0x09}),
+ bytes);
+
+ const auto null_offset_block = OffsetBitBlock<BigEndianBitBlockN<32>>();
+ EXPECT_FALSE(null_offset_block.Ok());
+ EXPECT_EQ(0, null_offset_block.SizeInBits());
+}
+
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_prelude.h b/public/emboss_prelude.h
new file mode 100644
index 0000000..85eeba8
--- /dev/null
+++ b/public/emboss_prelude.h
@@ -0,0 +1,803 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// This header contains implementations of the types in the Emboss Prelude
+// (UInt, Int, Flag, etc.)
+#ifndef EMBOSS_PUBLIC_EMBOSS_PRELUDE_H_
+#define EMBOSS_PUBLIC_EMBOSS_PRELUDE_H_
+
+#include <stddef.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <limits>
+#include <type_traits>
+#include <utility>
+
+#include "public/emboss_cpp_util.h"
+
+// This namespace must match the [(cpp) namespace] in the Emboss prelude.
+namespace emboss {
+namespace prelude {
+
+// FlagView is the C++ implementation of the Emboss "Flag" type, which is a
+// 1-bit value.
+template <class Parameters, class BitBlock>
+class FlagView final {
+ public:
+ static_assert(Parameters::kBits == 1, "FlagView must be 1 bit.");
+
+ explicit FlagView(BitBlock bits) : bit_block_{bits} {}
+ FlagView() : bit_block_() {}
+ FlagView(const FlagView &) = default;
+ FlagView(FlagView &&) = default;
+ FlagView &operator=(const FlagView &) = default;
+ FlagView &operator=(FlagView &&) = default;
+ ~FlagView() = default;
+
+ bool Read() const {
+ bool result = bit_block_.ReadUInt();
+ EMBOSS_CHECK(Parameters::ValueIsOk(result));
+ return result;
+ }
+ bool UncheckedRead() const { return bit_block_.UncheckedReadUInt(); }
+ void Write(bool value) const { EMBOSS_CHECK(TryToWrite(value)); }
+ bool TryToWrite(bool value) const {
+ if (!CouldWriteValue(value)) return false;
+ if (!IsComplete()) return false;
+ bit_block_.WriteUInt(value);
+ return true;
+ }
+ static constexpr bool CouldWriteValue(bool value) {
+ return Parameters::ValueIsOk(value);
+ }
+ void UncheckedWrite(bool value) const {
+ bit_block_.UncheckedWriteUInt(value);
+ }
+
+ template <typename OtherView>
+ void CopyFrom(const OtherView &other) const {
+ Write(other.Read());
+ }
+ template <typename OtherView>
+ void UncheckedCopyFrom(const OtherView &other) const {
+ UncheckedWrite(other.UncheckedRead());
+ }
+ template <typename OtherView>
+ bool TryToCopyFrom(const OtherView &other) const {
+ return TryToWrite(other.Read());
+ }
+
+ bool Ok() const {
+ return IsComplete() && Parameters::ValueIsOk(UncheckedRead());
+ }
+ template <class OtherBitBlock>
+ bool Equals(const FlagView<Parameters, OtherBitBlock> &other) const {
+ return Read() == other.Read();
+ }
+ template <class OtherBitBlock>
+ bool UncheckedEquals(const FlagView<Parameters, OtherBitBlock> &other) const {
+ return UncheckedRead() == other.UncheckedRead();
+ }
+ bool IsComplete() const {
+ return bit_block_.Ok() && bit_block_.SizeInBits() > 0;
+ }
+
+ template <class Stream>
+ bool UpdateFromTextStream(Stream *stream) const {
+ ::std::string token;
+ if (!::emboss::support::ReadToken(stream, &token)) return false;
+ if (token == "true") {
+ return TryToWrite(true);
+ } else if (token == "false") {
+ return TryToWrite(false);
+ }
+ // TODO(bolms): Provide a way to get an error message on parse failure.
+ return false;
+ }
+
+ template <class Stream>
+ void WriteToTextStream(Stream *stream,
+ const ::emboss::TextOutputOptions &options) const {
+ ::emboss::support::WriteBooleanViewToTextStream(this, stream, options);
+ }
+
+ private:
+ BitBlock bit_block_;
+};
+
+// UIntView is a view for UInts inside of bitfields.
+template <class Parameters, class BitViewType>
+class UIntView final {
+ public:
+ using ValueType = typename ::emboss::support::LeastWidthInteger<
+ Parameters::kBits>::Unsigned;
+
+ static_assert(
+ Parameters::kBits <= sizeof(ValueType) * 8,
+ "UIntView requires sizeof(ValueType) * 8 >= Parameters::kBits.");
+
+ template <typename... Args>
+ explicit UIntView(Args &&... args) : buffer_{::std::forward<Args>(args)...} {}
+ UIntView() : buffer_() {}
+ UIntView(const UIntView &) = default;
+ UIntView(UIntView &&) = default;
+ UIntView &operator=(const UIntView &) = default;
+ UIntView &operator=(UIntView &&) = default;
+ ~UIntView() = default;
+
+ ValueType Read() const {
+ ValueType result = buffer_.ReadUInt();
+ EMBOSS_CHECK(Parameters::ValueIsOk(result));
+ return result;
+ }
+ ValueType UncheckedRead() const { return buffer_.UncheckedReadUInt(); }
+
+ // The Write, TryToWrite, and CouldWriteValue methods are templated in order
+ // to avoid surprises due to implicit narrowing.
+ //
+ // In C++, you can pass (say) an `int` to a function expecting `uint8_t`, and
+ // the compiler will silently cast the `int` to `uint8_t`, which can change
+ // the value. Even with fairly aggressive warnings, something like this will
+ // silently compile, and print `256 is not >= 128!`:
+ //
+ // bool is_big_uint8(uint8_t value) { return value >= 128; }
+ // bool is_big(uint32_t value) { return is_big_uint8(value); }
+ // int main() {
+ // assert(!is_big(256)); // big is truncated to 0.
+ // std::cout << 256 << " is not >= 128!\n";
+ // return 0;
+ // }
+ //
+ // (Most compilers will give a warning when directly passing a *constant* that
+ // gets truncated; for example, GCC will throw -Woverflow on
+ // `is_big_uint8(256U)`.)
+ template <typename IntT,
+ typename = typename ::std::enable_if<
+ (::std::numeric_limits<typename ::std::remove_cv<
+ typename ::std::remove_reference<IntT>::type>::type>::
+ is_integer &&
+ !::std::is_same<bool, typename ::std::remove_cv<
+ typename ::std::remove_reference<
+ IntT>::type>::type>::value) ||
+ ::std::is_enum<IntT>::value>::type>
+ void Write(IntT value) const {
+ EMBOSS_CHECK(TryToWrite(value));
+ }
+
+ template <typename IntT,
+ typename = typename ::std::enable_if<
+ (::std::numeric_limits<typename ::std::remove_cv<
+ typename ::std::remove_reference<IntT>::type>::type>::
+ is_integer &&
+ !::std::is_same<bool, typename ::std::remove_cv<
+ typename ::std::remove_reference<
+ IntT>::type>::type>::value) ||
+ ::std::is_enum<IntT>::value>::type>
+ bool TryToWrite(IntT value) const {
+ if (!CouldWriteValue(value)) return false;
+ if (!IsComplete()) return false;
+ buffer_.WriteUInt(value);
+ return true;
+ }
+
+ template <typename IntT,
+ typename = typename ::std::enable_if<
+ (::std::numeric_limits<typename ::std::remove_cv<
+ typename ::std::remove_reference<IntT>::type>::type>::
+ is_integer &&
+ !::std::is_same<bool, typename ::std::remove_cv<
+ typename ::std::remove_reference<
+ IntT>::type>::type>::value) ||
+ ::std::is_enum<IntT>::value>::type>
+ static constexpr bool CouldWriteValue(IntT value) {
+ // Implicit conversions are doing some work here, but the upshot is that the
+ // value must be at least 0, and at most (2**kBits)-1. The clause to
+ // compute (2**kBits)-1 should not be "simplified" further.
+ //
+ // Because of C++ implicit integer promotions, the (2**kBits)-1 computation
+ // works differently when `ValueType` is smaller than `unsigned int` than it
+ // does when `ValueType` is at least as big as `unsigned int`.
+ //
+ // For example, when `ValueType` is `uint8_t` and `kBits` is 8:
+ //
+ // 1. `static_cast<ValueType>(1)` becomes `uint8_t(1)`.
+ // 2. `uint8_t(1) << (kBits - 1)` is `uint8_t(1) << 7`.
+ // 3. The shift operator `<<` promotes its left operand to `unsigned`,
+ // giving `unsigned(1) << 7`.
+ // 4. `unsigned(1) << 7` becomes `unsigned(0x80)`.
+ // 5. `unsigned(0x80) << 1` becomes `unsigned(0x100)`.
+ // 6. Finally, `unsigned(0x100) - 1` is `unsigned(0xff)`.
+ //
+ // (Note that the cases where `kBits` is less than `sizeof(ValueType) * 8`
+ // are very similar.)
+ //
+ // When `ValueType` is `uint32_t`, `unsigned` is 32 bits, and `kBits` is 32:
+ //
+ // 1. `static_cast<ValueType>(1)` becomes `uint32_t(1)`.
+ // 2. `uint32_t(1) << (kBits - 1)` is `uint32_t(1) << 31`.
+ // 3. The shift operator `<<` does *not* further promote `uint32_t`.
+ // 4. `uint32_t(1) << 31` becomes `uint32_t(0x80000000)`. Note that
+ // `uint32_t(1) << 32` would be undefined behavior (shift of >= the
+ // size of the left operand type), which is why the shift is broken
+ // into two parts.
+ // 5. `uint32_t(0x80000000) << 1` overflows, leaving `uint32_t(0)`.
+ // 6. `uint32_t(0) - 1` underflows, leaving `uint32_t(0xffffffff)`.
+ //
+ // Because unsigned overflow and underflow are defined to be modulo 2**N,
+ // where N is the number of bits in the type, this is entirely
+ // standards-compliant.
+ return value >= 0 &&
+ static_cast</**/ ::std::uint64_t>(value) <=
+ ((static_cast<ValueType>(1) << (Parameters::kBits - 1)) << 1) -
+ 1 &&
+ Parameters::ValueIsOk(value);
+ }
+ void UncheckedWrite(ValueType value) const {
+ buffer_.UncheckedWriteUInt(value);
+ }
+
+ template <typename OtherView>
+ void CopyFrom(const OtherView &other) const {
+ Write(other.Read());
+ }
+ template <typename OtherView>
+ void UncheckedCopyFrom(const OtherView &other) const {
+ UncheckedWrite(other.UncheckedRead());
+ }
+ template <typename OtherView>
+ bool TryToCopyFrom(const OtherView &other) const {
+ return other.Ok() && TryToWrite(other.Read());
+ }
+
+ // All bit patterns in the underlying buffer are valid, so Ok() is always
+ // true if IsComplete() is true.
+ bool Ok() const {
+ return IsComplete() && Parameters::ValueIsOk(UncheckedRead());
+ }
+ template <class OtherBitViewType>
+ bool Equals(const UIntView<Parameters, OtherBitViewType> &other) const {
+ return Read() == other.Read();
+ }
+ template <class OtherBitViewType>
+ bool UncheckedEquals(
+ const UIntView<Parameters, OtherBitViewType> &other) const {
+ return UncheckedRead() == other.UncheckedRead();
+ }
+ bool IsComplete() const {
+ return buffer_.Ok() && buffer_.SizeInBits() >= Parameters::kBits;
+ }
+
+ template <class Stream>
+ bool UpdateFromTextStream(Stream *stream) const {
+ return support::ReadIntegerFromTextStream(this, stream);
+ }
+
+ template <class Stream>
+ void WriteToTextStream(Stream *stream,
+ ::emboss::TextOutputOptions options) const {
+ support::WriteIntegerViewToTextStream(this, stream, options);
+ }
+
+ static constexpr int SizeInBits() { return Parameters::kBits; }
+
+ private:
+ BitViewType buffer_;
+};
+
+// IntView is a view for Ints inside of bitfields.
+template <class Parameters, class BitViewType>
+class IntView final {
+ public:
+ using ValueType =
+ typename ::emboss::support::LeastWidthInteger<Parameters::kBits>::Signed;
+
+ static_assert(Parameters::kBits <= sizeof(ValueType) * 8,
+ "IntView requires sizeof(ValueType) * 8 >= Parameters::kBits.");
+
+ template <typename... Args>
+ explicit IntView(Args &&... args) : buffer_{::std::forward<Args>(args)...} {}
+ IntView() : buffer_() {}
+ IntView(const IntView &) = default;
+ IntView(IntView &&) = default;
+ IntView &operator=(const IntView &) = default;
+ IntView &operator=(IntView &&) = default;
+ ~IntView() = default;
+
+ ValueType Read() const {
+ ValueType value = ConvertToSigned(buffer_.ReadUInt());
+ EMBOSS_CHECK(Parameters::ValueIsOk(value));
+ return value;
+ }
+ ValueType UncheckedRead() const {
+ return ConvertToSigned(buffer_.UncheckedReadUInt());
+ }
+ // As with UIntView, above, Write, TryToWrite, and CouldWriteValue need to be
+ // templated in order to avoid surprises due to implicit narrowing
+ // conversions.
+ template <typename IntT,
+ typename = typename ::std::enable_if<
+ (::std::numeric_limits<typename ::std::remove_cv<
+ typename ::std::remove_reference<IntT>::type>::type>::
+ is_integer &&
+ !::std::is_same<bool, typename ::std::remove_cv<
+ typename ::std::remove_reference<
+ IntT>::type>::type>::value) ||
+ ::std::is_enum<IntT>::value>::type>
+ void Write(IntT value) const {
+ EMBOSS_CHECK(TryToWrite(value));
+ }
+
+ template <typename IntT,
+ typename = typename ::std::enable_if<
+ (::std::numeric_limits<typename ::std::remove_cv<
+ typename ::std::remove_reference<IntT>::type>::type>::
+ is_integer &&
+ !::std::is_same<bool, typename ::std::remove_cv<
+ typename ::std::remove_reference<
+ IntT>::type>::type>::value) ||
+ ::std::is_enum<IntT>::value>::type>
+ bool TryToWrite(IntT value) const {
+ if (!CouldWriteValue(value)) return false;
+ if (!IsComplete()) return false;
+ buffer_.WriteUInt(::emboss::support::MaskToNBits(
+ static_cast<typename BitViewType::ValueType>(value),
+ Parameters::kBits));
+ return true;
+ }
+
+ template <typename IntT,
+ typename = typename ::std::enable_if<
+ (::std::numeric_limits<typename ::std::remove_cv<
+ typename ::std::remove_reference<IntT>::type>::type>::
+ is_integer &&
+ !::std::is_same<bool, typename ::std::remove_cv<
+ typename ::std::remove_reference<
+ IntT>::type>::type>::value) ||
+ ::std::is_enum<IntT>::value>::type>
+ static constexpr bool CouldWriteValue(IntT value) {
+ // This effectively checks that value >= -(2**(kBits-1) and value <=
+ // (2**(kBits-1))-1.
+ //
+ // This has to be done somewhat piecemeal, in order to avoid various bits of
+ // undefined and implementation-defined behavior.
+ //
+ // First, if IntT is an unsigned type, the check that value >=
+ // -(2**(kBits-1)) is skipped, in order to avoid any signed <-> unsigned
+ // conversions.
+ //
+ // Second, if kBits is 1, then the limits -1 and 0 are explicit, so that
+ // there is never a shift by -1 (which is undefined behavior).
+ //
+ // Third, the shifts are by (kBits - 2), so that they do not alter sign
+ // bits. To get the final bounds, we use a bit of addition and
+ // multiplication. For example, for 8 bits, the lower bound is (1 << 6) *
+ // -2, which is 64 * -2, which is -128. The corresponding upper bound is
+ // ((1 << 6) - 1) * 2 + 1, which is (64 - 1) * 2 + 1, which is 63 * 2 + 1,
+ // which is 126 + 1, which is 127. The upper bound must be computed in
+ // multiple steps like this in order to avoid overflow.
+ return (!::std::is_signed<typename ::std::remove_cv<
+ typename ::std::remove_reference<IntT>::type>::type>::value ||
+ static_cast</**/ ::std::int64_t>(value) >=
+ (Parameters::kBits == 1
+ ? -1
+ : (static_cast<ValueType>(1) << (Parameters::kBits - 2)) *
+ -2)) &&
+ value <=
+ (Parameters::kBits == 1
+ ? 0
+ : ((static_cast<ValueType>(1) << (Parameters::kBits - 2)) -
+ 1) * 2 +
+ 1) &&
+ Parameters::ValueIsOk(value);
+ }
+
+ void UncheckedWrite(ValueType value) const {
+ buffer_.UncheckedWriteUInt(::emboss::support::MaskToNBits(
+ static_cast<typename BitViewType::ValueType>(value),
+ Parameters::kBits));
+ }
+
+ template <typename OtherView>
+ void CopyFrom(const OtherView &other) const {
+ Write(other.Read());
+ }
+ template <typename OtherView>
+ void UncheckedCopyFrom(const IntView &other) const {
+ UncheckedWrite(other.UncheckedRead());
+ }
+ template <typename OtherView>
+ bool TryToCopyFrom(const OtherView &other) const {
+ return other.Ok() && TryToWrite(other.Read());
+ }
+
+ // All bit patterns in the underlying buffer are valid, so Ok() is always
+ // true if IsComplete() is true.
+ bool Ok() const {
+ return IsComplete() && Parameters::ValueIsOk(UncheckedRead());
+ }
+ template <class OtherBitViewType>
+ bool Equals(const IntView<Parameters, OtherBitViewType> &other) const {
+ return Read() == other.Read();
+ }
+ template <class OtherBitViewType>
+ bool UncheckedEquals(
+ const IntView<Parameters, OtherBitViewType> &other) const {
+ return UncheckedRead() == other.UncheckedRead();
+ }
+ bool IsComplete() const {
+ return buffer_.Ok() && buffer_.SizeInBits() >= Parameters::kBits;
+ }
+
+ template <class Stream>
+ bool UpdateFromTextStream(Stream *stream) const {
+ return support::ReadIntegerFromTextStream(this, stream);
+ }
+
+ template <class Stream>
+ void WriteToTextStream(Stream *stream,
+ ::emboss::TextOutputOptions options) const {
+ support::WriteIntegerViewToTextStream(this, stream, options);
+ }
+
+ static constexpr int SizeInBits() { return Parameters::kBits; }
+
+ private:
+ static ValueType ConvertToSigned(typename BitViewType::ValueType data) {
+ static_assert(sizeof(ValueType) <= sizeof(typename BitViewType::ValueType),
+ "Integer types wider than BitViewType::ValueType are not "
+ "supported.");
+#if EMBOSS_SYSTEM_IS_TWOS_COMPLEMENT
+ // static_cast from unsigned to signed is implementation-defined when the
+ // value does not fit in the signed type (in this case, when the final value
+ // should be negative). Most implementations use a reasonable definition,
+ // so on most systems we can just cast.
+ //
+ // If the integer does not take up the full width of ValueType, it needs to
+ // be sign-extended until it does. The easiest way to do this is to shift
+ // until the sign bit is in the topmost position, then cast to signed, then
+ // shift back. The shift back will copy the sign bit.
+ return static_cast<ValueType>(
+ data << (sizeof(ValueType) * 8 - Parameters::kBits)) >>
+ (sizeof(ValueType) * 8 - Parameters::kBits);
+#else
+ // Otherwise, in order to convert without running into
+ // implementation-defined behavior, first mask out the sign bit. This
+ // results in (final result MOD 2 ** (width of int in bits - 1)). That
+ // value can be safely converted to the signed ValueType.
+ //
+ // Finally, if the sign bit was set, subtract (2 ** (width of int in bits -
+ // 2)) twice.
+ //
+ // The 1-bit signed integer case must be handled separately, but it is
+ // (fortunately) quite easy to enumerate both possible values.
+ if (Parameters::kBits == 1) {
+ if (data == 0) {
+ return 0;
+ } else if (data == 1) {
+ return -1;
+ } else {
+ EMBOSS_CHECK(false);
+ }
+ } else {
+ typename BitViewType::ValueType sign_bit =
+ static_cast<typename BitViewType::ValueType>(1)
+ << (Parameters::kBits - 1);
+ typename BitViewType::ValueType mask = sign_bit - 1;
+ typename BitViewType::ValueType data_mod2_to_n = mask & data;
+ ValueType result_sign_bit =
+ static_cast<ValueType>((data & sign_bit) >> 1);
+ return data_mod2_to_n - result_sign_bit - result_sign_bit;
+ }
+#endif
+ }
+
+ BitViewType buffer_;
+};
+
+// The maximum Binary-Coded Decimal (BCD) value that fits in a particular number
+// of bits.
+template <typename ValueType>
+constexpr inline ValueType MaxBcd(int bits) {
+ return bits < 4 ? (1 << bits) - 1
+ : 10 * (MaxBcd<ValueType>(bits - 4) + 1) - 1;
+}
+
+template <typename ValueType>
+inline bool IsBcd(ValueType x) {
+ // Adapted from:
+ // https://graphics.stanford.edu/~seander/bithacks.html#HasLessInWord
+ //
+ // This determines if any nibble has a value greater than 9. It does
+ // this by treating operations on the n-bit value as parallel operations
+ // on n/4 4-bit values.
+ //
+ // The result is computed in the high bit of each nibble: if any of those
+ // bits is set in the end, then at least one nibble had a value in the
+ // range 10-15.
+ //
+ // The first check is subtle: ~x is equivalent to (nibble = 15 - nibble).
+ // Then, 6 is subtracted from each nibble. This operation will underflow
+ // if the original value was more than 9, leaving the high bit of the
+ // nibble set. It will also leave the high bit of the nibble set
+ // (without underflow) if the original value was 0 or 1.
+ //
+ // The second check is just x: the high bit of each nibble in x is set if
+ // that nibble's value is 8-15.
+ //
+ // Thus, the first check leaves the high bit set in any nibble with the
+ // value 0, 1, or 10-15, and the second check leaves the high bit set in
+ // any nibble with the value 8-15. Bitwise-anding these results, high
+ // bits are only set if the original value was 10-15.
+ //
+ // The underflow condition in the first check can screw up the condition
+ // for nibbles in higher positions than the underflowing nibble. This
+ // cannot affect the overall boolean result, because the underflow
+ // condition only happens if a nibble was greater than 9, and therefore
+ // *that* nibble's final value will be nonzero, and therefore the whole
+ // result will be nonzero, no matter what happens in the higher-order
+ // nibbles.
+ //
+ // A couple of examples in 16 bit:
+ //
+ // x = 0x09a8
+ // (~0x09a8 - 0x6666) & 0x09a8 & 0x8888
+ // ( 0xf657 - 0x6666) & 0x09a8 & 0x8888
+ // 0x8ff1 & 0x09a8 & 0x8888
+ // 0x09a0 & 0x8888
+ // 0x0880 Note the underflow into nibble 2
+ //
+ // x = 0x1289
+ // (~0x1289 - 0x6666) & 0x1289 & 0x8888
+ // ( 0xed76 - 0x6666) & 0x1289 & 0x8888
+ // 0x8710 & 0x1289 & 0x8888
+ // 0x0200 & 0x8888
+ // 0x0000
+ static_assert(!::std::is_signed<ValueType>::value,
+ "IsBcd only works on unsigned values.");
+ if (sizeof(ValueType) < sizeof(unsigned)) {
+ // For types with lower integer conversion rank than unsigned int, integer
+ // promotion rules cause many implicit conversions to signed int in the math
+ // below, which makes the math go wrong. Rather than add a dozen explicit
+ // casts back to ValueType, just do the math as 'unsigned'.
+ return IsBcd<unsigned>(x);
+ } else {
+ return ((~x - (~ValueType{0} / 0xf * 0x6 /* 0x6666...6666 */)) & x &
+ (~ValueType{0} / 0xf * 0x8 /* 0x8888...8888 */)) == 0;
+ }
+}
+
+// Base template for Binary-Coded Decimal (BCD) unsigned integer readers.
+template <class Parameters, class BitViewType>
+class BcdView final {
+ public:
+ using ValueType = typename ::emboss::support::LeastWidthInteger<
+ Parameters::kBits>::Unsigned;
+
+ static_assert(Parameters::kBits <= sizeof(ValueType) * 8,
+ "BcdView requires sizeof(ValueType) * 8 >= Parameters::kBits.");
+
+ template <typename... Args>
+ explicit BcdView(Args &&... args) : buffer_{::std::forward<Args>(args)...} {}
+ BcdView() : buffer_() {}
+ BcdView(const BcdView &) = default;
+ BcdView(BcdView &&) = default;
+ BcdView &operator=(const BcdView &) = default;
+ BcdView &operator=(BcdView &&) = default;
+ ~BcdView() = default;
+
+ ValueType Read() const {
+ EMBOSS_CHECK(Ok());
+ return ConvertToBinary(buffer_.ReadUInt());
+ }
+ ValueType UncheckedRead() const {
+ return ConvertToBinary(buffer_.UncheckedReadUInt());
+ }
+ void Write(ValueType value) const { EMBOSS_CHECK(TryToWrite(value)); }
+ bool TryToWrite(ValueType value) const {
+ if (!CouldWriteValue(value)) return false;
+ if (!IsComplete()) return false;
+ buffer_.WriteUInt(ConvertToBcd(value));
+ return true;
+ }
+ static constexpr bool CouldWriteValue(ValueType value) {
+ return value <= MaxValue() && Parameters::ValueIsOk(value);
+ }
+ void UncheckedWrite(ValueType value) const {
+ buffer_.UncheckedWriteUInt(ConvertToBcd(value));
+ }
+
+ template <class Stream>
+ bool UpdateFromTextStream(Stream *stream) const {
+ return support::ReadIntegerFromTextStream(this, stream);
+ }
+
+ template <class Stream>
+ void WriteToTextStream(Stream *stream,
+ ::emboss::TextOutputOptions options) const {
+ // TODO(bolms): This shares the numeric_base() option with IntView and
+ // UIntView (and EnumView, for unknown enum values). It seems like an end
+ // user might prefer to see BCD values in decimal, even if they want to see
+ // values of other numeric types in hex or binary. It seems like there
+ // could be some fancy C++ trickery to allow separate options for separate
+ // view types.
+ support::WriteIntegerViewToTextStream(this, stream, options);
+ }
+
+ template <typename OtherView>
+ void CopyFrom(const OtherView &other) const {
+ Write(other.Read());
+ }
+ template <typename OtherView>
+ void UncheckedCopyFrom(const OtherView &other) const {
+ UncheckedWrite(other.UncheckedRead());
+ }
+ template <typename OtherView>
+ bool TryToCopyFrom(const OtherView &other) const {
+ return other.Ok() && TryToWrite(other.Read());
+ }
+
+ bool Ok() const {
+ if (!IsComplete()) return false;
+ if (!IsBcd(buffer_.ReadUInt())) return false;
+ if (!Parameters::ValueIsOk(UncheckedRead())) return false;
+ return true;
+ }
+ template <class OtherBitViewType>
+ bool Equals(const BcdView<Parameters, OtherBitViewType> &other) const {
+ return Read() == other.Read();
+ }
+ template <class OtherBitViewType>
+ bool UncheckedEquals(
+ const BcdView<Parameters, OtherBitViewType> &other) const {
+ return UncheckedRead() == other.UncheckedRead();
+ }
+ bool IsComplete() const {
+ return buffer_.Ok() && buffer_.SizeInBits() >= Parameters::kBits;
+ }
+
+ static constexpr int SizeInBits() { return Parameters::kBits; }
+
+ private:
+ static ValueType ConvertToBinary(ValueType bcd_value) {
+ ValueType result = 0;
+ ValueType multiplier = 1;
+ for (int shift = 0; shift < Parameters::kBits; shift += 4) {
+ result += ((bcd_value >> shift) & 0xf) * multiplier;
+ multiplier *= 10;
+ }
+ return result;
+ }
+
+ static ValueType ConvertToBcd(ValueType value) {
+ ValueType bcd_value = 0;
+ for (int shift = 0; shift < Parameters::kBits; shift += 4) {
+ bcd_value |= (value % 10) << shift;
+ value /= 10;
+ }
+ return bcd_value;
+ }
+
+ static constexpr ValueType MaxValue() {
+ return MaxBcd<ValueType>(Parameters::kBits);
+ }
+
+ BitViewType buffer_;
+};
+
+// FloatView is the view for the Emboss Float type.
+template <class Parameters, class BitViewType>
+class FloatView final {
+ static_assert(Parameters::kBits == 32 || Parameters::kBits == 64,
+ "Only 32- and 64-bit floats are currently supported.");
+
+ public:
+ using ValueType = typename support::FloatType<Parameters::kBits>::Type;
+
+ template <typename... Args>
+ explicit FloatView(Args &&... args)
+ : buffer_{::std::forward<Args>(args)...} {}
+ FloatView() : buffer_() {}
+ FloatView(const FloatView &) = default;
+ FloatView(FloatView &&) = default;
+ FloatView &operator=(const FloatView &) = default;
+ FloatView &operator=(FloatView &&) = default;
+ ~FloatView() = default;
+
+ ValueType Read() const { return ConvertToFloat(buffer_.ReadUInt()); }
+ ValueType UncheckedRead() const {
+ return ConvertToFloat(buffer_.UncheckedReadUInt());
+ }
+ void Write(ValueType value) const { EMBOSS_CHECK(TryToWrite(value)); }
+ bool TryToWrite(ValueType value) const {
+ if (!CouldWriteValue(value)) return false;
+ if (!IsComplete()) return false;
+ buffer_.WriteUInt(ConvertToUInt(value));
+ return true;
+ }
+ static constexpr bool CouldWriteValue(ValueType value) { return true; }
+ void UncheckedWrite(ValueType value) const {
+ buffer_.UncheckedWriteUInt(ConvertToUInt(value));
+ }
+
+ template <typename OtherView>
+ void CopyFrom(const OtherView &other) const {
+ Write(other.Read());
+ }
+ template <typename OtherView>
+ void UncheckedCopyFrom(const OtherView &other) const {
+ UncheckedWrite(other.UncheckedRead());
+ }
+ template <typename OtherView>
+ bool TryToCopyFrom(const OtherView &other) const {
+ return other.Ok() && TryToWrite(other.Read());
+ }
+
+ // All bit patterns in the underlying buffer are valid, so Ok() is always
+ // true if IsComplete() is true.
+ bool Ok() const { return IsComplete(); }
+ template <class OtherBitViewType>
+ bool Equals(const FloatView<Parameters, OtherBitViewType> &other) const {
+ return Read() == other.Read();
+ }
+ template <class OtherBitViewType>
+ bool UncheckedEquals(
+ const FloatView<Parameters, OtherBitViewType> &other) const {
+ return UncheckedRead() == other.UncheckedRead();
+ }
+ bool IsComplete() const {
+ return buffer_.Ok() && buffer_.SizeInBits() >= Parameters::kBits;
+ }
+
+ template <class Stream>
+ bool UpdateFromTextStream(Stream *stream) const {
+ return support::ReadFloatFromTextStream(this, stream);
+ }
+
+ template <class Stream>
+ void WriteToTextStream(Stream *stream,
+ ::emboss::TextOutputOptions options) const {
+ support::WriteFloatToTextStream(Read(), stream, options);
+ }
+
+ static constexpr int SizeInBits() { return Parameters::kBits; }
+
+ private:
+ using UIntType = typename support::FloatType<Parameters::kBits>::UIntType;
+ static ValueType ConvertToFloat(UIntType bits) {
+ // TODO(bolms): This method assumes a few things that are not always
+ // strictly true; e.g., that uint32_t and float have the same endianness.
+ ValueType result;
+ memcpy(static_cast<void *>(&result), static_cast<void *>(&bits),
+ sizeof result);
+ return result;
+ }
+
+ static UIntType ConvertToUInt(ValueType value) {
+ // TODO(bolms): This method assumes a few things that are not always
+ // strictly true; e.g., that uint32_t and float have the same endianness.
+ UIntType bits;
+ memcpy(static_cast<void *>(&bits), static_cast<void *>(&value),
+ sizeof bits);
+ return bits;
+ }
+
+ BitViewType buffer_;
+};
+
+} // namespace prelude
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_PRELUDE_H_
diff --git a/public/emboss_prelude_test.cc b/public/emboss_prelude_test.cc
new file mode 100644
index 0000000..97658b1
--- /dev/null
+++ b/public/emboss_prelude_test.cc
@@ -0,0 +1,727 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_prelude.h"
+
+#include <type_traits>
+
+#include "public/emboss_cpp_util.h"
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace prelude {
+namespace test {
+
+using ::emboss::support::OffsetBitBlock;
+using ::emboss::support::ReadWriteContiguousBuffer;
+
+template <size_t kBits>
+using BitBlockN = ::emboss::support::BitBlock<
+ ::emboss::support::LittleEndianByteOrderer<ReadWriteContiguousBuffer>,
+ kBits>;
+
+template <size_t kBits>
+using ViewParameters = ::emboss::support::FixedSizeViewParameters<
+ kBits, ::emboss::support::AllValuesAreOk>;
+
+TEST(FlagView, Methods) {
+ uint8_t byte = 0;
+ auto flag_view =
+ FlagView<ViewParameters<1>, OffsetBitBlock<BitBlockN<8>>>{BitBlockN<8>{
+ ReadWriteContiguousBuffer{&byte, 1}}.GetOffsetStorage<1, 0>(0, 1)};
+ EXPECT_FALSE(flag_view.Read());
+ byte = 0xfe;
+ EXPECT_FALSE(flag_view.Read());
+ byte = 0x01;
+ EXPECT_TRUE(flag_view.Read());
+ byte = 0xff;
+ EXPECT_TRUE(flag_view.Read());
+ EXPECT_TRUE(flag_view.CouldWriteValue(false));
+ EXPECT_TRUE(flag_view.CouldWriteValue(true));
+ flag_view.Write(false);
+ EXPECT_EQ(0xfe, byte);
+ byte = 0xaa;
+ flag_view.Write(true);
+ EXPECT_EQ(0xab, byte);
+}
+
+TEST(FlagView, TextDecode) {
+ uint8_t byte = 0;
+ const auto flag_view =
+ FlagView<ViewParameters<1>, OffsetBitBlock<BitBlockN<8>>>{BitBlockN<8>{
+ ReadWriteContiguousBuffer{&byte, 1}}.GetOffsetStorage<1, 0>(0, 1)};
+ EXPECT_FALSE(UpdateFromText(flag_view, ""));
+ EXPECT_FALSE(UpdateFromText(flag_view, "FALSE"));
+ EXPECT_FALSE(UpdateFromText(flag_view, "TRUE"));
+ EXPECT_FALSE(UpdateFromText(flag_view, "+true"));
+ EXPECT_TRUE(UpdateFromText(flag_view, "true"));
+ EXPECT_EQ(0x01, byte);
+ EXPECT_TRUE(UpdateFromText(flag_view, "false"));
+ EXPECT_EQ(0x00, byte);
+ EXPECT_TRUE(UpdateFromText(flag_view, " true"));
+ EXPECT_EQ(0x01, byte);
+ {
+ auto stream = support::TextStream{" false xxx"};
+ EXPECT_TRUE(flag_view.UpdateFromTextStream(&stream));
+ EXPECT_EQ(0x00, byte);
+ ::std::string token;
+ EXPECT_TRUE(::emboss::support::ReadToken(&stream, &token));
+ EXPECT_EQ("xxx", token);
+ }
+}
+
+TEST(FlagView, TextEncode) {
+ uint8_t byte = 0;
+ const auto flag_view =
+ FlagView<ViewParameters<1>, OffsetBitBlock<BitBlockN<8>>>{BitBlockN<8>{
+ ReadWriteContiguousBuffer{&byte, 1}}.GetOffsetStorage<1, 0>(0, 1)};
+ EXPECT_EQ("false", WriteToString(flag_view));
+ byte = 1;
+ EXPECT_EQ("true", WriteToString(flag_view));
+}
+
+template <template <typename, typename> class ViewType, int kMaxBits>
+void CheckViewSizeInBits() {
+ const int size_in_bits =
+ ViewType<ViewParameters<kMaxBits>, BitBlockN<64>>::SizeInBits();
+ EXPECT_EQ(size_in_bits, kMaxBits);
+ return CheckViewSizeInBits<ViewType, kMaxBits - 1>();
+}
+
+template <>
+void CheckViewSizeInBits<UIntView, 0>() {
+ return;
+}
+
+template <>
+void CheckViewSizeInBits<IntView, 0>() {
+ return;
+}
+
+template <>
+void CheckViewSizeInBits<BcdView, 0>() {
+ return;
+}
+
+TEST(UIntView, SizeInBits) { CheckViewSizeInBits<UIntView, 64>(); }
+
+TEST(IntView, SizeInBits) { CheckViewSizeInBits<IntView, 64>(); }
+
+TEST(BcdView, SizeInBits) { CheckViewSizeInBits<BcdView, 64>(); }
+
+template <size_t kBits>
+using UIntViewN = UIntView<ViewParameters<kBits>, BitBlockN<kBits>>;
+
+TEST(UIntView, ValueType) {
+ using BitBlockType = BitBlockN<64>;
+ EXPECT_TRUE(
+ (::std::is_same<uint8_t, UIntView<ViewParameters<8>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint8_t, UIntView<ViewParameters<6>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint16_t, UIntView<ViewParameters<9>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint16_t, UIntView<ViewParameters<16>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint32_t, UIntView<ViewParameters<17>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint32_t, UIntView<ViewParameters<32>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint64_t, UIntView<ViewParameters<33>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint64_t, UIntView<ViewParameters<64>,
+ BitBlockType>::ValueType>::value));
+}
+
+TEST(UIntView, CouldWriteValue) {
+ EXPECT_TRUE(UIntViewN<8>::CouldWriteValue(0xff));
+ EXPECT_TRUE(UIntViewN<8>::CouldWriteValue(0));
+ EXPECT_FALSE(UIntViewN<8>::CouldWriteValue(0x100));
+ EXPECT_FALSE(UIntViewN<8>::CouldWriteValue(-1));
+ EXPECT_TRUE(UIntViewN<16>::CouldWriteValue(0xffff));
+ EXPECT_TRUE(UIntViewN<16>::CouldWriteValue(0));
+ EXPECT_FALSE(UIntViewN<16>::CouldWriteValue(0x10000));
+ EXPECT_FALSE(UIntViewN<16>::CouldWriteValue(-1));
+ EXPECT_TRUE(UIntViewN<32>::CouldWriteValue(0xffffffffU));
+ EXPECT_TRUE(UIntViewN<32>::CouldWriteValue(0xffffffffL));
+ EXPECT_TRUE(UIntViewN<32>::CouldWriteValue(0));
+ EXPECT_FALSE(UIntViewN<32>::CouldWriteValue(0x100000000L));
+ EXPECT_FALSE(UIntViewN<32>::CouldWriteValue(-1));
+ EXPECT_TRUE(UIntViewN<48>::CouldWriteValue(0x0000ffffffffffffUL));
+ EXPECT_TRUE(UIntViewN<48>::CouldWriteValue(0x0000ffffffffffffL));
+ EXPECT_TRUE(UIntViewN<48>::CouldWriteValue(0));
+ EXPECT_FALSE(UIntViewN<48>::CouldWriteValue(0x1000000000000UL));
+ EXPECT_FALSE(UIntViewN<48>::CouldWriteValue(0x1000000000000L));
+ EXPECT_FALSE(UIntViewN<48>::CouldWriteValue(-1));
+ EXPECT_TRUE(UIntViewN<64>::CouldWriteValue(0xffffffffffffffffUL));
+ EXPECT_TRUE(UIntViewN<64>::CouldWriteValue(0));
+ EXPECT_FALSE(UIntViewN<64>::CouldWriteValue(-1));
+}
+
+TEST(UIntView, CouldWriteValueNarrowing) {
+ auto narrowing_could_write = [](int value) {
+ return UIntViewN<8>::CouldWriteValue(value);
+ };
+ EXPECT_TRUE(narrowing_could_write(0));
+ EXPECT_TRUE(narrowing_could_write(255));
+ EXPECT_FALSE(narrowing_could_write(-1));
+ EXPECT_FALSE(narrowing_could_write(256));
+}
+
+TEST(UIntView, ReadAndWriteWithSufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}};
+ auto uint64_view =
+ UIntViewN<64>{BitBlockN<64>{ReadWriteContiguousBuffer{bytes.data(), 8}}};
+ EXPECT_EQ(0x090a0b0c0d0e0f10UL, uint64_view.Read());
+ EXPECT_EQ(0x090a0b0c0d0e0f10UL, uint64_view.UncheckedRead());
+ uint64_view.Write(0x100f0e0d0c0b0a09UL);
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x08}}),
+ bytes);
+ uint64_view.UncheckedWrite(0x090a0b0c0d0e0f10UL);
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}}),
+ bytes);
+ EXPECT_TRUE(uint64_view.TryToWrite(0x100f0e0d0c0b0a09UL));
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x08}}),
+ bytes);
+ EXPECT_TRUE(uint64_view.TryToWrite(0x090a0b0c0d0e0f10UL));
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}}),
+ bytes);
+ EXPECT_TRUE(uint64_view.Ok());
+ EXPECT_TRUE(uint64_view.IsComplete());
+}
+
+TEST(UIntView, ReadAndWriteWithInsufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}};
+ auto uint64_view =
+ UIntViewN<64>{BitBlockN<64>{ReadWriteContiguousBuffer{bytes.data(), 4}}};
+ EXPECT_DEATH(uint64_view.Read(), "");
+ EXPECT_EQ(0x090a0b0c0d0e0f10UL, uint64_view.UncheckedRead());
+ EXPECT_DEATH(uint64_view.Write(0x100f0e0d0c0b0a09UL), "");
+ EXPECT_FALSE(uint64_view.TryToWrite(0x100f0e0d0c0b0a09UL));
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}}),
+ bytes);
+ uint64_view.UncheckedWrite(0x100f0e0d0c0b0a09UL);
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x08}}),
+ bytes);
+ EXPECT_FALSE(uint64_view.Ok());
+ EXPECT_FALSE(uint64_view.IsComplete());
+ uint64_view.UncheckedWrite(0x090a0b0c0d0e0f10UL);
+}
+
+TEST(UIntView, NonPowerOfTwoSize) {
+ ::std::vector<uint8_t> bytes = {{0x10, 0x0f, 0x0e, 0x0d}};
+ auto uint24_view =
+ UIntViewN<24>{BitBlockN<24>{ReadWriteContiguousBuffer{bytes.data(), 3}}};
+ EXPECT_EQ(0x0e0f10, uint24_view.Read());
+ EXPECT_EQ(0x0e0f10, uint24_view.UncheckedRead());
+ EXPECT_DEATH(uint24_view.Write(0x1000000), "");
+ uint24_view.Write(0x100f0e);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x0e, 0x0f, 0x10, 0x0d}}), bytes);
+ uint24_view.UncheckedWrite(0x1000000);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0x00, 0x0d}}), bytes);
+ EXPECT_TRUE(uint24_view.Ok());
+ EXPECT_TRUE(uint24_view.IsComplete());
+}
+
+TEST(UIntView, NonPowerOfTwoSizeInsufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {{0x10, 0x0f, 0x0e, 0x0d}};
+ auto uint24_view =
+ UIntViewN<24>{BitBlockN<24>{ReadWriteContiguousBuffer{bytes.data(), 2}}};
+ EXPECT_DEATH(uint24_view.Read(), "");
+ EXPECT_EQ(0x0e0f10, uint24_view.UncheckedRead());
+ EXPECT_DEATH(uint24_view.Write(0x100f0e), "");
+ uint24_view.UncheckedWrite(0x100f0e);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x0e, 0x0f, 0x10, 0x0d}}), bytes);
+ uint24_view.UncheckedWrite(0x1000000);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0x00, 0x0d}}), bytes);
+ EXPECT_FALSE(uint24_view.Ok());
+ EXPECT_FALSE(uint24_view.IsComplete());
+}
+
+TEST(UIntView, NonByteSize) {
+ ::std::vector<uint8_t> bytes = {{0x00, 0x00, 0x80, 0x80}};
+ auto uint23_view =
+ UIntView<ViewParameters<23>, OffsetBitBlock<BitBlockN<24>>>{BitBlockN<24>{
+ ReadWriteContiguousBuffer{bytes.data(),
+ 3}}.GetOffsetStorage<1, 0>(0, 23)};
+ EXPECT_EQ(0x0, uint23_view.Read());
+ EXPECT_FALSE(uint23_view.CouldWriteValue(0x800f0e));
+ EXPECT_FALSE(uint23_view.CouldWriteValue(0x800000));
+ EXPECT_TRUE(uint23_view.CouldWriteValue(0x7fffff));
+ EXPECT_DEATH(uint23_view.Write(0x800f0e), "");
+ uint23_view.Write(0x400f0e);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x0e, 0x0f, 0xc0, 0x80}}), bytes);
+ uint23_view.UncheckedWrite(0x1000000);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0x80, 0x80}}), bytes);
+ EXPECT_TRUE(uint23_view.Ok());
+ EXPECT_TRUE(uint23_view.IsComplete());
+}
+
+TEST(UIntView, TextDecode) {
+ ::std::vector<uint8_t> bytes = {{0x00, 0x00, 0x00, 0xff}};
+ const auto uint24_view =
+ UIntViewN<24>{BitBlockN<24>{ReadWriteContiguousBuffer{bytes.data(), 3}}};
+ EXPECT_TRUE(UpdateFromText(uint24_view, "23"));
+ EXPECT_EQ((::std::vector<uint8_t>{{23, 0x00, 0x00, 0xff}}), bytes);
+ EXPECT_EQ(23, uint24_view.Read());
+ EXPECT_FALSE(UpdateFromText(uint24_view, "16777216"));
+ EXPECT_EQ((::std::vector<uint8_t>{{23, 0x00, 0x00, 0xff}}), bytes);
+ EXPECT_TRUE(UpdateFromText(uint24_view, "16777215"));
+ EXPECT_EQ((::std::vector<uint8_t>{{0xff, 0xff, 0xff, 0xff}}), bytes);
+ EXPECT_TRUE(UpdateFromText(uint24_view, "0x01_0203"));
+ EXPECT_EQ((::std::vector<uint8_t>{{0x03, 0x02, 0x01, 0xff}}), bytes);
+}
+
+template <size_t kBits>
+using IntViewN = IntView<ViewParameters<kBits>, BitBlockN<kBits>>;
+
+TEST(IntView, ValueType) {
+ using BitBlockType = BitBlockN<64>;
+ EXPECT_TRUE(
+ (::std::is_same<
+ int8_t, IntView<ViewParameters<8>, BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<
+ int8_t, IntView<ViewParameters<6>, BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<int16_t, IntView<ViewParameters<9>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<int16_t, IntView<ViewParameters<16>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<int32_t, IntView<ViewParameters<17>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<int32_t, IntView<ViewParameters<32>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<int64_t, IntView<ViewParameters<33>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<int64_t, IntView<ViewParameters<64>,
+ BitBlockType>::ValueType>::value));
+}
+
+TEST(IntView, CouldWriteValue) {
+ // Note that many values are in decimal in order to avoid C++'s implicit
+ // conversions to unsigned for hex constants.
+ EXPECT_TRUE(IntViewN<8>::CouldWriteValue(0x7f));
+ EXPECT_TRUE(IntViewN<8>::CouldWriteValue(-0x80));
+ EXPECT_FALSE(IntViewN<8>::CouldWriteValue(0x80));
+ EXPECT_FALSE(IntViewN<8>::CouldWriteValue(0x8000000000000000UL));
+ EXPECT_FALSE(IntViewN<8>::CouldWriteValue(-0x81));
+ EXPECT_TRUE(IntViewN<16>::CouldWriteValue(32767));
+ EXPECT_TRUE(IntViewN<16>::CouldWriteValue(0));
+ EXPECT_FALSE(IntViewN<16>::CouldWriteValue(0x8000));
+ EXPECT_FALSE(IntViewN<16>::CouldWriteValue(-0x8001));
+ EXPECT_TRUE(IntViewN<32>::CouldWriteValue(0x7fffffffU));
+ EXPECT_TRUE(IntViewN<32>::CouldWriteValue(0x7fffffffL));
+ EXPECT_FALSE(IntViewN<32>::CouldWriteValue(0x80000000U));
+ EXPECT_FALSE(IntViewN<32>::CouldWriteValue(-2147483649L));
+ EXPECT_TRUE(IntViewN<48>::CouldWriteValue(0x00007fffffffffffUL));
+ EXPECT_FALSE(IntViewN<48>::CouldWriteValue(140737488355328L));
+ EXPECT_FALSE(IntViewN<48>::CouldWriteValue(-140737488355329L));
+ EXPECT_TRUE(IntViewN<64>::CouldWriteValue(0x7fffffffffffffffUL));
+ EXPECT_TRUE(IntViewN<64>::CouldWriteValue(9223372036854775807L));
+ EXPECT_TRUE(IntViewN<64>::CouldWriteValue(-9223372036854775807L - 1));
+ EXPECT_FALSE(IntViewN<64>::CouldWriteValue(0x8000000000000000UL));
+}
+
+TEST(IntView, CouldWriteValueNarrowing) {
+ auto narrowing_could_write = [](int value) {
+ return IntViewN<8>::CouldWriteValue(value);
+ };
+ EXPECT_TRUE(narrowing_could_write(-128));
+ EXPECT_TRUE(narrowing_could_write(127));
+ EXPECT_FALSE(narrowing_could_write(-129));
+ EXPECT_FALSE(narrowing_could_write(128));
+}
+
+TEST(IntView, ReadAndWriteWithSufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}};
+ auto int64_view =
+ IntViewN<64>{BitBlockN<64>{ReadWriteContiguousBuffer{bytes.data(), 8}}};
+ EXPECT_EQ(0x090a0b0c0d0e0f10L, int64_view.Read());
+ EXPECT_EQ(0x090a0b0c0d0e0f10L, int64_view.UncheckedRead());
+ int64_view.Write(0x100f0e0d0c0b0a09L);
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x08}}),
+ bytes);
+ int64_view.UncheckedWrite(0x090a0b0c0d0e0f10L);
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}}),
+ bytes);
+ EXPECT_TRUE(int64_view.TryToWrite(0x100f0e0d0c0b0a09L));
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x08}}),
+ bytes);
+ int64_view.Write(-0x100f0e0d0c0b0a09L);
+ EXPECT_EQ(-0x100f0e0d0c0b0a09L, int64_view.Read());
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0xf7, 0xf5, 0xf4, 0xf3, 0xf2, 0xf1, 0xf0, 0xef, 0x08}}),
+ bytes);
+ EXPECT_TRUE(int64_view.Ok());
+ EXPECT_TRUE(int64_view.IsComplete());
+}
+
+TEST(IntView, ReadAndWriteWithInsufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}};
+ auto int64_view =
+ IntViewN<64>{BitBlockN<64>{ReadWriteContiguousBuffer{bytes.data(), 4}}};
+ EXPECT_DEATH(int64_view.Read(), "");
+ EXPECT_EQ(0x090a0b0c0d0e0f10L, int64_view.UncheckedRead());
+ EXPECT_DEATH(int64_view.Write(0x100f0e0d0c0b0a09L), "");
+ EXPECT_FALSE(int64_view.TryToWrite(0x100f0e0d0c0b0a09L));
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x10, 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08}}),
+ bytes);
+ int64_view.UncheckedWrite(0x100f0e0d0c0b0a09L);
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x08}}),
+ bytes);
+ EXPECT_FALSE(int64_view.Ok());
+ EXPECT_FALSE(int64_view.IsComplete());
+}
+
+TEST(IntView, NonPowerOfTwoSize) {
+ ::std::vector<uint8_t> bytes = {{0x10, 0x0f, 0x0e, 0x0d}};
+ auto int24_view =
+ IntViewN<24>{BitBlockN<24>{ReadWriteContiguousBuffer{bytes.data(), 3}}};
+ EXPECT_EQ(0x0e0f10, int24_view.Read());
+ EXPECT_EQ(0x0e0f10, int24_view.UncheckedRead());
+ EXPECT_DEATH(int24_view.Write(0x1000000), "");
+ int24_view.Write(0x100f0e);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x0e, 0x0f, 0x10, 0x0d}}), bytes);
+ int24_view.Write(-0x100f0e);
+ EXPECT_EQ((::std::vector<uint8_t>{{0xf2, 0xf0, 0xef, 0x0d}}), bytes);
+ EXPECT_DEATH(int24_view.Write(0x1000000), "");
+ int24_view.UncheckedWrite(0x1000000);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0x00, 0x0d}}), bytes);
+ EXPECT_TRUE(int24_view.Ok());
+ EXPECT_TRUE(int24_view.IsComplete());
+}
+
+TEST(IntView, NonPowerOfTwoSizeInsufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {{0x10, 0x0f, 0x0e, 0x0d}};
+ auto int24_view =
+ IntViewN<24>{BitBlockN<24>{ReadWriteContiguousBuffer{bytes.data(), 2}}};
+ EXPECT_DEATH(int24_view.Read(), "");
+ EXPECT_EQ(0x0e0f10, int24_view.UncheckedRead());
+ EXPECT_DEATH(int24_view.Write(0x100f0e), "");
+ int24_view.UncheckedWrite(0x100f0e);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x0e, 0x0f, 0x10, 0x0d}}), bytes);
+ int24_view.UncheckedWrite(0x1000000);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0x00, 0x0d}}), bytes);
+ EXPECT_FALSE(int24_view.Ok());
+ EXPECT_FALSE(int24_view.IsComplete());
+}
+
+TEST(IntView, NonByteSize) {
+ ::std::vector<uint8_t> bytes = {{0x00, 0x00, 0x80, 0x80}};
+ auto int23_view =
+ IntView<ViewParameters<23>, OffsetBitBlock<BitBlockN<24>>>{BitBlockN<24>{
+ ReadWriteContiguousBuffer{bytes.data(),
+ 3}}.GetOffsetStorage<1, 0>(0, 23)};
+ EXPECT_EQ(0x0, int23_view.Read());
+ EXPECT_FALSE(int23_view.CouldWriteValue(0x400f0e));
+ EXPECT_DEATH(int23_view.Write(0x400f0e), "");
+ int23_view.Write(0x200f0e);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x0e, 0x0f, 0xa0, 0x80}}), bytes);
+ int23_view.Write(-0x400000);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0xc0, 0x80}}), bytes);
+ int23_view.UncheckedWrite(0x1000000);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0x80, 0x80}}), bytes);
+ EXPECT_TRUE(int23_view.Ok());
+ EXPECT_TRUE(int23_view.IsComplete());
+}
+
+TEST(IntView, OneBit) {
+ uint8_t bytes[] = {0xfe};
+ auto int1_view =
+ IntView<ViewParameters<1>, OffsetBitBlock<BitBlockN<8>>>{BitBlockN<8>{
+ ReadWriteContiguousBuffer{bytes, 1}}.GetOffsetStorage<1, 0>(0, 1)};
+ EXPECT_TRUE(int1_view.Ok());
+ EXPECT_TRUE(int1_view.IsComplete());
+ EXPECT_EQ(0, int1_view.Read());
+ EXPECT_FALSE(int1_view.CouldWriteValue(1));
+ EXPECT_TRUE(int1_view.CouldWriteValue(0));
+ EXPECT_TRUE(int1_view.CouldWriteValue(-1));
+ EXPECT_DEATH(int1_view.Write(1), "");
+ int1_view.Write(-1);
+ EXPECT_EQ(0xff, bytes[0]);
+ EXPECT_EQ(-1, int1_view.Read());
+ int1_view.Write(0);
+ EXPECT_EQ(0xfe, bytes[0]);
+ bytes[0] = 0;
+ int1_view.Write(-1);
+ EXPECT_EQ(0x01, bytes[0]);
+}
+
+TEST(IntView, TextDecode) {
+ ::std::vector<uint8_t> bytes = {{0x00, 0x00, 0x00, 0xff}};
+ const auto int24_view =
+ IntViewN<24>{BitBlockN<24>{ReadWriteContiguousBuffer{bytes.data(), 3}}};
+ EXPECT_TRUE(UpdateFromText(int24_view, "23"));
+ EXPECT_EQ((::std::vector<uint8_t>{{23, 0x00, 0x00, 0xff}}), bytes);
+ EXPECT_EQ(23, int24_view.Read());
+ EXPECT_FALSE(UpdateFromText(int24_view, "16777216"));
+ EXPECT_EQ((::std::vector<uint8_t>{{23, 0x00, 0x00, 0xff}}), bytes);
+ EXPECT_FALSE(UpdateFromText(int24_view, "16777215"));
+ EXPECT_EQ((::std::vector<uint8_t>{{23, 0x00, 0x00, 0xff}}), bytes);
+ EXPECT_FALSE(UpdateFromText(int24_view, "8388608"));
+ EXPECT_EQ((::std::vector<uint8_t>{{23, 0x00, 0x00, 0xff}}), bytes);
+ EXPECT_TRUE(UpdateFromText(int24_view, "8388607"));
+ EXPECT_EQ((::std::vector<uint8_t>{{0xff, 0xff, 0x7f, 0xff}}), bytes);
+ EXPECT_TRUE(UpdateFromText(int24_view, "-8388608"));
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0x80, 0xff}}), bytes);
+ EXPECT_TRUE(UpdateFromText(int24_view, "-1"));
+ EXPECT_EQ((::std::vector<uint8_t>{{0xff, 0xff, 0xff, 0xff}}), bytes);
+ EXPECT_TRUE(UpdateFromText(int24_view, "0x01_0203"));
+ EXPECT_EQ((::std::vector<uint8_t>{{0x03, 0x02, 0x01, 0xff}}), bytes);
+ EXPECT_TRUE(UpdateFromText(int24_view, "-0x01_0203"));
+ EXPECT_EQ((::std::vector<uint8_t>{{0xfd, 0xfd, 0xfe, 0xff}}), bytes);
+ EXPECT_FALSE(UpdateFromText(int24_view, "- 0x01_0203"));
+ EXPECT_EQ((::std::vector<uint8_t>{{0xfd, 0xfd, 0xfe, 0xff}}), bytes);
+}
+
+TEST(MaxBcd, Values) {
+ EXPECT_EQ(0, MaxBcd<uint64_t>(0));
+ EXPECT_EQ(1, MaxBcd<uint64_t>(1));
+ EXPECT_EQ(3, MaxBcd<uint64_t>(2));
+ EXPECT_EQ(7, MaxBcd<uint64_t>(3));
+ EXPECT_EQ(9, MaxBcd<uint64_t>(4));
+ EXPECT_EQ(19, MaxBcd<uint64_t>(5));
+ EXPECT_EQ(39, MaxBcd<uint64_t>(6));
+ EXPECT_EQ(79, MaxBcd<uint64_t>(7));
+ EXPECT_EQ(99, MaxBcd<uint64_t>(8));
+ EXPECT_EQ(199, MaxBcd<uint64_t>(9));
+ EXPECT_EQ(999, MaxBcd<uint64_t>(12));
+ EXPECT_EQ(9999, MaxBcd<uint64_t>(16));
+ EXPECT_EQ(999999, MaxBcd<uint64_t>(24));
+ EXPECT_EQ(3999999999999999UL, MaxBcd<uint64_t>(62));
+ EXPECT_EQ(7999999999999999UL, MaxBcd<uint64_t>(63));
+ EXPECT_EQ(9999999999999999UL, MaxBcd<uint64_t>(64));
+ // Max uint64_t is 18446744073709551616, which is big enough to hold a 76-bit
+ // BCD value.
+ EXPECT_EQ(19999999999999999UL, MaxBcd<uint64_t>(65));
+ EXPECT_EQ(39999999999999999UL, MaxBcd<uint64_t>(66));
+ EXPECT_EQ(99999999999999999UL, MaxBcd<uint64_t>(68));
+ EXPECT_EQ(999999999999999999UL, MaxBcd<uint64_t>(72));
+ EXPECT_EQ(9999999999999999999UL, MaxBcd<uint64_t>(76));
+}
+
+TEST(IsBcd, Values) {
+ EXPECT_TRUE(IsBcd(0x00U));
+ EXPECT_TRUE(IsBcd(0x12U));
+ EXPECT_TRUE(IsBcd(0x91U));
+ EXPECT_TRUE(IsBcd(0x99U));
+ EXPECT_TRUE(IsBcd(uint8_t{0x00}));
+ EXPECT_TRUE(IsBcd(uint8_t{0x99}));
+ EXPECT_TRUE(IsBcd(uint16_t{0x0000}));
+ EXPECT_TRUE(IsBcd(uint16_t{0x9999}));
+ EXPECT_TRUE(IsBcd(0x9999999999999999UL));
+ EXPECT_FALSE(IsBcd(uint8_t{0x0a}));
+ EXPECT_FALSE(IsBcd(uint8_t{0xa0}));
+ EXPECT_FALSE(IsBcd(uint8_t{0xff}));
+ EXPECT_FALSE(IsBcd(uint16_t{0x0a00}));
+ EXPECT_FALSE(IsBcd(uint16_t{0x000a}));
+ EXPECT_FALSE(IsBcd(0x999999999999999aUL));
+ EXPECT_FALSE(IsBcd(0xaUL));
+ EXPECT_FALSE(IsBcd(0xa000000000000000UL));
+ EXPECT_FALSE(IsBcd(0xf000000000000000UL));
+ EXPECT_FALSE(IsBcd(0xffffffffffffffffUL));
+}
+
+TEST(BcdView, ValueType) {
+ using BitBlockType = BitBlockN<64>;
+ EXPECT_TRUE(
+ (::std::is_same<uint8_t, BcdView<ViewParameters<8>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint8_t, BcdView<ViewParameters<6>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint16_t, BcdView<ViewParameters<9>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint16_t, BcdView<ViewParameters<16>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint32_t, BcdView<ViewParameters<17>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint32_t, BcdView<ViewParameters<32>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint64_t, BcdView<ViewParameters<33>,
+ BitBlockType>::ValueType>::value));
+ EXPECT_TRUE(
+ (::std::is_same<uint64_t, BcdView<ViewParameters<64>,
+ BitBlockType>::ValueType>::value));
+}
+
+TEST(BcdView, CouldWriteValue) {
+ EXPECT_TRUE((BcdView<ViewParameters<64>, int>::CouldWriteValue(0)));
+ EXPECT_TRUE(
+ (BcdView<ViewParameters<64>, int>::CouldWriteValue(9999999999999999)));
+ EXPECT_FALSE(
+ (BcdView<ViewParameters<64>, int>::CouldWriteValue(10000000000000000)));
+ EXPECT_FALSE((
+ BcdView<ViewParameters<64>, int>::CouldWriteValue(0xffffffffffffffffUL)));
+ EXPECT_FALSE(
+ (BcdView<ViewParameters<48>, int>::CouldWriteValue(9999999999999999)));
+ EXPECT_TRUE(
+ (BcdView<ViewParameters<48>, int>::CouldWriteValue(999999999999)));
+ EXPECT_TRUE((BcdView<ViewParameters<48>, int>::CouldWriteValue(0)));
+ EXPECT_FALSE((BcdView<ViewParameters<48>, int>::CouldWriteValue(
+ (0xffUL << 48) + 999999999999)));
+ EXPECT_FALSE(
+ (BcdView<ViewParameters<48>, int>::CouldWriteValue(10000000000000000)));
+ EXPECT_FALSE((
+ BcdView<ViewParameters<48>, int>::CouldWriteValue(0xffffffffffffffffUL)));
+}
+
+template <size_t kBits>
+using BcdViewN = BcdView<ViewParameters<kBits>, BitBlockN<kBits>>;
+
+TEST(BcdView, ReadAndWriteWithSufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {
+ {0x16, 0x15, 0x14, 0x13, 0x12, 0x11, 0x10, 0x09, 0x08}};
+ auto bcd64_view =
+ BcdViewN<64>{BitBlockN<64>{ReadWriteContiguousBuffer{bytes.data(), 8}}};
+ EXPECT_EQ(910111213141516UL, bcd64_view.Read());
+ EXPECT_EQ(910111213141516UL, bcd64_view.UncheckedRead());
+ bcd64_view.Write(1615141312111009);
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x09, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x08}}),
+ bytes);
+ bcd64_view.UncheckedWrite(910111213141516UL);
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x16, 0x15, 0x14, 0x13, 0x12, 0x11, 0x10, 0x09, 0x08}}),
+ bytes);
+ EXPECT_TRUE(bcd64_view.TryToWrite(1615141312111009));
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x09, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x08}}),
+ bytes);
+ EXPECT_TRUE(bcd64_view.Ok());
+ EXPECT_TRUE(bcd64_view.IsComplete());
+}
+
+TEST(BcdView, ReadAndWriteWithInsufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {
+ {0x16, 0x15, 0x14, 0x13, 0x12, 0x11, 0x10, 0x09, 0x08}};
+ auto bcd64_view =
+ BcdViewN<64>{BitBlockN<64>{ReadWriteContiguousBuffer{bytes.data(), 4}}};
+ EXPECT_DEATH(bcd64_view.Read(), "");
+ EXPECT_EQ(910111213141516UL, bcd64_view.UncheckedRead());
+ EXPECT_DEATH(bcd64_view.Write(1615141312111009), "");
+ EXPECT_FALSE(bcd64_view.TryToWrite(1615141312111009));
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x16, 0x15, 0x14, 0x13, 0x12, 0x11, 0x10, 0x09, 0x08}}),
+ bytes);
+ bcd64_view.UncheckedWrite(1615141312111009);
+ EXPECT_EQ((::std::vector<uint8_t>{
+ {0x09, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x08}}),
+ bytes);
+ EXPECT_FALSE(bcd64_view.Ok());
+ EXPECT_FALSE(bcd64_view.IsComplete());
+}
+
+TEST(BcdView, NonPowerOfTwoSize) {
+ ::std::vector<uint8_t> bytes = {{0x16, 0x15, 0x14, 0x13}};
+ auto bcd24_view =
+ BcdViewN<24>{BitBlockN<24>{ReadWriteContiguousBuffer{bytes.data(), 3}}};
+ EXPECT_EQ(141516, bcd24_view.Read());
+ EXPECT_EQ(141516, bcd24_view.UncheckedRead());
+ bcd24_view.Write(161514);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x14, 0x15, 0x16, 0x13}}), bytes);
+ EXPECT_DEATH(bcd24_view.Write(1000000), "");
+ bcd24_view.UncheckedWrite(1000000);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0x00, 0x13}}), bytes);
+ bcd24_view.UncheckedWrite(141516);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x16, 0x15, 0x14, 0x13}}), bytes);
+ EXPECT_TRUE(bcd24_view.Ok());
+ EXPECT_TRUE(bcd24_view.IsComplete());
+}
+
+TEST(BcdView, NonPowerOfTwoSizeInsufficientBuffer) {
+ ::std::vector<uint8_t> bytes = {{0x16, 0x15, 0x14, 0x13}};
+ auto bcd24_view =
+ BcdViewN<24>{BitBlockN<24>{ReadWriteContiguousBuffer{bytes.data(), 2}}};
+ EXPECT_DEATH(bcd24_view.Read(), "");
+ EXPECT_EQ(141516, bcd24_view.UncheckedRead());
+ EXPECT_DEATH(bcd24_view.Write(161514), "");
+ bcd24_view.UncheckedWrite(161514);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x14, 0x15, 0x16, 0x13}}), bytes);
+ bcd24_view.UncheckedWrite(1000000);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0x00, 0x13}}), bytes);
+ EXPECT_FALSE(bcd24_view.Ok());
+ EXPECT_FALSE(bcd24_view.IsComplete());
+}
+
+TEST(BcdView, NonByteSize) {
+ ::std::vector<uint8_t> bytes = {{0x00, 0x00, 0x80, 0x80}};
+ auto bcd23_view =
+ BcdView<ViewParameters<23>, OffsetBitBlock<BitBlockN<24>>>{BitBlockN<24>{
+ ReadWriteContiguousBuffer{bytes.data(),
+ 3}}.GetOffsetStorage<1, 0>(0, 23)};
+ EXPECT_EQ(0x0, bcd23_view.Read());
+ EXPECT_FALSE(bcd23_view.CouldWriteValue(800000));
+ EXPECT_TRUE(bcd23_view.CouldWriteValue(799999));
+ EXPECT_DEATH(bcd23_view.Write(800000), "");
+ bcd23_view.Write(432198);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x98, 0x21, 0xc3, 0x80}}), bytes);
+ bcd23_view.UncheckedWrite(800000);
+ EXPECT_EQ((::std::vector<uint8_t>{{0x00, 0x00, 0x80, 0x80}}), bytes);
+ EXPECT_TRUE(bcd23_view.Ok());
+ EXPECT_TRUE(bcd23_view.IsComplete());
+}
+
+TEST(BcdLittleEndianView, AllByteValues) {
+ uint8_t byte = 0;
+ auto bcd8_view =
+ BcdViewN<8>{BitBlockN<8>{ReadWriteContiguousBuffer{&byte, 1}}};
+ for (int i = 0; i < 15; ++i) {
+ for (int j = 0; j < 15; ++j) {
+ byte = i * 16 + j;
+ if (i > 9 || j > 9) {
+ EXPECT_FALSE(bcd8_view.Ok()) << i << ", " << j;
+ } else {
+ EXPECT_TRUE(bcd8_view.Ok()) << i << ", " << j;
+ }
+ }
+ }
+}
+
+} // namespace test
+} // namespace prelude
+} // namespace emboss
diff --git a/public/emboss_test_util.h b/public/emboss_test_util.h
new file mode 100644
index 0000000..89f9855
--- /dev/null
+++ b/public/emboss_test_util.h
@@ -0,0 +1,98 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#ifndef EMBOSS_PUBLIC_EMBOSS_TEST_UTIL_H_
+#define EMBOSS_PUBLIC_EMBOSS_TEST_UTIL_H_
+
+#include <cctype>
+#include <iterator>
+#include <ostream>
+#include <string>
+
+#include "public/emboss_text_util.h"
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+#include "third_party/absl/memory/memory.h"
+#include "third_party/googletest/googletest/include/gtest/internal/gtest-internal.h"
+
+namespace emboss {
+
+class EmbMatcher {
+ public:
+ template <typename ViewType>
+ explicit EmbMatcher(ViewType compare_to)
+ : compare_to_ok_(compare_to.Ok()),
+ compare_to_lines_(SplitToLines(
+ compare_to_ok_ ? WriteToString(compare_to, MultilineText()) : "")) {
+ }
+
+ template <typename ViewType>
+ bool MatchAndExplain(ViewType compare_from,
+ ::testing::MatchResultListener* listener) const {
+ if (!compare_to_ok_) {
+ *listener << "View for comparison to is not OK.";
+ return false;
+ }
+
+ if (!compare_from.Ok()) {
+ *listener << "View for comparison from is not OK.";
+ return false;
+ }
+
+ const auto compare_from_lines =
+ SplitToLines(WriteToString(compare_from, MultilineText()));
+ if (compare_from_lines != compare_to_lines_) {
+ *listener << "\n"
+ << ::testing::internal::edit_distance::CreateUnifiedDiff(
+ compare_to_lines_, compare_from_lines);
+ return false;
+ }
+
+ return true;
+ }
+
+ // Describes the property of a value matching this matcher.
+ void DescribeTo(::std::ostream* os) const { *os << "are equal"; }
+
+ // Describes the property of a value NOT matching this matcher.
+ void DescribeNegationTo(::std::ostream* os) const { *os << "are NOT equal"; }
+
+ private:
+ // Splits the given string on '\n' boundaries and returns a vector of those
+ // strings.
+ ::std::vector<::std::string> SplitToLines(const ::std::string& input) const {
+ constexpr char kNewLine = '\n';
+
+ ::std::stringstream ss(input);
+ ss.ignore(::std::numeric_limits<::std::streamsize>::max(), kNewLine);
+
+ ::std::vector<::std::string> lines;
+ for (::std::string line; ::std::getline(ss, line, kNewLine);) {
+ lines.push_back(::std::move(line));
+ }
+ return lines;
+ }
+
+ const bool compare_to_ok_;
+ const ::std::vector<::std::string> compare_to_lines_;
+};
+
+template <typename ViewType>
+::testing::PolymorphicMatcher<EmbMatcher> EqualsEmb(ViewType view) {
+ return ::testing::MakePolymorphicMatcher(EmbMatcher(view));
+}
+
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_TEST_UTIL_H_
diff --git a/public/emboss_test_util_test.cc b/public/emboss_test_util_test.cc
new file mode 100644
index 0000000..71a34fe
--- /dev/null
+++ b/public/emboss_test_util_test.cc
@@ -0,0 +1,134 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_test_util.h"
+
+#include "testdata/complex_structure.emb.h"
+#include <gmock/gmock.h>
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace test {
+namespace {
+
+class EmbossTestUtilTest : public ::testing::Test {
+ protected:
+ EmbossTestUtilTest() { b_.s().Write(1); }
+ std::array<uint8, 64> buf_a_{};
+ ::emboss_test::ComplexWriter a_{&buf_a_};
+ std::array<char, 64> buf_b_{};
+ ::emboss_test::ComplexWriter b_{&buf_b_};
+};
+
+TEST_F(EmbossTestUtilTest, EqualsEmb) {
+ EXPECT_THAT(a_, EqualsEmb(a_));
+ EXPECT_THAT(b_, EqualsEmb(b_));
+
+ EXPECT_THAT(a_, ::testing::Not(EqualsEmb(b_)));
+ EXPECT_THAT(b_, ::testing::Not(EqualsEmb(a_)));
+}
+
+TEST_F(EmbossTestUtilTest, NotOkView) {
+ auto null_view = ::emboss_test::ComplexView(nullptr);
+ EXPECT_THAT(a_, ::testing::Not(EqualsEmb(null_view)));
+ EXPECT_THAT(b_, ::testing::Not(EqualsEmb(null_view)));
+ EXPECT_THAT(null_view, ::testing::Not(EqualsEmb(null_view)));
+ EXPECT_THAT(null_view, ::testing::Not(EqualsEmb(a_)));
+ EXPECT_THAT(null_view, ::testing::Not(EqualsEmb(b_)));
+}
+
+TEST_F(EmbossTestUtilTest, NotOkViewMatcherDescribe) {
+ auto null_view = ::emboss_test::ComplexView(nullptr);
+
+ ::testing::StringMatchResultListener listener;
+ EqualsEmb(a_).impl().MatchAndExplain(null_view, &listener);
+ EXPECT_EQ(listener.str(), "View for comparison from is not OK.");
+
+ listener.Clear();
+ EqualsEmb(null_view).impl().MatchAndExplain(a_, &listener);
+ EXPECT_EQ(listener.str(), "View for comparison to is not OK.");
+}
+
+TEST_F(EmbossTestUtilTest, MatcherDescribeEquivalent) {
+ ::std::stringstream ss;
+ EqualsEmb(a_).impl().DescribeTo(&ss);
+ EXPECT_EQ(ss.str(), "are equal");
+}
+
+TEST_F(EmbossTestUtilTest, MatcherDescribeNotEquivalent) {
+ ::std::stringstream ss;
+ EqualsEmb(a_).impl().DescribeNegationTo(&ss);
+ EXPECT_EQ(ss.str(), "are NOT equal");
+}
+
+TEST_F(EmbossTestUtilTest, MatcherExplainEquivalent) {
+ ::testing::StringMatchResultListener listener;
+
+ EqualsEmb(a_).impl().MatchAndExplain(a_, &listener);
+ EXPECT_EQ(listener.str(), "");
+
+ EqualsEmb(b_).impl().MatchAndExplain(b_, &listener);
+ EXPECT_EQ(listener.str(), "");
+}
+
+TEST_F(EmbossTestUtilTest, MatcherExplainNotEquivalent) {
+ ::testing::StringMatchResultListener listener;
+ EqualsEmb(a_).impl().MatchAndExplain(b_, &listener);
+ EXPECT_EQ(listener.str(), R"(
+@@ -1,3 +1,3 @@
+- s: 0 # 0x0
++ s: 1 # 0x1
+ u: 0 # 0x0
+ i: 0 # 0x0
+@@ +4,34 @@
+ b: 0 # 0x0
+ a: {
++ [0]: {
++ [0]: {
++ a: {
++ x: 0 # 0x0
++ l: 0 # 0x0
++ h: 0 # 0x0
++ }
++ }
++ [1]: {
++ a: {
++ x: 0 # 0x0
++ l: 0 # 0x0
++ h: 0 # 0x0
++ }
++ }
++ [2]: {
++ a: {
++ x: 0 # 0x0
++ l: 0 # 0x0
++ h: 0 # 0x0
++ }
++ }
++ [3]: {
++ a: {
++ x: 0 # 0x0
++ l: 0 # 0x0
++ h: 0 # 0x0
++ }
++ }
++ }
+ }
+ a0: 0 # 0x0
+)");
+}
+
+} // namespace
+} // namespace test
+} // namespace emboss
diff --git a/public/emboss_text_util.h b/public/emboss_text_util.h
new file mode 100644
index 0000000..6ed1e05
--- /dev/null
+++ b/public/emboss_text_util.h
@@ -0,0 +1,798 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// This header contains functionality related to Emboss text output.
+#ifndef EMBOSS_PUBLIC_EMBOSS_TEXT_UTIL_H_
+#define EMBOSS_PUBLIC_EMBOSS_TEXT_UTIL_H_
+
+#include <array>
+#include <climits>
+#include <cmath>
+#include <cstdio>
+#include <cstring>
+#include <limits>
+#include <sstream>
+#include <string>
+#include <vector>
+
+#include "public/emboss_defines.h"
+
+namespace emboss {
+
+// TextOutputOptions are used to configure text output. Typically, one can just
+// use a default TextOutputOptions() (for compact output) or MultilineText()
+// (for reasonable formatted output).
+class TextOutputOptions final {
+ public:
+ TextOutputOptions() = default;
+
+ TextOutputOptions PlusOneIndent() const {
+ TextOutputOptions result = *this;
+ result.current_indent_ += indent();
+ return result;
+ }
+
+ TextOutputOptions Multiline(bool new_value) const {
+ TextOutputOptions result = *this;
+ result.multiline_ = new_value;
+ return result;
+ }
+
+ TextOutputOptions WithIndent(::std::string new_value) const {
+ TextOutputOptions result = *this;
+ result.indent_ = ::std::move(new_value);
+ return result;
+ }
+
+ TextOutputOptions WithComments(bool new_value) const {
+ TextOutputOptions result = *this;
+ result.comments_ = new_value;
+ return result;
+ }
+
+ TextOutputOptions WithDigitGrouping(bool new_value) const {
+ TextOutputOptions result = *this;
+ result.digit_grouping_ = new_value;
+ return result;
+ }
+
+ TextOutputOptions WithNumericBase(int new_value) const {
+ TextOutputOptions result = *this;
+ result.numeric_base_ = new_value;
+ return result;
+ }
+
+ ::std::string current_indent() const { return current_indent_; }
+ ::std::string indent() const { return indent_; }
+ bool multiline() const { return multiline_; }
+ bool digit_grouping() const { return digit_grouping_; }
+ bool comments() const { return comments_; }
+ uint8_t numeric_base() const { return numeric_base_; }
+
+ private:
+ ::std::string indent_;
+ ::std::string current_indent_;
+ bool comments_ = false;
+ bool multiline_ = false;
+ bool digit_grouping_ = false;
+ uint8_t numeric_base_ = 10;
+};
+
+namespace support {
+
+// TextOutputStream puts a stream-like interface onto a std::string, for use by
+// DumpToTextStream. It is used by UpdateFromText().
+class TextOutputStream final {
+ public:
+ inline explicit TextOutputStream() = default;
+
+ inline void Write(const ::std::string &text) {
+ text_.write(text.data(), text.size());
+ }
+
+ inline void Write(const char *text) { text_.write(text, strlen(text)); }
+
+ inline void Write(const char c) { text_.put(c); }
+
+ inline ::std::string Result() { return text_.str(); }
+
+ private:
+ ::std::ostringstream text_;
+};
+
+// DecodeInteger decodes an integer from a string. This is very similar to the
+// many, many existing integer decode routines in the world, except that a) it
+// accepts integers in any Emboss format, and b) it can run in environments that
+// do not support std::istream or Google's number conversion routines.
+//
+// Ideally, this would be replaced by someone else's code.
+template <class IntType>
+bool DecodeInteger(const ::std::string &text, IntType *result) {
+ IntType accumulator = 0;
+ IntType base = 10;
+ bool negative = false;
+ unsigned offset = 0;
+ if (::std::is_signed<IntType>::value && text.size() >= 1 + offset &&
+ text[offset] == '-') {
+ negative = true;
+ offset += 1;
+ }
+ if (text.size() >= 2 + offset && text[offset] == '0') {
+ if (text[offset + 1] == 'x' || text[offset + 1] == 'X') {
+ base = 16;
+ offset += 2;
+ } else if (text[offset + 1] == 'b' || text[offset + 1] == 'B') {
+ base = 2;
+ offset += 2;
+ }
+ }
+ // "", "0x", "0b", "-", "-0x", and "-0b" are not valid numbers.
+ if (offset == text.size()) return false;
+ for (; offset < text.size(); ++offset) {
+ char c = text[offset];
+ IntType digit = 0;
+ if (c == '_') {
+ if (offset == 0) {
+ return false;
+ }
+ continue;
+ } else if (c >= '0' && c <= '9') {
+ digit = c - '0';
+ } else if (c >= 'A' && c <= 'F') {
+ digit = c - 'A' + 10;
+ } else if (c >= 'a' && c <= 'f') {
+ digit = c - 'a' + 10;
+ } else {
+ return false;
+ }
+ if (digit >= base) {
+ return false;
+ }
+ if (negative) {
+ if (accumulator <
+ (::std::numeric_limits<IntType>::min() + digit) / base) {
+ return false;
+ }
+ accumulator = accumulator * base - digit;
+ } else {
+ if (accumulator >
+ (::std::numeric_limits<IntType>::max() - digit) / base) {
+ return false;
+ }
+ accumulator = accumulator * base + digit;
+ }
+ }
+ *result = accumulator;
+ return true;
+}
+
+template <class Stream>
+bool DiscardWhitespace(Stream *stream) {
+ char c;
+ bool in_comment = false;
+ do {
+ if (!stream->Read(&c)) return true;
+ if (c == '#') in_comment = true;
+ if (c == '\r' || c == '\n') in_comment = false;
+ } while (in_comment || c == ' ' || c == '\t' || c == '\n' || c == '\r');
+ return stream->Unread(c);
+}
+
+template <class Stream>
+bool ReadToken(Stream *stream, ::std::string *token) {
+ ::std::vector<char> result;
+ char c;
+ if (!DiscardWhitespace(stream)) return false;
+ if (!stream->Read(&c)) {
+ *token = "";
+ return true;
+ }
+
+ const char *const punctuation = ":{}[],";
+ if (strchr(punctuation, c) != nullptr) {
+ *token = ::std::string(1, c);
+ return true;
+ } else {
+ // TODO(bolms): Only allow alphanumeric characters here?
+ do {
+ result.push_back(c);
+ if (!stream->Read(&c)) {
+ *token = ::std::string(&result[0], result.size());
+ return true;
+ }
+ } while (c != ' ' && c != '\t' && c != '\n' && c != '\r' && c != '#' &&
+ strchr(punctuation, c) == nullptr);
+ if (!stream->Unread(c)) return false;
+ *token = ::std::string(&result[0], result.size());
+ return true;
+ }
+}
+
+template <class Stream, class View>
+bool ReadIntegerFromTextStream(View *view, Stream *stream) {
+ ::std::string token;
+ if (!::emboss::support::ReadToken(stream, &token)) return false;
+ if (token.empty()) return false;
+ typename View::ValueType value;
+ if (!::emboss::support::DecodeInteger(token, &value)) return false;
+ return view->TryToWrite(value);
+}
+
+// WriteIntegerToTextStream encodes the given value in base 2, 10, or 16, with
+// or without digit group separators ('_'), and then calls stream->Write() with
+// a char * argument that is a C-style null-terminated string of the encoded
+// number.
+//
+// As with DecodeInteger, above, it would be nice to be able to replace this
+// with someone else's code, but I (bolms@) was unable to find anything in
+// standard C++ that would encode numbers in binary, nothing that would add
+// digit separators to hex numbers, and nothing that would use '_' for digit
+// separators.
+template <class Stream, typename IntegralType>
+void WriteIntegerToTextStream(IntegralType value, Stream *stream, uint8_t base,
+ bool digit_grouping) {
+ static_assert(::std::numeric_limits<
+ typename ::std::remove_cv<IntegralType>::type>::is_integer,
+ "WriteIntegerToTextStream only supports integer types.");
+ static_assert(
+ !::std::is_same<bool,
+ typename ::std::remove_cv<IntegralType>::type>::value,
+ "WriteIntegerToTextStream only supports integer types.");
+ EMBOSS_CHECK(base == 10 || base == 2 || base == 16);
+ const char *const digits = "0123456789abcdef";
+ const int grouping = base == 10 ? 3 : base == 16 ? 4 : 8;
+ // The maximum size 32-bit number is -2**31, which is:
+ //
+ // -0b10000000_00000000_00000000_00000000 (38 chars)
+ // -2_147_483_648 (14 chars)
+ // -0x8000_0000 (12 chars)
+ //
+ // Likewise, the maximum size 8-bit number is -128, which is:
+ // -0b10000000 (11 chars)
+ // -128 (4 chars)
+ // -0x80 (5 chars)
+ //
+ // Binary with separators is always the longest value: 9 chars per 8 bits,
+ // minus 1 char for the '_' that does not appear at the front of the number,
+ // plus 2 chars for "0b", plus 1 char for '-', plus 1 extra char for the
+ // trailing '\0', which is (sizeof value) * CHAR_BIT * 9 / 8 - 1 + 2 + 1 + 1.
+ const int buffer_size = (sizeof value) * CHAR_BIT * 9 / 8 + 3;
+ char buffer[buffer_size];
+ buffer[buffer_size - 1] = '\0';
+ int next_char = buffer_size - 2;
+ if (value == 0) {
+ EMBOSS_DCHECK_GE(next_char, 0);
+ buffer[next_char] = digits[0];
+ --next_char;
+ }
+ int sign = value < 0 ? -1 : 1;
+ int digit_count = 0;
+ auto buffer_char = [&](char c) {
+ EMBOSS_DCHECK_GE(next_char, 0);
+ buffer[next_char] = c;
+ --next_char;
+ };
+ if (value < 0) {
+ if (value == ::std::numeric_limits<decltype(value)>::lowest()) {
+ // The minimum negative two's-complement value has no corresponding
+ // positive value, so 'value = -value' is not useful in that case.
+ // Instead, we do some trickery to buffer the lowest-order digit here.
+ auto digit = -(value + 1) % base + 1;
+ value = -(value + 1) / base;
+ if (digit == base) {
+ digit = 0;
+ ++value;
+ }
+ buffer_char(digits[digit]);
+ ++digit_count;
+ } else {
+ value = -value;
+ }
+ }
+ while (value > 0) {
+ if (digit_count && digit_count % grouping == 0 && digit_grouping) {
+ buffer_char('_');
+ }
+ buffer_char(digits[value % base]);
+ value /= base;
+ ++digit_count;
+ }
+ if (base == 16) {
+ buffer_char('x');
+ buffer_char('0');
+ } else if (base == 2) {
+ buffer_char('b');
+ buffer_char('0');
+ }
+ if (sign < 0) {
+ buffer_char('-');
+ }
+
+ stream->Write(buffer + 1 + next_char);
+}
+
+// Writes an integer value in the base given in options, plus an optional
+// comment with the same value in a second base. This is used for the common
+// output format of IntView, UIntView, and BcdView.
+template <class Stream, class View>
+void WriteIntegerViewToTextStream(View *view, Stream *stream,
+ const TextOutputOptions &options) {
+ WriteIntegerToTextStream(view->Read(), stream, options.numeric_base(),
+ options.digit_grouping());
+ if (options.comments()) {
+ stream->Write(" # ");
+ WriteIntegerToTextStream(view->Read(), stream,
+ options.numeric_base() == 10 ? 16 : 10,
+ options.digit_grouping());
+ }
+}
+
+// The TextOutputOptions parameter is present so that it can be passed in by
+// generated code that uses the same form for WriteBooleanViewToTextStream,
+// WriteIntegerViewToTextStream, and WriteEnumViewToTextStream.
+template <class Stream, class View>
+void WriteBooleanViewToTextStream(View *view, Stream *stream,
+ const TextOutputOptions &) {
+ if (view->Read()) {
+ stream->Write("true");
+ } else {
+ stream->Write("false");
+ }
+}
+
+// FloatConstants holds various masks for working with IEEE754-compatible
+// floating-point values at a bit level. These are mostly used here to
+// implement text format for NaNs, preserving the NaN payload so that the text
+// format can (in theory) provide a bit-exact round-trip through the text
+// format.
+template <class Float>
+struct FloatConstants;
+
+template <>
+struct FloatConstants<float> {
+ static_assert(sizeof(float) == 4, "Emboss requires 32-bit float.");
+ using MatchingIntegerType = ::std::uint32_t;
+ static constexpr MatchingIntegerType kMantissaMask() { return 0x7fffffU; }
+ static constexpr MatchingIntegerType kExponentMask() { return 0x7f800000U; }
+ static constexpr MatchingIntegerType kSignMask() { return 0x80000000U; }
+ static constexpr int kPrintfPrecision() { return 9; }
+ static constexpr const char *kScanfFormat() { return "%f%n"; }
+};
+
+template <>
+struct FloatConstants<double> {
+ static_assert(sizeof(double) == 8, "Emboss requires 64-bit double.");
+ using MatchingIntegerType = ::std::uint64_t;
+ static constexpr MatchingIntegerType kMantissaMask() { return 0xfffffffffffffUL; }
+ static constexpr MatchingIntegerType kExponentMask() { return 0x7ff0000000000000UL; }
+ static constexpr MatchingIntegerType kSignMask() { return 0x8000000000000000UL; }
+ static constexpr int kPrintfPrecision() { return 17; }
+ static constexpr const char *kScanfFormat() { return "%lf%n"; }
+};
+
+// Decodes a floating-point number from text.
+template <class Float>
+bool DecodeFloat(const ::std::string &token, Float *result) {
+ // The state of the world for reading floating-point values is somewhat better
+ // than the situation for writing them, but there are still a few bits that
+ // are underspecified. This function is the mirror of WriteFloatToTextStream,
+ // below, so it specifically decodes infinities and NaNs in the formats that
+ // Emboss uses.
+ //
+ // Because of the use of scanf here, this function accepts hex floating-point
+ // values (0xh.hhhhpeee) *on some systems*. TODO(bolms): make hex float
+ // support universal.
+
+ using UInt = typename FloatConstants<Float>::MatchingIntegerType;
+
+ if (token.empty()) return false;
+
+ // First, check for negative.
+ bool negative = token[0] == '-';
+
+ // Second, check for NaN.
+ ::std::size_t i = token[0] == '-' || token[0] == '+' ? 1 : 0;
+ if (token.size() >= i + 3 && (token[i] == 'N' || token[i] == 'n') &&
+ (token[i + 1] == 'A' || token[i + 1] == 'a') &&
+ (token[i + 2] == 'N' || token[i + 2] == 'n')) {
+ UInt nan_payload;
+ if (token.size() >= i + 4) {
+ if (token[i + 3] == '(' && token[token.size() - 1] == ')') {
+ if (!DecodeInteger(token.substr(i + 4, token.size() - i - 5),
+ &nan_payload)) {
+ return false;
+ }
+ } else {
+ // NaN may not be followed by trailing characters other than a
+ // ()-enclosed payload.
+ return false;
+ }
+ } else {
+ // If no specific NaN was given, take a default NaN from the C++ standard
+ // library. Technically, a conformant C++ implementation might not have
+ // quiet_NaN(), but any IEEE754-based implementation should.
+ //
+ // It is tempting to just write the default NaN directly into the view and
+ // return success, but "-NaN" should be have its sign bit set, and there
+ // is no direct way to set the sign bit of a NaN, so there are fewer code
+ // paths if we extract the default NaN payload, then use it in the
+ // reconstruction step, below.
+ Float default_nan = ::std::numeric_limits<Float>::quiet_NaN();
+ UInt bits;
+ ::std::memcpy(&bits, &default_nan, sizeof(bits));
+ nan_payload = bits & FloatConstants<Float>::kMantissaMask();
+ }
+ if (nan_payload == 0) {
+ // "NaN" with a payload of zero is actually the bit pattern for infinity;
+ // "NaN(0)" should not be an alias for "Inf".
+ return false;
+ }
+ if (nan_payload & (FloatConstants<Float>::kExponentMask() |
+ FloatConstants<Float>::kSignMask())) {
+ // The payload must be small enough to fit in the payload space; it must
+ // not overflow into the exponent or sign bits.
+ //
+ // Note that the DecodeInteger call which decoded the payload will return
+ // false if the payload would overflow the `UInt` type, so cases like
+ // "NaN(0x10000000000000000000000000000)" -- which are so big that they no
+ // longer interfere with the sign or exponent -- are caught above.
+ return false;
+ }
+ UInt bits = FloatConstants<Float>::kExponentMask();
+ bits |= nan_payload;
+ if (negative) {
+ bits |= FloatConstants<Float>::kSignMask();
+ }
+ ::std::memcpy(result, &bits, sizeof(bits));
+ return true;
+ }
+
+ // If the value is not NaN, check for infinity.
+ if (token.size() >= i + 3 && (token[i] == 'I' || token[i] == 'i') &&
+ (token[i + 1] == 'N' || token[i + 1] == 'n') &&
+ (token[i + 2] == 'F' || token[i + 2] == 'f')) {
+ if (token.size() > i + 3) {
+ // Infinity must be exactly "Inf" or "-Inf" (case insensitive). There
+ // must not be trailing characters.
+ return false;
+ }
+ // As with quiet_NaN(), a conforming C++ implementation might not have
+ // infinity(), but an IEEE 754-based implementation should.
+ if (negative) {
+ *result = -::std::numeric_limits<Float>::infinity();
+ return true;
+ } else {
+ *result = ::std::numeric_limits<Float>::infinity();
+ return true;
+ }
+ }
+
+ // For non-NaN, non-Inf values, use the C scanf function, mirroring the use of
+ // printf for writing the value, below.
+ int chars_used = -1;
+ if (::std::sscanf(token.c_str(), FloatConstants<Float>::kScanfFormat(), result,
+ &chars_used) < 1) {
+ return false;
+ }
+ if (chars_used < 0 ||
+ static_cast</**/ ::std::size_t>(chars_used) < token.size()) {
+ return false;
+ }
+ return true;
+}
+
+// Decodes a floating-point number from a text stream and writes it to the
+// specified view.
+template <class Stream, class View>
+bool ReadFloatFromTextStream(View *view, Stream *stream) {
+ ::std::string token;
+ if (!ReadToken(stream, &token)) return false;
+ typename View::ValueType value;
+ if (!DecodeFloat(token, &value)) return false;
+ return view->TryToWrite(value);
+}
+
+template <class Stream, class Float>
+void WriteFloatToTextStream(Float n, Stream *stream,
+ const TextOutputOptions &options) {
+ static_assert(::std::is_same<Float, float>::value ||
+ ::std::is_same<Float, double>::value,
+ "WriteFloatToTextStream can only write float or double.");
+ // The state of the world w.r.t. rendering floating-points as decimal text is,
+ // ca. 2018, less than ideal.
+ //
+ // In C++ land, there is actually no stable facility in the standard library
+ // until to_chars() in C++17 -- which is not actually implemented yet in
+ // libc++. to_string(), the printf() family, and the iostreams system all
+ // respect the current locale. In most programs, the locale is permanently
+ // left on "C", but this is not guaranteed. to_string() also uses a fixed and
+ // rather unfortunate format.
+ //
+ // For integers, I (bolms@) chose to just implement custom read and write
+ // routines, but those routines are quite small and straightforward compared
+ // to floating point conversion. Even writing correct output is difficult,
+ // and writing correct and minimal output is the subject of a number of
+ // academic papers.
+ //
+ // For the moment, I'm just using snprintf("%.*g", 17, n), which is guaranteed
+ // to be read back as the same number, but can be longer than strictly
+ // necessary.
+ //
+ // TODO(bolms): Import a modified version of the double-to-string conversion
+ // from Swift's standard library, which appears to be best implementation
+ // currently available.
+
+ if (::std::isnan(n)) {
+ // The printf format for NaN is just "NaN". In the interests of keeping
+ // things bit-exact, Emboss prints the exact NaN.
+ typename FloatConstants<Float>::MatchingIntegerType bits;
+ ::std::memcpy(&bits, &n, sizeof(bits));
+ ::std::uint64_t nan_payload = bits & FloatConstants<Float>::kMantissaMask();
+ ::std::uint64_t nan_sign = bits & FloatConstants<Float>::kSignMask();
+ if (nan_sign) {
+ // NaN still has a sign bit, which is generally treated differently from
+ // the payload. There is no real "standard" text format for NaNs, but
+ // "-NaN" appears to be a common way of indicating a NaN with the sign bit
+ // set.
+ stream->Write("-NaN(");
+ } else {
+ stream->Write("NaN(");
+ }
+ // NaN payloads are always dumped in hex. Note that Emboss is treating the
+ // is_quiet/is_signal bit as just another bit in the payload.
+ WriteIntegerToTextStream(nan_payload, stream, 16, options.digit_grouping());
+ stream->Write(")");
+ return;
+ }
+
+ if (::std::isinf(n)) {
+ if (n < 0.0) {
+ stream->Write("-Inf");
+ } else {
+ stream->Write("Inf");
+ }
+ return;
+ }
+
+ // TODO(bolms): Should the current numeric base be honored here? Should there
+ // be a separate Float numeric base?
+ ::std::array<char, 30> buffer;
+ // TODO(bolms): Figure out how to get ::std::snprintf to work on
+ // microcontroller builds.
+ EMBOSS_CHECK_LE(static_cast</**/ ::std::size_t>(
+ ::snprintf(&(buffer[0]), buffer.size(), "%.*g",
+ FloatConstants<Float>::kPrintfPrecision(),
+ static_cast<double>(n)) +
+ 1),
+ buffer.size());
+ stream->Write(&buffer[0]);
+
+ // TODO(bolms): Support digit grouping.
+}
+
+template <class Stream, class View>
+void WriteEnumViewToTextStream(View *view, Stream *stream,
+ const TextOutputOptions &options) {
+ const char *name = TryToGetNameFromEnum(view->Read());
+ if (name != nullptr) {
+ stream->Write(name);
+ }
+ // If the enum value has no known name, then write its numeric value
+ // instead. If it does have a known name, and comments are enabled on the
+ // output, then write the numeric value as a comment.
+ if (name == nullptr || options.comments()) {
+ if (name != nullptr) stream->Write(" # ");
+ WriteIntegerToTextStream(
+ static_cast<
+ typename ::std::underlying_type<typename View::ValueType>::type>(
+ view->Read()),
+ stream, options.numeric_base(), options.digit_grouping());
+ }
+}
+
+// Updates an array from a text stream. For an array of integers, the most
+// basic form of the text format looks like:
+//
+// { 0, 1, 2 }
+//
+// However, the following are all acceptable and equivalent:
+//
+// { 0, 1, 2, }
+// {0 1 2}
+// { [2]: 2, [1]: 1, [0]: 0 }
+// {[2]:2, [0]:0, 1}
+//
+// Formally, the array must be contained within braces ("{}"). Elements are
+// represented as an optional index surrounded by brackets ("[]") followed by
+// the text format of the element, followed by a single optional comma (",").
+// If no index is present for the first element, the index 0 will be used. If
+// no index is present for any elements after the first, the index one greater
+// than the previous index will be used.
+template <class Array, class Stream>
+bool ReadArrayFromTextStream(Array *array, Stream *stream) {
+ // The text format allows any given index to be set more than once. In
+ // theory, this function could track indices and fail if an index were
+ // double-set, but doing so would require quite a bit of overhead, and
+ // O(array->ElementCount()) extra space in the worst case. It does not seem
+ // worth it to impose the runtime cost here.
+ size_t index = 0;
+ ::std::string brace;
+ // Read out the opening brace.
+ if (!ReadToken(stream, &brace)) return false;
+ if (brace != "{") return false;
+ for (;;) {
+ char c;
+ // Check for a closing brace; if present, success.
+ if (!DiscardWhitespace(stream)) return false;
+ if (!stream->Read(&c)) return false;
+ if (c == '}') return true;
+
+ // If the element has an index, read it.
+ if (c == '[') {
+ ::std::string index_text;
+ if (!ReadToken(stream, &index_text)) return false;
+ if (!::emboss::support::DecodeInteger(index_text, &index)) return false;
+ ::std::string closing_bracket;
+ if (!ReadToken(stream, &closing_bracket)) return false;
+ if (closing_bracket != "]") return false;
+ ::std::string colon;
+ if (!ReadToken(stream, &colon)) return false;
+ if (colon != ":") return false;
+ } else {
+ if (!stream->Unread(c)) return false;
+ }
+
+ // Read the element.
+ if (index >= array->ElementCount()) return false;
+ if (!(*array)[index].UpdateFromTextStream(stream)) return false;
+ ++index;
+
+ // If there is a trailing comma, discard it.
+ if (!DiscardWhitespace(stream)) return false;
+ if (!stream->Read(&c)) return false;
+ if (c != ',') {
+ if (c != '}') return false;
+ if (!stream->Unread(c)) return false;
+ }
+ }
+}
+
+// Writes an array to a text stream. This writes the array in a format
+// compatible with ReadArrayFromTextStream, above. For multiline output, writes
+// one element per line.
+//
+// TODO(bolms): Make the output for arrays of small elements (like bytes) much
+// more compact.
+//
+// This will require several support functions like `MaxTextLength` on every
+// view type, and will substantially increase the number of tests required for
+// this function, but will make arrays of small elements much more readable.
+template <class Array, class Stream>
+void WriteArrayToTextStream(Array *array, Stream *stream,
+ const TextOutputOptions &options) {
+ TextOutputOptions element_options = options.PlusOneIndent();
+ if (options.multiline()) {
+ stream->Write("{");
+ WriteShorthandArrayCommentToTextStream(array, stream, element_options);
+ for (::std::size_t i = 0; i < array->ElementCount(); ++i) {
+ stream->Write("\n");
+ stream->Write(element_options.current_indent());
+ stream->Write("[");
+ // TODO(bolms): Put padding in here so that array elements start at the
+ // same column.
+ //
+ // TODO(bolms): (Maybe) figure out how to get padding to work so that
+ // elements with comments can have their comments align to the same
+ // column.
+ WriteIntegerToTextStream(i, stream, options.numeric_base(),
+ options.digit_grouping());
+ stream->Write("]: ");
+ (*array)[i].WriteToTextStream(stream, element_options);
+ }
+ stream->Write("\n");
+ stream->Write(options.current_indent());
+ stream->Write("}");
+ } else {
+ stream->Write("{");
+ for (::std::size_t i = 0; i < array->ElementCount(); ++i) {
+ stream->Write(" ");
+ if (i % 8 == 0) {
+ stream->Write("[");
+ WriteIntegerToTextStream(i, stream, options.numeric_base(),
+ options.digit_grouping());
+ stream->Write("]: ");
+ }
+ (*array)[i].WriteToTextStream(stream, element_options);
+ if (i < array->ElementCount() - 1) {
+ stream->Write(",");
+ }
+ }
+ stream->Write(" }");
+ }
+}
+
+// TextStream puts a stream-like interface onto a std::string, for use by
+// UpdateFromTextStream. It is used by UpdateFromText().
+class TextStream final {
+ public:
+ // This template handles std::string, std::string_view, and absl::string_view.
+ template <class String>
+ inline explicit TextStream(const String &text)
+ : text_(text.data()), length_(text.size()) {}
+
+ inline explicit TextStream(const char *text)
+ : text_(text), length_(strlen(text)) {}
+
+ inline TextStream(const char *text, size_t length)
+ : text_(text), length_(length) {}
+
+ inline bool Read(char *result) {
+ if (index_ >= length_) return false;
+ *result = text_[index_];
+ ++index_;
+ return true;
+ }
+
+ inline bool Unread(char c) {
+ if (index_ < 1) return false;
+ if (text_[index_ - 1] != c) return false;
+ --index_;
+ return true;
+ }
+
+ private:
+ // It would be nice to use string_view here, but that's not available until
+ // C++17.
+ const char *text_ = nullptr;
+ size_t length_ = 0;
+ size_t index_ = 0;
+};
+
+} // namespace support
+
+// Returns a TextOutputOptions set for reasonable multi-line text output.
+static inline TextOutputOptions MultilineText() {
+ return TextOutputOptions()
+ .Multiline(true)
+ .WithIndent(" ")
+ .WithComments(true)
+ .WithDigitGrouping(true);
+}
+
+// TODO(bolms): Add corresponding ReadFromText*() verbs which enforce the
+// constraint that all of a field's dependencies must be present in the text
+// before the field itself is set.
+template <typename EmbossViewType>
+inline bool UpdateFromText(const EmbossViewType &view,
+ const ::std::string &text) {
+ auto text_stream = support::TextStream{text};
+ return view.UpdateFromTextStream(&text_stream);
+}
+
+template <typename EmbossViewType>
+inline ::std::string WriteToString(const EmbossViewType &view,
+ TextOutputOptions options) {
+ support::TextOutputStream text_stream;
+ view.WriteToTextStream(&text_stream, options);
+ return text_stream.Result();
+}
+
+template <typename EmbossViewType>
+inline ::std::string WriteToString(const EmbossViewType &view) {
+ return WriteToString(view, TextOutputOptions());
+}
+
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_TEXT_UTIL_H_
diff --git a/public/emboss_text_util_test.cc b/public/emboss_text_util_test.cc
new file mode 100644
index 0000000..4a0e4eb
--- /dev/null
+++ b/public/emboss_text_util_test.cc
@@ -0,0 +1,1114 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#include "public/emboss_text_util.h"
+
+#include <cmath>
+#include <limits>
+
+#include <gtest/gtest.h>
+
+namespace emboss {
+namespace support {
+namespace test {
+
+TEST(DecodeInteger, DecodeUInt8Decimal) {
+ uint8_t result;
+ EXPECT_TRUE(DecodeInteger("123", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_TRUE(DecodeInteger("0", &result));
+ EXPECT_EQ(0, result);
+ EXPECT_TRUE(DecodeInteger("0123", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_TRUE(DecodeInteger("0_123", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("_12", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("1234", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("12a", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("12A", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("12 ", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger(" 12", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("12.", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("12.0", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("256", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_TRUE(DecodeInteger("128", &result));
+ EXPECT_EQ(128, result);
+ EXPECT_FALSE(DecodeInteger("-0", &result));
+ EXPECT_EQ(128, result);
+ EXPECT_TRUE(DecodeInteger("255", &result));
+ EXPECT_EQ(255, result);
+}
+
+TEST(DecodeInteger, DecodeInt8Decimal) {
+ int8_t result;
+ EXPECT_TRUE(DecodeInteger("123", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_TRUE(DecodeInteger("0", &result));
+ EXPECT_EQ(0, result);
+ EXPECT_TRUE(DecodeInteger("0123", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_TRUE(DecodeInteger("0_123", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("_12", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("1234", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("12a", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("12A", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("12 ", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger(" 12", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("12.", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("12.0", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("256", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_FALSE(DecodeInteger("128", &result));
+ EXPECT_EQ(123, result);
+ EXPECT_TRUE(DecodeInteger("-0", &result));
+ EXPECT_EQ(0, result);
+ EXPECT_TRUE(DecodeInteger("127", &result));
+ EXPECT_EQ(127, result);
+ EXPECT_TRUE(DecodeInteger("-127", &result));
+ EXPECT_EQ(-127, result);
+ EXPECT_TRUE(DecodeInteger("-128", &result));
+ EXPECT_EQ(-128, result);
+ EXPECT_FALSE(DecodeInteger("0-127", &result));
+ EXPECT_EQ(-128, result);
+ EXPECT_FALSE(DecodeInteger("- 127", &result));
+ EXPECT_EQ(-128, result);
+}
+
+TEST(DecodeInteger, DecodeUInt8Hex) {
+ uint8_t result;
+ EXPECT_TRUE(DecodeInteger("0x23", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_TRUE(DecodeInteger("0x0", &result));
+ EXPECT_EQ(0x0, result);
+ EXPECT_TRUE(DecodeInteger("0xff", &result));
+ EXPECT_EQ(0xff, result);
+ EXPECT_TRUE(DecodeInteger("0xFE", &result));
+ EXPECT_EQ(0xfe, result);
+ EXPECT_TRUE(DecodeInteger("0xFd", &result));
+ EXPECT_EQ(0xfd, result);
+ EXPECT_TRUE(DecodeInteger("0XeC", &result));
+ EXPECT_EQ(0xec, result);
+ EXPECT_TRUE(DecodeInteger("0x012", &result));
+ EXPECT_EQ(0x12, result);
+ EXPECT_TRUE(DecodeInteger("0x0_0023", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_TRUE(DecodeInteger("0x_0023", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0x100", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0x", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0x0x0", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0x1g", &result));
+ EXPECT_EQ(0x23, result);
+}
+
+TEST(DecodeInteger, DecodeUInt8Binary) {
+ uint8_t result;
+ EXPECT_TRUE(DecodeInteger("0b10100101", &result));
+ EXPECT_EQ(0xa5, result);
+ EXPECT_TRUE(DecodeInteger("0b0", &result));
+ EXPECT_EQ(0x0, result);
+ EXPECT_TRUE(DecodeInteger("0B1", &result));
+ EXPECT_EQ(0x1, result);
+ EXPECT_TRUE(DecodeInteger("0b11111111", &result));
+ EXPECT_EQ(0xff, result);
+ EXPECT_TRUE(DecodeInteger("0b011111110", &result));
+ EXPECT_EQ(0xfe, result);
+ EXPECT_TRUE(DecodeInteger("0b00_0010_0011", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_TRUE(DecodeInteger("0b_0010_0011", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0b100000000", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0b", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0b0b0", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0b12", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("-0b0", &result));
+ EXPECT_EQ(0x23, result);
+}
+
+TEST(DecodeInteger, DecodeInt8Binary) {
+ int8_t result;
+ EXPECT_TRUE(DecodeInteger("0b01011010", &result));
+ EXPECT_EQ(0x5a, result);
+ EXPECT_TRUE(DecodeInteger("0b0", &result));
+ EXPECT_EQ(0x0, result);
+ EXPECT_TRUE(DecodeInteger("0B1", &result));
+ EXPECT_EQ(0x1, result);
+ EXPECT_TRUE(DecodeInteger("0b1111111", &result));
+ EXPECT_EQ(0x7f, result);
+ EXPECT_TRUE(DecodeInteger("0b01111110", &result));
+ EXPECT_EQ(0x7e, result);
+ EXPECT_TRUE(DecodeInteger("0b00_0010_0011", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_TRUE(DecodeInteger("0b_0010_0011", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0b100000000", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0b", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("-0b", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0b0b0", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0b12", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_FALSE(DecodeInteger("0b10000000", &result));
+ EXPECT_EQ(0x23, result);
+ EXPECT_TRUE(DecodeInteger("-0b1111111", &result));
+ EXPECT_EQ(-0x7f, result);
+ EXPECT_TRUE(DecodeInteger("-0b10000000", &result));
+ EXPECT_EQ(-0x80, result);
+ EXPECT_FALSE(DecodeInteger("-0b10000001", &result));
+ EXPECT_EQ(-0x80, result);
+ EXPECT_TRUE(DecodeInteger("-0b0", &result));
+ EXPECT_EQ(0x0, result);
+}
+
+TEST(DecodeInteger, DecodeUInt16) {
+ uint16_t result;
+ EXPECT_TRUE(DecodeInteger("65535", &result));
+ EXPECT_EQ(65535, result);
+ EXPECT_FALSE(DecodeInteger("65536", &result));
+ EXPECT_EQ(65535, result);
+}
+
+TEST(DecodeInteger, DecodeInt16) {
+ int16_t result;
+ EXPECT_TRUE(DecodeInteger("32767", &result));
+ EXPECT_EQ(32767, result);
+ EXPECT_FALSE(DecodeInteger("32768", &result));
+ EXPECT_EQ(32767, result);
+ EXPECT_TRUE(DecodeInteger("-32768", &result));
+ EXPECT_EQ(-32768, result);
+ EXPECT_FALSE(DecodeInteger("-32769", &result));
+ EXPECT_EQ(-32768, result);
+}
+
+TEST(DecodeInteger, DecodeUInt32) {
+ uint32_t result;
+ EXPECT_TRUE(DecodeInteger("4294967295", &result));
+ EXPECT_EQ(4294967295U, result);
+ EXPECT_FALSE(DecodeInteger("4294967296", &result));
+ EXPECT_EQ(4294967295U, result);
+}
+
+TEST(DecodeInteger, DecodeInt32) {
+ int32_t result;
+ EXPECT_TRUE(DecodeInteger("2147483647", &result));
+ EXPECT_EQ(2147483647, result);
+ EXPECT_FALSE(DecodeInteger("2147483648", &result));
+ EXPECT_EQ(2147483647, result);
+ EXPECT_FALSE(DecodeInteger("4294967295", &result));
+ EXPECT_EQ(2147483647, result);
+ EXPECT_TRUE(DecodeInteger("-2147483648", &result));
+ EXPECT_EQ(-2147483647 - 1, result);
+ EXPECT_FALSE(DecodeInteger("-2147483649", &result));
+ EXPECT_EQ(-2147483647 - 1, result);
+}
+
+TEST(DecodeInteger, DecodeUInt64) {
+ uint64_t result;
+ EXPECT_TRUE(DecodeInteger("18446744073709551615", &result));
+ EXPECT_EQ(18446744073709551615ULL, result);
+ EXPECT_FALSE(DecodeInteger("18446744073709551616", &result));
+ EXPECT_EQ(18446744073709551615ULL, result);
+}
+
+TEST(DecodeInteger, DecodeInt64) {
+ int64_t result;
+ EXPECT_TRUE(DecodeInteger("9223372036854775807", &result));
+ EXPECT_EQ(9223372036854775807LL, result);
+ EXPECT_FALSE(DecodeInteger("9223372036854775808", &result));
+ EXPECT_EQ(9223372036854775807LL, result);
+ EXPECT_FALSE(DecodeInteger("18446744073709551615", &result));
+ EXPECT_EQ(9223372036854775807LL, result);
+ EXPECT_TRUE(DecodeInteger("-9223372036854775808", &result));
+ EXPECT_EQ(-9223372036854775807LL - 1LL, result);
+ EXPECT_FALSE(DecodeInteger("-9223372036854775809", &result));
+ EXPECT_EQ(-9223372036854775807LL - 1LL, result);
+}
+
+TEST(TextStream, Construction) {
+ std::string string_text = "ab";
+ auto text_stream = TextStream(string_text);
+ char result;
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('a', result);
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('b', result);
+ EXPECT_FALSE(text_stream.Read(&result));
+
+ const char *c_string = "cd";
+ text_stream = TextStream(c_string);
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('c', result);
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('d', result);
+ EXPECT_FALSE(text_stream.Read(&result));
+
+ const char *long_c_string = "efghi";
+ text_stream = TextStream(long_c_string, 2);
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('e', result);
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('f', result);
+ EXPECT_FALSE(text_stream.Read(&result));
+}
+
+TEST(TextStream, Methods) {
+ auto text_stream = TextStream{"abc"};
+
+ EXPECT_FALSE(text_stream.Unread('d'));
+ char result;
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('a', result);
+
+ EXPECT_FALSE(text_stream.Unread('e'));
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('b', result);
+
+ EXPECT_TRUE(text_stream.Unread('b'));
+ result = 'f';
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('b', result);
+
+ EXPECT_TRUE(text_stream.Read(&result));
+ EXPECT_EQ('c', result);
+
+ result = 'g';
+ EXPECT_FALSE(text_stream.Read(&result));
+ EXPECT_EQ('g', result);
+
+ auto empty_text_stream = TextStream{""};
+ EXPECT_FALSE(empty_text_stream.Read(&result));
+ EXPECT_EQ('g', result);
+}
+
+TEST(ReadToken, ReadsToken) {
+ auto text_stream = TextStream{"abc"};
+ ::std::string result;
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("abc", result);
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("", result);
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("", result);
+}
+
+TEST(ReadToken, ReadsTwoTokens) {
+ auto text_stream = TextStream{"abc def"};
+ ::std::string result;
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("abc", result);
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("def", result);
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("", result);
+}
+
+TEST(ReadToken, SkipsInitialWhitespace) {
+ auto text_stream = TextStream{" \t\r\r\n\t\r abc def"};
+ ::std::string result;
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("abc", result);
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("def", result);
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("", result);
+}
+
+TEST(ReadToken, SkipsComments) {
+ auto text_stream = TextStream{" #comment##\r#comment\n abc #c\n def"};
+ ::std::string result;
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("abc", result);
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("def", result);
+ EXPECT_TRUE(ReadToken(&text_stream, &result));
+ EXPECT_EQ("", result);
+}
+
+TEST(TextOutputOptions, Defaults) {
+ TextOutputOptions options;
+ EXPECT_EQ("", options.current_indent());
+ EXPECT_EQ("", options.indent());
+ EXPECT_FALSE(options.multiline());
+ EXPECT_FALSE(options.comments());
+ EXPECT_FALSE(options.digit_grouping());
+ EXPECT_EQ(10, options.numeric_base());
+}
+
+TEST(TextOutputOptions, WithIndent) {
+ TextOutputOptions options;
+ TextOutputOptions new_options = options.WithIndent("xyz");
+ EXPECT_EQ("", options.current_indent());
+ EXPECT_EQ("", options.indent());
+ EXPECT_EQ("", new_options.current_indent());
+ EXPECT_EQ("xyz", new_options.indent());
+}
+
+TEST(TextOutputOptions, PlusOneIndent) {
+ TextOutputOptions options;
+ TextOutputOptions new_options = options.WithIndent("xyz").PlusOneIndent();
+ EXPECT_EQ("", options.current_indent());
+ EXPECT_EQ("", options.indent());
+ EXPECT_EQ("xyz", new_options.current_indent());
+ EXPECT_EQ("xyz", new_options.indent());
+ EXPECT_EQ("xyzxyz", new_options.PlusOneIndent().current_indent());
+}
+
+TEST(TextOutputOptions, WithComments) {
+ TextOutputOptions options;
+ TextOutputOptions new_options = options.WithComments(true);
+ EXPECT_FALSE(options.comments());
+ EXPECT_TRUE(new_options.comments());
+}
+
+TEST(TextOutputOptions, WithDigitGrouping) {
+ TextOutputOptions options;
+ TextOutputOptions new_options = options.WithDigitGrouping(true);
+ EXPECT_FALSE(options.digit_grouping());
+ EXPECT_TRUE(new_options.digit_grouping());
+}
+
+TEST(TextOutputOptions, Multiline) {
+ TextOutputOptions options;
+ TextOutputOptions new_options = options.Multiline(true);
+ EXPECT_FALSE(options.multiline());
+ EXPECT_TRUE(new_options.multiline());
+}
+
+TEST(TextOutputOptions, WithNumericBase) {
+ TextOutputOptions options;
+ TextOutputOptions new_options = options.WithNumericBase(2);
+ EXPECT_EQ(10, options.numeric_base());
+ EXPECT_EQ(2, new_options.numeric_base());
+}
+
+// Small helper function for the various WriteIntegerToTextStream tests; just
+// sets up a stream, forwards its arguments to WriteIntegerToTextStream, and
+// then returns the text from the stream.
+template <typename Arg0, typename... Args>
+::std::string WriteIntegerToString(Arg0 &&arg0, Args &&... args) {
+ TextOutputStream stream;
+ WriteIntegerToTextStream(::std::forward<Arg0>(arg0), &stream,
+ ::std::forward<Args>(args)...);
+ return stream.Result();
+}
+
+TEST(WriteIntegerToTextStream, Decimal) {
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<uint8_t>(0), 10, false));
+ EXPECT_EQ("100", WriteIntegerToString(static_cast<uint8_t>(100), 10, false));
+ EXPECT_EQ("255", WriteIntegerToString(static_cast<uint8_t>(255), 10, false));
+ EXPECT_EQ("-128", WriteIntegerToString(static_cast<int8_t>(-128), 10, false));
+ EXPECT_EQ("-100", WriteIntegerToString(static_cast<int8_t>(-100), 10, false));
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<int8_t>(0), 10, false));
+ EXPECT_EQ("100", WriteIntegerToString(static_cast<int8_t>(100), 10, false));
+ EXPECT_EQ("127", WriteIntegerToString(static_cast<int8_t>(127), 10, false));
+
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<uint8_t>(0), 10, true));
+ EXPECT_EQ("100", WriteIntegerToString(static_cast<uint8_t>(100), 10, true));
+ EXPECT_EQ("255", WriteIntegerToString(static_cast<uint8_t>(255), 10, true));
+ EXPECT_EQ("-128", WriteIntegerToString(static_cast<int8_t>(-128), 10, true));
+ EXPECT_EQ("-100", WriteIntegerToString(static_cast<int8_t>(-100), 10, true));
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<int8_t>(0), 10, true));
+ EXPECT_EQ("100", WriteIntegerToString(static_cast<int8_t>(100), 10, true));
+ EXPECT_EQ("127", WriteIntegerToString(static_cast<int8_t>(127), 10, true));
+
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<uint16_t>(0), 10, false));
+ EXPECT_EQ("1000",
+ WriteIntegerToString(static_cast<uint16_t>(1000), 10, false));
+ EXPECT_EQ("65535",
+ WriteIntegerToString(static_cast<uint16_t>(65535), 10, false));
+ EXPECT_EQ("-32768",
+ WriteIntegerToString(static_cast<int16_t>(-32768), 10, false));
+ EXPECT_EQ("-10000",
+ WriteIntegerToString(static_cast<int16_t>(-10000), 10, false));
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<int16_t>(0), 10, false));
+ EXPECT_EQ("32767",
+ WriteIntegerToString(static_cast<int16_t>(32767), 10, false));
+
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<uint16_t>(0), 10, true));
+ EXPECT_EQ("999", WriteIntegerToString(static_cast<uint16_t>(999), 10, true));
+ EXPECT_EQ("1_000",
+ WriteIntegerToString(static_cast<uint16_t>(1000), 10, true));
+ EXPECT_EQ("65_535",
+ WriteIntegerToString(static_cast<uint16_t>(65535), 10, true));
+ EXPECT_EQ("-32_768",
+ WriteIntegerToString(static_cast<int16_t>(-32768), 10, true));
+ EXPECT_EQ("-1_000",
+ WriteIntegerToString(static_cast<int16_t>(-1000), 10, true));
+ EXPECT_EQ("-999", WriteIntegerToString(static_cast<int16_t>(-999), 10, true));
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<int16_t>(0), 10, true));
+ EXPECT_EQ("32_767",
+ WriteIntegerToString(static_cast<int16_t>(32767), 10, true));
+
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<uint32_t>(0), 10, false));
+ EXPECT_EQ("1000000",
+ WriteIntegerToString(static_cast<uint32_t>(1000000), 10, false));
+ EXPECT_EQ("4294967295",
+ WriteIntegerToString(static_cast<uint32_t>(4294967295), 10, false));
+ EXPECT_EQ("-2147483648",
+ WriteIntegerToString(static_cast<int32_t>(-2147483648), 10, false));
+ EXPECT_EQ("-100000",
+ WriteIntegerToString(static_cast<int32_t>(-100000), 10, false));
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<int32_t>(0), 10, false));
+ EXPECT_EQ("2147483647",
+ WriteIntegerToString(static_cast<int32_t>(2147483647), 10, false));
+
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<uint32_t>(0), 10, true));
+ EXPECT_EQ("999_999",
+ WriteIntegerToString(static_cast<uint32_t>(999999), 10, true));
+ EXPECT_EQ("1_000_000",
+ WriteIntegerToString(static_cast<uint32_t>(1000000), 10, true));
+ EXPECT_EQ("4_294_967_295",
+ WriteIntegerToString(static_cast<uint32_t>(4294967295U), 10, true));
+ EXPECT_EQ("-2_147_483_648",
+ WriteIntegerToString(static_cast<int32_t>(-2147483648L), 10, true));
+ EXPECT_EQ("-999_999",
+ WriteIntegerToString(static_cast<int32_t>(-999999), 10, true));
+ EXPECT_EQ("-1_000_000",
+ WriteIntegerToString(static_cast<int32_t>(-1000000), 10, true));
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<int32_t>(0), 10, true));
+ EXPECT_EQ("2_147_483_647",
+ WriteIntegerToString(static_cast<int32_t>(2147483647), 10, true));
+
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<uint64_t>(0), 10, false));
+ EXPECT_EQ("1000000",
+ WriteIntegerToString(static_cast<uint64_t>(1000000), 10, false));
+ EXPECT_EQ("18446744073709551615",
+ WriteIntegerToString(static_cast<uint64_t>(18446744073709551615UL),
+ 10, false));
+ EXPECT_EQ("-9223372036854775808",
+ WriteIntegerToString(
+ static_cast<int64_t>(-9223372036854775807L - 1), 10, false));
+ EXPECT_EQ("-100000",
+ WriteIntegerToString(static_cast<int64_t>(-100000), 10, false));
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<int64_t>(0), 10, false));
+ EXPECT_EQ("9223372036854775807",
+ WriteIntegerToString(static_cast<int64_t>(9223372036854775807L), 10,
+ false));
+
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<uint64_t>(0), 10, true));
+ EXPECT_EQ("1_000_000",
+ WriteIntegerToString(static_cast<uint64_t>(1000000), 10, true));
+ EXPECT_EQ("18_446_744_073_709_551_615",
+ WriteIntegerToString(static_cast<uint64_t>(18446744073709551615UL),
+ 10, true));
+ EXPECT_EQ("-9_223_372_036_854_775_808",
+ WriteIntegerToString(
+ static_cast<int64_t>(-9223372036854775807L - 1), 10, true));
+ EXPECT_EQ("-100_000",
+ WriteIntegerToString(static_cast<int64_t>(-100000), 10, true));
+ EXPECT_EQ("0", WriteIntegerToString(static_cast<int64_t>(0), 10, true));
+ EXPECT_EQ("9_223_372_036_854_775_807",
+ WriteIntegerToString(static_cast<int64_t>(9223372036854775807L), 10,
+ true));
+}
+
+TEST(WriteIntegerToTextStream, Binary) {
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<uint8_t>(0), 2, false));
+ EXPECT_EQ("0b1100100",
+ WriteIntegerToString(static_cast<uint8_t>(100), 2, false));
+ EXPECT_EQ("0b11111111",
+ WriteIntegerToString(static_cast<uint8_t>(255), 2, false));
+ EXPECT_EQ("-0b10000000",
+ WriteIntegerToString(static_cast<int8_t>(-128), 2, false));
+ EXPECT_EQ("-0b1100100",
+ WriteIntegerToString(static_cast<int8_t>(-100), 2, false));
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<int8_t>(0), 2, false));
+ EXPECT_EQ("0b1100100",
+ WriteIntegerToString(static_cast<int8_t>(100), 2, false));
+ EXPECT_EQ("0b1111111",
+ WriteIntegerToString(static_cast<int8_t>(127), 2, false));
+
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<uint8_t>(0), 2, true));
+ EXPECT_EQ("0b1100100",
+ WriteIntegerToString(static_cast<uint8_t>(100), 2, true));
+ EXPECT_EQ("0b11111111",
+ WriteIntegerToString(static_cast<uint8_t>(255), 2, true));
+ EXPECT_EQ("-0b10000000",
+ WriteIntegerToString(static_cast<int8_t>(-128), 2, true));
+ EXPECT_EQ("-0b1100100",
+ WriteIntegerToString(static_cast<int8_t>(-100), 2, true));
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<int8_t>(0), 2, true));
+ EXPECT_EQ("0b1100100",
+ WriteIntegerToString(static_cast<int8_t>(100), 2, true));
+ EXPECT_EQ("0b1111111",
+ WriteIntegerToString(static_cast<int8_t>(127), 2, true));
+
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<uint16_t>(0), 2, false));
+ EXPECT_EQ("0b1111101000",
+ WriteIntegerToString(static_cast<uint16_t>(1000), 2, false));
+ EXPECT_EQ("0b1111111111111111",
+ WriteIntegerToString(static_cast<uint16_t>(65535), 2, false));
+ EXPECT_EQ("-0b1000000000000000",
+ WriteIntegerToString(static_cast<int16_t>(-32768), 2, false));
+ EXPECT_EQ("-0b10011100010000",
+ WriteIntegerToString(static_cast<int16_t>(-10000), 2, false));
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<int16_t>(0), 2, false));
+ EXPECT_EQ("0b111111111111111",
+ WriteIntegerToString(static_cast<int16_t>(32767), 2, false));
+
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<uint16_t>(0), 2, true));
+ EXPECT_EQ("0b11_11101000",
+ WriteIntegerToString(static_cast<uint16_t>(1000), 2, true));
+ EXPECT_EQ("0b11111111_11111111",
+ WriteIntegerToString(static_cast<uint16_t>(65535), 2, true));
+ EXPECT_EQ("-0b10000000_00000000",
+ WriteIntegerToString(static_cast<int16_t>(-32768), 2, true));
+ EXPECT_EQ("-0b11_11101000",
+ WriteIntegerToString(static_cast<int16_t>(-1000), 2, true));
+ EXPECT_EQ("-0b11_11100111",
+ WriteIntegerToString(static_cast<int16_t>(-999), 2, true));
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<int16_t>(0), 2, true));
+ EXPECT_EQ("0b1111111_11111111",
+ WriteIntegerToString(static_cast<int16_t>(32767), 2, true));
+
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<uint32_t>(0), 2, false));
+ EXPECT_EQ("0b11110100001001000000",
+ WriteIntegerToString(static_cast<uint32_t>(1000000), 2, false));
+ EXPECT_EQ("0b11111111111111111111111111111111",
+ WriteIntegerToString(static_cast<uint32_t>(4294967295), 2, false));
+ EXPECT_EQ("-0b10000000000000000000000000000000",
+ WriteIntegerToString(static_cast<int32_t>(-2147483648), 2, false));
+ EXPECT_EQ("-0b11000011010100000",
+ WriteIntegerToString(static_cast<int32_t>(-100000), 2, false));
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<int32_t>(0), 2, false));
+ EXPECT_EQ("0b1111111111111111111111111111111",
+ WriteIntegerToString(static_cast<int32_t>(2147483647), 2, false));
+
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<uint32_t>(0), 2, true));
+ EXPECT_EQ("0b1111_01000010_01000000",
+ WriteIntegerToString(static_cast<uint32_t>(1000000), 2, true));
+ EXPECT_EQ("0b11111111_11111111_11111111_11111111",
+ WriteIntegerToString(static_cast<uint32_t>(4294967295U), 2, true));
+ EXPECT_EQ("-0b10000000_00000000_00000000_00000000",
+ WriteIntegerToString(static_cast<int32_t>(-2147483648L), 2, true));
+ EXPECT_EQ("-0b1111_01000010_01000000",
+ WriteIntegerToString(static_cast<int32_t>(-1000000), 2, true));
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<int32_t>(0), 2, true));
+ EXPECT_EQ("0b1111111_11111111_11111111_11111111",
+ WriteIntegerToString(static_cast<int32_t>(2147483647), 2, true));
+
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<uint64_t>(0), 2, false));
+ EXPECT_EQ("0b11110100001001000000",
+ WriteIntegerToString(static_cast<uint64_t>(1000000), 2, false));
+ EXPECT_EQ(
+ "0b1111111111111111111111111111111111111111111111111111111111111111",
+ WriteIntegerToString(static_cast<uint64_t>(18446744073709551615UL), 2,
+ false));
+ EXPECT_EQ(
+ "-0b1000000000000000000000000000000000000000000000000000000000000000",
+ WriteIntegerToString(static_cast<int64_t>(-9223372036854775807L - 1), 2,
+ false));
+ EXPECT_EQ("-0b11000011010100000",
+ WriteIntegerToString(static_cast<int64_t>(-100000), 2, false));
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<int64_t>(0), 2, false));
+ EXPECT_EQ("0b111111111111111111111111111111111111111111111111111111111111111",
+ WriteIntegerToString(static_cast<int64_t>(9223372036854775807L), 2,
+ false));
+
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<uint64_t>(0), 2, true));
+ EXPECT_EQ("0b1111_01000010_01000000",
+ WriteIntegerToString(static_cast<uint64_t>(1000000), 2, true));
+ EXPECT_EQ(
+ "0b11111111_11111111_11111111_11111111_11111111_11111111_11111111_"
+ "11111111",
+ WriteIntegerToString(static_cast<uint64_t>(18446744073709551615UL), 2,
+ true));
+ EXPECT_EQ(
+ "-0b10000000_00000000_00000000_00000000_00000000_00000000_00000000_"
+ "00000000",
+ WriteIntegerToString(static_cast<int64_t>(-9223372036854775807L - 1), 2,
+ true));
+ EXPECT_EQ("-0b1_10000110_10100000",
+ WriteIntegerToString(static_cast<int64_t>(-100000), 2, true));
+ EXPECT_EQ("0b0", WriteIntegerToString(static_cast<int64_t>(0), 2, true));
+ EXPECT_EQ(
+ "0b1111111_11111111_11111111_11111111_11111111_11111111_11111111_"
+ "11111111",
+ WriteIntegerToString(static_cast<int64_t>(9223372036854775807L), 2,
+ true));
+}
+
+TEST(WriteIntegerToTextStream, Hexadecimal) {
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<uint8_t>(0), 16, false));
+ EXPECT_EQ("0x64", WriteIntegerToString(static_cast<uint8_t>(100), 16, false));
+ EXPECT_EQ("0xff", WriteIntegerToString(static_cast<uint8_t>(255), 16, false));
+ EXPECT_EQ("-0x80",
+ WriteIntegerToString(static_cast<int8_t>(-128), 16, false));
+ EXPECT_EQ("-0x64",
+ WriteIntegerToString(static_cast<int8_t>(-100), 16, false));
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<int8_t>(0), 16, false));
+ EXPECT_EQ("0x64", WriteIntegerToString(static_cast<int8_t>(100), 16, false));
+ EXPECT_EQ("0x7f", WriteIntegerToString(static_cast<int8_t>(127), 16, false));
+
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<uint8_t>(0), 16, true));
+ EXPECT_EQ("0x64", WriteIntegerToString(static_cast<uint8_t>(100), 16, true));
+ EXPECT_EQ("0xff", WriteIntegerToString(static_cast<uint8_t>(255), 16, true));
+ EXPECT_EQ("-0x80", WriteIntegerToString(static_cast<int8_t>(-128), 16, true));
+ EXPECT_EQ("-0x64", WriteIntegerToString(static_cast<int8_t>(-100), 16, true));
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<int8_t>(0), 16, true));
+ EXPECT_EQ("0x64", WriteIntegerToString(static_cast<int8_t>(100), 16, true));
+ EXPECT_EQ("0x7f", WriteIntegerToString(static_cast<int8_t>(127), 16, true));
+
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<uint16_t>(0), 16, false));
+ EXPECT_EQ("0x3e8",
+ WriteIntegerToString(static_cast<uint16_t>(1000), 16, false));
+ EXPECT_EQ("0xffff",
+ WriteIntegerToString(static_cast<uint16_t>(65535), 16, false));
+ EXPECT_EQ("-0x8000",
+ WriteIntegerToString(static_cast<int16_t>(-32768), 16, false));
+ EXPECT_EQ("-0x2710",
+ WriteIntegerToString(static_cast<int16_t>(-10000), 16, false));
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<int16_t>(0), 16, false));
+ EXPECT_EQ("0x7fff",
+ WriteIntegerToString(static_cast<int16_t>(32767), 16, false));
+
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<uint16_t>(0), 16, true));
+ EXPECT_EQ("0x3e8",
+ WriteIntegerToString(static_cast<uint16_t>(1000), 16, true));
+ EXPECT_EQ("0xffff",
+ WriteIntegerToString(static_cast<uint16_t>(65535), 16, true));
+ EXPECT_EQ("-0x8000",
+ WriteIntegerToString(static_cast<int16_t>(-32768), 16, true));
+ EXPECT_EQ("-0x3e8",
+ WriteIntegerToString(static_cast<int16_t>(-1000), 16, true));
+ EXPECT_EQ("-0x3e7",
+ WriteIntegerToString(static_cast<int16_t>(-999), 16, true));
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<int16_t>(0), 16, true));
+ EXPECT_EQ("0x7fff",
+ WriteIntegerToString(static_cast<int16_t>(32767), 16, true));
+
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<uint32_t>(0), 16, false));
+ EXPECT_EQ("0xf4240",
+ WriteIntegerToString(static_cast<uint32_t>(1000000), 16, false));
+ EXPECT_EQ("0xffffffff",
+ WriteIntegerToString(static_cast<uint32_t>(4294967295), 16, false));
+ EXPECT_EQ("-0x80000000",
+ WriteIntegerToString(static_cast<int32_t>(-2147483648), 16, false));
+ EXPECT_EQ("-0x186a0",
+ WriteIntegerToString(static_cast<int32_t>(-100000), 16, false));
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<int32_t>(0), 16, false));
+ EXPECT_EQ("0x7fffffff",
+ WriteIntegerToString(static_cast<int32_t>(2147483647), 16, false));
+
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<uint32_t>(0), 16, true));
+ EXPECT_EQ("0xf_4240",
+ WriteIntegerToString(static_cast<uint32_t>(1000000), 16, true));
+ EXPECT_EQ("0xffff_ffff",
+ WriteIntegerToString(static_cast<uint32_t>(4294967295U), 16, true));
+ EXPECT_EQ("-0x8000_0000",
+ WriteIntegerToString(static_cast<int32_t>(-2147483648L), 16, true));
+ EXPECT_EQ("-0xf_4240",
+ WriteIntegerToString(static_cast<int32_t>(-1000000), 16, true));
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<int32_t>(0), 16, true));
+ EXPECT_EQ("0x7fff_ffff",
+ WriteIntegerToString(static_cast<int32_t>(2147483647), 16, true));
+
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<uint64_t>(0), 16, false));
+ EXPECT_EQ("0xf4240",
+ WriteIntegerToString(static_cast<uint64_t>(1000000), 16, false));
+ EXPECT_EQ("0xffffffffffffffff",
+ WriteIntegerToString(static_cast<uint64_t>(18446744073709551615UL),
+ 16, false));
+ EXPECT_EQ("-0x8000000000000000",
+ WriteIntegerToString(
+ static_cast<int64_t>(-9223372036854775807L - 1), 16, false));
+ EXPECT_EQ("-0x186a0",
+ WriteIntegerToString(static_cast<int64_t>(-100000), 16, false));
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<int64_t>(0), 16, false));
+ EXPECT_EQ("0x7fffffffffffffff",
+ WriteIntegerToString(static_cast<int64_t>(9223372036854775807L), 16,
+ false));
+
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<uint64_t>(0), 16, true));
+ EXPECT_EQ("0xf_4240",
+ WriteIntegerToString(static_cast<uint64_t>(1000000), 16, true));
+ EXPECT_EQ("0xffff_ffff_ffff_ffff",
+ WriteIntegerToString(static_cast<uint64_t>(18446744073709551615UL),
+ 16, true));
+ EXPECT_EQ("-0x8000_0000_0000_0000",
+ WriteIntegerToString(
+ static_cast<int64_t>(-9223372036854775807L - 1), 16, true));
+ EXPECT_EQ("-0x1_86a0",
+ WriteIntegerToString(static_cast<int64_t>(-100000), 16, true));
+ EXPECT_EQ("0x0", WriteIntegerToString(static_cast<int64_t>(0), 16, true));
+ EXPECT_EQ("0x7fff_ffff_ffff_ffff",
+ WriteIntegerToString(static_cast<int64_t>(9223372036854775807L), 16,
+ true));
+}
+
+// Small helper function for the various WriteFloatToTextStream tests; just sets
+// up a stream, forwards its arguments to WriteFloatToTextStream, and then
+// returns the text from the stream.
+template <typename Arg0, typename... Args>
+::std::string WriteFloatToString(Arg0 &&arg0, Args &&... args) {
+ TextOutputStream stream;
+ WriteFloatToTextStream(::std::forward<Arg0>(arg0), &stream,
+ ::std::forward<Args>(args)...);
+ return stream.Result();
+}
+
+TEST(WriteFloatToTextStream, RegularNumbers) {
+ EXPECT_EQ("0", WriteFloatToString(0.0, TextOutputOptions()));
+ EXPECT_EQ("1", WriteFloatToString(1.0, TextOutputOptions()));
+ EXPECT_EQ("1.5", WriteFloatToString(1.5, TextOutputOptions()));
+ // TODO(bolms): Figure out how to get minimal-length output.
+ EXPECT_EQ("1.6000000000000001", WriteFloatToString(1.6, TextOutputOptions()));
+ EXPECT_EQ("123456789", WriteFloatToString(123456789.0, TextOutputOptions()));
+ EXPECT_EQ("12345678901234568",
+ WriteFloatToString(12345678901234567.0, TextOutputOptions()));
+ EXPECT_EQ("-12345678901234568",
+ WriteFloatToString(-12345678901234567.0, TextOutputOptions()));
+ EXPECT_EQ("-1.2345678901234568e+17",
+ WriteFloatToString(-123456789012345678.0, TextOutputOptions()));
+ EXPECT_EQ("4.9406564584124654e-324",
+ WriteFloatToString(::std::numeric_limits<double>::denorm_min(),
+ TextOutputOptions()));
+ EXPECT_EQ("1.7976931348623157e+308",
+ WriteFloatToString(::std::numeric_limits<double>::max(),
+ TextOutputOptions()));
+
+ EXPECT_EQ("0", WriteFloatToString(0.0f, TextOutputOptions()));
+ EXPECT_EQ("1", WriteFloatToString(1.0f, TextOutputOptions()));
+ EXPECT_EQ("1.5", WriteFloatToString(1.5f, TextOutputOptions()));
+ EXPECT_EQ("1.60000002", WriteFloatToString(1.6f, TextOutputOptions()));
+ EXPECT_EQ("123456792", WriteFloatToString(123456789.0f, TextOutputOptions()));
+ EXPECT_EQ("1.23456784e+16",
+ WriteFloatToString(12345678901234567.0f, TextOutputOptions()));
+ EXPECT_EQ("-1.23456784e+16",
+ WriteFloatToString(-12345678901234567.0f, TextOutputOptions()));
+ EXPECT_EQ("-1.00000003e+16",
+ WriteFloatToString(-10000000000000000.0f, TextOutputOptions()));
+ EXPECT_EQ("1.40129846e-45",
+ WriteFloatToString(::std::numeric_limits<float>::denorm_min(),
+ TextOutputOptions()));
+ EXPECT_EQ("3.40282347e+38",
+ WriteFloatToString(::std::numeric_limits<float>::max(),
+ TextOutputOptions()));
+}
+
+TEST(WriteFloatToTextStream, Infinities) {
+ EXPECT_EQ("Inf", WriteFloatToString(2 * ::std::numeric_limits<double>::max(),
+ TextOutputOptions()));
+ EXPECT_EQ("Inf", WriteFloatToString(2 * ::std::numeric_limits<float>::max(),
+ TextOutputOptions()));
+ EXPECT_EQ("-Inf",
+ WriteFloatToString(-2 * ::std::numeric_limits<double>::max(),
+ TextOutputOptions()));
+ EXPECT_EQ("-Inf", WriteFloatToString(-2 * ::std::numeric_limits<float>::max(),
+ TextOutputOptions()));
+}
+
+// C++ does not provide great low-level manipulation for NaNs, so we resort to
+// this mess.
+double MakeNanDouble(::std::uint64_t payload, int sign) {
+ payload |= 0x7ff0000000000000UL;
+ if (sign < 0) {
+ payload |= 0x8000000000000000UL;
+ }
+ double result;
+ ::std::memcpy(&result, &payload, sizeof result);
+ return result;
+}
+
+float MakeNanFloat(::std::uint32_t payload, int sign) {
+ payload |= 0x7f800000U;
+ if (sign < 0) {
+ payload |= 0x80000000U;
+ }
+ float result;
+ ::std::memcpy(&result, &payload, sizeof result);
+ return result;
+}
+
+TEST(WriteFloatToTextStream, Nans) {
+ EXPECT_EQ("NaN(0x1)",
+ WriteFloatToString(MakeNanDouble(1, 0), TextOutputOptions()));
+ EXPECT_EQ("NaN(0x1)",
+ WriteFloatToString(MakeNanFloat(1, 0), TextOutputOptions()));
+ EXPECT_EQ("NaN(0x10000)",
+ WriteFloatToString(MakeNanDouble(0x10000, 0), TextOutputOptions()));
+ EXPECT_EQ("NaN(0x7fffff)", WriteFloatToString(MakeNanFloat(0x7fffffU, 0),
+ TextOutputOptions()));
+ EXPECT_EQ("NaN(0xfffffffffffff)",
+ WriteFloatToString(MakeNanDouble(0xfffffffffffffUL, 0),
+ TextOutputOptions()));
+ EXPECT_EQ("-NaN(0x7fffff)", WriteFloatToString(MakeNanFloat(0x7fffffU, -1),
+ TextOutputOptions()));
+ EXPECT_EQ("-NaN(0xfffffffffffff)",
+ WriteFloatToString(MakeNanDouble(0xfffffffffffffUL, -1),
+ TextOutputOptions()));
+ EXPECT_EQ("NaN(0x10000)",
+ WriteFloatToString(MakeNanFloat(0x10000, 0), TextOutputOptions()));
+ EXPECT_EQ("-NaN(0x1)",
+ WriteFloatToString(MakeNanDouble(1, -1), TextOutputOptions()));
+ EXPECT_EQ("-NaN(0x1)",
+ WriteFloatToString(MakeNanFloat(1, -1), TextOutputOptions()));
+ EXPECT_EQ("-NaN(0x10000)", WriteFloatToString(MakeNanDouble(0x10000, -1),
+ TextOutputOptions()));
+ EXPECT_EQ("-NaN(0x10000)",
+ WriteFloatToString(MakeNanFloat(0x10000, -1), TextOutputOptions()));
+ EXPECT_EQ("-NaN(0x1_0000)",
+ WriteFloatToString(MakeNanDouble(0x10000, -1),
+ TextOutputOptions().WithDigitGrouping(true)));
+ EXPECT_EQ("-NaN(0x1_0000)",
+ WriteFloatToString(MakeNanFloat(0x10000, -1),
+ TextOutputOptions().WithDigitGrouping(true)));
+}
+
+TEST(DecodeFloat, RegularNumbers) {
+ double double_result;
+ EXPECT_TRUE(DecodeFloat("0", &double_result));
+ EXPECT_EQ(0.0, double_result);
+ EXPECT_FALSE(::std::signbit(double_result));
+ EXPECT_TRUE(DecodeFloat("-0", &double_result));
+ EXPECT_EQ(0.0, double_result);
+ EXPECT_TRUE(::std::signbit(double_result));
+ EXPECT_TRUE(DecodeFloat("0.0", &double_result));
+ EXPECT_EQ(0.0, double_result);
+ EXPECT_TRUE(DecodeFloat("0.0e100", &double_result));
+ EXPECT_EQ(0.0, double_result);
+ EXPECT_TRUE(DecodeFloat("0x0.0p100", &double_result));
+ EXPECT_EQ(0.0, double_result);
+ EXPECT_TRUE(DecodeFloat("1", &double_result));
+ EXPECT_EQ(1.0, double_result);
+ EXPECT_TRUE(DecodeFloat("1.5", &double_result));
+ EXPECT_EQ(1.5, double_result);
+ EXPECT_TRUE(DecodeFloat("1.6", &double_result));
+ EXPECT_EQ(1.6, double_result);
+ EXPECT_TRUE(DecodeFloat("1.6000000000000001", &double_result));
+ EXPECT_EQ(1.6, double_result);
+ EXPECT_TRUE(DecodeFloat("123456789", &double_result));
+ EXPECT_EQ(123456789.0, double_result);
+ EXPECT_TRUE(DecodeFloat("-1.234567890123458e+17", &double_result));
+ EXPECT_EQ(-1.234567890123458e+17, double_result);
+ EXPECT_TRUE(DecodeFloat("4.9406564584124654e-324", &double_result));
+ EXPECT_EQ(4.9406564584124654e-324, double_result);
+ EXPECT_TRUE(DecodeFloat("1.7976931348623157e+308", &double_result));
+ EXPECT_EQ(1.7976931348623157e+308, double_result);
+ EXPECT_TRUE(DecodeFloat(
+ "000000000000000000000000000004.9406564584124654e-324", &double_result));
+ EXPECT_EQ(4.9406564584124654e-324, double_result);
+
+ float float_result;
+ EXPECT_TRUE(DecodeFloat("0", &float_result));
+ EXPECT_EQ(0.0f, float_result);
+ EXPECT_FALSE(::std::signbit(float_result));
+ EXPECT_TRUE(DecodeFloat("-0", &float_result));
+ EXPECT_EQ(0.0f, float_result);
+ EXPECT_TRUE(::std::signbit(float_result));
+ EXPECT_TRUE(DecodeFloat("0.0", &float_result));
+ EXPECT_EQ(0.0f, float_result);
+ EXPECT_TRUE(DecodeFloat("0.0e100", &float_result));
+ EXPECT_EQ(0.0f, float_result);
+ EXPECT_TRUE(DecodeFloat("0x0.0p100", &float_result));
+ EXPECT_EQ(0.0f, float_result);
+ EXPECT_TRUE(DecodeFloat("1", &float_result));
+ EXPECT_EQ(1.0f, float_result);
+ EXPECT_TRUE(DecodeFloat("1.5", &float_result));
+ EXPECT_EQ(1.5f, float_result);
+ EXPECT_TRUE(DecodeFloat("1.6", &float_result));
+ EXPECT_EQ(1.6f, float_result);
+ EXPECT_TRUE(DecodeFloat("1.6000000000000001", &float_result));
+ EXPECT_EQ(1.6f, float_result);
+ EXPECT_TRUE(DecodeFloat("123456789", &float_result));
+ EXPECT_EQ(123456789.0f, float_result);
+ EXPECT_TRUE(DecodeFloat("-1.23456784e+16", &float_result));
+ EXPECT_EQ(-1.23456784e+16f, float_result);
+ EXPECT_TRUE(DecodeFloat("1.40129846e-45", &float_result));
+ EXPECT_EQ(1.40129846e-45f, float_result);
+ EXPECT_TRUE(DecodeFloat("3.40282347e+38", &float_result));
+ EXPECT_EQ(3.40282347e+38f, float_result);
+
+ // TODO(bolms): "_"-grouped numbers, like "123_456.789", should probably be
+ // allowed.
+}
+
+TEST(DecodeFloat, BadValues) {
+ double result;
+ float float_result;
+
+ // No text is not a value.
+ EXPECT_FALSE(DecodeFloat("", &result));
+
+ // Trailing characters after "Inf" are not allowed.
+ EXPECT_FALSE(DecodeFloat("INF+", &result));
+ EXPECT_FALSE(DecodeFloat("Infinity", &result));
+
+ // Trailing characters after "NaN" are not allowed.
+ EXPECT_FALSE(DecodeFloat("NaN(", &result));
+ EXPECT_FALSE(DecodeFloat("NaN(0]", &result));
+ EXPECT_FALSE(DecodeFloat("NaNaNaNa", &result));
+
+ // Non-number NaN payloads are not allowed.
+ EXPECT_FALSE(DecodeFloat("NaN()", &result));
+ EXPECT_FALSE(DecodeFloat("NaN(x)", &result));
+ EXPECT_FALSE(DecodeFloat("NaN(0x)", &result));
+
+ // Negative NaN payloads are not allowed.
+ EXPECT_FALSE(DecodeFloat("NaN(-1)", &result));
+ EXPECT_FALSE(DecodeFloat("NaN(-0)", &result));
+
+ // NaN with zero payload is infinity, and is thus not allowed.
+ EXPECT_FALSE(DecodeFloat("NaN(0)", &result));
+ EXPECT_FALSE(DecodeFloat("-NaN(0)", &result));
+
+ // NaN double payloads must be no more than 52 bits.
+ EXPECT_FALSE(DecodeFloat("NaN(0x10_0000_0000_0000)", &result));
+ EXPECT_FALSE(DecodeFloat("NaN(0x8000_0000_0000_0000)", &result));
+ EXPECT_FALSE(DecodeFloat("NaN(0x1_0000_0000_0000_0000)", &result));
+
+ // NaN float payloads must be no more than 23 bits.
+ EXPECT_FALSE(DecodeFloat("NaN(0x80_0000)", &float_result));
+ EXPECT_FALSE(DecodeFloat("NaN(0x8000_0000)", &float_result));
+ EXPECT_FALSE(DecodeFloat("NaN(0x1_0000_0000)", &float_result));
+
+ // Trailing characters after regular values are not allowed.
+ EXPECT_FALSE(DecodeFloat("0x", &result));
+ EXPECT_FALSE(DecodeFloat("0e0a", &result));
+ EXPECT_FALSE(DecodeFloat("0b0", &result));
+ EXPECT_FALSE(DecodeFloat("0a", &result));
+ EXPECT_FALSE(DecodeFloat("1..", &result));
+
+ // Grouping characters like "," should not be allowed.
+ EXPECT_FALSE(DecodeFloat("123,456", &result));
+ EXPECT_FALSE(DecodeFloat("123'456", &result));
+}
+
+TEST(DecodeFloat, Infinities) {
+ double double_result;
+ EXPECT_TRUE(DecodeFloat("Inf", &double_result));
+ EXPECT_TRUE(::std::isinf(double_result));
+ EXPECT_FALSE(::std::signbit(double_result));
+ EXPECT_TRUE(DecodeFloat("-Inf", &double_result));
+ EXPECT_TRUE(::std::isinf(double_result));
+ EXPECT_TRUE(::std::signbit(double_result));
+ EXPECT_TRUE(DecodeFloat("+Inf", &double_result));
+ EXPECT_TRUE(::std::isinf(double_result));
+ EXPECT_FALSE(::std::signbit(double_result));
+ EXPECT_TRUE(DecodeFloat("iNF", &double_result));
+ EXPECT_TRUE(::std::isinf(double_result));
+ EXPECT_FALSE(::std::signbit(double_result));
+ EXPECT_TRUE(DecodeFloat("-iNF", &double_result));
+ EXPECT_TRUE(::std::isinf(double_result));
+ EXPECT_TRUE(::std::signbit(double_result));
+ EXPECT_TRUE(DecodeFloat("+iNF", &double_result));
+ EXPECT_TRUE(::std::isinf(double_result));
+ EXPECT_FALSE(::std::signbit(double_result));
+}
+
+// Helper functions for converting NaNs to bit patterns, so that the exact bit
+// pattern result can be tested.
+::std::uint64_t DoubleBitPattern(double n) {
+ ::std::uint64_t result;
+ memcpy(&result, &n, sizeof(result));
+ return result;
+}
+
+::std::uint32_t FloatBitPattern(float n) {
+ ::std::uint32_t result;
+ memcpy(&result, &n, sizeof(result));
+ return result;
+}
+
+TEST(DecodeFloat, Nans) {
+ double double_result;
+ EXPECT_TRUE(DecodeFloat("nan", &double_result));
+ EXPECT_TRUE(::std::isnan(double_result));
+ EXPECT_FALSE(::std::signbit(double_result));
+ EXPECT_TRUE(DecodeFloat("-NAN", &double_result));
+ EXPECT_TRUE(::std::isnan(double_result));
+ EXPECT_TRUE(::std::signbit(double_result));
+ EXPECT_TRUE(DecodeFloat("NaN(1)", &double_result));
+ EXPECT_TRUE(::std::isnan(double_result));
+ EXPECT_EQ(0x7ff0000000000001UL, DoubleBitPattern(double_result));
+ EXPECT_TRUE(DecodeFloat("nAn(0x1000)", &double_result));
+ EXPECT_TRUE(::std::isnan(double_result));
+ EXPECT_EQ(0x7ff0000000001000UL, DoubleBitPattern(double_result));
+ EXPECT_TRUE(DecodeFloat("NaN(0b11000011)", &double_result));
+ EXPECT_TRUE(::std::isnan(double_result));
+ EXPECT_EQ(0x7ff00000000000c3UL, DoubleBitPattern(double_result));
+ EXPECT_TRUE(DecodeFloat("-NaN(0b11000011)", &double_result));
+ EXPECT_TRUE(::std::isnan(double_result));
+ EXPECT_EQ(0xfff00000000000c3UL, DoubleBitPattern(double_result));
+ EXPECT_TRUE(DecodeFloat("+NaN(0b11000011)", &double_result));
+ EXPECT_TRUE(::std::isnan(double_result));
+ EXPECT_EQ(0x7ff00000000000c3UL, DoubleBitPattern(double_result));
+ EXPECT_TRUE(DecodeFloat("NaN(0xf_ffff_ffff_ffff)", &double_result));
+ EXPECT_TRUE(::std::isnan(double_result));
+ EXPECT_EQ(0x7fffffffffffffffUL, DoubleBitPattern(double_result));
+ EXPECT_TRUE(DecodeFloat("-NaN(0xf_ffff_ffff_ffff)", &double_result));
+ EXPECT_TRUE(::std::isnan(double_result));
+ EXPECT_EQ(0xffffffffffffffffUL, DoubleBitPattern(double_result));
+
+ float float_result;
+ EXPECT_TRUE(DecodeFloat("nan", &float_result));
+ EXPECT_TRUE(::std::isnan(float_result));
+ EXPECT_FALSE(::std::signbit(float_result));
+ EXPECT_TRUE(DecodeFloat("-NAN", &float_result));
+ EXPECT_TRUE(::std::isnan(float_result));
+ EXPECT_TRUE(::std::signbit(float_result));
+ EXPECT_TRUE(DecodeFloat("NaN(1)", &float_result));
+ EXPECT_TRUE(::std::isnan(float_result));
+ EXPECT_EQ(0x7f800001U, FloatBitPattern(float_result));
+ EXPECT_TRUE(DecodeFloat("nAn(0x1000)", &float_result));
+ EXPECT_TRUE(::std::isnan(float_result));
+ EXPECT_EQ(0x7f801000U, FloatBitPattern(float_result));
+ EXPECT_TRUE(DecodeFloat("NaN(0b11000011)", &float_result));
+ EXPECT_TRUE(::std::isnan(float_result));
+ EXPECT_EQ(0x7f8000c3U, FloatBitPattern(float_result));
+ EXPECT_TRUE(DecodeFloat("-NaN(0b11000011)", &float_result));
+ EXPECT_TRUE(::std::isnan(float_result));
+ EXPECT_EQ(0xff8000c3U, FloatBitPattern(float_result));
+ EXPECT_TRUE(DecodeFloat("+NaN(0b11000011)", &float_result));
+ EXPECT_TRUE(::std::isnan(float_result));
+ EXPECT_EQ(0x7f8000c3U, FloatBitPattern(float_result));
+ EXPECT_TRUE(DecodeFloat("NaN(0x7f_ffff)", &float_result));
+ EXPECT_TRUE(::std::isnan(float_result));
+ EXPECT_EQ(0x7fffffffU, FloatBitPattern(float_result));
+ EXPECT_TRUE(DecodeFloat("-NaN(0x7f_ffff)", &float_result));
+ EXPECT_TRUE(::std::isnan(float_result));
+ EXPECT_EQ(0xffffffffU, FloatBitPattern(float_result));
+}
+
+} // namespace test
+} // namespace support
+} // namespace emboss
diff --git a/public/emboss_view_parameters.h b/public/emboss_view_parameters.h
new file mode 100644
index 0000000..f2a6e06
--- /dev/null
+++ b/public/emboss_view_parameters.h
@@ -0,0 +1,45 @@
+// Copyright 2019 Google LLC
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// https://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// Helper classes for constructing the `Parameters` template argument to view
+// classes.
+
+#ifndef EMBOSS_PUBLIC_EMBOSS_VIEW_PARAMETERS_H_
+#define EMBOSS_PUBLIC_EMBOSS_VIEW_PARAMETERS_H_
+
+namespace emboss {
+namespace support {
+
+template <int kBitsParam, typename Verifier>
+struct FixedSizeViewParameters {
+ static constexpr int kBits = kBitsParam;
+ template <typename ValueType>
+ static constexpr bool ValueIsOk(ValueType value) {
+ return Verifier::ValueIsOk(value);
+ }
+ // TODO(bolms): add AllValuesAreOk(), and use it to shortcut Ok() processing
+ // for arrays and other compound objects.
+};
+
+struct AllValuesAreOk {
+ template <typename ValueType>
+ static constexpr bool ValueIsOk(ValueType) {
+ return true;
+ }
+};
+
+} // namespace support
+} // namespace emboss
+
+#endif // EMBOSS_PUBLIC_EMBOSS_VIEW_PARAMETERS_H_
diff --git a/public/ir_pb2.py b/public/ir_pb2.py
new file mode 100644
index 0000000..b33a06b
--- /dev/null
+++ b/public/ir_pb2.py
@@ -0,0 +1,937 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Intermediate representation (IR) for Emboss.
+#
+# This was originally a Google Protocol Buffer file, but as of 2019 it turns
+# out that a) the public Google Python Protocol Buffer implementation is
+# extremely slow, and b) all the ways of getting Bazel+Python+Protocol Buffers
+# to play nice are hacky and fragile.
+#
+# Thus, this file, which presents a similar-enough interface that the rest of
+# Emboss can use it with minimal changes.
+#
+# Protobufs have a really, really strange, un-Pythonic interface, with tricky
+# implicit semantics -- mostly around magically instantiating protos when you
+# assign to some deeply-nested field. I (bolms@) would *strongly* prefer to
+# have a more explicit interface, but don't (currently) have time to refactor
+# everything that touches the IR (i.e., the entire compiler).
+
+import json
+import re
+import sys
+
+
+if sys.version_info[0] == 2:
+ _text = unicode
+ _text_types = (unicode, str)
+ _int = long
+ _int_types = (int, long)
+else:
+ _text = str
+ _text_types = (str,)
+ _int = int
+ _int_types = (int,)
+
+
+_BASIC_TYPES = _text_types + _int_types + (bool,)
+
+
+class Optional(object):
+ def __init__(self, type, oneof=None, names=None):
+ self._type = type
+ self._oneof = oneof
+ self._names = names
+
+ def __get__(self, obj, type=None):
+ result = obj.raw_fields.get(self.name, None)
+ if result is not None:
+ return result
+ if self.type in _BASIC_TYPES:
+ return self._type()
+ result = self._type()
+
+ def on_write():
+ self._set_value(obj, result)
+
+ result.set_on_write(on_write)
+ return result
+
+ def __set__(self, obj, value):
+ if issubclass(self._type, _BASIC_TYPES):
+ self.set(obj, value)
+ else:
+ raise AttributeError("Cannot set {} (type {}) for type {}".format(
+ value, value.__class__, self._type))
+
+ def _set_value(self, obj, value):
+ if self._oneof is not None:
+ current = obj.oneofs.get(self._oneof)
+ if current in obj.raw_fields:
+ del obj.raw_fields[current]
+ obj.oneofs[self._oneof] = self.name
+ obj.raw_fields[self.name] = value
+ obj.on_write()
+
+ def set(self, obj, value):
+ if value is None:
+ return
+ if isinstance(value, dict):
+ self._set_value(obj, self._type(**value))
+ elif isinstance(value, _text) and self._names:
+ self._set_value(obj, self._type(self._names(value)))
+ elif (not isinstance(value, self._type) and
+ not (self._type == _int and isinstance(value, _int_types)) and
+ not (self._type == _text and isinstance(value, _text_types))):
+ raise AttributeError("Cannot set {} (type {}) for type {}".format(
+ value, value.__class__, self._type))
+ elif issubclass(self._type, Message):
+ self._set_value(obj, self._type(**value.raw_fields))
+ else:
+ self._set_value(obj, self._type(value))
+
+ def resolve_type(self):
+ if isinstance(self._type, type(lambda: None)):
+ self._type = self._type()
+
+ @property
+ def type(self):
+ return self._type
+
+
+class Oneof(object):
+ def __init__(self, **members):
+ self.members = members
+
+ def __get__(self, obj, type=None):
+ return obj.oneofs[self.name]
+
+ def __set__(self, obj, value):
+ raise AttributeError("Cannot set {}".format(self.name))
+
+
+class TypedScopedList(object):
+ def __init__(self, type, on_write=lambda: None):
+ self._type = type
+ self._list = []
+ self._on_write = on_write
+
+ def __iter__(self):
+ return iter(self._list)
+
+ def __delitem__(self, key):
+ del self._list[key]
+
+ def __getitem__(self, key):
+ return self._list[key]
+
+ def extend(self, values):
+ for value in values:
+ if isinstance(value, dict):
+ self._list.append(self._type(**value))
+ elif (not isinstance(value, self._type) and
+ not (self._type == _int and isinstance(value, _int_types)) and
+ not (self._type == _text and isinstance(value, _text_types))):
+ raise TypeError(
+ "Needed {}, got {} ({!r})".format(
+ self._type, value.__class__, value))
+ else:
+ if self._type in _BASIC_TYPES:
+ self._list.append(self._type(value))
+ else:
+ self._list.append(self._type(**value.raw_fields))
+ self._on_write()
+
+ def __repr__(self):
+ return repr(self._list)
+
+ def __len__(self):
+ return len(self._list)
+
+ def __eq__(self, other):
+ return ((self.__class__ == other.__class__ and
+ self._list == other._list) or
+ (isinstance(other, list) and self._list == other))
+
+ def __ne__(self, other):
+ return not (self == other)
+
+
+class Repeated(object):
+ def __init__(self, type):
+ self._type = type
+
+ def __get__(self, obj, type=None):
+ return obj.raw_fields[self.name]
+
+ def __set__(self, obj, value):
+ raise AttributeError("Cannot set {}".format(self.name))
+
+ def set(self, obj, values):
+ typed_list = obj.raw_fields[self.name]
+ if not isinstance(values, (list, TypedScopedList)):
+ raise TypeError("Cannot initialize repeated field {} from {}".format(
+ self.name, values.__class__))
+ del typed_list[:]
+ typed_list.extend(values)
+
+ def resolve_type(self):
+ if isinstance(self._type, type(lambda: None)):
+ self._type = self._type()
+
+ @property
+ def type(self):
+ return self._type
+
+
+_deferred_specs = []
+
+def message(cls):
+ # TODO(bolms): move this into __init_subclass__ after dropping Python 2
+ # support.
+ _deferred_specs.append(cls)
+ return cls
+
+class Message(object):
+
+ def __init__(self, **field_values):
+ self.oneofs = {}
+ self._on_write = lambda: None
+ self._initialize_raw_fields_from(field_values)
+
+ def _initialize_raw_fields_from(self, field_values):
+ self.raw_fields = {}
+ for name, type in self.repeated_fields.items():
+ self.raw_fields[name] = TypedScopedList(type, self.on_write)
+ for k, v in field_values.items():
+ spec = self.field_specs.get(k)
+ if spec is None:
+ raise AttributeError("No field {} on {}.".format(
+ k, self.__class__.__name__))
+ spec.set(self, v)
+
+ @classmethod
+ def from_json(cls, text):
+ as_dict = json.loads(text)
+ return cls(**as_dict)
+
+ def on_write(self):
+ self._on_write()
+ self._on_write = lambda: None
+
+ def set_on_write(self, on_write):
+ self._on_write = on_write
+
+ def __eq__(self, other):
+ return (self.__class__ == other.__class__ and
+ self.raw_fields == other.raw_fields)
+
+ def CopyFrom(self, other):
+ if self.__class__ != other.__class__:
+ raise TypeError("{} cannot CopyFrom {}".format(
+ self.__class__.__name__, other.__class__.__name__))
+ self._initialize_raw_fields_from(other.raw_fields)
+ self.on_write()
+
+ def HasField(self, name):
+ return name in self.raw_fields
+
+ def WhichOneof(self, oneof_name):
+ return self.oneofs.get(oneof_name)
+
+ def to_dict(self):
+ result = {}
+ for k, v in self.raw_fields.items():
+ if isinstance(v, _BASIC_TYPES):
+ result[k] = v
+ elif isinstance(v, TypedScopedList):
+ if len(v):
+ # For compatibility with the proto world, empty lists are just
+ # elided.
+ result[k] = [
+ item if isinstance(item, _BASIC_TYPES) else item.to_dict()
+ for item in v
+ ]
+ else:
+ result[k] = v.to_dict()
+ return result
+
+ def __repr__(self):
+ return self.to_json(separators=(',', ':'), sort_keys=True)
+
+ def to_json(self, *args, **kwargs):
+ return json.dumps(self.to_dict(), *args, **kwargs)
+
+ def __str__(self):
+ return _text(self.to_dict())
+
+
+def _initialize_deferred_specs():
+ for cls in _deferred_specs:
+ field_specs = {}
+ repeated_fields = {}
+ for k, v in cls.__dict__.items():
+ if k[0] == "_":
+ continue
+ if isinstance(v, (Optional, Repeated)):
+ v.name = k
+ v.resolve_type()
+ field_specs[k] = v
+ if isinstance(v, Repeated):
+ repeated_fields[k] = v.type
+ cls.field_specs = field_specs
+ cls.repeated_fields = repeated_fields
+
+
+################################################################################
+
+
+@message
+class Position(Message):
+ """A zero-width position within a source file."""
+ line = Optional(int) # Line (starts from 1).
+ column = Optional(int) # Column (starts from 1).
+
+
+@message
+class Location(Message):
+ """A half-open start:end range within a source file."""
+ start = Optional(Position) # Beginning of the range.
+ end = Optional(Position) # One column past the end of the range.
+
+ # True if this Location is outside of the parent object's Location.
+ is_disjoint_from_parent = Optional(bool)
+
+ # True if this Location's parent was synthesized, and does not directly
+ # appear in the source file. The Emboss front end uses this field to cull
+ # irrelevant error messages.
+ is_synthetic = Optional(bool)
+
+
+@message
+class Word(Message):
+ """IR for a bare word in the source file.
+
+ This is used in NameDefinitions and References."""
+
+ text = Optional(_text)
+ source_location = Optional(Location)
+
+
+@message
+class String(Message):
+ """IR for a string in the source file."""
+ text = Optional(_text)
+ source_location = Optional(Location)
+
+
+@message
+class Documentation(Message):
+ text = Optional(_text)
+ source_location = Optional(Location)
+
+
+@message
+class BooleanConstant(Message):
+ """IR for a boolean constant."""
+ value = Optional(bool)
+ source_location = Optional(Location)
+
+
+@message
+class Empty(Message):
+ """Placeholder message for automatic element counts for arrays."""
+ source_location = Optional(Location)
+
+
+@message
+class NumericConstant(Message):
+ """IR for any numeric constant."""
+
+ # Numeric constants are stored as decimal strings; this is the simplest way
+ # to store the full -2**63..+2**64 range.
+ #
+ # TODO(bolms): switch back to int, and just use strings during
+ # serialization, now that we're free of proto.
+ value = Optional(_text)
+ source_location = Optional(Location)
+
+
+@message
+class Function(Message):
+ """IR for a single function (+, -, *, ==, $max, etc.) in an expression."""
+ UNKNOWN = 0
+ ADDITION = 1 # +
+ SUBTRACTION = 2 # -
+ MULTIPLICATION = 3 # *
+ EQUALITY = 4 # ==
+ INEQUALITY = 5 # !=
+ AND = 6 # &&
+ OR = 7 # ||
+ LESS = 8 # <
+ LESS_OR_EQUAL = 9 # <=
+ GREATER = 10 # >
+ GREATER_OR_EQUAL = 11 # >=
+ CHOICE = 12 # ?:
+ MAXIMUM = 13 # $max()
+ PRESENCE = 14 # $present()
+ UPPER_BOUND = 15 # $upper_bound()
+ LOWER_BOUND = 16 # $lower_bound()
+
+ function = Optional(int, names=lambda x: getattr(Function, x))
+ args = Repeated(lambda: Expression)
+ function_name = Optional(Word)
+ source_location = Optional(Location)
+
+
+@message
+class CanonicalName(Message):
+ """CanonicalName is the unique, absolute name for some object.
+
+ A CanonicalName is the unique, absolute name for some object (Type, field,
+ etc.) in the IR. It is used both in the definitions of objects ("struct
+ Foo"), and in references to objects (a field of type "Foo")."""
+
+ # The module_file is the Module.source_file_name of the Module in which this
+ # object's definition appears. Note that the Prelude always has a
+ # Module.source_file_name of "", and thus references to Prelude names will
+ # have module_file == "".
+ module_file = Optional(_text)
+
+ # The object_path is the canonical path to the object definition within its
+ # module file. For example, the field "bar" would have an object path of
+ # ["Foo", "bar"]:
+ #
+ # struct Foo:
+ # 0:3 UInt bar
+ #
+ #
+ # The enumerated name "BOB" would have an object path of ["Baz", "Qux",
+ # "BOB"]:
+ #
+ # struct Baz:
+ # 0:3 Qux qux
+ #
+ # enum Qux:
+ # BOB = 0
+ object_path = Repeated(_text)
+
+
+@message
+class NameDefinition(Message):
+ """NameDefinition is IR for the name of an object, within the object.
+
+ That is, a TypeDefinition or Field will hold a NameDefinition as its
+ name."""
+
+ # The name, as directly generated from the source text. name.text will
+ # match the last element of canonical_name.object_path. Note that in some
+ # cases, the exact string in name.text may not appear in the source text.
+ name = Optional(Word)
+
+ # The CanonicalName that will appear in References. This field is
+ # technically redundant: canonical_name.module_file should always match the
+ # source_file_name of the enclosing Module, and canonical_name.object_path
+ # should always match the names of parent nodes.
+ canonical_name = Optional(CanonicalName)
+
+ # If true, indicates that this is an automatically-generated name, which
+ # should not be visible outside of its immediate namespace.
+ is_anonymous = Optional(bool)
+
+ # The location of this NameDefinition in source code.
+ source_location = Optional(Location)
+
+
+@message
+class Reference(Message):
+ """A Reference holds the canonical name of something defined elsewhere.
+
+ For example, take this fragment:
+
+ struct Foo:
+ 0:3 UInt size (s)
+ 4:s Int:8[] payload
+
+ "Foo", "size", and "payload" will become NameDefinitions in their
+ corresponding Field and Message IR objects, while "UInt", the second "s",
+ and "Int" are References. Note that the second "s" will have a
+ canonical_name.object_path of ["Foo", "size"], not ["Foo", "s"]: the
+ Reference always holds the single "true" name of the object, regardless of
+ what appears in the .emb."""
+
+ # The canonical name of the object being referred to. This name should be
+ # used to find the object in the IR.
+ canonical_name = Optional(CanonicalName)
+
+ # The source_name is the name the user entered in the source file; it could
+ # be either relative or absolute, and may be an alias (and thus not match
+ # any part of the canonical_name). Back ends should use canonical_name for
+ # name lookup, and reserve source_name for error messages.
+ source_name = Repeated(Word)
+
+ # If true, then symbol resolution should only look at local names when
+ # resolving source_name. This is used so that the names of inline types
+ # aren't "ambiguous" if there happens to be another type with the same name
+ # at a parent scope.
+ is_local_name = Optional(bool)
+
+ # TODO(bolms): Allow absolute paths starting with ".".
+
+ # Note that this is the source_location of the *Reference*, not of the
+ # object to which it refers.
+ source_location = Optional(Location)
+
+
+@message
+class FieldReference(Message):
+ """IR for a "field" or "field.sub.subsub" reference in an expression.
+
+ The first element of "path" is the "base" field, which should be directly
+ readable in the (runtime) context of the expression. For example:
+
+ struct Foo:
+ 0:1 UInt header_size (h)
+ 0:h UInt:8[] header_bytes
+
+ The "h" will translate to ["Foo", "header_size"], which will be the first
+ (and in this case only) element of "path".
+
+ Subsequent path elements should be treated as subfields. For example, in:
+
+ struct Foo:
+ struct Sizes:
+ 0:1 UInt header_size
+ 1:2 UInt body_size
+ 0 [+2] Sizes sizes
+ 0 [+sizes.header_size] UInt:8[] header
+ sizes.header_size [+sizes.body_size] UInt:8[] body
+
+ The references to "sizes.header_size" will have a path of [["Foo",
+ "sizes"], ["Foo", "Sizes", "header_size"]]. Note that each path element is
+ a fully-qualified reference; some back ends (C++, Python) may only use the
+ last element, while others (C) may use the complete path.
+
+ This representation is a bit awkward, and is fundamentally limited to a
+ dotted list of static field names. It does not allow an expression like
+ `array[n]` on the left side of a `.`. At this point, it is an artifact of
+ the era during which I (bolms@) thought I could get away with skipping
+ compiler-y things."""
+
+ # TODO(bolms): Add composite types to the expression type system, and
+ # replace FieldReference with a "member access" Expression kind. Further,
+ # move the symbol resolution for FieldReferences that is currently in
+ # symbol_resolver.py into type_check.py.
+
+ # TODO(bolms): Make the above change before declaring the IR to be "stable".
+
+ path = Repeated(Reference)
+ source_location = Optional(Location)
+
+
+@message
+class OpaqueType(Message):
+ pass
+
+
+@message
+class IntegerType(Message):
+ """Type of an integer expression."""
+
+ # For optimization, the modular congruence of an integer expression is
+ # tracked. This consists of a modulus and a modular_value, such that for
+ # all possible values of expression, expression MOD modulus ==
+ # modular_value.
+ #
+ # The modulus may be the special value "infinity" to indicate that the
+ # expression's value is exactly modular_value; otherwise, it should be a
+ # positive integer.
+ #
+ # A modulus of 1 places no constraints on the value.
+ #
+ # The modular_value should always be a nonnegative integer that is smaller
+ # than the modulus.
+ #
+ # Note that this is specifically the *modulus*, which is not equivalent to
+ # the value from C's '%' operator when the dividend is negative: in C, -7 %
+ # 4 == -3, but the modular_value here would be 1. Python uses modulus: in
+ # Python, -7 % 4 == 1.
+ modulus = Optional(_text)
+ modular_value = Optional(_text)
+
+ # The minimum and maximum values of an integer are tracked and checked so
+ # that Emboss can implement reliable arithmetic with no operations
+ # overflowing either 64-bit unsigned or 64-bit signed 2's-complement
+ # integers.
+ #
+ # Note that constant subexpressions are allowed to overflow, as long as the
+ # final, computed constant value of the subexpression fits in a 64-bit
+ # value.
+ #
+ # The minimum_value may take the value "-infinity", and the maximum_value
+ # may take the value "infinity". These sentinel values indicate that
+ # Emboss has no bound information for the Expression, and therefore the
+ # Expression may only be evaluated during compilation; the back end should
+ # never need to compile such an expression into the target language (e.g.,
+ # C++).
+ minimum_value = Optional(_text)
+ maximum_value = Optional(_text)
+
+
+@message
+class BooleanType(Message):
+ value = Optional(bool)
+
+
+@message
+class EnumType(Message):
+ name = Optional(Reference)
+ value = Optional(_text)
+
+
+@message
+class ExpressionType(Message):
+ opaque = Optional(OpaqueType, "type")
+ integer = Optional(IntegerType, "type")
+ boolean = Optional(BooleanType, "type")
+ enumeration = Optional(EnumType, "type")
+
+
+@message
+class Expression(Message):
+ """ IR for an expression.
+
+ An Expression is a potentially-recursive data structure. It can either
+ represent a leaf node (constant or reference) or an operation combining
+ other Expressions (function)."""
+
+ constant = Optional(NumericConstant, "expression")
+ constant_reference = Optional(Reference, "expression")
+ function = Optional(Function, "expression")
+ field_reference = Optional(FieldReference, "expression")
+ boolean_constant = Optional(BooleanConstant, "expression")
+ builtin_reference = Optional(Reference, "expression")
+
+ type = Optional(ExpressionType)
+ source_location = Optional(Location)
+
+
+@message
+class ArrayType(Message):
+ """IR for an array type ("Int:8[12]" or "Message[2]" or "UInt[3][2]")."""
+ base_type = Optional(lambda: Type)
+
+ element_count = Optional(Expression, "size")
+ automatic = Optional(Empty, "size")
+
+ source_location = Optional(Location)
+
+
+@message
+class AtomicType(Message):
+ """IR for a non-array type ("UInt" or "Foo(Version.SIX)")."""
+ reference = Optional(Reference)
+ runtime_parameter = Repeated(Expression)
+ source_location = Optional(Location)
+
+
+@message
+class Type(Message):
+ """IR for a type reference ("UInt", "Int:8[12]", etc.)."""
+ atomic_type = Optional(AtomicType, "type")
+ array_type = Optional(ArrayType, "type")
+
+ size_in_bits = Optional(Expression)
+ source_location = Optional(Location)
+
+
+@message
+class AttributeValue(Message):
+ """IR for a attribute value."""
+ # TODO(bolms): Make String a type of Expression, and replace
+ # AttributeValue with Expression.
+ expression = Optional(Expression, "value")
+ string_constant = Optional(String, "value")
+
+ source_location = Optional(Location)
+
+
+@message
+class Attribute(Message):
+ """IR for a [name = value] attribute."""
+ name = Optional(Word)
+ value = Optional(AttributeValue)
+ back_end = Optional(Word)
+ is_default = Optional(bool)
+ source_location = Optional(Location)
+
+
+@message
+class WriteTransform(Message):
+ """IR which defines an expression-based virtual field write scheme.
+
+ E.g., for a virtual field like `x_plus_one`:
+
+ struct Foo:
+ 0 [+1] UInt x
+ let x_plus_one = x + 1
+
+ ... the `WriteMethod` would be `transform`, with `$logical_value - 1` for
+ `function_body` and `x` for `destination`."""
+
+ function_body = Optional(Expression)
+ destination = Optional(FieldReference)
+
+
+@message
+class WriteMethod(Message):
+ """IR which defines the method used for writing to a virtual field."""
+
+ # A physical Field can be written directly.
+ physical = Optional(bool, "method")
+
+ # A read_only Field cannot be written.
+ read_only = Optional(bool, "method")
+
+ # An alias is a direct, untransformed forward of another field; it can be
+ # implemented by directly returning a reference to the aliased field.
+ #
+ # Aliases are the only kind of virtual field that may have an opaque type.
+ alias = Optional(FieldReference, "method")
+
+ # A transform is a way of turning a logical value into a value which should
+ # be written to another field: A virtual field like `let y = x + 1` would
+ # have a transform WriteMethod to subtract 1 from the new `y` value, and
+ # write that to `x`.
+ transform = Optional(WriteTransform, "method")
+
+
+@message
+class FieldLocation(Message):
+ """IR for a field location."""
+ start = Optional(Expression)
+ size = Optional(Expression)
+ source_location = Optional(Location)
+
+
+@message
+class Field(Message):
+ """IR for a field in a struct definition.
+
+ There are two kinds of Field: physical fields have location and (physical)
+ type; virtual fields have read_transform. Although there are differences,
+ in many situations physical and virtual fields are treated the same way,
+ and they can be freely intermingled in the source file."""
+ location = Optional(FieldLocation) # The physical location of the field.
+ type = Optional(Type) # The physical type of the field.
+
+ read_transform = Optional(Expression) # The value of a virtual field.
+
+ # How this virtual field should be written.
+ write_method = Optional(WriteMethod)
+
+ name = Optional(NameDefinition) # The name of the field.
+ abbreviation = Optional(Word) # An optional short name for the field, only
+ # visible inside the enclosing bits/struct.
+ attribute = Repeated(Attribute) # Field-specific attributes.
+ documentation = Repeated(Documentation) # Field-specific documentation.
+
+ # The field only exists when existence_condition evaluates to true. For
+ # example:
+ #
+ # struct Message:
+ # 0 [+4] UInt length
+ # 4 [+8] MessageType message_type
+ # if message_type == MessageType.FOO:
+ # 8 [+length] Foo foo
+ # if message_type == MessageType.BAR:
+ # 8 [+length] Bar bar
+ # 8+length [+4] UInt crc
+ #
+ # For length, message_type, and crc, existence_condition will be
+ # "boolean_constant { value: true }"
+ #
+ # For "foo", existence_condition will be:
+ # function { function: EQUALITY
+ # args: [reference to message_type]
+ # args: { [reference to MessageType.FOO] } }
+ #
+ # The "bar" field will have a similar existence_condition to "foo":
+ # function { function: EQUALITY
+ # args: [reference to message_type]
+ # args: { [reference to MessageType.BAR] } }
+ #
+ # When message_type is MessageType.BAR, the Message struct does not contain
+ # field "foo", and vice versa for message_type == MessageType.FOO and field
+ # "bar": those fields only conditionally exist in the structure.
+ #
+ # TODO(bolms): Document conditional fields better, and replace some of this
+ # explanation with a reference to the documentation.
+ existence_condition = Optional(Expression)
+ source_location = Optional(Location)
+
+
+@message
+class Structure(Message):
+ """IR for a bits or struct definition."""
+ field = Repeated(Field)
+
+ # The fields in `field` are listed in the order they appear in the original
+ # .emb.
+ #
+ # For text format output, this can lead to poor results. Take the following
+ # struct:
+ #
+ # struct Foo:
+ # b [+4] UInt a
+ # 0 [+4] UInt b
+ #
+ # Here, the location of `a` depends on the current value of `b`. Because of
+ # this, if someone calls
+ #
+ # emboss::UpdateFromText(foo_view, "{ a: 10, b: 4 }");
+ #
+ # then foo_view will not be updated the way one would expect: if `b`'s value
+ # was something other than 4 to start with, then `UpdateFromText` will write
+ # the 10 to some other location, then update `b` to 4.
+ #
+ # To avoid surprises, `emboss::DumpAsText` should return `"{ b: 4, a: 10
+ # }"`.
+ #
+ # The `fields_in_dependency_order` field provides a permutation of `field`
+ # such that each field appears after all of its dependencies. For example,
+ # `struct Foo`, above, would have `{ 1, 0 }` in
+ # `fields_in_dependency_order`.
+ #
+ # The exact ordering of `fields_in_dependency_order` is not guaranteed, but
+ # some effort is made to keep the order close to the order fields are listed
+ # in the original `.emb` file. In particular, if the ordering 0, 1, 2, 3,
+ # ... satisfies dependency ordering, then `fields_in_dependency_order` will
+ # be `{ 0, 1, 2, 3, ... }`.
+ fields_in_dependency_order = Repeated(int)
+
+ source_location = Optional(Location)
+
+
+@message
+class External(Message):
+ """IR for an external type declaration."""
+ # Externals have no values other than name and attribute list, which are
+ # common to all type definitions.
+
+ source_location = Optional(Location)
+
+
+@message
+class EnumValue(Message):
+ """IR for a single value within an enumerated type."""
+ name = Optional(NameDefinition) # The name of the enum value.
+ value = Optional(Expression) # The value of the enum value.
+ documentation = Repeated(Documentation) # Value-specific documentation.
+
+ source_location = Optional(Location)
+
+
+@message
+class Enum(Message):
+ """IR for an enumerated type definition."""
+ value = Repeated(EnumValue)
+ source_location = Optional(Location)
+
+
+@message
+class Import(Message):
+ """IR for an import statement in a module."""
+ file_name = Optional(String) # The file to import.
+ local_name = Optional(Word) # The name to use within this module.
+ source_location = Optional(Location)
+
+
+@message
+class RuntimeParameter(Message):
+ name = Optional(NameDefinition) # The name of the parameter.
+ type = Optional(ExpressionType) # The type of the parameter.
+
+ # For convenience and readability, physical types may be used in the .emb
+ # source instead of a full expression type. That way, users can write
+ # something like:
+ #
+ # struct Foo(version :: UInt:8):
+ #
+ # instead of:
+ #
+ # struct Foo(version :: {$int x |: 0 <= x <= 255}):
+ #
+ # In these cases, physical_type_alias holds the user-supplied type, and type
+ # is filled in after initial parsing is finished.
+ #
+ # TODO(bolms): Actually implement the set builder type notation.
+ physical_type_alias = Optional(Type)
+
+ source_location = Optional(Location)
+
+
+@message
+class TypeDefinition(Message):
+ """Container IR for a type definition (struct, union, etc.)"""
+
+ # The "addressable unit" is the size of the smallest unit that can be read
+ # from the backing store that this type expects. For `struct`s, this is
+ # BYTE; for `enum`s and `bits`, this is BIT, and for `external`s it depends
+ # on the specific type.
+
+ NONE = 0
+ BIT = 1
+ BYTE = 8
+
+ external = Optional(External, "type")
+ enumeration = Optional(Enum, "type")
+ structure = Optional(Structure, "type")
+
+ name = Optional(NameDefinition) # The name of the type.
+ attribute = Repeated(Attribute) # All attributes attached to the type.
+ documentation = Repeated(Documentation) # Docs for the type.
+ subtype = Repeated(lambda: TypeDefinition) # Subtypes of this type.
+ addressable_unit = Optional(
+ int, names=lambda x: getattr(TypeDefinition, x))
+
+ # If the type requires parameters at runtime, these are its parameters.
+ # These are currently only allowed on structures, but in the future they
+ # should be allowed on externals.
+ runtime_parameter = Repeated(RuntimeParameter)
+
+ source_location = Optional(Location)
+
+
+@message
+class Module(Message):
+ """The IR for an individual Emboss module (file)."""
+ attribute = Repeated(Attribute) # Module-level attributes.
+ type = Repeated(TypeDefinition) # Module-level type definitions.
+ documentation = Repeated(Documentation) # Module-level docs.
+ foreign_import = Repeated(Import) # Other modules imported.
+ source_location = Optional(Location) # Source code covered by this IR.
+ source_file_name = Optional(_text) # Name of the source file.
+
+
+@message
+class EmbossIr(Message):
+ """The top-level IR for an Emboss module and all of its dependencies."""
+ # All modules. The first entry will be the main module; back ends should
+ # generate code corresponding to that module. The second entry will be the
+ # prelude module.
+ module = Repeated(Module)
+
+
+_initialize_deferred_specs()
diff --git a/testdata/BUILD b/testdata/BUILD
new file mode 100644
index 0000000..5df3b38
--- /dev/null
+++ b/testdata/BUILD
@@ -0,0 +1,286 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Shared test data for Emboss.
+
+load("//public:build_defs.bzl", "emboss_cc_library")
+
+package(
+ default_visibility = ["//:__subpackages__"],
+)
+
+filegroup(
+ name = "golden_files",
+ srcs = [
+ "golden/span_se_log_file_status.emb",
+ "golden/span_se_log_file_status.ir.txt",
+ "golden/span_se_log_file_status.parse_tree.txt",
+ "golden/span_se_log_file_status.tokens.txt",
+ "golden/__init__.py",
+ ],
+)
+
+filegroup(
+ name = "test_embs",
+ srcs = [
+ "absolute_cpp_namespace.emb",
+ "anonymous_bits.emb",
+ "bcd.emb",
+ "bits.emb",
+ "complex_structure.emb",
+ "condition.emb",
+ "cpp_namespace.emb",
+ "dynamic_size.emb",
+ "enum.emb",
+ "explicit_sizes.emb",
+ "float.emb",
+ "imported.emb",
+ "imported_genfiles.emb",
+ "importer.emb",
+ "int_sizes.emb",
+ "nested_structure.emb",
+ "no_cpp_namespace.emb",
+ "parameters.emb",
+ "requires.emb",
+ "subtypes.emb",
+ "text_format.emb",
+ "uint_sizes.emb",
+ "virtual_field.emb",
+ "__init__.py",
+ ],
+)
+
+filegroup(
+ name = "format_embs",
+ srcs = glob(["format/**"]),
+)
+
+emboss_cc_library(
+ name = "span_se_log_file_status_emboss",
+ srcs = [
+ "golden/span_se_log_file_status.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "nested_structure_emboss",
+ srcs = [
+ "nested_structure.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "condition_emboss",
+ srcs = [
+ "condition.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "enum_emboss",
+ srcs = [
+ "enum.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "explicit_sizes_emboss",
+ srcs = [
+ "explicit_sizes.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "imported_emboss",
+ srcs = [
+ "imported.emb",
+ ],
+)
+
+# This rule is here to test that the Emboss Skylark macro sets eveything up
+# correctly for the Emboss front end to read generated .embs.
+#
+# TODO(bolms): Should genrules with output_to_bindir = 1 be supported as inputs
+# to emboss_cc_library?
+genrule(
+ name = "imported_genfiles",
+ srcs = ["imported.emb"],
+ outs = ["imported_genfiles.emb"],
+ cmd = "sed -e 's/emboss::test/emboss::test::generated/g' $(SRCS) > $(@)",
+)
+
+emboss_cc_library(
+ name = "imported_genfiles_emboss",
+ srcs = [
+ "imported_genfiles.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "alignments_emboss",
+ srcs = [
+ "alignments.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "importer_emboss",
+ srcs = [
+ "importer.emb",
+ ],
+ deps = [
+ ":imported_emboss",
+ ":imported_genfiles_emboss",
+ ],
+)
+
+emboss_cc_library(
+ name = "float_emboss",
+ srcs = [
+ "float.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "large_array_emboss",
+ srcs = [
+ "large_array.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "uint_sizes_emboss",
+ srcs = [
+ "uint_sizes.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "int_sizes_emboss",
+ srcs = [
+ "int_sizes.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "dynamic_size_emboss",
+ srcs = [
+ "dynamic_size.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "auto_array_size_emboss",
+ srcs = [
+ "auto_array_size.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "start_size_range_emboss",
+ srcs = [
+ "start_size_range.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "bcd_emboss",
+ srcs = [
+ "bcd.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "no_cpp_namespace_emboss",
+ srcs = [
+ "no_cpp_namespace.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "cpp_namespace_emboss",
+ srcs = [
+ "cpp_namespace.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "absolute_cpp_namespace_emboss",
+ srcs = [
+ "absolute_cpp_namespace.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "requires_emboss",
+ srcs = [
+ "requires.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "subtypes_emboss",
+ srcs = [
+ "subtypes.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "inline_type_emboss",
+ srcs = [
+ "inline_type.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "bits_emboss",
+ srcs = [
+ "bits.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "complex_structure_emboss",
+ srcs = [
+ "complex_structure.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "anonymous_bits_emboss",
+ srcs = [
+ "anonymous_bits.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "text_format_emboss",
+ srcs = [
+ "text_format.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "parameters_emboss",
+ srcs = [
+ "parameters.emb",
+ ],
+)
+
+emboss_cc_library(
+ name = "virtual_field_emboss",
+ srcs = [
+ "virtual_field.emb",
+ ],
+)
diff --git a/testdata/__init__.py b/testdata/__init__.py
new file mode 100644
index 0000000..2c31d84
--- /dev/null
+++ b/testdata/__init__.py
@@ -0,0 +1,14 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
diff --git a/testdata/absolute_cpp_namespace.emb b/testdata/absolute_cpp_namespace.emb
new file mode 100644
index 0000000..11fff91
--- /dev/null
+++ b/testdata/absolute_cpp_namespace.emb
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test .emb to ensure that the generated type ends up in the given namespace
+-- when the [(cpp) namespace] attribute is set to a valid value with leading
+-- "::".
+
+[(cpp) namespace: "::emboss::test::leading_double_colon"]
+
+
+enum Foo:
+ VALUE = 12
diff --git a/testdata/alignments.emb b/testdata/alignments.emb
new file mode 100644
index 0000000..5edd61a
--- /dev/null
+++ b/testdata/alignments.emb
@@ -0,0 +1,46 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Structures which can be used be test code to check that alignment
+-- information is properly propagated through Emboss views.
+
+[$default byte_order: "BigEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct Alignments:
+ 0 [+4] Placeholder4 zero_offset
+ 0 [+6] Placeholder6 zero_offset_substructure
+ 2 [+6] Placeholder6 two_offset_substructure
+ 3 [+4] Placeholder4 three_offset
+ 4 [+4] Placeholder4 four_offset
+ 11 [+4] Placeholder4 eleven_offset
+ 12 [+4] Placeholder4 twelve_offset
+ 0 [+12] Placeholder4[3] zero_offset_four_stride_array
+ 0 [+24] Placeholder6[4] zero_offset_six_stride_array
+ 3 [+12] Placeholder4[3] three_offset_four_stride_array
+ 4 [+24] Placeholder6[4] four_offset_six_stride_array
+
+
+struct Placeholder4:
+ -- Four-byte structure used as a byte-oriented placeholder so that type
+ -- alignment can be tested without any of the bit/byte interface types.
+ 0 [+4] UInt dummy
+
+
+struct Placeholder6:
+ -- Six-byte structure. Includes Placeholder4 so that substructure alignments
+ -- can be checked.
+ 0 [+4] Placeholder4 zero_offset
+ 2 [+4] Placeholder4 two_offset
diff --git a/testdata/anonymous_bits.emb b/testdata/anonymous_bits.emb
new file mode 100644
index 0000000..4f9332f
--- /dev/null
+++ b/testdata/anonymous_bits.emb
@@ -0,0 +1,32 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct Foo:
+ 0 [+4] bits:
+ 31 [+1] Flag high_bit
+ 14 [+4] enum bar:
+ BAR = 0
+ BAZ = 1
+
+ 0 [+1] Flag first_bit
+
+ 4 [+4] bits:
+ # The last byte is intentionally unused, in order to test that Ok() checks
+ # the readability of all the bits, not just the ones that have names.
+ 23 [+1] Flag bit_23
+ 0 [+1] Flag low_bit
diff --git a/testdata/auto_array_size.emb b/testdata/auto_array_size.emb
new file mode 100644
index 0000000..4eef0a9
--- /dev/null
+++ b/testdata/auto_array_size.emb
@@ -0,0 +1,29 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct Element:
+ 0 [+1] UInt a
+ 1 [+1] UInt b
+
+
+struct AutoSize:
+ 0 [+1] UInt array_size (a)
+ 1 [+4] UInt:8[] four_byte_array
+ 5 [+8] Element[] four_struct_array
+ 13 [+a] UInt:8[] dynamic_byte_array
+ 13+a [+2*a] Element[] dynamic_struct_array
diff --git a/testdata/bcd.emb b/testdata/bcd.emb
new file mode 100644
index 0000000..95cb093
--- /dev/null
+++ b/testdata/bcd.emb
@@ -0,0 +1,39 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Tests for various sizes of Binary-Coded Decimal (BCD) integers.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct BcdSizes:
+ 0 [+1] Bcd one_byte
+ 1 [+2] Bcd two_byte
+ 3 [+3] Bcd three_byte
+ 6 [+4] Bcd four_byte
+ 10 [+5] Bcd five_byte
+ 15 [+6] Bcd six_byte
+ 21 [+7] Bcd seven_byte
+ 28 [+8] Bcd eight_byte
+ 36 [+4] bits:
+ 0 [+4] Bcd four_bit
+ 4 [+6] Bcd six_bit
+ 10 [+10] Bcd ten_bit
+ 20 [+12] Bcd twelve_bit
+
+
+struct BcdBigEndian:
+ [$default byte_order: "BigEndian"]
+ 0 [+4] Bcd four_byte
diff --git a/testdata/bits.emb b/testdata/bits.emb
new file mode 100644
index 0000000..f91603f
--- /dev/null
+++ b/testdata/bits.emb
@@ -0,0 +1,62 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test `.emb` for the `bits` construct.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+bits OneByte:
+ 7 [+1] Flag high_bit
+ 6 [+1] Flag less_high_bit
+ 2 [+4] UInt mid_nibble
+ 1 [+1] Flag less_low_bit
+ 0 [+1] Flag low_bit
+
+
+bits FourByte:
+ bits TwoByteWithGaps:
+ 15 [+1] Flag high_bit
+ 6 [+4] UInt mid_nibble
+ 0 [+1] Flag low_bit
+
+ 28 [+4] UInt high_nibble
+ 20 [+8] OneByte one_byte
+ 4 [+16] TwoByteWithGaps two_byte
+ 0 [+4] UInt raw_low_nibble
+ # Check that the [requires] attribute works on bits fields just like it does
+ # on struct fields.
+ [requires: 1 <= this <= 15]
+ let low_nibble = raw_low_nibble + 100
+
+
+bits ArrayInBits:
+ 15 [+1] Flag lone_flag
+ 0 [+12] Flag[] flags
+
+
+struct ArrayInBitsInStruct:
+ 0 [+2] ArrayInBits array_in_bits
+
+
+struct StructOfBits:
+ 0 [+1] OneByte one_byte
+ 1 [+2] FourByte.TwoByteWithGaps two_byte
+ 3 [+4] FourByte four_byte
+ one_byte.mid_nibble [+1] UInt located_byte
+
+
+struct BitArray:
+ 0 [+8] OneByte[8] one_byte
diff --git a/testdata/complex_structure.emb b/testdata/complex_structure.emb
new file mode 100644
index 0000000..94a1bd3
--- /dev/null
+++ b/testdata/complex_structure.emb
@@ -0,0 +1,63 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Relatively complex structure intended for use in fuzz testing.
+--
+-- Note that field names are intentionally very short; this helps American
+-- Fuzzy Lop (go/afl) find new code paths more quickly.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss_test"]
+
+
+bits RegisterLayout:
+ 0 [+8] Int x
+ 0 [+4] UInt l
+ 4 [+4] UInt h
+
+
+struct ArrayElement:
+ 0 [+1] RegisterLayout a
+
+
+struct Complex:
+ 0 [+1] UInt s
+ 1 [+8] UInt u
+ 1 [+8] Int i
+ 1 [+8] Bcd b
+ 1 [+s*4] ArrayElement[4][] a
+ 1 [+1] bits:
+ 0 [+8] UInt a0
+ 7 [+1] Flag s0
+ 0 [+4] Int l0
+ 4 [+4] Int h0
+
+ 2 [+1] ArrayElement e1
+ if a0 >= 0x80:
+ 3 [+1] ArrayElement e2
+
+ if a0 < 0x80:
+ 3 [+1] Bcd b2
+
+ if b2 > 25:
+ 4 [+1] Int e3
+
+ if s >= 4 && (a0 >= 80 ? e3 >= 0x80 : b2 < 50):
+ 5 [+1] Int e4
+
+ if s >= 5 && e4 > 0:
+ 6 [+1] Int e5
+
+ if s < 2 || a0 < 4:
+ 1 [+1] Int e0
diff --git a/testdata/condition.emb b/testdata/condition.emb
new file mode 100644
index 0000000..acfa304
--- /dev/null
+++ b/testdata/condition.emb
@@ -0,0 +1,229 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct BasicConditional:
+ 0 [+1] UInt x
+ if x == 0:
+ 1 [+1] UInt xc
+
+
+struct NegativeConditional:
+ 0 [+1] UInt x
+ if x != 0:
+ 1 [+1] UInt xc
+
+
+struct ConditionalAndUnconditionalOverlappingFinalField:
+ 0 [+1] UInt x
+ if x == 0:
+ 1 [+1] UInt xc
+
+ 1 [+1] UInt z
+
+
+struct ConditionalBasicConditionalFieldFirst:
+ if x == 0:
+ 0 [+1] UInt xc
+
+ 1 [+1] UInt x
+
+
+struct ConditionalAndDynamicLocation:
+ 0 [+1] UInt x
+ 2 [+1] UInt y
+ if x == 0:
+ y [+1] UInt xc
+
+
+struct ConditionUsesMinInt:
+ 0 [+1] Int x
+ if x - 0x7fff_ffff_ffff_ff80 == -0x8000_0000_0000_0000:
+ 1 [+1] UInt xc
+
+
+struct NestedConditional:
+ 0 [+1] UInt x
+ if x == 0:
+ 1 [+1] UInt xc
+
+ if xc == 0:
+ 2 [+1] UInt xcc
+
+
+struct CorrectNestedConditional:
+ 0 [+1] UInt x
+ if x == 0:
+ 1 [+1] UInt xc
+
+ if x == 0 && xc == 0:
+ 2 [+1] UInt xcc
+
+
+struct AlwaysFalseCondition:
+ 0 [+1] UInt x
+ if false:
+ 1 [+1] UInt xc
+
+
+struct OnlyAlwaysFalseCondition:
+ if false:
+ 0 [+1] UInt xc
+
+
+struct EmptyStruct:
+ -- Empty structure.
+
+
+struct AlwaysFalseConditionDynamicSize:
+ 0 [+1] UInt x
+ x [+1] UInt y
+ if false:
+ 1 [+1] UInt xc
+
+
+struct ConditionDoesNotContributeToSize:
+ 0 [+1] UInt x
+ if x == 0:
+ 1 [+1] UInt xc
+ 2 [+1] UInt y
+
+
+enum OnOff:
+ OFF = 0
+ ON = 1
+
+
+struct EnumCondition:
+ 0 [+1] OnOff x
+ if x == OnOff.ON:
+ 1 [+1] UInt xc
+
+
+struct NegativeEnumCondition:
+ 0 [+1] OnOff x
+ if x != OnOff.ON:
+ 1 [+1] UInt xc
+
+
+struct LessThanCondition:
+ 0 [+1] UInt x
+ if x < 5:
+ 1 [+1] UInt xc
+
+
+struct LessThanOrEqualCondition:
+ 0 [+1] UInt x
+ if x <= 5:
+ 1 [+1] UInt xc
+
+
+struct GreaterThanOrEqualCondition:
+ 0 [+1] UInt x
+ if x >= 5:
+ 1 [+1] UInt xc
+
+
+struct GreaterThanCondition:
+ 0 [+1] UInt x
+ if x > 5:
+ 1 [+1] UInt xc
+
+
+struct RangeCondition:
+ 0 [+1] UInt x
+ 1 [+1] UInt y
+ if 5 < x <= y < 10:
+ 2 [+1] UInt xc
+
+
+struct ReverseRangeCondition:
+ 0 [+1] UInt x
+ 1 [+1] UInt y
+ if 10 > y >= x > 5:
+ 2 [+1] UInt xc
+
+
+struct AndCondition:
+ 0 [+1] UInt x
+ 1 [+1] UInt y
+ if x == 5 && y == 5:
+ 2 [+1] UInt xc
+
+
+struct OrCondition:
+ 0 [+1] UInt x
+ 1 [+1] UInt y
+ if x == 5 || y == 5:
+ 2 [+1] UInt xc
+
+
+struct ChoiceCondition:
+ 0 [+1] enum field:
+ USE_X = 1
+ USE_Y = 2
+
+ 1 [+1] UInt x
+ 2 [+1] UInt y
+ if (field == Field.USE_X ? x : y) == 5:
+ 3 [+1] UInt xyc
+
+
+struct ContainsBits:
+ 0 [+1] bits:
+ 7 [+1] UInt has_top
+ 0 [+1] UInt has_bottom
+
+
+struct ContainsContainsBits:
+ 0 [+1] ContainsBits condition
+ # TODO(bolms): allow Flags to be used as booleans in conditions.
+ if condition.has_top == 1:
+ 1 [+1] UInt top
+
+
+struct ConditionalInline:
+ 0 [+1] UInt payload_id
+
+ if payload_id == 0:
+ 1 [+3] struct type_0:
+ 0 [+1] UInt a
+ 1 [+1] UInt b
+ 2 [+1] UInt c
+
+ if payload_id == 1:
+ 1 [+3] struct type_1:
+ 0 [+1] UInt a
+ 1 [+1] UInt b
+ 2 [+1] UInt c
+
+
+struct ConditionalAnonymous:
+ 0 [+1] UInt x
+ if x > 10:
+ 1 [+1] bits:
+ 0 [+1] UInt low
+ if low == 1:
+ 3 [+2] UInt mid
+ 7 [+1] UInt high
+
+
+struct ConditionalOnFlag:
+ 0 [+1] bits:
+ 0 [+1] Flag enabled
+ if enabled:
+ 1 [+1] UInt value
diff --git a/testdata/cpp_namespace.emb b/testdata/cpp_namespace.emb
new file mode 100644
index 0000000..d71ebca
--- /dev/null
+++ b/testdata/cpp_namespace.emb
@@ -0,0 +1,24 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test .emb to ensure that the generated type ends up in the given namespace
+-- when the [(cpp) namespace] attribute is set to a valid value with no leading
+-- "::".
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test::no_leading_double_colon"]
+
+
+enum Foo:
+ VALUE = 11
diff --git a/testdata/dynamic_size.emb b/testdata/dynamic_size.emb
new file mode 100644
index 0000000..0189e2b
--- /dev/null
+++ b/testdata/dynamic_size.emb
@@ -0,0 +1,96 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct Message:
+ 0 [+1] UInt header_length (h)
+ 1 [+1] UInt message_length (m)
+ 2 [+h-2] UInt:8[h-2] padding
+ h [+m] UInt:8[m] message
+ h+m [+4] UInt crc32
+
+
+struct Image:
+ 0 [+1] UInt size
+ 1 [+3*5*size] UInt:8[3][5][size] pixels
+
+
+struct TwoRegions:
+ # Fields are listed in reverse order to exercise a specific part of
+ # header_generator.py.
+ 3 [+1] UInt b_end
+ 2 [+1] UInt b_start
+ 1 [+1] UInt a_size
+ 0 [+1] UInt a_start
+ a_start [+a_size] UInt:8[a_size] region_a
+ b_start [+b_end-b_start] UInt:8[b_end-b_start] region_b
+
+
+struct MultipliedSize:
+ 0 [+1] UInt width (w)
+ 1 [+1] UInt height (h)
+ # Used to test that w*h is not performed with 8-bit arithmetic.
+ 2 [+w*h] UInt:8[w*h] data
+
+
+struct NegativeTermsInSizes:
+ 0 [+1] UInt a
+ 1 [+1] UInt b
+ 2 [+1] UInt c
+ # Used to test that the pruning logic for Size() doesn't miss anything.
+ 3 [+a-b-3] UInt:8[a-b-3] a_minus_b
+ 3 [+a-2*b-3] UInt:8[a-2*b-3] a_minus_2b
+ 3 [+a-b-c-3] UInt:8[a-b-c-3] a_minus_b_minus_c
+ 3 [+7-a] UInt:8[7-a] ten_minus_a
+ 3 [+a-2*c-3] UInt:8[a-2*c-3] a_minus_2c
+ 3 [+a-c-3] UInt:8[a-c-3] a_minus_c
+
+
+struct NegativeTermInLocation:
+ 0 [+1] UInt a
+ 10-a [+1] UInt b
+
+
+struct ChainedSize:
+ 0 [+1] UInt a
+ a [+1] UInt b
+ b [+1] UInt c
+ c [+1] UInt d
+
+
+struct FinalFieldOverlaps:
+ 0 [+1] UInt a
+ 1 [+4] UInt b
+ 3 [+2] UInt c
+
+
+struct DynamicFinalFieldOverlaps:
+ 0 [+1] UInt a
+ 9 [+1] UInt b
+ a [+2] UInt c
+ a+1 [+1] UInt d
+
+
+struct DynamicFieldDependsOnLaterField:
+ a+1 [+1] UInt b
+ 4 [+1] UInt a
+
+
+struct DynamicFieldDoesNotAffectSize:
+ 0 [+1] UInt a
+ a [+1] UInt b
+ 255 [+1] UInt c
diff --git a/testdata/enum.emb b/testdata/enum.emb
new file mode 100644
index 0000000..27c4618
--- /dev/null
+++ b/testdata/enum.emb
@@ -0,0 +1,49 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+struct Constants:
+ let sprocket = 1
+ let geegaw = 2
+
+enum Kind:
+ WIDGET = 0
+ SPROCKET = Constants.sprocket
+ GEEGAW = Constants.geegaw
+ COMPUTED = Constants.geegaw+Constants.sprocket
+ LARGE_VALUE = 2000
+ DUPLICATE_LARGE_VALUE = LARGE_VALUE
+ MAX32BIT = 4294967295
+ MAX64BIT = 0x1_0000_0000_0000_0000-1
+
+
+enum Signed:
+ MIN64BIT = -0x8000_0000_0000_0000
+ MAX64BIT = 0x8000_0000_0000_0000-1
+
+
+struct ManifestEntry:
+ 0 [+1] Kind kind
+ 1 [+4] UInt count
+ 5 [+4] Kind wide_kind
+ 9 [+5] bits:
+ 4 [+32] Kind wide_kind_in_bits
+
+struct StructContainingEnum:
+ enum Status:
+ OK = 0x00
+ FAILURE = 0x01
+ 0 [+1] UInt bar
diff --git a/testdata/explicit_sizes.emb b/testdata/explicit_sizes.emb
new file mode 100644
index 0000000..096fccf
--- /dev/null
+++ b/testdata/explicit_sizes.emb
@@ -0,0 +1,52 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test structs for arrays of Ints, UInts, and enums.
+# TODO(bolms): Arrays of bit-level entities directly in structs.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+bits SizedUIntArrays:
+ 0 [+8] UInt:4[2] one_nibble
+ 8 [+16] UInt:8[2] two_nibble
+ 24 [+32] UInt:16[2] four_nibble
+
+
+bits SizedIntArrays:
+ 0 [+8] Int:4[2] one_nibble
+ 8 [+16] Int:8[2] two_nibble
+ 24 [+32] Int:16[2] four_nibble
+
+
+bits SizedEnumArrays:
+ 0 [+8] Enum:4[2] one_nibble
+ 8 [+16] Enum:8[2] two_nibble
+ 24 [+32] Enum:16[2] four_nibble
+
+
+struct BitArrayContainer:
+ 0 [+7] SizedUIntArrays uint_arrays
+
+
+enum Enum:
+ VALUE1 = 1
+ VALUE10 = 10
+ VALUE100 = 100
+ VALUE1000 = 1000
+ VALUE10000 = 10000
+ VALUE100000 = 100000
+ VALUE1000000 = 1000000
+ VALUE10000000 = 10000000
diff --git a/testdata/float.emb b/testdata/float.emb
new file mode 100644
index 0000000..7493af8
--- /dev/null
+++ b/testdata/float.emb
@@ -0,0 +1,33 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test struct for IEEE floats.
+
+[(cpp) namespace: "emboss::test"]
+
+
+struct Floats:
+ 0 [+4] Float float_little_endian
+ [byte_order: "LittleEndian"]
+
+ 4 [+4] Float float_big_endian
+ [byte_order: "BigEndian"]
+
+
+struct Doubles:
+ 0 [+8] Float double_little_endian
+ [byte_order: "LittleEndian"]
+
+ 8 [+8] Float double_big_endian
+ [byte_order: "BigEndian"]
diff --git a/testdata/format/__init__.py b/testdata/format/__init__.py
new file mode 100644
index 0000000..2c31d84
--- /dev/null
+++ b/testdata/format/__init__.py
@@ -0,0 +1,14 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
diff --git a/testdata/format/abbreviations.emb b/testdata/format/abbreviations.emb
new file mode 100644
index 0000000..7dbd00e
--- /dev/null
+++ b/testdata/format/abbreviations.emb
@@ -0,0 +1,19 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Field abbreviations are treated as part of field names.
+
+struct Foo:
+ 0 [+1] UInt abcdefghijklmnopqrstuv (a)
+ 1 [+1] UInt short (s)
diff --git a/testdata/format/abbreviations.emb.formatted b/testdata/format/abbreviations.emb.formatted
new file mode 100644
index 0000000..5e223df
--- /dev/null
+++ b/testdata/format/abbreviations.emb.formatted
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Field abbreviations are treated as part of field names.
+
+
+struct Foo:
+ 0 [+1] UInt abcdefghijklmnopqrstuv (a)
+ 1 [+1] UInt short (s)
diff --git a/testdata/format/abbreviations.emb.formatted_indent_4 b/testdata/format/abbreviations.emb.formatted_indent_4
new file mode 100644
index 0000000..1573ea5
--- /dev/null
+++ b/testdata/format/abbreviations.emb.formatted_indent_4
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Field abbreviations are treated as part of field names.
+
+
+struct Foo:
+ 0 [+1] UInt abcdefghijklmnopqrstuv (a)
+ 1 [+1] UInt short (s)
diff --git a/testdata/format/anonymous_bits_formatting.emb b/testdata/format/anonymous_bits_formatting.emb
new file mode 100644
index 0000000..f73c33a
--- /dev/null
+++ b/testdata/format/anonymous_bits_formatting.emb
@@ -0,0 +1,29 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline `bits` should be formatted as if their fields were part of the
+-- surrounding `struct`, except that their full locations should be indented one
+-- indentation level.
+--
+-- Also, the `bits:` designator should be treated as part of the field size
+-- column on its line.
+struct Foo:
+ 0 [+4] bits:
+ 0 [+4] UInt offset
+ 4 [+32] UInt reserved
+ if offset<reserved:
+ 10 [+1]Flag flag
+ # comment
+ 10 [+10]Int value
+ offset [+10] UInt:8[] data
diff --git a/testdata/format/anonymous_bits_formatting.emb.formatted b/testdata/format/anonymous_bits_formatting.emb.formatted
new file mode 100644
index 0000000..01da43e
--- /dev/null
+++ b/testdata/format/anonymous_bits_formatting.emb.formatted
@@ -0,0 +1,32 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline `bits` should be formatted as if their fields were part of the
+-- surrounding `struct`, except that their full locations should be indented one
+-- indentation level.
+--
+-- Also, the `bits:` designator should be treated as part of the field size
+-- column on its line.
+
+
+struct Foo:
+ 0 [+4] bits:
+ 0 [+4] UInt offset
+ 4 [+32] UInt reserved
+ if offset < reserved:
+ 10 [+1] Flag flag
+ # comment
+ 10 [+10] Int value
+
+ offset [+10] UInt:8[] data
diff --git a/testdata/format/anonymous_bits_formatting.emb.formatted_indent_4 b/testdata/format/anonymous_bits_formatting.emb.formatted_indent_4
new file mode 100644
index 0000000..503c37c
--- /dev/null
+++ b/testdata/format/anonymous_bits_formatting.emb.formatted_indent_4
@@ -0,0 +1,32 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline `bits` should be formatted as if their fields were part of the
+-- surrounding `struct`, except that their full locations should be indented one
+-- indentation level.
+--
+-- Also, the `bits:` designator should be treated as part of the field size
+-- column on its line.
+
+
+struct Foo:
+ 0 [+4] bits:
+ 0 [+4] UInt offset
+ 4 [+32] UInt reserved
+ if offset < reserved:
+ 10 [+1] Flag flag
+ # comment
+ 10 [+10] Int value
+
+ offset [+10] UInt:8[] data
diff --git a/testdata/format/arithmetic_expressions.emb b/testdata/format/arithmetic_expressions.emb
new file mode 100644
index 0000000..148a4da
--- /dev/null
+++ b/testdata/format/arithmetic_expressions.emb
@@ -0,0 +1,22 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Arithmetic expressions should be formatted without spaces around operators.
+
+struct Foo:
+ 0+0 [+1] UInt a
+ 0+(0) [+1] UInt b
+ (1)*(1) [+1] UInt c
+ (1)-(1) [+1] UInt d
+ -1+2 [+1] UInt e
diff --git a/testdata/format/arithmetic_expressions.emb.formatted b/testdata/format/arithmetic_expressions.emb.formatted
new file mode 100644
index 0000000..a7e0395
--- /dev/null
+++ b/testdata/format/arithmetic_expressions.emb.formatted
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Arithmetic expressions should be formatted without spaces around operators.
+
+
+struct Foo:
+ 0+0 [+1] UInt a
+ 0+(0) [+1] UInt b
+ (1)*(1) [+1] UInt c
+ (1)-(1) [+1] UInt d
+ -1+2 [+1] UInt e
diff --git a/testdata/format/arithmetic_expressions.emb.formatted_indent_4 b/testdata/format/arithmetic_expressions.emb.formatted_indent_4
new file mode 100644
index 0000000..68673f8
--- /dev/null
+++ b/testdata/format/arithmetic_expressions.emb.formatted_indent_4
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Arithmetic expressions should be formatted without spaces around operators.
+
+
+struct Foo:
+ 0+0 [+1] UInt a
+ 0+(0) [+1] UInt b
+ (1)*(1) [+1] UInt c
+ (1)-(1) [+1] UInt d
+ -1+2 [+1] UInt e
diff --git a/testdata/format/array_length.emb b/testdata/format/array_length.emb
new file mode 100644
index 0000000..c6ba315
--- /dev/null
+++ b/testdata/format/array_length.emb
@@ -0,0 +1,18 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Array lengths should not have spaces inside the `[]`.
+struct Foo:
+ 0 [+4] UInt:8[4] a1
+ 4 [+4] UInt:8[] a2
diff --git a/testdata/format/array_length.emb.formatted b/testdata/format/array_length.emb.formatted
new file mode 100644
index 0000000..3c141f3
--- /dev/null
+++ b/testdata/format/array_length.emb.formatted
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Array lengths should not have spaces inside the `[]`.
+
+
+struct Foo:
+ 0 [+4] UInt:8[4] a1
+ 4 [+4] UInt:8[] a2
diff --git a/testdata/format/array_length.emb.formatted_indent_4 b/testdata/format/array_length.emb.formatted_indent_4
new file mode 100644
index 0000000..d1ce407
--- /dev/null
+++ b/testdata/format/array_length.emb.formatted_indent_4
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Array lengths should not have spaces inside the `[]`.
+
+
+struct Foo:
+ 0 [+4] UInt:8[4] a1
+ 4 [+4] UInt:8[] a2
diff --git a/testdata/format/attributes.emb b/testdata/format/attributes.emb
new file mode 100644
index 0000000..7d3954c
--- /dev/null
+++ b/testdata/format/attributes.emb
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Attributes.
+
+[ byte_order: "LittleEndian"] # Comment
+[(cpp) namespace: "test::bar"] # Comment
+[$default byte_order: "LittleEndian"]# Comment
+[(cpp)$default namespace: "what"]# Comment
diff --git a/testdata/format/attributes.emb.formatted b/testdata/format/attributes.emb.formatted
new file mode 100644
index 0000000..0d2ea31
--- /dev/null
+++ b/testdata/format/attributes.emb.formatted
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Attributes.
+
+[byte_order: "LittleEndian"] # Comment
+[(cpp) namespace: "test::bar"] # Comment
+[$default byte_order: "LittleEndian"] # Comment
+[(cpp) $default namespace: "what"] # Comment
diff --git a/testdata/format/attributes.emb.formatted_indent_4 b/testdata/format/attributes.emb.formatted_indent_4
new file mode 100644
index 0000000..0d2ea31
--- /dev/null
+++ b/testdata/format/attributes.emb.formatted_indent_4
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Attributes.
+
+[byte_order: "LittleEndian"] # Comment
+[(cpp) namespace: "test::bar"] # Comment
+[$default byte_order: "LittleEndian"] # Comment
+[(cpp) $default namespace: "what"] # Comment
diff --git a/testdata/format/choice_expression.emb b/testdata/format/choice_expression.emb
new file mode 100644
index 0000000..dcf161b
--- /dev/null
+++ b/testdata/format/choice_expression.emb
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- The choice operator `?:` should be formatted with spaces between its
+-- components, but not after.
+struct Foo:
+ 0 [+4] UInt data1
+ if data1 > 4?false : true :
+ 10 [+10] UInt:8[] data2
+
diff --git a/testdata/format/choice_expression.emb.formatted b/testdata/format/choice_expression.emb.formatted
new file mode 100644
index 0000000..ed61598
--- /dev/null
+++ b/testdata/format/choice_expression.emb.formatted
@@ -0,0 +1,22 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- The choice operator `?:` should be formatted with spaces between its
+-- components, but not after.
+
+
+struct Foo:
+ 0 [+4] UInt data1
+ if data1 > 4 ? false : true:
+ 10 [+10] UInt:8[] data2
diff --git a/testdata/format/choice_expression.emb.formatted_indent_4 b/testdata/format/choice_expression.emb.formatted_indent_4
new file mode 100644
index 0000000..d310106
--- /dev/null
+++ b/testdata/format/choice_expression.emb.formatted_indent_4
@@ -0,0 +1,22 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- The choice operator `?:` should be formatted with spaces between its
+-- components, but not after.
+
+
+struct Foo:
+ 0 [+4] UInt data1
+ if data1 > 4 ? false : true:
+ 10 [+10] UInt:8[] data2
diff --git a/testdata/format/comparison_expressions.emb b/testdata/format/comparison_expressions.emb
new file mode 100644
index 0000000..2270171
--- /dev/null
+++ b/testdata/format/comparison_expressions.emb
@@ -0,0 +1,24 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Comparison expressions should have one space around operators.
+struct Foo:
+ if 0<0:
+ 0 [+4] UInt a
+ if 0 <= 0:
+ 4 [+4] UInt b
+ if 0>=0 >0==0:
+ 0 [+4] UInt a
+ if 0==0<0<=0== 0:
+ 0 [+4] UInt a
diff --git a/testdata/format/comparison_expressions.emb.formatted b/testdata/format/comparison_expressions.emb.formatted
new file mode 100644
index 0000000..264801a
--- /dev/null
+++ b/testdata/format/comparison_expressions.emb.formatted
@@ -0,0 +1,29 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Comparison expressions should have one space around operators.
+
+
+struct Foo:
+ if 0 < 0:
+ 0 [+4] UInt a
+
+ if 0 <= 0:
+ 4 [+4] UInt b
+
+ if 0 >= 0 > 0 == 0:
+ 0 [+4] UInt a
+
+ if 0 == 0 < 0 <= 0 == 0:
+ 0 [+4] UInt a
diff --git a/testdata/format/comparison_expressions.emb.formatted_indent_4 b/testdata/format/comparison_expressions.emb.formatted_indent_4
new file mode 100644
index 0000000..4d11e49
--- /dev/null
+++ b/testdata/format/comparison_expressions.emb.formatted_indent_4
@@ -0,0 +1,29 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Comparison expressions should have one space around operators.
+
+
+struct Foo:
+ if 0 < 0:
+ 0 [+4] UInt a
+
+ if 0 <= 0:
+ 4 [+4] UInt b
+
+ if 0 >= 0 > 0 == 0:
+ 0 [+4] UInt a
+
+ if 0 == 0 < 0 <= 0 == 0:
+ 0 [+4] UInt a
diff --git a/testdata/format/conditional_field_formatting.emb b/testdata/format/conditional_field_formatting.emb
new file mode 100644
index 0000000..f8cea89
--- /dev/null
+++ b/testdata/format/conditional_field_formatting.emb
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Conditional fields should be aligned with unconditional fields, except that
+-- their full locations should be indented one indentation level.
+struct Foo:
+ 0 [+4] UInt data1
+ if true:
+ 10 [+10] UInt:8[] data2
diff --git a/testdata/format/conditional_field_formatting.emb.formatted b/testdata/format/conditional_field_formatting.emb.formatted
new file mode 100644
index 0000000..741319f
--- /dev/null
+++ b/testdata/format/conditional_field_formatting.emb.formatted
@@ -0,0 +1,22 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Conditional fields should be aligned with unconditional fields, except that
+-- their full locations should be indented one indentation level.
+
+
+struct Foo:
+ 0 [+4] UInt data1
+ if true:
+ 10 [+10] UInt:8[] data2
diff --git a/testdata/format/conditional_field_formatting.emb.formatted_indent_4 b/testdata/format/conditional_field_formatting.emb.formatted_indent_4
new file mode 100644
index 0000000..e5a8be5
--- /dev/null
+++ b/testdata/format/conditional_field_formatting.emb.formatted_indent_4
@@ -0,0 +1,22 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Conditional fields should be aligned with unconditional fields, except that
+-- their full locations should be indented one indentation level.
+
+
+struct Foo:
+ 0 [+4] UInt data1
+ if true:
+ 10 [+10] UInt:8[] data2
diff --git a/testdata/format/conditional_inline_bits_formatting.emb b/testdata/format/conditional_inline_bits_formatting.emb
new file mode 100644
index 0000000..7ed1649
--- /dev/null
+++ b/testdata/format/conditional_inline_bits_formatting.emb
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Conditional inline `bits` should be formatted as if their fields were part
+-- of the surrounding `struct`, except that their full locations should be
+-- indented two indentation levels.
+struct Foo:
+ if true:
+ 0 [+4] bits:
+ 0 [+4] UInt offset
+ 4 [+32] UInt reserved
+ offset [+10] UInt:8[] data
diff --git a/testdata/format/conditional_inline_bits_formatting.emb.formatted b/testdata/format/conditional_inline_bits_formatting.emb.formatted
new file mode 100644
index 0000000..3f003f0
--- /dev/null
+++ b/testdata/format/conditional_inline_bits_formatting.emb.formatted
@@ -0,0 +1,26 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Conditional inline `bits` should be formatted as if their fields were part
+-- of the surrounding `struct`, except that their full locations should be
+-- indented two indentation levels.
+
+
+struct Foo:
+ if true:
+ 0 [+4] bits:
+ 0 [+4] UInt offset
+ 4 [+32] UInt reserved
+
+ offset [+10] UInt:8[] data
diff --git a/testdata/format/conditional_inline_bits_formatting.emb.formatted_indent_4 b/testdata/format/conditional_inline_bits_formatting.emb.formatted_indent_4
new file mode 100644
index 0000000..52b82ae
--- /dev/null
+++ b/testdata/format/conditional_inline_bits_formatting.emb.formatted_indent_4
@@ -0,0 +1,26 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Conditional inline `bits` should be formatted as if their fields were part
+-- of the surrounding `struct`, except that their full locations should be
+-- indented two indentation levels.
+
+
+struct Foo:
+ if true:
+ 0 [+4] bits:
+ 0 [+4] UInt offset
+ 4 [+32] UInt reserved
+
+ offset [+10] UInt:8[] data
diff --git a/testdata/format/dotted_names.emb b/testdata/format/dotted_names.emb
new file mode 100644
index 0000000..b64a8e1
--- /dev/null
+++ b/testdata/format/dotted_names.emb
@@ -0,0 +1,30 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- '.'-separated names are formatted with no spaces.
+
+import "foo.emb" as f
+
+enum Foo:
+ FOUR = 4
+
+struct Bar:
+ struct Baz:
+ 0 [+4] UInt field
+
+struct Qux:
+ 0 [+Foo . FOUR] Bar .Baz field
+ field.field [+4] UInt thing
+ if f. Bar .BAZ==f . Bar.BAZ:
+ 0 [+4] f.Bar other
diff --git a/testdata/format/dotted_names.emb.formatted b/testdata/format/dotted_names.emb.formatted
new file mode 100644
index 0000000..f6f8452
--- /dev/null
+++ b/testdata/format/dotted_names.emb.formatted
@@ -0,0 +1,33 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- '.'-separated names are formatted with no spaces.
+
+import "foo.emb" as f
+
+
+enum Foo:
+ FOUR = 4
+
+
+struct Bar:
+ struct Baz:
+ 0 [+4] UInt field
+
+
+struct Qux:
+ 0 [+Foo.FOUR] Bar.Baz field
+ field.field [+4] UInt thing
+ if f.Bar.BAZ == f.Bar.BAZ:
+ 0 [+4] f.Bar other
diff --git a/testdata/format/dotted_names.emb.formatted_indent_4 b/testdata/format/dotted_names.emb.formatted_indent_4
new file mode 100644
index 0000000..771b018
--- /dev/null
+++ b/testdata/format/dotted_names.emb.formatted_indent_4
@@ -0,0 +1,33 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- '.'-separated names are formatted with no spaces.
+
+import "foo.emb" as f
+
+
+enum Foo:
+ FOUR = 4
+
+
+struct Bar:
+ struct Baz:
+ 0 [+4] UInt field
+
+
+struct Qux:
+ 0 [+Foo.FOUR] Bar.Baz field
+ field.field [+4] UInt thing
+ if f.Bar.BAZ == f.Bar.BAZ:
+ 0 [+4] f.Bar other
diff --git a/testdata/format/empty.emb b/testdata/format/empty.emb
new file mode 100644
index 0000000..2c31d84
--- /dev/null
+++ b/testdata/format/empty.emb
@@ -0,0 +1,14 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
diff --git a/testdata/format/empty.emb.formatted b/testdata/format/empty.emb.formatted
new file mode 100644
index 0000000..086a24e
--- /dev/null
+++ b/testdata/format/empty.emb.formatted
@@ -0,0 +1,13 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/testdata/format/empty.emb.formatted_indent_4 b/testdata/format/empty.emb.formatted_indent_4
new file mode 100644
index 0000000..086a24e
--- /dev/null
+++ b/testdata/format/empty.emb.formatted_indent_4
@@ -0,0 +1,13 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/testdata/format/enum_value_bodies.emb b/testdata/format/enum_value_bodies.emb
new file mode 100644
index 0000000..28b96fe
--- /dev/null
+++ b/testdata/format/enum_value_bodies.emb
@@ -0,0 +1,25 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Documentation on `enum` values should be properly indented.
+
+enum Foo:
+ BAR = 0
+ -- Bar is a bar.
+ BAZ = 1
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
diff --git a/testdata/format/enum_value_bodies.emb.formatted b/testdata/format/enum_value_bodies.emb.formatted
new file mode 100644
index 0000000..9064602
--- /dev/null
+++ b/testdata/format/enum_value_bodies.emb.formatted
@@ -0,0 +1,27 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Documentation on `enum` values should be properly indented.
+
+
+enum Foo:
+ BAR = 0
+ -- Bar is a bar.
+
+ BAZ = 1
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
diff --git a/testdata/format/enum_value_bodies.emb.formatted_indent_4 b/testdata/format/enum_value_bodies.emb.formatted_indent_4
new file mode 100644
index 0000000..1db058d
--- /dev/null
+++ b/testdata/format/enum_value_bodies.emb.formatted_indent_4
@@ -0,0 +1,27 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Documentation on `enum` values should be properly indented.
+
+
+enum Foo:
+ BAR = 0
+ -- Bar is a bar.
+
+ BAZ = 1
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
+ -- Baz requires more documentation.
diff --git a/testdata/format/enum_values_aligned.emb b/testdata/format/enum_values_aligned.emb
new file mode 100644
index 0000000..2feb0f3
--- /dev/null
+++ b/testdata/format/enum_values_aligned.emb
@@ -0,0 +1,19 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+enum Foo:
+ SHORT = 1
+ MEDIUM = 2
+ SOMEWHAT_LONG = 30
+ VERY_VERY_VERY_VERY_VERY_VERY_VERY_VERY_VERY_LONG = 1_000_000_000_000_000
diff --git a/testdata/format/enum_values_aligned.emb.formatted b/testdata/format/enum_values_aligned.emb.formatted
new file mode 100644
index 0000000..b9edab8
--- /dev/null
+++ b/testdata/format/enum_values_aligned.emb.formatted
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+enum Foo:
+ SHORT = 1
+ MEDIUM = 2
+ SOMEWHAT_LONG = 30
+ VERY_VERY_VERY_VERY_VERY_VERY_VERY_VERY_VERY_LONG = 1_000_000_000_000_000
diff --git a/testdata/format/enum_values_aligned.emb.formatted_indent_4 b/testdata/format/enum_values_aligned.emb.formatted_indent_4
new file mode 100644
index 0000000..d13f82a
--- /dev/null
+++ b/testdata/format/enum_values_aligned.emb.formatted_indent_4
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+enum Foo:
+ SHORT = 1
+ MEDIUM = 2
+ SOMEWHAT_LONG = 30
+ VERY_VERY_VERY_VERY_VERY_VERY_VERY_VERY_VERY_LONG = 1_000_000_000_000_000
diff --git a/testdata/format/equality_expressions.emb b/testdata/format/equality_expressions.emb
new file mode 100644
index 0000000..d5bd636
--- /dev/null
+++ b/testdata/format/equality_expressions.emb
@@ -0,0 +1,24 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Equality expressions should have one space around operators.
+struct Foo:
+ if 0==0:
+ 0 [+4] UInt a
+ if 0 != 0:
+ 4 [+4] UInt b
+ if 0==0 ==0:
+ 0 [+4] UInt a
+ if 0!=0:
+ 0 [+4] UInt a
diff --git a/testdata/format/equality_expressions.emb.formatted b/testdata/format/equality_expressions.emb.formatted
new file mode 100644
index 0000000..dce7839
--- /dev/null
+++ b/testdata/format/equality_expressions.emb.formatted
@@ -0,0 +1,29 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Equality expressions should have one space around operators.
+
+
+struct Foo:
+ if 0 == 0:
+ 0 [+4] UInt a
+
+ if 0 != 0:
+ 4 [+4] UInt b
+
+ if 0 == 0 == 0:
+ 0 [+4] UInt a
+
+ if 0 != 0:
+ 0 [+4] UInt a
diff --git a/testdata/format/equality_expressions.emb.formatted_indent_4 b/testdata/format/equality_expressions.emb.formatted_indent_4
new file mode 100644
index 0000000..b18d56a
--- /dev/null
+++ b/testdata/format/equality_expressions.emb.formatted_indent_4
@@ -0,0 +1,29 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Equality expressions should have one space around operators.
+
+
+struct Foo:
+ if 0 == 0:
+ 0 [+4] UInt a
+
+ if 0 != 0:
+ 4 [+4] UInt b
+
+ if 0 == 0 == 0:
+ 0 [+4] UInt a
+
+ if 0 != 0:
+ 0 [+4] UInt a
diff --git a/testdata/format/external.emb b/testdata/format/external.emb
new file mode 100644
index 0000000..5119a4a
--- /dev/null
+++ b/testdata/format/external.emb
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Externals get minimal formatting.
+
+external Foo:
+ -- documentation
+ [requires: $is_statically_sized && 1 <= $static_size_in_bits <= 64]
+ [is_integer: true]
+ [addressable_unit_size: 1]
diff --git a/testdata/format/external.emb.formatted b/testdata/format/external.emb.formatted
new file mode 100644
index 0000000..fd02726
--- /dev/null
+++ b/testdata/format/external.emb.formatted
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Externals get minimal formatting.
+
+
+external Foo:
+ -- documentation
+
+ [requires: $is_statically_sized && 1 <= $static_size_in_bits <= 64]
+ [is_integer: true]
+ [addressable_unit_size: 1]
diff --git a/testdata/format/external.emb.formatted_indent_4 b/testdata/format/external.emb.formatted_indent_4
new file mode 100644
index 0000000..55bed10
--- /dev/null
+++ b/testdata/format/external.emb.formatted_indent_4
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Externals get minimal formatting.
+
+
+external Foo:
+ -- documentation
+
+ [requires: $is_statically_sized && 1 <= $static_size_in_bits <= 64]
+ [is_integer: true]
+ [addressable_unit_size: 1]
diff --git a/testdata/format/extra_newlines.emb b/testdata/format/extra_newlines.emb
new file mode 100644
index 0000000..14e9ed7
--- /dev/null
+++ b/testdata/format/extra_newlines.emb
@@ -0,0 +1,37 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+
+
+
+
+
+
+
+-- Leading and trailing newlines should be stripped.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/testdata/format/extra_newlines.emb.formatted b/testdata/format/extra_newlines.emb.formatted
new file mode 100644
index 0000000..0f47ec1
--- /dev/null
+++ b/testdata/format/extra_newlines.emb.formatted
@@ -0,0 +1,15 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Leading and trailing newlines should be stripped.
diff --git a/testdata/format/extra_newlines.emb.formatted_indent_4 b/testdata/format/extra_newlines.emb.formatted_indent_4
new file mode 100644
index 0000000..0f47ec1
--- /dev/null
+++ b/testdata/format/extra_newlines.emb.formatted_indent_4
@@ -0,0 +1,15 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Leading and trailing newlines should be stripped.
diff --git a/testdata/format/fields_aligned.emb b/testdata/format/fields_aligned.emb
new file mode 100644
index 0000000..2a1ccac
--- /dev/null
+++ b/testdata/format/fields_aligned.emb
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+struct Foo:
+ # Field columns should be aligned, with one space betwen start and size, and
+ # two spaces between other columns.
+ 0 [+1] UInt short
+ 10 [+1] UInt:8 medium
+ 1_000_000 [+8] UInt:64 long
diff --git a/testdata/format/fields_aligned.emb.formatted b/testdata/format/fields_aligned.emb.formatted
new file mode 100644
index 0000000..ae2f28e
--- /dev/null
+++ b/testdata/format/fields_aligned.emb.formatted
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+struct Foo:
+ # Field columns should be aligned, with one space betwen start and size, and
+ # two spaces between other columns.
+ 0 [+1] UInt short
+ 10 [+1] UInt:8 medium
+ 1_000_000 [+8] UInt:64 long
diff --git a/testdata/format/fields_aligned.emb.formatted_indent_4 b/testdata/format/fields_aligned.emb.formatted_indent_4
new file mode 100644
index 0000000..fd14ce2
--- /dev/null
+++ b/testdata/format/fields_aligned.emb.formatted_indent_4
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+struct Foo:
+ # Field columns should be aligned, with one space betwen start and size, and
+ # two spaces between other columns.
+ 0 [+1] UInt short
+ 10 [+1] UInt:8 medium
+ 1_000_000 [+8] UInt:64 long
diff --git a/testdata/format/functions.emb b/testdata/format/functions.emb
new file mode 100644
index 0000000..6c0f1d1
--- /dev/null
+++ b/testdata/format/functions.emb
@@ -0,0 +1,25 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Functions should be formatted like `$func(arg)`, `$func(arg, arg, arg)`, or
+-- `$func()`.
+
+struct Foo:
+ $max ( ) [+1] UInt foo
+ if $present (foo ) :
+ $max ( 1 ) [ + 3 ] UInt bar
+ $max ( 1 , 2 ) [ + 3 ] UInt baz
+ $max ( 1 , 2,3,4,5,6 ) [ + 3 ] UInt qux
+ $upper_bound(foo ) [+1] UInt quux
+ $lower_bound(quux ) [ +1 ] UInt quuux
diff --git a/testdata/format/functions.emb.formatted b/testdata/format/functions.emb.formatted
new file mode 100644
index 0000000..7eafbac
--- /dev/null
+++ b/testdata/format/functions.emb.formatted
@@ -0,0 +1,27 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Functions should be formatted like `$func(arg)`, `$func(arg, arg, arg)`, or
+-- `$func()`.
+
+
+struct Foo:
+ $max() [+1] UInt foo
+ if $present(foo):
+ $max(1) [+3] UInt bar
+
+ $max(1, 2) [+3] UInt baz
+ $max(1, 2, 3, 4, 5, 6) [+3] UInt qux
+ $upper_bound(foo) [+1] UInt quux
+ $lower_bound(quux) [+1] UInt quuux
diff --git a/testdata/format/functions.emb.formatted_indent_4 b/testdata/format/functions.emb.formatted_indent_4
new file mode 100644
index 0000000..d6e4884
--- /dev/null
+++ b/testdata/format/functions.emb.formatted_indent_4
@@ -0,0 +1,27 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Functions should be formatted like `$func(arg)`, `$func(arg, arg, arg)`, or
+-- `$func()`.
+
+
+struct Foo:
+ $max() [+1] UInt foo
+ if $present(foo):
+ $max(1) [+3] UInt bar
+
+ $max(1, 2) [+3] UInt baz
+ $max(1, 2, 3, 4, 5, 6) [+3] UInt qux
+ $upper_bound(foo) [+1] UInt quux
+ $lower_bound(quux) [+1] UInt quuux
diff --git a/testdata/format/header_and_type.emb b/testdata/format/header_and_type.emb
new file mode 100644
index 0000000..83768f2
--- /dev/null
+++ b/testdata/format/header_and_type.emb
@@ -0,0 +1,17 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Header lines should be separated from type definitions by two blank lines.
+struct Foo:
+ 0 [+1] UInt bar
diff --git a/testdata/format/header_and_type.emb.formatted b/testdata/format/header_and_type.emb.formatted
new file mode 100644
index 0000000..7dff2fc
--- /dev/null
+++ b/testdata/format/header_and_type.emb.formatted
@@ -0,0 +1,19 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Header lines should be separated from type definitions by two blank lines.
+
+
+struct Foo:
+ 0 [+1] UInt bar
diff --git a/testdata/format/header_and_type.emb.formatted_indent_4 b/testdata/format/header_and_type.emb.formatted_indent_4
new file mode 100644
index 0000000..5d6b04c
--- /dev/null
+++ b/testdata/format/header_and_type.emb.formatted_indent_4
@@ -0,0 +1,19 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Header lines should be separated from type definitions by two blank lines.
+
+
+struct Foo:
+ 0 [+1] UInt bar
diff --git a/testdata/format/indent.emb b/testdata/format/indent.emb
new file mode 100644
index 0000000..5746d46
--- /dev/null
+++ b/testdata/format/indent.emb
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# All indentation should be set to 2 spaces.
+struct Foo:
+ 0 [+1] bits:
+ 0 [+1] UInt x
+ 10 [+1] UInt:8 medium
+
+enum Bar:
+ VAL = 1
+ VV = 2
diff --git a/testdata/format/indent.emb.formatted b/testdata/format/indent.emb.formatted
new file mode 100644
index 0000000..c139be6
--- /dev/null
+++ b/testdata/format/indent.emb.formatted
@@ -0,0 +1,27 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# All indentation should be set to 2 spaces.
+
+
+struct Foo:
+ 0 [+1] bits:
+ 0 [+1] UInt x
+
+ 10 [+1] UInt:8 medium
+
+
+enum Bar:
+ VAL = 1
+ VV = 2
diff --git a/testdata/format/indent.emb.formatted_indent_4 b/testdata/format/indent.emb.formatted_indent_4
new file mode 100644
index 0000000..91f58ff
--- /dev/null
+++ b/testdata/format/indent.emb.formatted_indent_4
@@ -0,0 +1,27 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# All indentation should be set to 2 spaces.
+
+
+struct Foo:
+ 0 [+1] bits:
+ 0 [+1] UInt x
+
+ 10 [+1] UInt:8 medium
+
+
+enum Bar:
+ VAL = 1
+ VV = 2
diff --git a/testdata/format/inline_attributes_get_a_column.emb b/testdata/format/inline_attributes_get_a_column.emb
new file mode 100644
index 0000000..916a55b
--- /dev/null
+++ b/testdata/format/inline_attributes_get_a_column.emb
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline attributes on fields are aligned in a column.
+bits Foo:
+ 0 [+4] UInt a [requires: 0 <= this <= 10]
+ 4 [+4] UInt bc [requires: 0 <= this <= 4] [byte_order: "LittleEndian"]
+ 8 [+4] UInt def [requires: 1000 <= this <= 1020]
+ 8 [+4] UInt ghij [requires: 10 <= this <= 10]
diff --git a/testdata/format/inline_attributes_get_a_column.emb.formatted b/testdata/format/inline_attributes_get_a_column.emb.formatted
new file mode 100644
index 0000000..3e09b70
--- /dev/null
+++ b/testdata/format/inline_attributes_get_a_column.emb.formatted
@@ -0,0 +1,22 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline attributes on fields are aligned in a column.
+
+
+bits Foo:
+ 0 [+4] UInt a [requires: 0 <= this <= 10]
+ 4 [+4] UInt bc [requires: 0 <= this <= 4] [byte_order: "LittleEndian"]
+ 8 [+4] UInt def [requires: 1000 <= this <= 1020]
+ 8 [+4] UInt ghij [requires: 10 <= this <= 10]
diff --git a/testdata/format/inline_attributes_get_a_column.emb.formatted_indent_4 b/testdata/format/inline_attributes_get_a_column.emb.formatted_indent_4
new file mode 100644
index 0000000..bb8429b
--- /dev/null
+++ b/testdata/format/inline_attributes_get_a_column.emb.formatted_indent_4
@@ -0,0 +1,22 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline attributes on fields are aligned in a column.
+
+
+bits Foo:
+ 0 [+4] UInt a [requires: 0 <= this <= 10]
+ 4 [+4] UInt bc [requires: 0 <= this <= 4] [byte_order: "LittleEndian"]
+ 8 [+4] UInt def [requires: 1000 <= this <= 1020]
+ 8 [+4] UInt ghij [requires: 10 <= this <= 10]
diff --git a/testdata/format/inline_bits.emb b/testdata/format/inline_bits.emb
new file mode 100644
index 0000000..ed95d37
--- /dev/null
+++ b/testdata/format/inline_bits.emb
@@ -0,0 +1,34 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline bits are columnized separately from their surrounding structure.
+
+bits Foo:
+ 0 [+5] bits bler:
+ 0[+1] Flag xxx # comment
+ 1 [+1]Flag yy
+ 2 [+1] Flag zzzzzzzzzzz
+ 10 [+6] UInt length
+ 16 [+16] UInt width
+ if width == 1000:
+ 32 [+10] UInt depth # comment
+ 42 [+6] UInt checksum # other comment
+
+struct Foo2:
+ 8 [+4] UInt length # comment
+ 12 [+4] UInt width # comment
+ 0 [+5] bits bler:
+ 0[+1] Flag xxx # comment
+ 1 [+1]Flag yy
+ 2 [+1] Flag zzzzzzzzzzz
diff --git a/testdata/format/inline_bits.emb.formatted b/testdata/format/inline_bits.emb.formatted
new file mode 100644
index 0000000..ee03cba
--- /dev/null
+++ b/testdata/format/inline_bits.emb.formatted
@@ -0,0 +1,37 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline bits are columnized separately from their surrounding structure.
+
+
+bits Foo:
+ 0 [+5] bits bler:
+ 0 [+1] Flag xxx # comment
+ 1 [+1] Flag yy
+ 2 [+1] Flag zzzzzzzzzzz
+
+ 10 [+6] UInt length
+ 16 [+16] UInt width
+ if width == 1000:
+ 32 [+10] UInt depth # comment
+ 42 [+6] UInt checksum # other comment
+
+
+struct Foo2:
+ 8 [+4] UInt length # comment
+ 12 [+4] UInt width # comment
+ 0 [+5] bits bler:
+ 0 [+1] Flag xxx # comment
+ 1 [+1] Flag yy
+ 2 [+1] Flag zzzzzzzzzzz
diff --git a/testdata/format/inline_bits.emb.formatted_indent_4 b/testdata/format/inline_bits.emb.formatted_indent_4
new file mode 100644
index 0000000..c6942cb
--- /dev/null
+++ b/testdata/format/inline_bits.emb.formatted_indent_4
@@ -0,0 +1,37 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline bits are columnized separately from their surrounding structure.
+
+
+bits Foo:
+ 0 [+5] bits bler:
+ 0 [+1] Flag xxx # comment
+ 1 [+1] Flag yy
+ 2 [+1] Flag zzzzzzzzzzz
+
+ 10 [+6] UInt length
+ 16 [+16] UInt width
+ if width == 1000:
+ 32 [+10] UInt depth # comment
+ 42 [+6] UInt checksum # other comment
+
+
+struct Foo2:
+ 8 [+4] UInt length # comment
+ 12 [+4] UInt width # comment
+ 0 [+5] bits bler:
+ 0 [+1] Flag xxx # comment
+ 1 [+1] Flag yy
+ 2 [+1] Flag zzzzzzzzzzz
diff --git a/testdata/format/inline_documentation_gets_a_column.emb b/testdata/format/inline_documentation_gets_a_column.emb
new file mode 100644
index 0000000..da8600a
--- /dev/null
+++ b/testdata/format/inline_documentation_gets_a_column.emb
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline documentation on fields is aligned in a column.
+bits Foo:
+ 0 [+4] UInt a -- a
+ 4 [+4] UInt bc -- bc
+ 8 [+4] UInt def -- def
+ 8 [+4] UInt ghij -- ghi
+
diff --git a/testdata/format/inline_documentation_gets_a_column.emb.formatted b/testdata/format/inline_documentation_gets_a_column.emb.formatted
new file mode 100644
index 0000000..4d1184e
--- /dev/null
+++ b/testdata/format/inline_documentation_gets_a_column.emb.formatted
@@ -0,0 +1,22 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline documentation on fields is aligned in a column.
+
+
+bits Foo:
+ 0 [+4] UInt a -- a
+ 4 [+4] UInt bc -- bc
+ 8 [+4] UInt def -- def
+ 8 [+4] UInt ghij -- ghi
diff --git a/testdata/format/inline_documentation_gets_a_column.emb.formatted_indent_4 b/testdata/format/inline_documentation_gets_a_column.emb.formatted_indent_4
new file mode 100644
index 0000000..b91179a
--- /dev/null
+++ b/testdata/format/inline_documentation_gets_a_column.emb.formatted_indent_4
@@ -0,0 +1,22 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline documentation on fields is aligned in a column.
+
+
+bits Foo:
+ 0 [+4] UInt a -- a
+ 4 [+4] UInt bc -- bc
+ 8 [+4] UInt def -- def
+ 8 [+4] UInt ghij -- ghi
diff --git a/testdata/format/inline_enum.emb b/testdata/format/inline_enum.emb
new file mode 100644
index 0000000..268e584
--- /dev/null
+++ b/testdata/format/inline_enum.emb
@@ -0,0 +1,34 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline enums are columnized separately from their surrounding structure.
+
+bits Foo:
+ 0 [+5] enum bler:
+ XXX = 0 # comment
+ YY = 1
+ ZZZZZZZZZZZ = 2
+ 10 [+6] UInt length
+ 16 [+16] UInt width
+ if width == 1000:
+ 32 [+10] UInt depth # comment
+ 42 [+6] UInt checksum # other comment
+
+struct Foo2:
+ 8 [+4] UInt length # comment
+ 12 [+4] UInt width # comment
+ 0 [+5] enum bler:
+ XXX = 0 # comment
+ YY = 1
+ ZZZZZZZZZZZ = 2
diff --git a/testdata/format/inline_enum.emb.formatted b/testdata/format/inline_enum.emb.formatted
new file mode 100644
index 0000000..ca54592
--- /dev/null
+++ b/testdata/format/inline_enum.emb.formatted
@@ -0,0 +1,37 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline enums are columnized separately from their surrounding structure.
+
+
+bits Foo:
+ 0 [+5] enum bler:
+ XXX = 0 # comment
+ YY = 1
+ ZZZZZZZZZZZ = 2
+
+ 10 [+6] UInt length
+ 16 [+16] UInt width
+ if width == 1000:
+ 32 [+10] UInt depth # comment
+ 42 [+6] UInt checksum # other comment
+
+
+struct Foo2:
+ 8 [+4] UInt length # comment
+ 12 [+4] UInt width # comment
+ 0 [+5] enum bler:
+ XXX = 0 # comment
+ YY = 1
+ ZZZZZZZZZZZ = 2
diff --git a/testdata/format/inline_enum.emb.formatted_indent_4 b/testdata/format/inline_enum.emb.formatted_indent_4
new file mode 100644
index 0000000..052bccf
--- /dev/null
+++ b/testdata/format/inline_enum.emb.formatted_indent_4
@@ -0,0 +1,37 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline enums are columnized separately from their surrounding structure.
+
+
+bits Foo:
+ 0 [+5] enum bler:
+ XXX = 0 # comment
+ YY = 1
+ ZZZZZZZZZZZ = 2
+
+ 10 [+6] UInt length
+ 16 [+16] UInt width
+ if width == 1000:
+ 32 [+10] UInt depth # comment
+ 42 [+6] UInt checksum # other comment
+
+
+struct Foo2:
+ 8 [+4] UInt length # comment
+ 12 [+4] UInt width # comment
+ 0 [+5] enum bler:
+ XXX = 0 # comment
+ YY = 1
+ ZZZZZZZZZZZ = 2
diff --git a/testdata/format/inline_struct.emb b/testdata/format/inline_struct.emb
new file mode 100644
index 0000000..cca9d82
--- /dev/null
+++ b/testdata/format/inline_struct.emb
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline structs are columnized separately from their surrounding structure.
+
+struct Foo2:
+ 8 [+4] UInt length # comment
+ 0 [+5] struct bler:
+ 0[+1] UInt xxx # comment
+ 1 [+1]UInt yy
+ 2 [+ 1] UInt zzzzzzzzzzz
+ 12 [+4] UInt width # comment
diff --git a/testdata/format/inline_struct.emb.formatted b/testdata/format/inline_struct.emb.formatted
new file mode 100644
index 0000000..4f467a2
--- /dev/null
+++ b/testdata/format/inline_struct.emb.formatted
@@ -0,0 +1,26 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline structs are columnized separately from their surrounding structure.
+
+
+struct Foo2:
+ 8 [+4] UInt length # comment
+
+ 0 [+5] struct bler:
+ 0 [+1] UInt xxx # comment
+ 1 [+1] UInt yy
+ 2 [+1] UInt zzzzzzzzzzz
+
+ 12 [+4] UInt width # comment
diff --git a/testdata/format/inline_struct.emb.formatted_indent_4 b/testdata/format/inline_struct.emb.formatted_indent_4
new file mode 100644
index 0000000..3adf356
--- /dev/null
+++ b/testdata/format/inline_struct.emb.formatted_indent_4
@@ -0,0 +1,26 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Inline structs are columnized separately from their surrounding structure.
+
+
+struct Foo2:
+ 8 [+4] UInt length # comment
+
+ 0 [+5] struct bler:
+ 0 [+1] UInt xxx # comment
+ 1 [+1] UInt yy
+ 2 [+1] UInt zzzzzzzzzzz
+
+ 12 [+4] UInt width # comment
diff --git a/testdata/format/lines_not_spaced_out_with_excess_trailing_noise_lines.emb b/testdata/format/lines_not_spaced_out_with_excess_trailing_noise_lines.emb
new file mode 100644
index 0000000..ce64f5f
--- /dev/null
+++ b/testdata/format/lines_not_spaced_out_with_excess_trailing_noise_lines.emb
@@ -0,0 +1,29 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- If there are fewer noise lines than primary lines in a block, *not counting
+-- trailing noise lines*, then extra newlines should not be inserted between
+-- other elements of the block.
+struct Foo:
+ 0 [+4] UInt a
+ 4 [+4] UInt b
+ -- some doc
+ -- some doc
+ 8 [+4] UInt c
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
diff --git a/testdata/format/lines_not_spaced_out_with_excess_trailing_noise_lines.emb.formatted b/testdata/format/lines_not_spaced_out_with_excess_trailing_noise_lines.emb.formatted
new file mode 100644
index 0000000..ef2994b
--- /dev/null
+++ b/testdata/format/lines_not_spaced_out_with_excess_trailing_noise_lines.emb.formatted
@@ -0,0 +1,32 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- If there are fewer noise lines than primary lines in a block, *not counting
+-- trailing noise lines*, then extra newlines should not be inserted between
+-- other elements of the block.
+
+
+struct Foo:
+ 0 [+4] UInt a
+ 4 [+4] UInt b
+ -- some doc
+ -- some doc
+
+ 8 [+4] UInt c
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
diff --git a/testdata/format/lines_not_spaced_out_with_excess_trailing_noise_lines.emb.formatted_indent_4 b/testdata/format/lines_not_spaced_out_with_excess_trailing_noise_lines.emb.formatted_indent_4
new file mode 100644
index 0000000..3f33616
--- /dev/null
+++ b/testdata/format/lines_not_spaced_out_with_excess_trailing_noise_lines.emb.formatted_indent_4
@@ -0,0 +1,32 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- If there are fewer noise lines than primary lines in a block, *not counting
+-- trailing noise lines*, then extra newlines should not be inserted between
+-- other elements of the block.
+
+
+struct Foo:
+ 0 [+4] UInt a
+ 4 [+4] UInt b
+ -- some doc
+ -- some doc
+
+ 8 [+4] UInt c
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
diff --git a/testdata/format/lines_not_spaced_out_with_not_enough_noise_lines.emb b/testdata/format/lines_not_spaced_out_with_not_enough_noise_lines.emb
new file mode 100644
index 0000000..4316166
--- /dev/null
+++ b/testdata/format/lines_not_spaced_out_with_not_enough_noise_lines.emb
@@ -0,0 +1,22 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- If there are fewer noise lines than primary lines in a block, then extra
+-- newlines should not be inserted between other elements of the block.
+struct Foo:
+ 0 [+4] UInt a
+ 4 [+4] UInt b
+ -- some doc
+ -- some doc
+ 8 [+4] UInt c
diff --git a/testdata/format/lines_not_spaced_out_with_not_enough_noise_lines.emb.formatted b/testdata/format/lines_not_spaced_out_with_not_enough_noise_lines.emb.formatted
new file mode 100644
index 0000000..1727937
--- /dev/null
+++ b/testdata/format/lines_not_spaced_out_with_not_enough_noise_lines.emb.formatted
@@ -0,0 +1,25 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- If there are fewer noise lines than primary lines in a block, then extra
+-- newlines should not be inserted between other elements of the block.
+
+
+struct Foo:
+ 0 [+4] UInt a
+ 4 [+4] UInt b
+ -- some doc
+ -- some doc
+
+ 8 [+4] UInt c
diff --git a/testdata/format/lines_not_spaced_out_with_not_enough_noise_lines.emb.formatted_indent_4 b/testdata/format/lines_not_spaced_out_with_not_enough_noise_lines.emb.formatted_indent_4
new file mode 100644
index 0000000..a62f908
--- /dev/null
+++ b/testdata/format/lines_not_spaced_out_with_not_enough_noise_lines.emb.formatted_indent_4
@@ -0,0 +1,25 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- If there are fewer noise lines than primary lines in a block, then extra
+-- newlines should not be inserted between other elements of the block.
+
+
+struct Foo:
+ 0 [+4] UInt a
+ 4 [+4] UInt b
+ -- some doc
+ -- some doc
+
+ 8 [+4] UInt c
diff --git a/testdata/format/lines_spaced_out_with_noise_lines.emb b/testdata/format/lines_spaced_out_with_noise_lines.emb
new file mode 100644
index 0000000..b38fcf8
--- /dev/null
+++ b/testdata/format/lines_spaced_out_with_noise_lines.emb
@@ -0,0 +1,24 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- If there are more non-primary lines than primary lines inside of a block,
+-- then all elements in the block should have blank lines between them.
+struct Foo:
+ 0 [+4] UInt a
+ 4 [+4] UInt b
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
+ 8 [+4] UInt c
diff --git a/testdata/format/lines_spaced_out_with_noise_lines.emb.formatted b/testdata/format/lines_spaced_out_with_noise_lines.emb.formatted
new file mode 100644
index 0000000..c681e96
--- /dev/null
+++ b/testdata/format/lines_spaced_out_with_noise_lines.emb.formatted
@@ -0,0 +1,28 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- If there are more non-primary lines than primary lines inside of a block,
+-- then all elements in the block should have blank lines between them.
+
+
+struct Foo:
+ 0 [+4] UInt a
+
+ 4 [+4] UInt b
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
+
+ 8 [+4] UInt c
diff --git a/testdata/format/lines_spaced_out_with_noise_lines.emb.formatted_indent_4 b/testdata/format/lines_spaced_out_with_noise_lines.emb.formatted_indent_4
new file mode 100644
index 0000000..764e080
--- /dev/null
+++ b/testdata/format/lines_spaced_out_with_noise_lines.emb.formatted_indent_4
@@ -0,0 +1,28 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- If there are more non-primary lines than primary lines inside of a block,
+-- then all elements in the block should have blank lines between them.
+
+
+struct Foo:
+ 0 [+4] UInt a
+
+ 4 [+4] UInt b
+ -- some doc
+ -- some doc
+ -- some doc
+ -- some doc
+
+ 8 [+4] UInt c
diff --git a/testdata/format/logical_expressions.emb b/testdata/format/logical_expressions.emb
new file mode 100644
index 0000000..a56ca03
--- /dev/null
+++ b/testdata/format/logical_expressions.emb
@@ -0,0 +1,26 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Equality expressions should have one space around operators.
+struct Foo:
+ if true&&false:
+ 0 [+4] UInt a
+ if false || true:
+ 4 [+4] UInt b
+ if true||false ||true:
+ 0 [+4] UInt a
+ if false&&true &&false:
+ 0 [+4] UInt a
+ if (false&&true) ||false:
+ 0 [+4] UInt a
diff --git a/testdata/format/logical_expressions.emb.formatted b/testdata/format/logical_expressions.emb.formatted
new file mode 100644
index 0000000..1a32e4c
--- /dev/null
+++ b/testdata/format/logical_expressions.emb.formatted
@@ -0,0 +1,32 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Equality expressions should have one space around operators.
+
+
+struct Foo:
+ if true && false:
+ 0 [+4] UInt a
+
+ if false || true:
+ 4 [+4] UInt b
+
+ if true || false || true:
+ 0 [+4] UInt a
+
+ if false && true && false:
+ 0 [+4] UInt a
+
+ if (false && true) || false:
+ 0 [+4] UInt a
diff --git a/testdata/format/logical_expressions.emb.formatted_indent_4 b/testdata/format/logical_expressions.emb.formatted_indent_4
new file mode 100644
index 0000000..5ecec9d
--- /dev/null
+++ b/testdata/format/logical_expressions.emb.formatted_indent_4
@@ -0,0 +1,32 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Equality expressions should have one space around operators.
+
+
+struct Foo:
+ if true && false:
+ 0 [+4] UInt a
+
+ if false || true:
+ 4 [+4] UInt b
+
+ if true || false || true:
+ 0 [+4] UInt a
+
+ if false && true && false:
+ 0 [+4] UInt a
+
+ if (false && true) || false:
+ 0 [+4] UInt a
diff --git a/testdata/format/multiline_ifs.emb b/testdata/format/multiline_ifs.emb
new file mode 100644
index 0000000..f80b031
--- /dev/null
+++ b/testdata/format/multiline_ifs.emb
@@ -0,0 +1,19 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Multiline ifs should have their bodies properly indented.
+struct Foo:
+ if 0==0:
+ 0 [+4] UInt a
+ 4 [+4] UInt b
diff --git a/testdata/format/multiline_ifs.emb.formatted b/testdata/format/multiline_ifs.emb.formatted
new file mode 100644
index 0000000..a362875
--- /dev/null
+++ b/testdata/format/multiline_ifs.emb.formatted
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Multiline ifs should have their bodies properly indented.
+
+
+struct Foo:
+ if 0 == 0:
+ 0 [+4] UInt a
+ 4 [+4] UInt b
diff --git a/testdata/format/multiline_ifs.emb.formatted_indent_4 b/testdata/format/multiline_ifs.emb.formatted_indent_4
new file mode 100644
index 0000000..d022ab5
--- /dev/null
+++ b/testdata/format/multiline_ifs.emb.formatted_indent_4
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Multiline ifs should have their bodies properly indented.
+
+
+struct Foo:
+ if 0 == 0:
+ 0 [+4] UInt a
+ 4 [+4] UInt b
diff --git a/testdata/format/multiple_header_sections.emb b/testdata/format/multiple_header_sections.emb
new file mode 100644
index 0000000..da09bd0
--- /dev/null
+++ b/testdata/format/multiple_header_sections.emb
@@ -0,0 +1,18 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# .Emb with multiple header sections. First, a comment.
+-- Some documentation.
+import "foo" as bar # An import
+[$default byte_order: "LittleEndian"] # An attribute.
diff --git a/testdata/format/multiple_header_sections.emb.formatted b/testdata/format/multiple_header_sections.emb.formatted
new file mode 100644
index 0000000..cf3f77f
--- /dev/null
+++ b/testdata/format/multiple_header_sections.emb.formatted
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# .Emb with multiple header sections. First, a comment.
+
+-- Some documentation.
+
+import "foo" as bar # An import
+
+[$default byte_order: "LittleEndian"] # An attribute.
diff --git a/testdata/format/multiple_header_sections.emb.formatted_indent_4 b/testdata/format/multiple_header_sections.emb.formatted_indent_4
new file mode 100644
index 0000000..cf3f77f
--- /dev/null
+++ b/testdata/format/multiple_header_sections.emb.formatted_indent_4
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# .Emb with multiple header sections. First, a comment.
+
+-- Some documentation.
+
+import "foo" as bar # An import
+
+[$default byte_order: "LittleEndian"] # An attribute.
diff --git a/testdata/format/nested_types_are_columnized_independently.emb b/testdata/format/nested_types_are_columnized_independently.emb
new file mode 100644
index 0000000..34bec1f
--- /dev/null
+++ b/testdata/format/nested_types_are_columnized_independently.emb
@@ -0,0 +1,24 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Nested types are columnized independently of each other and of the
+-- surrounding type.
+struct Foo:
+ struct Bar:
+ 0 [+4] UInt very_very_long
+ very_very_long [+4] UInt v
+ struct Baz:
+ 0 [+4] UInt:32 long
+ 4 [+long] UInt:8[long] data
+ 0 [+4] UInt field
diff --git a/testdata/format/nested_types_are_columnized_independently.emb.formatted b/testdata/format/nested_types_are_columnized_independently.emb.formatted
new file mode 100644
index 0000000..8d0095c
--- /dev/null
+++ b/testdata/format/nested_types_are_columnized_independently.emb.formatted
@@ -0,0 +1,28 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Nested types are columnized independently of each other and of the
+-- surrounding type.
+
+
+struct Foo:
+ struct Bar:
+ 0 [+4] UInt very_very_long
+ very_very_long [+4] UInt v
+
+ struct Baz:
+ 0 [+4] UInt:32 long
+ 4 [+long] UInt:8[long] data
+
+ 0 [+4] UInt field
diff --git a/testdata/format/nested_types_are_columnized_independently.emb.formatted_indent_4 b/testdata/format/nested_types_are_columnized_independently.emb.formatted_indent_4
new file mode 100644
index 0000000..9f08b57
--- /dev/null
+++ b/testdata/format/nested_types_are_columnized_independently.emb.formatted_indent_4
@@ -0,0 +1,28 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Nested types are columnized independently of each other and of the
+-- surrounding type.
+
+
+struct Foo:
+ struct Bar:
+ 0 [+4] UInt very_very_long
+ very_very_long [+4] UInt v
+
+ struct Baz:
+ 0 [+4] UInt:32 long
+ 4 [+long] UInt:8[long] data
+
+ 0 [+4] UInt field
diff --git a/testdata/format/one_type.emb b/testdata/format/one_type.emb
new file mode 100644
index 0000000..8243050
--- /dev/null
+++ b/testdata/format/one_type.emb
@@ -0,0 +1,17 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+struct Foo:
+ -- A single type with no header lines.
+ 0 [+1] UInt bar
diff --git a/testdata/format/one_type.emb.formatted b/testdata/format/one_type.emb.formatted
new file mode 100644
index 0000000..385ef62
--- /dev/null
+++ b/testdata/format/one_type.emb.formatted
@@ -0,0 +1,18 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+struct Foo:
+ -- A single type with no header lines.
+ 0 [+1] UInt bar
diff --git a/testdata/format/one_type.emb.formatted_indent_4 b/testdata/format/one_type.emb.formatted_indent_4
new file mode 100644
index 0000000..cfec541
--- /dev/null
+++ b/testdata/format/one_type.emb.formatted_indent_4
@@ -0,0 +1,18 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+struct Foo:
+ -- A single type with no header lines.
+ 0 [+1] UInt bar
diff --git a/testdata/format/parameterized_struct.emb b/testdata/format/parameterized_struct.emb
new file mode 100644
index 0000000..b6e8ae4
--- /dev/null
+++ b/testdata/format/parameterized_struct.emb
@@ -0,0 +1,29 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test for parameterized structures: definition and use.
+
+struct TwoParameters(a:UInt:8,b:UInt:32):
+ 0 [+1] UInt a_field
+
+bits OneParameter(a:UInt:8):
+ 0 [+1] UInt a_field
+
+struct NoParameters():
+ 0 [+1] UInt a_field
+
+struct UsingParameters:
+ 0 [+1] TwoParameters(0, 1+10) two
+ 1 [+1] OneParameter(0) one
+ 2 [+5] NoParameters()[5] zero
diff --git a/testdata/format/parameterized_struct.emb.formatted b/testdata/format/parameterized_struct.emb.formatted
new file mode 100644
index 0000000..7f04456
--- /dev/null
+++ b/testdata/format/parameterized_struct.emb.formatted
@@ -0,0 +1,33 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test for parameterized structures: definition and use.
+
+
+struct TwoParameters(a: UInt:8, b: UInt:32):
+ 0 [+1] UInt a_field
+
+
+bits OneParameter(a: UInt:8):
+ 0 [+1] UInt a_field
+
+
+struct NoParameters():
+ 0 [+1] UInt a_field
+
+
+struct UsingParameters:
+ 0 [+1] TwoParameters(0, 1+10) two
+ 1 [+1] OneParameter(0) one
+ 2 [+5] NoParameters()[5] zero
diff --git a/testdata/format/parameterized_struct.emb.formatted_indent_4 b/testdata/format/parameterized_struct.emb.formatted_indent_4
new file mode 100644
index 0000000..09ef2a9
--- /dev/null
+++ b/testdata/format/parameterized_struct.emb.formatted_indent_4
@@ -0,0 +1,33 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test for parameterized structures: definition and use.
+
+
+struct TwoParameters(a: UInt:8, b: UInt:32):
+ 0 [+1] UInt a_field
+
+
+bits OneParameter(a: UInt:8):
+ 0 [+1] UInt a_field
+
+
+struct NoParameters():
+ 0 [+1] UInt a_field
+
+
+struct UsingParameters:
+ 0 [+1] TwoParameters(0, 1+10) two
+ 1 [+1] OneParameter(0) one
+ 2 [+5] NoParameters()[5] zero
diff --git a/testdata/format/sanity_check.emb b/testdata/format/sanity_check.emb
new file mode 100644
index 0000000..47c20c3
--- /dev/null
+++ b/testdata/format/sanity_check.emb
@@ -0,0 +1,15 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Minimal .emb.
diff --git a/testdata/format/sanity_check.emb.formatted b/testdata/format/sanity_check.emb.formatted
new file mode 100644
index 0000000..47c20c3
--- /dev/null
+++ b/testdata/format/sanity_check.emb.formatted
@@ -0,0 +1,15 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Minimal .emb.
diff --git a/testdata/format/sanity_check.emb.formatted_indent_4 b/testdata/format/sanity_check.emb.formatted_indent_4
new file mode 100644
index 0000000..47c20c3
--- /dev/null
+++ b/testdata/format/sanity_check.emb.formatted_indent_4
@@ -0,0 +1,15 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Minimal .emb.
diff --git a/testdata/format/spacing_between_types.emb b/testdata/format/spacing_between_types.emb
new file mode 100644
index 0000000..4451c50
--- /dev/null
+++ b/testdata/format/spacing_between_types.emb
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Top-level types get two blank lines between them.
+
+struct Foo:
+ 0 [+1] UInt a
+
+struct Bar:
+ 0 [+1] UInt b
diff --git a/testdata/format/spacing_between_types.emb.formatted b/testdata/format/spacing_between_types.emb.formatted
new file mode 100644
index 0000000..63e15e2
--- /dev/null
+++ b/testdata/format/spacing_between_types.emb.formatted
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Top-level types get two blank lines between them.
+
+
+struct Foo:
+ 0 [+1] UInt a
+
+
+struct Bar:
+ 0 [+1] UInt b
diff --git a/testdata/format/spacing_between_types.emb.formatted_indent_4 b/testdata/format/spacing_between_types.emb.formatted_indent_4
new file mode 100644
index 0000000..0fb50c7
--- /dev/null
+++ b/testdata/format/spacing_between_types.emb.formatted_indent_4
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Top-level types get two blank lines between them.
+
+
+struct Foo:
+ 0 [+1] UInt a
+
+
+struct Bar:
+ 0 [+1] UInt b
diff --git a/testdata/format/trailing_spaces.emb b/testdata/format/trailing_spaces.emb
new file mode 100644
index 0000000..1cdb3a8
--- /dev/null
+++ b/testdata/format/trailing_spaces.emb
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+
+ # Leading and trailing whitespace should be removed, even from comments
+-- ... and documentation
+
+import "x" as y
diff --git a/testdata/format/trailing_spaces.emb.formatted b/testdata/format/trailing_spaces.emb.formatted
new file mode 100644
index 0000000..f94ea7b
--- /dev/null
+++ b/testdata/format/trailing_spaces.emb.formatted
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+
+# Leading and trailing whitespace should be removed, even from comments
+
+-- ... and documentation
+
+import "x" as y
diff --git a/testdata/format/trailing_spaces.emb.formatted_indent_4 b/testdata/format/trailing_spaces.emb.formatted_indent_4
new file mode 100644
index 0000000..f94ea7b
--- /dev/null
+++ b/testdata/format/trailing_spaces.emb.formatted_indent_4
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+
+# Leading and trailing whitespace should be removed, even from comments
+
+-- ... and documentation
+
+import "x" as y
diff --git a/testdata/format/virtual_fields.emb b/testdata/format/virtual_fields.emb
new file mode 100644
index 0000000..0ac32ba
--- /dev/null
+++ b/testdata/format/virtual_fields.emb
@@ -0,0 +1,40 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Formatting of virtual fields, mixed in with nonvirtual fields.
+
+enum Values:
+ FOO = 1
+ BAR = 2
+
+struct Foo:
+ let important_constant = Values.FOO# comm
+ 0 [+1]UInt len # comment
+ # comment comment comment
+ 1[+2] LongTypeName long_value_name # comment?
+ let s = len - 2
+ if($size_in_bytes )> 10 :
+ 2[+s] SubMessage submessage
+ let truth=Bar . top
+ if Bar .$max_size_in_bits < 2:
+ Bar. $min_size_in_bits [+1] Int x
+
+bits Bar:
+ let top = true
+ 0 [+1]Flag allowed
+ if $size_in_bits>10:
+ let s = 100
+ 2[+s] SubMessage submessage
+ if Foo .$max_size_in_bytes < 2:
+ Foo. $min_size_in_bytes [+1] Int x
diff --git a/testdata/format/virtual_fields.emb.formatted b/testdata/format/virtual_fields.emb.formatted
new file mode 100644
index 0000000..16ed4c4
--- /dev/null
+++ b/testdata/format/virtual_fields.emb.formatted
@@ -0,0 +1,45 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Formatting of virtual fields, mixed in with nonvirtual fields.
+
+
+enum Values:
+ FOO = 1
+ BAR = 2
+
+
+struct Foo:
+ let important_constant = Values.FOO # comm
+ 0 [+1] UInt len # comment
+ # comment comment comment
+ 1 [+2] LongTypeName long_value_name # comment?
+ let s = len-2
+ if ($size_in_bytes) > 10:
+ 2 [+s] SubMessage submessage
+
+ let truth = Bar.top
+ if Bar.$max_size_in_bits < 2:
+ Bar.$min_size_in_bits [+1] Int x
+
+
+bits Bar:
+ let top = true
+ 0 [+1] Flag allowed
+ if $size_in_bits > 10:
+ let s = 100
+ 2 [+s] SubMessage submessage
+
+ if Foo.$max_size_in_bytes < 2:
+ Foo.$min_size_in_bytes [+1] Int x
diff --git a/testdata/format/virtual_fields.emb.formatted_indent_4 b/testdata/format/virtual_fields.emb.formatted_indent_4
new file mode 100644
index 0000000..f2ca36c
--- /dev/null
+++ b/testdata/format/virtual_fields.emb.formatted_indent_4
@@ -0,0 +1,45 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Formatting of virtual fields, mixed in with nonvirtual fields.
+
+
+enum Values:
+ FOO = 1
+ BAR = 2
+
+
+struct Foo:
+ let important_constant = Values.FOO # comm
+ 0 [+1] UInt len # comment
+ # comment comment comment
+ 1 [+2] LongTypeName long_value_name # comment?
+ let s = len-2
+ if ($size_in_bytes) > 10:
+ 2 [+s] SubMessage submessage
+
+ let truth = Bar.top
+ if Bar.$max_size_in_bits < 2:
+ Bar.$min_size_in_bits [+1] Int x
+
+
+bits Bar:
+ let top = true
+ 0 [+1] Flag allowed
+ if $size_in_bits > 10:
+ let s = 100
+ 2 [+s] SubMessage submessage
+
+ if Foo.$max_size_in_bytes < 2:
+ Foo.$min_size_in_bytes [+1] Int x
diff --git a/testdata/golden/README.md b/testdata/golden/README.md
new file mode 100644
index 0000000..6f89c8d
--- /dev/null
+++ b/testdata/golden/README.md
@@ -0,0 +1,47 @@
+This directory contains an `.emb` and a set of golden files which correspond to
+various parsing stages of the `.emb`. The primary purpose is to highlight
+changes to the parse tree, tokenization, or (uncooked) intermediate
+representation in a code review, where the before/after can be seen in
+side-by-side diffs. The golden files *are* checked by unit tests, but test
+failures generally just mean that the files need to be regenerated, not that
+there is an actual bug.
+
+
+## `span_se_log_file_status.emb`
+
+The .emb file from which the other files are derived.
+
+
+## `span_se_log_file_status.tokens.txt`
+
+The tokenization. This file should change very rarely. From the workspace root
+directory, it can be generated with:
+
+ bazel run //front_end:emboss_front_end \
+ -- --no-debug-show-header-lines --debug-show-tokenization \
+ $(pwd)/testdata/golden/span_se_log_file_status.emb \
+ > $(pwd)/testdata/golden/span_se_log_file_status.tokens.txt
+
+
+## `span_se_log_file_status.parse_tree.txt`
+
+The syntactic parse tree. From the workspace root directory, it can be
+generated with:
+
+ bazel run //front_end:emboss_front_end \
+ -- --no-debug-show-header-lines --debug-show-parse-tree \
+ $(pwd)/testdata/golden/span_se_log_file_status.emb \
+ > $(pwd)/testdata/golden/span_se_log_file_status.parse_tree.txt
+
+
+## `span_se_log_file_status.ir.txt`
+
+The "uncooked" module-level IR: that is, the IR of *only*
+`span_se_log_file_status.emb` (without the prelude or any imports), straight out
+of `module_ir.py` with no "middle end" transformations. From the workspace root
+directory, it can be generated with:
+
+ blaze run //front_end:emboss_front_end \
+ -- --no-debug-show-header-lines --debug-show-module-ir \
+ $(pwd)/testdata/golden/span_se_log_file_status.emb \
+ > $(pwd)/testdata/golden/span_se_log_file_status.ir.txt
diff --git a/testdata/golden/__init__.py b/testdata/golden/__init__.py
new file mode 100644
index 0000000..2c31d84
--- /dev/null
+++ b/testdata/golden/__init__.py
@@ -0,0 +1,14 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
diff --git a/testdata/golden/span_se_log_file_status.emb b/testdata/golden/span_se_log_file_status.emb
new file mode 100644
index 0000000..d45165d
--- /dev/null
+++ b/testdata/golden/span_se_log_file_status.emb
@@ -0,0 +1,25 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- This is a simple, real-world example structure.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct LogFileStatus:
+ 0 [+4] UInt file_state
+ 4 [+12] UInt:8[12] file_name
+ 16 [+4] UInt file_size_kb
+ 20 [+4] UInt media
diff --git a/testdata/golden/span_se_log_file_status.ir.txt b/testdata/golden/span_se_log_file_status.ir.txt
new file mode 100644
index 0000000..ad6fcaf
--- /dev/null
+++ b/testdata/golden/span_se_log_file_status.ir.txt
@@ -0,0 +1 @@
+{"attribute":[{"is_default":true,"name":{"source_location":{"end":{"column":21,"line":17},"is_synthetic":false,"start":{"column":11,"line":17}},"text":"byte_order"},"source_location":{"end":{"column":38,"line":17},"is_synthetic":false,"start":{"column":1,"line":17}},"value":{"source_location":{"end":{"column":37,"line":17},"is_synthetic":false,"start":{"column":23,"line":17}},"string_constant":{"source_location":{"end":{"column":37,"line":17},"is_synthetic":false,"start":{"column":23,"line":17}},"text":"LittleEndian"}}},{"back_end":{"source_location":{"end":{"column":7,"line":18},"is_synthetic":false,"start":{"column":2,"line":18}},"text":"cpp"},"is_default":false,"name":{"source_location":{"end":{"column":17,"line":18},"is_synthetic":false,"start":{"column":8,"line":18}},"text":"namespace"},"source_location":{"end":{"column":1,"line":20},"is_synthetic":false,"start":{"column":1,"line":18}},"value":{"source_location":{"end":{"column":33,"line":18},"is_synthetic":false,"start":{"column":19,"line":18}},"string_constant":{"source_location":{"end":{"column":33,"line":18},"is_synthetic":false,"start":{"column":19,"line":18}},"text":"emboss::test"}}}],"documentation":[{"source_location":{"end":{"column":1,"line":16},"is_synthetic":false,"start":{"column":1,"line":15}},"text":"This is a simple, real-world example structure."}],"foreign_import":[{"file_name":{"source_location":{"end":{"column":1,"line":16},"is_synthetic":false,"start":{"column":1,"line":16}},"text":""},"local_name":{"source_location":{"end":{"column":1,"line":16},"is_synthetic":false,"start":{"column":1,"line":16}},"text":""},"source_location":{"end":{"column":1,"line":16},"is_synthetic":false,"start":{"column":1,"line":16}}}],"source_location":{"end":{"column":1,"line":26},"is_synthetic":false,"start":{"column":1,"line":1}},"type":[{"addressable_unit":8,"name":{"name":{"source_location":{"end":{"column":21,"line":21},"is_synthetic":false,"start":{"column":8,"line":21}},"text":"LogFileStatus"},"source_location":{"end":{"column":21,"line":21},"is_synthetic":false,"start":{"column":8,"line":21}}},"source_location":{"end":{"column":1,"line":26},"is_synthetic":false,"start":{"column":1,"line":21}},"structure":{"field":[{"existence_condition":{"boolean_constant":{"source_location":{"end":{"column":35,"line":22},"is_synthetic":false,"start":{"column":3,"line":22}},"value":true},"source_location":{"end":{"column":35,"line":22},"is_synthetic":false,"start":{"column":3,"line":22}}},"location":{"size":{"constant":{"source_location":{"end":{"column":9,"line":22},"is_synthetic":false,"start":{"column":8,"line":22}},"value":"4"},"source_location":{"end":{"column":9,"line":22},"is_synthetic":false,"start":{"column":8,"line":22}}},"source_location":{"end":{"column":10,"line":22},"is_synthetic":false,"start":{"column":3,"line":22}},"start":{"constant":{"source_location":{"end":{"column":4,"line":22},"is_synthetic":false,"start":{"column":3,"line":22}},"value":"0"},"source_location":{"end":{"column":4,"line":22},"is_synthetic":false,"start":{"column":3,"line":22}}}},"name":{"name":{"source_location":{"end":{"column":35,"line":22},"is_synthetic":false,"start":{"column":25,"line":22}},"text":"file_state"},"source_location":{"end":{"column":35,"line":22},"is_synthetic":false,"start":{"column":25,"line":22}}},"source_location":{"end":{"column":35,"line":22},"start":{"column":3,"line":22}},"type":{"atomic_type":{"reference":{"source_location":{"end":{"column":17,"line":22},"is_synthetic":false,"start":{"column":13,"line":22}},"source_name":[{"source_location":{"end":{"column":17,"line":22},"is_synthetic":false,"start":{"column":13,"line":22}},"text":"UInt"}]},"source_location":{"end":{"column":17,"line":22},"is_synthetic":false,"start":{"column":13,"line":22}}},"source_location":{"end":{"column":17,"line":22},"is_synthetic":false,"start":{"column":13,"line":22}}}},{"existence_condition":{"boolean_constant":{"source_location":{"end":{"column":34,"line":23},"is_synthetic":false,"start":{"column":3,"line":23}},"value":true},"source_location":{"end":{"column":34,"line":23},"is_synthetic":false,"start":{"column":3,"line":23}}},"location":{"size":{"constant":{"source_location":{"end":{"column":10,"line":23},"is_synthetic":false,"start":{"column":8,"line":23}},"value":"12"},"source_location":{"end":{"column":10,"line":23},"is_synthetic":false,"start":{"column":8,"line":23}}},"source_location":{"end":{"column":11,"line":23},"is_synthetic":false,"start":{"column":3,"line":23}},"start":{"constant":{"source_location":{"end":{"column":4,"line":23},"is_synthetic":false,"start":{"column":3,"line":23}},"value":"4"},"source_location":{"end":{"column":4,"line":23},"is_synthetic":false,"start":{"column":3,"line":23}}}},"name":{"name":{"source_location":{"end":{"column":34,"line":23},"is_synthetic":false,"start":{"column":25,"line":23}},"text":"file_name"},"source_location":{"end":{"column":34,"line":23},"is_synthetic":false,"start":{"column":25,"line":23}}},"source_location":{"end":{"column":34,"line":23},"start":{"column":3,"line":23}},"type":{"array_type":{"base_type":{"atomic_type":{"reference":{"source_location":{"end":{"column":17,"line":23},"is_synthetic":false,"start":{"column":13,"line":23}},"source_name":[{"source_location":{"end":{"column":17,"line":23},"is_synthetic":false,"start":{"column":13,"line":23}},"text":"UInt"}]},"source_location":{"end":{"column":17,"line":23},"is_synthetic":false,"start":{"column":13,"line":23}}},"size_in_bits":{"constant":{"source_location":{"end":{"column":19,"line":23},"is_synthetic":false,"start":{"column":18,"line":23}},"value":"8"},"source_location":{"end":{"column":19,"line":23},"is_synthetic":false,"start":{"column":17,"line":23}}},"source_location":{"end":{"column":19,"line":23},"is_synthetic":false,"start":{"column":13,"line":23}}},"element_count":{"constant":{"source_location":{"end":{"column":22,"line":23},"is_synthetic":false,"start":{"column":20,"line":23}},"value":"12"},"source_location":{"end":{"column":23,"line":23},"is_synthetic":false,"start":{"column":19,"line":23}}},"source_location":{"end":{"column":23,"line":23},"is_synthetic":false,"start":{"column":13,"line":23}}},"source_location":{"end":{"column":23,"line":23},"is_synthetic":false,"start":{"column":13,"line":23}}}},{"existence_condition":{"boolean_constant":{"source_location":{"end":{"column":37,"line":24},"is_synthetic":false,"start":{"column":3,"line":24}},"value":true},"source_location":{"end":{"column":37,"line":24},"is_synthetic":false,"start":{"column":3,"line":24}}},"location":{"size":{"constant":{"source_location":{"end":{"column":9,"line":24},"is_synthetic":false,"start":{"column":8,"line":24}},"value":"4"},"source_location":{"end":{"column":9,"line":24},"is_synthetic":false,"start":{"column":8,"line":24}}},"source_location":{"end":{"column":10,"line":24},"is_synthetic":false,"start":{"column":3,"line":24}},"start":{"constant":{"source_location":{"end":{"column":5,"line":24},"is_synthetic":false,"start":{"column":3,"line":24}},"value":"16"},"source_location":{"end":{"column":5,"line":24},"is_synthetic":false,"start":{"column":3,"line":24}}}},"name":{"name":{"source_location":{"end":{"column":37,"line":24},"is_synthetic":false,"start":{"column":25,"line":24}},"text":"file_size_kb"},"source_location":{"end":{"column":37,"line":24},"is_synthetic":false,"start":{"column":25,"line":24}}},"source_location":{"end":{"column":37,"line":24},"start":{"column":3,"line":24}},"type":{"atomic_type":{"reference":{"source_location":{"end":{"column":17,"line":24},"is_synthetic":false,"start":{"column":13,"line":24}},"source_name":[{"source_location":{"end":{"column":17,"line":24},"is_synthetic":false,"start":{"column":13,"line":24}},"text":"UInt"}]},"source_location":{"end":{"column":17,"line":24},"is_synthetic":false,"start":{"column":13,"line":24}}},"source_location":{"end":{"column":17,"line":24},"is_synthetic":false,"start":{"column":13,"line":24}}}},{"existence_condition":{"boolean_constant":{"source_location":{"end":{"column":30,"line":25},"is_synthetic":false,"start":{"column":3,"line":25}},"value":true},"source_location":{"end":{"column":30,"line":25},"is_synthetic":false,"start":{"column":3,"line":25}}},"location":{"size":{"constant":{"source_location":{"end":{"column":9,"line":25},"is_synthetic":false,"start":{"column":8,"line":25}},"value":"4"},"source_location":{"end":{"column":9,"line":25},"is_synthetic":false,"start":{"column":8,"line":25}}},"source_location":{"end":{"column":10,"line":25},"is_synthetic":false,"start":{"column":3,"line":25}},"start":{"constant":{"source_location":{"end":{"column":5,"line":25},"is_synthetic":false,"start":{"column":3,"line":25}},"value":"20"},"source_location":{"end":{"column":5,"line":25},"is_synthetic":false,"start":{"column":3,"line":25}}}},"name":{"name":{"source_location":{"end":{"column":30,"line":25},"is_synthetic":false,"start":{"column":25,"line":25}},"text":"media"},"source_location":{"end":{"column":30,"line":25},"is_synthetic":false,"start":{"column":25,"line":25}}},"source_location":{"end":{"column":30,"line":25},"start":{"column":3,"line":25}},"type":{"atomic_type":{"reference":{"source_location":{"end":{"column":17,"line":25},"is_synthetic":false,"start":{"column":13,"line":25}},"source_name":[{"source_location":{"end":{"column":17,"line":25},"is_synthetic":false,"start":{"column":13,"line":25}},"text":"UInt"}]},"source_location":{"end":{"column":17,"line":25},"is_synthetic":false,"start":{"column":13,"line":25}}},"source_location":{"end":{"column":17,"line":25},"is_synthetic":false,"start":{"column":13,"line":25}}}}],"source_location":{"end":{"column":1,"line":26},"start":{"column":1,"line":21}}}}]}
diff --git a/testdata/golden/span_se_log_file_status.parse_tree.txt b/testdata/golden/span_se_log_file_status.parse_tree.txt
new file mode 100644
index 0000000..bd15bdd
--- /dev/null
+++ b/testdata/golden/span_se_log_file_status.parse_tree.txt
@@ -0,0 +1,376 @@
+module:
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '# Copyright 2019 Google LLC' 1:1-1:28
+ "\n" '\n' 1:1-1:28
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '#' 2:1-2:2
+ "\n" '\n' 2:1-2:2
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '# Licensed under the Apache License, Version 2.0 (the "License");' 3:1-3:66
+ "\n" '\n' 3:1-3:66
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '# you may not use this file except in compliance with the License.' 4:1-4:67
+ "\n" '\n' 4:1-4:67
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '# You may obtain a copy of the License at' 5:1-5:42
+ "\n" '\n' 5:1-5:42
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '#' 6:1-6:2
+ "\n" '\n' 6:1-6:2
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '# https://www.apache.org/licenses/LICENSE-2.0' 7:1-7:50
+ "\n" '\n' 7:1-7:50
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '#' 8:1-8:2
+ "\n" '\n' 8:1-8:2
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '# Unless required by applicable law or agreed to in writing, software' 9:1-9:70
+ "\n" '\n' 9:1-9:70
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '# distributed under the License is distributed on an "AS IS" BASIS,' 10:1-10:68
+ "\n" '\n' 10:1-10:68
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.' 11:1-11:75
+ "\n" '\n' 11:1-11:75
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '# See the License for the specific language governing permissions and' 12:1-12:70
+ "\n" '\n' 12:1-12:70
+ comment-line*:
+ comment-line:
+ Comment?:
+ Comment '# limitations under the License.' 13:1-13:33
+ "\n" '\n' 13:1-13:33
+ comment-line*:
+ comment-line:
+ Comment?
+ "\n" '\n' 14:1-14:1
+ comment-line*
+ doc-line*:
+ doc-line:
+ doc:
+ Documentation '-- This is a simple, real-world example structure.' 15:1-15:51
+ Comment?
+ eol:
+ "\n" '\n' 15:51-16:1
+ comment-line*:
+ comment-line:
+ Comment?
+ "\n" '\n' 16:1-16:1
+ comment-line*
+ doc-line*
+ import-line*
+ attribute-line*:
+ attribute-line:
+ attribute:
+ "[" '[' 17:1-17:2
+ attribute-context?
+ "$default"?:
+ "$default" '$default' 17:2-17:10
+ snake-word:
+ SnakeWord 'byte_order' 17:11-17:21
+ ":" ':' 17:21-17:22
+ attribute-value:
+ string-constant:
+ String '"LittleEndian"' 17:23-17:37
+ "]" ']' 17:37-17:38
+ Comment?
+ eol:
+ "\n" '\n' 17:38-17:38
+ comment-line*
+ attribute-line*:
+ attribute-line:
+ attribute:
+ "[" '[' 18:1-18:2
+ attribute-context?:
+ attribute-context:
+ "(" '(' 18:2-18:3
+ snake-word:
+ SnakeWord 'cpp' 18:3-18:6
+ ")" ')' 18:6-18:7
+ "$default"?
+ snake-word:
+ SnakeWord 'namespace' 18:8-18:17
+ ":" ':' 18:17-18:18
+ attribute-value:
+ string-constant:
+ String '"emboss::test"' 18:19-18:33
+ "]" ']' 18:33-18:34
+ Comment?
+ eol:
+ "\n" '\n' 18:34-20:1
+ comment-line*:
+ comment-line:
+ Comment?
+ "\n" '\n' 19:1-19:1
+ comment-line*:
+ comment-line:
+ Comment?
+ "\n" '\n' 20:1-20:1
+ comment-line*
+ attribute-line*
+ type-definition*:
+ type-definition:
+ struct:
+ "struct" 'struct' 21:1-21:7
+ type-name:
+ type-word:
+ CamelWord 'LogFileStatus' 21:8-21:21
+ delimited-parameter-definition-list?
+ ":" ':' 21:21-21:22
+ Comment?
+ eol:
+ "\n" '\n' 21:22-21:22
+ comment-line*
+ struct-body:
+ Indent ' ' 22:1-22:3
+ doc-line*
+ attribute-line*
+ type-definition*
+ struct-field-block:
+ unconditional-struct-field:
+ field:
+ field-location:
+ expression:
+ choice-expression:
+ logical-expression:
+ comparison-expression:
+ additive-expression:
+ times-expression:
+ negation-expression:
+ bottom-expression:
+ numeric-constant:
+ Number '0' 22:3-22:4
+ times-expression-right*
+ additive-expression-right*
+ "[" '[' 22:6-22:7
+ "+" '+' 22:7-22:8
+ expression:
+ choice-expression:
+ logical-expression:
+ comparison-expression:
+ additive-expression:
+ times-expression:
+ negation-expression:
+ bottom-expression:
+ numeric-constant:
+ Number '4' 22:8-22:9
+ times-expression-right*
+ additive-expression-right*
+ "]" ']' 22:9-22:10
+ type:
+ type-reference:
+ type-reference-tail:
+ type-word:
+ CamelWord 'UInt' 22:13-22:17
+ delimited-argument-list?
+ type-size-specifier?
+ array-length-specifier*
+ snake-name:
+ snake-word:
+ SnakeWord 'file_state' 22:25-22:35
+ abbreviation?
+ attribute*
+ doc?
+ Comment?
+ eol:
+ "\n" '\n' 22:35-22:35
+ comment-line*
+ field-body?
+ struct-field-block:
+ unconditional-struct-field:
+ field:
+ field-location:
+ expression:
+ choice-expression:
+ logical-expression:
+ comparison-expression:
+ additive-expression:
+ times-expression:
+ negation-expression:
+ bottom-expression:
+ numeric-constant:
+ Number '4' 23:3-23:4
+ times-expression-right*
+ additive-expression-right*
+ "[" '[' 23:6-23:7
+ "+" '+' 23:7-23:8
+ expression:
+ choice-expression:
+ logical-expression:
+ comparison-expression:
+ additive-expression:
+ times-expression:
+ negation-expression:
+ bottom-expression:
+ numeric-constant:
+ Number '12' 23:8-23:10
+ times-expression-right*
+ additive-expression-right*
+ "]" ']' 23:10-23:11
+ type:
+ type-reference:
+ type-reference-tail:
+ type-word:
+ CamelWord 'UInt' 23:13-23:17
+ delimited-argument-list?
+ type-size-specifier?:
+ type-size-specifier:
+ ":" ':' 23:17-23:18
+ numeric-constant:
+ Number '8' 23:18-23:19
+ array-length-specifier*:
+ array-length-specifier:
+ "[" '[' 23:19-23:20
+ expression:
+ choice-expression:
+ logical-expression:
+ comparison-expression:
+ additive-expression:
+ times-expression:
+ negation-expression:
+ bottom-expression:
+ numeric-constant:
+ Number '12' 23:20-23:22
+ times-expression-right*
+ additive-expression-right*
+ "]" ']' 23:22-23:23
+ array-length-specifier*
+ snake-name:
+ snake-word:
+ SnakeWord 'file_name' 23:25-23:34
+ abbreviation?
+ attribute*
+ doc?
+ Comment?
+ eol:
+ "\n" '\n' 23:34-23:34
+ comment-line*
+ field-body?
+ struct-field-block:
+ unconditional-struct-field:
+ field:
+ field-location:
+ expression:
+ choice-expression:
+ logical-expression:
+ comparison-expression:
+ additive-expression:
+ times-expression:
+ negation-expression:
+ bottom-expression:
+ numeric-constant:
+ Number '16' 24:3-24:5
+ times-expression-right*
+ additive-expression-right*
+ "[" '[' 24:6-24:7
+ "+" '+' 24:7-24:8
+ expression:
+ choice-expression:
+ logical-expression:
+ comparison-expression:
+ additive-expression:
+ times-expression:
+ negation-expression:
+ bottom-expression:
+ numeric-constant:
+ Number '4' 24:8-24:9
+ times-expression-right*
+ additive-expression-right*
+ "]" ']' 24:9-24:10
+ type:
+ type-reference:
+ type-reference-tail:
+ type-word:
+ CamelWord 'UInt' 24:13-24:17
+ delimited-argument-list?
+ type-size-specifier?
+ array-length-specifier*
+ snake-name:
+ snake-word:
+ SnakeWord 'file_size_kb' 24:25-24:37
+ abbreviation?
+ attribute*
+ doc?
+ Comment?
+ eol:
+ "\n" '\n' 24:37-24:37
+ comment-line*
+ field-body?
+ struct-field-block:
+ unconditional-struct-field:
+ field:
+ field-location:
+ expression:
+ choice-expression:
+ logical-expression:
+ comparison-expression:
+ additive-expression:
+ times-expression:
+ negation-expression:
+ bottom-expression:
+ numeric-constant:
+ Number '20' 25:3-25:5
+ times-expression-right*
+ additive-expression-right*
+ "[" '[' 25:6-25:7
+ "+" '+' 25:7-25:8
+ expression:
+ choice-expression:
+ logical-expression:
+ comparison-expression:
+ additive-expression:
+ times-expression:
+ negation-expression:
+ bottom-expression:
+ numeric-constant:
+ Number '4' 25:8-25:9
+ times-expression-right*
+ additive-expression-right*
+ "]" ']' 25:9-25:10
+ type:
+ type-reference:
+ type-reference-tail:
+ type-word:
+ CamelWord 'UInt' 25:13-25:17
+ delimited-argument-list?
+ type-size-specifier?
+ array-length-specifier*
+ snake-name:
+ snake-word:
+ SnakeWord 'media' 25:25-25:30
+ abbreviation?
+ attribute*
+ doc?
+ Comment?
+ eol:
+ "\n" '\n' 25:30-25:30
+ comment-line*
+ field-body?
+ struct-field-block
+ Dedent '' 26:1-26:1
+ type-definition*
+
diff --git a/testdata/golden/span_se_log_file_status.tokens.txt b/testdata/golden/span_se_log_file_status.tokens.txt
new file mode 100644
index 0000000..5474c99
--- /dev/null
+++ b/testdata/golden/span_se_log_file_status.tokens.txt
@@ -0,0 +1,91 @@
+Comment '# Copyright 2019 Google LLC' 1:1-1:28
+"\n" '\n' 1:1-1:28
+Comment '#' 2:1-2:2
+"\n" '\n' 2:1-2:2
+Comment '# Licensed under the Apache License, Version 2.0 (the "License");' 3:1-3:66
+"\n" '\n' 3:1-3:66
+Comment '# you may not use this file except in compliance with the License.' 4:1-4:67
+"\n" '\n' 4:1-4:67
+Comment '# You may obtain a copy of the License at' 5:1-5:42
+"\n" '\n' 5:1-5:42
+Comment '#' 6:1-6:2
+"\n" '\n' 6:1-6:2
+Comment '# https://www.apache.org/licenses/LICENSE-2.0' 7:1-7:50
+"\n" '\n' 7:1-7:50
+Comment '#' 8:1-8:2
+"\n" '\n' 8:1-8:2
+Comment '# Unless required by applicable law or agreed to in writing, software' 9:1-9:70
+"\n" '\n' 9:1-9:70
+Comment '# distributed under the License is distributed on an "AS IS" BASIS,' 10:1-10:68
+"\n" '\n' 10:1-10:68
+Comment '# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.' 11:1-11:75
+"\n" '\n' 11:1-11:75
+Comment '# See the License for the specific language governing permissions and' 12:1-12:70
+"\n" '\n' 12:1-12:70
+Comment '# limitations under the License.' 13:1-13:33
+"\n" '\n' 13:1-13:33
+"\n" '\n' 14:1-14:1
+Documentation '-- This is a simple, real-world example structure.' 15:1-15:51
+"\n" '\n' 15:51-16:1
+"\n" '\n' 16:1-16:1
+"[" '[' 17:1-17:2
+"$default" '$default' 17:2-17:10
+SnakeWord 'byte_order' 17:11-17:21
+":" ':' 17:21-17:22
+String '"LittleEndian"' 17:23-17:37
+"]" ']' 17:37-17:38
+"\n" '\n' 17:38-17:38
+"[" '[' 18:1-18:2
+"(" '(' 18:2-18:3
+SnakeWord 'cpp' 18:3-18:6
+")" ')' 18:6-18:7
+SnakeWord 'namespace' 18:8-18:17
+":" ':' 18:17-18:18
+String '"emboss::test"' 18:19-18:33
+"]" ']' 18:33-18:34
+"\n" '\n' 18:34-20:1
+"\n" '\n' 19:1-19:1
+"\n" '\n' 20:1-20:1
+"struct" 'struct' 21:1-21:7
+CamelWord 'LogFileStatus' 21:8-21:21
+":" ':' 21:21-21:22
+"\n" '\n' 21:22-21:22
+Indent ' ' 22:1-22:3
+Number '0' 22:3-22:4
+"[" '[' 22:6-22:7
+"+" '+' 22:7-22:8
+Number '4' 22:8-22:9
+"]" ']' 22:9-22:10
+CamelWord 'UInt' 22:13-22:17
+SnakeWord 'file_state' 22:25-22:35
+"\n" '\n' 22:35-22:35
+Number '4' 23:3-23:4
+"[" '[' 23:6-23:7
+"+" '+' 23:7-23:8
+Number '12' 23:8-23:10
+"]" ']' 23:10-23:11
+CamelWord 'UInt' 23:13-23:17
+":" ':' 23:17-23:18
+Number '8' 23:18-23:19
+"[" '[' 23:19-23:20
+Number '12' 23:20-23:22
+"]" ']' 23:22-23:23
+SnakeWord 'file_name' 23:25-23:34
+"\n" '\n' 23:34-23:34
+Number '16' 24:3-24:5
+"[" '[' 24:6-24:7
+"+" '+' 24:7-24:8
+Number '4' 24:8-24:9
+"]" ']' 24:9-24:10
+CamelWord 'UInt' 24:13-24:17
+SnakeWord 'file_size_kb' 24:25-24:37
+"\n" '\n' 24:37-24:37
+Number '20' 25:3-25:5
+"[" '[' 25:6-25:7
+"+" '+' 25:7-25:8
+Number '4' 25:8-25:9
+"]" ']' 25:9-25:10
+CamelWord 'UInt' 25:13-25:17
+SnakeWord 'media' 25:25-25:30
+"\n" '\n' 25:30-25:30
+Dedent '' 26:1-26:1
diff --git a/testdata/imported.emb b/testdata/imported.emb
new file mode 100644
index 0000000..27e9fb8
--- /dev/null
+++ b/testdata/imported.emb
@@ -0,0 +1,20 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct Inner:
+ 0 [+8] UInt value
diff --git a/testdata/importer.emb b/testdata/importer.emb
new file mode 100644
index 0000000..2d008ac
--- /dev/null
+++ b/testdata/importer.emb
@@ -0,0 +1,32 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Test .emb to ensure that the import system works.
+#
+# The file imported_genfiles.emb is identical to imported.emb except for the
+# [(cpp) namespace] attribute; it is used to ensure that generated .embs can be
+# used by the emboss_cc_library build rule.
+
+# These imports intentionally use names that do not match the file names, as a
+# test that the file names aren't being used.
+
+import "testdata/imported.emb" as imp
+import "testdata/imported_genfiles.emb" as imp_gen
+
+[(cpp) namespace: "emboss::test"]
+
+
+struct Outer:
+ 0 [+8] imp.Inner inner
+ 8 [+8] imp_gen.Inner inner_gen
diff --git a/testdata/inline_type.emb b/testdata/inline_type.emb
new file mode 100644
index 0000000..94e83db
--- /dev/null
+++ b/testdata/inline_type.emb
@@ -0,0 +1,28 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test definitions for inline types.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct Foo:
+ 0 [+1] enum status:
+ OK = 0
+ FAILURE = 12
+
+ 1 [+1] enum secondary_status:
+ OK = 12
+ FAILURE = 0
diff --git a/testdata/int_sizes.emb b/testdata/int_sizes.emb
new file mode 100644
index 0000000..ec974c2
--- /dev/null
+++ b/testdata/int_sizes.emb
@@ -0,0 +1,29 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test struct for 8, 16, 32, and 64 bit Ints.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct Sizes:
+ 0 [+1] Int one_byte
+ 1 [+2] Int two_byte
+ 3 [+3] Int three_byte
+ 6 [+4] Int four_byte
+ 10 [+5] Int five_byte
+ 15 [+6] Int six_byte
+ 21 [+7] Int seven_byte
+ 28 [+8] Int eight_byte
diff --git a/testdata/large_array.emb b/testdata/large_array.emb
new file mode 100644
index 0000000..49570e3
--- /dev/null
+++ b/testdata/large_array.emb
@@ -0,0 +1,24 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Structure used for performance testing: Emboss-mediated access vs.
+-- reinterpret_cast<>-style access.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct UIntArray:
+ 0 [+4] UInt element_count (e)
+ 4 [+4*e] UInt:32[e] elements
diff --git a/testdata/nested_structure.emb b/testdata/nested_structure.emb
new file mode 100644
index 0000000..ac68514
--- /dev/null
+++ b/testdata/nested_structure.emb
@@ -0,0 +1,32 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct Container:
+ 0 [+4] UInt weight
+ 4 [+8] Box important_box
+ 12 [+8] Box other_box
+
+
+struct Box:
+ 0 [+4] UInt id
+ 4 [+4] UInt count
+
+
+struct Truck:
+ 0 [+4] UInt id
+ 4 [+40] Container[2] cargo
diff --git a/testdata/no_cpp_namespace.emb b/testdata/no_cpp_namespace.emb
new file mode 100644
index 0000000..67a01f7
--- /dev/null
+++ b/testdata/no_cpp_namespace.emb
@@ -0,0 +1,21 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test .emb to ensure that the generated type ends up in the
+-- ::emboss_generated_code namespace when the [(cpp) namespace] attribute is not
+-- set.
+
+
+enum Foo:
+ VALUE = 10
diff --git a/testdata/parameters.emb b/testdata/parameters.emb
new file mode 100644
index 0000000..3e891ee
--- /dev/null
+++ b/testdata/parameters.emb
@@ -0,0 +1,92 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss_test"]
+
+enum Product:
+ VERSION_1 = 0
+ VERSION_2 = 10
+ VERSION_X = 23
+
+enum MessageId:
+ AXIS = 0
+ CONFIG = 1
+
+struct Multiversion(product: Product):
+ 0 [+1] MessageId message_id
+ if message_id == MessageId.AXIS:
+ 1 [+12] Axes(product == Product.VERSION_X ? 3 : 2) axes
+ if message_id == MessageId.CONFIG:
+ 1 [+4] Config config
+ if product == Product.VERSION_X && message_id == MessageId.CONFIG:
+ 1 [+8] ConfigVX() config_vx
+
+struct Axes(axes: UInt:4):
+ 0 [+axes * 4] Axis(AxisType.GENERIC)[] values
+ if axes > 0:
+ 0 [+4] Axis(AxisType.X_AXIS) x
+ if axes > 1:
+ 4 [+4] Axis(AxisType.Y_AXIS) y
+ if axes > 2:
+ 8 [+4] Axis(AxisType.Z_AXIS) z
+ let axis_count_plus_one = axes + 1
+
+struct AxesEnvelope:
+ 0 [+1] UInt:8 axis_count
+ 1 [+axis_count*4] Axes(axis_count) axes
+
+enum AxisType:
+ GENERIC = -1
+ X_AXIS = 1
+ Y_AXIS = 2
+ Z_AXIS = 3
+
+struct Axis(axis_type_parameter: AxisType):
+ let axis_type = axis_type_parameter
+ 0 [+4] UInt:32 value
+ if axis_type == AxisType.X_AXIS:
+ 0 [+4] UInt:32 x
+ if axis_type == AxisType.Y_AXIS:
+ 0 [+4] UInt:32 y
+ if axis_type == AxisType.Z_AXIS:
+ 0 [+4] UInt:32 z
+
+bits Config():
+ 31 [+1] Flag power
+
+struct ConfigVX:
+ 0 [+4] bits:
+ 31 [+1] Flag power
+ 4 [+4] UInt gain
+
+
+struct StructWithUnusedParameter(x: UInt:8):
+ 0 [+1] UInt y
+
+# StructContainingStructWithUnusedParameter is used to ensure that a struct is
+# not Ok() if it does not have its parameters, even if it does not directly use
+# those parameters.
+struct StructContainingStructWithUnusedParameter:
+ 0 [+1] StructWithUnusedParameter(x) swup
+ 1 [+1] UInt x
+
+struct BiasedValue(bias: UInt:8):
+ 0 [+1] UInt raw_value
+ let value = raw_value + bias
+
+struct SizedArrayOfBiasedValues:
+ 0 [+1] UInt element_count (ec)
+ 1 [+1] UInt bias
+ 2 [+ec] BiasedValue(bias)[] values
diff --git a/testdata/requires.emb b/testdata/requires.emb
new file mode 100644
index 0000000..b37f61a
--- /dev/null
+++ b/testdata/requires.emb
@@ -0,0 +1,87 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct RequiresIntegers:
+ [requires: zero_through_nine <= ten_through_twenty - 10]
+
+ 0 [+1] UInt zero_through_nine [requires: 0 <= this <= 9]
+
+ 1 [+1] Int ten_through_twenty [requires: 10 <= this <= 20]
+
+ 2 [+1] UInt disjoint
+ [requires: 0 <= this <= 5 || 15 <= this <= 20]
+
+ let ztn_plus_ttt = zero_through_nine + ten_through_twenty
+ [requires: 10 <= this <= 19]
+
+ let alias_of_zero_through_nine = zero_through_nine
+ [requires: 2 <= this <= 7]
+
+ let zero_through_nine_plus_five = zero_through_nine + 5
+ [requires: 5 <= this <= 10]
+
+
+struct RequiresBools:
+ [requires: a || b]
+
+ 0 [+1] bits:
+ 0 [+1] Flag a
+ 1 [+1] Flag b
+ 2 [+1] Flag must_be_true
+ [requires: this]
+ 3 [+1] Flag must_be_false
+ [requires: this == false]
+
+ let b_must_be_false = b == false
+ [requires: this]
+
+ let alias_of_a_must_be_true = a
+ [requires: this]
+
+
+struct RequiresEnums:
+ [requires: a == Enum.EN0 || b == Enum.EN0]
+
+ enum Enum:
+ EN0 = 0
+ EN1 = 1
+ EN2 = 2
+ EN3 = 3
+
+ 0 [+1] Enum a
+ 1 [+1] Enum b
+ 2 [+1] Enum c
+ [requires: this == Enum.EN0 || this == Enum.EN1]
+
+ let filtered_a = a == Enum.EN0 ? Enum.EN1 : a
+ [requires: this == Enum.EN1]
+
+ let alias_of_a = a
+ [requires: this == Enum.EN1]
+
+
+struct RequiresWithOptionalFields:
+ [requires: a || b]
+ 0 [+1] bits:
+ 0 [+1] Flag a
+ 1 [+1] Flag b_exists
+ if b_exists:
+ 2 [+1] Flag b
+ if b_exists:
+ 2 [+1] Flag b_true
+ [requires: this]
diff --git a/testdata/start_size_range.emb b/testdata/start_size_range.emb
new file mode 100644
index 0000000..19aa053
--- /dev/null
+++ b/testdata/start_size_range.emb
@@ -0,0 +1,23 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct StartSize:
+ 0 [+1] UInt size (s)
+ 1 [+2] UInt start_size_constants
+ 3 [+s] UInt:8[s] payload
+ 3+s [+4] UInt counter
diff --git a/testdata/subtypes.emb b/testdata/subtypes.emb
new file mode 100644
index 0000000..7bcafc1
--- /dev/null
+++ b/testdata/subtypes.emb
@@ -0,0 +1,65 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test cases for types containing types.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct Out:
+ struct In:
+ struct InIn:
+ enum InInIn:
+ NO = 0
+ YES = 1
+
+ let outer_offset = 24
+
+ 0 [+1] InInIn field_enum
+ # In2 should be Out.In2, despite In2 appearing in an enclosing scope and
+ # later in the source file.
+
+ 1 [+1] In2 in_2
+
+ 0 [+2] InIn in_in_1
+
+ 2 [+2] InIn in_in_2
+
+ 4 [+1] InIn.InInIn in_in_in_1
+
+ 5 [+1] In2 in_2
+
+ 6 [+1] UInt name_collision
+ # name_collision should resolve to Out.In.name_collision, not
+ # Out.name_collision, and there should be no error about ambiguous
+ # resolution. (Note that since field references are actually used at
+ # runtime, and there isn't necessarily any enclosing Out object for an
+ # Out.In at runtime, it does not make sense for a field name to resolve to
+ # a field in an outer struct.)
+ # TODO(bolms): Add a warning for this case, since it is somewhat subtle.
+
+ name_collision [+1] UInt name_collision_check
+
+ struct In2:
+ 0 [+1] UInt field_byte
+
+ 0 [+8] In in_1
+ 8 [+8] In in_2
+ 16 [+2] In.InIn in_in_1
+ 18 [+2] In.InIn in_in_2
+ 20 [+1] In.InIn.InInIn in_in_in_1
+ 21 [+1] In.InIn.InInIn in_in_in_2
+ 22 [+2] UInt name_collision
+ In.InIn.outer_offset [+1] UInt nested_constant_check
diff --git a/testdata/text_format.emb b/testdata/text_format.emb
new file mode 100644
index 0000000..7458170
--- /dev/null
+++ b/testdata/text_format.emb
@@ -0,0 +1,36 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Structures used specifically to test text format input and output.
+
+[(cpp) namespace: "emboss::test"]
+
+
+struct Vanilla:
+ 0 [+1] UInt a
+ 1 [+1] UInt b
+
+
+struct StructWithSkippedFields:
+ 0 [+1] UInt a
+ 1 [+1] UInt b
+ [text_output: "Skip"]
+ 2 [+1] UInt c
+
+
+struct StructWithSkippedStructureFields:
+ 0 [+2] Vanilla a
+ 2 [+2] Vanilla b
+ [text_output: "Skip"]
+ 4 [+2] Vanilla c
diff --git a/testdata/uint_sizes.emb b/testdata/uint_sizes.emb
new file mode 100644
index 0000000..953af81
--- /dev/null
+++ b/testdata/uint_sizes.emb
@@ -0,0 +1,86 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Test structs for 8, 16, 32, and 64 bit UInts and enums.
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct Sizes:
+ 0 [+1] UInt one_byte
+ 1 [+2] UInt two_byte
+ 3 [+3] UInt three_byte
+ 6 [+4] UInt four_byte
+ 10 [+5] UInt five_byte
+ 15 [+6] UInt six_byte
+ 21 [+7] UInt seven_byte
+ 28 [+8] UInt eight_byte
+
+
+struct BigEndianSizes:
+ [$default byte_order: "BigEndian"]
+ 0 [+1] UInt one_byte
+ 1 [+2] UInt two_byte
+ 3 [+3] UInt three_byte
+ 6 [+4] UInt four_byte
+ 10 [+5] UInt five_byte
+ 15 [+6] UInt six_byte
+ 21 [+7] UInt seven_byte
+ 28 [+8] UInt eight_byte
+
+
+struct AlternatingEndianSizes:
+ [$default byte_order: "BigEndian"]
+ 0 [+1] UInt one_byte [byte_order: "BigEndian"]
+ 1 [+2] UInt two_byte [byte_order: "LittleEndian"]
+ 3 [+3] UInt three_byte # default to "BigEndian"
+ 6 [+4] UInt four_byte [byte_order: "LittleEndian"]
+ 10 [+5] UInt five_byte # default to "BigEndian"
+ 15 [+6] UInt six_byte [byte_order: "LittleEndian"]
+ 21 [+7] UInt seven_byte [byte_order: "BigEndian"]
+ 28 [+8] UInt eight_byte [byte_order: "LittleEndian"]
+
+
+struct EnumSizes:
+ 0 [+1] Enum one_byte
+ 1 [+2] Enum two_byte
+ 3 [+3] Enum three_byte
+ 6 [+4] Enum four_byte
+ 10 [+5] Enum five_byte
+ 15 [+6] Enum six_byte
+ 21 [+7] Enum seven_byte
+ 28 [+8] Enum eight_byte
+
+
+enum Enum:
+ VALUE1 = 1
+ VALUE10 = 10
+ VALUE100 = 100
+ VALUE1000 = 1000
+ VALUE10000 = 10000
+ VALUE100000 = 100000
+ VALUE1000000 = 1000000
+ VALUE10000000 = 10000000
+
+
+struct ArraySizes:
+ 0 [+2] UInt:8[2] one_byte
+ 2 [+4] UInt:16[2] two_byte
+ 6 [+6] UInt:24[2] three_byte
+ 12 [+8] UInt:32[2] four_byte
+ 20 [+10] UInt:40[2] five_byte
+ 30 [+12] UInt:48[2] six_byte
+ 42 [+14] UInt:56[2] seven_byte
+ 56 [+16] UInt:64[2] eight_byte
diff --git a/testdata/virtual_field.emb b/testdata/virtual_field.emb
new file mode 100644
index 0000000..d9af205
--- /dev/null
+++ b/testdata/virtual_field.emb
@@ -0,0 +1,153 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+-- Tests for virtual fields:
+--
+-- * `let` constructs
+-- * TODO(bolms@): `transform` annotations
+-- * TODO(bolms@): `read` and `write` annotations
+
+[$default byte_order: "LittleEndian"]
+[(cpp) namespace: "emboss::test"]
+
+
+struct StructureWithConstants:
+ let ten = 10
+ let twenty = 20
+ let four_billion = 4_000_000_000
+ let ten_billion = 10_000_000_000
+ let minus_ten_billion = -10_000_000_000
+ 0 [+4] UInt value
+ let alias_of_value = value
+ let alias_of_alias_of_value = alias_of_value
+ let alias_of_ten = ten
+ let alias_of_alias_of_ten = alias_of_ten
+
+
+struct StructureWithComputedValues:
+ 0 [+4] UInt value
+ let doubled = value * 2
+ let plus_ten = value + 10
+ let signed_doubled = value2 * 2
+ let signed_plus_ten = value2 + 10
+ let product = value * value2
+ 4 [+4] Int value2
+
+
+struct StructureWithConditionalValue:
+ 0 [+4] UInt x
+ if x < 0x8000_0000:
+ let two_x = x * 2
+ let x_plus_one = x + 1
+
+
+struct StructureWithValueInCondition:
+ let two_x = x * 2
+ 0 [+4] UInt x
+ if two_x < 100:
+ 4 [+4] UInt if_two_x_lt_100
+
+
+struct StructureWithValuesInLocation:
+ let two_x = x * 2
+ 0 [+4] UInt x
+ two_x [+4] UInt offset_two_x
+ 4 [+two_x] UInt:32 size_two_x
+
+
+struct StructureWithBoolValue:
+ let x_is_ten = x == 10
+ 0 [+4] UInt x
+
+
+struct StructureWithEnumValue:
+ enum Category:
+ SMALL = 1
+ LARGE = 2
+ let x_size = x < 100 ? Category.SMALL : Category.LARGE
+ 0 [+4] UInt x
+
+
+struct StructureWithBitsWithValue:
+ 0 [+4] BitsWithValue b
+ let alias_of_b_sum = b.sum
+ let alias_of_b_a = b.a
+
+
+bits BitsWithValue:
+ 0 [+16] UInt a
+ 16 [+16] UInt b
+ let sum = a + b
+
+
+struct StructureUsingForeignConstants:
+ StructureWithConstants.ten [+4] UInt x
+ let one_hundred = StructureWithConstants.twenty * 5
+
+
+struct SubfieldOfAlias:
+ 0 [+4] struct header:
+ 0 [+2] UInt size
+ 2 [+2] UInt message_id
+ let h = header
+ let size = h.size
+
+
+struct RestrictedAlias:
+ 0 [+4] BitsWithValue a_b
+ 4 [+1] UInt alias_switch
+ if alias_switch > 10:
+ let a_b_alias = a_b
+
+
+struct HasField:
+ 0 [+1] UInt z
+ if $present(x.y):
+ let y = x.y
+ if z > 10:
+ 1 [+2] struct x:
+ 0 [+1] UInt v
+ if v > 10:
+ 1 [+1] UInt y
+ if $present(x):
+ let x_has_y = $present(x.y)
+
+
+struct VirtualUnconditionallyUsesConditional:
+ 0 [+1] UInt x
+ if x == 0:
+ 1 [+1] UInt xc
+
+ let x_nor_xc = x == 0 && xc == 0
+
+
+struct UsesSize:
+ 0 [+1] bits r:
+ 0 [+8] UInt q
+ let q_plus_bit_size = q + $size_in_bits
+ let r_q_plus_byte_size = r.q + $size_in_bytes
+
+
+struct UsesExternalSize:
+ 0 [+4] StructureWithConstants x
+ x.$size_in_bytes [+StructureWithConstants.$size_in_bytes] StructureWithConstants y
+
+
+struct ImplicitWriteBack:
+ 0 [+1] UInt x
+ let x_plus_ten = x + 10
+ let ten_plus_x = 10 + x
+ let x_minus_ten = x - 10
+ let ten_minus_x = 10 - x
+ let ten_minus_x_plus_ten = (10 - x) + 10
diff --git a/util/BUILD b/util/BUILD
new file mode 100644
index 0000000..eec0cd4
--- /dev/null
+++ b/util/BUILD
@@ -0,0 +1,133 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Shared utilities for Emboss back ends.
+
+package(
+ default_visibility = ["//:__subpackages__"],
+)
+
+py_library(
+ name = "expression_parser",
+ srcs = ["expression_parser.py"],
+ deps = [
+ "//front_end:module_ir",
+ "//front_end:parser",
+ "//front_end:tokenizer",
+ ],
+)
+
+py_library(
+ name = "ir_util",
+ srcs = ["ir_util.py"],
+ deps = ["//public:ir_pb2"],
+)
+
+py_test(
+ name = "ir_util_test",
+ srcs = ["ir_util_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":expression_parser",
+ ":ir_util",
+ "//public:ir_pb2",
+ ],
+)
+
+py_library(
+ name = "simple_memoizer",
+ srcs = ["simple_memoizer.py"],
+ deps = [],
+)
+
+py_test(
+ name = "simple_memoizer_test",
+ srcs = ["simple_memoizer_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":simple_memoizer",
+ ],
+)
+
+py_library(
+ name = "traverse_ir",
+ srcs = ["traverse_ir.py"],
+ deps = [
+ ":simple_memoizer",
+ "//public:ir_pb2",
+ ],
+)
+
+py_test(
+ name = "traverse_ir_test",
+ srcs = ["traverse_ir_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":traverse_ir",
+ "//public:ir_pb2",
+ ],
+)
+
+py_library(
+ name = "parser_types",
+ srcs = ["parser_types.py"],
+ deps = [
+ "//public:ir_pb2",
+ ],
+)
+
+py_test(
+ name = "parser_types_test",
+ srcs = ["parser_types_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":parser_types",
+ "//public:ir_pb2",
+ ],
+)
+
+py_library(
+ name = "error",
+ srcs = [
+ "error.py",
+ ],
+ deps = [
+ ":parser_types",
+ ],
+)
+
+py_test(
+ name = "error_test",
+ srcs = ["error_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":error",
+ ":parser_types",
+ ],
+)
+
+py_library(
+ name = "name_conversion",
+ srcs = ["name_conversion.py"],
+ deps = [],
+)
+
+py_test(
+ name = "name_conversion_test",
+ srcs = ["name_conversion_test.py"],
+ python_version = "PY3",
+ deps = [
+ ":name_conversion",
+ ],
+)
diff --git a/util/error.py b/util/error.py
new file mode 100644
index 0000000..5a49e70
--- /dev/null
+++ b/util/error.py
@@ -0,0 +1,244 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Error and warning message support for Emboss.
+
+This module exports the error, warn, and note functions, which return a _Message
+representing the error, warning, or note, respectively. The format method of
+the returned object can be used to render the message with source code snippets.
+
+Throughout Emboss, messages are passed around as lists of lists of _Messages.
+Each inner list represents a group of messages which should either all be
+printed, or not printed; i.e., an error message and associated informational
+messages. For example, to indicate both a duplicate definition error and a
+warning that a field is a reserved word, one might return:
+
+ return [
+ [
+ error.error(file_name, location, "Duplicate definition),
+ error.note(original_file_name, original_location,
+ "Original definition"),
+ ],
+ [
+ error.warn(file_name, location, "Field name is a C reserved word.")
+ ],
+ ]
+"""
+
+from util import parser_types
+
+# Error levels; represented by the strings that will be included in messages.
+ERROR = "error"
+WARNING = "warning"
+NOTE = "note"
+
+# Colors; represented by the terminal escape sequences used to switch to them.
+# These work out-of-the-box on Unix derivatives (Linux, *BSD, Mac OS X), and
+# work on Windows using colorify.
+BLACK = "\033[0;30m"
+RED = "\033[0;31m"
+GREEN = "\033[0;32m"
+YELLOW = "\033[0;33m"
+BLUE = "\033[0;34m"
+MAGENTA = "\033[0;35m"
+CYAN = "\033[0;36m"
+WHITE = "\033[0;37m"
+BRIGHT_BLACK = "\033[0;1;30m"
+BRIGHT_RED = "\033[0;1;31m"
+BRIGHT_GREEN = "\033[0;1;32m"
+BRIGHT_YELLOW = "\033[0;1;33m"
+BRIGHT_BLUE = "\033[0;1;34m"
+BRIGHT_MAGENTA = "\033[0;1;35m"
+BRIGHT_CYAN = "\033[0;1;36m"
+BRIGHT_WHITE = "\033[0;1;37m"
+BOLD = "\033[0;1m"
+RESET = "\033[0m"
+
+
+def error(source_file, location, message):
+ """Returns an object representing an error message."""
+ return _Message(source_file, location, ERROR, message)
+
+
+def warn(source_file, location, message):
+ """Returns an object representing a warning."""
+ return _Message(source_file, location, WARNING, message)
+
+
+def note(source_file, location, message):
+ """Returns and object representing an informational note."""
+ return _Message(source_file, location, NOTE, message)
+
+
+class _Message(object):
+ """_Message holds a human-readable message."""
+ __slots__ = ("location", "source_file", "severity", "message")
+
+ def __init__(self, source_file, location, severity, message):
+ self.location = location
+ self.source_file = source_file
+ self.severity = severity
+ self.message = message
+
+ def format(self, source_code):
+ """Formats the _Message for display.
+
+ Arguments:
+ source_code: A dict of file names to source texts. This is used to
+ render source snippets.
+
+ Returns:
+ A list of tuples.
+
+ The first element of each tuple is an escape sequence used to put a Unix
+ terminal into a particular color mode. For use in non-Unix-terminal
+ output, the string will match one of the color names exported by this
+ module.
+
+ The second element is a string containing text to show to the user.
+
+ The text will not end with a newline character, nor will it include a
+ RESET color element.
+
+ To show non-colorized output, simply write the second element of each
+ tuple, then a newline at the end.
+
+ To show colorized output, write both the first and second element of each
+ tuple, then a newline at the end. Before exiting to the operating system,
+ a RESET sequence should be emitted.
+ """
+ # TODO(bolms): Figure out how to get Vim, Emacs, etc. to parse Emboss error
+ # messages.
+ severity_colors = {
+ ERROR: (BRIGHT_RED, BOLD),
+ WARNING: (BRIGHT_MAGENTA, BOLD),
+ NOTE: (BRIGHT_BLACK, WHITE)
+ }
+
+ result = []
+ if self.location.is_synthetic:
+ pos = "[compiler bug]"
+ else:
+ pos = parser_types.format_position(self.location.start)
+ source_name = self.source_file or "[prelude]"
+ if not self.location.is_synthetic and self.source_file in source_code:
+ source_lines = source_code[self.source_file].splitlines()
+ source_line = source_lines[self.location.start.line - 1]
+ else:
+ source_line = ""
+ lines = self.message.splitlines()
+ for i in range(len(lines)):
+ line = lines[i]
+ # This is a little awkward, but we want to suppress the final newline in
+ # the message. This newline is final if and only if it is the last line
+ # of the message and there is no source snippet.
+ if i != len(lines) - 1 or source_line:
+ line += "\n"
+ result.append((BOLD, "{}:{}: ".format(source_name, pos)))
+ if i == 0:
+ severity = self.severity
+ else:
+ severity = NOTE
+ result.append((severity_colors[severity][0], "{}: ".format(severity)))
+ result.append((severity_colors[severity][1], line))
+ if source_line:
+ result.append((WHITE, source_line + "\n"))
+ indicator_indent = " " * (self.location.start.column - 1)
+ if self.location.start.line == self.location.end.line:
+ indicator_caret = "^" * max(
+ 1, self.location.end.column - self.location.start.column)
+ else:
+ indicator_caret = "^"
+ result.append((BRIGHT_GREEN, indicator_indent + indicator_caret))
+ return result
+
+ def __repr__(self):
+ return ("Message({source_file!r}, make_location(({start_line!r}, "
+ "{start_column!r}), ({end_line!r}, {end_column!r}), "
+ "{is_synthetic!r}), {severity!r}, {message!r})").format(
+ source_file=self.source_file,
+ start_line=self.location.start.line,
+ start_column=self.location.start.column,
+ end_line=self.location.end.line,
+ end_column=self.location.end.column,
+ is_synthetic=self.location.is_synthetic,
+ severity=self.severity,
+ message=self.message)
+
+ def __eq__(self, other):
+ return (
+ self.__class__ == other.__class__ and self.location == other.location
+ and self.source_file == other.source_file and
+ self.severity == other.severity and self.message == other.message)
+
+ def __ne__(self, other):
+ return not self == other
+
+
+def split_errors(errors):
+ """Splits errors into (user_errors, synthetic_errors).
+
+ Arguments:
+ errors: A list of lists of _Message, which is a list of bundles of
+ associated messages.
+
+ Returns:
+ (user_errors, synthetic_errors), where both user_errors and
+ synthetic_errors are lists of lists of _Message. synthetic_errors will
+ contain all bundles that reference any synthetic source_location, and
+ user_errors will contain the rest.
+
+ The intent is that user_errors can be shown to end users, while
+ synthetic_errors should generally be suppressed.
+ """
+ synthetic_errors = []
+ user_errors = []
+ for error_block in errors:
+ if any(message.location.is_synthetic for message in error_block):
+ synthetic_errors.append(error_block)
+ else:
+ user_errors.append(error_block)
+ return user_errors, synthetic_errors
+
+
+def filter_errors(errors):
+ """Returns the non-synthetic errors from `errors`."""
+ return split_errors(errors)[0]
+
+
+def format_errors(errors, source_codes, use_color=False):
+ """Formats error messages with source code snippets."""
+ result = []
+ for error_group in errors:
+ assert error_group, "Found empty error_group!"
+ for message in error_group:
+ if use_color:
+ result.append("".join(e[0] + e[1] + RESET
+ for e in message.format(source_codes)))
+ else:
+ result.append("".join(e[1] for e in message.format(source_codes)))
+ return "\n".join(result)
+
+
+def make_error_from_parse_error(file_name, parse_error):
+ return [error(file_name,
+ parse_error.token.source_location,
+ "{code}\n"
+ "Found {text!r} ({symbol}), expected {expected}.".format(
+ code=parse_error.code or "Syntax error",
+ text=parse_error.token.text,
+ symbol=parse_error.token.symbol,
+ expected=", ".join(parse_error.expected_tokens)))]
+
+
diff --git a/util/error_test.py b/util/error_test.py
new file mode 100644
index 0000000..ee36419
--- /dev/null
+++ b/util/error_test.py
@@ -0,0 +1,345 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for util.error."""
+
+import unittest
+
+from util import error
+from util import parser_types
+
+
+class MessageTest(unittest.TestCase):
+ """Tests for _Message, as returned by error, warn, and note."""
+
+ def test_error(self):
+ error_message = error.error("foo.emb", parser_types.make_location(
+ (3, 4), (3, 6)), "Bad thing")
+ self.assertEqual("foo.emb", error_message.source_file)
+ self.assertEqual(error.ERROR, error_message.severity)
+ self.assertEqual(parser_types.make_location((3, 4), (3, 6)),
+ error_message.location)
+ self.assertEqual("Bad thing", error_message.message)
+ sourceless_format = error_message.format({})
+ sourced_format = error_message.format({"foo.emb": "\n\nabcdefghijklm"})
+ self.assertEqual("foo.emb:3:4: error: Bad thing",
+ "".join([x[1] for x in sourceless_format]))
+ self.assertEqual([(error.BOLD, "foo.emb:3:4: "), # Location
+ (error.BRIGHT_RED, "error: "), # Severity
+ (error.BOLD, "Bad thing"), # Message
+ ], sourceless_format)
+ self.assertEqual("foo.emb:3:4: error: Bad thing\n"
+ "abcdefghijklm\n"
+ " ^^", "".join([x[1] for x in sourced_format]))
+ self.assertEqual([(error.BOLD, "foo.emb:3:4: "), # Location
+ (error.BRIGHT_RED, "error: "), # Severity
+ (error.BOLD, "Bad thing\n"), # Message
+ (error.WHITE, "abcdefghijklm\n"), # Source snippet
+ (error.BRIGHT_GREEN, " ^^"), # Error column indicator
+ ], sourced_format)
+
+ def test_synthetic_error(self):
+ error_message = error.error("foo.emb", parser_types.make_location(
+ (3, 4), (3, 6), True), "Bad thing")
+ sourceless_format = error_message.format({})
+ sourced_format = error_message.format({"foo.emb": "\n\nabcdefghijklm"})
+ self.assertEqual("foo.emb:[compiler bug]: error: Bad thing",
+ "".join([x[1] for x in sourceless_format]))
+ self.assertEqual([
+ (error.BOLD, "foo.emb:[compiler bug]: "), # Location
+ (error.BRIGHT_RED, "error: "), # Severity
+ (error.BOLD, "Bad thing"), # Message
+ ], sourceless_format)
+ self.assertEqual("foo.emb:[compiler bug]: error: Bad thing",
+ "".join([x[1] for x in sourced_format]))
+ self.assertEqual([
+ (error.BOLD, "foo.emb:[compiler bug]: "), # Location
+ (error.BRIGHT_RED, "error: "), # Severity
+ (error.BOLD, "Bad thing"), # Message
+ ], sourced_format)
+
+ def test_prelude_as_file_name(self):
+ error_message = error.error("", parser_types.make_location(
+ (3, 4), (3, 6)), "Bad thing")
+ self.assertEqual("", error_message.source_file)
+ self.assertEqual(error.ERROR, error_message.severity)
+ self.assertEqual(parser_types.make_location((3, 4), (3, 6)),
+ error_message.location)
+ self.assertEqual("Bad thing", error_message.message)
+ sourceless_format = error_message.format({})
+ sourced_format = error_message.format({"": "\n\nabcdefghijklm"})
+ self.assertEqual("[prelude]:3:4: error: Bad thing",
+ "".join([x[1] for x in sourceless_format]))
+ self.assertEqual([(error.BOLD, "[prelude]:3:4: "), # Location
+ (error.BRIGHT_RED, "error: "), # Severity
+ (error.BOLD, "Bad thing"), # Message
+ ], sourceless_format)
+ self.assertEqual("[prelude]:3:4: error: Bad thing\n"
+ "abcdefghijklm\n"
+ " ^^", "".join([x[1] for x in sourced_format]))
+ self.assertEqual([(error.BOLD, "[prelude]:3:4: "), # Location
+ (error.BRIGHT_RED, "error: "), # Severity
+ (error.BOLD, "Bad thing\n"), # Message
+ (error.WHITE, "abcdefghijklm\n"), # Source snippet
+ (error.BRIGHT_GREEN, " ^^"), # Error column indicator
+ ], sourced_format)
+
+ def test_multiline_error_source(self):
+ error_message = error.error("foo.emb", parser_types.make_location(
+ (3, 4), (4, 6)), "Bad thing")
+ self.assertEqual("foo.emb", error_message.source_file)
+ self.assertEqual(error.ERROR, error_message.severity)
+ self.assertEqual(parser_types.make_location((3, 4), (4, 6)),
+ error_message.location)
+ self.assertEqual("Bad thing", error_message.message)
+ sourceless_format = error_message.format({})
+ sourced_format = error_message.format(
+ {"foo.emb": "\n\nabcdefghijklm\nnopqrstuv"})
+ self.assertEqual("foo.emb:3:4: error: Bad thing",
+ "".join([x[1] for x in sourceless_format]))
+ self.assertEqual([(error.BOLD, "foo.emb:3:4: "), # Location
+ (error.BRIGHT_RED, "error: "), # Severity
+ (error.BOLD, "Bad thing"), # Message
+ ], sourceless_format)
+ self.assertEqual("foo.emb:3:4: error: Bad thing\n"
+ "abcdefghijklm\n"
+ " ^", "".join([x[1] for x in sourced_format]))
+ self.assertEqual([(error.BOLD, "foo.emb:3:4: "), # Location
+ (error.BRIGHT_RED, "error: "), # Severity
+ (error.BOLD, "Bad thing\n"), # Message
+ (error.WHITE, "abcdefghijklm\n"), # Source snippet
+ (error.BRIGHT_GREEN, " ^"), # Error column indicator
+ ], sourced_format)
+
+ def test_multiline_error(self):
+ error_message = error.error("foo.emb", parser_types.make_location(
+ (3, 4), (3, 6)), "Bad thing\nSome explanation\nMore explanation")
+ self.assertEqual("foo.emb", error_message.source_file)
+ self.assertEqual(error.ERROR, error_message.severity)
+ self.assertEqual(parser_types.make_location((3, 4), (3, 6)),
+ error_message.location)
+ self.assertEqual("Bad thing\nSome explanation\nMore explanation",
+ error_message.message)
+ sourceless_format = error_message.format({})
+ sourced_format = error_message.format(
+ {"foo.emb": "\n\nabcdefghijklm\nnopqrstuv"})
+ self.assertEqual("foo.emb:3:4: error: Bad thing\n"
+ "foo.emb:3:4: note: Some explanation\n"
+ "foo.emb:3:4: note: More explanation",
+ "".join([x[1] for x in sourceless_format]))
+ self.assertEqual([(error.BOLD, "foo.emb:3:4: "), # Location
+ (error.BRIGHT_RED, "error: "), # Severity
+ (error.BOLD, "Bad thing\n"), # Message
+ (error.BOLD, "foo.emb:3:4: "), # Location, line 2
+ (error.BRIGHT_BLACK, "note: "), # "Note" severity, line 2
+ (error.WHITE, "Some explanation\n"), # Message, line 2
+ (error.BOLD, "foo.emb:3:4: "), # Location, line 3
+ (error.BRIGHT_BLACK, "note: "), # "Note" severity, line 3
+ (error.WHITE, "More explanation"), # Message, line 3
+ ], sourceless_format)
+ self.assertEqual("foo.emb:3:4: error: Bad thing\n"
+ "foo.emb:3:4: note: Some explanation\n"
+ "foo.emb:3:4: note: More explanation\n"
+ "abcdefghijklm\n"
+ " ^^", "".join([x[1] for x in sourced_format]))
+ self.assertEqual([(error.BOLD, "foo.emb:3:4: "), # Location
+ (error.BRIGHT_RED, "error: "), # Severity
+ (error.BOLD, "Bad thing\n"), # Message
+ (error.BOLD, "foo.emb:3:4: "), # Location, line 2
+ (error.BRIGHT_BLACK, "note: "), # "Note" severity, line 2
+ (error.WHITE, "Some explanation\n"), # Message, line 2
+ (error.BOLD, "foo.emb:3:4: "), # Location, line 3
+ (error.BRIGHT_BLACK, "note: "), # "Note" severity, line 3
+ (error.WHITE, "More explanation\n"), # Message, line 3
+ (error.WHITE, "abcdefghijklm\n"), # Source snippet
+ (error.BRIGHT_GREEN, " ^^"), # Column indicator
+ ], sourced_format)
+
+ def test_warn(self):
+ warning_message = error.warn("foo.emb", parser_types.make_location(
+ (3, 4), (3, 6)), "Not good thing")
+ self.assertEqual("foo.emb", warning_message.source_file)
+ self.assertEqual(error.WARNING, warning_message.severity)
+ self.assertEqual(parser_types.make_location((3, 4), (3, 6)),
+ warning_message.location)
+ self.assertEqual("Not good thing", warning_message.message)
+ sourced_format = warning_message.format({"foo.emb": "\n\nabcdefghijklm"})
+ self.assertEqual("foo.emb:3:4: warning: Not good thing\n"
+ "abcdefghijklm\n"
+ " ^^", "".join([x[1] for x in sourced_format]))
+ self.assertEqual([(error.BOLD, "foo.emb:3:4: "), # Location
+ (error.BRIGHT_MAGENTA, "warning: "), # Severity
+ (error.BOLD, "Not good thing\n"), # Message
+ (error.WHITE, "abcdefghijklm\n"), # Source snippet
+ (error.BRIGHT_GREEN, " ^^"), # Column indicator
+ ], sourced_format)
+
+ def test_note(self):
+ note_message = error.note("foo.emb", parser_types.make_location(
+ (3, 4), (3, 6)), "OK thing")
+ self.assertEqual("foo.emb", note_message.source_file)
+ self.assertEqual(error.NOTE, note_message.severity)
+ self.assertEqual(parser_types.make_location((3, 4), (3, 6)),
+ note_message.location)
+ self.assertEqual("OK thing", note_message.message)
+ sourced_format = note_message.format({"foo.emb": "\n\nabcdefghijklm"})
+ self.assertEqual("foo.emb:3:4: note: OK thing\n"
+ "abcdefghijklm\n"
+ " ^^", "".join([x[1] for x in sourced_format]))
+ self.assertEqual([(error.BOLD, "foo.emb:3:4: "), # Location
+ (error.BRIGHT_BLACK, "note: "), # Severity
+ (error.WHITE, "OK thing\n"), # Message
+ (error.WHITE, "abcdefghijklm\n"), # Source snippet
+ (error.BRIGHT_GREEN, " ^^"), # Column indicator
+ ], sourced_format)
+
+ def test_equality(self):
+ note_message = error.note("foo.emb", parser_types.make_location(
+ (3, 4), (3, 6)), "thing")
+ self.assertEqual(note_message,
+ error.note("foo.emb", parser_types.make_location(
+ (3, 4), (3, 6)), "thing"))
+ self.assertNotEqual(note_message,
+ error.warn("foo.emb", parser_types.make_location(
+ (3, 4), (3, 6)), "thing"))
+ self.assertNotEqual(note_message,
+ error.note("foo2.emb", parser_types.make_location(
+ (3, 4), (3, 6)), "thing"))
+ self.assertNotEqual(note_message,
+ error.note("foo.emb", parser_types.make_location(
+ (2, 4), (3, 6)), "thing"))
+ self.assertNotEqual(note_message,
+ error.note("foo.emb", parser_types.make_location(
+ (3, 4), (3, 6)), "thing2"))
+
+
+class StringTest(unittest.TestCase):
+ """Tests for strings."""
+
+ # These strings are a fixed part of the API.
+
+ def test_color_strings(self):
+ self.assertEqual("\033[0;30m", error.BLACK)
+ self.assertEqual("\033[0;31m", error.RED)
+ self.assertEqual("\033[0;32m", error.GREEN)
+ self.assertEqual("\033[0;33m", error.YELLOW)
+ self.assertEqual("\033[0;34m", error.BLUE)
+ self.assertEqual("\033[0;35m", error.MAGENTA)
+ self.assertEqual("\033[0;36m", error.CYAN)
+ self.assertEqual("\033[0;37m", error.WHITE)
+ self.assertEqual("\033[0;1;30m", error.BRIGHT_BLACK)
+ self.assertEqual("\033[0;1;31m", error.BRIGHT_RED)
+ self.assertEqual("\033[0;1;32m", error.BRIGHT_GREEN)
+ self.assertEqual("\033[0;1;33m", error.BRIGHT_YELLOW)
+ self.assertEqual("\033[0;1;34m", error.BRIGHT_BLUE)
+ self.assertEqual("\033[0;1;35m", error.BRIGHT_MAGENTA)
+ self.assertEqual("\033[0;1;36m", error.BRIGHT_CYAN)
+ self.assertEqual("\033[0;1;37m", error.BRIGHT_WHITE)
+ self.assertEqual("\033[0;1m", error.BOLD)
+ self.assertEqual("\033[0m", error.RESET)
+
+ def test_error_strings(self):
+ self.assertEqual("error", error.ERROR)
+ self.assertEqual("warning", error.WARNING)
+ self.assertEqual("note", error.NOTE)
+
+
+class SplitErrorsTest(unittest.TestCase):
+
+ def test_split_errors(self):
+ user_error = [
+ error.error("foo.emb", parser_types.make_location((1, 2), (3, 4)),
+ "Bad thing"),
+ error.note("foo.emb", parser_types.make_location((3, 4), (5, 6)),
+ "Note: bad thing referrent")
+ ]
+ user_error_2 = [
+ error.error("foo.emb", parser_types.make_location((8, 9), (10, 11)),
+ "Bad thing"),
+ error.note("foo.emb", parser_types.make_location((10, 11), (12, 13)),
+ "Note: bad thing referrent")
+ ]
+ synthetic_error = [
+ error.error("foo.emb", parser_types.make_location((1, 2), (3, 4)),
+ "Bad thing"),
+ error.note("foo.emb", parser_types.make_location((3, 4), (5, 6), True),
+ "Note: bad thing referrent")
+ ]
+ synthetic_error_2 = [
+ error.error("foo.emb",
+ parser_types.make_location((8, 9), (10, 11), True),
+ "Bad thing"),
+ error.note("foo.emb", parser_types.make_location((10, 11), (12, 13)),
+ "Note: bad thing referrent")
+ ]
+ user_errors, synthetic_errors = error.split_errors(
+ [user_error, synthetic_error])
+ self.assertEqual([user_error], user_errors)
+ self.assertEqual([synthetic_error], synthetic_errors)
+ user_errors, synthetic_errors = error.split_errors(
+ [synthetic_error, user_error])
+ self.assertEqual([user_error], user_errors)
+ self.assertEqual([synthetic_error], synthetic_errors)
+ user_errors, synthetic_errors = error.split_errors(
+ [synthetic_error, user_error, synthetic_error_2, user_error_2])
+ self.assertEqual([user_error, user_error_2], user_errors)
+ self.assertEqual([synthetic_error, synthetic_error_2], synthetic_errors)
+
+ def test_filter_errors(self):
+ user_error = [
+ error.error("foo.emb", parser_types.make_location((1, 2), (3, 4)),
+ "Bad thing"),
+ error.note("foo.emb", parser_types.make_location((3, 4), (5, 6)),
+ "Note: bad thing referrent")
+ ]
+ synthetic_error = [
+ error.error("foo.emb", parser_types.make_location((1, 2), (3, 4)),
+ "Bad thing"),
+ error.note("foo.emb", parser_types.make_location((3, 4), (5, 6), True),
+ "Note: bad thing referrent")
+ ]
+ synthetic_error_2 = [
+ error.error("foo.emb",
+ parser_types.make_location((8, 9), (10, 11), True),
+ "Bad thing"),
+ error.note("foo.emb", parser_types.make_location((10, 11), (12, 13)),
+ "Note: bad thing referrent")
+ ]
+ self.assertEqual(
+ [user_error],
+ error.filter_errors([synthetic_error, user_error, synthetic_error_2]))
+
+
+class FormatErrorsTest(unittest.TestCase):
+
+ def test_format_errors(self):
+ errors = [[error.note("foo.emb", parser_types.make_location((3, 4), (3, 6)),
+ "note")]]
+ sources = {"foo.emb": "x\ny\nz bcd\nq\n"}
+ self.assertEqual("foo.emb:3:4: note: note\n"
+ "z bcd\n"
+ " ^^", error.format_errors(errors, sources))
+ bold = error.BOLD
+ reset = error.RESET
+ white = error.WHITE
+ bright_black = error.BRIGHT_BLACK
+ bright_green = error.BRIGHT_GREEN
+ self.assertEqual(bold + "foo.emb:3:4: " + reset + bright_black + "note: " +
+ reset + white + "note\n" +
+ reset + white + "z bcd\n" +
+ reset + bright_green + " ^^" + reset,
+ error.format_errors(errors, sources, use_color=True))
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/util/expression_parser.py b/util/expression_parser.py
new file mode 100644
index 0000000..6b7cc0e
--- /dev/null
+++ b/util/expression_parser.py
@@ -0,0 +1,47 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Utility function to parse text into an ir_pb2.Expression."""
+
+from front_end import module_ir
+from front_end import parser
+from front_end import tokenizer
+
+
+def parse(text):
+ """Parses text as an Expression.
+
+ This parses text using the expression subset of the Emboss grammar, and
+ returns an ir_pb2.Expression. The expression only needs to be syntactically
+ valid; it will not go through symbol resolution or type checking. This
+ function is not intended to be called on arbitrary input; it asserts that the
+ text successfully parses, but does not return errors.
+
+ Arguments:
+ text: The text of an Emboss expression, like "4 + 5" or "$max(1, a, b)".
+
+ Returns:
+ An ir_pb2.Expression corresponding to the textual form.
+
+ Raises:
+ AssertionError if text is not a well-formed Emboss expression, and
+ assertions are enabled.
+ """
+ tokens, errors = tokenizer.tokenize(text, "")
+ assert not errors, "{!r}".format(errors)
+ # tokenizer.tokenize always inserts a newline token at the end, which breaks
+ # expression parsing.
+ parse_result = parser.parse_expression(tokens[:-1])
+ assert not parse_result.error, "{!r}".format(parse_result.error)
+ return module_ir.build_ir(parse_result.parse_tree)
diff --git a/util/ir_util.py b/util/ir_util.py
new file mode 100644
index 0000000..aa06229
--- /dev/null
+++ b/util/ir_util.py
@@ -0,0 +1,390 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Utility functions for reading and manipulating Emboss IR."""
+
+import operator
+
+from public import ir_pb2
+
+
+_FIXED_SIZE_ATTRIBUTE = "fixed_size_in_bits"
+
+
+def get_attribute(attribute_list, name):
+ """Finds name in attribute_list and returns a AttributeValue or None."""
+ attribute_value = None
+ for attr in attribute_list:
+ if attr.name.text == name and not attr.is_default:
+ assert attribute_value is None, 'Duplicate attribute "{}".'.format(name)
+ attribute_value = attr.value
+ return attribute_value
+
+
+def get_boolean_attribute(attribute_list, name, default_value=None):
+ """Returns the boolean value of an attribute, if any, or default_value.
+
+ Arguments:
+ attribute_list: A list of attributes to search.
+ name: The name of the desired attribute.
+ default_value: A value to return if name is not found in attribute_list,
+ or the attribute does not have a boolean value.
+
+ Returns:
+ The boolean value of the requested attribute, or default_value if the
+ requested attribute is not found or has a non-boolean value.
+ """
+ attribute_value = get_attribute(attribute_list, name)
+ if (not attribute_value or
+ not attribute_value.expression.HasField("boolean_constant")):
+ return default_value
+ return attribute_value.expression.boolean_constant.value
+
+
+def get_integer_attribute(attribute_list, name, default_value=None):
+ """Returns the integer value of an attribute, if any, or default_value.
+
+ Arguments:
+ attribute_list: A list of attributes to search.
+ name: The name of the desired attribute.
+ default_value: A value to return if name is not found in attribute_list,
+ or the attribute does not have an integer value.
+
+ Returns:
+ The integer value of the requested attribute, or default_value if the
+ requested attribute is not found or has a non-integer value.
+ """
+ attribute_value = get_attribute(attribute_list, name)
+ if (not attribute_value or
+ attribute_value.expression.type.WhichOneof("type") != "integer" or
+ not is_constant(attribute_value.expression)):
+ return default_value
+ return constant_value(attribute_value.expression)
+
+
+def is_constant(expression, bindings=None):
+ return constant_value(expression, bindings) is not None
+
+
+def is_constant_type(expression_type):
+ """Returns True if expression_type is inhabited by a single value."""
+ return (expression_type.integer.modulus == "infinity" or
+ expression_type.boolean.HasField("value") or
+ expression_type.enumeration.HasField("value"))
+
+
+def constant_value(expression, bindings=None):
+ """Evaluates expression with the given bindings."""
+ if expression.WhichOneof("expression") == "constant":
+ return int(expression.constant.value)
+ elif expression.WhichOneof("expression") == "constant_reference":
+ # We can't look up the constant reference without the IR, but by the time
+ # constant_value is called, the actual values should have been propagated to
+ # the type information.
+ if expression.type.WhichOneof("type") == "integer":
+ assert expression.type.integer.modulus == "infinity"
+ return int(expression.type.integer.modular_value)
+ elif expression.type.WhichOneof("type") == "boolean":
+ assert expression.type.boolean.HasField("value")
+ return expression.type.boolean.value
+ elif expression.type.WhichOneof("type") == "enumeration":
+ assert expression.type.enumeration.HasField("value")
+ return int(expression.type.enumeration.value)
+ else:
+ assert False, "Unexpected expression type {}".format(
+ expression.type.WhichOneof("type"))
+ elif expression.WhichOneof("expression") == "function":
+ return _constant_value_of_function(expression.function, bindings)
+ elif expression.WhichOneof("expression") == "field_reference":
+ return None
+ elif expression.WhichOneof("expression") == "boolean_constant":
+ return expression.boolean_constant.value
+ elif expression.WhichOneof("expression") == "builtin_reference":
+ name = expression.builtin_reference.canonical_name.object_path[0]
+ if bindings and name in bindings:
+ return bindings[name]
+ else:
+ return None
+ elif expression.WhichOneof("expression") is None:
+ return None
+ else:
+ assert False, "Unexpected expression kind {}".format(
+ expression.WhichOneof("expression"))
+
+
+def _constant_value_of_function(function, bindings):
+ """Returns the constant value of evaluating `function`, or None."""
+ values = [constant_value(arg, bindings) for arg in function.args]
+ # Expressions like `$is_statically_sized && 1 <= $static_size_in_bits <= 64`
+ # should return False, not None, if `$is_statically_sized` is false, even
+ # though `$static_size_in_bits` is unknown.
+ #
+ # The easiest way to allow this is to use a three-way logic chart for each;
+ # specifically:
+ #
+ # AND: True False Unknown
+ # +--------------------------
+ # True | True False Unknown
+ # False | False False False
+ # Unknown | Unknown False Unknown
+ #
+ # OR: True False Unknown
+ # +--------------------------
+ # True | True True True
+ # False | True False Unknown
+ # Unknown | True Unknown Unknown
+ #
+ # This raises the question of just how many constant-from-nonconstant
+ # expressions Emboss should support. There are, after all, a vast number of
+ # constant expression patterns built from non-constant subexpressions, such as
+ # `0 * X` or `X == X` or `3 * X == X + X + X`. I (bolms@) am not implementing
+ # any further special cases because I do not see any practical use for them.
+ if function.function == ir_pb2.Function.AND:
+ if any(value is False for value in values):
+ return False
+ elif any(value is None for value in values):
+ return None
+ else:
+ return True
+ elif function.function == ir_pb2.Function.OR:
+ if any(value is True for value in values):
+ return True
+ elif any(value is None for value in values):
+ return None
+ else:
+ return False
+ elif function.function == ir_pb2.Function.CHOICE:
+ if values[0] is None:
+ return None
+ else:
+ return values[1] if values[0] else values[2]
+ # Other than the logical operators and choice operator, the result of any
+ # function on an unknown value is, itself, considered unknown.
+ if any(value is None for value in values):
+ return None
+ functions = {
+ ir_pb2.Function.ADDITION: operator.add,
+ ir_pb2.Function.SUBTRACTION: operator.sub,
+ ir_pb2.Function.MULTIPLICATION: operator.mul,
+ ir_pb2.Function.EQUALITY: operator.eq,
+ ir_pb2.Function.INEQUALITY: operator.ne,
+ ir_pb2.Function.LESS: operator.lt,
+ ir_pb2.Function.LESS_OR_EQUAL: operator.le,
+ ir_pb2.Function.GREATER: operator.gt,
+ ir_pb2.Function.GREATER_OR_EQUAL: operator.ge,
+ # Python's max([1, 2]) == 2; max(1, 2) == 2; max([1]) == 1; but max(1)
+ # throws a TypeError ("'int' object is not iterable").
+ ir_pb2.Function.MAXIMUM: lambda *x: max(x),
+ }
+ return functions[function.function](*values)
+
+
+def _hashable_form_of_name(name):
+ return (name.module_file,) + tuple(name.object_path)
+
+
+def hashable_form_of_reference(reference):
+ """Returns a representation of reference that can be used as a dict key.
+
+ Arguments:
+ reference: An ir_pb2.Reference or ir_pb2.NameDefinition.
+
+ Returns:
+ A tuple of the module_file and object_path.
+ """
+ return _hashable_form_of_name(reference.canonical_name)
+
+
+def hashable_form_of_field_reference(field_reference):
+ """Returns a representation of field_reference that can be used as a dict key.
+
+ Arguments:
+ field_reference: An ir_pb2.FieldReference
+
+ Returns:
+ A tuple of tuples of the module_files and object_paths.
+ """
+ return tuple(_hashable_form_of_name(reference.canonical_name)
+ for reference in field_reference.path)
+
+
+def is_array(type_ir):
+ """Returns true if type_ir is an array type."""
+ return type_ir.HasField("array_type")
+
+
+def _find_path_in_structure_field(path, field):
+ if not path:
+ return field
+ return None
+
+
+def _find_path_in_structure(path, type_definition):
+ for field in type_definition.structure.field:
+ if field.name.name.text == path[0]:
+ return _find_path_in_structure_field(path[1:], field)
+ return None
+
+
+def _find_path_in_enumeration(path, type_definition):
+ if len(path) != 1:
+ return None
+ for value in type_definition.enumeration.value:
+ if value.name.name.text == path[0]:
+ return value
+ return None
+
+
+def _find_path_in_parameters(path, type_definition):
+ if len(path) > 1:
+ return None
+ for parameter in type_definition.runtime_parameter:
+ if parameter.name.name.text == path[0]:
+ return parameter
+ return None
+
+
+def _find_path_in_type_definition(path, type_definition):
+ """Finds the object with the given path in the given type_definition."""
+ if not path:
+ return type_definition
+ obj = _find_path_in_parameters(path, type_definition)
+ if obj:
+ return obj
+ if type_definition.HasField("structure"):
+ obj = _find_path_in_structure(path, type_definition)
+ elif type_definition.HasField("enumeration"):
+ obj = _find_path_in_enumeration(path, type_definition)
+ if obj:
+ return obj
+ else:
+ return _find_path_in_type_list(path, type_definition.subtype)
+
+
+def _find_path_in_type_list(path, type_list):
+ for type_definition in type_list:
+ if type_definition.name.name.text == path[0]:
+ return _find_path_in_type_definition(path[1:], type_definition)
+ return None
+
+
+def _find_path_in_module(path, module_ir):
+ if not path:
+ return module_ir
+ return _find_path_in_type_list(path, module_ir.type)
+
+
+def find_object_or_none(name, ir):
+ """Finds the object with the given canonical name, if it exists.."""
+ if (isinstance(name, ir_pb2.Reference) or
+ isinstance(name, ir_pb2.NameDefinition)):
+ path = _hashable_form_of_name(name.canonical_name)
+ elif isinstance(name, ir_pb2.CanonicalName):
+ path = _hashable_form_of_name(name)
+ else:
+ path = name
+
+ for module in ir.module:
+ if module.source_file_name == path[0]:
+ return _find_path_in_module(path[1:], module)
+
+ return None
+
+
+def find_object(name, ir):
+ """Finds the IR of the type, field, or value with the given canonical name."""
+ result = find_object_or_none(name, ir)
+ assert result is not None, "Bad reference {}".format(name)
+ return result
+
+
+def find_parent_object(name, ir):
+ """Finds the parent object of the object with the given canonical name."""
+ if (isinstance(name, ir_pb2.Reference) or
+ isinstance(name, ir_pb2.NameDefinition)):
+ path = _hashable_form_of_name(name.canonical_name)
+ elif isinstance(name, ir_pb2.CanonicalName):
+ path = _hashable_form_of_name(name)
+ else:
+ path = name
+ return find_object(path[:-1], ir)
+
+
+def get_base_type(type_ir):
+ """Returns the base type of the given type.
+
+ Arguments:
+ type_ir: IR of a type reference.
+
+ Returns:
+ If type_ir corresponds to an atomic type (like "UInt"), returns type_ir. If
+ type_ir corresponds to an array type (like "UInt:8[12]" or "Square[8][8]"),
+ returns the type after stripping off the array types ("UInt" or "Square").
+ """
+ while type_ir.HasField("array_type"):
+ type_ir = type_ir.array_type.base_type
+ assert type_ir.HasField("atomic_type"), (
+ "Unknown kind of type {}".format(type_ir))
+ return type_ir
+
+
+def fixed_size_of_type_in_bits(type_ir, ir):
+ """Returns the fixed, known size for the given type, in bits, or None.
+
+ Arguments:
+ type_ir: The IR of a type.
+ ir: A complete IR, used to resolve references to types.
+
+ Returns:
+ size if the size of the type can be determined, otherwise None.
+ """
+ array_multiplier = 1
+ while type_ir.HasField("array_type"):
+ if type_ir.array_type.WhichOneof("size") == "automatic":
+ return None
+ else:
+ assert type_ir.array_type.WhichOneof("size") == "element_count", (
+ 'Expected array size to be "automatic" or "element_count".')
+ element_count = type_ir.array_type.element_count
+ if not is_constant(element_count):
+ return None
+ else:
+ array_multiplier *= constant_value(element_count)
+ assert not type_ir.HasField("size_in_bits"), (
+ "TODO(bolms): implement explicitly-sized arrays")
+ type_ir = type_ir.array_type.base_type
+ assert type_ir.HasField("atomic_type"), "Unexpected type!"
+ if type_ir.HasField("size_in_bits"):
+ size = constant_value(type_ir.size_in_bits)
+ else:
+ type_definition = find_object(type_ir.atomic_type.reference, ir)
+ size_attr = get_attribute(type_definition.attribute, _FIXED_SIZE_ATTRIBUTE)
+ if not size_attr:
+ return None
+ size = constant_value(size_attr.expression)
+ return size * array_multiplier
+
+
+def field_is_virtual(field_ir):
+ """Returns true if the field is virtual."""
+ # TODO(bolms): Should there be a more explicit indicator that a field is
+ # virtual?
+ return not field_ir.HasField("location")
+
+
+def field_is_read_only(field_ir):
+ """Returns true if the field is read-only."""
+ # For now, all virtual fields are read-only, and no non-virtual fields are
+ # read-only.
+ return field_ir.write_method.read_only
diff --git a/util/ir_util_test.py b/util/ir_util_test.py
new file mode 100644
index 0000000..7d22dde
--- /dev/null
+++ b/util/ir_util_test.py
@@ -0,0 +1,725 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for util.ir_util."""
+
+import unittest
+
+from public import ir_pb2
+from util import expression_parser
+from util import ir_util
+
+
+def _parse_expression(text):
+ return expression_parser.parse(text)
+
+
+class IrUtilTest(unittest.TestCase):
+ """Tests for the miscellaneous utility functions in ir_util.py."""
+
+ def test_is_constant_integer(self):
+ self.assertTrue(ir_util.is_constant(_parse_expression("6")))
+ expression = _parse_expression("12")
+ # The type information should be ignored for constants like this one.
+ expression.type.integer.CopyFrom(ir_pb2.IntegerType())
+ self.assertTrue(ir_util.is_constant(expression))
+
+ def test_is_constant_boolean(self):
+ self.assertTrue(ir_util.is_constant(_parse_expression("true")))
+ expression = _parse_expression("true")
+ # The type information should be ignored for constants like this one.
+ expression.type.boolean.CopyFrom(ir_pb2.BooleanType())
+ self.assertTrue(ir_util.is_constant(expression))
+
+ def test_is_constant_enum(self):
+ self.assertTrue(ir_util.is_constant(ir_pb2.Expression(
+ constant_reference=ir_pb2.Reference(),
+ type=ir_pb2.ExpressionType(enumeration=ir_pb2.EnumType(value="12")))))
+
+ def test_is_constant_integer_type(self):
+ self.assertFalse(ir_util.is_constant_type(ir_pb2.ExpressionType(
+ integer=ir_pb2.IntegerType(
+ modulus="10",
+ modular_value="5",
+ minimum_value="-5",
+ maximum_value="15"))))
+ self.assertTrue(ir_util.is_constant_type(ir_pb2.ExpressionType(
+ integer=ir_pb2.IntegerType(
+ modulus="infinity",
+ modular_value="5",
+ minimum_value="5",
+ maximum_value="5"))))
+
+ def test_is_constant_boolean_type(self):
+ self.assertFalse(ir_util.is_constant_type(ir_pb2.ExpressionType(
+ boolean=ir_pb2.BooleanType())))
+ self.assertTrue(ir_util.is_constant_type(ir_pb2.ExpressionType(
+ boolean=ir_pb2.BooleanType(value=True))))
+ self.assertTrue(ir_util.is_constant_type(ir_pb2.ExpressionType(
+ boolean=ir_pb2.BooleanType(value=False))))
+
+ def test_is_constant_enumeration_type(self):
+ self.assertFalse(ir_util.is_constant_type(ir_pb2.ExpressionType(
+ enumeration=ir_pb2.EnumType())))
+ self.assertTrue(ir_util.is_constant_type(ir_pb2.ExpressionType(
+ enumeration=ir_pb2.EnumType(value="0"))))
+
+ def test_is_constant_opaque_type(self):
+ self.assertFalse(ir_util.is_constant_type(ir_pb2.ExpressionType(
+ opaque=ir_pb2.OpaqueType())))
+
+ def test_constant_value_of_integer(self):
+ self.assertEqual(6, ir_util.constant_value(_parse_expression("6")))
+
+ def test_constant_value_of_none(self):
+ self.assertIsNone(ir_util.constant_value(ir_pb2.Expression()))
+
+ def test_constant_value_of_addition(self):
+ self.assertEqual(6, ir_util.constant_value(_parse_expression("2+4")))
+
+ def test_constant_value_of_subtraction(self):
+ self.assertEqual(-2, ir_util.constant_value(_parse_expression("2-4")))
+
+ def test_constant_value_of_multiplication(self):
+ self.assertEqual(8, ir_util.constant_value(_parse_expression("2*4")))
+
+ def test_constant_value_of_equality(self):
+ self.assertFalse(ir_util.constant_value(_parse_expression("2 == 4")))
+
+ def test_constant_value_of_inequality(self):
+ self.assertTrue(ir_util.constant_value(_parse_expression("2 != 4")))
+
+ def test_constant_value_of_less(self):
+ self.assertTrue(ir_util.constant_value(_parse_expression("2 < 4")))
+
+ def test_constant_value_of_less_or_equal(self):
+ self.assertTrue(ir_util.constant_value(_parse_expression("2 <= 4")))
+
+ def test_constant_value_of_greater(self):
+ self.assertFalse(ir_util.constant_value(_parse_expression("2 > 4")))
+
+ def test_constant_value_of_greater_or_equal(self):
+ self.assertFalse(ir_util.constant_value(_parse_expression("2 >= 4")))
+
+ def test_constant_value_of_and(self):
+ self.assertFalse(ir_util.constant_value(_parse_expression("true && false")))
+ self.assertTrue(ir_util.constant_value(_parse_expression("true && true")))
+
+ def test_constant_value_of_or(self):
+ self.assertTrue(ir_util.constant_value(_parse_expression("true || false")))
+ self.assertFalse(
+ ir_util.constant_value(_parse_expression("false || false")))
+
+ def test_constant_value_of_choice(self):
+ self.assertEqual(
+ 10, ir_util.constant_value(_parse_expression("false ? 20 : 10")))
+ self.assertEqual(
+ 20, ir_util.constant_value(_parse_expression("true ? 20 : 10")))
+
+ def test_constant_value_of_choice_with_unknown_other_branch(self):
+ self.assertEqual(
+ 10, ir_util.constant_value(_parse_expression("false ? foo : 10")))
+ self.assertEqual(
+ 20, ir_util.constant_value(_parse_expression("true ? 20 : foo")))
+
+ def test_constant_value_of_maximum(self):
+ self.assertEqual(10,
+ ir_util.constant_value(_parse_expression("$max(5, 10)")))
+ self.assertEqual(10,
+ ir_util.constant_value(_parse_expression("$max(10)")))
+ self.assertEqual(
+ 10,
+ ir_util.constant_value(_parse_expression("$max(5, 9, 7, 10, 6, -100)")))
+
+ def test_constant_value_of_boolean(self):
+ self.assertTrue(ir_util.constant_value(_parse_expression("true")))
+ self.assertFalse(ir_util.constant_value(_parse_expression("false")))
+
+ def test_constant_value_of_enum(self):
+ self.assertEqual(12, ir_util.constant_value(ir_pb2.Expression(
+ constant_reference=ir_pb2.Reference(),
+ type=ir_pb2.ExpressionType(enumeration=ir_pb2.EnumType(value="12")))))
+
+ def test_constant_value_of_integer_reference(self):
+ self.assertEqual(12, ir_util.constant_value(ir_pb2.Expression(
+ constant_reference=ir_pb2.Reference(),
+ type=ir_pb2.ExpressionType(
+ integer=ir_pb2.IntegerType(modulus="infinity",
+ modular_value="12")))))
+
+ def test_constant_value_of_boolean_reference(self):
+ self.assertTrue(ir_util.constant_value(ir_pb2.Expression(
+ constant_reference=ir_pb2.Reference(),
+ type=ir_pb2.ExpressionType(boolean=ir_pb2.BooleanType(value=True)))))
+
+ def test_constant_value_of_builtin_reference(self):
+ self.assertEqual(12, ir_util.constant_value(
+ ir_pb2.Expression(
+ builtin_reference=ir_pb2.Reference(
+ canonical_name=ir_pb2.CanonicalName(object_path=["$foo"]))),
+ {"$foo": 12}))
+
+ def test_constant_value_of_field_reference(self):
+ self.assertIsNone(ir_util.constant_value(_parse_expression("foo")))
+
+ def test_constant_value_of_missing_builtin_reference(self):
+ self.assertIsNone(ir_util.constant_value(
+ _parse_expression("$static_size_in_bits"), {"$bar": 12}))
+
+ def test_constant_value_of_present_builtin_reference(self):
+ self.assertEqual(12, ir_util.constant_value(
+ _parse_expression("$static_size_in_bits"),
+ {"$static_size_in_bits": 12}))
+
+ def test_constant_false_value_of_operator_and_with_missing_value(self):
+ self.assertIs(False, ir_util.constant_value(
+ _parse_expression("false && foo"), {"bar": 12}))
+ self.assertIs(False, ir_util.constant_value(
+ _parse_expression("foo && false"), {"bar": 12}))
+
+ def test_constant_false_value_of_operator_and_known_value(self):
+ self.assertTrue(ir_util.constant_value(
+ _parse_expression("true && $is_statically_sized"),
+ {"$is_statically_sized": True}))
+
+ def test_constant_none_value_of_operator_and_with_missing_value(self):
+ self.assertIsNone(ir_util.constant_value(
+ _parse_expression("true && foo"), {"bar": 12}))
+ self.assertIsNone(ir_util.constant_value(
+ _parse_expression("foo && true"), {"bar": 12}))
+
+ def test_constant_false_value_of_operator_or_with_missing_value(self):
+ self.assertTrue(ir_util.constant_value(
+ _parse_expression("true || foo"), {"bar": 12}))
+ self.assertTrue(ir_util.constant_value(
+ _parse_expression("foo || true"), {"bar": 12}))
+
+ def test_constant_none_value_of_operator_or_with_missing_value(self):
+ self.assertIsNone(ir_util.constant_value(
+ _parse_expression("foo || false"), {"bar": 12}))
+ self.assertIsNone(ir_util.constant_value(
+ _parse_expression("false || foo"), {"bar": 12}))
+
+ def test_constant_value_of_operator_plus_with_missing_value(self):
+ self.assertIsNone(ir_util.constant_value(
+ _parse_expression("12 + foo"), {"bar": 12}))
+
+ def test_is_array(self):
+ self.assertTrue(
+ ir_util.is_array(ir_pb2.Type(array_type=ir_pb2.ArrayType())))
+ self.assertFalse(
+ ir_util.is_array(ir_pb2.Type(atomic_type=ir_pb2.AtomicType())))
+
+ def test_get_attribute(self):
+ type_def = ir_pb2.TypeDefinition(attribute=[
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=ir_pb2.Expression()),
+ name=ir_pb2.Word(text="phil")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ name=ir_pb2.Word(text="bob"),
+ is_default=True),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("true")),
+ name=ir_pb2.Word(text="bob")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ name=ir_pb2.Word(text="bob2")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("true")),
+ name=ir_pb2.Word(text="bob2"),
+ is_default=True),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ name=ir_pb2.Word(text="bob3"),
+ is_default=True),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ name=ir_pb2.Word()),
+ ])
+ self.assertEqual(
+ ir_pb2.AttributeValue(expression=_parse_expression("true")),
+ ir_util.get_attribute(type_def.attribute, "bob"))
+ self.assertEqual(
+ ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ ir_util.get_attribute(type_def.attribute, "bob2"))
+ self.assertEqual(None, ir_util.get_attribute(type_def.attribute, "Bob"))
+ self.assertEqual(None, ir_util.get_attribute(type_def.attribute, "bob3"))
+
+ def test_get_boolean_attribute(self):
+ type_def = ir_pb2.TypeDefinition(attribute=[
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=ir_pb2.Expression()),
+ name=ir_pb2.Word(text="phil")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ name=ir_pb2.Word(text="bob"),
+ is_default=True),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("true")),
+ name=ir_pb2.Word(text="bob")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ name=ir_pb2.Word(text="bob2")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("true")),
+ name=ir_pb2.Word(text="bob2"),
+ is_default=True),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ name=ir_pb2.Word(text="bob3"),
+ is_default=True),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ name=ir_pb2.Word()),
+ ])
+ self.assertTrue(ir_util.get_boolean_attribute(type_def.attribute, "bob"))
+ self.assertTrue(ir_util.get_boolean_attribute(type_def.attribute,
+ "bob",
+ default_value=False))
+ self.assertFalse(ir_util.get_boolean_attribute(type_def.attribute, "bob2"))
+ self.assertFalse(ir_util.get_boolean_attribute(type_def.attribute,
+ "bob2",
+ default_value=True))
+ self.assertIsNone(ir_util.get_boolean_attribute(type_def.attribute, "Bob"))
+ self.assertTrue(ir_util.get_boolean_attribute(type_def.attribute,
+ "Bob",
+ default_value=True))
+ self.assertIsNone(ir_util.get_boolean_attribute(type_def.attribute, "bob3"))
+
+ def test_get_integer_attribute(self):
+ type_def = ir_pb2.TypeDefinition(attribute=[
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(
+ expression=ir_pb2.Expression(
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType()))),
+ name=ir_pb2.Word(text="phil")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(
+ expression=ir_pb2.Expression(
+ constant=ir_pb2.NumericConstant(value="20"),
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType(
+ modular_value="20",
+ modulus="infinity")))),
+ name=ir_pb2.Word(text="bob"),
+ is_default=True),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(
+ expression=ir_pb2.Expression(
+ constant=ir_pb2.NumericConstant(value="10"),
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType(
+ modular_value="10",
+ modulus="infinity")))),
+ name=ir_pb2.Word(text="bob")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(
+ expression=ir_pb2.Expression(
+ constant=ir_pb2.NumericConstant(value="5"),
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType(
+ modular_value="5",
+ modulus="infinity")))),
+ name=ir_pb2.Word(text="bob2")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(
+ expression=ir_pb2.Expression(
+ constant=ir_pb2.NumericConstant(value="0"),
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType(
+ modular_value="0",
+ modulus="infinity")))),
+ name=ir_pb2.Word(text="bob2"),
+ is_default=True),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(
+ expression=ir_pb2.Expression(
+ constant=ir_pb2.NumericConstant(value="30"),
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType(
+ modular_value="30",
+ modulus="infinity")))),
+ name=ir_pb2.Word(text="bob3"),
+ is_default=True),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(
+ expression=ir_pb2.Expression(
+ function=ir_pb2.Function(
+ function=ir_pb2.Function.ADDITION,
+ args=[
+ ir_pb2.Expression(
+ constant=ir_pb2.NumericConstant(value="100"),
+ type=ir_pb2.ExpressionType(
+ integer=ir_pb2.IntegerType(
+ modular_value="100",
+ modulus="infinity"))),
+ ir_pb2.Expression(
+ constant=ir_pb2.NumericConstant(value="100"),
+ type=ir_pb2.ExpressionType(
+ integer=ir_pb2.IntegerType(
+ modular_value="100",
+ modulus="infinity")))
+ ]),
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType(
+ modular_value="200",
+ modulus="infinity")))),
+ name=ir_pb2.Word(text="bob4")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(
+ expression=ir_pb2.Expression(
+ constant=ir_pb2.NumericConstant(value="40"),
+ type=ir_pb2.ExpressionType(integer=ir_pb2.IntegerType(
+ modular_value="40",
+ modulus="infinity")))),
+ name=ir_pb2.Word()),
+ ])
+ self.assertEqual(10,
+ ir_util.get_integer_attribute(type_def.attribute, "bob"))
+ self.assertEqual(5,
+ ir_util.get_integer_attribute(type_def.attribute, "bob2"))
+ self.assertIsNone(ir_util.get_integer_attribute(type_def.attribute, "Bob"))
+ self.assertEqual(10, ir_util.get_integer_attribute(type_def.attribute,
+ "Bob",
+ default_value=10))
+ self.assertIsNone(ir_util.get_integer_attribute(type_def.attribute, "bob3"))
+ self.assertEqual(200, ir_util.get_integer_attribute(type_def.attribute,
+ "bob4"))
+
+ def test_get_duplicate_attribute(self):
+ type_def = ir_pb2.TypeDefinition(attribute=[
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=ir_pb2.Expression()),
+ name=ir_pb2.Word(text="phil")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("true")),
+ name=ir_pb2.Word(text="bob")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ name=ir_pb2.Word(text="bob")),
+ ir_pb2.Attribute(
+ value=ir_pb2.AttributeValue(expression=_parse_expression("false")),
+ name=ir_pb2.Word()),
+ ])
+ self.assertRaises(AssertionError, ir_util.get_attribute, type_def.attribute,
+ "bob")
+
+ def test_find_object(self):
+ ir = ir_pb2.EmbossIr(
+ module=[{
+ "type": [{
+ "structure": {
+ "field": [{
+ "name": {
+ "name": { "text": "field" },
+ "canonical_name": {
+ "module_file": "test.emb",
+ "object_path": ["Foo", "field"]
+ }
+ }
+ }]
+ },
+ "name": {
+ "name": { "text": "Foo" },
+ "canonical_name": {
+ "module_file": "test.emb",
+ "object_path": ["Foo"]
+ }
+ },
+ "runtime_parameter": [{
+ "name": {
+ "name": { "text": "parameter" },
+ "canonical_name": {
+ "module_file": "test.emb",
+ "object_path": ["Foo", "parameter"],
+ }
+ }
+ }]
+ },
+ {
+ "enumeration": {
+ "value": [{
+ "name": {
+ "name": { "text": "QUX" },
+ "canonical_name": {
+ "module_file": "test.emb",
+ "object_path": ["Bar", "QUX"]
+ }
+ }
+ }]
+ },
+ "name": {
+ "name": { "text": "Bar" },
+ "canonical_name": {
+ "module_file": "test.emb",
+ "object_path": ["Bar"]
+ }
+ }
+ }],
+ "source_file_name": "test.emb"
+ },
+ {
+ "type": [{
+ "external": { },
+ "name": {
+ "name": { "text": "UInt" },
+ "canonical_name": { "module_file": "", "object_path": ["UInt"] }
+ }
+ }],
+ "source_file_name": ""
+ }])
+
+ # Test that find_object works with any of its four "name" types.
+ canonical_name_of_foo = ir_pb2.CanonicalName(module_file="test.emb",
+ object_path=["Foo"])
+ self.assertEqual(ir.module[0].type[0], ir_util.find_object(
+ ir_pb2.Reference(canonical_name=canonical_name_of_foo), ir))
+ self.assertEqual(ir.module[0].type[0], ir_util.find_object(
+ ir_pb2.NameDefinition(canonical_name=canonical_name_of_foo), ir))
+ self.assertEqual(ir.module[0].type[0],
+ ir_util.find_object(canonical_name_of_foo, ir))
+ self.assertEqual(ir.module[0].type[0],
+ ir_util.find_object(("test.emb", "Foo"), ir))
+
+ # Test that find_object works with objects other than structs.
+ self.assertEqual(ir.module[0].type[1],
+ ir_util.find_object(("test.emb", "Bar"), ir))
+ self.assertEqual(ir.module[1].type[0],
+ ir_util.find_object(("", "UInt"), ir))
+ self.assertEqual(ir.module[0].type[0].structure.field[0],
+ ir_util.find_object(("test.emb", "Foo", "field"), ir))
+ self.assertEqual(ir.module[0].type[0].runtime_parameter[0],
+ ir_util.find_object(("test.emb", "Foo", "parameter"), ir))
+ self.assertEqual(ir.module[0].type[1].enumeration.value[0],
+ ir_util.find_object(("test.emb", "Bar", "QUX"), ir))
+ self.assertEqual(ir.module[0], ir_util.find_object(("test.emb",), ir))
+ self.assertEqual(ir.module[1], ir_util.find_object(("",), ir))
+
+ # Test searching for non-present objects.
+ self.assertIsNone(ir_util.find_object_or_none(("not_test.emb",), ir))
+ self.assertIsNone(ir_util.find_object_or_none(("test.emb", "NotFoo"), ir))
+ self.assertIsNone(
+ ir_util.find_object_or_none(("test.emb", "Foo", "not_field"), ir))
+ self.assertIsNone(
+ ir_util.find_object_or_none(("test.emb", "Foo", "field", "no_subfield"),
+ ir))
+ self.assertIsNone(
+ ir_util.find_object_or_none(("test.emb", "Bar", "NOT_QUX"), ir))
+ self.assertIsNone(
+ ir_util.find_object_or_none(("test.emb", "Bar", "QUX", "no_subenum"),
+ ir))
+
+ # Test that find_parent_object works with any of its four "name" types.
+ self.assertEqual(ir.module[0], ir_util.find_parent_object(
+ ir_pb2.Reference(canonical_name=canonical_name_of_foo), ir))
+ self.assertEqual(ir.module[0], ir_util.find_parent_object(
+ ir_pb2.NameDefinition(canonical_name=canonical_name_of_foo), ir))
+ self.assertEqual(ir.module[0],
+ ir_util.find_parent_object(canonical_name_of_foo, ir))
+ self.assertEqual(ir.module[0],
+ ir_util.find_parent_object(("test.emb", "Foo"), ir))
+
+ # Test that find_parent_object works with objects other than structs.
+ self.assertEqual(ir.module[0],
+ ir_util.find_parent_object(("test.emb", "Bar"), ir))
+ self.assertEqual(ir.module[1], ir_util.find_parent_object(("", "UInt"), ir))
+ self.assertEqual(ir.module[0].type[0],
+ ir_util.find_parent_object(("test.emb", "Foo", "field"),
+ ir))
+ self.assertEqual(ir.module[0].type[1],
+ ir_util.find_parent_object(("test.emb", "Bar", "QUX"), ir))
+
+ def test_hashable_form_of_reference(self):
+ self.assertEqual(
+ ("t.emb", "Foo", "Bar"),
+ ir_util.hashable_form_of_reference(ir_pb2.Reference(
+ canonical_name=ir_pb2.CanonicalName(module_file="t.emb",
+ object_path=["Foo", "Bar"]))))
+ self.assertEqual(
+ ("t.emb", "Foo", "Bar"),
+ ir_util.hashable_form_of_reference(ir_pb2.NameDefinition(
+ canonical_name=ir_pb2.CanonicalName(module_file="t.emb",
+ object_path=["Foo", "Bar"]))))
+
+ def test_get_base_type(self):
+ array_type_ir = ir_pb2.Type(
+ array_type={
+ "element_count": { "constant": { "value": "20" } },
+ "base_type": {
+ "array_type": {
+ "element_count": { "constant": { "value": "10" } },
+ "base_type": {
+ "atomic_type": {
+ "reference": { },
+ "source_location": { "start": { "line": 5 } }
+ }
+ },
+ "source_location": { "start": { "line": 4 } }
+ }
+ },
+ "source_location": { "start": { "line": 3 } }
+ })
+ base_type_ir = array_type_ir.array_type.base_type.array_type.base_type
+ self.assertEqual(base_type_ir, ir_util.get_base_type(array_type_ir))
+ self.assertEqual(base_type_ir, ir_util.get_base_type(
+ array_type_ir.array_type.base_type))
+ self.assertEqual(base_type_ir, ir_util.get_base_type(base_type_ir))
+
+ def test_size_of_type_in_bits(self):
+ ir = ir_pb2.EmbossIr(
+ module=[{
+ "type": [{
+ "name": {
+ "name": { "text": "Baz" },
+ "canonical_name": { "module_file": "s.emb", "object_path": ["Baz"] }
+ }
+ }],
+ "source_file_name": "s.emb"
+ },
+ {
+ "type": [{
+ "name": {
+ "name": { "text": "UInt" },
+ "canonical_name": { "module_file": "", "object_path": ["UInt"] }
+ }
+ },
+ {
+ "name": {
+ "name": { "text": "Byte" },
+ "canonical_name": { "module_file": "", "object_path": ["Byte"] }
+ },
+ "attribute": [{
+ "name": { "text": "fixed_size_in_bits" },
+ "value": {
+ "expression": {
+ "constant": { "value": "8" },
+ "type": { "integer": { "modular_value": "8", "modulus": "infinity" } }
+ }
+ }
+ }]
+ }],
+ "source_file_name": ""
+ }])
+
+ fixed_size_type = ir_pb2.Type(
+ atomic_type={
+ "reference": { "canonical_name": { "module_file": "", "object_path": ["Byte"] } }
+ })
+ self.assertEqual(8, ir_util.fixed_size_of_type_in_bits(fixed_size_type, ir))
+
+ explicit_size_type = ir_pb2.Type(
+ atomic_type={
+ "reference": { "canonical_name": { "module_file": "", "object_path": ["UInt"] } }
+ },
+ size_in_bits={
+ "constant": { "value": "32" },
+ "type": { "integer": { "modular_value": "32", "modulus": "infinity" } }
+ })
+ self.assertEqual(32,
+ ir_util.fixed_size_of_type_in_bits(explicit_size_type, ir))
+
+ fixed_size_array = ir_pb2.Type(
+ array_type={
+ "base_type": {
+ "atomic_type": {
+ "reference": {
+ "canonical_name": { "module_file": "", "object_path": ["Byte"] }
+ }
+ }
+ },
+ "element_count": {
+ "constant": { "value": "5" },
+ "type": { "integer": { "modular_value": "5", "modulus": "infinity" } }
+ }
+ })
+ self.assertEqual(40,
+ ir_util.fixed_size_of_type_in_bits(fixed_size_array, ir))
+
+ fixed_size_2d_array = ir_pb2.Type(
+ array_type={
+ "base_type": {
+ "array_type": {
+ "base_type": {
+ "atomic_type": {
+ "reference": {
+ "canonical_name": { "module_file": "", "object_path": ["Byte"] }
+ }
+ }
+ },
+ "element_count": {
+ "constant": { "value": "5" },
+ "type": { "integer": { "modular_value": "5", "modulus": "infinity" } }
+ }
+ }
+ },
+ "element_count": {
+ "constant": { "value": "2" },
+ "type": { "integer": { "modular_value": "2", "modulus": "infinity" } }
+ }
+ })
+ self.assertEqual(
+ 80, ir_util.fixed_size_of_type_in_bits(fixed_size_2d_array, ir))
+
+ automatic_size_array = ir_pb2.Type(
+ array_type={
+ "base_type": {
+ "array_type": {
+ "base_type": {
+ "atomic_type": {
+ "reference": {
+ "canonical_name": { "module_file": "", "object_path": ["Byte"] }
+ }
+ }
+ },
+ "element_count": {
+ "constant": { "value": "5" },
+ "type": { "integer": { "modular_value": "5", "modulus": "infinity" } }
+ }
+ }
+ },
+ "automatic": { }
+ })
+ self.assertIsNone(
+ ir_util.fixed_size_of_type_in_bits(automatic_size_array, ir))
+
+ variable_size_type = ir_pb2.Type(
+ atomic_type={
+ "reference": { "canonical_name": { "module_file": "", "object_path": ["UInt"] } }
+ })
+ self.assertIsNone(
+ ir_util.fixed_size_of_type_in_bits(variable_size_type, ir))
+
+ no_size_type = ir_pb2.Type(
+ atomic_type={
+ "reference": {
+ "canonical_name": { "module_file": "s.emb", "object_path": ["Baz"] }
+ }
+ })
+ self.assertIsNone(ir_util.fixed_size_of_type_in_bits(no_size_type, ir))
+
+ def test_field_is_virtual(self):
+ self.assertTrue(ir_util.field_is_virtual(ir_pb2.Field()))
+
+ def test_field_is_not_virtual(self):
+ self.assertFalse(ir_util.field_is_virtual(
+ ir_pb2.Field(location=ir_pb2.FieldLocation())))
+
+ def test_field_is_read_only(self):
+ self.assertTrue(ir_util.field_is_read_only(ir_pb2.Field(
+ write_method=ir_pb2.WriteMethod(read_only=True))))
+
+ def test_field_is_not_read_only(self):
+ self.assertFalse(ir_util.field_is_read_only(
+ ir_pb2.Field(location=ir_pb2.FieldLocation())))
+ self.assertFalse(ir_util.field_is_read_only(ir_pb2.Field(
+ write_method=ir_pb2.WriteMethod())))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/util/name_conversion.py b/util/name_conversion.py
new file mode 100644
index 0000000..71aeff5
--- /dev/null
+++ b/util/name_conversion.py
@@ -0,0 +1,19 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Conversions between snake-, camel-, and shouty-case names."""
+
+
+def snake_to_camel(name):
+ return "".join(word.capitalize() for word in name.split("_"))
diff --git a/util/name_conversion_test.py b/util/name_conversion_test.py
new file mode 100644
index 0000000..5a423e3
--- /dev/null
+++ b/util/name_conversion_test.py
@@ -0,0 +1,34 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for util.name_conversion."""
+
+import unittest
+from util import name_conversion
+
+
+class NameConversionTest(unittest.TestCase):
+
+ def test_snake_to_camel(self):
+ self.assertEqual("", name_conversion.snake_to_camel(""))
+ self.assertEqual("Abc", name_conversion.snake_to_camel("abc"))
+ self.assertEqual("AbcDef", name_conversion.snake_to_camel("abc_def"))
+ self.assertEqual("AbcDef89", name_conversion.snake_to_camel("abc_def89"))
+ self.assertEqual("AbcDef89", name_conversion.snake_to_camel("abc_def_89"))
+ self.assertEqual("Abc89Def", name_conversion.snake_to_camel("abc_89_def"))
+ self.assertEqual("Abc89def", name_conversion.snake_to_camel("abc_89def"))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/util/parser_types.py b/util/parser_types.py
new file mode 100644
index 0000000..f17a56f
--- /dev/null
+++ b/util/parser_types.py
@@ -0,0 +1,116 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Various types shared through multiple passes of parsing.
+
+This module contains types used as interfaces between parts of the Emboss front
+end. These types do not really "belong" to either the producers or consumers,
+and in a few cases placing them in one or the other creates unnecessary
+dependencies, so they are defined here.
+"""
+
+import collections
+from public import ir_pb2
+
+
+def _make_position(line, column):
+ """Makes an ir_pb2.Position from line, column ints."""
+ if not isinstance(line, int):
+ raise ValueError("Bad line {!r}".format(line))
+ elif not isinstance(column, int):
+ raise ValueError("Bad column {!r}".format(column))
+ return ir_pb2.Position(line=line, column=column)
+
+
+def _parse_position(text):
+ """Parses an ir_pb2.Position from "line:column" (e.g., "1:2")."""
+ line, column = text.split(":")
+ return _make_position(int(line), int(column))
+
+
+def format_position(position):
+ """formats an ir_pb2.Position to "line:column" form."""
+ return "{}:{}".format(position.line, position.column)
+
+
+def make_location(start, end, is_synthetic=False):
+ """Makes an ir_pb2.Location from (line, column) tuples or ir_pb2.Positions."""
+ if isinstance(start, tuple):
+ start = _make_position(*start)
+ if isinstance(end, tuple):
+ end = _make_position(*end)
+ if not isinstance(start, ir_pb2.Position):
+ raise ValueError("Bad start {!r}".format(start))
+ elif not isinstance(end, ir_pb2.Position):
+ raise ValueError("Bad end {!r}".format(end))
+ elif start.line > end.line or (
+ start.line == end.line and start.column > end.column):
+ raise ValueError("Start {} is after end {}".format(format_position(start),
+ format_position(end)))
+ return ir_pb2.Location(start=start, end=end, is_synthetic=is_synthetic)
+
+
+def format_location(location):
+ """Formats an ir_pb2.Location in format "1:2-3:4" ("start-end")."""
+ return "{}-{}".format(format_position(location.start),
+ format_position(location.end))
+
+
+def parse_location(text):
+ """Parses an ir_pb2.Location from format "1:2-3:4" ("start-end")."""
+ start, end = text.split("-")
+ return make_location(_parse_position(start), _parse_position(end))
+
+
+class Token(
+ collections.namedtuple("Token", ["symbol", "text", "source_location"])):
+ """A Token is a chunk of text from a source file, and a classification.
+
+ Attributes:
+ symbol: The name of this token ("Indent", "SnakeWord", etc.)
+ text: The original text ("1234", "some_name", etc.)
+ source_location: Where this token came from in the original source file.
+ """
+
+ def __str__(self):
+ return "{} {} {}".format(self.symbol, repr(str(self.text)),
+ format_location(self.source_location))
+
+
+class Production(collections.namedtuple("Production", ["lhs", "rhs"])):
+ """A Production is a simple production from a context-free grammar.
+
+ A Production takes the form:
+
+ nonterminal -> symbol*
+
+ where "nonterminal" is an implicitly non-terminal symbol in the language,
+ and "symbol*" is zero or more terminal or non-terminal symbols which form the
+ non-terminal on the left.
+
+ Attributes:
+ lhs: The non-terminal symbol on the left-hand-side of the production.
+ rhs: The sequence of symbols on the right-hand-side of the production.
+ """
+
+ def __str__(self):
+ return str(self.lhs) + " -> " + " ".join([str(r) for r in self.rhs])
+
+ @staticmethod
+ def parse(production_text):
+ """Parses a Production from a "symbol -> symbol symbol symbol" string."""
+ words = production_text.split()
+ if words[1] != "->":
+ raise SyntaxError
+ return Production(words[0], tuple(words[2:]))
diff --git a/util/parser_types_test.py b/util/parser_types_test.py
new file mode 100644
index 0000000..03109d5
--- /dev/null
+++ b/util/parser_types_test.py
@@ -0,0 +1,133 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for parser_types."""
+
+import unittest
+from public import ir_pb2
+from util import parser_types
+
+
+class PositionTest(unittest.TestCase):
+ """Tests for Position-related functions in parser_types."""
+
+ def test_format_position(self):
+ self.assertEqual(
+ "1:2", parser_types.format_position(ir_pb2.Position(line=1, column=2)))
+
+
+class LocationTest(unittest.TestCase):
+ """Tests for Location-related functions in parser_types."""
+
+ def test_make_location(self):
+ self.assertEqual(ir_pb2.Location(start=ir_pb2.Position(line=1,
+ column=2),
+ end=ir_pb2.Position(line=3,
+ column=4),
+ is_synthetic=False),
+ parser_types.make_location((1, 2), (3, 4)))
+ self.assertEqual(
+ ir_pb2.Location(start=ir_pb2.Position(line=1,
+ column=2),
+ end=ir_pb2.Position(line=3,
+ column=4),
+ is_synthetic=False),
+ parser_types.make_location(ir_pb2.Position(line=1,
+ column=2),
+ ir_pb2.Position(line=3,
+ column=4)))
+
+ def test_make_synthetic_location(self):
+ self.assertEqual(
+ ir_pb2.Location(start=ir_pb2.Position(line=1, column=2),
+ end=ir_pb2.Position(line=3, column=4),
+ is_synthetic=True),
+ parser_types.make_location((1, 2), (3, 4), True))
+ self.assertEqual(
+ ir_pb2.Location(start=ir_pb2.Position(line=1, column=2),
+ end=ir_pb2.Position(line=3, column=4),
+ is_synthetic=True),
+ parser_types.make_location(ir_pb2.Position(line=1, column=2),
+ ir_pb2.Position(line=3, column=4),
+ True))
+
+ def test_make_location_type_checks(self):
+ self.assertRaises(ValueError, parser_types.make_location, [1, 2], (1, 2))
+ self.assertRaises(ValueError, parser_types.make_location, (1, 2), [1, 2])
+
+ def test_make_location_logic_checks(self):
+ self.assertRaises(ValueError, parser_types.make_location, (3, 4), (1, 2))
+ self.assertRaises(ValueError, parser_types.make_location, (1, 3), (1, 2))
+ self.assertTrue(parser_types.make_location((1, 2), (1, 2)))
+
+ def test_format_location(self):
+ self.assertEqual("1:2-3:4",
+ parser_types.format_location(parser_types.make_location(
+ (1, 2), (3, 4))))
+
+ def test_parse_location(self):
+ self.assertEqual(parser_types.make_location((1, 2), (3, 4)),
+ parser_types.parse_location("1:2-3:4"))
+ self.assertEqual(parser_types.make_location((1, 2), (3, 4)),
+ parser_types.parse_location(" 1 : 2 - 3 : 4 "))
+
+
+class TokenTest(unittest.TestCase):
+ """Tests for parser_types.Token."""
+
+ def test_str(self):
+ self.assertEqual("FOO 'bar' 1:2-3:4", str(parser_types.Token(
+ "FOO", "bar", parser_types.make_location((1, 2), (3, 4)))))
+
+
+class ProductionTest(unittest.TestCase):
+ """Tests for parser_types.Production."""
+
+ def test_parse(self):
+ self.assertEqual(parser_types.Production(lhs="A",
+ rhs=("B", "C")),
+ parser_types.Production.parse("A -> B C"))
+ self.assertEqual(parser_types.Production(lhs="A",
+ rhs=("B",)),
+ parser_types.Production.parse("A -> B"))
+ self.assertEqual(parser_types.Production(lhs="A",
+ rhs=("B", "C")),
+ parser_types.Production.parse(" A -> B C "))
+ self.assertEqual(parser_types.Production(lhs="A",
+ rhs=tuple()),
+ parser_types.Production.parse("A ->"))
+ self.assertEqual(parser_types.Production(lhs="A",
+ rhs=tuple()),
+ parser_types.Production.parse("A -> "))
+ self.assertEqual(parser_types.Production(lhs="FOO",
+ rhs=('"B"', "x*")),
+ parser_types.Production.parse('FOO -> "B" x*'))
+ self.assertRaises(SyntaxError, parser_types.Production.parse, "F-> A B")
+ self.assertRaises(SyntaxError, parser_types.Production.parse, "F B -> A B")
+ self.assertRaises(SyntaxError, parser_types.Production.parse, "-> A B")
+
+ def test_str(self):
+ self.assertEqual(str(parser_types.Production(lhs="A",
+ rhs=("B", "C"))), "A -> B C")
+ self.assertEqual(str(parser_types.Production(lhs="A",
+ rhs=("B",))), "A -> B")
+ self.assertEqual(str(parser_types.Production(lhs="A",
+ rhs=tuple())), "A -> ")
+ self.assertEqual(str(parser_types.Production(lhs="FOO",
+ rhs=('"B"', "x*"))),
+ 'FOO -> "B" x*')
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/util/simple_memoizer.py b/util/simple_memoizer.py
new file mode 100644
index 0000000..15a1719
--- /dev/null
+++ b/util/simple_memoizer.py
@@ -0,0 +1,69 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Provides a simple memoizing decorator."""
+
+
+def memoize(f):
+ """Memoizes f.
+
+ The @memoize decorator returns a function which caches the results of f, and
+ returns directly from the cache instead of calling f when it is called again
+ with the same arguments.
+
+ Memoization has some caveats:
+
+ Most importantly, the decorated function will not be called every time the
+ function is called. If the memoized function `f` performs I/O or relies on
+ or changes global state, it may not work correctly when memoized.
+
+ This memoizer only works for functions taking positional arguments. It does
+ not handle keywork arguments.
+
+ This memoizer only works for hashable arguments -- tuples, ints, etc. It does
+ not work on most iterables.
+
+ This memoizer returns a function whose __name__ and argument list may differ
+ from the memoized function under reflection.
+
+ This memoizer never evicts anything from its cache, so its memory usage can
+ grow indefinitely.
+
+ Depending on the workload and speed of `f`, the memoized `f` can be slower
+ than unadorned `f`; it is important to use profiling before and after
+ memoization.
+
+ Usage:
+ @memoize
+ def function(arg, arg2, arg3):
+ ...
+
+ Arguments:
+ f: The function to memoize.
+
+ Returns:
+ A function which acts like f, but faster when called repeatedly with the
+ same arguments.
+ """
+ cache = {}
+
+ def _memoized(*args):
+ assert all(arg.__hash__ for arg in args), (
+ "Arguments to memoized function {} must be hashable.".format(
+ f.__name__))
+ if args not in cache:
+ cache[args] = f(*args)
+ return cache[args]
+
+ return _memoized
diff --git a/util/simple_memoizer_test.py b/util/simple_memoizer_test.py
new file mode 100644
index 0000000..50e3113
--- /dev/null
+++ b/util/simple_memoizer_test.py
@@ -0,0 +1,73 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for simple_memoizer."""
+
+import unittest
+from util import simple_memoizer
+
+
+class SimpleMemoizerTest(unittest.TestCase):
+
+ def test_memoized_function_returns_same_values(self):
+ @simple_memoizer.memoize
+ def add_one(n):
+ return n + 1
+
+ for i in range(100):
+ self.assertEqual(i + 1, add_one(i))
+
+ def test_memoized_function_is_only_called_once(self):
+ arguments = []
+
+ @simple_memoizer.memoize
+ def add_one_and_add_argument_to_list(n):
+ arguments.append(n)
+ return n + 1
+
+ self.assertEqual(1, add_one_and_add_argument_to_list(0))
+ self.assertEqual([0], arguments)
+ self.assertEqual(1, add_one_and_add_argument_to_list(0))
+ self.assertEqual([0], arguments)
+
+ def test_memoized_function_with_multiple_arguments(self):
+ arguments = []
+
+ @simple_memoizer.memoize
+ def sum_arguments_and_add_arguments_to_list(n, m, o):
+ arguments.append((n, m, o))
+ return n + m + o
+
+ self.assertEqual(3, sum_arguments_and_add_arguments_to_list(0, 1, 2))
+ self.assertEqual([(0, 1, 2)], arguments)
+ self.assertEqual(3, sum_arguments_and_add_arguments_to_list(0, 1, 2))
+ self.assertEqual([(0, 1, 2)], arguments)
+ self.assertEqual(3, sum_arguments_and_add_arguments_to_list(2, 1, 0))
+ self.assertEqual([(0, 1, 2), (2, 1, 0)], arguments)
+
+ def test_memoized_function_with_no_arguments(self):
+ arguments = []
+
+ @simple_memoizer.memoize
+ def return_one_and_add_empty_tuple_to_list():
+ arguments.append(())
+ return 1
+
+ self.assertEqual(1, return_one_and_add_empty_tuple_to_list())
+ self.assertEqual([()], arguments)
+ self.assertEqual(1, return_one_and_add_empty_tuple_to_list())
+ self.assertEqual([()], arguments)
+
+if __name__ == '__main__':
+ unittest.main()
diff --git a/util/traverse_ir.py b/util/traverse_ir.py
new file mode 100644
index 0000000..9e0ff7b
--- /dev/null
+++ b/util/traverse_ir.py
@@ -0,0 +1,284 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Routines for fully traversing an IR."""
+
+import inspect
+import sys
+
+from public import ir_pb2
+
+
+def _call_with_optional_args(function, positional_arg, keyword_args):
+ """Calls function with whatever keyword_args it will accept."""
+ argspec = inspect.getargspec(function)
+ if argspec.keywords:
+ # If the function accepts a kwargs parameter, then it will accept all
+ # arguments.
+ # Note: this isn't technically true if one of the keyword arguments has the
+ # same name as one of the positional arguments.
+ return function(positional_arg, **keyword_args)
+ else:
+ ok_arguments = {}
+ for name in keyword_args:
+ if name in argspec.args[1:]:
+ ok_arguments[name] = keyword_args[name]
+ for name in argspec.args[1:len(argspec.args) - len(argspec.defaults or [])]:
+ assert name in ok_arguments, (
+ "Attempting to call '{}'; missing '{}' (have '{!r}')".format(
+ function.__name__, name, list(keyword_args.keys())))
+ return function(positional_arg, **ok_arguments)
+
+
+def _fast_traverse_proto_top_down(proto, incidental_actions, pattern,
+ skip_descendants_of, action, parameters):
+ """Traverses an IR, calling `action` on some nodes."""
+
+ # Parameters are scoped to the branch of the tree, so make a copy here, before
+ # any action or incidental_action can update them.
+ parameters = parameters.copy()
+
+ # If there is an incidental action for this node type, run it.
+ if type(proto) in incidental_actions: # pylint: disable=unidiomatic-typecheck
+ for incidental_action in incidental_actions[type(proto)]:
+ parameters.update(_call_with_optional_args(
+ incidental_action, proto, parameters) or {})
+
+ # If we are at the end of pattern, check to see if we should call action.
+ if len(pattern) == 1:
+ new_pattern = pattern
+ if pattern[0] == type(proto):
+ parameters.update(
+ _call_with_optional_args(action, proto, parameters) or {})
+ else:
+ # Otherwise, if this node's type matches the head of pattern, recurse with
+ # the tail of the pattern.
+ if pattern[0] == type(proto):
+ new_pattern = pattern[1:]
+ else:
+ new_pattern = pattern
+
+ # If the current node's type is one of the types whose branch should be
+ # skipped, then bail. This has to happen after `action` is called, because
+ # clients rely on being able to, e.g., get a callback for the "root"
+ # Expression without getting callbacks for every sub-Expression.
+ # pylint: disable=unidiomatic-typecheck
+ if type(proto) in skip_descendants_of:
+ return
+
+ # Otherwise, recurse. _FIELDS_TO_SCAN_BY_CURRENT_AND_TARGET tells us, given
+ # the current node's type and the current target type, which fields to check.
+ singular_fields, repeated_fields = _FIELDS_TO_SCAN_BY_CURRENT_AND_TARGET[
+ type(proto), new_pattern[0]]
+ for member_name in singular_fields:
+ if proto.HasField(member_name):
+ _fast_traverse_proto_top_down(getattr(proto, member_name),
+ incidental_actions, new_pattern,
+ skip_descendants_of, action, parameters)
+ for member_name in repeated_fields:
+ for array_element in getattr(proto, member_name):
+ _fast_traverse_proto_top_down(array_element, incidental_actions,
+ new_pattern, skip_descendants_of, action,
+ parameters)
+
+
+def _fields_to_scan_by_current_and_target():
+ """Generates _FIELDS_TO_SCAN_BY_CURRENT_AND_TARGET."""
+ # In order to avoid spending a *lot* of time just walking the IR, this
+ # function sets up a dict that allows `_fast_traverse_proto_top_down()` to
+ # skip traversing large portions of the IR, depending on what node types it is
+ # targeting.
+ #
+ # Without this branch culling scheme, the Emboss front end (at time of
+ # writing) spends roughly 70% (19s out of 31s) of its time just walking the
+ # IR. With branch culling, that goes down to 6% (0.7s out of 12.2s).
+
+ # type_to_fields is a map of types to maps of field names to field types.
+ # That is, type_to_fields[ir_pb2.Module]["type"] == ir_pb2.TypeDefinition.
+ type_to_fields = {}
+
+ # Later, we need to know which fields are singular and which are repeated,
+ # because the access methods are not uniform. This maps (type, field_name)
+ # tuples to descriptor labels: type_fields_to_cardinality[ir_pb2.Module,
+ # "type"] == ir_pb2.Repeated.
+ type_fields_to_cardinality = {}
+
+ # Fill out the above maps by recursively walking the IR type tree, starting
+ # from the root.
+ types_to_check = [ir_pb2.EmbossIr]
+ while types_to_check:
+ type_to_check = types_to_check.pop()
+ if type_to_check in type_to_fields:
+ continue
+ fields = {}
+ for field_name, field_type in type_to_check.field_specs.items():
+ if issubclass(field_type.type, ir_pb2.Message):
+ fields[field_name] = field_type.type
+ types_to_check.append(field_type.type)
+ type_fields_to_cardinality[type_to_check, field_name] = (
+ field_type.__class__)
+ type_to_fields[type_to_check] = fields
+
+ # type_to_descendant_types is a map of all types that can be reached from a
+ # particular type. After the setup, type_to_descendant_types[ir_pb2.EmbossIr]
+ # == set(<all types>) and type_to_descendant_types[ir_pb2.Reference] ==
+ # {ir_pb2.CanonicalName, ir_pb2.Word, ir_pb2.Location} and
+ # type_to_descendant_types[ir_pb2.Word] == set().
+ #
+ # The while loop basically ors in the known descendants of each known
+ # descendant of each type until the dict stops changing, which is a bit
+ # brute-force, but in practice only iterates a few times.
+ type_to_descendant_types = {}
+ for parent_type, field_map in type_to_fields.items():
+ type_to_descendant_types[parent_type] = set(field_map.values())
+ previous_map = {}
+ while type_to_descendant_types != previous_map:
+ # In order to check the previous iteration against the current iteration, it
+ # is necessary to make a two-level copy. Otherwise, the updates to the
+ # values will also update previous_map's values, which causes the loop to
+ # exit prematurely.
+ previous_map = {k: set(v) for k, v in type_to_descendant_types.items()}
+ for ancestor_type, descendents in previous_map.items():
+ for descendent in descendents:
+ type_to_descendant_types[ancestor_type] |= previous_map[descendent]
+
+ # Finally, we have all of the information we need to make the map we really
+ # want: given a current node type and a target node type, which fields should
+ # be checked? (This implicitly skips fields that *can't* contain the target
+ # type.)
+ fields_to_scan_by_current_and_target = {}
+ for current_node_type in type_to_fields:
+ for target_node_type in type_to_fields:
+ singular_fields_to_scan = []
+ repeated_fields_to_scan = []
+ for field_name, field_type in type_to_fields[current_node_type].items():
+ # If the target node type cannot contain another instance of itself, it
+ # is still necessary to scan fields that have the actual target type.
+ if (target_node_type == field_type or
+ target_node_type in type_to_descendant_types[field_type]):
+ # Singular and repeated fields go to different lists, so that they can
+ # be handled separately.
+ if (type_fields_to_cardinality[current_node_type, field_name] ==
+ ir_pb2.Optional):
+ singular_fields_to_scan.append(field_name)
+ else:
+ repeated_fields_to_scan.append(field_name)
+ fields_to_scan_by_current_and_target[
+ current_node_type, target_node_type] = (
+ singular_fields_to_scan, repeated_fields_to_scan)
+ return fields_to_scan_by_current_and_target
+
+
+_FIELDS_TO_SCAN_BY_CURRENT_AND_TARGET = _fields_to_scan_by_current_and_target()
+
+
+def fast_traverse_ir_top_down(ir, pattern, action, incidental_actions=None,
+ skip_descendants_of=(), parameters=None):
+ """Traverses an IR from the top down, executing the given actions.
+
+ `fast_traverse_ir_top_down` walks the given IR in preorder traversal,
+ specifically looking for nodes whose path from the root of the tree matches
+ `pattern`. For every node which matches `pattern`, `action` will be called.
+
+ `pattern` is just a list of node types. For example, to execute `print` on
+ every `ir_pb2.Word` in the IR:
+
+ fast_traverse_ir_top_down(ir, [ir_pb2.Word], print)
+
+ If more than one type is specified, then each one must be found inside the
+ previous. For example, to print only the Words inside of import statements:
+
+ fast_traverse_ir_top_down(ir, [ir_pb2.Import, ir_pb2.Word], print)
+
+ The optional arguments provide additional control.
+
+ `skip_descendants_of` is a list of types that should be treated as if they are
+ leaf nodes when they are encountered. That is, traversal will skip any
+ nodes with any ancestor node whose type is in `skip_descendants_of`. For
+ example, to `do_something` only on outermost `Expression`s:
+
+ fast_traverse_ir_top_down(ir, [ir_pb2.Expression], do_something,
+ skip_descendants_of={ir_pb2.Expression})
+
+ `parameters` specifies a dictionary of initial parameters which can be passed
+ as arguments to `action` and `incidental_actions`. Note that the parameters
+ can be overridden for parts of the tree by `action` and `incidental_actions`.
+ Parameters can be used to set an object which may be updated by `action`, such
+ as a list of errors generated by some check in `action`:
+
+ def check_structure(structure, errors):
+ if structure_is_bad(structure):
+ errors.append(error_for_structure(structure))
+
+ errors = []
+ fast_traverse_ir_top_down(ir, [ir_pb2.Structure], check_structure,
+ parameters={"errors": errors})
+ if errors:
+ print("Errors: {}".format(errors))
+ sys.exit(1)
+
+ `incidental_actions` is a map from node types to functions (or tuples of
+ functions or lists of functions) which should be called on those nodes.
+ Because `fast_traverse_ir_top_down` may skip branches that can't contain
+ `pattern`, functions in `incidental_actions` should generally not have any
+ side effects: instead, they may return a dictionary, which will be used to
+ override `parameters` for any children of the node they were called on. For
+ example:
+
+ def do_something(expression, field_name=None):
+ if field_name:
+ print("Found {} inside {}".format(expression, field_name))
+ else:
+ print("Found {} not in any field".format(expression))
+
+ fast_traverse_ir_top_down(
+ ir, [ir_pb2.Expression], do_something,
+ incidental_actions={ir_pb2.Field: lambda f: {"field_name": f.name}})
+
+ (The `action` may also return a dict in the same way.)
+
+ A few `incidental_actions` are built into `fast_traverse_ir_top_down`, so
+ that certain parameters are contextually available with well-known names:
+
+ ir: The complete IR (the root ir_pb2.EmbossIr node).
+ source_file_name: The file name from which the current node was sourced.
+ type_definition: The most-immediate ancestor type definition.
+ field: The field containing the current node, if any.
+
+ Arguments:
+ ir: An ir_pb2.Ir object to walk.
+ pattern: A list of node types to match.
+ action: A callable, which will be called on nodes matching `pattern`.
+ incidental_actions: A dict of node types to callables, which can be used to
+ set new parameters for `action` for part of the IR tree.
+ skip_descendants_of: A list of types whose children should be skipped when
+ traversing `ir`.
+ parameters: A list of top-level parameters.
+
+ Returns:
+ None
+ """
+ all_incidental_actions = {
+ ir_pb2.EmbossIr: [lambda ir: {"ir": ir}],
+ ir_pb2.Module: [lambda m: {"source_file_name": m.source_file_name}],
+ ir_pb2.TypeDefinition: [lambda t: {"type_definition": t}],
+ ir_pb2.Field: [lambda f: {"field": f}],
+ }
+ if incidental_actions:
+ for key, incidental_action in incidental_actions.items():
+ if not isinstance(incidental_action, (list, tuple)):
+ incidental_action = [incidental_action]
+ all_incidental_actions.setdefault(key, []).extend(incidental_action)
+ _fast_traverse_proto_top_down(ir, all_incidental_actions, pattern,
+ skip_descendants_of, action, parameters or {})
diff --git a/util/traverse_ir_test.py b/util/traverse_ir_test.py
new file mode 100644
index 0000000..1bc921f
--- /dev/null
+++ b/util/traverse_ir_test.py
@@ -0,0 +1,323 @@
+# Copyright 2019 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Tests for util.traverse_ir."""
+
+import collections
+import unittest
+
+from public import ir_pb2
+from util import traverse_ir
+
+_EXAMPLE_IR = ir_pb2.EmbossIr(
+ module=[
+ {
+ "type": [
+ {
+ "structure": {
+ "field": [
+ {
+ "location": {
+ "start": { "constant": { "value": "0" } },
+ "size": { "constant": { "value": "8" } }
+ },
+ "type": {
+ "atomic_type": {
+ "reference": { "canonical_name": { "module_file": "", "object_path": ["UInt"] } }
+ }
+ },
+ "name": { "name": { "text": "field1" } }
+ },
+ {
+ "location": {
+ "start": { "constant": { "value": "8" } },
+ "size": { "constant": { "value": "16" } }
+ },
+ "type": {
+ "array_type": {
+ "base_type": {
+ "atomic_type": {
+ "reference": {
+ "canonical_name": { "module_file": "", "object_path": ["UInt"] }
+ }
+ }
+ },
+ "element_count": { "constant": { "value": "8" } }
+ }
+ },
+ "name": { "name": { "text": "field2" } }
+ }
+ ]
+ },
+ "name": { "name": { "text": "Foo" } },
+ "subtype": [
+ {
+ "structure": {
+ "field": [
+ {
+ "location": {
+ "start": { "constant": { "value": "24" } },
+ "size": { "constant": { "value": "32" } }
+ },
+ "type": {
+ "atomic_type": {
+ "reference": {
+ "canonical_name": { "module_file": "", "object_path": ["UInt"] }
+ }
+ }
+ },
+ "name": { "name": { "text": "bar_field1" } }
+ },
+ {
+ "location": {
+ "start": { "constant": { "value": "32" } },
+ "size": { "constant": { "value": "320" } }
+ },
+ "type": {
+ "array_type": {
+ "base_type": {
+ "array_type": {
+ "base_type": {
+ "atomic_type": {
+ "reference": {
+ "canonical_name": { "module_file": "", "object_path": ["UInt"] }
+ }
+ }
+ },
+ "element_count": { "constant": { "value": "16" } }
+ }
+ },
+ "automatic": { }
+ }
+ },
+ "name": { "name": { "text": "bar_field2" } }
+ }
+ ]
+ },
+ "name": { "name": { "text": "Bar" } }
+ }
+ ]
+ },
+ {
+ "enumeration": {
+ "value": [
+ {
+ "name": { "name": { "text": "ONE" } },
+ "value": { "constant": { "value": "1" } }
+ },
+ {
+ "name": { "name": { "text": "TWO" } },
+ "value": {
+ "function": {
+ "function": ir_pb2.Function.ADDITION,
+ "args": [
+ { "constant": { "value": "1" } },
+ { "constant": { "value": "1" } }
+ ],
+ "function_name": { "text": "+" }
+ }
+ }
+ }
+ ]
+ },
+ "name": { "name": { "text": "Bar" } }
+ },
+ ],
+ "source_file_name": "t.emb"
+ },
+ {
+ "type": [
+ {
+ "external": { },
+ "name": {
+ "name": { "text": "UInt" },
+ "canonical_name": { "module_file": "", "object_path": ["UInt"] }
+ },
+ "attribute": [
+ {
+ "name": { "text": "statically_sized" },
+ "value": { "expression": { "boolean_constant": { "value": True } } }
+ },
+ {
+ "name": { "text": "size_in_bits" },
+ "value": { "expression": { "constant": { "value": "64" } } }
+ }
+ ]
+ }
+ ],
+ "source_file_name": ""
+ }
+ ]
+)
+
+
+def _count_entries(sequence):
+ counts = collections.Counter()
+ for entry in sequence:
+ counts[entry] += 1
+ return counts
+
+
+def _record_constant(constant, constant_list):
+ constant_list.append(int(constant.value))
+
+
+def _record_field_name_and_constant(constant, constant_list, field):
+ constant_list.append((field.name.name.text, int(constant.value)))
+
+
+def _record_file_name_and_constant(constant, constant_list, source_file_name):
+ constant_list.append((source_file_name, int(constant.value)))
+
+
+def _record_location_parameter_and_constant(constant, constant_list,
+ location=None):
+ constant_list.append((location, int(constant.value)))
+
+
+def _record_kind_and_constant(constant, constant_list, type_definition):
+ if type_definition.HasField("enumeration"):
+ constant_list.append(("enumeration", int(constant.value)))
+ elif type_definition.HasField("structure"):
+ constant_list.append(("structure", int(constant.value)))
+ elif type_definition.HasField("external"):
+ constant_list.append(("external", int(constant.value)))
+ else:
+ assert False, "Shouldn't be here."
+
+
+class TraverseIrTest(unittest.TestCase):
+
+ def test_filter_on_type(self):
+ constants = []
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR, [ir_pb2.NumericConstant], _record_constant,
+ parameters={"constant_list": constants})
+ self.assertEqual(
+ _count_entries([0, 8, 8, 8, 16, 24, 32, 16, 32, 320, 1, 1, 1, 64]),
+ _count_entries(constants))
+
+ def test_filter_on_type_in_type(self):
+ constants = []
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR,
+ [ir_pb2.Function, ir_pb2.Expression, ir_pb2.NumericConstant],
+ _record_constant,
+ parameters={"constant_list": constants})
+ self.assertEqual([1, 1], constants)
+
+ def test_filter_on_type_star_type(self):
+ struct_constants = []
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR, [ir_pb2.Structure, ir_pb2.NumericConstant],
+ _record_constant,
+ parameters={"constant_list": struct_constants})
+ self.assertEqual(_count_entries([0, 8, 8, 8, 16, 24, 32, 16, 32, 320]),
+ _count_entries(struct_constants))
+ enum_constants = []
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR, [ir_pb2.Enum, ir_pb2.NumericConstant], _record_constant,
+ parameters={"constant_list": enum_constants})
+ self.assertEqual(_count_entries([1, 1, 1]), _count_entries(enum_constants))
+
+ def test_filter_on_not_type(self):
+ notstruct_constants = []
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR, [ir_pb2.NumericConstant], _record_constant,
+ skip_descendants_of=(ir_pb2.Structure,),
+ parameters={"constant_list": notstruct_constants})
+ self.assertEqual(_count_entries([1, 1, 1, 64]),
+ _count_entries(notstruct_constants))
+
+ def test_field_is_populated(self):
+ constants = []
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR, [ir_pb2.Field, ir_pb2.NumericConstant],
+ _record_field_name_and_constant,
+ parameters={"constant_list": constants})
+ self.assertEqual(_count_entries([
+ ("field1", 0), ("field1", 8), ("field2", 8), ("field2", 8),
+ ("field2", 16), ("bar_field1", 24), ("bar_field1", 32),
+ ("bar_field2", 16), ("bar_field2", 32), ("bar_field2", 320)
+ ]), _count_entries(constants))
+
+ def test_file_name_is_populated(self):
+ constants = []
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR, [ir_pb2.NumericConstant], _record_file_name_and_constant,
+ parameters={"constant_list": constants})
+ self.assertEqual(_count_entries([
+ ("t.emb", 0), ("t.emb", 8), ("t.emb", 8), ("t.emb", 8), ("t.emb", 16),
+ ("t.emb", 24), ("t.emb", 32), ("t.emb", 16), ("t.emb", 32),
+ ("t.emb", 320), ("t.emb", 1), ("t.emb", 1), ("t.emb", 1), ("", 64)
+ ]), _count_entries(constants))
+
+ def test_type_definition_is_populated(self):
+ constants = []
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR, [ir_pb2.NumericConstant], _record_kind_and_constant,
+ parameters={"constant_list": constants})
+ self.assertEqual(_count_entries([
+ ("structure", 0), ("structure", 8), ("structure", 8), ("structure", 8),
+ ("structure", 16), ("structure", 24), ("structure", 32),
+ ("structure", 16), ("structure", 32), ("structure", 320),
+ ("enumeration", 1), ("enumeration", 1), ("enumeration", 1),
+ ("external", 64)
+ ]), _count_entries(constants))
+
+ def test_keyword_args_dict_in_action(self):
+ call_counts = {"populated": 0, "not": 0}
+
+ def check_field_is_populated(node, **kwargs):
+ del node # Unused.
+ self.assertTrue(kwargs["field"])
+ call_counts["populated"] += 1
+
+ def check_field_is_not_populated(node, **kwargs):
+ del node # Unused.
+ self.assertFalse("field" in kwargs)
+ call_counts["not"] += 1
+
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR, [ir_pb2.Field, ir_pb2.Type], check_field_is_populated)
+ self.assertEqual(7, call_counts["populated"])
+
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR, [ir_pb2.Enum, ir_pb2.EnumValue],
+ check_field_is_not_populated)
+ self.assertEqual(2, call_counts["not"])
+
+ def test_pass_only_to_sub_nodes(self):
+ constants = []
+
+ def pass_location_down(field):
+ return {
+ "location": (int(field.location.start.constant.value),
+ int(field.location.size.constant.value))
+ }
+
+ traverse_ir.fast_traverse_ir_top_down(
+ _EXAMPLE_IR, [ir_pb2.NumericConstant],
+ _record_location_parameter_and_constant,
+ incidental_actions={ir_pb2.Field: pass_location_down},
+ parameters={"constant_list": constants, "location": None})
+ self.assertEqual(_count_entries([
+ ((0, 8), 0), ((0, 8), 8), ((8, 16), 8), ((8, 16), 8), ((8, 16), 16),
+ ((24, 32), 24), ((24, 32), 32), ((32, 320), 16), ((32, 320), 32),
+ ((32, 320), 320), (None, 1), (None, 1), (None, 1), (None, 64)
+ ]), _count_entries(constants))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/vim/ft-emboss/ftdetect/emboss.vim b/vim/ft-emboss/ftdetect/emboss.vim
new file mode 100644
index 0000000..fa44af5
--- /dev/null
+++ b/vim/ft-emboss/ftdetect/emboss.vim
@@ -0,0 +1,17 @@
+" Copyright 2019 Google LLC
+"
+" Licensed under the Apache License, Version 2.0 (the "License");
+" you may not use this file except in compliance with the License.
+" You may obtain a copy of the License at
+"
+" https://www.apache.org/licenses/LICENSE-2.0
+"
+" Unless required by applicable law or agreed to in writing, software
+" distributed under the License is distributed on an "AS IS" BASIS,
+" WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+" See the License for the specific language governing permissions and
+" limitations under the License.
+
+" Vim file detection for Emboss.
+
+autocmd BufRead,BufNewFile *.emb setfiletype emboss
diff --git a/vim/ft-emboss/ftplugin/emboss.vim b/vim/ft-emboss/ftplugin/emboss.vim
new file mode 100644
index 0000000..7880796
--- /dev/null
+++ b/vim/ft-emboss/ftplugin/emboss.vim
@@ -0,0 +1,26 @@
+" Copyright 2019 Google LLC
+"
+" Licensed under the Apache License, Version 2.0 (the "License");
+" you may not use this file except in compliance with the License.
+" You may obtain a copy of the License at
+"
+" https://www.apache.org/licenses/LICENSE-2.0
+"
+" Unless required by applicable law or agreed to in writing, software
+" distributed under the License is distributed on an "AS IS" BASIS,
+" WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+" See the License for the specific language governing permissions and
+" limitations under the License.
+
+" Emboss-specific Vim settings.
+
+if exists('b:did_ftplugin')
+ finish
+endif
+let b:did_ftplugin = 1
+
+let b:undo_ftplugin = 'setlocal comments< formatoptions< iskeyword<'
+
+setlocal formatoptions-=t
+setlocal comments=b:--,:#
+setlocal iskeyword+=$
diff --git a/vim/ft-emboss/syntax/emboss.vim b/vim/ft-emboss/syntax/emboss.vim
new file mode 100644
index 0000000..71765ea
--- /dev/null
+++ b/vim/ft-emboss/syntax/emboss.vim
@@ -0,0 +1,99 @@
+" Copyright 2019 Google LLC
+"
+" Licensed under the Apache License, Version 2.0 (the "License");
+" you may not use this file except in compliance with the License.
+" You may obtain a copy of the License at
+"
+" https://www.apache.org/licenses/LICENSE-2.0
+"
+" Unless required by applicable law or agreed to in writing, software
+" distributed under the License is distributed on an "AS IS" BASIS,
+" WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+" See the License for the specific language governing permissions and
+" limitations under the License.
+
+" Vim syntax file for Emboss.
+
+" Quit when a (custom) syntax file was already loaded.
+if exists('b:current_syntax')
+ finish
+endif
+
+" TODO(bolms): Generate the syntax patterns from the patterns in tokenizer.py.
+" Note that Python regex syntax and Vim regexp syntax differ significantly, and
+" the matching logic Vim uses for syntactic elements is significantly different
+" from what a tokenizer uses.
+
+syn clear
+
+" Emboss keywords.
+syn keyword embStructure struct union enum bits external
+syn keyword embKeyword $reserved $default
+syn keyword embKeyword $static_size_in_bits $is_statically_sized this
+syn keyword embKeyword $max $present $upper_bound $lower_bound
+syn keyword embKeyword import as
+syn keyword embKeyword if let
+syn keyword embBoolean true false
+syn keyword embIdentifier $size_in_bits $size_in_bytes
+syn keyword embIdentifier $max_size_in_bits $max_size_in_bytes
+syn keyword embIdentifier $min_size_in_bits $min_size_in_bytes
+
+" Per standard convention, highlight to-do patterns in comments.
+syn keyword embTodo contained TODO FIXME XXX
+
+" When more than one syntax pattern matches a particular chunk of text, Vim
+" picks the last one. These 'catch-all' patterns will match any word or number,
+" valid or invalid; valid tokens will be matched again by later patterns,
+" overriding the embBadNumber or embBadWord match.
+syn match embBadNumber display '\v\C<[0-9][0-9a-zA-Z_$]*>'
+syn match embBadWord display '\v\C<[A-Za-z_$][A-Za-z0-9_$]*>'
+
+" Type names are always CamelCase, enum constants are always SHOUTY_CASE, and
+" most other identifiers (field names, attribute names) are snake_case.
+syn match embType display '\v\C<[A-Z][A-Z0-9]*[a-z][A-Za-z0-9]*>'
+syn match embConstant display '\v\C<[A-Z][A-Z0-9_]+>'
+syn match embIdentifier display '\v\C<[a-z][a-z0-9_]*>'
+
+" Decimal integers both with and without thousands separators.
+syn match embNumber display '\v\C<\d+>'
+syn match embNumber display '\v\C<\d{1,3}(_\d{3})*>'
+
+" Hex integers with and without word/doubleword separators.
+syn match embNumber display '\v\C<0[xX]\x+>'
+syn match embNumber display '\v\C<0[xX]\x{1,4}(_\x{4})*>'
+syn match embNumber display '\v\C<0[xX]\x{1,8}(_\x{8})*>'
+
+" Binary integers with and without byte/nibble separators.
+syn match embNumber display '\v\C<0[bB][01]+>'
+syn match embNumber display '\v\C<0[bB][01]{1,4}(_[01]{4})*>'
+syn match embNumber display '\v\C<0[bB][01]{1,8}(_[01]{8})*>'
+
+" Strings
+syn match embString display '\v\C"([^"\n\\]|\\[n\\"])*"'
+
+" Comments and documentation.
+syn match embComment display contains=embTodo '\v\C\#.*$'
+syn match embDocumentation display contains=embTodo '\v\C\-\- .*$'
+syn match embDocumentation display '\v\C\-\-$'
+syn match embBadDocumentation display '\v\C\-\-[^ ].*$'
+
+
+" Most Emboss constructs map neatly onto the standard Vim syntax types.
+hi def link embComment Comment
+hi def link embConstant Constant
+hi def link embIdentifier Identifier
+hi def link embNumber Number
+hi def link embOperator Operator
+hi def link embString String
+hi def link embStructure Structure
+hi def link embTodo Todo
+hi def link embType Type
+
+" SpecialComment seems to be the best match for embDocumentation, as it is used
+" for things like javadoc.
+hi def link embDocumentation SpecialComment
+hi def link embBadDocumentation Error
+hi def link embBadWord Error
+hi def link embBadNumber Error
+
+let b:current_syntax = 'emboss'