| # Editions: Life of a FeatureSet |
| |
| **Author:** [@mkruskal-google](https://github.com/mkruskal-google) |
| |
| **Approved:** 2023-08-17 |
| |
| ## Background |
| |
| Outside of some minor spelling tweaks, our current implementation of features |
| has very closely followed the original design laid out in |
| [Protobuf Editions Design: Features](protobuf-editions-design-features.md). This |
| approach led to the creation of four different feature sets for each descriptor |
| though, and it's left under-specified who is responsible for generating these |
| (protoc, plugins, runtimes), who has access to them, and where they need to be |
| propagated to. |
| |
| *Exposing Editions Feature Sets* (not available externally) was a first attempt |
| to try to define some of these concepts. It locks down feature visibility to |
| protoc, generators, and runtimes. Users will only be exposed to them indirectly, |
| via codegen changes or runtime helper functions, in order to avoid Hyrum's law |
| cementing every decision we make about them. We (incorrectly) assumed that the |
| protoc frontend would be able to calculate all the feature sets and then |
| propagate all four sets to the generators, who would then forward the fully |
| resolved runtime features to the runtime. This had the added benefit that we |
| could treat our C++ feature resolution logic as a source-of-truth and didn't |
| have to reimplement it identically in every language we support. |
| |
| *Editions: Runtime Feature Set Defaults* (not available externally) was a |
| follow-up attempt to specifically handle the default feature sets of an edition. |
| We had realized that we would need proto2/proto3 default features in each |
| language to safely roll out editions, and that languages supporting descriptor |
| pools would have cases that bypass protoc entirely. The solution we arrived at |
| was that we should continue using the protoc frontend as the source-of-truth, |
| and propagate these defaults down to the necessary runtimes. This would fix the |
| proto2/proto3 issue, and at least provide some utilities to make the situation |
| easier for descriptor pool users. |
| |
| [Protobuf Editions Design: Features](protobuf-editions-design-features.md) |
| defines the feature resolution algorithm, which can be summarized by the |
| following diagram: |
| |
|  |
| |
| Feature resolution for a given descriptor starts by using the proto file's |
| edition and the feature schemas to generate the default feature set. It then |
| merges all of the parent features from top to bottom, merging the descriptor's |
| features last. |
| |
| ## Glossary |
| |
| We will be discussing features **a lot** in this document, but the meaning |
| behind the word can vary in some subtle ways depending on context. Whenever it's |
| ambiguous, we will stick to qualifying these according to the following |
| definitions: |
| |
| * **Global features** - The features contained directly in `FeatureSet` as |
| fields. These apply to the protobuf language itself, rather than any |
| particular runtime or generator. |
| |
| * **Generator features** - Extensions of `FeatureSet` owned by a specific |
| runtime or generator. |
| |
| * **Feature resolution** - The process of applying the algorithm laid out in |
| [Protobuf Editions Design: Features](protobuf-editions-design-features.md). |
| This means that edition defaults, parent features, and overrides have all |
| been merged together. After resolution, every feature should have an |
| explicit value. |
| |
| * **Unresolved features** - The features a user has explicitly set on |
| their descriptors in the `.proto` file. These have not gone through |
| feature resolution and are a minimal representation that require more |
| knowledge to be useful. |
| |
| * **Resolved features** - Features that have gone through feature |
| resolution, with defaults and inheritance applied. These are the only |
| feature sets that should be used to make decisions. |
| |
| * **Option Retention** - We support a retention specification on all options |
| (see |
| [here](https://protobuf.dev/programming-guides/proto3#option-retention)), |
| including features |
| |
| * **Source features** - The features available to protoc and generators, |
| before option retention has been applied. These can be either resolved |
| or unresolved. |
| |
| * **Runtime features** - The features available to runtimes after option |
| retention has been applied. These can be either resolved or unresolved. |
| |
| ## Problem Description |
| |
| The flaw that all of these design documents suffer from is that protoc **can't** |
| be the universal source-of-truth for feature resolution under the original |
| design. For global features, there's of course no issue (protoc has a |
| bootstrapping setup for `descriptor.proto`` and always knows the global feature |
| set). For generator features though, we depend on [imports to make them |
| discoverable](protobuf-editions-design-features.md#specification-of-an-edition). |
| |
| If a user is actually overriding one of these features, there will necessarily |
| be an import and therefore protoc will be able to discover generator features |
| and handle resolution. However, if the user is ok with the edition defaults |
| there's no need for an import. Without the import, protoc has **no way of |
| knowing** that those generator features exist in general. We could hardcode the |
| ones we own, but that just pushes the problem off to third-party plugins. We |
| could also force proto owners to include imports for *every* (transitive) |
| language they generate code to, even if they're unused, but that would be very |
| disruptive and isn't practical or idiomatic. |
| |
| Pushing the source-of-truth to the generators makes things a little better, |
| since they each know exactly what feature file needs to be included. There's no |
| longer any knowledge gap, and we don't need to rely on imports to discover the |
| feature extension. Additionally, many of our generators are written in C++ (even |
| non-built-in plugins), so we could at least reuse our existing feature |
| resolution utility for all of those and limit the amount of duplication |
| necessary. However, there's still a code-size issue with this approach. As |
| described in the previous documents, we would need to send four feature sets for |
| **every** descriptor to the runtime (i.e. in the generator request and embedded |
| as a serialized string). We wouldn't be able to use inheritance or references to |
| minimize the cost, and every generator that embeds a `FileDescriptorProto` into |
| its gencode would see a massive code-size increase. |
| |
| There's also still the issue of descriptor pools that need to be able to build |
| descriptors at runtime. These are typically power users (and our own unit-tests) |
| doing very atypical things and bypassing protoc entirely. In previous documents |
| we've attempted to push some of the cost onto them by explicitly not giving them |
| feature resolution. They would have to specify every feature on every |
| descriptor, and would not be able to use edition defaults or inheritance. |
| However, this cost is fairly high and it also makes the `edition` field |
| meaningless. Any missing feature would be a runtime error, and there would be no |
| concept of "edition". This creates an inconsistent experience for developers, |
| where they think in terms of editions in one context and then throw it out in |
| another. Also, it would mean that we have two distinct ways of specifying a |
| `FileDescriptorProto``: with unresolved features meant to only go through |
| protoc, and with fully resolved features meant to always bypass protoc. |
| Round-tripping descriptors would become difficult or impossible. |
| |
| The following image attempts illustrates the issue: |
| |
|  |
| |
| Here, a proto file is used in both A and B runtimes. The schema itself only |
| overrides features for A though, and doesn't declare an import on B's features. |
| This means that protoc doesn't know about B's features, and Generator B will |
| need to resolve them. Additionally, dynamic messages in both A and B runtimes |
| have issues because they've bypassed protoc and don't have any way to follow the |
| feature resolution spec. |
| |
| ### Requirements |
| |
| The following minimal feature sets are required by protoc: |
| |
| * **Resolved global source features** - to make proto-level decisions |
| * **Unresolved global source features** - for validation |
| |
| For each generator: |
| |
| * **Resolved generator source features** - to make language-specific codegen |
| decisions |
| * **Unresolved generator source features** - for validation |
| * **Resolved global source features** - to make more complex decisions |
| |
| For each runtime: |
| |
| * **All resolved runtime features** - for making runtime decisions |
| * **All unresolved runtime features** - for round-trip behavior and debugging |
| |
| With some additional requirements on an ideal solution: |
| |
| * **Minimal code-size costs** - code size bloat can easily block the rollout |
| of editions, and once those limits are hit we don't have great solutions |
| |
| * **Minimal performance costs** - we want a solution that avoids any |
| unnecessary CPU or RAM regressions |
| |
| * **Minimal code duplication** - obviously we want to minimize this, but where |
| we can't, we need a suitable test strategy to keep the duplication in sync |
| |
| * **Runtime support for dynamic messages** - while dynamic messages are a |
| less-frequently-used feature, they are a critical feature used by a lot of |
| important systems. Our solution should avoid making them harder to use in |
| any runtime that supports them. |
| |
| ## Recommended Solution |
| |
| Our long-term recommendation here is to support and use feature resolution in |
| every stage in the life of a FeatureSet. Every runtime, generator, and protoc |
| itself will all handle feature resolution independently, only sharing unresolved |
| features between each other. This will necessarily mean duplication across |
| nearly every language we support, and the following sections will go into detail |
| about strategies for managing this. |
| |
| The main justification for this duplication is the simple fact that *edition |
| defaults* will be needed almost everywhere. The generators need defaults for |
| *their* features to get fully resolved generator features to make decisions on, |
| and can't get them from protoc in every case. The runtimes need defaults for |
| both global and generator features in order to honor editions in dynamic |
| messages and to keep RAM costs down (e.g. the absence of feature overrides |
| should result in a reference to some shared default object). Since the |
| calculation of edition defaults is by far the most complicated piece of feature |
| resolution, with the remainder just being proto merges, it makes everything |
| simpler to understand if we just duplicate the entire algorithm. |
| |
| #### Pros |
| |
| * Resolved feature sets will never be publicly exposed |
| |
| * Our APIs will be significantly simpler, cutting the number of different |
| types of feature sets by a factor of 2 |
| |
| * There will be no ambiguity about what a `FeatureSet` object *means*. It |
| will always either be unresolved (outside of protobuf code) or fully |
| resolved on all accessible features (inside protobuf code). |
| |
| * RAM and code-size costs will be minimal, since we'll only be storing and |
| propagating the minimal amount of information (unresolved features) |
| |
| * Combats Hyrum's law by allowing us to provide wrappers around resolved |
| features everywhere, instead of letting people depend on them directly |
| |
| * **Minimal** duplication on top of what's already necessary (edition |
| defaults). |
| |
| * Dynamic messages will be treated on equal footing to proto files |
| |
| * The necessary feature dependencies will always be available in the |
| appropriate context |
| |
| * We can simplify the current implementation since protoc won't need to handle |
| resolution of imported features. |
| |
| #### Cons |
| |
| * Requires duplication of feature resolution in every runtime and every unique |
| generator language |
| |
| * This means building out additional infrastructure to enforce |
| cross-language conformance |
| |
| ### Runtimes Without Reflection |
| |
| There are various runtimes that do not support reflection or dynamic messages at |
| all (e.g. Java lite, ObjC). They typically embed the "feature-like" information |
| they need directly into custom objects in the gencode. In these cases, the |
| problem becomes a lot simpler because they *don't need* the full FeatureSet |
| objects. We **don't** need to duplicate feature resolution in the runtime, and |
| the generator can just directly embed the fully resolved features values needed |
| by the runtime (of course, the generator might still need duplicate logic to get |
| those). |
| |
| ### Staged Rollout for Dynamic Messages |
| |
| Long-term, we want to be able to handle feature resolution at run-time for any |
| runtime that supports reflection (and therefore needs FeatureSet objects) to |
| reduce code-size/RAM costs and support dynamic messages. However, in any |
| language where these costs are less critical, a staged rollout could be |
| appropriate. Here, the generator would embed the serialized resolved source |
| features into the gencode along with the rest of the options. We would use the |
| `raw_features` field (which should eventually be deleted) to also include the |
| unresolved features for reflection. |
| |
| This would allow us to implement and test editions, and unblock the migration of |
| all non-dynamic cases. A follow-up optimization at a later stage could push this |
| down the runtime, and only embed unresolved features in the gencode. |
| |
| Under this scenario, dynamic messages could still allow editions, as long as |
| fully-resolved features were provided on every descriptor. When we do implement |
| feature resolution, it will just be a matter of deleting redundant/unnecessary |
| features, but there should always be a valid transformation from fully-resolved |
| features to unresolved ones. |
| |
| ### C++ Generators |
| |
| Generators written in C++ are in a better position since they don't require any |
| code duplication. They could be given visibility to our existing feature |
| resolution utility to resolve the features themselves. However, a better |
| alternative is to make improvements to this utility so that some helpers like |
| the ones we proposed in *Exposing Editions Feature Sets* can be used to access |
| the resolved features that *already exist*. |
| |
| Protoc works by first parsing the input protofiles and building them into a |
| descriptor pool. This is the frontend pass, where only the global features are |
| needed. For built-in languages, the resulting descriptors are passed directly to |
| the generator for codegen. For plugins, they're serialized into descriptor |
| protos, rebuilt in a new descriptor pool (in the generator process), and then |
| sent to the generator code for codegen. In both of these cases, a |
| `DescriptorPool` build of the protos is done from a binary that *necessarily* |
| links in the relevant generator features. |
| |
| Today, we discover features in the pool which are imported by the protos being |
| built. This has the hole we mentioned above where non-imported features can't be |
| discovered. Instead, we will pivot to a more explicit strategy for discovering |
| features. By default, `DescriptorPool` will only resolve the global features and |
| the C++ features (since this is the C++ runtime). A new method will be added to |
| `DescriptorPool` that allows new feature sets to replace the C++ features for |
| feature resolution. Generators will register their features via a virtual method |
| in `CodeGenerator` and the generator's pool build will take those into account |
| during feature resolution. |
| |
| There are a few ways to actually define this registration, which we'll leave as |
| implementation details. Some examples that we're considering include: |
| |
| * Have the generator provide its own `DescriptorPool` containing the relevant |
| feature sets |
| * Have the generator provide a mapping of edition -> default `FeatureSet` |
| objects |
| |
| Expanding on previous designs, we will provide the following API to C++ |
| generators via the `CodeGenerator` class: |
| |
| They will have access to all the fully-resolved feature sets of any descriptor |
| for making codegen decisions, and they will have access to their own unresolved |
| generator features for validation. The `FileDescriptor::CopyTo` method will |
| continue to output unresolved runtime features, which will become unresolved |
| source features after option retention stripping (which generators should |
| already be doing), for embedding in the gencode for runtime use. |
| |
| #### Example |
| |
| As an example, let's look at some hypothetical language `lang` and how it would |
| introduce its own features. First, if it needs features at runtime it would |
| create a `lang_features.proto` file in its runtime directory and bootstrap the |
| gencode the same as it does for `descriptor.proto`. It would then *also* |
| bootstrap C++ gencode using a special C++-only build of protoc. This can be |
| illustrated in the following diagram: |
| |
|  |
| |
| This illustrates the bootstrapping setup for a built-in C++ generator. If |
| generator features weren't needed in the runtime, that red box would disappear. |
| If this were a separate plugin, the "plugin" box would simply be moved out of |
| `protoc` and `protoc` could also serve as `protoc_cpp`. |
| |
| If `lang` didn't need runtime features, we would simply put the features proto |
| in the `lang` generator and only generate C++ code (using the same bootstrapping |
| technique as above). |
| |
| After the generator registers `lang_features.proto` with the DescriptorPool, the |
| `FeatureSet` objects returned by `GetFeatures` will always have fully resolved |
| `lang` features. |
| |
| ### Non-C++ Generators |
| |
| As we've shown above, non-C++ generators are already in a situation where they'd |
| need to duplicate *some* of the feature resolution logic. With this solution, |
| they'd need to duplicate much more of it. The `GeneratorRequest` from protoc |
| will provide the full set of *unresolved* features, which they will need to |
| resolve and apply retention stripping to. |
| |
| **Note:** If we're able to implement bidirectional plugin communication, the |
| [Bidirectional Plugins](#bidirectional-plugins) alternative may be a simpler |
| solution for non-C++ generators that *don't* need features at runtime. Ones that |
| need it at runtime will need to reimplement feature resolution anyway, so it may |
| be less useful. |
| |
| One of the trickier pieces of the resolution logic is the calculation of edition |
| defaults, which requires a lot of reflection. One of the ideas mentioned above |
| in [C++ Generators](#c++-generators) could actually be repurposed to avoid |
| duplication of this in non-C++ generators as well. The basic idea is that we |
| start by defining a proto: |
| |
| ``` |
| message EditionFeatureDefaults { |
| message FeatureDefaults { |
| string edition = 1; |
| FeatureSet defaults = 2; |
| } |
| repeated FeatureDefaults defaults = 1; |
| string minimum_edition = 2; |
| string maximum_edition = 3; |
| } |
| ``` |
| |
| This can be filled from any feature set extension to provide a much more usable |
| specification of defaults. We can package a genrule that converts from feature |
| protos to a serialized `EditionFeatureDefaults` string, and embed this anywhere |
| we want. Both C++ and non-C++ generators/runtimes could embed this into their |
| code. Once this is known, feature resolution becomes a lot simpler. The hardest |
| part is creating a comparator for edition strings. After that, it's a simple |
| search for the lower bound in the defaults, followed by some proto merges. |
| |
| ### Bootstrapping |
| |
| One major complication we're likely to hit revolves around our bootstrapping of |
| `descriptor.proto`. In languages that have dynamic messages, one codegen |
| strategy is to embed the `FileDescriptorProto` of the file and then parse and |
| build it at the beginning of runtime. For `descriptor.proto` in particular, |
| handling options can be very challenging. For example, in Python, we |
| intentionally strip all options from this file and then assume that the options |
| descriptors always exist during build (in the presence of serialized options). |
| Since features *are* options, this poses a challenge that's likely to vary |
| language by language. |
| |
| We will likely need to special-case `descriptor.proto` in a number of ways. |
| Notably, this file will **never** have any generator feature overrides, since it |
| can't import those files. In every other case, we can safely assume that |
| generator features exist in a fully resolved feature set. But for |
| `descriptor.proto`, at least at the time it's first being built by the runtime, |
| this extension won't be present. We also can't figure out edition defaults at |
| that point since we don't have the generator features proto to reflect over. |
| |
| One possible solution would be to codegen extra information specifically for |
| this bootstrapped proto, similar to what we suggested in *Editions: Runtime |
| Feature Set Defaults* for edition defaults. That would allow the generator to |
| provide enough information to build `descriptor.proto` during runtime. As long |
| as these special cases are limited to `descriptor.proto` though, it can be left |
| to a more isolated language-specific discussion. |
| |
| ### Conformance Testing |
| |
| Code duplication means that we need a test strategy for making sure everyone |
| stays conformant. We will need to implement a conformance testing framework for |
| validating that all the different implementations of feature resolution agree. |
| Our current conformance tests provide a good model for accomplishing this, even |
| though they don't quite fit the problem (they're designed for |
| parsing/serialization). There's a runner binary that can be hooked up to another |
| binary built in any language. It sends a `ConformanceRequest` proto with a |
| serialized payload and set of instructions, and then receives a |
| `ConformanceResponse` with the result. In the runner, we just loop over a number |
| of fixed test suites to validate that the supplied binary is conformant. |
| |
| We would want a similar setup here for language-agnostic testing. While we could |
| write a highly focused framework just for feature resolution, a more general |
| approach may set us up better in the future (e.g. option retention isn't |
| duplicated now but could have been implemented that way). This will allow us to |
| test any kind of transformation to descriptor protos, such as: proto3_optional, |
| group/DELIMITED, required/LEGACY_REQUIRED. The following request/response protos |
| describe the API: |
| |
| ``` |
| message DescriptorConformanceRequest { |
| // The file under test, pre-transformation. |
| FileDescriptorProto file = 1; |
| |
| // The pool of dependencies and feature files required for build. |
| FileDescriptorSet dependencies = 2; |
| } |
| |
| message DescriptorConformanceResponse { |
| // The transformed file. |
| FileDescriptorProto file = 1; |
| |
| // Any additional features added during build. |
| FileDescriptorSet added_features = 2; |
| } |
| ``` |
| |
| Each test point would construct a proto file, its dependencies, and any feature |
| files to include in feature resolution. The conformance binary would use this to |
| fully decorate the proto file with resolved features, and send the result back |
| for comparison against our C++ source-of-truth. Any generator features added by |
| the binary will also need to be sent back to get matching results. |
| |
| ### Documentation |
| |
| Because we're now asking third-party generator owners to handle feature |
| resolution on their own, we will need to document this. Specifically, we need to |
| open-source documentation for: |
| |
| * The algorithm described in |
| [Protobuf Editions Design: Features](protobuf-editions-design-features.md) |
| * The conformance test framework and how to use it (once it's implemented) |
| |
| On the other hand, we will have significantly less documentation to write about |
| which feature sets to use where. Descriptor protos will *always* contain |
| unresolved features, and C++ generators will have a simple API for getting the |
| fully-resolved features. |
| |
| ## Considered Alternatives |
| |
| ### Use Generated Pool for C++ Generators |
| |
| *Note: this was part of the original proposal, but has been refactored (see |
| cons)* |
| |
| Generators written in C++ are in a better position since they don't require any |
| code duplication. They could be given visibility to our existing feature |
| resolution utility to resolve the features themselves. However, a better |
| alternative is to make improvements to this utility so that some helpers like |
| the ones we proposed in *Exposing Editions Feature Sets* can be used to access |
| the resolved features that *already exist*. |
| |
| Protoc works by first parsing the input protofiles and building them into a |
| descriptor pool. This is the frontend pass, where only the global features are |
| needed. For built-in languages, the resulting descriptors are passed directly to |
| the generator for codegen. For plugins, they're serialized into descriptor |
| protos, rebuilt in a new descriptor pool (in the generator process), and then |
| sent to the generator code for codegen. In both of these cases, a |
| `DescriptorPool` build of the protos is done from a binary that *necessarily* |
| links in the relevant generator features. |
| |
| However, the FeatureSets we supply to generators are transformed to the |
| generated pool (i.e. `FeatureSet` objects rather than `Message`) where the |
| generator features will always exist. We've decided that there's no longer any |
| reason to scrape the imports for features, but we *could* scrape the generated |
| pool for them. This essentially means that when you call `MergeFeatures` to get |
| a `FeatureSet`, the returned set is fully resolved *with respect to the current |
| generated pool*. This is a much clearer contract, and has the benefit that the |
| features visible to every C++ generator would automatically be populated with |
| the correct generator features for them to use. |
| |
| Expanding on previous designs, we will provide the following API to C++ |
| generators via the `CodeGenerator` class: |
| |
| They will have access to all the fully-resolved feature set of any descriptor |
| for making codegen decisions, and they will have access to their own unresolved |
| generator features for validation. The `FileDescriptor::CopyTo` method will |
| continue to output unresolved runtime features, which will become unresolved |
| source features after option retention stripping (which generators should |
| already be doing), for embedding in the gencode for runtime use. |
| |
| #### Pros |
| |
| * Automatic inclusion of any features used in a binary |
| * Features will never be partially resolved |
| |
| #### Cons |
| |
| * Implicit action at a distance could cause unexpected behaviors |
| * Uses globals, making testing awkward |
| * Not friendly to `DescriptorPool` cases who wouldn't necessarily want every |
| linked-in feature to go through feature resolution. |
| |
| ### Default Placeholders |
| |
| Protoc continues to propagate and resolve core features and imported language |
| level features. For language level features that protoc does not know about |
| (that is, not imported), a core placeholder feature indicating that the default |
| for a given edition should be respected can be propagated. |
| |
| ``` |
| message FeatureSet { |
| optional string unknown_feature_edition_default = N; // e.g. 2023 |
| } |
| ``` |
| |
| Instead of duplicating the entire feature resolution algorithm, plugins must |
| only provide a utility mapping editions to their default FeatureSet using the |
| generator feature files and optionally caching them. |
| |
| For example: |
| |
| ``` |
| if features.hasUtf8Validation(): |
| return features.getUtf8Validation() |
| else: |
| default_features = getDefaultFeatures(features.getUnknownFeatureEditionDefault()) |
| return default_features.getUtf8Validation() |
| ``` |
| |
| #### Pros |
| |
| * Less duplicate logic for propagating features |
| |
| #### Cons |
| |
| * Descriptor proto bloat that is technically redundant with |
| `FileDescriptorProto` edition. |
| * Confusing that some but not all features are fully resolved |
| * Duplicated logic to resolve edition default from edition # |
| * Code-size and memory costs associated with the original approach still exist |
| * Still doesn't help with the descriptor pool case, which may require |
| duplicate logic. |
| |
| ### Bidirectional Plugins |
| |
| Since the generators know the features they care about, we could have some kind |
| of bidirectional communication between protoc and the plugins. The plugin would |
| start by telling protoc the features it wants added, and then protoc would be |
| able to fully resolve all feature sets before sending them off. This has the |
| added benefit that it would allow us to do more interesting enhancements in the |
| future. For example, the plugin could send its minimum required edition and |
| other requirements *before* actually starting the build. |
| |
| **Note:** Bidirectional plugins could still be implemented for other purposes. |
| This "alternative" is specifically for *using* that communication to pass |
| missing feature specs. |
| |
| #### Pros |
| |
| * Eliminates code duplication problem |
| * Provides infrastructure to enable future enhancements |
| |
| #### Cons |
| |
| * Doesn't address the confusing API we have now where it's unclear what kind |
| of features are contained in the `features` field |
| * Doesn't address the code-size and memory costs during runtime |
| * Doesn't address the descriptor pool case |
| |
| ### Central Feature Registry |
| |
| Instead of relying on generators and imports to supply feature specs, we could |
| pivot to a central registry of all known features. Instead of simply claiming an |
| extension number, generator owners could be required to submit all the feature |
| protos to a central repository of feature protos. This would give protoc access |
| to **all** features. There would be two ways to implement this: |
| |
| * If it were built *into* protoc, we could avoid requiring any import |
| statements. We would probably still want an extension point to avoid adding |
| a dependency to `descriptor.proto`, but instead of `features.(pb.cpp)` they |
| would be something more like `features.(pb).cpp`. |
| |
| * We could keep the current extension and import scheme. Proto files would |
| still need to import the features they override, but protoc would depend on |
| all of them and populate defaults for unspecified ones. |
| |
| #### Pros |
| |
| * Makes all features easily discoverable wherever they're needed |
| * Eliminates the code duplication problem |
| * Gives us an option to remove the import statements, which are likely to |
| cause future headaches (in the edition zero LSC, in maintenance afterward, |
| and also for proto files that need to support a lot of third-party |
| runtimes). |
| |
| #### Cons |
| |
| * Doesn't address the code-size and memory costs |
| * Creates version skew problems |
| * Confusing ownership semantics |
| |
| ### Do Nothing |
| |
| Doing nothing would basically mean abandoning editions. The current design |
| doesn't (and can't) work for third party generators. They'd be left to duplicate |
| the logic themselves with no guidance or support from us. We would also see |
| code-size and RAM bloat (except in C++) that would be very difficult to resolve. |
| |
| #### Pros |
| |
| * Less work |
| |
| #### Cons |
| |
| * Worse in every other way |