docs/design/editions/life-of-an-edition.md - third_party/github/protocolbuffers/protobuf - Git at Google

 # Life of an Edition

 **Author:** [@mcy](https://github.com/mcy)

 How to use Protobuf Editions to construct a large-scale change that modifies the
 semantics of Protobuf in some way.

 ## Overview

 This document describes how to use the Protobuf Editions mechanism (both
 editions, themselves, and [features](protobuf-editions-design-features.md)) for
 designing migrations and large-scale changes intended to solve a particular kind
 of defect in the language.

 This document describes:

 *   How features are added to the language.
 *   How editions are defined and "proclaimed".
 *   How to build different kinds of large-scale changes.
 *   Tooling in `protoc` to support large-scale changes.
 *   An OSS strategy.

 ## Defining Features

 There are two kinds of features:

 *   Global features, which are the fields of `proto.Features`. In this document,
     we refer to them as `features.<name>`, e.g. `features.enum`.
 *   Language-scoped features, which are defined in a typed extension field for
     that language. In this document, we refer to them as
     `features.(<lang>).name`, e.g. `features.(proto.cpp).legacy_string`.

 Global features require a `descriptor.h` change, and are relatively heavy
 weight, since defining one will also require providing helpers in `Descriptor`
 wrapper classes to avoid the need for users to resolve inheritance. Because they
 are not specific to a language, they need to be carefully, visibility
 documented.

 Language-scoped features require only a change in a backend's feature extension,
 which has a smaller blast radius (except in C++ and Java). Often these are
 relevant only for codegen and do not require reflective introspection.

 Adding a feature is never a breaking change.

 ### Feature Lifetime

 In general, features should have an *original default* and a *desired default*:
 features are intended to gradually flip from one value to another throughout the
 ecosystem as migrations progress. This is not always true, but this means most
 features will be bools or enums.

 Any migration that introduces a feature should plan to eventually deprecate and
 remove that feature from both our internal codebase and open source, generally
 with a multi-year horizon. Features are *transient*.

 Removing a feature is a breaking change, but it does not need to be tied to an
 edition. Feature removal in OSS must thus be batched into a breaking release.
 Deletion of a feature should generally be announced to OSS a year in advance.

 ### Do's and Don'ts

 Here are some things that we could use features for, very broadly:

 *   Changing the generated API of any syntax production (name, behavior,
     signature, whether it is generated at all). E.g.
     `features.(proto.cpp).legacy_string`.
 *   Changing the serialization encoding of a field (so long as it does not break
     readers). E.g., `features.packed`, eventually `features.group_encoding`.
 *   Changing the deserialization semantics of a field. E.g., `features.enum`,
     `features.utf8`.

 Although almost any semantic change can be feature-controlled, some things would
 be a bit tricky to use a feature for:

 *   Changing syntax. If we introduce a new syntax production, gating it doesn't
     do people much good and is just noise. We should avoid changing how things
     are spelled. In Protobuf's history, it has been incredibly rare that we have
     needed to do this.
 *   Shape of a descriptor. Features should generally not cause fields, message,
     or enum descriptors to appear or disappear.
 *   Names and field numbers. Features should not change the names or field
     numbers of syntax entities as seen in a descriptor. This is separate from
     using features to change generated API names.
 *   Changing the wire encoding in an incompatible way. Using features to change
     the wire format has some long horizons and caveats described below.

 ## Proclaiming an Edition

 An *edition* is a set of default values for all features that `protoc`'s
 frontend, and its backends, understand. Edition numbers are announced by
 protobuf-team, but not necessarily defined by us. `protoc` only defines the
 edition defaults for global features, and each backend defines the edition
 defaults for its features.

 ### Total Ordering of Editions

 The `FileDescriptorProto.edition` field is a string, so that we can avoid nasty
 surprises around needing to mint multiple editions per year: even if we mint
 `edition = "2022";`, we can mint `edition = "2022.1";` in a pinch.

 However, protobuf-team does not define editions, it only proclaims them.
 Third-party backends are responsible for changing defaults across editions. To
 minimize the amount of synchronization, we introduce a *total order* on
 editions.

 This means that a backend can pick the default not by looking at the edition,
 but by asking "is this proto older than this edition, where I introduced this
 default?"

 The total order is thus: the edition string is split on `'.'`. Each component is
 then ordered by `a.len < b.len && a < b`. This ensures that `9 < 10`, for
 example.

 By convention, we will make the edition be either the year, like `2022`, or the
 year followed by a revision, like `2022.1`. Thus, we have the following total
 ordering on editions:

 ```
 2022 < 2022.0 < 2022.1 < ... < 2022.9 < 2022.10 < ... < 2023 < ... < 2024 < ...
 ```

 (**Note:** The above edition ordering is updated in
 [Edition Naming](edition-naming.md).)

 Thus, if an imaginary Haskell backend defines a feature
 `feature.(haskell).more_monads`, which becomes true in 2023, the backend can ask
 `file.EditionIsLaterThan("2023")`. If it becomes false in 2023.1, a future
 version would ask `file.EditionIsBetween("2023", "2023.1")`.

 This means that backends only need to change when they make a change to
 defaults. However, backends cannot add things to editions willy-nilly. A backend
 can only start observing an edition after protobuf-team proclaims the next
 edition number, and may not use edition numbers we do not proclaim.

 ### Proclamation

 "Proclamation" is done via a two-step process: first, we announce an upcoming
 edition some months ahead of time to OSS, and give an approximate date on which
 we plan to release a non-breaking version that causes protoc to accept the new
 edition. Around the time of that release, backends should make a release adding
 support for that edition, if they want to change a default. It is a faux-pas,
 but ultimately has no enforcement mechanism, for the meaning of an edition to
 change long (> 1 month) after it has been released.

 We promise to proclaim an edition once per calendar year, even if first-party
 backends will not use it. In the event of an emergency (whatever that means), we
 can proclaim a `Y.1`, `Y.2`, and so on. Because of the total order, only
 backends that desperately need a new edition need to pay attention to the
 announcement. As we gain experience, we should define guidelines for third
 parties to request an unscheduled edition bump, but for the time being we will
 deal with things case-by-case.

 We may want to have a canonical way for finding out what the latest edition is.
 It should be included in large print on our landing page, and `protoc
 --latest-edition` should print the newest edition known to `protoc`. The intent
 is for tooling that wants to generate `.proto` templates externally can choose
 to use the latest edition for new messages.

 ## Large-scale Change Templates

 The following are sketches of large-scale change designs for feature changes we
 would like to execute, presented as example use-cases.

 ### Large-scale Changes with No Functional Changes: Edition Zero

 We need to get the ecosystem into the `"editions"` syntax. This migration is
 probably unique because we are not changing any behavior, just the spelling of a
 bunch of things.

 We also need to track down and upgrade (by hand) any code that is using the
 value of `syntax`. This will likely be a manual large-scale change performed
 either by Busy Beavers or a handful of protobuf-team members furnished with
 appropriate stimulants (coffee, diet mountain dew, etc). Once we have migrated
 95% of callers of `syntax`, we will mark all accessors of that field in various
 languages as deprecated.

 Because the value of `syntax` becomes unreliable at this point, this will be a
 breaking change.

 Next, we will introduce the features defined in
 [Edition Zero Features](edition-zero-features.md). We will then implement
 tooling that can take a `proto2` or `proto3` file and add `edition = "2023";`
 and `option features.* = ...;` as appropriate, so that each file retains its
 original behavior.

 This second large-scale change can be fully automated, and does not require
 breaking changes.

 ### Large-scale Changes with Features Only: Immolation of `required`

 We can use features to move fields off of `features.field_presence =
 LEGACY_REQUIRED` (the edition’s spelling of `required`) and onto
 `features.field_presence = EXPLICIT_PRESENCE`.

 To do this, we introduce a new value for `features.field_presence`,
 `ALWAYS_SERIALIZE`, which behaves like `EXPLICIT_PRESENCE`, but, if the has-bit
 is not set, the default is serialized. (This is sort of like a cross between
 `required` and `proto3` no-label.)

 It is always safe to turn a proto from `LEGACY_REQUIRED` to `ALWAYS_SERIALIZE`,
 because `required` is a constraint on initialization checking, i.e., that the
 value was present. This means the only requirement is that old readers not
 break, which is accomplished by always providing *a* value. Because `required`
 fields don't set the value anyways, this is not a behavioral change, but it now
 permits writers to veer off of actually setting the value.

 After an appropriate build horizon, we can assume that all readers are tolerant
 of a potentially missing value (even though no writer would actually be omitting
 it). At this point we can migrate from `ALWAYS_SERIALIZE` to
 `EXPLICIT_PRESENCE`. If a reader does not see a record for the field, attempting
 to access it will produce the default value; it is not likely that callers are
 actually checking for presence of `required` fields, even though that is
 technically a thing you can do.

 Once all required fields have gone through both steps, `LEGACY_REQUIRED` and
 `ALWAYS_SERIALIZE` can be removed as variants (breaking change).

 ### Large-scale Changes with Editions: absl::string_view Accessors

 In C++, a `string` or `bytes` typed field has accessors that produce `const
 std::string&`s. The missed optimizations of doing this are well-understood, so
 we won't rehash that discussion.

 We would like to migrate all of them to return `absl::string_view`, a-la
 `ctype = STRING_PIECE`.

 To do this, we introduce `features.(proto.cpp).legacy_string`[^1], a boolean
 feature by default true. When false on a field of appropriate type, it does the
 needful and causes accessors to become representationally opaque.

 The feature can be set at file or field scope; tooling (see below) can be used
 to minimize the diff impact of these changes. Changing a field may also require
 changing code that was previously assuming they could write `std::string x =
 proto.string_field();`. This has the usual "unspooling string" migration
 caveats.

 Once we have applied 95% of internal changes, we will upgrade the C++ backend at
 the next edition to default `legacy_string` to false in the new edition. Tooling
 (again, below) can be used to automatically delete explicit settings of the
 feature throughout our internal codebase, as a second large-scale change. This
 can happen in parallel to closing the loop on the last 5% of required internal
 changes.

 Once we have eliminated all the legacy accessors, we will remove the feature
 (breaking change).

 ### Large-scale Changes with Wire Format Break: Group-Encoded Messages

 It turns out that encoding and decoding groups (end-marker-delimited
 submessages) is cheaper than handling length-prefixed messages. There are
 likely CPU and RAM savings in switching messages to use the group encoding.
 Unfortunately, that would be a wire-breaking change, causing old readers to be
 unable to parse new messages.

 We can do what we did for `packed`. First, we modify parsers to accept message
 fields that are encoded as either groups or messages (i.e., `TYPE_MESSAGE` and
 `TYPE_GROUP` become synonyms in the deserializer). We will let this soak for
 three years[^2] and bide our time.

 After those three years, we can begin a large-scale change to add
 `features.group_encoded` to message fields throughout our internal codebase
 (note that groups don't actually exist in editions; they are just messages with
 `features.group_encoded`). Because of our long waiting period, it is (hopefully)
 unlikely that old readers will be caught by surprise.

 Once we are 95% done, we will upgrade protoc to set `features.group_encoded` to
 true by default in new editions. Tooling can be used to clean up features as
 before.

 We will probably never completely eliminate length-prefixed messages, so this
 is a rare case where the feature lives on forever.

 ## Large-scale Change Tooling

 We will need a few different tools for minimizing migration toil, all of which
 will be released in OSS. These are:

 *   The features GC. Running `protoc --gc-features foo.proto` on a file in
     editions mode will compute the minimal (or a heuristically minimal, if this
     proves expensive) set of features to set on things, given the edition
     specified in the file. This will produce a Protochangifier `ProtoChangeSpec`
     that describes how to clean up the file.

 *   The editions "adopter". Running `protoc --upgrade-edition -I... file.proto`
     figure out how to update `file.proto` from `proto2` or `proto3` to the
     latest edition, adding features as necessary. It will emit this information
     as a `ProtoChangeSpec`, implicitly running features GC.

 *   The editions "upgrader". Running `protoc --upgrade-edition` as above on a
     file that is already in editions mode will bump it up to the latest edition
     known to `protoc` and add features as necessary. Again, this emits a
     features GC'd `ProtoChangeSpec`.

 This is by no means all the tooling we need, but it will simplify the work of
 robots and beavers, along with any bespoke, internal-codebase-specific tooling
 we build.

 ## The OSS Story

 We need to export our large-scale changes into open source to have any hope of
 editions not splitting the ecosystem. It is impossible to do this the way we do
 large-scale changes in our internal codebase, where we have global approvals and
 a finite but nonzero supply of bureaucratic sticks to motivate reluctant users.

 In OSS, we have neither of these things. The only stick we have is breaking
 changes, and the only carrots we can offer are new features. There is no "global
 approval" or "TAP" for OSS.

 Our strategy must be a mixture of:

 *   Convincing users this is a good thing that will help us make Protobuf easier
     to use, cheaper to deploy, and faster in production.
 *   Gently steering users to the new edition in new Protobuf definitions,
     through protoc diagnostics (when an old edition is going or has gone out of
     date) and developer tooling (editor integration, new-file-boilerplate
     templates).
 *   Convincing third-party backend vendors (such as Apple, for Swift) that they
     can leverage editions to fix mistakes. We should go out of our way to design
     attractive migrations for them to execute.
 *   Providing Google-class tooling for migrations. This includes the large-scale
     change tooling above, and, where possible, specialized tooling. When it is
     not possible to provide tooling, we should provide detailed migration guides
     that highlight the benefits.
 *   Being clear that we have a breaking changes policy and that we will
     regularly remove old features after a pre-announced horizon, locking new
     improvements behind completing migrations. This is a risky proposition,
     because users may react by digging in their heels. Comms planning is
     critical.

 The common theme is comms and making it clear that these are improvements
 everyone can benefit from, and that there is no "I" in "ecosystem": using
 Protobuf, just like using Abseil, means accepting upgrades as a fact of life,
 not something to be avoided.

 We should lean in on lessons learned by Go (see: their `go fix` tool) and Rust
 (see: their `rustfix` tool); Rust in particular has an editions/epoch mechanism
 like we do; they also have feature gates, but those are not the same concept as
 *our* features. We should also lean on the Carbon team's public messaging about
 upgrading being a fact of life, to provide a unified Google front on the matter
 from the view of observers.

 ### Prior Art: Rust Editions

 The design of [Protobuf Editions](what-are-protobuf-editions.md) is directly
 inspired by Rust's own
 [edition system](https://doc.rust-lang.org/edition-guide/editions/index.html)[^3].

 Rust defines and ships a new edition every three years, and focuses on changes
 to the surface language that do not inhibit interop: crates of different
 editions can always be linked together, and "edition" is a parallel ratchet to
 the language/compiler version.

 For example, keywords (like `async`) have been introduced using editions.
 Editions have also been used to change the semantics of the borrow checker to
 allow new programs, and to change name resolution rules to be more intuitive.
 For Rust, an edition may require changes to existing code to be able to compile
 again, but *only* at the point that the crate opts into the new edition, to
 obtain some benefit from doing so.

 Unlike Protobuf, Rust commits to supporting *all* past editions in perpetuity:
 there is no ratcheting forward of the whole ecosystem. However, Rust does ship
 with `rustfix` (runnable on Cargo projects via `cargo fix`), a tool that can
 upgrade crates to a new edition. Edition changes are *required* to come with a
 migration plan to enable `rustfix`.

 Crates therefore have limited pressure to upgrade to the latest edition. It
 provides better features, but because there is no EOL horizon, crates tend to
 stay on old editions to support old compilers. For users, this is a great story,
 and allows old code to work indefinitely. However, there is a maintenance burden
 on the compiler that old editions and new language features (mostly) work
 correctly together.

 In Rust, macros present a challenge: rich support for interpreted, declarative
 macros and compiled, fully procedural macros, mean that macros written for older
 editions may not work well in crates written on newer editions, or vice versa.
 There are mitigations for this in the compiler, but such fixes cannot be
 perfect, so this is a source of difficulties in getting total conversion.
 Protobuf does not have macros, but it does have rich descriptors that mirror
 input files, and this is a potential source of problems to watch out for.

 Overall, Rust's migration story is poor: they have accepted they need to support
 old editions indefinitely, but only produce an edition every three years.
 Protobuf plans to be much more aggressive, and we should study where Rust's
 leniency to old versions is unavoidable and where it is an explicit design
 choice.

 ## Notes

 [^1]: `ctype` has baggage and I am going to ignore it for the purposes of
     discussion. The feature is spelled `legacy_string` because adding string
     view accessors is not likely the only thing to do, given we probably want
     to change the mutators as well.
 [^2]: The correct size of the horizon is arbitrary, due to the "budget phones in
     India" problem. Realistically we would need to pick one, start the
     migration, and halt it if we encounter problems. It is quite difficult to
     do better than "hope" as our strategy, but `packed` is an existence proof
     that this is not insurmountable, merely very expensive.
 [^3]: Rust also has feature gates, used mostly so that people may start trying
     out experimental unstable features. These are largely orthogonal to
     editions, and tied to compiler versions. Rust's feature gates generally do
     not change the semantics of existing programs, they just cause new
     programs to be valid. When a feature is "stabilized", the feature flag is
     removed. Feature flags do not participate in Rust's stability promises.
	# Life of an Edition

	Author: [@mcy](https://github.com/mcy)

	How to use Protobuf Editions to construct a large-scale change that modifies the
	semantics of Protobuf in some way.

	## Overview

	This document describes how to use the Protobuf Editions mechanism (both
	editions, themselves, and [features](protobuf-editions-design-features.md)) for
	designing migrations and large-scale changes intended to solve a particular kind
	of defect in the language.

	This document describes:

	* How features are added to the language.
	* How editions are defined and "proclaimed".
	* How to build different kinds of large-scale changes.
	* Tooling in `protoc` to support large-scale changes.
	* An OSS strategy.

	## Defining Features

	There are two kinds of features:

	* Global features, which are the fields of `proto.Features`. In this document,
	we refer to them as `features.<name>`, e.g. `features.enum`.
	* Language-scoped features, which are defined in a typed extension field for
	that language. In this document, we refer to them as
	`features.(<lang>).name`, e.g. `features.(proto.cpp).legacy_string`.

	Global features require a `descriptor.h` change, and are relatively heavy
	weight, since defining one will also require providing helpers in `Descriptor`
	wrapper classes to avoid the need for users to resolve inheritance. Because they
	are not specific to a language, they need to be carefully, visibility
	documented.

	Language-scoped features require only a change in a backend's feature extension,
	which has a smaller blast radius (except in C++ and Java). Often these are
	relevant only for codegen and do not require reflective introspection.

	Adding a feature is never a breaking change.

	### Feature Lifetime

	In general, features should have an original default and a desired default:
	features are intended to gradually flip from one value to another throughout the
	ecosystem as migrations progress. This is not always true, but this means most
	features will be bools or enums.

	Any migration that introduces a feature should plan to eventually deprecate and
	remove that feature from both our internal codebase and open source, generally
	with a multi-year horizon. Features are transient.

	Removing a feature is a breaking change, but it does not need to be tied to an
	edition. Feature removal in OSS must thus be batched into a breaking release.
	Deletion of a feature should generally be announced to OSS a year in advance.

	### Do's and Don'ts

	Here are some things that we could use features for, very broadly:

	* Changing the generated API of any syntax production (name, behavior,
	signature, whether it is generated at all). E.g.
	`features.(proto.cpp).legacy_string`.
	* Changing the serialization encoding of a field (so long as it does not break
	readers). E.g., `features.packed`, eventually `features.group_encoding`.
	* Changing the deserialization semantics of a field. E.g., `features.enum`,
	`features.utf8`.

	Although almost any semantic change can be feature-controlled, some things would
	be a bit tricky to use a feature for:

	* Changing syntax. If we introduce a new syntax production, gating it doesn't
	do people much good and is just noise. We should avoid changing how things
	are spelled. In Protobuf's history, it has been incredibly rare that we have
	needed to do this.
	* Shape of a descriptor. Features should generally not cause fields, message,
	or enum descriptors to appear or disappear.
	* Names and field numbers. Features should not change the names or field
	numbers of syntax entities as seen in a descriptor. This is separate from
	using features to change generated API names.
	* Changing the wire encoding in an incompatible way. Using features to change
	the wire format has some long horizons and caveats described below.

	## Proclaiming an Edition

	An edition is a set of default values for all features that `protoc`'s
	frontend, and its backends, understand. Edition numbers are announced by
	protobuf-team, but not necessarily defined by us. `protoc` only defines the
	edition defaults for global features, and each backend defines the edition
	defaults for its features.

	### Total Ordering of Editions

	The `FileDescriptorProto.edition` field is a string, so that we can avoid nasty
	surprises around needing to mint multiple editions per year: even if we mint
	`edition = "2022";`, we can mint `edition = "2022.1";` in a pinch.

	However, protobuf-team does not define editions, it only proclaims them.
	Third-party backends are responsible for changing defaults across editions. To
	minimize the amount of synchronization, we introduce a total order on
	editions.

	This means that a backend can pick the default not by looking at the edition,
	but by asking "is this proto older than this edition, where I introduced this
	default?"

	The total order is thus: the edition string is split on `'.'`. Each component is
	then ordered by `a.len < b.len && a < b`. This ensures that `9 < 10`, for
	example.

	By convention, we will make the edition be either the year, like `2022`, or the
	year followed by a revision, like `2022.1`. Thus, we have the following total
	ordering on editions:

	```
	2022 < 2022.0 < 2022.1 < ... < 2022.9 < 2022.10 < ... < 2023 < ... < 2024 < ...
	```

	(Note: The above edition ordering is updated in
	[Edition Naming](edition-naming.md).)

	Thus, if an imaginary Haskell backend defines a feature
	`feature.(haskell).more_monads`, which becomes true in 2023, the backend can ask
	`file.EditionIsLaterThan("2023")`. If it becomes false in 2023.1, a future
	version would ask `file.EditionIsBetween("2023", "2023.1")`.

	This means that backends only need to change when they make a change to
	defaults. However, backends cannot add things to editions willy-nilly. A backend
	can only start observing an edition after protobuf-team proclaims the next
	edition number, and may not use edition numbers we do not proclaim.

	### Proclamation

	"Proclamation" is done via a two-step process: first, we announce an upcoming
	edition some months ahead of time to OSS, and give an approximate date on which
	we plan to release a non-breaking version that causes protoc to accept the new
	edition. Around the time of that release, backends should make a release adding
	support for that edition, if they want to change a default. It is a faux-pas,
	but ultimately has no enforcement mechanism, for the meaning of an edition to
	change long (> 1 month) after it has been released.

	We promise to proclaim an edition once per calendar year, even if first-party
	backends will not use it. In the event of an emergency (whatever that means), we
	can proclaim a `Y.1`, `Y.2`, and so on. Because of the total order, only
	backends that desperately need a new edition need to pay attention to the
	announcement. As we gain experience, we should define guidelines for third
	parties to request an unscheduled edition bump, but for the time being we will
	deal with things case-by-case.

	We may want to have a canonical way for finding out what the latest edition is.
	It should be included in large print on our landing page, and `protoc
	--latest-edition` should print the newest edition known to `protoc`. The intent
	is for tooling that wants to generate `.proto` templates externally can choose
	to use the latest edition for new messages.

	## Large-scale Change Templates

	The following are sketches of large-scale change designs for feature changes we
	would like to execute, presented as example use-cases.

	### Large-scale Changes with No Functional Changes: Edition Zero

	We need to get the ecosystem into the `"editions"` syntax. This migration is
	probably unique because we are not changing any behavior, just the spelling of a
	bunch of things.

	We also need to track down and upgrade (by hand) any code that is using the
	value of `syntax`. This will likely be a manual large-scale change performed
	either by Busy Beavers or a handful of protobuf-team members furnished with
	appropriate stimulants (coffee, diet mountain dew, etc). Once we have migrated
	95% of callers of `syntax`, we will mark all accessors of that field in various
	languages as deprecated.

	Because the value of `syntax` becomes unreliable at this point, this will be a
	breaking change.

	Next, we will introduce the features defined in
	[Edition Zero Features](edition-zero-features.md). We will then implement
	tooling that can take a `proto2` or `proto3` file and add `edition = "2023";`
	and `option features.* = ...;` as appropriate, so that each file retains its
	original behavior.

	This second large-scale change can be fully automated, and does not require
	breaking changes.

	### Large-scale Changes with Features Only: Immolation of `required`

	We can use features to move fields off of `features.field_presence =
	LEGACY_REQUIRED` (the edition’s spelling of `required`) and onto
	`features.field_presence = EXPLICIT_PRESENCE`.

	To do this, we introduce a new value for `features.field_presence`,
	`ALWAYS_SERIALIZE`, which behaves like `EXPLICIT_PRESENCE`, but, if the has-bit
	is not set, the default is serialized. (This is sort of like a cross between
	`required` and `proto3` no-label.)

	It is always safe to turn a proto from `LEGACY_REQUIRED` to `ALWAYS_SERIALIZE`,
	because `required` is a constraint on initialization checking, i.e., that the
	value was present. This means the only requirement is that old readers not
	break, which is accomplished by always providing a value. Because `required`
	fields don't set the value anyways, this is not a behavioral change, but it now
	permits writers to veer off of actually setting the value.

	After an appropriate build horizon, we can assume that all readers are tolerant
	of a potentially missing value (even though no writer would actually be omitting
	it). At this point we can migrate from `ALWAYS_SERIALIZE` to
	`EXPLICIT_PRESENCE`. If a reader does not see a record for the field, attempting
	to access it will produce the default value; it is not likely that callers are
	actually checking for presence of `required` fields, even though that is
	technically a thing you can do.

	Once all required fields have gone through both steps, `LEGACY_REQUIRED` and
	`ALWAYS_SERIALIZE` can be removed as variants (breaking change).

	### Large-scale Changes with Editions: absl::string_view Accessors

	In C++, a `string` or `bytes` typed field has accessors that produce `const
	std::string&`s. The missed optimizations of doing this are well-understood, so
	we won't rehash that discussion.

	We would like to migrate all of them to return `absl::string_view`, a-la
	`ctype = STRING_PIECE`.

	To do this, we introduce `features.(proto.cpp).legacy_string`[^1], a boolean
	feature by default true. When false on a field of appropriate type, it does the
	needful and causes accessors to become representationally opaque.

	The feature can be set at file or field scope; tooling (see below) can be used
	to minimize the diff impact of these changes. Changing a field may also require
	changing code that was previously assuming they could write `std::string x =
	proto.string_field();`. This has the usual "unspooling string" migration
	caveats.

	Once we have applied 95% of internal changes, we will upgrade the C++ backend at
	the next edition to default `legacy_string` to false in the new edition. Tooling
	(again, below) can be used to automatically delete explicit settings of the
	feature throughout our internal codebase, as a second large-scale change. This
	can happen in parallel to closing the loop on the last 5% of required internal
	changes.

	Once we have eliminated all the legacy accessors, we will remove the feature
	(breaking change).

	### Large-scale Changes with Wire Format Break: Group-Encoded Messages

	It turns out that encoding and decoding groups (end-marker-delimited
	submessages) is cheaper than handling length-prefixed messages. There are
	likely CPU and RAM savings in switching messages to use the group encoding.
	Unfortunately, that would be a wire-breaking change, causing old readers to be
	unable to parse new messages.

	We can do what we did for `packed`. First, we modify parsers to accept message
	fields that are encoded as either groups or messages (i.e., `TYPE_MESSAGE` and
	`TYPE_GROUP` become synonyms in the deserializer). We will let this soak for
	three years[^2] and bide our time.

	After those three years, we can begin a large-scale change to add
	`features.group_encoded` to message fields throughout our internal codebase
	(note that groups don't actually exist in editions; they are just messages with
	`features.group_encoded`). Because of our long waiting period, it is (hopefully)
	unlikely that old readers will be caught by surprise.

	Once we are 95% done, we will upgrade protoc to set `features.group_encoded` to
	true by default in new editions. Tooling can be used to clean up features as
	before.

	We will probably never completely eliminate length-prefixed messages, so this
	is a rare case where the feature lives on forever.

	## Large-scale Change Tooling

	We will need a few different tools for minimizing migration toil, all of which
	will be released in OSS. These are:

	* The features GC. Running `protoc --gc-features foo.proto` on a file in
	editions mode will compute the minimal (or a heuristically minimal, if this
	proves expensive) set of features to set on things, given the edition
	specified in the file. This will produce a Protochangifier `ProtoChangeSpec`
	that describes how to clean up the file.

	* The editions "adopter". Running `protoc --upgrade-edition -I... file.proto`
	figure out how to update `file.proto` from `proto2` or `proto3` to the
	latest edition, adding features as necessary. It will emit this information
	as a `ProtoChangeSpec`, implicitly running features GC.

	* The editions "upgrader". Running `protoc --upgrade-edition` as above on a
	file that is already in editions mode will bump it up to the latest edition
	known to `protoc` and add features as necessary. Again, this emits a
	features GC'd `ProtoChangeSpec`.

	This is by no means all the tooling we need, but it will simplify the work of
	robots and beavers, along with any bespoke, internal-codebase-specific tooling
	we build.

	## The OSS Story

	We need to export our large-scale changes into open source to have any hope of
	editions not splitting the ecosystem. It is impossible to do this the way we do
	large-scale changes in our internal codebase, where we have global approvals and
	a finite but nonzero supply of bureaucratic sticks to motivate reluctant users.

	In OSS, we have neither of these things. The only stick we have is breaking
	changes, and the only carrots we can offer are new features. There is no "global
	approval" or "TAP" for OSS.

	Our strategy must be a mixture of:

	* Convincing users this is a good thing that will help us make Protobuf easier
	to use, cheaper to deploy, and faster in production.
	* Gently steering users to the new edition in new Protobuf definitions,
	through protoc diagnostics (when an old edition is going or has gone out of
	date) and developer tooling (editor integration, new-file-boilerplate
	templates).
	* Convincing third-party backend vendors (such as Apple, for Swift) that they
	can leverage editions to fix mistakes. We should go out of our way to design
	attractive migrations for them to execute.
	* Providing Google-class tooling for migrations. This includes the large-scale
	change tooling above, and, where possible, specialized tooling. When it is
	not possible to provide tooling, we should provide detailed migration guides
	that highlight the benefits.
	* Being clear that we have a breaking changes policy and that we will
	regularly remove old features after a pre-announced horizon, locking new
	improvements behind completing migrations. This is a risky proposition,
	because users may react by digging in their heels. Comms planning is
	critical.

	The common theme is comms and making it clear that these are improvements
	everyone can benefit from, and that there is no "I" in "ecosystem": using
	Protobuf, just like using Abseil, means accepting upgrades as a fact of life,
	not something to be avoided.

	We should lean in on lessons learned by Go (see: their `go fix` tool) and Rust
	(see: their `rustfix` tool); Rust in particular has an editions/epoch mechanism
	like we do; they also have feature gates, but those are not the same concept as
	our features. We should also lean on the Carbon team's public messaging about
	upgrading being a fact of life, to provide a unified Google front on the matter
	from the view of observers.

	### Prior Art: Rust Editions

	The design of [Protobuf Editions](what-are-protobuf-editions.md) is directly
	inspired by Rust's own
	[edition system](https://doc.rust-lang.org/edition-guide/editions/index.html)[^3].

	Rust defines and ships a new edition every three years, and focuses on changes
	to the surface language that do not inhibit interop: crates of different
	editions can always be linked together, and "edition" is a parallel ratchet to
	the language/compiler version.

	For example, keywords (like `async`) have been introduced using editions.
	Editions have also been used to change the semantics of the borrow checker to
	allow new programs, and to change name resolution rules to be more intuitive.
	For Rust, an edition may require changes to existing code to be able to compile
	again, but only at the point that the crate opts into the new edition, to
	obtain some benefit from doing so.

	Unlike Protobuf, Rust commits to supporting all past editions in perpetuity:
	there is no ratcheting forward of the whole ecosystem. However, Rust does ship
	with `rustfix` (runnable on Cargo projects via `cargo fix`), a tool that can
	upgrade crates to a new edition. Edition changes are required to come with a
	migration plan to enable `rustfix`.

	Crates therefore have limited pressure to upgrade to the latest edition. It
	provides better features, but because there is no EOL horizon, crates tend to
	stay on old editions to support old compilers. For users, this is a great story,
	and allows old code to work indefinitely. However, there is a maintenance burden
	on the compiler that old editions and new language features (mostly) work
	correctly together.

	In Rust, macros present a challenge: rich support for interpreted, declarative
	macros and compiled, fully procedural macros, mean that macros written for older
	editions may not work well in crates written on newer editions, or vice versa.
	There are mitigations for this in the compiler, but such fixes cannot be
	perfect, so this is a source of difficulties in getting total conversion.
	Protobuf does not have macros, but it does have rich descriptors that mirror
	input files, and this is a potential source of problems to watch out for.

	Overall, Rust's migration story is poor: they have accepted they need to support
	old editions indefinitely, but only produce an edition every three years.
	Protobuf plans to be much more aggressive, and we should study where Rust's
	leniency to old versions is unavoidable and where it is an explicit design
	choice.

	## Notes

	[^1]: `ctype` has baggage and I am going to ignore it for the purposes of
	discussion. The feature is spelled `legacy_string` because adding string
	view accessors is not likely the only thing to do, given we probably want
	to change the mutators as well.
	[^2]: The correct size of the horizon is arbitrary, due to the "budget phones in
	India" problem. Realistically we would need to pick one, start the
	migration, and halt it if we encounter problems. It is quite difficult to
	do better than "hope" as our strategy, but `packed` is an existence proof
	that this is not insurmountable, merely very expensive.
	[^3]: Rust also has feature gates, used mostly so that people may start trying
	out experimental unstable features. These are largely orthogonal to
	editions, and tied to compiler versions. Rust's feature gates generally do
	not change the semantics of existing programs, they just cause new
	programs to be valid. When a feature is "stabilized", the feature flag is
	removed. Feature flags do not participate in Rust's stability promises.