Fix RepeatedField::MergeFrom self-merge heap pointer disclosure (#27607)
## Summary
`RepeatedField<T>::MergeFrom` has an unsafe self-merge path. When `MergeFrom(self)` is called on a field at SOO capacity (2 elements for `int32_t`), `Reserve()` triggers a SOO→heap transition. The `other_is_soo` flag captured before `Reserve()` becomes stale — `other.elements(stale_true)` returns `soo_data_[]`, which now holds the raw `HeapRep*` pointer bytes. `UninitializedCopyN` then copies those bytes as `int32_t` values, appending two elements containing the low/high 32 bits of the heap address.
`ABSL_DCHECK_NE(&other, this)` caught this in debug builds but was compiled out with `-DNDEBUG`, so release builds proceeded silently and produced corrupted contents / heap-address disclosure.
## Fix
Self-merge is undefined behavior. Rather than silently no-op, this turns it into a well-defined termination: an out-of-line `internal::LogSelfMergeAndAbort()` that fires `ABSL_LOG(FATAL)` in all build modes.
```cpp
template <typename Element>
inline void RepeatedField<Element>::MergeFrom(const RepeatedField& other) {
if (ABSL_PREDICT_FALSE(&other == this)) {
PROTOBUF_NO_MERGE internal::LogSelfMergeAndAbort();
}
...
```
The abort helper is declared out-of-line (in `repeated_field.cc`) so the failure path does not pull `ABSL_LOG` streaming support into every inlined `MergeFrom` instantiation. The previously-existing `ABSL_DCHECK_NE(&other, this)` is now dead code (the self-reference branch terminates before reaching it) and has been removed.
## Test
`RepeatedField.MergeFromSelfFailsWithATermination` exercises the SOO-capacity self-merge — the case that previously appended heap-pointer bytes in release builds — and asserts it now terminates:
```cpp
RepeatedField<int32_t> field;
field.Add(1);
field.Add(2);
EXPECT_DEATH(field.MergeFrom(field), "self-reference");
```
Closes #27607
COPYBARA_INTEGRATE_REVIEW=https://github.com/protocolbuffers/protobuf/pull/27607 from adilburaksen:fix/repeated-field-self-merge cbe6a42d7ae2ed8474fd4226f1aca0d349e7c452
PiperOrigin-RevId: 927283856
Copyright 2008 Google LLC
Protocol Buffers (a.k.a., protobuf) are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data. You can learn more about it in protobuf's documentation.
This README file contains protobuf installation instructions. To install protobuf, you need to install the protocol compiler (used to compile .proto files) and the protobuf runtime for your chosen programming language.
Most users will find working from supported releases to be the easiest path.
If you choose to work from the head revision of the main branch your build will occasionally be broken by source-incompatible changes and insufficiently-tested (and therefore broken) behavior.
If you are using C++ or otherwise need to build protobuf from source as a part of your project, you should pin to a release commit on a release branch.
This is because even release branches can experience some instability in between release commits.
Protobuf supports Bzlmod with Bazel 8 +. Users should specify a dependency on protobuf in their MODULE.bazel file as follows.
bazel_dep(name = "protobuf", version = <VERSION>)
Users can optionally override the repo name, such as for compatibility with WORKSPACE.
bazel_dep(name = "protobuf", version = <VERSION>, repo_name = "com_google_protobuf")
Users can also add the following to their legacy WORKSPACE file.
Note that with the release of 30.x there are a few more load statements to properly set up rules_java and rules_python.
http_archive(
name = "com_google_protobuf",
strip_prefix = "protobuf-VERSION",
sha256 = ...,
url = ...,
)
load("@com_google_protobuf//:protobuf_deps.bzl", "protobuf_deps")
protobuf_deps()
load("@rules_java//java:rules_java_deps.bzl", "rules_java_dependencies")
rules_java_dependencies()
load("@rules_java//java:repositories.bzl", "rules_java_toolchains")
rules_java_toolchains()
load("@rules_python//python:repositories.bzl", "py_repositories")
py_repositories()
The protobuf compiler is written in C++. If you are using C++, please follow the C++ Installation Instructions to install protoc along with the C++ runtime.
For non-C++ users, the simplest way to install the protocol compiler is to download a pre-built binary from our GitHub release page.
In the downloads section of each release, you can find pre-built binaries in zip packages: protoc-$VERSION-$PLATFORM.zip. It contains the protoc binary as well as a set of standard .proto files distributed along with protobuf.
If you are looking for an old version that is not available in the release page, check out the Maven repository.
These pre-built binaries are only provided for released versions. If you want to use the github main version at HEAD, or you need to modify protobuf code, or you are using C++, it's recommended to build your own protoc binary from source.
If you would like to build protoc binary from source, see the C++ Installation Instructions.
Protobuf supports several different programming languages. For each programming language, you can find instructions in the corresponding source directory about how to install protobuf runtime for that specific language:
| Language | Source |
|---|---|
| C++ (include C++ runtime and protoc) | src |
| Java | java |
| Python | python |
| Objective-C | objectivec |
| C# | csharp |
| Ruby | ruby |
| Go | protocolbuffers/protobuf-go |
| PHP | php |
| Dart | dart-lang/protobuf |
| JavaScript | protocolbuffers/protobuf-javascript |
The best way to learn how to use protobuf is to follow the tutorials in our developer guide.
If you want to learn from code examples, take a look at the examples in the examples directory.
The complete documentation is available at the Protocol Buffers doc site.
Read about our version support policy to stay current on support timeframes for the language libraries.
To be alerted to upcoming changes in Protocol Buffers and connect with protobuf developers and users, join the Google Group.