Optimize `StringWriter` and `ResizableWriter` to avoid resizing to the whole
capacity if it is much larger than currently needed.

Resizing proceeds at most to twice the current size each time.

Change the traits protocol of `ResizableWriter`: replace `GrowToCapacity()`
with `GrowUnderCapacity()`. It takes the new size as a parameter, and indicates
whether growing without reallocation was possible.

Invariants of `StringWriter` are changed to match `ResizableWriter` more
closely: even if the secondary buffer is used, the string can have uninitialized
space appended.

A difference between invariants of `StringWriter` and `ResizableWriter` remains:
`StringWriter` has functions which resize the string as a side effect of
appending a `Chain` (including the secondary buffer), `Cord`, `ExternalRef`, or
`ByteFill`. `ResizableWriter` is simpler but less efficient by not having these
combined functions, because otherwise traits would need to have several new
functions, one for each type being appended.

As before, if the initial string is not empty but `set_append(true)` is not
requested, then the original contents are utilized as the new space without
clearing, avoiding filling them with zeros. This means that it is more efficient
to let `StringWriter` clear the string than to clear it explicitly before
creating the `StringWriter`.

Minor changes:

* Override `WriteSlow(absl::string_view)`. In the case of `StringWriter`,
  this can be more efficient than the default implementation in terms of
  `PushSlow()`, by resizing as a side effect of appending. In the case of
  `{String,Resizable}Writer`, this avoids forced resizing if data follow the
  current position, because there is no need to make them available for partial
  overwriting, because the whole buffer will be overwritten immediately.

* Call `ResizableTraits::Grow()` only if the existing size is insufficient.
  Move the responsibility of that check from implementations of this function
  to their caller.

* During `ResizableWriter` initialization, set buffer pointers to existing
  contents eagerly, instead of leaving buffer pointers unset and waiting for
  an operation like `ResizableWriterGrowDestAndMakeBuffer()` to set them.

PiperOrigin-RevId: 895723423
10 files changed
tree: 7e5cd85f85ae07f82d14860729c22833b8c42a33
  1. doc/
  2. python/
  3. riegeli/
  4. tf_dependency/
  5. .bazelrc
  6. configure
  7. CONTRIBUTING.md
  8. LICENSE
  9. MANIFEST.in
  10. MODULE.bazel
  11. README.md
README.md

Riegeli

Riegeli/records is a file format for storing a sequence of string records, typically serialized protocol buffers. It supports dense compression, fast decoding, seeking, detection and optional skipping of data corruption, filtering of proto message fields for even faster decoding, and parallel encoding.

See documentation.

Status

Riegeli file format will only change in a backward compatible way (i.e. future readers will understand current files, but current readers might not understand files using future features).

Riegeli C++ API might change in incompatible ways.