Remove `Reader::ReadOrPullSome()` with the corresponding virtual functions:
`Reader::ReadOrPullSomeSlow()` and
`PullableReader::ReadOrPullSomeBehindScratch()`.
Instead, override `Reader::ReadSomeSlow(char*)` and
`Reader::CopySomeSlow(Writer&)`, which are now virtual.
Advantages:
* The overriding API is much more intuitive.
`ReadOrPullSome()` was weird in how it allowed the `Reader` to choose whether
to expose its data or to write to a destination buffer, how it allowed the
`Reader` to communicate a tighter `max_length` to the destination, and how it
allowed the destination to effectively override the choice if it needs the
data in a particular place.
The choice influences whether the destination should be asked for a buffer
at all, and how large, hence `ReadOrPullSome()` used a callback for this.
Now, `ReadSome(char*)` always copies to the destination buffer, and for
`CopySome(Writer&)` the `Reader` calls `Writer::Write(absl::string_view)`,
`Writer::Write(ExternalRef)`, or `Writer::Push()` as appropriate.
* In `ReadSome(std::string&)`, string allocation is done together with filling,
which will allow to use `absl::StringResizeAndOverwrite()` to avoid prefilling
with zeros.
Disadvantages:
* There are two functions to override instead of one, which leads to some
duplication of logic.
* `ReadSome(std::string&)` preallocates the whole `max_length`, instead of
allowing the `Reader` to communicate if the maximum needed is smaller.
A `Writer` is also hinted for the whole `max_length`. That optimization is
deemed not worth the API complication.
Other changes:
* Strengthen the preconditions of `ReadSomeSlow()` and `CopySomeSlow()` from
`available() < max_length` to `available() == 0`. Other cases are handled
by non-virtual `ReadSome()` and `CopySome()`, by delegating to `Read()` or
`Copy()` with `max_length` limited to `available()`.
This makes overrides simpler, at the cost of potentially losing optimizations
to write data from multiple flat buffers at one call.
* Clean up comments about which `Reader` functions are implemented in terms of
which ones. This includes all non-virtual functions instead of some of them.
This can be important to determine which functions can be called by overrides
to avoid infinite recursion.
* Move length overflow checks from `ReadSlow()` to `ReadAndAppend()`, and
remove private `ReadSlowWithSizeCheck()` functions. Relevant usages have been
already moved from `reader.h` to `reader.cc`, so there is no need to put them
in separate functions.
* Move `move_cursor()` calls above `dest.Write()` if this allows to call
`dest.Write()` in a tail position.
* Cosmetic change in `riegeli::tensorflow::File{Reader,Writer}Base::OpenFile()`:
allow NRVO by always returning the same local variable.
PiperOrigin-RevId: 858222599
Riegeli/records is a file format for storing a sequence of string records, typically serialized protocol buffers. It supports dense compression, fast decoding, seeking, detection and optional skipping of data corruption, filtering of proto message fields for even faster decoding, and parallel encoding.
See documentation.
Riegeli file format will only change in a backward compatible way (i.e. future readers will understand current files, but current readers might not understand files using future features).
Riegeli C++ API might change in incompatible ways.