Move `FieldMap` from a private member class of `FieldHandlerMap` to a public
standalone class called `SmallIntMap`.

Its design is not specific to field handlers, and it does not actually depend
on the `Context...` template parameters of `FieldHandlerMap`.

Generalize it a bit by parameterizing it over:
* key type (`int` in `FieldHandlerMap`)
* expected minimum key (1 in `FieldHandlerMap`)
* maximum array capacity (128 in `FieldHandlerMap`)
* source construct it from (`absl::flat_hash_map<int, Value>`
  in `FieldHandlerMap`)

`SmallIntMap` is a map optimized for keys being small integers. It supports only
lookups, but no incremental building nor iteration.

It stores a part of the map covering some range of keys starting from
`expected_min_key` in an array.

At least `array_capacity` possible keys starting from `expected_min_key` are
suitable for array lookup. If all present keys are suitable, the stored array
can be smaller than `array_capacity`, covering the range to the largest key.
If the map is large, the stored array can be larger than `array_capacity`,
as long as it is at least 25% full.

Optimizations to `SmallIntMap`:
* Move `large_map_` behind a pointer to reduce memory usage in the common case.
* Delay constructing elements of `small_values_` to avoid requiring `Value`
  to be default-constructible, and to make the code initializing them smaller.
* Move the slow path of `Find()` to a separate function to inline less code.
* Add `RIEGELI_ASSUME(_ == nullptr)` to avoid generating deletion code for an
  initial assignment to a `std::unique_ptr`.

PiperOrigin-RevId: 879470235
5 files changed
tree: 3b16d0707ee94daa4b7f6ec8a9f819421cd15c0c
  1. doc/
  2. python/
  3. riegeli/
  4. tf_dependency/
  5. .bazelrc
  6. configure
  7. CONTRIBUTING.md
  8. LICENSE
  9. MANIFEST.in
  10. MODULE.bazel
  11. README.md
README.md

Riegeli

Riegeli/records is a file format for storing a sequence of string records, typically serialized protocol buffers. It supports dense compression, fast decoding, seeking, detection and optional skipping of data corruption, filtering of proto message fields for even faster decoding, and parallel encoding.

See documentation.

Status

Riegeli file format will only change in a backward compatible way (i.e. future readers will understand current files, but current readers might not understand files using future features).

Riegeli C++ API might change in incompatible ways.