lib/mempool: Fix spurious -ENOMEM due to agressive latency control

The mempool operations need to be atomic, but because of latency
concerns (the allocator is intended for use in an ISR) the locking was
designed to be as minimal as possible.  And it... mostly got it right.
All the list handling was correctly synchronized.  The merging of four
child blocks into a parent block was atomic.  The splitting of a block
into four children was atomic.

BUT: there was a moment between the allocation of a large block and
the re-addition of its three unused children where the lock was being
released.  This meant that another context (e.g. an ISR that just
fired, interrupting the existing call to k_mem_pool_alloc()) would see
some memory "missing" that wasn't actually allocated.  And if this
happens to have been the top level block, it's entirely possible that
the whole heap looks empty, even though the other allocator might have
been doing only the smallest allocation!

Fix that by making the "remove a block then add back the three
children we don't use" into an atomic step.  We can still relax the
lock between levels as we split the subblocks further.

(Finally, note that this trick allows a somewhat cleaner API as we can
do our "retry due to race" step internally by walking back up the
block size list instead of forcing our caller to do it via that weird
-EAGAIN return value.)

Fixes #11022

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
1 file changed