kernel/sched: Fix SMP race on pend
For historical reasons[1] suspending threads would release the
scheduler lock between pend() (which places the current thread onto a
wait queue) and z_swap() (which effects the context swtich). This
process happens with the caller's lock held, so local interrupts are
masked. But on SMP this opens a tiny race where another CPU could
grab the pended thread and switch to it while we were still executing
on its stack!
Fix this by elevating the "lock swap" code that already exists in the
(portable/switch-based) z_swap() code one level so that it happens in
z_pend_curr() also. Now we hold the scheduler lock between pend and
the final context switch.
Note that this technique can't work for the older z_swap_irqlock()
implementation, which exists to vestigially support a few bits of arch
code (mostly direct interrupts) that don't work on SMP anyway.
Address with an assert to prevent future misuse.
[1] z_swap() is a historical API implemented in per-arch assembly for
older architectures (like ARM32!). It was designed to be called
with what at the time was a global IRQ lock, so it doesn't
understand the idea of a separate scheduler lock. When we finally
get all archictures on arch_switch() this design can be cleaned up
quite a bit.
Signed-off-by: Andy Ross <andyross@google.com>
1 file changed