kernel/spinlock: Force inlining

Something is going wrong with code generation here, potentially the
inline assembly generated by _arch_irq_un/lock(), and these calls are
not being inlined by gcc.  So what should be a ~3 instruction sequence
on most uniprocessor architectures is turning into 8-20 cycles worth
of work to implement the API as written.

Use an ALWAYS_INLINE, which is sort of ugly semantically but produces
much better code.

Signed-off-by: Andy Ross <andrew.j.ross@intel.com>
1 file changed