sched: Optimize dummy thread usage on SMP

Nicolas Pitre points out that since these thread structs are just
dummies for the context swtiching, they can be presumed to be "write
only" and thus there's no point in having one per CPU, everyone can
share the same one.

The only gotcha is that we never really documented (nor really have a
place to document) that rule, so it's not theoretically impossible for
an architecture to read back what it might have written underneath
arch_switch().  Leave this in a separate commit for bisection
purposes, but the risk seems very low.

Signed-off-by: Andy Ross <andyross@google.com>
4 files changed