hmm. I was under the impression that the Pentium serialized writes
by reserving locations through their caches. But knowing Intel, Linus
is probably right.
Sometimes I wish I could just take a gun to the Pentium.
But this isn't a big deal, we should simply be able to do a locked
write into the per-cpu area to synchronize just before we release
the lock. This is still going to be a whole lot more efficient then
trying to lock a write to the shared lock, because we will almost certainly
already own that memory location.
I'll run some tests and commit a solution Nobody commit anything. No
matter what, we still get the benefit of the recursion lock optimization
which is actually the more important one.