|From:||Matthew Dillon (dil...@apollo.backplane.com)|
|Date:||Feb 1, 2000 6:30:13 pm|
I discussed this issue with Linus after someone from the linux kernel group brought it up. Basically we have adopted the same thing that linux uses.
Under Intel, the following is true:
* Writes are buffered but are committed to memory in the same order they were issued. Thus write<->write conflicts are not an issue.
* Speculative reads may occur, but they occur from main memory into the L1 cache and thus still adhere to L1 cache protocols. Thus speculative reads are not an issue.
* The cpu may reorder non-conflicting reads. A non-conflicting read may be reordered from *before* a write to *after* a write, or from *after* a write to *before* a write.
This is an issue. This is the ONLY issue with intel hardware.
The purpose of the locked instruction is to prevent any possibility of reads being reordered from to after the MP lock is released.
There has been a huge amount of misinformation on the issue. I looked at both the linux kernel and the Freebsd mail archives and 90% of the messages posted entertaining one opinion or another were just plain wrong (and Linus agrees with me). We don't know whether the read reordering issue is real or not. We do know that none of the other issues brought up were real.
We aren't taking any chances with the read re-ordering issue and so we have a locked instruction.
The reason we use 0(%esp) for the locked memory address is because there is nearly a 100% chance that that address is already in our processor's L1 cache *AND* that we have (via the hardware cache protocol) exclusive ownership of the address, thus minimizing the cost of running the instruction.
The reason we do not use a locked instruction to actually release the MP lock is because:
(A) we don't have to, writes are ordered and we do not care if reads are reordered to occur before the write releasing the lock is committed.
(B) because if there is another processor trying for the lock we may not have exclusive ownership of the lock address in our L1 cache (the other cpu might), costing us a huge stall due to the way the hardware cache coherency protocol work.
-Matt Matthew Dillon <dil...@backplane.com>
:* Chee Wei Ng <scip...@yahoo.com> [000201 17:31] wrote:
:> I would like to know why we need
:> addl $0,0(%esp) /* see note above */
:> for serialization.
:> Could you show me an example for MP case where it may cause trouble if the
:> above lines are not added in it?
:> Because I didn't see how instruction execution out of order come into the
:> picture since before any processors enter the Critical Section, it has to
:> acquire the mplock first, and acquire the mplock, you must 'LOCK' the bus
cycle :> to serialize the mplock flag to be read-modify-write, so I thought here will
do :> all the serialization as required. Unless, it could be something that may
needs :> to serialize for access before this. : :It's to ensure that memory ops scheduled _before_ the lock is released :have been completed before the lock is actually released. : :Otherwise out of order memory writes can occur corrupting the state :of protected variables. : :Imagine if a CPU releases a lock then a previously sheduled write on the :_same_ cpu goes in several cycles after another processor aquires the :lock. : :Since we aren't using a locked cycle to release the lock, we must _at least_ :insert a barrier instruction to force correct ordering. : :-Alfred
To Unsubscribe: send mail to majo...@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message