| From | Sent On | Attachments |
|---|---|---|
| Luigi Rizzo | Aug 17, 2005 12:05 am | |
| Max Laier | Aug 17, 2005 2:35 am | |
| Julian Elischer | Aug 17, 2005 3:20 am | |
| Luigi Rizzo | Aug 18, 2005 12:02 am | |
| Max Laier | Aug 18, 2005 1:32 am | |
| Luigi Rizzo | Aug 18, 2005 7:57 am | |
| Luigi Rizzo | Aug 18, 2005 2:31 pm | |
| John Baldwin | Aug 18, 2005 4:28 pm | |
| Stephan Uphoff | Aug 20, 2005 10:39 pm | |
| Stephan Uphoff | Aug 20, 2005 11:02 pm |
| Subject: | duplicate read/write locks in net/pfil.c and netinet/ip_fw2.c | |
|---|---|---|
| From: | Stephan Uphoff (up...@tree.com) | |
| Date: | Aug 20, 2005 11:02:19 pm | |
| List: | org.freebsd.freebsd-net | |
On Thu, 2005-08-18 at 10:26, John Baldwin wrote:
On Thursday 18 August 2005 10:18 am, Stephan Uphoff wrote:
On Thu, 2005-08-18 at 03:57, Luigi Rizzo wrote:
On Thu, Aug 18, 2005 at 03:32:19AM +0200, Max Laier wrote:
On Thursday 18 August 2005 02:02, Luigi Rizzo wrote:
...
could you guys look at the following code and see if it makes sense, or tell me where i am wrong ?
It should solve the starvation and blocking trylock problems, because the active reader does not hold the mutex in the critical
^^^^^^
i meant 'writer', sorry... as max said even in the current implementation the reader does not hold the lock.
int RLOCK(struct rwlock *rwl, int try) { if (!try) mtx_lock(&rwl->m); else if (!mtx_trylock(&rwl->m)) return EBUSY; if (rwl->writers == 0) /* no writer, pass */ rwl->readers++; else { rwl->br++; cv_wait(&rwl->qr, &rwl->m);
^^^^^^^
That we can't do. That's exactly the thing the existing sx(9) implementation does and where it breaks. The problem is that cv_wait() is an implicit sleep which breaks when we try to RLOCK() with other mutex already acquired.
but that is not a solvable problem given that the *LOCK may be blocking. And the cv_wait is not an unconditioned sleep, it is one where you release the lock right before ans wait for an event to wake you up. In fact i don't understand why you consider spinning and sleeping on a mutex two different things.
The major difference between sleeping (cv_wait,msleep,..) and blocking on a mutex is priority inheritance. If you need to be able to use (non-spin) mutexes while holding a [R|W]LOCK and use a [R|W]LOCK while holding a (non-spin) mutex then you need to implement priority inheritance for [R|W]LOCKs. For the (single) write lock holder tracking priority is easy. (just like a (non-spin) mutex). However priority inheritance to multiple readers is more difficult as one needs to keep track of all holders of the lock. Keeping track of all readers requires pre-allocated memory resources. This memory could come from 1) A limited global pool 2) A limited per [R|W]LOCK pool 3) A limited per thread pool 4) As a parameter for acquiring a RLOCK None of the choices are really pretty. (1),(2) and (3) can lead to limiting reader parallelism when running out of resources. (4) may be practically for some cases since the memory could be allocated from stack. However since the memory must be valid while holding a read lock choice (4) makes some algorithms (than use for example lock crabbing) a bit harder to implement.
Solaris handles the read case by only tracking the first thread to get a read lock (referred to as the "owner of record" IIRC) and only propagating priority to that thread and ignoring other readers. They admit it's not perfect as well. That's mentioned in the Solaris Internals book.
Hi John,
If I recall correctly Solaris user threads get a better "system priority" when running/blocking in the kernel. This prevents threads holding critical kernel resources from being starved by user processes. I think this would also be a good idea for FreeBSD to prevent unbound priority inversions caused by vnode locks and other non-tracked resources. However even with these changes I don't think the Solaris implementation is the right way to go if we want to freely mix mutexes and [R|W]locks.
Stephan





