atom feed10 messages in org.freebsd.freebsd-netduplicate read/write locks in net/pf...
FromSent OnAttachments
Luigi RizzoAug 17, 2005 12:05 am 
Max LaierAug 17, 2005 2:35 am 
Julian ElischerAug 17, 2005 3:20 am 
Luigi RizzoAug 18, 2005 12:02 am 
Max LaierAug 18, 2005 1:32 am 
Luigi RizzoAug 18, 2005 7:57 am 
Luigi RizzoAug 18, 2005 2:31 pm 
John BaldwinAug 18, 2005 4:28 pm 
Stephan UphoffAug 20, 2005 10:39 pm 
Stephan UphoffAug 20, 2005 11:02 pm 
Subject:duplicate read/write locks in net/pfil.c and netinet/ip_fw2.c
From:Luigi Rizzo (riz@icir.org)
Date:Aug 18, 2005 12:02:36 am
List:org.freebsd.freebsd-net

On Wed, Aug 17, 2005 at 04:35:19AM +0200, Max Laier wrote:

On Wednesday 17 August 2005 02:05, Luigi Rizzo wrote:

[apologies for the cross post but it belongs both to arch and net.]

I notice that net/pfil.c and netinet/ip_fw2.c have two copies of aisimilar but slightly different implementation of multiple-reader/single-writer locks, which brings up the question(s):

1. should we rather put this code in the generic kernel code so that other subsystems could make use of it ? E.g. the routing table is certainly a candidate,

I have asked this several time on -arch and IRC, but never found anyone willing to pursue it. However, the problem is ...

and especially

2. should we implement it right ?

Both implementations are subject to starvation for the writers (which is indeed a problem here, because we might want to modify a ruleset and be prevented from doing it because of incoming traffic that keeps readers active). Also the PFIL_TRY_WLOCK will in fact be blocking if a writer is already in - i have no idea how problematic is this in the way it is actually used.

... really this. I didn't find a clean way out of the starvation issue. What I do for pfil is that I set a flag and simply stop serving[2] shared requests once a writer waits for the lock. If a writer can't sleep[1] then we return EBUSY and don't. However, for pfil it's almost ever safe to assume that a write may sleep (as it is for most instances of this kind of sx-lock where you have BIGNUMxreads:1xwrite).

could you guys look at the following code and see if it makes sense, or tell me where i am wrong ?

It should solve the starvation and blocking trylock problems, because the active reader does not hold the mutex in the critical section anymore. The lock could well be a spinlock.

cheers luigi

/* * Implementation of multiple reader-single writer that prevents starvation. * Luigi Rizzo 2005.08.19 * * The mutex m only protects accesses to the struct rwlock. * We can have the following states: * IDLE: readers = 0, writers = 0, br = 0; * any request will be granted immediately. * * READ: readers > 0, writers = 0, br = 0. Read in progress. * Grant read requests immediately, queue write requests and * move to READ1. * When last reader terminates, move to IDLE. * * READ1: readers > 0, writers > 0, br >= 0. * Read in progress, but writers are queued. * Queue read and write requests to qr and wr, respectively. * When the last reader terminates, wakeup the next queued writer * and move to WRITE * * WRITE: readers = 0, writers > 0, br >= 0. * Write in progress, possibly queued readers/writers. * Queue read and write requests to qr and wr, respectively. * When the writer terminates, wake up all readers if any, * otherwise wake up the next writer if any. * Move to READ, READ1, IDLE accordingly. */

struct rwlock { mtx m; /* protects access to the rwlock */ int readers; /* active readers */ int br; /* blocked readers */ int writers; /* active + blocked writers */ cv qr; /* queued readers */ cv qw; /* queued writers */ }

int RLOCK(struct rwlock *rwl, int try) { if (!try) mtx_lock(&rwl->m); else if (!mtx_trylock(&rwl->m)) return EBUSY; if (rwl->writers == 0) /* no writer, pass */ rwl->readers++; else { rwl->br++; cv_wait(&rwl->qr, &rwl->m); } mtx_unlock(&rwl->m); return 0; }

int WLOCK(struct rwlock *rwl, int try) { if (!try) mtx_lock(&rwl->m); else if (!mtx_trylock(&rwl->m)) return EBUSY; rwl->writers++; if (rwl->readers > 0) /* have readers, must wait */ cv_wait(&rwl->qw, &rwl->m); mtx_unlock(&rwl->m); return 0; }

void RUNLOCK(struct rwlock *rwl) { mtx_lock(&rwl->m); rwl->readers--; if (rwl->readers == 0 && rwl->writers > 0) cv_signal(&rwl->qw); mtx_unlock(&rwl->m); }

void WUNLOCK(struct rwlock *rwl) { mtx_lock(&rwl->m); rwl->writers--; if (rwl->br > 0) { /* priority to readers */ rwl->readers = rwl->br; rwl->br = 0; cv_broadcast(&rwl->qr); } else if (rwl->writers > 0) cv_signal(&rwl->qw); mtx_unlock(&rwl->m); }