atom feed22 messages in org.freebsd.freebsd-fsDeadlock between nfsd and snapshots.
FromSent OnAttachments
Konstantin BelousovAug 18, 2006 4:48 pm 
Tor EggeAug 18, 2006 8:19 pm 
Peter HolmAug 20, 2006 5:28 pm 
Tor EggeAug 21, 2006 1:21 pm 
Eric AndersonAug 21, 2006 1:37 pm 
Kostik BelousovAug 21, 2006 1:38 pm 
Kostik BelousovAug 21, 2006 1:44 pm 
Tor EggeAug 21, 2006 2:09 pm 
Bruce EvansAug 22, 2006 9:15 am 
Kostik BelousovAug 22, 2006 1:07 pm 
Kostik BelousovAug 22, 2006 1:16 pm 
Tor EggeAug 22, 2006 9:46 pm 
Kostik BelousovAug 23, 2006 4:40 am 
Kostik BelousovAug 23, 2006 11:07 am 
Bruce EvansAug 23, 2006 2:04 pm 
Bruce EvansAug 23, 2006 3:39 pm 
Tor EggeAug 23, 2006 3:47 pm 
Tor EggeAug 23, 2006 4:31 pm 
Tor EggeAug 23, 2006 4:57 pm 
Bruce EvansAug 23, 2006 6:32 pm 
Bruce EvansAug 23, 2006 7:12 pm 
Tor EggeAug 23, 2006 8:06 pm 
Subject:Deadlock between nfsd and snapshots.
From:Bruce Evans (bd@zeta.org.au)
Date:Aug 23, 2006 3:39:41 pm
List:org.freebsd.freebsd-fs

On Wed, 23 Aug 2006, Kostik Belousov wrote:

On Tue, Aug 22, 2006 at 09:46:38PM +0000, Tor Egge wrote:

2. All places that currently set IN_ACCESS, instead would increment i_accessed using the atomic ops. ufs_itimes shall update i_access under some mutex if i_accessed is greater than zero.

Protecting the existing i_flag and the timestamps with the vnode interlock when the current thread only has a shared vnode lock should be sufficient to protect against the races, removing the need for #3, #4 and #4 below.

You asked about this in a later reply (the one with the patch).

This seems wrong to me. I think MNT_ILOCK() (like you used) is sufficient, but you should just use a nonblocking vn_start_write() to avoid knowing about the internals of vn_start_write(). If the shared (or whatever) vnode lock is insufficient, then there are much larger, much older bugs. Inodes are accessed a lot with just the the vnode lock, and the vnode interlock here won't affect races with most other accesses.

What's left is avoiding setting IN_MODIFIED when it's unsafe, to protect against the deadlock.

So, I will do the following:

1. Protect both setting and reading inode times and i_flag with vnode interlock. This shall be done through all the sys/ufs/*/* code.

The patch doesn't change so much. Most places shouldn't need changing.

2. Modify ufs_itimes:

If neither IN_CHANGE nor IN_UPDATE is set then it might be unsafe to set IN_MODIFIED since the file system might be suspended or in the process of being suspended with the vnode sync loop in ffs_sync() having iterated past the vnode.

In other words, if IN_CHANGE or IN_UPDATE are already set, I can safely convert IN_ACCESS into IN_MOD.

Not quite. IN_ACCESS can also be handled if IN_MODIFIED is already set, even if neither IN_CHANGE nor IN_UPDATE is set. All this is when the file system is suspended; otherwise we can do anything to the inode. In all cases, we depend on the inode not changing underneath us. I think ordinary vnode locking gives that, just like it did before suspension existed.

Otherwise, I shall implemented the algorithm below. Suspending/suspended checks need to take MNT_ILOCK.

... When ufs_itimes() cannot set IN_MODIFIED then it has to either risk losing the access time update or use some mechanism to defer it (e.g. set IN_LAZYMOD or a new flag and let process_deferred_inactive() set IN_MODIFIED after the file system has been resumed).

BTW, shall the test for MNT_RDONLY in the ufs_itimes moved earlier ?

Probably not. It is supposed to be earlier, but is misplaced to work around bugs in the MNT_RDONLY case (this case should never set a flag or a timestamp, but in fact does set flags to work around bugs elsewhere).

3. Add the process_deferred_lazymod procedure, called from ffs_snapshot before proc_deferred_inactive, that shall convert IN_LAZYMOD | IN_ACCESS into IN_MODIFIED. To be safe, the proc_def_lazymod needs vn_start_write braces.

I think you should just ignore IN_ACCESS when it cannot be converted to a timestamp, or use IN_LAZYMOD.

ufs_inactive() needs to do something with IN_LAZYMOD other than blindly turn it into IN_MODIFIED (but I believe the problem case is unreachable in -current -- see other mail).

ffs_update() clears IN_MODIFIED. I now think this and other clearings in ffs_update() are safe. vnode locking should make it safe to change the inode, and it doesn't matter if the file system is suspended (or being suspended) provided IN_MODIFIED is not set when it shouldn't be.

Bruce