|Konstantin Belousov||Aug 18, 2006 4:48 pm|
|Tor Egge||Aug 18, 2006 8:19 pm|
|Peter Holm||Aug 20, 2006 5:28 pm|
|Tor Egge||Aug 21, 2006 1:21 pm|
|Eric Anderson||Aug 21, 2006 1:37 pm|
|Kostik Belousov||Aug 21, 2006 1:38 pm|
|Kostik Belousov||Aug 21, 2006 1:44 pm|
|Tor Egge||Aug 21, 2006 2:09 pm|
|Bruce Evans||Aug 22, 2006 9:15 am|
|Kostik Belousov||Aug 22, 2006 1:07 pm|
|Kostik Belousov||Aug 22, 2006 1:16 pm|
|Tor Egge||Aug 22, 2006 9:46 pm|
|Kostik Belousov||Aug 23, 2006 4:40 am|
|Kostik Belousov||Aug 23, 2006 11:07 am|
|Bruce Evans||Aug 23, 2006 2:04 pm|
|Bruce Evans||Aug 23, 2006 3:39 pm|
|Tor Egge||Aug 23, 2006 3:47 pm|
|Tor Egge||Aug 23, 2006 4:31 pm|
|Tor Egge||Aug 23, 2006 4:57 pm|
|Bruce Evans||Aug 23, 2006 6:32 pm|
|Bruce Evans||Aug 23, 2006 7:12 pm|
|Tor Egge||Aug 23, 2006 8:06 pm|
|Subject:||Deadlock between nfsd and snapshots.|
|From:||Bruce Evans (bd...@zeta.org.au)|
|Date:||Aug 23, 2006 3:39:41 pm|
On Wed, 23 Aug 2006, Kostik Belousov wrote:
On Tue, Aug 22, 2006 at 09:46:38PM +0000, Tor Egge wrote:
2. All places that currently set IN_ACCESS, instead would increment i_accessed using the atomic ops. ufs_itimes shall update i_access under some mutex if i_accessed is greater than zero.
Protecting the existing i_flag and the timestamps with the vnode interlock when the current thread only has a shared vnode lock should be sufficient to protect against the races, removing the need for #3, #4 and #4 below.
You asked about this in a later reply (the one with the patch).
This seems wrong to me. I think MNT_ILOCK() (like you used) is sufficient, but you should just use a nonblocking vn_start_write() to avoid knowing about the internals of vn_start_write(). If the shared (or whatever) vnode lock is insufficient, then there are much larger, much older bugs. Inodes are accessed a lot with just the the vnode lock, and the vnode interlock here won't affect races with most other accesses.
What's left is avoiding setting IN_MODIFIED when it's unsafe, to protect against the deadlock.
So, I will do the following:
1. Protect both setting and reading inode times and i_flag with vnode interlock. This shall be done through all the sys/ufs/*/* code.
The patch doesn't change so much. Most places shouldn't need changing.
2. Modify ufs_itimes:
If neither IN_CHANGE nor IN_UPDATE is set then it might be unsafe to set IN_MODIFIED since the file system might be suspended or in the process of being suspended with the vnode sync loop in ffs_sync() having iterated past the vnode.
In other words, if IN_CHANGE or IN_UPDATE are already set, I can safely convert IN_ACCESS into IN_MOD.
Not quite. IN_ACCESS can also be handled if IN_MODIFIED is already set, even if neither IN_CHANGE nor IN_UPDATE is set. All this is when the file system is suspended; otherwise we can do anything to the inode. In all cases, we depend on the inode not changing underneath us. I think ordinary vnode locking gives that, just like it did before suspension existed.
Otherwise, I shall implemented the algorithm below. Suspending/suspended checks need to take MNT_ILOCK.
... When ufs_itimes() cannot set IN_MODIFIED then it has to either risk losing the access time update or use some mechanism to defer it (e.g. set IN_LAZYMOD or a new flag and let process_deferred_inactive() set IN_MODIFIED after the file system has been resumed).
BTW, shall the test for MNT_RDONLY in the ufs_itimes moved earlier ?
Probably not. It is supposed to be earlier, but is misplaced to work around bugs in the MNT_RDONLY case (this case should never set a flag or a timestamp, but in fact does set flags to work around bugs elsewhere).
3. Add the process_deferred_lazymod procedure, called from ffs_snapshot before proc_deferred_inactive, that shall convert IN_LAZYMOD | IN_ACCESS into IN_MODIFIED. To be safe, the proc_def_lazymod needs vn_start_write braces.
I think you should just ignore IN_ACCESS when it cannot be converted to a timestamp, or use IN_LAZYMOD.
ufs_inactive() needs to do something with IN_LAZYMOD other than blindly turn it into IN_MODIFIED (but I believe the problem case is unreachable in -current -- see other mail).
ffs_update() clears IN_MODIFIED. I now think this and other clearings in ffs_update() are safe. vnode locking should make it safe to change the inode, and it doesn't matter if the file system is suspended (or being suspended) provided IN_MODIFIED is not set when it shouldn't be.