|Rick Macklem||Jun 28, 2009 4:59 pm|
|Nathanael Hoyle||Jun 28, 2009 5:31 pm|
|David G Lawrence||Jun 28, 2009 9:52 pm|
|Attilio Rao||Jun 29, 2009 2:55 am|
|Rick Macklem||Jun 29, 2009 7:38 am|
|Rick Macklem||Jun 29, 2009 8:16 am|
|Bruce Evans||Jun 29, 2009 10:52 am|
|Rick C. Petty||Jun 29, 2009 4:26 pm|
|Rick Macklem||Jun 30, 2009 9:00 am|
|Attilio Rao||Jun 30, 2009 9:07 am|
|Kostik Belousov||Jun 30, 2009 12:32 pm|
|Rick Macklem||Jun 30, 2009 1:03 pm|
|Kirk McKusick||Jun 30, 2009 2:58 pm|
|Julian H. Stacey||Jun 30, 2009 5:48 pm|
|Rick Macklem||Jul 1, 2009 10:26 am|
|Subject:||Re: umount -f implementation|
|From:||Kostik Belousov (kost...@gmail.com)|
|Date:||Jun 30, 2009 12:32:25 pm|
On Tue, Jun 30, 2009 at 12:01:21PM -0400, Rick Macklem wrote:
On Mon, 29 Jun 2009, Attilio Rao wrote:
While that should be real in principle (immediate shutdown of the fs operation and unmounting of the partition) it is totally impossible to have it completely unsleeping, so it can happen that also umount -f sleeps / delays for some times (example: vflush). Currently, umount -f is one of the most complicated thing to handle in our VFS because it puts as requirement that vnodes can be reclaimed in any moment, adding complexity and possibility for races.
What's the fix for your problem?
From other responses, it does look like pursuing this is appropriate and that current behaviour is considered a bug.
I should have noted in the previous email that I suspected that my simple patch didn't handle all cases, which I have just determined via testing.
Unfortunately, the thread doing "umount" can also get stuck in an msleep() while waiting for the mnt_lockref to go to 0, which happens before the VFS_UNMOUNT() call. (mnt_lockref gets incremented by various system calls that call vfs_busy().)
I think I can fix this in the experimental nfsv4 client, since it has a kernel thread that can check for MNTK_UNMOUNTF being set and then kill off the RPCs in progress, but that won't help the regular client.
This solution sounds good, but see below.
It's starting to look like too much work for FreeBSD8, but sounds like it is worth pursuing. (Appologies to anyone that thought I would have it all fixed in a day or two.)
It may be argued by some people, me included, that umount -f shall not override any ownership of kernel resources. In particular, you must not ignore the lockref. Instead, the threads that own misc filesystem resources, like mount reference counter, locked vnodes etc shall be weed out of the syscalls. E.g., finishing stalled rpc calls with some error code that is propagated to return code from vops is good solution.
Quite similar problems happen with SIGSTOP and intr NFS mounts. You saw the proposed solution that is quite similar, it forces the threads owning the resources to progress to syscall boundary.
Another problem with forced unmounts is that VFS does not block new threads from arriving into VOPs. When finishing the inflight rpcs, you may either leave some new rpcs behind or loop infinitely chasing rpcs that arrive while you finishing old rpcs.
Half-measure is the filesystem suspension, that keeps operations that modify filesystem from entering VOPs. UFS uses suspension for unmounts and rw->ro remounts.
Umount -f is needed in two different situations, one is normally worked filesystem that shall be unmounted by administrative request, detaching any resources opened by application. Second is the last-resort action when backing storage (server in NFS case, disk for UFS) is misbehaving. I think we must not break first case for the second.