|Bryant Eadon||Feb 10, 2009 10:35 am|
|Peter Schuller||Feb 10, 2009 10:52 am|
|Miles Nordin||Feb 10, 2009 11:23 am|
|Chris Ridd||Feb 10, 2009 11:27 am|
|Tim||Feb 10, 2009 12:19 pm|
|Peter Schuller||Feb 10, 2009 1:36 pm|
|David Collier-Brown||Feb 10, 2009 1:55 pm|
|Miles Nordin||Feb 10, 2009 2:56 pm|
|Peter Schuller||Feb 10, 2009 3:45 pm|
|Bob Friesenhahn||Feb 10, 2009 4:08 pm|
|Jeff Bonwick||Feb 10, 2009 4:41 pm|
|Toby Thain||Feb 10, 2009 5:23 pm|
|Miles Nordin||Feb 10, 2009 6:10 pm|
|Frank Cusack||Feb 10, 2009 7:36 pm|
|Toby Thain||Feb 10, 2009 8:53 pm|
|Bryant Eadon||Feb 10, 2009 10:28 pm|
|Eric D. Mudama||Feb 11, 2009 12:25 am|
|David Dyer-Bennet||Feb 11, 2009 7:27 am|
|Frank Cusack||Feb 11, 2009 8:24 am|
|Subject:||Re: [zfs-discuss] Does your device honor write barriers?|
|From:||Miles Nordin (car...@Ivy.NET)|
|Date:||Feb 10, 2009 6:10:20 pm|
jb> Not if the disk drive just *ignores* barrier and flush-cache jb> commands and returns success. Some consumer drives really do jb> exactly that. That's the issue that people are asking ZFS to jb> work around.
Some are asking ZFS to work around the issue, which I think is not crazy: ZFS is already designed around failures clustered together in space, so why not failures clustered together in time as well? But I'm not in their camp, not asking for that workaround. It couldn't ever deliver the kind if integrity to which the checksum tree aspires. I'm asking for a solution to the overall problem, mostly outing, avoiding, fixing the broken devices and storage stacks.
jb> If it were possible to detect such disks, I'd add code to ZFS jb> that would simply refuse to use them. Unfortunately, there is jb> no reliable way to test the functioning of synchonize-cache jb> programmatically.
I think the situation's closer to: there's no way to test for it upon adding/attaching/replacing a device, so quickly that the user doesn't realize it's happening, and with few enough false positives that you don't mind supporting it when it goes wrong, and don't mind defending its correctness when it damages vendor relationships.
However I think developing a qualification _procedure_ that sysadmins can actually follow, possibly involving cord-yanking, and one that's decisive enough we can start sharing results instead of saying ``a major vendor'' and covering our asses all the time, is quite within reach. And I think it's all but certain to uncover all sorts of problems which are not in devices, too.
tt> This applies equally to virtual disks, of course (can we get tt> VirtualBox to NOT ignore flushes by default?)
haha but then people would say it performs so much worse than VMWare! :)
To be honest I have not absolutely verified this problem. I just hazily remember reading an email here or a bug report about it.
_______________________________________________ zfs-discuss mailing list zfs-...@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss