atom feed19 messages in org.opensolaris.zfs-discussRe: [zfs-discuss] Does your device ho...
FromSent OnAttachments
Bryant EadonFeb 10, 2009 10:35 am 
Peter SchullerFeb 10, 2009 10:52 am 
Miles NordinFeb 10, 2009 11:23 am 
Chris RiddFeb 10, 2009 11:27 am 
TimFeb 10, 2009 12:19 pm 
Peter SchullerFeb 10, 2009 1:36 pm 
David Collier-BrownFeb 10, 2009 1:55 pm 
Miles NordinFeb 10, 2009 2:56 pm 
Peter SchullerFeb 10, 2009 3:45 pm 
Bob FriesenhahnFeb 10, 2009 4:08 pm 
Jeff BonwickFeb 10, 2009 4:41 pm 
Toby ThainFeb 10, 2009 5:23 pm 
Miles NordinFeb 10, 2009 6:10 pm 
Frank CusackFeb 10, 2009 7:36 pm 
Toby ThainFeb 10, 2009 8:53 pm 
Bryant EadonFeb 10, 2009 10:28 pm 
Eric D. MudamaFeb 11, 2009 12:25 am 
David Dyer-BennetFeb 11, 2009 7:27 am 
Frank CusackFeb 11, 2009 8:24 am 
Subject:Re: [zfs-discuss] Does your device honor write barriers?
From:Bryant Eadon (brya@gmail.com)
Date:Feb 10, 2009 10:28:30 pm
List:org.opensolaris.zfs-discuss

Toby Thain wrote:

On 10-Feb-09, at 10:36 PM, Frank Cusack wrote:

On February 10, 2009 4:41:35 PM -0800 Jeff Bonwick <Jeff@sun.com> wrote:

Not if the disk drive just *ignores* barrier and flush-cache commands and returns success. Some consumer drives really do exactly that.

ouch.

If it were possible to detect such disks, I'd add code to ZFS that would simply refuse to use them. Unfortunately, there is no reliable way to test the functioning of synchonize-cache programmatically.

How about a database of known bad drives? Like the format.dat of old.

The intransigence of disk makers is incredible. Name and shame might work, though.

I do like the idea of a 'known bad' DB, just a quick reference for people to check on and drop an email to $vendor indicating someone's added $drive to the list based on $test ? It's a lot of work to keep updated though. :-/

JB> because it is *impossible* to know when the data is on stable storage.

Pardon the ignorance to in-depth drive internals for a moment, would it be possible to time a write of X to the drive, time a write of X to the drive again w/ a sync, power it off immediately after the sync returns (physically ? programmatically ?) then back on to re-read data that was just written ? If it's there, then the sync didn't lie, otherwise the drive failed the test. Many BIOS support powering off the machine on shutdown, could the same command be issued to hose the drive in this scenario skipping a 'proper' shutdown procedure ? Or would the PSU continue supplying it with power long enough for it to finish writing ? I suppose it would have to be a sufficiently large write...

Alternatively, timing the tested writes across various sectors of the disk would give you a good baseline of how long writes take. Would forcing a sync immediately after the writes to the same locations give you an indication if the sync was doing as it is supposed to ? If there's a noticeable (*vague) increase in delay then we assume the sync worked ?