|Bryant Eadon||Feb 10, 2009 10:35 am|
|Peter Schuller||Feb 10, 2009 10:52 am|
|Miles Nordin||Feb 10, 2009 11:23 am|
|Chris Ridd||Feb 10, 2009 11:27 am|
|Tim||Feb 10, 2009 12:19 pm|
|Peter Schuller||Feb 10, 2009 1:36 pm|
|David Collier-Brown||Feb 10, 2009 1:55 pm|
|Miles Nordin||Feb 10, 2009 2:56 pm|
|Peter Schuller||Feb 10, 2009 3:45 pm|
|Bob Friesenhahn||Feb 10, 2009 4:08 pm|
|Jeff Bonwick||Feb 10, 2009 4:41 pm|
|Toby Thain||Feb 10, 2009 5:23 pm|
|Miles Nordin||Feb 10, 2009 6:10 pm|
|Frank Cusack||Feb 10, 2009 7:36 pm|
|Toby Thain||Feb 10, 2009 8:53 pm|
|Bryant Eadon||Feb 10, 2009 10:28 pm|
|Eric D. Mudama||Feb 11, 2009 12:25 am|
|David Dyer-Bennet||Feb 11, 2009 7:27 am|
|Frank Cusack||Feb 11, 2009 8:24 am|
|Subject:||Re: [zfs-discuss] Does your device honor write barriers?|
|From:||Miles Nordin (car...@Ivy.NET)|
|Date:||Feb 10, 2009 11:23:35 am|
ps> A test I did was to write a minimalistic program that simply ps> appended one block (8k in this case), fsync():ing in between, ps> timing each fsync().
were you the one that suggested writing backwards to make the difference bigger? I guess you found that trick unnecessary---speeds differed enough when writing forwards?
ps> * Write-back caching on the RAID controller (lowest latency).
Did you find a good way to disable this case so you could distinguish between the second two?
like, I thought there was some type of SYNCHRONIZE CACHE with a certain flag-bit set, which demands a flush to disk not to NVRAM, and that years ago ZFS was mistakenly sending this overly aggressive command instead of the normal ``just make it persistent'' sync, so there was that stale best-practice advice to lobotomize the array by ordering it to treat the two commands equivalent.
Maybe it would be possible to send that old SYNC command on purpose. Then you could start the tool by comparing speeds with to-disk-SYNC and normal-nvramallowed-SYNC: if they're the same speed and oddly fast, then you know the array controller is lobotomized, and the second half of the test is thus invalid. If they're different speeds, then you can trust the second half is actually testing the disks, so lnog as you send old-SYNC. If they're the same speed but slow, then you don't have NVRAM.
ps> you could write an ever increasing sequence of values to ps> deterministic but pseudo-random pages in some larger file, ps> such that you can, after a powerfail test, read them back in ps> and test the sequence of numbers (after sorting it) for the ps> existence of holes.
yeah, the perl script I linked to requires a ``server'' which is not rebooted and a ``client'' which is rebooted during the test, and the client checks in its behavior with the server. I think the server should be unnecessary---the script should just know itself, know in the check phase what it would have written. I guess the original script author is thinking more of the SYNC comand and less of the write barrier, but in terms of losing pools or _corrupting_ databases, it's really only barriers that matter, and SYNC matters only because it's also an implicit barrier, doesn't matter exactly when it returns.
so....I guess you would need the listening-server to test SYNC is not returning early, like if you want to detect that someone has disabled the ZIL, or if you have an n-tier database system with retries at higher tiers or a system that's distributed or doing replication, then you do care when SYNC returns and need the not-rebooted listening-server. But you should be able to make a serverless tool just to check write barriers and thus corruption-proofness.
_______________________________________________ zfs-discuss mailing list zfs-...@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss