atom feed19 messages in org.opensolaris.zfs-discussRe: [zfs-discuss] Does your device ho...
FromSent OnAttachments
Bryant EadonFeb 10, 2009 10:35 am 
Peter SchullerFeb 10, 2009 10:52 am 
Miles NordinFeb 10, 2009 11:23 am 
Chris RiddFeb 10, 2009 11:27 am 
TimFeb 10, 2009 12:19 pm 
Peter SchullerFeb 10, 2009 1:36 pm 
David Collier-BrownFeb 10, 2009 1:55 pm 
Miles NordinFeb 10, 2009 2:56 pm 
Peter SchullerFeb 10, 2009 3:45 pm 
Bob FriesenhahnFeb 10, 2009 4:08 pm 
Jeff BonwickFeb 10, 2009 4:41 pm 
Toby ThainFeb 10, 2009 5:23 pm 
Miles NordinFeb 10, 2009 6:10 pm 
Frank CusackFeb 10, 2009 7:36 pm 
Toby ThainFeb 10, 2009 8:53 pm 
Bryant EadonFeb 10, 2009 10:28 pm 
Eric D. MudamaFeb 11, 2009 12:25 am 
David Dyer-BennetFeb 11, 2009 7:27 am 
Frank CusackFeb 11, 2009 8:24 am 
Subject:Re: [zfs-discuss] Does your device honor write barriers?
From:Peter Schuller (pete@infidyne.com)
Date:Feb 10, 2009 10:52:16 am
List:org.opensolaris.zfs-discuss

I use 3 external devices of on 2 models of external enclosures (eSATA and USB consumer grade)-- how can I test this write barrier issue on these 2 ?? Is it worthwhile adding to a wiki (table) somewhere what has or has not been tested ?

It depends on circumstances. If write barriers are enforced by instructing the device to flush caches, and assuming there is no battery-backed cache, a good way is to make sure that the latency of an fsync() is in fact what it is expected to be.

A test I did was to write a minimalistic program that simply appended one block (8k in this case), fsync():ing in between, timing each fsync().

In my case I was able to detect three distinct modes:

* Write-back caching on the RAID controller (lowest latency).

* Write-through on the RAID controller but write-back on the drives (medium
latency).

* Write-through on the RAID controller and the drive (highest latency, as
expected by rotational delay and seek delay of drives).

This was useful to test that things "seemed" to behave properly. Of course you only test that it is not systematically mis-behaving, not that it will actually behave correctly under all circumstances.

However this test boils down to testing durable persistence. If you want to specifically test write barriers regardless of durable persistence, you can write a tool that performs I/O:s in a way where you can determine, after the fact, whether they happened in order. For example you could write an ever increasing sequence of values to deterministic but pseudo-random pages in some larger file, such that you can, after a powerfail test, read them back in and test the sequence of numbers (after sorting it) for the existence of holes.

Given that ZFS is planned to be used in Snow Leopard, is it worth setting something up for consumer grade appliance vendors to 'certify' against? ("Ok, you play nice with ZFS by doing the right things", etc.. ) Maybe you can give them a 'Gold Star' == 'Supports ZFS' . That'll give them a selling point to consumers and Sun some free marketing ?

It would actually be nice in general I think, not just for ZFS, to have some standard "run this tool" that will give you a check list of successes/failures that specifically target storage correctness. Though correctness cannot be proven, you can at least test for common cases of systematic incorrect behavior.