|Matthew Wilcox||Dec 17, 2008 8:41 am|
|bbu...@extremeprotocol.com||Dec 17, 2008 9:20 am|
|Matthew Wilcox||Dec 17, 2008 9:24 am|
|Grant Grundler||Dec 17, 2008 9:50 am|
|Matthew Wilcox||Dec 17, 2008 10:06 am|
|Grant Grundler||Dec 17, 2008 10:56 am|
|James Bottomley||Dec 17, 2008 11:04 am|
|Matthew Wilcox||Dec 17, 2008 11:10 am|
|James Bottomley||Dec 17, 2008 11:13 am|
|Matthew Wilcox||Dec 17, 2008 11:32 am|
|James Bottomley||Dec 17, 2008 11:36 am|
|Matthew Wilcox||Dec 17, 2008 11:48 am|
|Boaz Harrosh||Dec 18, 2008 1:05 am|
|Matthew Wilcox||Dec 18, 2008 6:08 am|
|Boaz Harrosh||Dec 18, 2008 6:38 am|
|Matthew Wilcox||Dec 18, 2008 6:49 am|
|James Bottomley||Dec 18, 2008 6:51 am|
|Boaz Harrosh||Dec 18, 2008 6:58 am|
|Douglas Gilbert||Dec 18, 2008 12:41 pm|
|Subject:||Re: READ CAPACITY 16|
|From:||James Bottomley (Jame...@HansenPartnership.com)|
|Date:||Dec 17, 2008 11:04:01 am|
On Wed, 2008-12-17 at 11:06 -0700, Matthew Wilcox wrote:
On Wed, Dec 17, 2008 at 09:50:52AM -0800, Grant Grundler wrote:
Algorithm A (a perfect world):
Issue RC16 -> If it fails, issue RC10 -> If it times out, reset the device, issue RC10
Issue RC10 Issue RC16 -> If it succeeds, use its results in preference to those from RC10 -> If it fails, carry on with the results from RC10 -> If it times out, reset the device, carry on with the results from RC10
I fail to see an effective difference between Algo A and B.
Whether to issue an RC10 before issuing an RC16 or not. It matches what we currently do better (we currently issue an RC10 and then issue an RC16 if RC10 reports we have 0xffffffff LBAs).
The question really is one you already asked:
...The question is what to do about devices that either hang or take a long time to respond to an RC16 command.
A few ideas: 1) maintain a blacklist
Which is obviously what we're trying to avoid doing.
I don't really see a way of avoiding this ... for USB devices it's probably going to be a requirement.
2) anything in RC10 or IDENTIFY that would clue us about RC16 functionality? If so, then something like B or C would make sense.
RC10 only returns number of LBAs and how many bytes per LBA. I don't see anything in the INQUIRY data (other than the protection bit, which we already use to know that RC16 is supported). We could maybe key off scsi_level > SCSI_2 like scsi_device_protection() does. This would work for ATA SSDs because libata reports SCSI ANSI revision 05, but it won't work for USB devices because they get mangled down to SCSI_2, no matter what they support.
That latter piece is fixable. We can also go with the INQUIRY version descriptor information which I don't think USB mangles.
3) How long does Read Capacity16 normally take? e.g. at boot time with drive that isn't spun up yet or equivalent from RAID device. If it's not that long (e.g < 1sec or so) then just use a shorter timeout in general? With parallel scanning, it should be tolerably painful.
I don't know how long it'll take. I was hoping people with experience in this matter would chime in.
Actually, we can't afford to send READ CAPACITY(16) to failing devices; some of them never come back.