atom feed70 messages in org.freebsd.freebsd-scsiSCSI tape data loss
FromSent OnAttachments
Kern SibbaldJun 1, 2003 10:54 am 
Dan LangilleJun 1, 2003 11:32 am 
Justin T. GibbsJun 1, 2003 1:08 pm 
Kern SibbaldJun 1, 2003 2:44 pm 
Justin T. GibbsJun 1, 2003 3:39 pm 
Matthew JacobJun 1, 2003 5:00 pm 
Matthew JacobJun 1, 2003 5:13 pm 
Dan LangilleJun 1, 2003 6:58 pm 
Matthew JacobJun 1, 2003 7:03 pm 
Kern SibbaldJun 2, 2003 1:28 am 
Kern SibbaldJun 2, 2003 1:29 am 
Kern SibbaldJun 2, 2003 1:57 am 
Kern SibbaldJun 2, 2003 3:45 am 
Dan LangilleJun 2, 2003 4:28 am 
Matthew JacobJun 2, 2003 8:05 am 
Justin T. GibbsJun 2, 2003 8:10 am 
Dan LangilleJun 2, 2003 8:14 am 
Matthew JacobJun 2, 2003 8:21 am 
Kern SibbaldJun 2, 2003 8:27 am 
Dan LangilleJun 2, 2003 9:46 am 
Dan LangilleJun 2, 2003 11:05 am 
Matthew JacobJun 2, 2003 11:11 am 
Justin T. GibbsJun 2, 2003 11:49 am 
Dan LangilleJun 2, 2003 12:06 pm 
Justin T. GibbsJun 2, 2003 12:10 pm 
Matthew JacobJun 2, 2003 1:14 pm 
Dan LangilleJun 2, 2003 2:16 pm 
Matthew JacobJun 2, 2003 2:24 pm 
Kern SibbaldJun 2, 2003 2:46 pm 
Matthew JacobJun 2, 2003 2:55 pm 
Kern SibbaldJun 2, 2003 3:31 pm 
Carl ReisingerJun 2, 2003 3:44 pm 
Matthew JacobJun 2, 2003 3:44 pm 
Dan LangilleJun 2, 2003 6:37 pm 
Kern SibbaldJun 3, 2003 12:28 am 
Kern SibbaldJun 3, 2003 6:07 am 
Carl ReisingerJun 3, 2003 6:19 am 
Kern SibbaldJun 3, 2003 6:37 am 
Carl ReisingerJun 3, 2003 7:01 am 
Matthew JacobJun 3, 2003 7:34 am 
Justin T. GibbsJun 3, 2003 7:51 am 
Kern SibbaldJun 3, 2003 8:05 am 
Kern SibbaldJun 3, 2003 8:11 am 
Matthew JacobJun 3, 2003 9:03 am 
Dan LangilleJun 3, 2003 9:10 am 
Justin T. GibbsJun 3, 2003 9:24 am 
Kern SibbaldJun 3, 2003 9:40 am 
Justin T. GibbsJun 3, 2003 10:03 am 
Kern SibbaldJun 3, 2003 10:19 am 
Kern SibbaldJun 3, 2003 10:34 am 
Matthew JacobJun 3, 2003 11:00 am 
Matthew JacobJun 3, 2003 11:16 am 
Matthew JacobJun 3, 2003 11:39 am 
Justin T. GibbsJun 3, 2003 12:12 pm 
Dan LangilleJun 3, 2003 12:43 pm 
Matthew JacobJun 3, 2003 12:46 pm 
Kern SibbaldJun 3, 2003 1:05 pm 
PostMaster GeneralJun 3, 2003 2:21 pm 
Kern SibbaldJun 4, 2003 12:20 am 
Matthew JacobJun 4, 2003 7:51 am 
Kern SibbaldJun 4, 2003 9:51 am 
Kern SibbaldJun 6, 2003 7:38 am 
Dan LangilleJun 6, 2003 8:59 am 
Matthew JacobJun 6, 2003 11:50 am 
Dan LangilleJun 20, 2003 6:17 pm 
Dan LangilleJul 1, 2003 5:07 pm 
Matthew JacobJul 1, 2003 11:11 pm 
Michael L. SquiresAug 25, 2003 4:16 am 
Dan LangilleAug 25, 2003 9:13 am 
Michael L. SquiresAug 27, 2003 5:27 am 
Subject:SCSI tape data loss
From:Kern Sibbald (ke@sibbald.com)
Date:Jun 3, 2003 9:40:53 am
List:org.freebsd.freebsd-scsi

Yes, I probably should move the clrerror() and the check/set of errno inside the check for "stat == -1". However, the code though odd is correct since I do not use errno unless the status is -1.

Our most recent tests are even more interesting. We are getting the same data loss any time Bacula switches tapes. This means the data loss does not have anything in particular to do with the LEOM or PEOM status.

By the way, the funny casting is mandatory in C++, because ssize_t as returned by the write is not the same as size_t (what is written).

More after I look at the most recent tests results.

Best regards,

Kern

On Tue, 2003-06-03 at 18:25, Justin T. Gibbs wrote:

What is clear from the output is that the write() is returning a -1 status. errno could possibly be 0, in which case I set it to ENOSPC, if it is not 0 then it is ENOSPC judging by the error message that is printed "Write error on device ...".

You may want to see more, but here is the basic code that does the write: if ((uint32_t)(stat=write(dev->fd, block->buf, (size_t)wlen)) != wlen) { /* We should check for errno == ENOSPC, BUT many * devices simply report EIO when it is full. * with a little more thought we may be able to check * capacity and distinguish real errors and EOT * conditions. In any case, we probably want to * simulate an End of Medium. */ clrerror_dev(dev, -1);

Apart from the funny casting, the only obvious bug is that you are expecting errno to be set on every syscall. Errno is only valid if stat == -1 or you explicitly clear it prior to the syscall (or after the last time it was set). You don't seem to be doing that here.

See the errno man page for details