|Brian Fundakowski Feldman||Apr 14, 2005 10:07 pm|
|Marc Olzheim||Apr 15, 2005 6:21 am|
|Brian Fundakowski Feldman||Apr 15, 2005 8:21 am|
|Marc Olzheim||Apr 18, 2005 2:25 am|
|Marc Olzheim||Apr 19, 2005 6:32 am|
|Brian Fundakowski Feldman||Apr 19, 2005 8:18 am|
|Marc Olzheim||Apr 19, 2005 9:02 am|
|Marc Olzheim||Apr 19, 2005 9:09 am|
|Brian Fundakowski Feldman||Apr 19, 2005 9:17 am|
|Brian Fundakowski Feldman||Apr 19, 2005 1:48 pm|
|Marc Olzheim||Apr 20, 2005 7:04 am|
|Brian Fundakowski Feldman||Apr 20, 2005 7:26 am|
|Marc Olzheim||Apr 20, 2005 7:39 am|
|Brian Fundakowski Feldman||Apr 20, 2005 8:22 am|
|Marc Olzheim||Apr 20, 2005 8:35 am|
|Brian Fundakowski Feldman||Apr 20, 2005 8:54 am|
|Jilles Tjoelker||Apr 20, 2005 10:12 am|
|Brian Fundakowski Feldman||Apr 20, 2005 10:31 am|
|Brian Fundakowski Feldman||Apr 20, 2005 11:03 am|
|Marc Olzheim||Apr 20, 2005 11:03 am|
|Dag-Erling Smørgrav||Apr 21, 2005 1:36 am|
|Garrett Wollman||Apr 21, 2005 4:50 am|
|Garrett Wollman||Apr 21, 2005 4:51 am|
|Garrett Wollman||Apr 22, 2005 5:49 am|
|Brian Fundakowski Feldman||Apr 22, 2005 8:12 am|
|Brian Fundakowski Feldman||Apr 22, 2005 8:38 am|
|Garrett Wollman||Apr 23, 2005 5:09 am|
|Subject:||NFS client/buffer cache deadlock|
|From:||Brian Fundakowski Feldman (gre...@freebsd.org)|
|Date:||Apr 20, 2005 8:54:29 am|
On Wed, Apr 20, 2005 at 05:35:28PM +0200, Marc Olzheim wrote:
On Wed, Apr 20, 2005 at 11:20:38AM -0400, Brian Fundakowski Feldman wrote:
Reads should be totally unaffected...
The server was misbehaving. Fixed. :-)
Btw.: I'm not sure write(),writev() and pwrite() are allowed to do short writes on regular files... ?
Our manpage is incorrect; POSIX states that they are (see earlier e-mail). There really is no alternative -- we simply can't build an NFS transaction larger than our buffer cache can accomodate. Note that short wries won't happen for normal buffer sizes, only excessively large ones. I really don't believe that writev() is meant to be used so that you can write gigantic data structures in a single transaction...
Ah, I was reading the SUSv2 page:
instead of the POSIX version.
But in neither of those I can extrude the fact that it can return with result < nbyte, without it being a permanent condition. What phrase makes you conclude that it can ?
This specific issue is not clear-cut; the best thing to do lies somewhere within the range of these scenarios:
"If a write() requests that more bytes be written than there is room for (for example, [XSI] [Option Start] the process' file size limit or [Option End] the physical end of a medium), only as many bytes as there is room for shall be written. For example, suppose there is space for 20 bytes more in a file before reaching a limit. A write of 512 bytes will return 20. The next write of a non-zero number of bytes would give a failure return (except as noted below)."
"When attempting to write to a file descriptor (other than a pipe or FIFO) that supports non-blocking writes and cannot accept the data immediately:
* If the O_NONBLOCK flag is clear, write() shall block the calling thread until the data can be accepted.
* If the O_NONBLOCK flag is set, write() shall not block the thread. If some data can be written without blocking the thread, write() shall write what it can and return the number of bytes written. Otherwise, it shall return -1 and set errno to [EAGAIN]."
"[ENOBUFS] Insufficient resources were available in the system to perform the operation."
I think the first is more useful behavior than the last. Supporting it should be exactly the same as supporting what happens if the actual filesystem fills up. In this case, the filesystem is being requested to write more "than there is room for."
-- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> gre...@FreeBSD.org \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\