1 message in org.python.python-bugs-list[ python-Bugs-725265 ] urlopen object...
FromSent OnAttachments
SourceForge.netMar 25, 2004 12:04 pm 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:[ python-Bugs-725265 ] urlopen object's read() doesn't read to EOFActions...
From:SourceForge.net (nore@sourceforge.net)
Date:Mar 25, 2004 12:04:19 pm
List:org.python.python-bugs-list

Bugs item #725265, was opened at 2003-04-21 16:49 Message generated for change (Comment added) made by fdrake You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=725265&group_id=5470

Category: Documentation Group: Python 2.2.2 Status: Open

Resolution: Fixed

Priority: 5 Submitted By: Christopher Smith (smichr) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: urlopen object's read() doesn't read to EOF

Initial Comment: On http://python.org/doc/current/lib/module-urllib.html it says that the object returned by urlopen supports the read()method and that this and other methods "have the same interface as for file objects -- see section 2.2.8". In that section on page http://python.org/doc/current/lib/bltin-file-objects.html it says about the read() method that "if the size argument is negative or omitted, [read should] read all data until EOF is reached."

I was a bit surprised when a project that students of mine were working on were failing when they tried to process the data obtained by the read() method on a connection made to a web page. The problem, apparently, is that the read may not obtain all of the data requested in the first request and the total response has to be built up someting like follows:

import urllib c=urllib.urlopen("http://www.blakeschool.org") data = '' while 1: packet=c.read() if packet == '': break data+=packet

I'm not sure if this is a feature or a bug. Could a file's read
method fail to obtain the whole file in one read(), too? It seems that either the documentation should be changed or the read() method for at least urllib objects should be changed.

/c

Christopher P. Smith The Blake School Minneapolis, MN

----------------------------------------------------------------------

Comment By: Fred L. Drake, Jr. (fdrake)

Date: 2004-03-25 12:04

Message: Logged In: YES user_id=3066

This is an issue with reading from a socket; there's no way to recognize the end of the stream until the remote end of the socket actually closes the socket.

I've documented this limitation in Doc/lib/liburllib.tex 1.52. Someone should backport the patch to Python 2.3.x and close this report.

----------------------------------------------------------------------