

![]() | Start a set with this search |
![]() | Include this search in one of my sets |
![]() | Exclude this search from one of my sets |
![]() | Permalink to these results Paste this link in email or IM: |
| Atom feed for tracking future search results Paste this URL into your reader: |
4 messages in org.python.python-bugs-list[ python-Bugs-914148 ] xml.sax segfau...| From | Sent On | Attachments |
|---|---|---|
| SourceForge.net | Mar 11, 2004 9:13 am | |
| SourceForge.net | Mar 15, 2004 5:21 am | |
| SourceForge.net | Mar 15, 2004 5:26 am | |
| SourceForge.net | Mar 19, 2004 5:45 pm |

![]() | Permalink for this message Paste this link in email or IM: |
![]() | Permalink for this thread Paste this link in email or IM: |
| Atom feed for this thread Paste this URL into your reader: |
| Subject: | [ python-Bugs-914148 ] xml.sax segfault on error | Actions... |
|---|---|---|
| From: | SourceForge.net (nore...@sourceforge.net) | |
| Date: | Mar 15, 2004 5:26:35 am | |
| List: | org.python.python-bugs-list | |
Bugs item #914148, was opened at 2004-03-11 06:14 Message generated for change (Comment added) made by moraes You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=914148&group_id=5470
Category: XML Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Adam Sampson (adamsampson) Assigned to: Nobody/Anonymous (nobody) Summary: xml.sax segfault on error
Initial Comment: While (mistakenly) using Mark Pilgrim's feedparser module to parse data from <http://www.gothamist.com/archives/news_nyc/index.php>, Python segfaults when it should invoke an error handler for invalid XML. The attached code demonstrates the problem; it occurs with Python 2.2.3 and 2.3.3 on my system. I've tried to chop the example data down as far as possible, but reducing it any further doesn't exhibit the problem (it's currently just above 64k, which might be a coincidence).
The gdb traceback I get from the example is as follows:
#0 normal_updatePosition (enc=0x404a4fc0, ptr=0x40682000 <Address 0x40682000 out of bounds>, end=0x81e87e0 "a></div>\n\n<div id=\content\>\n\n<div class=\blog\>\n<!--\n<rdf:RDF xmlns:rdf=\http://www.w3.org/1999/02/22-rdf-syntax-ns#\\n
xmlns:trackback=\http://madskills.com/public/xml/rss/module/trackback/\\n"..., pos=0x81e7dac) at /120g/gar/python/python23/work/Python-2.3.3/Modules/expat/xmltok_impl.c:1745 #1 0x40484288 in XML_GetCurrentLineNumber (parser=0x81e7c18) at /120g/gar/python/python23/work/Python-2.3.3/Modules/expat/xmlparse.c:1605 #2 0x40481fc5 in set_error (self=0x0, code=XML_ERROR_TAG_MISMATCH) at /120g/gar/python/python23/work/Python-2.3.3/Modules/pyexpat.c:124 #3 0x40480ae7 in xmlparse_Parse (self=0x402fddac, args=0x0) at /120g/gar/python/python23/work/Python-2.3.3/Modules/pyexpat.c:888 #4 0x080fc25a in PyCFunction_Call (func=0x402faa0c, arg=0x402f338c, kw=0xfffffffb) at Objects/methodobject.c:108 #5 0x080aa674 in call_function (pp_stack=0xbffff03c, oparg=0) at Python/ceval.c:3439 #6 0x080a8a2e in eval_frame (f=0x816e45c) at Python/ceval.c:2116 #7 0x080a95bc in PyEval_EvalCodeEx (co=0x40303de0, globals=0xfffffffb, locals=0x0, args=0x816e5a8, argcount=2, kws=0x816a9fc, kwcount=0, defs=0x40321678, defcount=1, closure=0x0) at Python/ceval.c:2663 #8 0x080aa729 in fast_function (func=0xfffffffb, pp_stack=0xbffff1bc, n=2, na=0, nk=135703028) at Python/ceval.c:3529 #9 0x080aa56c in call_function (pp_stack=0xbffff1bc, oparg=0) at Python/ceval.c:3458 #10 0x080a8a2e in eval_frame (f=0x816a894) at Python/ceval.c:2116 #11 0x080a95bc in PyEval_EvalCodeEx (co=0x402fd2a0, globals=0xfffffffb, locals=0x0, args=0x402f3318, argcount=2, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2663 #12 0x080fbda7 in function_call (func=0x4030617c, arg=0x402f330c, kw=0x0) at Objects/funcobject.c:504 #13 0x0805b899 in PyObject_Call (func=0x40682000, arg=0x0, kw=0x0) at Objects/abstract.c:1755 #14 0x08062288 in instancemethod_call (func=0x4030617c, arg=0x402f330c, kw=0x0) at Objects/classobject.c:2433 #15 0x0805b899 in PyObject_Call (func=0x40682000, arg=0x0, kw=0x0) at Objects/abstract.c:1755 #16 0x080aa892 in do_call (func=0x4032025c, pp_stack=0x402f330c, na=0, nk=0) at Python/ceval.c:3644 #17 0x080aa4f9 in call_function (pp_stack=0xbffff5fc, oparg=0) at Python/ceval.c:3460 #18 0x080a8a2e in eval_frame (f=0x818b414) at Python/ceval.c:2116 #19 0x080aa7ad in fast_function (func=0xfffffffb, pp_stack=0xbffff71c, n=2, na=0, nk=1076865996) at Python/ceval.c:3518 #20 0x080aa56c in call_function (pp_stack=0xbffff71c, oparg=0) at Python/ceval.c:3458 #21 0x080a8a2e in eval_frame (f=0x8183814) at Python/ceval.c:2116 #22 0x080a95bc in PyEval_EvalCodeEx (co=0x402ed2a0, globals=0xfffffffb, locals=0x0, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2663 #23 0x080abdb9 in PyEval_EvalCode (co=0x0, globals=0x0, locals=0x0) at Python/ceval.c:537 #24 0x080d7d2b in run_node (n=0x402bb79c, filename=0x0, globals=0x0, locals=0x0, flags=0x0) at Python/pythonrun.c:1265 #25 0x080d74df in PyRun_SimpleFileExFlags (fp=0x8139050, filename=0xbffffa4d "testexpat.py", closeit=-1073743283, flags=0xbffff878) at Python/pythonrun.c:862 #26 0x08054dd5 in Py_Main (argc=1, argv=0xbffff8f4) at Modules/main.c:415 #27 0x0805492b in main (argc=0, argv=0x0) at Modules/python.c:23
----------------------------------------------------------------------
Comment By: Mark Moraes (moraes) Date: 2004-03-15 02:26
Message: Logged In: YES user_id=390363
#! /usr/bin/env python
dhead = """<?xml version="1.0" encoding="ISO-8859-1" ?> <item><title>»</title></item> <item><title> """ dtail = """</title></item> """
import xml.sax from cStringIO import StringIO as _StringIO
class _StrictFeedParser:
def _err(self, errtype, exc):
print errtype, exc.getMessage(), 'line',
exc.getLineNumber(), 'column', exc.getColumnNumber()
def fatalError(self, exc):
self._err('fatalError', exc)
# raise exc # avoids the problem
def error(self, exc):
self._err('error', exc)
def warning(self, exc):
self._err('warning', exc)
def parse(data): feedparser = _StrictFeedParser() saxparser = xml.sax.make_parser(["drv_libxml2"]) saxparser.setErrorHandler(feedparser) source = xml.sax.xmlreader.InputSource() source.setByteStream(_StringIO(data)) saxparser.parse(source)
if __name__ == '__main__': for i in xrange(65427,66000,1): print i parse(dhead + 'x'*i + dtail)
----------------------------------------------------------------------
Comment By: Mark Moraes (moraes) Date: 2004-03-15 02:22
Message: Logged In: YES user_id=390363
I ran into this as well -- turns out that 64k is relevant: I have a simpler script that reproduces this problem -- create an unterminated character ref such as "«" without the trailing semi-colon and add roughly 64k of data after it. The crash occurs if the sax parser has an ErrorHandler set where the fatalError() method returns normally instead of terminating/raising the exception.
As a defensive measure, I suggest that any call to the fatalError method be followed by a raise of the exception if fatalError returns.
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=914148&group_id=5470







