atom feed4 messages in net.sourceforge.lists.saxon-help[saxon] problem with lost namespace w...
FromSent OnAttachments
Martin HonnenFeb 25, 2011 6:42 am 
Michael KayFeb 25, 2011 7:57 am 
Martin HonnenFeb 25, 2011 9:10 am 
Michael KayFeb 25, 2011 9:39 am 
Subject:[saxon] problem with lost namespace with Saxon 9.3 and 9.2 and htmlparser 1.3
From:Martin Honnen (Mart@arcor.de)
Date:Feb 25, 2011 6:42:54 am
List:net.sourceforge.lists.saxon-help

I am trying to use Saxon 9.3.0.4 PE with Java 1.6 to parse HTML documents with Henri Sivonen's htmlparser 1.3 (http://about.validator.nu/htmlparser/), simply by naming -x:nu.validator.htmlparser.sax.InfosetCoercingHtmlParser on Saxon's command line. When I use a HTML(5) input document with both HTML and SVG elements:

<!DOCTYPE html> <html lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>HTML(5) parsing and XSLT transformation test</title> </head> <body> <h1 class=heading>HTML(5) parsing and XSLT transformation test</h1> <p> This is paragraph 1 with some inline SVG: <svg with="100" height="100"> <circle cx="50" cy="50" r="20" fill="green"></circle> </svg> <p>This is the next paragraph. </p> </body> </html>

and run it through an XSLT stylesheet

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

<xsl:output method="xml"/>

<xsl:template match="/"> <xsl:comment> <xsl:text>Processed with XSLT version </xsl:text> <xsl:value-of select="system-property('xsl:version')"/> <xsl:text> by XSLT processor </xsl:text> <xsl:value-of select="system-property('xsl:vendor')"/> <xsl:text>.</xsl:text> </xsl:comment> <xsl:copy-of select="node()"/> </xsl:template>

</xsl:stylesheet>

simply doing a deep copy of the document's child nodes (and output some system properties for debugging as a comment) then I would expect the SVG elements to be in the SVG namespace. However the result I get with Saxon 9.3.0.4 is as follows:

<?xml version="1.0" encoding="UTF-8"?><!--Processed with XSLT version 2.0 by XSLT processor SAXON 9.3.0.4 from Saxonica.--><html xmlns="http://www.w3.org/1999/xhtml" lang="en"><head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> <title>HTML(5) parsing and XSLT transformation test</title> </head> <body> <h1 class="heading">HTML(5) parsing and XSLT transformation test</h1> <p> This is paragraph 1 with some inline SVG: <svg with="100" height="100"> <circle cx="50" cy="50" r="20" fill="green"/> </svg> </p><p>This is the next paragraph. </p>

</body></html>

so in that result document the SVG elements "svg" and "circle" end up in the XHTML namespace.

When I use Saxon 9.1.0.8 (Basic) I get the expected result:

<?xml version="1.0" encoding="UTF-8"?><!--Processed with XSLT version 2.0 by XSLT processor SAXON 9.1.0.8 from Saxonica.--><html xmlns="http://www.w3.org/1999/xhtml" lang="en"><head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> <title>HTML(5) parsing and XSLT transformation test</title> </head> <body> <h1 class="heading">HTML(5) parsing and XSLT transformation test</h1> <p> This is paragraph 1 with some inline SVG: <svg xmlns="http://www.w3.org/2000/svg" with="100" height="100"> <circle cx="50" cy="50" r="20" fill="green"/> </svg> </p><p>This is the next paragraph. </p>

</body></html>

Even with Saxon 6.5.5 I get the SVG elements in the SVG namespace and not in the XHTML namespace. With Saxon 9.2.1.5 the SVG elements end up in the XHTML namespace, as with Saxon 9.3.

As the SVG elements are in the SVG namespace with Saxon 6.5.5 and 9.1.0.8 but not with 9.3.0.4 and 9.2.1.5 this might be a bug in the latest Saxon versions and not in htmlparser. What do you think?

------------------------------------------------------------------------------ Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev