atom feed90 messages in org.xml.lists.xml-devRe: [xml-dev] Pragmatic namespaces
FromSent OnAttachments
62 earlier messages
Liam QuinAug 6, 2009 11:05 am 
Pete CordellAug 6, 2009 11:49 am 
John L. ClarkAug 6, 2009 12:32 pm 
Simon St.LaurentAug 6, 2009 1:06 pm 
Michael LudwigAug 6, 2009 1:13 pm 
Michael LudwigAug 6, 2009 1:16 pm 
Michael LudwigAug 6, 2009 1:39 pm 
Liam QuinAug 6, 2009 2:43 pm 
Michael LudwigAug 6, 2009 3:11 pm 
Michael KayAug 6, 2009 3:32 pm 
rjel...@allette.com.auAug 6, 2009 8:21 pm 
rjel...@allette.com.auAug 6, 2009 8:32 pm 
Michael KayAug 7, 2009 1:10 am 
michael odling-smeeAug 7, 2009 1:28 am 
Michael KayAug 7, 2009 1:33 am 
michael odling-smeeAug 7, 2009 2:24 am 
Michael LudwigAug 7, 2009 3:00 am 
Dave PawsonAug 7, 2009 8:50 am 
Liam QuinAug 7, 2009 9:08 am 
Micah DubinkoAug 7, 2009 5:03 pm 
Micah DubinkoAug 7, 2009 5:05 pm 
Robert KobergAug 7, 2009 5:08 pm 
Dave PawsonAug 12, 2009 12:34 am 
Dave PawsonAug 13, 2009 12:35 am 
Henri SivonenAug 13, 2009 11:47 am 
Micah DubinkoAug 23, 2009 3:05 pm 
David CarverAug 23, 2009 4:21 pm 
Henri SivonenAug 24, 2009 4:03 am 
Subject:Re: [xml-dev] Pragmatic namespaces
From:Henri Sivonen (hsiv@iki.fi)
Date:Aug 13, 2009 11:47:16 am
List:org.xml.lists.xml-dev

On Aug 1, 2009, at 02:06, Micah Dubinko wrote:

Literally for years, people have been talking about how great it would be to use something like Java-style namespaces in XML instead of the current xmlns regime. For example
<http://www.xml.com/pub/a/2005/04/13/namespace-uris.html

I'm glad to see that here over in the XML land, people who've worked with Namespaces show appropriate discontent with them. I wish the RDFa land took note.

Requirement: this solution must not interfere with existing HTML elements or attributes

Point 1: Any element name with no dots in it is treated as HTML (including HTML rules on handing unrecognized elements)

I'd go further and say that for processing purposes, any element with dots needs to be treated per HTML rules (where "HTML rules" means the HTML5 parsing algorithm).

Requirement: this solution must allow for distributed creation of globally-unique namespace names (including those outside of a consensus process)

This works if it is a naming convention but the HTML parser & DOM don't do any novel processing based on this convention.

(It follows that ASCII letters A to Z get folded into a to z by the tokenizer and ASCII letters a to z get folded into A to Z by DOM Level 1 getters when the owner document has its HTMLness bit set, so you can't make com.example.foo and com.example.FOO be distinct.)

Point 2: Any element with one or more dots in it is treated as an extension element.

As long as "treated" is a social thing and not in software operation, so far good.

I think syntax-wise this is the best "distributed extensibility" proposal I've seen for HTML5. (It's similar to the microdata section in HTML5.) Thank you!

The portion after the last dot is considered the localname, and the the portion up to but not including the last dot is parsed as the pragmatic namespace name (or pname for short). Interfaces with existing namespace-aware APIs must treat the pname as the namespace URI. With the exception noted below, to prevent clashes pnames must be based on reversed DNS names.

Example: <head> <title>Document title</title> <com.example.project> <com.example.id>123521123</com.example.id> </com.example.project> </head>

In this example document.getElementsByTagName("id") would return the innermost element. So would document.getElementsByTagNameNS("com.example", "id")

I think here your proposal goes into the weeds.

The #1 flaw with Namespaces & DOM Level 2 is that the identifiers that are fundamental to the operation of software were different from the identifiers in plain XML 1.0 or DOM Level 1. Your proposal repeats this mistake by making the platform behave radically differently if you have a JS program running on a browser that doesn't implement your proposal and if you have the same JS program running on a browser that implements your proposal.

In your example, the local name of the innermost element MUST be "com.example.id" for compatibility with existing behavior. Changing what document.getElementsByTagName() returns here is not something that's open for discussion. (As in, the probability of a browser vendor shipping with the API behavior change is virtually zero.)

The namespace of the innermost element as reported by the DOM isn't really open for discussion, either. In an HTML5-compliant UA it is
"http://www.w3.org/1999/xhtml ", because this unifies the DOM with the XHTML5 side, where the namespace is constrained by the XHTML legacy to be "http://www.w3.org/1999/xhtml ". In legacy UAs, the namespace is null.

It would be OK to use the naming convention you propose in markup and deliver a helper JS library along you JS application code and let your own helper library expand "id" to "com.example.id" before passing it to document.getElementsByTagName(). Such a helper library would immediately run on past, present and future browsers without needing any DOM or parser infrastructural work.

Requirement: it is highly desirable to produce a document that will produce the same element names in HTML or XML

Agreed. This is basically the DOM Consistency Design principle of HTML5: http://www.w3.org/TR/html-design-principles/#dom-consistency

Point 3: Zero or more special attributes of the form using.<pname> may appear on the root element, and ONLY on the root element. The declarations have document-wide scope.

Can't have this, because agents implementing your proposal and legacy agents would get radically different DOMs.

Requirement: widely-known namespaces must be parse to an equivalent DOM as xmlns

For practical purposes, the Web platform has four markup languages: HTML, SVG, MathML and ARIA. HTML5 already covers the namespace assignment of HTML, SVG and MathML. ARIA doesn't need special treatment, because it consists entirely of no-namespace attributes.

It's plausible that XBL2 joins the markup language family of the platform. However, it's more problematic from the text/html point of view. More on that below.

What's the use case for embedding Atom in text/html?

Browsers don't support Docbook now. Having syntax for it isn't the major part. Supporting all the elements in ways appropriate for their semantics would be non-trivial. I think this doesn't belong in HTML5.

Already covered by HTML5 without new syntax.

Already covered by HTML5 with syntax that is compatible with copying MathML markup from XML and pasting into text/html.

Already covered by HTML5 with syntax that is compatible with copying SVG markup from XML and pasting into text/html.

This is being replaced with XBL2. As far as I'm aware, other vendors haven't shown interest in implementing the original Mozilla XBL.

XBL2 markup can embed XHTML subtrees in rather arbitrary ways. This kind of nesting wouldn't work in a backwards-compatible in text/html when the nested HTML elements interfere with element within which the XBL2 subtree has been embedded. In particular, one would want to put the XBL2 subtree inside <head>, but having e.g. <div> as a descendant of <head> is not a viable option.

XForms hasn't been implemented as a native feature in any of the top 4 browser engines. Having namespace syntax for XForms for text/html would be unlikely to change this. In fact, the whole HTML5 effort got started as an alternative to the XForms vision.

However, I'd welcome JS libraries, such as Ubiquity XForms, to implement XForms behavior using syntax like <xforms.input>, because the dotted syntax results in a consistent DOM in HTML5 and XHTML5 unlike the colonified syntax.

The XLink 1.0 names are already covered in HTML5 when they appear on SVG or MathML elements. Generic XLink itself is pretty dead. SVG implementations have to implement SVG-specific semantics for the XLink names instead of being able to use generic XLink code.

HTML5 already assigns xml:lang, xml:space and xml:base to the
http://www.w3.org/XML/1998/namespace when used on SVG or MathML elements.

xml:id is not supported, because HTML, MathML and SVG all already have an id attribute that works just fine. xml:id just adds complexity.

As for HTML elements, there's already the lang attribute and <pre> has built-in whitespace significance. xml:base is not supported on HTML elements in the text/html serialization.

Example:

<html using.math="math">... <p> E.g. <math><msqrt><mi>π</mi></msqrt></math> </p> ...</html>

This already works in HTML5 without even having to use the using.math stuff. I invite you to try it in a trunk nightly build of Firefox after you've set the preference html5.enable to true in about:config.

See http://hsivonen.iki.fi/test-html5-parsing/

In this example document.getElementsByTagName("mi") would return the innermost element. So would document.getElementsByTagNameNS("http://www.w3.org/1998/Math/MathML/ ", "mi")

Already works. You can try this with a nightly build of Firefox with html5.enable set to true.

Requirement: must support HTML nested inside an extension vocabulary.

Point 5: Unless overridden, HTML documents are treated as if all localnames explicitly listed in the specification are HTML boundary elements.

Example: <html using.svg="svg"> <body> <svg version="1.1" viewBox="0 0 100 100" preserveAspectRatio="xMidYMid slice"> <rect x="10" y="10" width="100" height="150" fill="gray"/> <foreignObject x="10" y="10" width="100" height="150"> <body> <div>Here is a <strong>paragraph</strong>.</div> </body> </foreignObject> </svg> </body> </html>

Here the inner body element and its children are still treated as HTML.

Already works in HTML5 without having to use "using.svg". You can try this with a nightly build of Firefox with html5.enable set to true.

Another example: <html using.xforms="model select1 range secret"> <head> <model>...</model> </head> </body> <xforms.input>... </body> </html>

In this case, "input" is already used as an HTML element name, so uses of it--even with the using statement at the top--need to be explicitly spelled out. Of course, the author could have overridden this by including "input" in the using statement, but then any regular HTML input controls would need to be spelled <html.input>. Just like in Java.

This would be highly backwards-incompatible. HTML5 extends HTML forms so that the new form features together with JavaScript cover the use case space that XForms covers.

-- Henri Sivonen hsiv@iki.fi http://hsivonen.iki.fi/

_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ Or unsubscribe: xml-@lists.xml.org subscribe: xml-@lists.xml.org List archive: http://lists.xml.org/archives/xml-dev/ List Guidelines: http://www.oasis-open.org/maillists/guidelines.php