

![]() | Start a set with this search |
![]() | Include this search in one of my sets |
![]() | Exclude this search from one of my sets |
![]() | Permalink to these results Paste this link in email or IM: |
| Atom feed for tracking future search results Paste this URL into your reader: |
21 messages in org.w3.www-htmlRE: Tag Soup vs Generalized Markup (w...| From | Sent On | Attachments |
|---|---|---|
| Larry Masinter | Sep 23, 1999 12:15 pm | |
| Jukk...@hut.fi | Sep 24, 1999 2:44 am | |
| Walter Ian Kaye | Sep 24, 1999 10:39 am | |
| Arjun Ray | Sep 27, 1999 6:31 pm | |
| Harald Tveit Alvestrand | Sep 28, 1999 10:51 am | |
| Arjun Ray | Sep 28, 1999 5:01 pm | |
| Harald Tveit Alvestrand | Sep 30, 1999 11:41 am | |
| Arjun Ray | Sep 30, 1999 11:55 pm | |
| Larry Masinter | Oct 4, 1999 1:06 pm | |
| Arjun Ray | Oct 4, 1999 9:56 pm | |
| Larry Masinter | Oct 5, 1999 6:55 am | |
| Arjun Ray | Oct 6, 1999 12:04 am | |
| Arjun Ray | Oct 6, 1999 3:29 am | |
| Rick Jelliffe | Oct 6, 1999 6:37 am | |
| Arjun Ray | Oct 6, 1999 6:39 am | |
| Russell Steven Shawn O'Connor | Oct 6, 1999 6:42 am | |
| Arjun Ray | Oct 6, 1999 10:17 am | |
| Arjun Ray | Oct 7, 1999 10:00 pm | |
| Larry Masinter | Oct 8, 1999 2:55 am | |
| Jukk...@hut.fi | Oct 8, 1999 3:27 am | |
| Arjun Ray | Oct 8, 1999 3:49 pm |

![]() | Permalink for this message Paste this link in email or IM: |
![]() | Permalink for this thread Paste this link in email or IM: |
| Atom feed for this thread Paste this URL into your reader: |
| Subject: | RE: Tag Soup vs Generalized Markup (was: I-D ACTION..) | Actions... |
|---|---|---|
| From: | Arjun Ray (ar...@q2.net) | |
| Date: | Oct 8, 1999 3:49:42 pm | |
| List: | org.w3.www-html | |
On Fri, 8 Oct 1999, Larry Masinter wrote:
In addition to the development of standards, a wide variety of additional extensions, restrictions, and modifications to HTML were popularized by the competitive implementations of Netscape Navigator and Microsoft Internet Explorer and documented in various books and online guides.
Maybe only "historical" names are best? So, replace
popularized by the competitive implementations of Netscape Navigator and Microsoft Internet Explorer
with
: popularized by competitive implementations derived mainly from : the Mosaic browser of the NCSA [add reference]
I would be happy to include a reference to a "Tag Soup spec" if I could find one that would be suitable for a references list in an RFC.
Actually, I was suggesting that a Tag Soup spec be written for the I-D to point to.
I'm uneasy about recommending one popular HTML book over another,
I agree. I doubt a suitable book could be found, because all such books are about "how to use HTML" rather than the dry details that go into a spec.
and can't find any stable reference to something that would constitute an "official guide to Mozilla and/or MSIE HTML tags".
Jukka seems to have covered that ground reasonably well, so let me elaborate on what I meant by a Tag Soup spec. It's probably best understood in terms of how Mosaic's parser used to work, and the "normal" way in which common word processing software is used (keep on typing, and where needed, smack a function key or toolbar button to insert a "command code". The "Reveal Codes" feature of WordPerfect is perhaps canonical in this respect.)
Basically, an HTML document is treated as a flat stream of text punctuated by "marks"; each "mark" involves a collection of toggles and/or counters aimed at a "global processing state". By design, these primitives should be orthogonal, but they may interact in ad hoc ways; even so, the idea is to avoid as far as possible ever having to "stack". In any case, each individual HTML tag should be independently treated as a macro expanding to these commands.
For instance, the header tags could all be treated as affecting a global value of "font size", with a default re-established upon "cancellation" via an end-tag. Any such end-tag, in fact, so that something like this should work swimmingly well:
<h2>Hello <h3>World!</h1>
(Read: change to font size h2, print "Hello ", break a line and a half and change to font size h3, print "World!", break two lines and cancel font changes to reset font default)
or something like this if bold and italics can be independently varied:
<b>bold stuff<i>and italicized</b>just italics</i>
The fact that this represents a fundamental misunderstanding of SGML syntax is irrelevant. The outward form of the borrowed syntax is being mapped to a different mental model. As Eric Bina was known to say: "This is not Rocket Science". [I nominate this for an epigrammatic quote should the spec elect to have one.]
The mental model, in turn, is oriented towards a bunch of styling primitives. So, we would need a taxonomy of the various "marks", perhaps in alphabetical order for easy reference, and with notes on potential -ahem- "interactions". For example, UL is really just some geek's idea of obfuscating the plain English word "indent"; most of the time LI (more obfuscation for "smack-bullet") is found after it, so the section on LI should mention that it's advisable to always indent bullets. Another example would be how DD ("wide indent") is customarily cancelled by /DL. And so on.
If all this is making anyone uneasy, let it be noted that the source code for Mosaic was always available (at least the X version), so what was going on was no secret, and yet there were few if any complaints (even on the www-talk mailing list.) There are situations where silence implies approval.
So, the Tag Soup spec consists of three parts, which could be factored into separate documents.
1. A lexical specification.
This deals directly with tag syntax. Dan Connolly's paper on sgml-lex is an excellent model, were all the SGML references removed.
- no need to explain selected="selected" or ismap="ismap", and no need to mention that <h1 center> doesn't work. - quoting attribute values can be made "functional": needed only to prevent misparsing of whitespace or '>'. - "Comment tags" Made Easy. - <!junk decl> considered legal. - no need to have stuff "forbidden by this report" (PIs, Marked Sections, etc,)
and so on.
2. An Interaction specification
- common combinations of marks, eg. UL + LI and DD + /DL - known "no-ops" such as /LI and /DT
This is where on-line guides and the like could prove useful.
3. A Semantic Specification
The 4.01 spec with all SGML removed: just a listing of names and intended meanings.
The new I-D could point to a covering document pointing to these three parts, and thus avoid the need to provide references directly.
Arjun







