|Subject:||New conversion scheme in place.|
|From:||John Fieber (jfie...@indiana.edu)|
|Date:||Sep 8, 1996 4:45:05 pm|
Thos watching the cvs lists should be aware that instant(1) has now replaced sgmlsasp(1) as the core of the sgml->xxx conversion process.
Currently, not too many of instant's capabilities have been exercised, but the changeover has solved a number of longstanding bugs. Most importantly, decent postscript output can be achieved without the use of TeX (which is actually broken at the moment). Both text and postscript output support cross references and you will find a table of contents at the end of the document.
The HTML output is basically the same as before--just a few minor glitches have been fixed. I suspect there are still a few cases where invalid HTML is generated but those should be easy to fix now.
There are still a number of glitches in the groff generated output (I've developed a healthy disrespect for troff and all its variants with this project). I have some modifications to instant in mind that should help pacify troff's insanely picky parsing conventions (tabs, spaces, and newlines in particular).
For people picking pieces of current rather than following it wholesale, you need to grab src/share/sgml, src/usr.bin/sgmls, src/usr.bin/sgmlfmt, and the mm macros from the groff that was just imported (you can get the macros elsewhere too, just be sure and get version 1.27--later versions are pretty buggy). This stuff all works on 2.1[.5] systems.
I don't anticipate doing much more than the odd bug fix for the linuxdoc->xxx translations. Instead, I'm going to dust off the docbook translations that I started some time ago. I'll try and bring that in soon so people can start looking at a DTD that was actually designed for computer documentation.
Be sure to let me know if you find any notable glitches with the new system.
For those curious about how/why instant(1) is better than sgmlsasp(2), the latter is a one-pass event driven filter. The former reads the entire SGML document into an element tree. Each element in the tree is matched with a rule from the transpec(5) file. The matching can be determined by the element name, its relationship to other elements (parent, child, sibling, descendant, ancestor, etc.), a regular expression on the data content, the value of a variable or attributes of the element or its parent.
Next, instant(1) begins an in-order traversal of the tree, applying the translation rules just selected as it goes.
There are many things a translation rule can do, the most powerful being the ability to take detours from the in-order traversal. You can easily hunt down other elements based on a variety of criteria, execute other arbitrary translation rules (chaining) and otherwize provide the illusion of moving data anywhere you please. For example, if you hit a cross reference, you can easily track down the other end of the reference, then search up the tree until you find the smallest sectioning element, grab the title and insert it where the original reference was.
One last thing. The transpec(5) man page needs to be re-written as the file format has changed dramatically. However, clever people can probably figure out what is going on by comparing what the manual page says with actual transpec file.