On Tue, Aug 13, 2002 at 02:26:24AM -0700, jonathon wrote:
I have roughly 10 000 documents of various formats
[ plain ASCII, TeeX, DocBook, HTML 4.01, XHTML 1.0
word, wordperfect, pdf and a couple of others. ]
Can anybody point me to something that will easilly convert
these to docbook, and preserve some/most of their current
I'm not looking forward to doing the conversion manually.
If I had that problem, I would convert as many of them
as I could to HTML, run 'tidy' to clean up the HTML,
and then run the DocParse tool from www.commmandprompt.com to
convert them to DocBook. DocParse is not free, but it
is not expensive either.
For your PDF documents, I'd look for the source document
that generated the PDF. It is tough (impossible?)
to convert PDF.
Bob Stayton 400 Encinal Street
Publications Architect Santa Cruz, CA 95060
Technical Publications voice: (831) 427-7796
Caldera International, Inc. fax: (831) 429-1887