| From | Sent On | Attachments |
|---|---|---|
| madhu | Jul 28, 2000 2:22 am | |
| Sebastian Rahtz | Jul 28, 2000 2:36 am | |
| Bob McIlvride | Aug 2, 2000 10:39 am | |
| ROSm...@ixl.com | Aug 2, 2000 10:54 am | |
| Bob McIlvride | Aug 3, 2000 5:30 am | |
| bmad...@yahoo.com | Aug 3, 2000 9:37 am | |
| Sebastian Rahtz | Aug 3, 2000 1:07 pm | |
| Gary Lawrence Murphy | Aug 4, 2000 5:51 pm |
| Subject: | RE: DOCBOOK-APPS: M$word and docbook/xml | |
|---|---|---|
| From: | bmad...@yahoo.com (bmad...@yahoo.com) | |
| Date: | Aug 3, 2000 9:37:23 am | |
| List: | org.oasis-open.lists.docbook-apps | |
Hi,
i have finally saved it as .txt and have finished about 100 pages ;-) but since the thread was taking on some interesting proportions ; here's my $0.2 worth a) i agree with bob on one count it is easier to paste from pdf for tables list etc b) i used staroffice (windows) which had three options save it as .txt or ms-dos txt or unix text i chooose the safest route of txt since i work both in windows and linux also my java editor ( wonders of wonders works better on linux) which has some nifty tree view and automatic refresh as i type in my tags etc. c) sebastian though had suggested as alternative to majjix but that dam'n thing always gave java out of memory error and i didn't want to go back to disturb mucho busy man , our own sebastian . d) with the word document saved as text i have only one problem that all the " ' " i.e. single qoutes or rather apostrophe get saved in probably a binary format since it shows up as as tiny square (outline) . i used find and replace function of my editor to get rid of it.
thank you all very much for this thread
regards
maddy
On Wed, 02 Aug 2000, RoSmith wrote:
-----Original Message----- From: Bob McIlvride [mailto:rob...@cogent.ca]
Sebastian Rahtz wrote:
A better route is to print the documents out and have them retyped in India. Seriously.
A _slightly_ better route is to convert them to PDF and select and paste the text a paragraph at a time into your text editor, creating and/or adding DocBook markup as you go.
Bob,
I can't imagine that would be any easier than simply saving the file as "Text Only" or "MS-DOS Text" and starting from there.
Another thought would be to Export the .doc file as HTML, then use HTML Tidy to clean up the MSHTML and convert it to XHTML. I have never done this, but it may work better than the "Text Only" solution.
-Ross





