From: Karl-Heinz Zimmer (khz@stardivision.de)
Date: Tue Mar 30 1999 - 09:52:09 CEST
Am 29.03.1999, 22:18:36, schrieb Michele Andreoli:
> You work at StarDivision, therefore you know much about MS Word
> format.
> In your opinion, it's possibile to develope an "awk" script
> which strip escape from a .doc document, converting them to
> plain text format?
Very sorry: it will not be possible to do that.
(Ths is not my opinion, but i am SURE of it!)
The way the store information in their files is far different from
normal encoding procedures: they use a so called 'Storage' format
containing several 'Streams' containing the data in a somewhat random
way. ;-)
True: as one can see in the file format documentation on their webpage
(only when going there with Internet Explorer) it will never be
possible for a script to extract content correctly from an WW97 doc.
Maybe you can get out parts of the info from a WW95 or WW6 doc but
even that is not sure...
Sorry for the bad news,
Karl-Heinz
This archive was generated by hypermail 2.1.6 : Sat Feb 08 2003 - 15:27:11 CET