Subject: Re: [xsl] Re: Any Doc to XML converter ? From: Peter Flynn <peter@xxxxxxxxxxx> Date: Wed, 20 Jun 2001 23:51:22 +0100 |
On Tue, 19 Jun 2001, Dmitri wrote: > Bob DuCharme wrote: > > > In his latest 'XML Deviant' column in XML.com > > (http://www.xml.com/pub/a/2001/06/13/deviant.html), Leigh Dodds describes > > and points to a recent thread on the topic. > > >From a recent MSDN article 'Export a Word Document to XML' by Kevin McDowell > (http://msdn.microsoft.com/library/techart/odc_expwordtoxml.htm) > > 'The XML output by this application is very straightforward and very similar to the > HTML output by Word itself, but it fully accounts for all styled text, tables, and > lists. ' Which may very well be true, but the output is largely garbage. This whole discussion misses the major points: 1) Iff your Word document is formatted 100% exclusively with named styles, robust conversion to meaningful XML is easily possible with a number of packages, eg Enigma's DynaTag. 2) If your Word document uses arbitrary manual styling, no amount of footling around with conversions is going to produce anything other than an XML-syntax'd representation of all the styles. You still have to undertake the hardest part, which is interpreting all the styling cruft into some meaningful markup. XSLT could certainly be used at this stage. This assumes you do want meaningful markup. If all you need is the XML representation of the manual styling, then there are several solutions already discussed. It may be instructive that a someone last year wrote a short VB script to turn any DOC file into XML, extracting all the style info into a CSS stylesheet in a single pass...and it was written on a laptop in the bus on the way to the airport after a meeting. I'm sure it has long been superseded but this is not rocket science. ///Peter XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Re: Any Doc to XML conver, Michael Beddow | Thread | Re: [xsl] Re: Any Doc to XML conver, Michael Beddow |
RE: [xsl] again position()?, Peter Flynn | Date | RE: [xsl] Re: Any Doc to XML conver, Joshua Allen |
Month |