Subject: Re: [xsl] Re: Any Doc to XML converter ? From: "Michael Beddow" <mbnospam@xxxxxxxxxxx> Date: Thu, 21 Jun 2001 09:35:22 +0100 |
On 20 Jun 2001 Peter Flynn wrote: > Which may very well be true, but the output is largely garbage. > This whole discussion misses the major points: You mean, "the output (probably) doesn't contain a meaningful representation of the document's structure". Correct, but who claimed that it did? That doesn't make it garbage. The first stage in brewing Guinness doesn't result in something anyone would want to drink in their local pub, but it doesn't get pumped into the sea: it gets processed into something more useful. > > 1) Iff your Word document is formatted 100% exclusively with > named styles, robust conversion to meaningful XML is easily > possible with a number of packages, eg Enigma's DynaTag. > Or, if it really is formatted that way, with free extensions to products you've already licensed, as per the program in question, or, better still, with completely free ones like OpenOffice. Further proprietary tools not needed. > > 2) If your Word document uses arbitrary manual styling, no > amount of footling around with conversions is going to > produce anything other than an XML-syntax'd representation > of all the styles. Again, nobody disputes that, but nobody was claiming anything different. > You still have to undertake the hardest > part, which is interpreting all the styling cruft into some > meaningful markup. Not quite. You don't *just* have the "styling cruft". You have, unless you're very unlucky, various clues in the original document about how it's articulated. Devise a system that identifies those clues and uses them to rewrite the "cruft" and you're on your way. And yes, it is "hard", but it can be automated, and without proprietary tools. Depends a lot, obviously, on the input document. To see what this can look like in practice, take a look at http://xml.lexilog.org.uk/techbrief1.html Written for an audience of academic medievalists, so contains some corner-cutting that will make some people here wince, but it illustrates what I mean in this last remark. Michael --------------------------------------------------------- Michael Beddow http://www.mbeddow.net/ XML and the Humanities page: http://xml.lexilog.org.uk/ --------------------------------------------------------- XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Re: Any Doc to XML conver, Peter Flynn | Thread | Re: [xsl] Re: Any Doc to XML conver, Wendell Piez |
RE: [xsl] Inserting spaces in my FO, B.Rabi Shankar | Date | RE: [xsl] Re: Any Doc to XML conver, Peter Flynn |
Month |