Subject: RE: [xsl] One texdocument in and several xmldocuments out? From: "Stuart Celarier" <stuart@xxxxxxxxxxx> Date: Mon, 6 May 2002 09:33:20 -0700 |
You can convert a Word document to HTML using File / Save As... and selecting HTML or Filtered HTML. The difference between these two is HTML preserves all of Word's information such as <span> tags to mark spelling and grammar issues, whereas Filtered HTML drops the Word-specific tags. Then follow the advice already provided here (e.g., Tidy) to ensure that the HTML is well-formed XML. Cheers, Stuart -----Original Message----- From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Robert Koberg Sent: Monday, May 06, 2002 06:43 To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: Re: [xsl] One texdocument in and several xmldocuments out? Hi, Zack Brown wrote: >On Mon, May 06, 2002 at 01:28:51PM +0200, Tove Nilstun wrote: > > >>Hi >> >>I am a total beginner when it comes to XML, but in order to start working >>with it, there are two things I need to sort out. >> >>I have a user guide (written in MS Word) with both text and pictures. I >>would like to 1. convert this document to several xml documents, one per >>headline and 2. create an additional xml file containing an index of the >>files created in step one. >> >>Is this possible? >> >> > >Absolutely. Just create one XSLT file for each output file you desire. >Then run the XML through your parser once for each XSLT file you've >created > You do not need an XSLT file for each page. First you have to get the MSWord doc into XML. THere are a few products out there that convert Word to docbook or some other XML. A neat trick we found when building our MSIE-based editor was that you could paste a MSWord doc into an element that has contentEditable="true". IE converts this to HTML. We use JS to convert it to XML on the client, but you could use Tidy to get well-formed HTML (XML). Then hopefully there are clean separations to indicate where a new page should start. Apply-templates (loop) on each page division and (you can) use extension functions built into Saxon or Xalan to create multiple output documents from one source. best, -Rob XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] One texdocument in and se, Robert Koberg | Thread | RE: [xsl] One texdocument in and se, CHAOUI Hassan |
RE: [xsl] XSl, sorting, namespaces , Stuart Celarier | Date | Re: [xsl] data translation => desce, Wendell Piez |
Month |