Re: [xsl] XSL for WordML -> specific HTML

Subject: Re: [xsl] XSL for WordML -> specific HTML
From: Stephen <azrael@xxxxxxxxxxxxxxxxx>
Date: Fri, 06 May 2005 15:29:29 +0100
Thanks for getting back to me Jay... it's so daunting to start doing something so totally new - and feels easier to know others have been there and might be able to poke you along in the right direction.

The XSL I have written so far is very basic.. and isn't working fully on my word document.. and so it is hard to spot what exactly the problem is. Sometimes I make an intuitive change - only to get a freaky result that makes me doubt how intuitive the change was ;)

Basically what I am looking to do is an xsl that outputs just the content into html within div that have a class of the appropriate style. I'm not looking to pick up font weights/size and lots of other style data word stores. Also I want to grab word tables and output them as html tables.. in my head it all seems simple.. but the format of WordML makes it all very very convoluted.

Do you have any 'simple' XSL documents that do this sort of thing that I could study?

JBryant@xxxxxxxxx wrote:
Hi, Stephen,

I've done a little WordML to XHTML and WordML to FO, so maybe I can help a bit.

Your heading templates look fine to me. They'll get you the values of the headings without anything you don't want. I suppose you think they're messy because you don't want to apply multiple templates to the same element. However, your current solution works just as well as matching w:pPr and applying logic within the template to figure out which heading level it is. Personally, I find having one template per type of result node to be just as readable as one template per type of source node.

To get the content of the paragraphs, add this to your heading templates:

<xsl:template match="w:p/w:r/w:t">
  <p><xsl:apply-templates/></p>
</xsl:template>

A little uncanny - but I used something like that and that duplicates the heading level text for me. As it is being picked out by the specific template match, and then being picked out by that more generic one. Unless there's an XPath for 'match a w:p/w:r/w:t unless the w:p bit contains a w:pPr/w:pStyle node' ?


I can attach a small sample of my xml doc - and xsl.. if that'll help explain what I have got, and what I am getting? (Or I can dump them on a website somewhere)

I used apply-templates there because you may have format elements (bold, etc.) within the paragraph. I used the three level match to make sure you get just body content and not the heading content, too (as matching just w:t or w:r/w:t would do).

HTH

Jay Bryant
Bryant Communication Services
(presently consulting at Synergistic Solution Technologies)




Stephen <azrael@xxxxxxxxxxxxxxxxx> 05/05/2005 09:34 AM
Please respond to
xsl-list@xxxxxxxxxxxxxxxxxxxxxx



To xsl-list@xxxxxxxxxxxxxxxxxxxxxx cc

Subject
[xsl] XSL for WordML -> specific HTML






I have a WordML based xml document which contains content along the lines of:


<wx:sub-section>
                 <w:p>
                    <w:pPr>
                       <w:pStyle w:val="Heading1"/>
                    </w:pPr>
                    <w:r>
                       <w:t>Hello 1</w:t>
                    </w:r>
                 </w:p>
                 <w:p>
                    <w:r>
                       <w:t>Some random normal text 1</w:t>
                    </w:r>
                 </w:p>
                 <wx:sub-section>
                     <w:p>
                        <w:pPr>
                           <w:pStyle w:val="Heading2"/>
                        </w:pPr>
                        <w:r>
                           <w:t>Hello 2</w:t>
                        </w:r>
                    </w:p>
                    <w:p>
                       <w:r>
                          <w:t>Some random normal text 2</w:t>
                       </w:r>
                    </w:p>

and I want to output that as:

<div style="level1">Hello 1</div>
<p>Some random normal text 1</p>

<div style="level2">Hello 2</div>
<p>Some random normal text 2</p>

Obviously I may have headings all over the document, so I want something generic that will pick them all out nicely.

Currently I have a rather messy:

<xsl:template match="w:pStyle[@w:val='Heading1']">
<div class="section_L1">
<xsl:value-of select="../../w:r/w:t/text()"/>
</div>
</xsl:template>


<xsl:template match="w:pStyle[@w:val='Heading2']">
<div class="section_L2">
<xsl:value-of select="../../w:r/w:t/text()"/>
</div>
</xsl:template>


that outputs:

<div class="section_L1"></div><div class="section_L2"></div><div class="section_L1">Hello 1</div>Some random normal text 1<div class="section_L2">Hello 2</div>Some random normal text 2

Anyone have any useful ideas?




--
   Azrael

           ("\''/").___..--'''"-._
           `0_ O  )   `-.  (     ).`-.__.`)
           (_Y_.)'  ._   )  `._ `. ``-..-'
         _..`--'_..-_/  /--'_.' .'
        ((i).-''  ((i).'  (((.-'

Current Thread