Re: [xsl] From WordprocessingML inline styles to nested inline elements

Subject: Re: [xsl] From WordprocessingML inline styles to nested inline elements
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Tue, 27 Mar 2007 14:16:56 -0400
Yves,

At 12:44 PM 3/27/2007, you wrote:
After testing both of your solutions, I discovered that David's does not do the right thing: the inner-most run style from the hierarchy is not wrapped around the text, but ends up as a singleton element just before the run's text. Wendell's solution, however, works perfectly (after minor typo corrections), so I favour this one.

What a thrill. :->


But I'm sure you'd fine that after correction of another minor typo or small error in David-logic (which is very dependable in the main but sometimes slips in the detail) you'd find David's would still work as well (since as I said it's the same solution).

Currently, I am thinking about getting even more "static", by replacing the above generic style parsing approach by individual generated "classical" templates, each mapping one style and controlling the subsequently called templates.

So I'd start out with a template for the outermost style:

<xsl:template match="w:r[w:rPr/w:b]">
  <xsl:element name="b">

Inside the template, I would like to pass the same run (w:r) to a template dealing with the next level's style, matched by "w:r[w:rPr/w:i]", and so on. I wonder how can I "materialize" the style hierarchy, read from the configuration file, into a series of properly chained templates.

My ideas: either use one mode per style-specific template, or one single named template that gets called recursively, with the current style and the run's text as parameters. Which way would you recommend?

Well, the nice thing about the approach we suggested is that the style hierarchy you want is already "materialized", in your styles specification.


If that's not available, then how should your stylesheet know what order to nest things in? That is, what information is available to determine what order of chaining is proper?

Without such a specification as an input, I'd probably go for your second idea, or rather for a modification of it -- I'd nest my output in a tree of nodes generated from traversing the input formats (the w:i, w:b etc.) in series, probably in just the order they're provided in.

So:

<w:r>
  <w:Pr>
    <w:b/>
    <w:i>
  </w:Pr>
  <w:t>Bold, italicized text</w:t>
</w:r>

<xsl:template match="w:r">
  <xsl:apply-templates select="w:Pr/*[1]" mode="style"/>
  <xsl:if test="not(w:Pr/*)">
    <!-- fail safe if we have no styles -->
    <xsl:value-of select="w:t"/>
  </xsl:if>
</xsl:template>

<xsl:template match="w:Pr/*" mode="style">
  <xsl:element name="local-name()">
    <!-- creates an element 'b' for 'w:b', etc. -->
    <xsl:apply-templates select="following-sibling::*" mode="style"/>
    <xsl:if test="not(following-sibling::*)">
       <!-- if we have no following sibling, we're done -->
       <xsl:value-of select="../../w:t"/>
    </xsl:if>
  </xsl:element>
</xsl:template>

Note: untested.

This would chain the templates in the order of styling elements in the input, so if you have <w:b/><w:i/><w:ul/> you'd get <b><i><ul>text</ul></i></b> but <w:i/><w:b/><w:ul/> would get you <i><b><ul>text</ul></b></i>.

C'est la vie: if we have no other way to determine what order they nest in, we have to make it up somehow.

I don't think that answers your questions, but I hope it helps you think about the problem.

Cheers,
Wendell



======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

Current Thread