[xsl] Tricky XSLT 2.0 grouping problem

Subject: [xsl] Tricky XSLT 2.0 grouping problem
From: "James Sulak" <jsulak@xxxxxxxxxxxxxxxx>
Date: Fri, 10 Oct 2008 09:03:47 -0500
I have a tricky grouping problem that I'm running into a wall with.  I
thought it might be a fun challenge to throw out there.  I'm attempting
to group a flat list of <section> elements into a hierarchy based on
matching its number against different regular expressions.  The list is
assumed to be in the correct order.  I have it working (the code is
below) with one exception:  roman numerals.

For example, the XML:

<body>
<section><pnum>(a)</pnum><p>First-level section</p></section>
<section><pnum>(1)</pnum><p>Second-level section</p></section>
<section><pnum>(A)</pnum><p>Third-level section</p></section>
<section><pnum>(i)</pnum><p>Fourth-level section</p></section>
<section><pnum>(ii)</pnum><p>Fourth-level section</p></section>
<section><pnum>(B)</pnum><p>Third-level section</p></section>
<section><pnum>(2)</pnum><p>Second-level section</p></section>
<section><pnum>(A)</pnum><p>Third-level section</p></section>
</body>

Should give the result:

<body>
<section>
  <pnum>(a)</pnum><p>First-level section</p>
  <section>
    <pnum>(1)</pnum><p>Second-level section</p>
    <section>
      <pnum>(A)</pnum><p>Third-level section</p>
	<section><pnum>(i)</pnum><p>Fourth-level section</p></section>
	<section><pnum>(ii)</pnum><p>Fourth-level section</p></section>
    </section>
    <section>
      <pnum>(B)</pnum><p>Third-level section</p>
    </section>
  </section>
  <section>
    <pnum>(2)</pnum><p>Second-level section</p>
    <section><pnum>(A)</pnum><p>Third-level section</p></section>
  </section>
</section>
</body>

The problem is that the number "(i)," which is supposed to be a
fourth-level section, in ambiguous with an "(i)" that would be a
first-level section.  My transform ends up treating it like a
first-level section, and so gives the following, incorrect output:

<body>
<section>
  <pnum>(a)</pnum><p>First-level section</p>
  <section>
    <pnum>(1)</pnum><p>Second-level section</p>
    <section>
      <pnum>(A)</pnum><p>Third-level section</p>
    </section>
  </section>
</section>
<section>
  <pnum>(i)</pnum><p>Fourth-level section</p>
  <section>
    <pnum>(ii)</pnum><p>Fourth-level section</p>
    <section>
      <pnum>(B)</pnum><p>Third-level section</p>
    </section>
  </section>
  <section>
    <pnum>(2)</pnum><p>Second-level section</p>
    <section>
      <pnum>(A)</pnum><p>Third-level section</p>
    </section>
  </section>
</section>
</body>

I've included my current transform below.  The grouping_keys variable is
a sequence of regex strings that match each subsequent level of section
nesting.  Does anybody have an alternate way of tackling this?

Thanks,

-James


<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
    xmlns:xs="http://www.w3.org/2001/XMLSchema"; version="2.0">

    <xsl:variable name="grouping_keys" as="xs:string+"
                  select="('\([a-z]\)', '\([1-9]\)', '\([A-Z]\)',
'\([ivx]{1,4}\)')" />

    <!-- Start the grouping here -->
    <xsl:template match="codebody">
        <codebody>
            <xsl:copy-of select="@*"/>
            <xsl:for-each-group select="*"
                group-starting-with="section[matches(pnum,
string($grouping_keys[1]))]">
                <xsl:apply-templates select="." mode="group">
                    <xsl:with-param name="level" select="1"
as="xs:integer"/>
                </xsl:apply-templates>
            </xsl:for-each-group>
        </codebody>
    </xsl:template>

    <!-- This template copies the current section and groups any
"nested" sections  -->
    <xsl:template match="section" mode="group">
        <xsl:param name="level" as="xs:integer"/>
        <section>
            <xsl:copy-of select="@*, *"/>
            <xsl:if test="$level &lt; count($grouping_keys)">
                <xsl:for-each-group select="current-group() except ."
                    group-starting-with="section[matches(pnum,
string($grouping_keys[$level + 1]))]">
                    <xsl:apply-templates select="." mode="group">
                        <xsl:with-param name="level" select="$level + 1"
as="xs:integer"/>
                    </xsl:apply-templates>
                </xsl:for-each-group>
            </xsl:if>
        </section>
    </xsl:template>

    <xsl:template match="element()" mode="#all">
        <xsl:copy>
            <xsl:apply-templates select="@*,node()" mode="#current"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template
match="attribute()|text()|comment()|processing-instruction()"
mode="#all">
        <xsl:copy/>
    </xsl:template>


</xsl:stylesheet>

Current Thread