[xsl] xsl:for-each-group, group-starting-with, nested

Subject: [xsl] xsl:for-each-group, group-starting-with, nested
From: "Kate Atkins" <katkins@xxxxxxxxxxxxxxx>
Date: Wed, 13 Apr 2005 16:01:06 -0400
Hello all,

I've been lurking and searching a bit to see if anyone had a similar
discussion on this, but I haven't seen anything quite like this. I
apologize if I've missed something and am repeating.

The challenge was to wrap parts of XML files in a tag called "split"
based on a marker that could occur essentially anywhere in the XML file
(but not inside a mixed content node). At this point in the process, I
had neither the option to limit the rules where the <splithere/> marks
could be made, nor the structure of the (several sets of) XML. I needed
a data-blind master key.

The content could be as simple as this:
<Chapter>
	<Title>Whales</Title>
	<splithere/>
	<Section>
		<Head>Types</Head>
		<Para>La la la</Para>
		<Para>La la la</Para>
	</Section>
	<splithere/>
	<Section>
		<Head>Colors</Head>
		<Para>La la la</Para>
		<Para>La la la</Para>
	</Section>
</Chapter>

Or as complex as this:

<OtherChapter>
	<Metainformation/>
	<splithere/>
	<Frontmatter>
		<Cover/>
		<People/>
		<TOC/>
	</Frontmatter>
	<Sprockets>
		<Sprocket>
			<Introduction>
				<splithere/>
				<Section>
					<Head/>
					<Para/>
				</Section>
			</Introduction>
		</Sprocket>
	</Sprockets>
	<Widgets>
		<splithere/>
		<Part>
			<PartTitle/>
			<splithere/>
			<Widget>
				<Intro>
					<splithere/>
					<Section>
						<Head/>
						<Section>
							<Head/>
							<Para/>
							<Table/>
						</Section>
					</Section>
					<splithere/>
					<Section>
						<Head/>
						<Section>
							<Head/>
							<Para/>
						</Section>
						<Section>
							<Head/>
						</Section>
						<Section>
							<Head/>
						</Section>
					</Section>
				</Intro>
			</Widget>
		</Part>
		<splithere/>
		<References>
			<Head/>
			<RefList/>
		</References>
	</Widgets>
</OtherChapter>

I thought, hey, I'll try this nifty new grouping method in XSLT 2.0, as
my first official foray into the new stuff. (I've been writing 1.0 for a
couple of years.). I'm using Saxon B, release 8.1.1.

So this is what I came up with:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output method="xml" indent="yes" encoding="UTF-8" />

<xsl:template match="* | @* | text()">
<xsl:choose>
	<xsl:when test="splithere or descendant::splithere">
		<xsl:call-template name="splitter"/>
	</xsl:when>
	<xsl:otherwise>
		<xsl:copy>
			<xsl:apply-templates select="* | @* | text()"/>
		</xsl:copy>
	</xsl:otherwise>
</xsl:choose>
</xsl:template>

<xsl:template name="spliter">
  <xsl:element name="{name()}">
	<xsl:for-each select="@*">
		<xsl:attribute name="{name(.)}"><xsl:value-of
select="."/></xsl:attribute>
	</xsl:for-each>
	<xsl:choose>
		<xsl:when test="splithere">
		        <xsl:for-each-group select="*"
group-starting-with="splithere">
		          <chunk>
		          <xsl:choose>
					<xsl:when
test="current-group()/splithere">

					<xsl:for-each-group
select="current-group()" group-starting-with="splithere">
						<chunk>
							<xsl:for-each
select="current-group()">

<xsl:call-template name="splitter"/>
					            </xsl:for-each>

						</chunk>

					</xsl:for-each-group>

					</xsl:when>
					<xsl:otherwise>
				            <xsl:for-each
select="current-group()">
				      		      <xsl:call-template
name="splitter"/>
				            </xsl:for-each>

					</xsl:otherwise>
				</xsl:choose>
		          </chunk>
		        </xsl:for-each-group>
		</xsl:when>
		<xsl:otherwise>
				<xsl:apply-templates select="* | @* |
text()"/>
		</xsl:otherwise>
	</xsl:choose>
	 </xsl:element>
</xsl:template>
</xsl:stylesheet>

But it leaves a lot of muck of extra tags in the file, like this:

<Chapter>
   <split>
      <Title>Whales</Title>
   </split>
   <split>
      <splithere/>
      <Section>
		         <Head>Types</Head>
		         <Para>La la la</Para>
		         <Para>La la la</Para>
	      </Section>
   </split>
   <split>
      <splithere/>
      <Section>
		         <Head>Colors</Head>
		         <Para>La la la</Para>
		         <Para>La la la</Para>
	      </Section>
   </split>
</Chapter>

So I have a little fixer to follow it up, going back XSLT version 1.1
with an older saxon, I think 7-6-5a:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.1"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
  <xsl:output method="xml" indent="yes" encoding="UTF-8" />


<xsl:template match="* | @* | text()">
	<xsl:copy>
		<xsl:apply-templates select="* | @* | text()"/>
	</xsl:copy>
</xsl:template>

<xsl:template match="split">
	<xsl:choose>
		<xsl:when test="splithere=true()">
			<split>
				<xsl:apply-templates/>
			</split>
		</xsl:when>
		<xsl:otherwise>
		<xsl:apply-templates/></xsl:otherwise>
	</xsl:choose>
</xsl:template>

</xsl:stylesheet>

To give me this (desired output):
<Chapter>
	<Title>Whales</Title>
	<split>
		<splithere/>
		<Section>
			<Head>Types</Head>
			<Para>La la la</Para>
			<Para>La la la</Para>
		</Section>
	</split>
	<split>
		<splithere/>
		<Section>
			<Head>Colors</Head>
			<Para>La la la</Para>
			<Para>La la la</Para>
		</Section>
	</split>
</Chapter>

I'd rather correct my initial code than correct it with another program.
Does anyone see my mistake, the one causing the extra <split> tags? I
think I've just banged my head on this a few too many times...


Thanks in advance,

Kate Atkins
Silverchair Science + Communications
http://www.silverchair.com

Current Thread