Re: [xsl] Converting from <dt><dd> pairs to better XML

Subject: Re: [xsl] Converting from <dt><dd> pairs to better XML
From: "Imsieke, Gerrit, le-tex" <gerrit.imsieke@xxxxxxxxx>
Date: Thu, 26 Aug 2010 01:51:36 +0200
So would I, and this is what I initially sketched. But then I recognized the desired ultimate output of the form:

<dl>A</dl>
<dd>1</dd>
<dl>A</dl>
<dd>2</dd>
<dl>B</dl>
<dd>3</dd>
->
1,2|3

So grouping by dl values was the essential part of the task, and the pre-grouping may be skipped in favour of following-sibling::*[1]/self::dd.

Pre-grouping the dls with their corresponding dds, if necessary, may either performed by group-starting-with dl or by group-adjacent dd. I don't have a preference here, but maybe I should have mentioned both ways (which I'll do below).

Here's a modified sample input with multiple dds:

<dl>
 <dt>AAA</dt>
 <dd>111</dd>
 <dt>BBB</dt>
 <dd>222</dd>
 <dt>BBB</dt>
 <dd>333</dd>
 <dt>BBB</dt>
 <dd>444</dd>
 <dd>888</dd>
 <dd>999</dd>
 <dt>CCC</dt>
 <dd>555</dd>
 <dt>CCC</dt>
 <dd>666</dd>
 <dd>777</dd>
</dl>



Here's the template with the group-starting-with pre-pass:

<xsl:template match="dl">
<xsl:variable name="group-sw" as="element(dl)">
<xsl:copy>
<xsl:for-each-group select="*" group-starting-with="dt">
<xsl:copy-of select="." /><!-- first group item -->
<dd>
<xsl:sequence select="string-join(current-group() except ., ',')" />
</dd>
</xsl:for-each-group>
</xsl:copy>
</xsl:variable>
<xsl:message>
<xsl:copy-of select="$group-sw"/>
</xsl:message>
<xsl:variable name="dds" as="xs:string+">
<xsl:for-each-group select="$group-sw/dt" group-by=".">
<xsl:sequence
select="string-join(
for $dt in current-group()
return $dt/following-sibling::*[1]/self::dd,
','
)">
</xsl:sequence>
</xsl:for-each-group>
</xsl:variable>
<xsl:sequence select="string-join($dds, '|')" />
</xsl:template>


=>
111|222,333,444,888,999|555,666,777


Here's the (more verbose) group-adjacent approach:


  <xsl:template match="dl">
    <xsl:variable name="group-dds" as="element(dl)">
      <xsl:copy>
        <xsl:for-each-group select="*" group-adjacent="boolean(self::dd)">
          <xsl:choose>
            <xsl:when test="current-grouping-key()">
              <dd>
                <xsl:sequence select="string-join(current-group(), ',')" />
              </dd>
            </xsl:when>
            <xsl:otherwise>
              <xsl:sequence select="current-group()" />
            </xsl:otherwise>
          </xsl:choose>
        </xsl:for-each-group>
      </xsl:copy>
    </xsl:variable>
            <xsl:message>
              <xsl:copy-of select="$group-dds"/>
            </xsl:message>
    <xsl:variable name="dds" as="xs:string+">
      <xsl:for-each-group select="$group-dds/dt" group-by=".">
        <xsl:sequence
          select="string-join(
                    for $dt in current-group()
                      return $dt/following-sibling::*[1]/self::dd,
                    ','
                  )">
        </xsl:sequence>
      </xsl:for-each-group>
    </xsl:variable>
    <xsl:sequence select="string-join($dds, '|')" />
  </xsl:template>

=>
111|222,333,444,888,999|555,666,777


Maybe there are also approaches with less passes?


------

Evan should consider what happens to column order when a BBB dt comes before the first AAA dt => maybe use xsl:sort if applicable to the actual scenario.

-Gerrit

On 26.08.2010 01:13, Michael Kay wrote:

This is based on the assumption that each dt is followed by exactly
one dd. If there may be multiple dds after each dt, you should group
adjacent dds in a first pass.

I would normally tackle this using group-starting-with="dt", creating a
group consisting of a dt and its following dd's.

Michael Kay
Saxonica


-- Gerrit Imsieke GeschC$ftsfC<hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

GeschC$ftsfC<hrer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard VC6ckler

Current Thread