[xsl] Generating ID key values

Subject: [xsl] Generating ID key values
From: "Trevor Nicholls" <trevor@xxxxxxxxxxxxxxxxxx>
Date: Thu, 11 Dec 2008 00:49:48 +1300
Hi

I have a stylesheet which is supposed to generate id attributes for any
occurrences of particular elements which do not already have one.
Unfortunately it has a couple of defects.

I am testing in XMLSpy but the stylesheet has to run with Xalan (because
it's built into a FrameMaker application). Hence we're restricted to XSL 1
solutions.

The stylesheet is here:
 =======================
 <xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
 <xsl:output method="xml" encoding="UTF-8" />

 <!--
   - NOTE
   - This stylesheet generates ID attributes for certain elements that
should have them - but do not.
   -->

 <!-- set of existing ID values -->
 <xsl:key name="id" match="*[@id]" use="@id" />

 <!-- set of potential ID values -->
 <xsl:key name="noid" match="title[not(@id)]"
use="translate(normalize-space(translate(.,translate(.,'
abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_.-',''),'
')),' ','_')" />
 <xsl:key name="noid" match="question[not(@id)]"
use="translate(normalize-space(translate(.,translate(.,'
abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_.-',''),'
')),' ','_')" />
 <xsl:key name="noid" match="dt[not(@id)]"
use="translate(normalize-space(translate(.,translate(.,'
abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_.-',''),'
')),' ','_')" />
 <xsl:key name="noid" match="target[not(@id)]"
use="translate(normalize-space(translate(.,translate(.,'
abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_.-',''),'
')),' ','_')" />

 <!-- Fill in any missing IDs -->

 <xsl:template match="title | question | target | dt">
  <xsl:choose>
   <xsl:when test="@id">
    <xsl:call-template name="copy-elem" />
   </xsl:when>
   <xsl:otherwise>
    <xsl:call-template name="copy-elem-giving-id" />
   </xsl:otherwise>
  </xsl:choose>
 </xsl:template>

 <xsl:template name="copy-elem">
  <xsl:copy>
   <xsl:apply-templates select="@*" />
   <xsl:apply-templates />
  </xsl:copy>
 </xsl:template>

 <xsl:template name="copy-elem-giving-id">
  <xsl:variable name="id"
select="translate(normalize-space(translate(.,translate(.,'
abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_.-',''),'
')),' ','_')" />
  <xsl:variable name="thisnode" select="generate-id()" />
  <xsl:copy>
   <xsl:attribute name="id">
    <xsl:value-of select="$id" />
    <xsl:for-each select="key('noid',$id)">
     <xsl:if test="(key('id',$id)) or (position()!=1 and
generate-id(.)=$thisnode)">
      <xsl:text>-</xsl:text>
      <xsl:value-of select="position()" />
     </xsl:if>
    </xsl:for-each>
   </xsl:attribute>
   <xsl:apply-templates select="@*" />
   <xsl:apply-templates />
  </xsl:copy>
 </xsl:template>

 <!-- catchall -->
 <xsl:template match="*">
  <xsl:copy>
   <xsl:apply-templates select="@*" />
   <xsl:apply-templates />
  </xsl:copy>
 </xsl:template>

 <xsl:template match="@*">
  <xsl:copy-of select="." />
 </xsl:template>

 </xsl:stylesheet>
 =======================

Sample XML here:
 =======================
 <?xml version="1.0" encoding="UTF-8"?>
 <document>
  <title id="Intro">Intro</title>
  <title>Me</title>
  <title>Intro</title>
  <title>3 Point Turns</title>
  <title>Intro</title>
 </document>
 =======================

The output I get:
 =======================
 <?xml version="1.0" encoding="UTF-8"?>
 <document>
  <title id="Intro">Intro</title>
  <title id="Me">Me</title>
  <title id="Intro-1-2">Intro</title>
  <title id="3_Point_Turns">3 Point Turns</title>
  <title id="Intro-1-2">Intro</title>
 </document>
 =======================

The first problem is that instead of generating ID values of "Intro-1" and
"Intro-2", both titles are given the illegal duplicate ids "Intro-1-2",
which rather negates the point of using the keys in the first place. I
thought the <for-each><if /></for-each> construction would only match once,
but evidently not.

The second problem is that title beginning with a digit. While I can prefix
the id with an underscore when I create the attribute, I can't see how to do
the same modification when I construct the key. And if I don't construct the
"noid" key with the same prefix then I run the risk of failing to detect a
duplicate id when I create the attribute.

If you can see what I've done wrong I would be most grateful.

Cheers
Trevor

Current Thread