Re: [xsl] Performance of link target search, and Normalising or collapsing a pathname value, best method?

Subject: Re: [xsl] Performance of link target search, and Normalising or collapsing a pathname value, best method?
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 22 Sep 2023 12:35:28 -0000
On 22/09/2023 14:09, Trevor Nicholls trevor@xxxxxxxxxxxxxxxxxx wrote:
>
> I am working with sets of XML document files which include "include"
> elements; the include elements are substituted by the content of the
> file found at include/@srcfile, and inclusions may be nested many
> levels deep.
>
> These documents contain link elements which point to elements in other
> documents. For reasons (FrameMaker) all the links have to be verified
> because the FrameMaker cross references often point to the wrong
> target file.
>
> For example a.xml might include a link/cross-reference which points to
> d.xml#an_id, but it will have saved this as <link
> srcfile="b.xml#an_id"> because b.xml includes c.xml which includes
> d.xml. I need to correct the srcfile so that it refers to the correct
> file.
>
> I have a template which does this, and at the moment it looks like this:
>
> <!-- local key for cross references -->
>
> <xsl:key name="linkidkey" match="*[@id]" use="@id" />
>
> <xsl:template name="verify-link">
>
> B B B B B B B  <xsl:param name="linkurl" />
>
> B B B B B B B  <xsl:variable name="match-file">
>
> B B B B B B B B B B B B B B B  <xsl:choose>
>
> B B B B B B B B B B B B B B B B B B B B B B B  <xsl:when
test="contains($linkurl,'#')">
>
> <xsl:value-of select="substring-before($linkurl,'#')" />
>
> B B B B B B B B B B B B B B B B B B B B B B B  </xsl:when>
>
> <xsl:otherwise>
>
> <xsl:value-of select="$linkurl" />
>
> </xsl:otherwise>
>
> B B B B B B B B B B B B B B B  </xsl:choose>
>
> B B B B B B B  </xsl:variable>
>
> B B B B B B B  <!-- NB1 -->
>
> B B B B B B B  <xsl:variable name="match-prefix">
>
> B B B B B B B B B B B B B B B  <xsl:value-of
> select="string-join(tokenize($match-file, '/')[position() != last()],
> '/')" />
>
> B B B B B B B  </xsl:variable>
>
> B B B B B B B  <xsl:variable name="match-id"
> select="substring-after($linkurl,'#')" />
>
> B B B B B B B  <xsl:choose>
>
> B B B B B B B B B B B B B B B  <!-- FILE -->
>
> B B B B B B B B B B B B B B B  <xsl:when test="$match-id = ''">
>
> B B B B B B B B B B B B B B B B B B B B B B B  <xsl:value-of
select="$linkurl" />
>
> B B B B B B B B B B B B B B B  </xsl:when>
>
> B B B B B B B B B B B B B B B  <!-- ID -->
>
> B B B B B B B B B B B B B B B  <xsl:when test="$match-file = ''">
>
> B B B B B B B B B B B B B B B B B B B B B B B  <xsl:value-of
select="$linkurl" />
>
> B B B B B B B B B B B B B B B  </xsl:when>
>
> B B B B B B B B B B B B B B B  <!-- FILE#ID -->
>
> B B B B B B B B B B B B B B B  <!-- need to verify that ID is in FILE, and
correct if
> it isn't -->
>
> B B B B B B B B B B B B B B B  <xsl:otherwise>
>
> B B B B B B B B B B B B B B B B B B B B B B B  <xsl:variable name="this">
>
> <xsl:for-each select="document($match-file,/)">
>
> <xsl:value-of select="key('linkidkey',$match-id)" />
>
> </xsl:for-each>
>
> </xsl:variable>
>
> B B B B B B B B B B B B B B B B B B B B B B B  <xsl:choose>
>
> <xsl:when test="$this != ''">
>
> <xsl:value-of select="concat($match-file,'#',$match-id)" />
>
> </xsl:when>
>
> <xsl:otherwise>
>
> <xsl:for-each select="document($match-file,/)">
>
> B <!-- NB2 -->
>
> <xsl:for-each select="//include">
>
> <xsl:call-template name="verify-link">
>
> <!-- NB3 -->
>
> <xsl:with-param name="linkurl"
> select="concat($match-prefix,'/',@srcfile,'#',$match-id)" />
>
> </xsl:call-template>
>
> </xsl:for-each>
>
> </xsl:for-each>
>
> </xsl:otherwise>
>
> B B B B B B B B B B B B B B B B B B B B B B B  </xsl:choose>
>
> B B B B B B B B B B B B B B B  </xsl:otherwise>
>
> B B B B B B B  </xsl:choose>
>
> </xsl:template>
>
> Is there a simple way of normalising that path? If necessary I can
> probably write my own but there may be a function which already does
> it that I don't know about.
>
> Secondly, where the code has the comment NB2, there must be a
> significant performance penalty because the expansion of "include"
> elements into deeper and deeper levels is always performed even if the
> sought id was found in the first included subfile. Is there a more
> efficient way I could do this search? (Maybe the performance is
> tolerable in this case, the users haven't complained yet, but if there
> is a better technique I could learn that would likely help me avoid
> inefficient code in other projects I'd be grateful)
>

If you are using an XSLT 3.0 processor then you could consider to
replace the `xsl:for-each` with an `xsl:iterate` that uses `xsl:break`
as soon as it has found a match.

Current Thread