RE: [xsl] Converting embedded URLs int hot links via XSL

Subject: RE: [xsl] Converting embedded URLs int hot links via XSL
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Sat, 17 Apr 2004 22:50:07 +0100
> Here is what I"m doing

I've added some indentation, without it the code is completely unfathomable.
> 
> <xsl:template name="extract">
> <xsl:param name="strEmbUrl"/>
> <xsl:param name="delim1"/>
> <xsl:param name="delim2"/>
> 		
> <xsl:choose>
>   <xsl:when test="contains($strEmbUrl, $delim1)">
>   <xsl:variable name="strB4"
>                 select="substring-before($strEmbUrl, $delim1)"/>
>   <xsl:variable name="strAfter"
>                 select="substring-after($strEmbUrl, $delim1)"/>
>   <xsl:variable name="lastCh"
>                 select="substring($strAfter, string-length($strAfter))"/>
>   <xsl:choose>
>     <xsl:when test="$lastCh = $delim2">	

I don't have a copy of your original data, but this test looks highly
dubious. $strAfter contains all the text after the first "http://"; - why
does it matter what the last character of this text is? What you need to do
is to locate the end of the URI, i.e. the first character that isn't valid
in a URI, of which you can find a list in RFC 2396. You seem to be assuming
that the URI will end with a single or double quote - I'm not sure why.

>     <xsl:choose>
>       <xsl:when test="contains($strAfter, $delim2)">

This is strange! If the last character in $strAfter is $delim2 (which is
what you just tested) then the contains() call will always return true.

>         <xsl:variable name="emURL"
>                       select="substring-before($strAfter, $delim2)"/>
>         <!-- #1 Solution -->
>         <xsl:value-of select="$strB4"/>&lt;a
>                          href="<xsl:value-of
select="concat($delim1,$emURL)"/>"
>                          target="_blank"&gt;<xsl:value-of
>                          select="concat($delim1, $emURL)"/>&lt;/a&gt;>

#1 is nonsense. You want to create an element node, not angle bracket
characters.

>         <!-- #2 Solution -->
>         <xsl:value-of select="$strB4"/>
          <xsl:variable name="link" select="concat($delim1,$emURL)"/>
>         <a target="_blank" href="{$link}">
>             <xsl:value-of select="$link"/></a>

#2 is the right approach to creating the <a> element

>         <!-- #3 Solution -->
>         <xsl:value-of select="$strB4"/>
          <xsl:element name="A"><xsl:attribute name="href"><xsl:value-of
> select="concat($delim1,$emURL)"/></xsl:attribute><xsl:attribute
> name="target"><xsl:value-of
> select="'_blank'"/></xsl:attribute><xsl:value-of
> select="concat($delim1, $emURL)"/></xsl:element>

#3 is just verbose and unnecessary.

> <!-- end problematic code -->

You might think so, but you're wrong.
> 
> <xsl:call-template name="extract">
> <xsl:with-param name="delim1" select="$delim1"/>
> <xsl:with-param name="delim2" select="$delim2"/>
> <xsl:with-param name="strEmbUrl"><xsl:value-of
> select="substring-after($strEmbUrl,
> $emURL)"/></xsl:with-param>
> </xsl:call-template>
> </xsl:when>

The following identical xsl:otherwise clauses are because you are testing
redundantly in the xsl:when clauses, as mentioned above.

> <xsl:otherwise>
> <xsl:value-of select="$strEmbUrl"/>
> </xsl:otherwise>
> </xsl:choose>
> </xsl:when>
> <xsl:otherwise>
> <xsl:value-of select="$strEmbUrl"/>
> </xsl:otherwise></xsl:choose></xsl:when>
> <xsl:otherwise><xsl:value-of select="$strEmbUrl"/>
> </xsl:otherwise></xsl:choose></xsl:template>
> 
> <xsl:template name="extractEmbeddedURLs">
> <xsl:param name="str"/>
> <xsl:call-template name="extract">
> <xsl:with-param name="delim1" select="'http://'"/>
> <xsl:with-param name="delim2">'</xsl:with-param>
> <xsl:with-param name="strEmbUrl">
>    <xsl:call-template name="extract">
>      <xsl:with-param name="delim1" select="'http://'"/>
>      <xsl:with-param name="delim2">"</xsl:with-param>
>      <xsl:with-param name="strEmbUrl" select="$str"/>
>    </xsl:call-template>
> </xsl:with-param>
> </xsl:call-template>
> </xsl:template>
> 
> 
> Solution 1,2,3 are the problem area. I tried these 3
> but first iteration works fine but when it goes into
> the subsequent iteration string containing html is
> lost somewhere. i.e. If I call extract only once then
> it works fine and browser is able to render the html
> but recursion seems to have some problem.
> 

Your recursive template is treating the strEmbURL parameter as a string, but
you are passing it a result tree fragment containing text and <a> elements.
Conversion of a result tree fragment to a string flattens the <a> elements,
getting rid of the element and attribute nodes and retaining only the text.
I would suggest that you try to code this in a single pass through the data,
looking for all the characters that can terminate a URL in the same pass.
Use the translate() function to convert all the possible terminators into a
single terminator, and then search for this.

Michael Kay 

Current Thread