RE: [xsl] User-defined function for linenumber

Subject: RE: [xsl] User-defined function for linenumber
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 1 Aug 2007 09:39:38 +0100
This feels horrendously inefficient. Why not instead implement a SAX filter
that adds the line number as an extra attribute to every element?

Michael Kay
http://www.saxonica.com/ 

> -----Original Message-----
> From: jesper.tverskov@xxxxxxxxx 
> [mailto:jesper.tverskov@xxxxxxxxx] On Behalf Of Jesper Tverskov
> Sent: 01 August 2007 09:11
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] User-defined function for linenumber
> 
> Hi list
> 
> I am trying to make a user-defined function that can return 
> the linenumber of a node (yes I know Saxon has an extension 
> function doing the same). So far my solution works for 
> element nodes and that is good enough for now.
> 
> But I am using the analyze-string element. I would like to 
> find a solution not using analyze-string in order to get a 
> solution that would also work when the expressions are 
> modified and transferred to Schematron. I am not sure if it 
> is possible? Some clever REGEX?
> 
> If the document does not contain the element node in question 
> also as text inside comments, CDATA sections and PI's, I can 
> do without analyze-string. I use analyze-string only to 
> neutralize false positives simply by deleting all "&lt;" 
> found inside comments, CDATA sections and PIs.
> 
> It is possible to do without analyze-string under all circumstances?
> 
> My function works like this:
> 
> I load the document as unparsed text and deletes all "&lt;" 
> from comments, CDATA sections and PIs to avoid false 
> positives. I then use the node name (e.g.: "p") to split the 
> string and make a new string of the items until the node 
> number (e.g.: the third "p"). I then count the characters, 
> delete all linefeeds, count again, and subtract to get the 
> count of linefeeds until the element node in question.
> 
> My function looks like this:
> 
> <xsl:function name="please:linenumber">
>         <xsl:param name="document-uri"/><!-- similar to 
> document-uri() -->
>         <xsl:param name="node-name"/><!-- e.g.: 'p' -->
>         <xsl:param name="node-number"/><!-- e.g.: '3', that 
> is the third p -->
>         <xsl:variable name="unparsed" 
> select="unparsed-text($document-uri)"/>
>         <xsl:variable name="unparsed2">
>             <xsl:analyze-string select="$unparsed"
> regex="&lt;!--.*?--&gt;|&lt;!\[CDATA\[.*?\]\]&gt;|&lt;\?.*?\?&gt;"
> flags="s">
>             <xsl:matching-substring>
>                 <xsl:value-of select="replace(., '&lt;', '')"/>
>             </xsl:matching-substring>
>             <xsl:non-matching-substring>
>                 <xsl:value-of select="."/>
>             </xsl:non-matching-substring>
>         </xsl:analyze-string>
>         </xsl:variable>
>        <xsl:value-of
> select="string-length(string-join(subsequence(tokenize($unparsed2,
> concat('&lt;', $node-name)), 1, $node-number), ' ')) -
>             
> string-length(replace(string-join(subsequence(tokenize($unparsed2,
> concat('&lt;', $node-name)), 1, $node-number), ' '), '&#xA;', ''))"/>
>     </xsl:function>
> 
> Cheers
> Jesper Tverskov
> http://www.xmlplease.com

Current Thread