Re: [xsl] Fwd: text nodes

Subject: Re: [xsl] Fwd: text nodes
From: "Lucas Lain" <lucas.lain@xxxxxxxxx>
Date: Thu, 18 Sep 2008 16:59:14 -0300
Hello Wendell,

first of all, thank you for your response.

Second:

THANK YOU FOR YOUR RESPONSE! =)

you read my mind. Everything is working great right now!

Regards,

Lucas.

On Thu, Sep 18, 2008 at 11:26 AM, Wendell Piez <wapiez@xxxxxxxxxxxxxxxx> wrote:
> Lucas,
>
> The problem you are looking at is actually a variant of a grouping problem.
> Processing all the nodes up to a particular node amounts to grouping the
> nodes into several "before" and "after" groups.
>
> Grouping in general, and this sort of grouping in particular (called
> "positional grouping") are a well-known weak spot in XSLT 1.0. Accordingly,
> if you can use XSLT 2.0, you will have much better and much easier solutions
> available.
>
> If you must use XSLT 1.0, however, there are known methods. The two best
> methods are probably sibling recursion and key-based association. In sibling
> recursion, basically what you do is shift your processor (using template
> modes for this) out of its normal pattern of selecting and processing
> (applying templates) all children, and instead process only the first child,
> which processes the next, which processes the next, etc. This gives you a
> way to introduce stop and restart conditions into the processing.
>
> In key-based association, you basically associate the nodes, typically using
> a key (this makes it easier), with the node you want to stop on, and then
> use the key to retrieve them. This is essentially an optimization of the
> method that Sam has suggested (his logic does the same thing without the
> key).
>
> I think this method may be slightly easier for you. It would look something
> like:
>
> (Sam's code, for comparison)
>>
>> <xsl:template match="a">
>>  <xsl:variable name="next_a"
>>    select="generate-id(following-sibling::a[1])"/>
>>
>>  <xsl:for-each select="following-sibling::text()[ generate-id(
>>    following-sibling::a[1] ) = $next_a ]">
>>    <xsl:value-of select="."/>
>>  </xsl:for-each>
>>
>> </xsl:template>
>>
>> <xsl:template match="some_element">
>>  <xsl:apply-templates select="a"/>
>> </xsl:template>
>
> <xsl:key name="nodes-by-last-stop" match="node()"
>  use="generate-id(preceding-sibling::a|parent::*)[last()])"/>
> <!-- using this key, each node in the document can be retrieved using
>     the system-generated ID of the last preceding sibling 'a'
>     element, or of its parent (for those that have no
>     preceding-sibling::a) -->
>
> <xsl:template match="a">
>  <!-- when 'a' is matched, nothing is done with it, but the elements
>       associated by the key with its generated ID are processed -->
>  <xsl:apply-templates select="key('nodes-by-last-stop',generate-id())"/>
> </xsl:template>
>
> <xsl:template match="some_element">
>  <!-- when an element requiring splitting is matched, its own associated
>       elements are processed (these are children that have no 'a' preceding
>       them), then its 'a' children are processed -->
>  <xsl:apply-templates select="key('nodes-by-last-stop',generate-id()"/>
>  <xsl:apply-templates select="a"/>
> </xsl:template>
>
> If you research how keys work, you will find this does exactly the same
> thing as Sam's logic, only more concisely and more comprehensively (since it
> doesn't drop elements before the first 'a').
>
> This can be extended to include 'lb' elements among the "stop" elements as
> follows:
>
> <xsl:key name="nodes-by-last-stop" match="node()"
>  use="generate-id(parent::*|preceding-sibling::a|preceding-sibling::lb)[last()])"/>
>
> <xsl:template match="a | lb">
>  <xsl:apply-templates select="key('nodes-by-last-stop',generate-id())"/>
> </xsl:template>
>
> <xsl:template match="some_element">
>  <xsl:apply-templates select="key('nodes-by-last-stop',generate-id()"/>
>  <xsl:apply-templates select="a | lb"/>
> </xsl:template>
>
> Note: this code is untested, although the algorithm isn't.
>
> Good luck (and find a way to use XSLT 2.0!),
> Wendell
>
> At 07:35 AM 9/18/2008, you wrote:
>>
>> Thank you Sam!
>>
>> i meant by ' stop on the first occurrence on the "[ ]" ' this:
>>
>> for this input:
>>
>> <some_element>
>> <a>hello</a> some text 1<br/>
>> some text 2
>> some text 3
>> <a>hello 2</a> some text 4
>> some text 5
>> some text 6
>> > </some_element>
>>
>> i want this output:
>>
>> "some text 1 some text 2 some text 3"  and
>> "some text 4 some text 5 some text 6"
>>
>> I am not sure how to deal with <br/> elements. I'm using xsl to
>> extract data from HTML.
>> i can't make the xsl you sent me to work. If this info helps you to
>> help me ... let me know.
>>
>> best regards!
>>
>> L.
>>
>> On Wed, Sep 17, 2008 at 10:52 AM, Sam Byland <shbyland@xxxxxxxxxxx> wrote:
>> >> the output is:
>> >>
>> >> "some text 1 some text 2 some text 3"
>> >
>> > Lucas,
>> >
>> > I used:
>> >
>> > <some_element>
>> > <a>hello</a> some text 1<br/>
>> > some text 2
>> > some text 3
>> > <a>hello 2</a>
>> > </some_element>
>> >
>> > for the input.  Assuming you only want to output the text up to the next
>> > <a>
>> > element, then something like this (XSLT1.0) might get you in the right
>> > direction:
>> >
>> > <xsl:template match="a">
>> >
>> >   <xsl:variable name="next_a"
>> > select="generate-id(following-sibling::a[1])"/>
>> >
>> >   <xsl:for-each select="following-sibling::text()[ generate-id(
>> > following-sibling::a[1] ) = $next_a ]">
>> >       <xsl:value-of select="."/>
>> >   </xsl:for-each>
>> >
>> > </xsl:template>
>> >
>> > <xsl:template match="some_element">
>> >   <xsl:apply-templates select="a"/>
>> > </xsl:template>
>> >
>> > I'm not totally sure what you meant by ' stop on the first occurrence on
>> > the
>> > "[ ]" '
>> >
>> > ...sam
>> >
>
>
> ======================================================================
> Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
> Mulberry Technologies, Inc.                http://www.mulberrytech.com
> 17 West Jefferson Street                    Direct Phone: 301/315-9635
> Suite 207                                          Phone: 301/315-9631
> Rockville, MD  20850                                 Fax: 301/315-8285
> ----------------------------------------------------------------------
>  Mulberry Technologies: A Consultancy Specializing in SGML and XML
> ======================================================================
>
>



-- 
Ing. Lucas Lain

Current Thread