RE: [xsl] XSL pattern needed for begin/end elements

Subject: RE: [xsl] XSL pattern needed for begin/end elements
From: Pieter Reint Siegers Kort <pieter.siegers@xxxxxxxxxxx>
Date: Wed, 7 Jul 2004 17:00:43 -0500
Hi Tracy,

I haven't tried it with variations in your input XML but you may want to use
a identity template approach, like this:

<?xml version="1.0" encoding="iso-8859-1"?>

<xsl:output indent="yes"/>

<xsl:template match="/doc">
     <xsl:apply-templates select="@*|node()"/>

<xsl:template match="@*|node()">
     <xsl:apply-templates select="@*|node()"/>

<xsl:template match="text_run"/>
<xsl:template match="hyperlink_end"/>

<xsl:template match="hyperlink_begin">
    <xsl:attribute name="xlink:href">
    <xsl:value-of select="concat(following-sibling::text_run,'
')"/><b><xsl:value-of select="following-sibling::text_run[2]"/></b>


When applied to the input XML:

  <hyperlink_begin id="111" end="222">
    <locator_url protocol="http" host_name=""/>
  <text_run emphasis="bold">here.</text_run>
  <hyperlink_end id="222" begin="111"/>

this produces

<?xml version="1.0" encoding="UTF-16"?>
<hyperlink xlink:href="";
xmlns:xlink="";>Click <b>here.</b>

The only problem I still see other than the input variations, is that the
namespace is still in the output element <hyperlink>; I tried to get rid of
it using exclude-result-prefixes="xlink", but that didn't help. Maybe
someone else could comment on that one?

Anyway, I hope this helps you in some way - if not, I apologize, but it has
been anyway a good exercise for me to try and solve :-)


-----Original Message-----
From: Tracy Atteberry [mailto:Tracy.Atteberry@xxxxxxxxxxxx] 
Sent: Wednesday, July 07, 2004 3:40 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: RE: [xsl] XSL pattern needed for begin/end elements


Thanks for your suggestion.  Your template code is much cleaner than what I
had posted (so I used it as an example to clean up my own!) but
unfortunately the behavior remains the same.  That is, the text_run elements
between the hyperlink_(begin/end) elements are processed twice.
Once for the hyperlink then again.

So the output looks something like this:

 <HyperLink xlink:href="";>
   Click <b>here.</b>
 Click <b>here.</b>

How do we stop the intervening elements from being processed twice?


-----Original Message-----
From: Mike Trotman [mailto:mike.trotman@xxxxxxxxxxxxx]
Sent: Wednesday, July 07, 2004 3:09 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] XSL pattern needed for begin/end elements


I haven't worked through this too carefully - but here is a pseudo-code 
method that might work in the sibling case.
It is based on the idea of selecting all following nodes for processing 
based on their next <hyperlink_end> element having matching attributes 
to the current hyperlink-begin.
(which looks like what you were intending)

<xsl:template match='hyperlink-begin'>

<xsl:template match='text_run' mode='INLINK'>
<xsl:when test='@emphaisis="bold"'>
<b><xsl:value-of select='.'/></b>
<xsl:value-of select='.'/>

There are other ways of doing the 'INLINK' mode processing - depending 
on hwo complex it gets.
E.g. - you could have separate templates matching 'text_run[@emphasis]'

You may need an additional template
<xsl:template match='*' mode='INLINE'>
    <xsl:apply-templates/> <!-- or whatever else you want to do -->
</xsl:template> if you need to process non-sibling intervening elements.

I think something close to the above should work.


Tracy Atteberry wrote:

>The current project is a demo for something that will eventually be 
>written in C/C++.  Then as you say, we can then walk the DOM tree and 
>maintain a separate context stack to help solve the problem.
>For now, we can definitely assume that these elements are siblings.  In

>fact, for most real source documents this will be the case.  Given that

>assumption, I would love to know the not-too-difficult solution, as 
>this is my immediate problem.
>As for the more general case, a hyperlink may in some cases overlap 
>text runs.  For example:
>  <p>
>    <text_run emphasis="bold">Click 
>      <hyperlink_begin id="111" end="222">
>        <locator_url protocol="http" host_name=""/>
>      </hyperlink_begin>
>      here
>    </text_run>
>    <text_run> to download.</text_run>
>    <hyperlink_end id="222" begin="111"/>
>  </p>
>In fact, hyperlinks can overlap paragraphs and other document elements 
>though this is rarely seen in practice.
>-----Original Message-----
>From: Mike Trotman [mailto:mike.trotman@xxxxxxxxxxxxx]
>Sent: Wednesday, July 07, 2004 1:26 PM
>To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
>Subject: Re: [xsl] XSL pattern needed for begin/end elements
>If the begin and end elements are siblings at the same level then the
>problem is tractable and probably not too difficult to solve.
>However if they can occur at different levels then this means that one
>of them is enclosed inside an element that excludes the other (I
>Can you give any example of a case where the begin and end elements are
>not siblings at the same level?
>I ask because:
>a) I can't picture how this would make sense given the information that

>you require them to contain
>b) If one of them does occur inside an element that excludes the other
>    - what would you want to to with the excluded part of this elements

>content / tree?
>    - If you start closing all the parent elements etc (and opening
>again to match the orphaned end tags)
>    then you are destroying the structure and meaning of the XML data
>which XSLT is designed to help preserve.
>I.e. if they are not siblings at the same level then the XML data
>'structure' is totally inappropriate for XSLT
>and the 1st thing you should do is process it using something else.
>I have documents like this - and I process them by walking the DOM tree
>and maintaining a separate STACK of whatever I consider my current 
>context to be.
>(I am doing this to detect overlap between different document layers 
>marked in exactly the way you describe.)
>Tracy Atteberry wrote:
>>Hi all,
>>I'm looking for an XSL pattern to solve the problem of going from XML
>>that has separate begin and end elements to one that does not.
>>Please, please note that I do not control either the source or target
>>XML formats.  If I did, this would be much easier.
>>Source XML snip:
>> <hyperlink_begin id=3D"111" end=3D"222">
>>   <locator_url protocol=3D"http" host_name=3D""/>  
>></hyperlink_begin>  <text_run>Click</text_run>
>> <text_run emphasis=3D"bold">here.</text_run>
>> <hyperlink_end id=3D"222" begin=3D"111"/>
>>Target XML example:
>> <HyperLink xlink:href=3D"";>
>>   Click <b>here.</b>
>> </HyperLink>
>>In my case I can assume that associated begin and end hyperlink tags
>>will occur as siblings -- though generally this is not the case and in

>>fact, this is the reason the begin and end tags are unique elements.
>>I have a template that /almost/ works so feel free to let me know why
>>it fails OR suggest a completely different solution.
>>Current XSL template snip:
>><xsl:template match=3D"//hyperlink_begin">
>>   <xsl:variable name=3D"linkUrl">
>>       <xsl:value-of select=3D"locator_url/@protocol"/>
>>       <xsl:text>://</xsl:text>
>>       <xsl:value-of select=3D"locator_url/@host_name"/>
>>   </xsl:variable>
>>   <xsl:variable name=3D"endID" select=3D"@end"/>
>>   <xsl:element name=3D"HyperLink">
>>       <xsl:attribute name=3D"xlink:href"><xsl:value-of
>>       <xsl:apply-templates select=3D"(following-sibling::*) except 
>>   </xsl:element>
>>This produces the correct hyperlink but the template for text_run
>>elements gets called twice this way -- once inside the hyperlink, then

>>again as templates continue to be applied.
>>Any help would be greatly appreciated.  Thanks!
>>Tracy Atteberry
>>PS. I'm using Saxon 8
>>XSL-List info and archive:
>>To unsubscribe, go to:
>>or e-mail: <mailto:xsl-list-unsubscribe@xxxxxxxxxxxxxxxxxxxxxx>

Datalucid Limited
8 Eileen Road
South Norwood
London SE25 5EJ
United Kingdom

tel :0208-239-6810
mob: 0794-725-9760
email: mike.trotman@xxxxxxxxxxxxx

UK Co. Reg:   4383635
VAT Reg.:   798 7531 60


XSL-List info and archive:
To unsubscribe, go to:
or e-mail: <mailto:xsl-list-unsubscribe@xxxxxxxxxxxxxxxxxxxxxx>

XSL-List info and archive:
To unsubscribe, go to:
or e-mail: <mailto:xsl-list-unsubscribe@xxxxxxxxxxxxxxxxxxxxxx>

Current Thread