Re: [xsl] Moving element up hierarchy unless text nodes

Subject: Re: [xsl] Moving element up hierarchy unless text nodes
From: "James Cummings james@xxxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 4 Mar 2015 00:36:38 -0000
Cool Wendell!

I've not had a chance to test this out yet, I may have to come back to you
with some questions as I'm really not sure I understand that match
pattern.  I'll have a play with it.

Many thanks!

-James

On Tue, Mar 3, 2015 at 7:48 PM, Wendell Piez wapiez@xxxxxxxxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> Hi again James,
>
> So in the code I posted yesterday I realized at least one more
> interesting improvement is possible.
>
> Instead of
>
> <xsl:template match="pb">
>   <!-- Only copy the pb if no ancestor considers it 'leading' or
> 'trailing'. -->
>   <xsl:if test="empty(ancestor::*/
>         (key('leading-pb',generate-id()) |
>          key('trailing-pb',generate-id())) intersect . )  ">
>     <xsl:copy-of select="."/>
>   </xsl:if>
> </xsl:template>
>
> We could have more directly and efficiently
>
>   <xsl:template match="pb">
>     <xsl:if test="(. is key('leading-pb',generate-id())) and
>             (. is key('trailing-pb',generate-id()))">
>       <xsl:copy-of select="."/>
>     </xsl:if>
>   </xsl:template>
>
>
> Or even (if you are crazy for match patterns, and who isn't)
>
> <xsl:template match="pb[empty(key('leading-pb',generate-id())) or
>       empty(key('trailing-pb',generate-id()))]"/>
>
> These work because the keys bind pb elements to themselves when they
> are not 'leading' or 'trailing' (i.e. correctly outside not inside
> their parent).
>
> Cheers, Wendell
>
> On Mon, Mar 2, 2015 at 2:11 PM, Wendell Piez wapiez@xxxxxxxxxxxxxxx
> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> > Hi James,
> >
> > So, try this. It works by assigning 'pb' elements to ancestors that
> > consider them 'leading' (start the element off) or 'trailing'. They
> > can be retrieved from (for) said ancestor using a key.
> >
> > Lightly tested.
> >
> > <xsl:template match="comment() | processing-instruction() | text() | @*">
> >   <xsl:copy-of select="."/>
> > </xsl:template>
> >
> > <xsl:template match="*">
> >   <xsl:copy-of select="key('leading-pb',generate-id())"/>
> >   <xsl:copy>
> >     <xsl:apply-templates select="@* | node()"/>
> >   </xsl:copy>
> >   <xsl:copy-of select="key('trailing-pb',generate-id())"/>
> > </xsl:template>
> >
> > <xsl:template match="pb">
> >   <!-- Only copy the pb if no ancestor considers it 'leading' or
> 'trailing'. -->
> >   <xsl:if test="empty(
> >     ancestor::*/(key('leading-pb',generate-id()) |
> > key('trailing-pb',generate-id())) intersect . )  ">
> >     <xsl:copy-of select="."/>
> >   </xsl:if>
> > </xsl:template>
> >
> > <xsl:key name="leading-pb" match="pb">
> >   <xsl:apply-templates select="." mode="leading-pb"/>
> > </xsl:key>
> >
> > <xsl:key name="trailing-pb" match="pb">
> >   <xsl:apply-templates select="." mode="trailing-pb"/>
> > </xsl:key>
> >
> > <xsl:template match="body/*" mode="leading-pb trailing-pb">
> >   <xsl:sequence select="generate-id()"/>
> > </xsl:template>
> >
> > <xsl:template match="*" mode="leading-pb">
> >   <xsl:choose>
> >     <xsl:when test="empty(preceding-sibling::*/(. except self::pb) |
> > preceding-sibling::text()[matches(.,'\S')])">
> >       <xsl:apply-templates select=".." mode="leading-pb"/>
> >     </xsl:when>
> >     <xsl:otherwise>
> >       <xsl:sequence select="generate-id()"/>
> >     </xsl:otherwise>
> >   </xsl:choose>
> > </xsl:template>
> >
> > <xsl:template match="*" mode="trailing-pb">
> >   <xsl:choose>
> >     <xsl:when test="empty(following-sibling::*/(. except self::pb) |
> > following-sibling::text()[matches(.,'\S')])">
> >       <xsl:apply-templates select=".." mode="trailing-pb"/>
> >     </xsl:when>
> >     <xsl:otherwise>
> >       <xsl:sequence select="generate-id()"/>
> >     </xsl:otherwise>
> >   </xsl:choose>
> > </xsl:template>
> >
> > Feel free to ask for any explanation needed. It *seems* to work
> > (although I often do not trust my lying eyes) ... :-)
> >
> > Cheers, Wendell
> >
> > On Fri, Feb 27, 2015 at 6:51 PM, James Cummings
> > james@xxxxxxxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
> > wrote:
> >>
> >> Hi there.
> >>
> >> We've been looking at canonicalising use of <pb/> in a large collection
> of
> >> TEI P5 XML texts. What we want to do is move this up the hierarchy
> unless
> >> there is text before or after it only stopping when there is a sibling
> >> element with textual content or when it hits the body/back/front
> elements.
> >> i.e. someone might have encoded:
> >>
> >>
> >> ====input====
> >> <body>
> >>     <div>
> >>         <lg>
> >>             <l><pb n="1"/> some text here</l>
> >>             <l>some text here <pb n="2"/></l>
> >>         </lg>
> >>         <lg>
> >>             <l>some text <pb n="3"/> some text</l>
> >>             <anchor xml:id="test"/>
> >>             <l><pb n="4"/>some text here</l>
> >>             <l>some text here <pb n="5"/></l>
> >>             <anchor xml:id="test2"/>
> >>         </lg>
> >>     </div>
> >>     <div>
> >>         <head>Some Text</head>
> >>         <lg>
> >>             <!-- A comment here -->
> >>             <l><pb n="6"/>Some text</l>
> >>             <l>Some text<pb n="7"/></l>
> >>         </lg>
> >>     </div>
> >> </body>
> >> =====
> >>
> >> And what we'd want to end up with is:
> >>
> >> =====
> >> <body>
> >>     <pb n="1"/>
> >>     <div>
> >>         <lg>
> >>             <l> some text here</l>
> >>             <l>some text here </l>
> >>         </lg>
> >>         <pb n="2"/>
> >>         <lg>
> >>             <l>some text <pb n="3"/> some text</l>
> >>             <pb n="4"/>
> >>             <anchor xml:id="test"/>
> >>             <l>some text here</l>
> >>             <l>some text here </l>
> >>             <anchor xml:id="test2"/>
> >>         </lg>
> >>     </div>
> >>     <pb n="5"/>
> >>     <div>
> >>         <head>Some Text</head>
> >>         <pb n="6"/>
> >>         <lg>
> >>             <!-- A comment here -->
> >>             <l>Some text</l>
> >>             <l>Some text</l>
> >>         </lg>
> >>     </div>
> >>     <pb n="7"/>
> >> </body>
> >> =====
> >>
> >> So as the <pb/> has text before/after it, it stays where it is. It
> should
> >> move to the level in the hierarchy where its
> preceding-sibling::node()[1]
> >> has text, passing over other empty elements or comments.  (Of course,
> as you
> >> might expect) the markup could be any element names, I just use div/lg/l
> >> here because it is short and nicely hierarchicial as an example. My
> approach
> >> so far has been, on every element to try to test if there is text()
> between
> >> where I currently am and the following::pb[1] by selecting everything
> >> between the start and the pb and looking at its normalised
> string-length.
> >> But so far these tests aren't working right, and I haven't even got my
> head
> >> round how to do it in reverse for <pb/> at the end.
> >>
> >> Has anyone done something like this before that I could look at? Any
> >> suggestions?
> >>
> >> Thanks for any help!
> >>
> >> -James Cummings
> >> XSL-List info and archive
> >> EasyUnsubscribe (by email)
> >
> >
> >
> > --
> > Wendell Piez | http://www.wendellpiez.com
> > XML | XSLT | electronic publishing
> > Eat Your Vegetables
> > _____oo_________o_o___ooooo____ooooooo_^
> >
>
>
>
> --
> Wendell Piez | http://www.wendellpiez.com
> XML | XSLT | electronic publishing
> Eat Your Vegetables
> _____oo_________o_o___ooooo____ooooooo_^

Current Thread