Subject: Re: [xsl] Moving element up hierarchy unless text nodes From: "James Cummings james@xxxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Mon, 6 Apr 2015 13:21:54 -0000 |
I _finally_ had a chance to test and make sure I think I understand the clever solution Wendell came up with for moving <pb/> elements before or after nodes with no text content and/or whitespace-only nodes. I must apologise to him for delaying so long in doing so. Mea culpa. I've added some comments to the XSL to ensure I understood what was going on. Although I've never really been good with key()s the bits that confused me most were: === <!-- copy pb if it is both leading and trailing, thus stays put --> <xsl:template match="pb"> <xsl:if test="(. is key('leading-pb',generate-id())) and (. is key('trailing-pb',generate-id()))"> <xsl:copy-of select="."/> </xsl:if> </xsl:template> === Where if I understand it, a <pb/> is only copied if its generate-id is equal to be leading-pb and trailng-pb key. (i.e. it is in the middle some elements with text, or a text node, or similar, so it stays where it is.) The other confusing bit for me was the test in the leading/trailing-pb mode matching any element but closer inspection I think means I understand it. (Though never would have thought of it...) This tests for trailing-pb mode that the result is empty for the follow-sibling nodes or text that isn't just whitespace. Otherwise it generates an id. === <xsl:choose> <xsl:when test="empty(following-sibling::*/(. except self::pb) | following-sibling::text()[matches(.,'\S')])"> <xsl:apply-templates select=".." mode="trailing-pb"/> </xsl:when> <xsl:otherwise> <xsl:sequence select="generate-id()"/> </xsl:otherwise> </xsl:choose> === I think I understand all the individual bits to this but still have difficulty thinking through the whole thing. It does seem to work on all the tests I've tried. Thanks Wendell! -James =====full xslt=== <!-- comments, processing instructions, text nodes and attributes --> <xsl:template match="comment() | processing-instruction() | text() | @*"> <xsl:copy-of select="."/> </xsl:template> <!-- copy elements separately so can move pb elements --> <xsl:template match="*"> <!-- copy the pb only if no ancestor considers it leading or trailing --> <xsl:copy-of select="key('leading-pb',generate-id())"/> <!-- copy the element, attributes, and process nodes --> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy> <xsl:copy-of select="key('trailing-pb',generate-id())"/> </xsl:template> <!-- copy pb if it is both leading and trailing, thus stays put --> <xsl:template match="pb"> <xsl:if test="(. is key('leading-pb',generate-id())) and (. is key('trailing-pb',generate-id()))"> <xsl:copy-of select="."/> </xsl:if> </xsl:template> <!-- key for leading pb applying templates in leading-pb mode --> <xsl:key name="leading-pb" match="pb"> <xsl:apply-templates select="." mode="leading-pb"/> </xsl:key> <!-- key for trailing pb applying templates in trailing-pb mode --> <xsl:key name="trailing-pb" match="pb"> <xsl:apply-templates select="." mode="trailing-pb"/> </xsl:key> <!-- everything directly under body generate an id --> <xsl:template match="body/*" mode="leading-pb trailing-pb"> <xsl:sequence select="generate-id()"/> </xsl:template> <!-- when the preceding-sibling is empty or not whitespace apply-templates in leading-pb to the parent --> <xsl:template match="*" mode="leading-pb"> <xsl:choose> <xsl:when test="empty(preceding-sibling::*/(. except self::pb) | preceding-sibling::text()[matches(.,'\S')])"> <xsl:apply-templates select=".." mode="leading-pb"/> </xsl:when> <xsl:otherwise> <xsl:sequence select="generate-id()"/> </xsl:otherwise> </xsl:choose> </xsl:template> <!-- when the preceding-sibling is empty or not whitespace apply-templates in leading-pb to the parent --> <xsl:template match="*" mode="trailing-pb"> <xsl:choose> <xsl:when test="empty(following-sibling::*/(. except self::pb) | following-sibling::text()[matches(.,'\S')])"> <xsl:apply-templates select=".." mode="trailing-pb"/> </xsl:when> <xsl:otherwise> <xsl:sequence select="generate-id()"/> </xsl:otherwise> </xsl:choose> </xsl:template> ===== On Wed, Mar 4, 2015 at 12:36 AM, James Cummings james@xxxxxxxxxxxxxxxxx < xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > Cool Wendell! > > I've not had a chance to test this out yet, I may have to come back to you > with some questions as I'm really not sure I understand that match > pattern. I'll have a play with it. > > Many thanks! > > -James > > On Tue, Mar 3, 2015 at 7:48 PM, Wendell Piez wapiez@xxxxxxxxxxxxxxx < > xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > >> Hi again James, >> >> So in the code I posted yesterday I realized at least one more >> interesting improvement is possible. >> >> Instead of >> >> <xsl:template match="pb"> >> <!-- Only copy the pb if no ancestor considers it 'leading' or >> 'trailing'. --> >> <xsl:if test="empty(ancestor::*/ >> (key('leading-pb',generate-id()) | >> key('trailing-pb',generate-id())) intersect . ) "> >> <xsl:copy-of select="."/> >> </xsl:if> >> </xsl:template> >> >> We could have more directly and efficiently >> >> <xsl:template match="pb"> >> <xsl:if test="(. is key('leading-pb',generate-id())) and >> (. is key('trailing-pb',generate-id()))"> >> <xsl:copy-of select="."/> >> </xsl:if> >> </xsl:template> >> >> >> Or even (if you are crazy for match patterns, and who isn't) >> >> <xsl:template match="pb[empty(key('leading-pb',generate-id())) or >> empty(key('trailing-pb',generate-id()))]"/> >> >> These work because the keys bind pb elements to themselves when they >> are not 'leading' or 'trailing' (i.e. correctly outside not inside >> their parent). >> >> Cheers, Wendell >> >> On Mon, Mar 2, 2015 at 2:11 PM, Wendell Piez wapiez@xxxxxxxxxxxxxxx >> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: >> > Hi James, >> > >> > So, try this. It works by assigning 'pb' elements to ancestors that >> > consider them 'leading' (start the element off) or 'trailing'. They >> > can be retrieved from (for) said ancestor using a key. >> > >> > Lightly tested. >> > >> > <xsl:template match="comment() | processing-instruction() | text() | >> @*"> >> > <xsl:copy-of select="."/> >> > </xsl:template> >> > >> > <xsl:template match="*"> >> > <xsl:copy-of select="key('leading-pb',generate-id())"/> >> > <xsl:copy> >> > <xsl:apply-templates select="@* | node()"/> >> > </xsl:copy> >> > <xsl:copy-of select="key('trailing-pb',generate-id())"/> >> > </xsl:template> >> > >> > <xsl:template match="pb"> >> > <!-- Only copy the pb if no ancestor considers it 'leading' or >> 'trailing'. --> >> > <xsl:if test="empty( >> > ancestor::*/(key('leading-pb',generate-id()) | >> > key('trailing-pb',generate-id())) intersect . ) "> >> > <xsl:copy-of select="."/> >> > </xsl:if> >> > </xsl:template> >> > >> > <xsl:key name="leading-pb" match="pb"> >> > <xsl:apply-templates select="." mode="leading-pb"/> >> > </xsl:key> >> > >> > <xsl:key name="trailing-pb" match="pb"> >> > <xsl:apply-templates select="." mode="trailing-pb"/> >> > </xsl:key> >> > >> > <xsl:template match="body/*" mode="leading-pb trailing-pb"> >> > <xsl:sequence select="generate-id()"/> >> > </xsl:template> >> > >> > <xsl:template match="*" mode="leading-pb"> >> > <xsl:choose> >> > <xsl:when test="empty(preceding-sibling::*/(. except self::pb) | >> > preceding-sibling::text()[matches(.,'\S')])"> >> > <xsl:apply-templates select=".." mode="leading-pb"/> >> > </xsl:when> >> > <xsl:otherwise> >> > <xsl:sequence select="generate-id()"/> >> > </xsl:otherwise> >> > </xsl:choose> >> > </xsl:template> >> > >> > <xsl:template match="*" mode="trailing-pb"> >> > <xsl:choose> >> > <xsl:when test="empty(following-sibling::*/(. except self::pb) | >> > following-sibling::text()[matches(.,'\S')])"> >> > <xsl:apply-templates select=".." mode="trailing-pb"/> >> > </xsl:when> >> > <xsl:otherwise> >> > <xsl:sequence select="generate-id()"/> >> > </xsl:otherwise> >> > </xsl:choose> >> > </xsl:template> >> > >> > Feel free to ask for any explanation needed. It *seems* to work >> > (although I often do not trust my lying eyes) ... :-) >> > >> > Cheers, Wendell >> > >> > On Fri, Feb 27, 2015 at 6:51 PM, James Cummings >> > james@xxxxxxxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> >> > wrote: >> >> >> >> Hi there. >> >> >> >> We've been looking at canonicalising use of <pb/> in a large >> collection of >> >> TEI P5 XML texts. What we want to do is move this up the hierarchy >> unless >> >> there is text before or after it only stopping when there is a sibling >> >> element with textual content or when it hits the body/back/front >> elements. >> >> i.e. someone might have encoded: >> >> >> >> >> >> ====input==== >> >> <body> >> >> <div> >> >> <lg> >> >> <l><pb n="1"/> some text here</l> >> >> <l>some text here <pb n="2"/></l> >> >> </lg> >> >> <lg> >> >> <l>some text <pb n="3"/> some text</l> >> >> <anchor xml:id="test"/> >> >> <l><pb n="4"/>some text here</l> >> >> <l>some text here <pb n="5"/></l> >> >> <anchor xml:id="test2"/> >> >> </lg> >> >> </div> >> >> <div> >> >> <head>Some Text</head> >> >> <lg> >> >> <!-- A comment here --> >> >> <l><pb n="6"/>Some text</l> >> >> <l>Some text<pb n="7"/></l> >> >> </lg> >> >> </div> >> >> </body> >> >> ===== >> >> >> >> And what we'd want to end up with is: >> >> >> >> ===== >> >> <body> >> >> <pb n="1"/> >> >> <div> >> >> <lg> >> >> <l> some text here</l> >> >> <l>some text here </l> >> >> </lg> >> >> <pb n="2"/> >> >> <lg> >> >> <l>some text <pb n="3"/> some text</l> >> >> <pb n="4"/> >> >> <anchor xml:id="test"/> >> >> <l>some text here</l> >> >> <l>some text here </l> >> >> <anchor xml:id="test2"/> >> >> </lg> >> >> </div> >> >> <pb n="5"/> >> >> <div> >> >> <head>Some Text</head> >> >> <pb n="6"/> >> >> <lg> >> >> <!-- A comment here --> >> >> <l>Some text</l> >> >> <l>Some text</l> >> >> </lg> >> >> </div> >> >> <pb n="7"/> >> >> </body> >> >> ===== >> >> >> >> So as the <pb/> has text before/after it, it stays where it is. It >> should >> >> move to the level in the hierarchy where its >> preceding-sibling::node()[1] >> >> has text, passing over other empty elements or comments. (Of course, >> as you >> >> might expect) the markup could be any element names, I just use >> div/lg/l >> >> here because it is short and nicely hierarchicial as an example. My >> approach >> >> so far has been, on every element to try to test if there is text() >> between >> >> where I currently am and the following::pb[1] by selecting everything >> >> between the start and the pb and looking at its normalised >> string-length. >> >> But so far these tests aren't working right, and I haven't even got my >> head >> >> round how to do it in reverse for <pb/> at the end. >> >> >> >> Has anyone done something like this before that I could look at? Any >> >> suggestions? >> >> >> >> Thanks for any help! >> >> >> >> -James Cummings >> >> XSL-List info and archive >> >> EasyUnsubscribe (by email) >> > >> > >> > >> > -- >> > Wendell Piez | http://www.wendellpiez.com >> > XML | XSLT | electronic publishing >> > Eat Your Vegetables >> > _____oo_________o_o___ooooo____ooooooo_^ >> > >> >> >> >> -- >> Wendell Piez | http://www.wendellpiez.com >> XML | XSLT | electronic publishing >> Eat Your Vegetables >> _____oo_________o_o___ooooo____ooooooo_^ >> >> > XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list> > EasyUnsubscribe <-list/1053205> (by > email <>)
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Sorting Hex v. Decimal, Charles O'Connor cha | Thread | Re: [xsl] Moving element up hierarc, Wendell Piez wapiez@ |
Re: [xsl] Sorting Hex v. Decimal, Charles O'Connor cha | Date | Re: [xsl] Moving element up hierarc, Wendell Piez wapiez@ |
Month |