Subject: RE: [xsl] [XSL] extracting a verse (LONG) From: "McNally, David" <David.McNally@xxxxxxxxxx> Date: Thu, 19 Dec 2002 10:55:23 -0500 |
At the risk of beating a dead horse, here's an improvement of the key-based method to handle markup that doesn't cleanly nest. Because of the way it assigns keys, it does require id attributes to already exist on the verse and verseEnd elements, so I had to rewrite Wendell's example a bit - but those id's could obviously be generated by a pre-processing step. I couldn't use generate-id because I want to assign multiple keys to each node. Basically the approach is to have 2 keys, "verses" and "verseends", on all nodes except for the root element. For any node, the verses key will contain the id attributes of all <verse/> milestones preceding the node, or contained in the node. For any node, the verseends key will contain the id attributes of all <verseEnd/> milestones following the node, or contained in the node. Then, for each verse, you do a key operation on the id of the verse and verseEnd, giving you two nodesets, and take the intersection of them to find nodes within the verse, or that contain the verse either fully or in a non-well-formed way. Then apply templates to nodes in the intersection that don't have a parent in the intersection (to avoid repetition), and carry the intersection nodeset as a parameter so that through-out all the apply-templates you do for a given parent, only child nodes within the intersection are processed. Anyway, that's what I think is going on, and though not properly tested, it seems to work for the two examples I have: verses5.xslt: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:key name="verses" match="text() | *[parent::*]" use="preceding::verse/@id | .//verse/@id"/> <xsl:key name="verseends" match="text() | *[parent::*]" use="following::verseEnd/@id | .//verseEnd/@id"/> <xsl:template match="/"> <quote> <xsl:for-each select="//verse"> <verse> <xsl:variable name="starts" select="key('verses',@id)"/> <xsl:variable name="ends" select="key('verseends',@to)"/> <xsl:variable name="text" select="$starts[count(.|$ends) = count($ends)]"/> <xsl:apply-templates select="$text[not(count(parent::*|$text) = count($text))]"> <xsl:with-param name="text" select="$text"/> </xsl:apply-templates> </verse> </xsl:for-each> </quote> </xsl:template> <xsl:template match="*"> <xsl:param name="text"/> <xsl:element name="{name(.)}"> <xsl:copy-of select="@*"/> <xsl:attribute name="origElementID"> <xsl:value-of select="generate-id()"/> </xsl:attribute> <xsl:apply-templates select="*[count(.|$text) = count($text)] | text()[count(.|$text) = count($text)]"> <xsl:with-param name="text" select="$text"/> </xsl:apply-templates> </xsl:element> </xsl:template> </xsl:stylesheet> verses.xml: <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="C:\Work\xsl\verses5.xslt"?> <text> <div> <chapter id="BCV-GEN-1" to="BCV-GEN-1-END" value="1"/> <head>The Story of#Creation</head> <p> <verse id="BCV-GEN-1.1" to="BCV-GEN-1.1-END" value="1"/>In the beginning, when God created the universe, <verseEnd id="BCV-GEN-1.1-END" from="BCV-GEN-1.1"/> <verse id="BCV-GEN-1.2" to="BCV-GEN-1.2-END" value="2"/>the earth was formless and desolate. The raging ocean that covered everything was engulfed in total darkness, and the ...... </p> <p>rest of verse 2 <verseEnd id="BCV-GEN-1.2-END" from="BCV-GEN-1.2"/> but this is just paragraph </p> <p>Paragraph Paragraph Paragraph <verse id="BCV-GEN-1.3" to="BCV-GEN-1.3-END" value="3"/>This is the third </p> <p>verse </p> <verseEnd id="BCV-GEN-1.3-END" from="BCV-GEN-1.3"/> <p> paragraph </p> </div> </text> Output: <?xml version="1.0" encoding="UTF-8"?> <quote> <verse> <div origElementID="IDABELQB"> <p origElementID="IDAHELQB">In the beginning, when God created the universe, </p> </div> </verse> <verse> <div origElementID="IDABELQB"> <p origElementID="IDAHELQB">the earth was formless and desolate. The raging ocean that covered everything was engulfed in total darkness, and the ...... </p> <p origElementID="IDAVELQB">rest of verse 2 </p> </div> </verse> <verse> <div origElementID="IDABELQB"> <p origElementID="IDA0ELQB">This is the third </p> <p origElementID="IDAAFLQB">verse </p> </div> </verse> </quote> Which looks right - the origElementID attributes that I'm adding make it obvious how the elements have been split between verses. This is a modified version of one of Wendell's files: verses5.xml: <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="C:\Work\xsl\verses5.xslt"?> <quote> <verse id="1" to="e1"/>No! penury, inertness and grimace,<verseEnd id="e1"/> <verse id="2" to="e2"/>In some strange sort, were the land's portion. <q>See<verseEnd id="e2"/> <verse id="3" to="e3"/>Or shut your eyes,</q> said Nature peevishly,<verseEnd id="e3"/> <verse id="4" to="e4"/><q>It nothing skills: I cannot help my case:<verseEnd id="e4"/> <verse id="5" to="e5"/>'Tis the Last Judgment's fire must cure this place,<verseEnd id="e5"/> <verse id="6" to="e6"/>Calcine its clods and set my prisoners free.</q><verseEnd id="e6"/> </quote> Output: <?xml version="1.0" encoding="UTF-8"?> <quote> <verse>No! penury, inertness and grimace,</verse> <verse>In some strange sort, were the land's portion. <q origElementID="IDANPKQB">See</q> </verse> <verse> <q origElementID="IDANPKQB">Or shut your eyes,</q> said Nature peevishly,</verse> <verse> <q origElementID="IDAZPKQB">It nothing skills: I cannot help my case:</q> </verse> <verse> <q origElementID="IDAZPKQB">'Tis the Last Judgment's fire must cure this place,</q> </verse> <verse> <q origElementID="IDAZPKQB">Calcine its clods and set my prisoners free.</q> </verse> </quote> I'm not sure how processor intensive this is, but it seems to more or less do what's needed. Thanks, David. -- David McNally Moody's Investors Service Software Engineer 99 Church St, NY NY 10007 David.McNally@xxxxxxxxxx (212) 553-7475 --------------------------------------- The information contained in this e-mail message, and any attachment thereto, is confidential and may not be disclosed without our express permission. If you are not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution or copying of this message, or any attachment thereto, in whole or in part, is strictly prohibited. If you have received this message in error, please immediately notify us by telephone, fax or e-mail and delete the message and all of its attachments. Thank you. Every effort is made to keep our network free from viruses. You should, however, review this e-mail message, as well as any attachment thereto, for viruses. We take no responsibility and have no liability for any computer virus which may be transferred via this e-mail message. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] using the mozilla xsl renderi, juggy | Thread | RE: [xsl] [XSL] extracting a verse , Wendell Piez |
RE: [xsl] using the mozilla xsl ren, Jarno . Elovirta | Date | [xsl] single display of double elem, Heiko Specht |
Month |