Subject: [xsl] Splitting text nodes - xsl:iterate? From: "Tom Cleghorn tcleghorn@xxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Wed, 12 Nov 2014 14:10:22 -0000 |
Hi list, Given an input document looking something like this: <doc> <head><foo/><bar/><baz/></head> <body> <sec> <para>Lorem ipsum dolor sit amet, consectetur adipiscing elit.<box outline="maybe"><para quack="y">Proin id <?foo bar?>bibendum urna, <baz>ut ornare</baz> mi.</para></box></para> <para>Aenean dui risus, <qux>sodales quis leo sit amet, ornare consequat</qux> metus. Ut vel massa congue, egestas nibh et, rutrum odio.</para> </sec> </body> </doc> (i.e. document markup consisting of arbitrary text and element nodes nested to some unknown depth) and the requirement for two separate outputs looking like these: <doc> <head><foo/><bar/><baz/></head> <body> <sec> <para><new:start/>Lorem ipsum dolor sit amet, consectetur adipiscing elit.<box outline="maybe"><para quack="y">Proin id <?foo bar?>bibendum urna, <baz>ut ornare</baz> mi.</para></box></para> <para>Aenean dui risus, <qux>sodales quis <new:end/>leo sit amet, ornare consequat</qux> metus. Ut vel massa congue, egestas nibh et, rutrum odio.</para> </sec> </body> </doc> <sec> <para>Lorem ipsum dolor sit amet, consectetur adipiscing elit.<box outline="maybe"><para quack="y">Proin id <?foo bar?>bibendum urna, <baz>ut ornare</baz> mi.</para></box></para> <para>Aenean dui risus, <qux>sodales quis [...]</qux></para> </sec> (i.e. a copy of the input, with new:start and new:end elements marking the first 20 words of the document; and separately a copy of those first twenty words, preserving all markup within them and adding ellipses at the end) ...how might I fruitfully approach the transformation in an XSLT idiom? I feel that there should be some neat declarative way of doing it, possibly using xsl:iterate and/or accumulators, that I'm just failing to see. XSLT 3.0 is available (Saxon 9.6), but the source documents are old content and not open to adjustment, sadly. I've tried using xsl:iterate, but I seem to be falling down in keeping track of whether or not I'm processing the specific text node in which the break needs to occur. Am I making a rod for my own back here? Should I just be breaking out to a custom Java function and crossing my fingers that I manage to avoid ill-formed output? Any advice will be very gratefully received! Thanks!
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Web scraping, Ihe Onwuka ihe.onwuk | Thread | Re: [xsl] Splitting text nodes - xs, David Rudel fwqhgads |
Re: [xsl] How do I properly define , Wolfgang Laun wolfga | Date | Re: [xsl] Splitting text nodes - xs, David Rudel fwqhgads |
Month |