Re: [xsl] problem with transforming mixed content

Subject: Re: [xsl] problem with transforming mixed content
From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 15 Aug 2020 17:08:45 -0000
>  In general I think the one-pass solution is often more complicated and
runs the risk of not being extensible when the problem "evolves".

In general maybe, but not in this specific case...

I wouldn't offer this solution if it wasn't obviously much simpler than the
offered 3.0 one.

I would say to everyone: Stick to the KISS principle and believe your eyes
(and timings)  :)

  --
Cheers,
Dimitre Novatchev


On Sat, Aug 15, 2020 at 2:16 AM Michael Kay mike@xxxxxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> This problem comes up from time to time, and it's not easy.
>
> There seem to be three general approaches:
>
> (a) turn the punctuation into markup (e.g. turn ":" into <colon/>), then
> do the manipulation on a tree of nodes
>
> (b) turn the markup into punctuation, then do the manipulation on the
> resulting text.
>
> (c) do it all in one pass
>
> I see that Graydon's solution uses serialize() and parse-xml(), so that's
> a modern approach to doing (b); while Dimitre's solution does (c). In
> general I think the one-pass solution is often more complicated and runs
> the risk of not being extensible when the problem "evolves".
>
> One of the things that can cause the problem to "evolve" is error
> handling: dealing with situations where the input isn't quite as simple as
> in your example. For example, multiple colons, no colons, colons that are
> there for a different purpose, etc,. You haven't included any such cases in
> your requirements statement.
>
> If we ignore error handling, this example of the problem is simpler than
> some because the ":" is always going to be in an immediate child text node;
> we've seen other examples (like splitting a table) where we need to look
> for conditions much deeper in the structure. This is probably what makes a
> one-pass solution feasible in this case.
>
> Intuitively, my feeling is that (a) is the most rigorous approach, the one
> that is least likely to fail because of unanticipated input conditions. For
> example, Graydon's solution fails if the input contains tags with
> upper-case names, or if it contains comments with a colon in the text.
>
> Michael Kay
> Saxonica
>
> On 15 Aug 2020, at 03:16, Wolfhart Totschnig
> wolfhart.totschnig@xxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
> wrote:
>
> Dear list,
>
> I would like to ask for your help with the following mixed-content
> problem. I am receiving, from an external source, data in the following
> form:
>
> <title>THE TITLE OF THE BOOK WITH SOME <i>ITALICS</i> AND SOME MORE WORDS:
> THE SUBTITLE OF THE BOOK WITH SOME <i>ITALICS</i></title>
>
> What I would like to do is
> 1) separate the title from the subtitle (i.e., divide the data at the
> colon) and put each in a separate element node;
> 2) all the while maintaining the <i> markup;
> 3) and perform certain string manipulations on all of the text nodes; for
> the purposes of this post, I will use the example of converting upper-case
> to lower-case.
>
> So the desired output is the following:
>
> <title>the title of the book with some <i>italics</i> and some more
> words</title>
> <subtitle>the subtitle of the book with some <i>italics</i></subtitle>
>
> How can this be done?
>
> I know that I can perform string manipulations while maintaining the <i>
> markup with templates, i.e., <xsl:template match="text()"/> and
> <xsl:template match="i"/>. But in this case I do not know how to divide the
> data at the colon. And I know that I can divide the data at the colon with
> <xsl:value-of select="substring-before(.,': ')"/>, but then I loose the <i>
> markup. So I am at a loss.
>
> Thanks in advance for your help!
> Wolfhart
>
>
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/782854> (by
> email <>)

Current Thread