Subject: Re: [xsl] problem with transforming mixed content From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Sat, 15 Aug 2020 11:01:19 -0000 |
Like Graydon's solution, this solution falls into category (b): convert the markup to text, then process as text. And like Graydon's solution, it makes assumptions about the markup and text content that can be encountered in the mixed content: in this case, the only markup it handles is what appears in the supplied test case, that is, an <i> element with no attributes, and it assumes that the '##' sequence won't appear naturally. The problem with this kind of solution is that when you process 10,000 input documents it will do the right thing for 9,999 of them, and you need very good testing to catch the failures. In fact, you'll only catch the failure if you put a lot more effort into the testing than you put into the actual code. (I'm working this morning on a bug I've created in the course of Saxon development that causes just 2 tests out of 30,000 in the QT3 test suite to fail. Or there might be two bugs, of course. Indeed, more worryingly, there might be three, and the tests are only catching two of them. As I'm sure you've found in your work on Xerces, you can have a vast test suite and bugs can still slip through. The general assumption with question-and-answer forums seems to be that one test case is enough, and that's blatantly wrong.) Mukul wrote: > > I've come up with following XSLT transform, which seems to work for this use case, > > <xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform <http://www.w3.org/1999/XSL/Transform>" > xmlns:xs="http://www.w3.org/2001/XMLSchema <http://www.w3.org/2001/XMLSchema>" > exclude-result-prefixes="xs"> > > <xsl:output method="xml" indent="yes"/> > > <xsl:template match="title"> > <result> > <xsl:variable name="result_pass1" as="xs:string*"> > <xsl:apply-templates select="node()" mode="pass1"/> > </xsl:variable> > <title> > <xsl:for-each select="tokenize(normalize-space(substring-before(string-join($result_pass1, ''), ':')), '##')"> > <xsl:call-template name="process_tokenize_result_item"> > <xsl:with-param name="inpStr" select="."/> > </xsl:call-template> > </xsl:for-each> > </title> > <subtitle> > <xsl:for-each select="tokenize(normalize-space(substring-after(string-join($result_pass1, ''), ':')), '##')"> > <xsl:call-template name="process_tokenize_result_item"> > <xsl:with-param name="inpStr" select="."/> > </xsl:call-template> > </xsl:for-each> > </subtitle> > </result> > </xsl:template> > > <xsl:template name="process_tokenize_result_item"> > <xsl:param name="inpStr" as="xs:string"/> > > <xsl:choose> > <xsl:when test="position() mod 2 = 0"> > <i> > <xsl:value-of select="."/> > </i> > </xsl:when> > <xsl:otherwise> > <xsl:value-of select="."/> > </xsl:otherwise> > </xsl:choose> > </xsl:template> > > <xsl:template match="node()" mode="pass1"> > <xsl:choose> > <xsl:when test="self::i"> > <xsl:value-of select="concat('##', lower-case(.), '##')"/> > </xsl:when> > <xsl:otherwise> > <xsl:value-of select="lower-case(.)"/> > </xsl:otherwise> > </xsl:choose> > </xsl:template> > > </xsl:stylesheet> > > The above XSLT transform, when provided following XML input document, > > <title>THE TITLE OF THE BOOK WITH SOME <i>ITALICS</i> AND SOME MORE > WORDS: THE SUBTITLE OF THE BOOK WITH SOME <i>ITALICS</i></title> > > produces following result, > > <result> > <title>the title of the book with some <i>italics</i> and some more words</title> > <subtitle>the subtitle of the book with some <i>italics</i> > </subtitle> > </result> > > This solution, follows a two pass approach. In the first pass, the element constructs <i>text</i> are transformed into ##text## (assuming that delimiter ## doesn't interfere with the input text). The result of pass one, is transformed into the final result by second pass. > > > > -- > Regards, > Mukul Gandhi > XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list> > EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by email <>)
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] problem with transforming, Mukul Gandhi gandhi. | Thread | Re: [xsl] problem with transforming, Mukul Gandhi gandhi. |
Re: [xsl] problem with transforming, Mukul Gandhi gandhi. | Date | Re: [xsl] XSLT2: Grouping mixed con, Martin Honnen martin |
Month |