RE: [xsl] Problems with mixed content and inline elements when transforming XHTML into another XML format

Subject: RE: [xsl] Problems with mixed content and inline elements when transforming XHTML into another XML format
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Mon, 27 Feb 2006 09:52:37 -0000
> The function is-inline is generating a sequence with several elements.
> I am wondering if the way this is done is most efficient. The list is
> getting a little longer than I expected and I may add a few more
> elements before it is all said and done. Is there a way I can declare
> the set of tags I want to be included rather than building up this big
> conditional?

Very often with a long list like this, you find that the elements are all
members of the same substitution group in the schema. So with a schema-aware
processor, you can use the construct 

. instance of schema-element(inline-element)

to select them all.

This may or may not be useful in your situation; but it's just one example
of the ways that making stylesheets schema-aware can improve the robustness
of your code.

Michael Kay
http://www.saxonica.com/


> 
> For example: 
> 
> INSTEAD OF:
> <xsl:sequence select="($node instance of text() and
> normalize-space($node)) 
>             or
> $node[self::u|self::b|self::i|self::strong|self::span|self::em
> |self::br|self::img|self::font|self::a]"/>
> 
> COULD I:
> Declare a list of all tags I want considered...
> <xsl:variable name="inlineElements"
> select="u,b,i,strong,span,em,br,img,font,a"/>
> 
> <xsl:sequence select="($node instance of text() and
> normalize-space($node)) 
>             or inList($node, $inlineElements)"/>
> 
> Realizing I just made up this mythical function "inList". I am trying
> to make it easier to add/subtract from the list and maybe improve the
> readability some. Is there anything close to this that will perform
> well?
> 
> Thanks again for help.
> 
> ----
> In case anyone else is looking for the final script I have included it
> here. I am guessing all of this was rather obvious to all of you. 
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> version="2.0"
>     xpath-default-namespace="http://www.w3.org/1999/xhtml";
>     xmlns:f="http://localhost";
>     xmlns:xs="http://www.w3.org/2001/XMLSchema";
>     xmlns:my="http://localhost/markup.xsd";>
>     
>     <xsl:output indent="yes" method="xml" encoding="UTF-8"
> standalone="no"/>
>     
>     <xsl:template match="@*|node()">
>         <xsl:copy>
>             <xsl:for-each-group select="@*|node()"
> group-adjacent="f:is-inline(.)">
>                 <xsl:choose>
>                     <xsl:when test="current-grouping-key()">
>                         <xsl:element name="my:textnode">
>                             <xsl:copy>
>                                 <xsl:apply-templates
> select="current-group()"/>
>                             </xsl:copy>
>                         </xsl:element>
>                     </xsl:when>
>                     <xsl:otherwise>                        
>                         <xsl:apply-templates 
> select="current-group()"/>
>                     </xsl:otherwise>
>                 </xsl:choose>
>             </xsl:for-each-group>
>         </xsl:copy>    
>     </xsl:template>
>     
>     <xsl:function name="f:is-inline" as="xs:boolean">
>         <xsl:param name="node" as="node()"/>
>         <xsl:sequence select="($node instance of text() and
> normalize-space($node)) 
>             or
> $node[self::u|self::b|self::i|self::strong|self::span|self::em
> |self::br|self::img|self::font|self::a]"/>
>     </xsl:function>
> </xsl:stylesheet>
> 
> This will wrap all elements that are in the select clause of the
> sequence above in a tag called called <my:textnode>.
> 
> --- Tony Kinnis <kinnist@xxxxxxxxx> wrote:
> 
> > Hello all, 
> > 
> > My apologies in advance for reposting. I sent this question a few
> > days
> > ago and didn't receive a response. Maybe it was simply over 
> looked or
> > even ignored. :) In case it is the former I am sending it again.
> > 
> > Thanks in advance for any help you can give. See below for the
> > posting.
> > 
> > >>-- repost--<<
> > 
> > Sorry to keep asking about this problem but I am still 
> having issues.
> > The change you mention below does remove the error but now it never
> > hits the block of code to wrap the elements in the textnode. It is
> > simply outputting the input verbatim. Stepping through it in a
> > debugger
> > it shows that it steps into the for-each-group statement then right
> > into the otherwise clause, outputs the entire document then it exits
> > processing as complete. It seems as though it is missing a statement
> > in
> > the otherwise clause that causes it to recurse on the elements. I
> > tried a couple of different things with that but they were all wrong
> > and didn't produce the results I wanted.
> > 
> > See the previous message below for the input, stylesheet and desired
> > output because I think something is missing here or I am not asking
> > my
> > question correctly. For convenience here is the entire stylesheet
> > including the change you suggested. Once again thanks for your help.
> > 
> > <?xml version="1.0" encoding="UTF-8"?>
> > <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> > version="2.0"
> >     xpath-default-namespace="http://www.w3.org/1999/xhtml";
> >     xmlns:f="http://whatever";
> >     xmlns:xs="http://www.w3.org/2001/XMLSchema";>
> >     
> >     <xsl:template match="/">
> >         <xsl:copy>
> >             <xsl:for-each-group select="node()"
> > group-adjacent="f:is-inline(.)">
> >                 <xsl:choose>
> >                     <xsl:when test="current-grouping-key()">
> >                         <textnode><xsl:copy-of
> > select="current-group()"/></textnode>
> >                     </xsl:when>
> >                     <xsl:otherwise>
> >                         <xsl:copy-of select="current-group()"/>
> >                     </xsl:otherwise>
> >                 </xsl:choose>
> >             </xsl:for-each-group>
> >         </xsl:copy>   
> >     </xsl:template>
> >     
> >     <xsl:function name="f:is-inline" as="xs:boolean">
> >         <xsl:param name="node" as="node()"/>
> >         <xsl:sequence select="$node instance of text() or
> >
> $node[self::u|self::b|self::i|self::strong|self::span|self::em
> |self::br]"/>
> >     </xsl:function>
> > </xsl:stylesheet>
> > 
> > > --- Michael Kay <mike@xxxxxxxxxxxx> wrote:
> > > 
> > > > > 
> > > > > I keep getting this error...
> > > > > 
> > > > > Description: A sequence of more than one item is not 
> allowed as
> > > the
> > > > > first argument of f:is-inline()
> > > > > URL: http://www.w3.org/TR/xpath20/#ERRXPTY0004
> > > > 
> > > > Sorry, the code should have said group-adjacent="f:is-inline(.)"
> > > > 
> > > > Michael Kay
> > > > http://www.saxonica.com/
> > > > 
> > > > > 
> > > > > In case this this matters I am debugging this using the Oxygen
> > > > editor
> > > > > for the mac. The processor I have selected is Saxon8B. Once
> > again
> > > > help
> > > > > is much appreciated.
> > > > > 
> > > > > To make this easier here is the full xsl doc, input I am
> > testing
> > > > and
> > > > > desired output....
> > > > > 
> > > > > XSL document...
> > > > > <?xml version="1.0" encoding="UTF-8"?>
> > > > > <xsl:stylesheet
> > xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> > > > > version="2.0"
> > > > >     xpath-default-namespace="http://www.w3.org/1999/xhtml";
> > > > >     xmlns:f="http://whatever";
> > > > >     xmlns:xs="http://www.w3.org/2001/XMLSchema";>
> > > > >     <xsl:template match="/">
> > > > >         <xsl:copy>
> > > > >             <xsl:for-each-group select="node()"
> > > > >                 group-adjacent="f:is-inline(node())">
> > > > >                 <xsl:choose>
> > > > >                     <xsl:when test="current-grouping-key()">
> > > > >                         <textnode><xsl:copy-of
> > > > > select="current-group()"/></textnode>
> > > > >                     </xsl:when>
> > > > >                     <xsl:otherwise>
> > > > >                         <xsl:copy-of 
> select="current-group()"/>
> > > > >                     </xsl:otherwise>
> > > > >                 </xsl:choose>
> > > > >             </xsl:for-each-group>
> > > > >         </xsl:copy>    
> > > > >     </xsl:template>
> > > > >     
> > > > >     <xsl:function name="f:is-inline" as="xs:boolean">
> > > > >         <xsl:param name="node" as="node()"/>
> > > > >         <xsl:sequence select="$node instance of text() or
> > > > > $node[self::u|self::b|self::i|self::strong|self::span|self::em
> > > > > |self::br]"/>
> > > > >     </xsl:function>
> > > > > </xsl:stylesheet>
> > > > > 
> > > > > XHTML Document...
> > > > > 
> > > > > <?xml version="1.0" encoding="utf-8"?>
> > > > > <html xmlns="http://www.w3.org/1999/xhtml";>
> > > > >     <head>
> > > > >         <meta name="generator" content="HTML Tidy, see
> > > > www.w3.org"/>
> > > > >         <title>The Title Is</title>
> > > > >     </head>
> > > > >     <body>
> > > > >         <ul id="bar">
> > > > >             <li/>
> > > > >             <li>foo<br/> after break <div/> after empty
> > div</li>
> > > > >             <li>bar<strong>baz</strong></li>
> > > > >         </ul>
> > > > >         <ol>
> > > > >             <li>Item 1</li>
> > > > >             <li>Item 2</li>
> > > > >         </ol>
> > > > >         <p><span>foo</span><br/> asdf <b>bold another</b>
> > > > >             and <strong>a strong item</strong>
> > > > >         </p>
> > > > >         <div>
> > > > >             Content of a <b>div tag</b> here.
> > > > >             <ul>
> > > > >                 <li>
> > > > >                     Nested List Item 1
> > > > >                 </li>
> > > > >                 <li>
> > > > >                     Nested List Item 2
> > > > >                 </li>
> > > > >             </ul>
> > > > >             Now list is done
> > > > >         </div>
> > > > >     </body>
> > > > > </html>
> > > > > 
> > > > > Desired output...
> > > > > <?xml version="1.0" encoding="utf-8"?>
> > > > > <html xmlns="http://www.w3.org/1999/xhtml";>
> > > > >     <head>
> > > > >         <meta name="generator" content="HTML Tidy, see
> > > > www.w3.org"/>
> > > > >         <title><textnode>The Title Is</textnode></title>
> > > > >     </head>
> > > > >     <body>
> > > > >         <ul id="bar">
> > > > >             <li/>
> > > > >             <li><textnode>foo<br/> after break
> > > > > </textnode><div/><textnode> after empty div</textnode></li>
> > > > >            
> > <li><textnode>bar<strong>baz</strong></textnode></li>
> > > > >         </ul>
> > > > >         <ol>
> > > > >             <li><textnode>Item 1</textnode></li>
> > > > >             <li><textnode>Item 2</textnode></li>
> > > > >         </ol>
> > > > >         <p><textnode><span>foo</span><br/> asdf <b>bold
> > > another</b>
> > > > >             and <strong>a strong item</strong></textnode>
> > > > >         </p>
> > > > >         <div>
> > > > >             <textnode>Content of a <b>div tag</b>
> > > here.</textnode>
> > > > >             <ul>
> > > > >                 <li>
> > > > >                     <textnode>Nested List Item 1</textnode>
> > > > >                 </li>
> > > > >                 <li>
> > > > >                     <textnode>Nested List Item 2</textnode>
> > > > >                 </li>
> > > > >             </ul>
> > > > >             <textnode>Now list is done</textnode>
> > > > >         </div>
> > > > >     </body>
> > > > > </html>
> > > > > 
> > 
> === message truncated ===
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 

Current Thread