RE: [xsl] Problems with mixed content and inline elements when transforming XHTML into another XML format

Subject: RE: [xsl] Problems with mixed content and inline elements when transforming XHTML into another XML format
From: Tony Kinnis <kinnist@xxxxxxxxx>
Date: Mon, 27 Feb 2006 00:27:49 -0800 (PST)
Great news. I finally solved this problem. It was a combination of
issues. Incorrect match criteria and missing nested
<xsl:apply-templates/> call. 

I do have one follow up question. If you all think I should make this a
different post just let me know.

The function is-inline is generating a sequence with several elements.
I am wondering if the way this is done is most efficient. The list is
getting a little longer than I expected and I may add a few more
elements before it is all said and done. Is there a way I can declare
the set of tags I want to be included rather than building up this big
conditional?

For example: 

INSTEAD OF:
<xsl:sequence select="($node instance of text() and
normalize-space($node)) 
            or
$node[self::u|self::b|self::i|self::strong|self::span|self::em|self::br|self::img|self::font|self::a]"/>

COULD I:
Declare a list of all tags I want considered...
<xsl:variable name="inlineElements"
select="u,b,i,strong,span,em,br,img,font,a"/>

<xsl:sequence select="($node instance of text() and
normalize-space($node)) 
            or inList($node, $inlineElements)"/>

Realizing I just made up this mythical function "inList". I am trying
to make it easier to add/subtract from the list and maybe improve the
readability some. Is there anything close to this that will perform
well?

Thanks again for help.

----
In case anyone else is looking for the final script I have included it
here. I am guessing all of this was rather obvious to all of you. 

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="2.0"
    xpath-default-namespace="http://www.w3.org/1999/xhtml";
    xmlns:f="http://localhost";
    xmlns:xs="http://www.w3.org/2001/XMLSchema";
    xmlns:my="http://localhost/markup.xsd";>
    
    <xsl:output indent="yes" method="xml" encoding="UTF-8"
standalone="no"/>
    
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:for-each-group select="@*|node()"
group-adjacent="f:is-inline(.)">
                <xsl:choose>
                    <xsl:when test="current-grouping-key()">
                        <xsl:element name="my:textnode">
                            <xsl:copy>
                                <xsl:apply-templates
select="current-group()"/>
                            </xsl:copy>
                        </xsl:element>
                    </xsl:when>
                    <xsl:otherwise>                        
                        <xsl:apply-templates select="current-group()"/>
                    </xsl:otherwise>
                </xsl:choose>
            </xsl:for-each-group>
        </xsl:copy>    
    </xsl:template>
    
    <xsl:function name="f:is-inline" as="xs:boolean">
        <xsl:param name="node" as="node()"/>
        <xsl:sequence select="($node instance of text() and
normalize-space($node)) 
            or
$node[self::u|self::b|self::i|self::strong|self::span|self::em|self::br|self::img|self::font|self::a]"/>
    </xsl:function>
</xsl:stylesheet>

This will wrap all elements that are in the select clause of the
sequence above in a tag called called <my:textnode>.

--- Tony Kinnis <kinnist@xxxxxxxxx> wrote:

> Hello all, 
> 
> My apologies in advance for reposting. I sent this question a few
> days
> ago and didn't receive a response. Maybe it was simply over looked or
> even ignored. :) In case it is the former I am sending it again.
> 
> Thanks in advance for any help you can give. See below for the
> posting.
> 
> >>-- repost--<<
> 
> Sorry to keep asking about this problem but I am still having issues.
> The change you mention below does remove the error but now it never
> hits the block of code to wrap the elements in the textnode. It is
> simply outputting the input verbatim. Stepping through it in a
> debugger
> it shows that it steps into the for-each-group statement then right
> into the otherwise clause, outputs the entire document then it exits
> processing as complete. It seems as though it is missing a statement
> in
> the otherwise clause that causes it to recurse on the elements. I
> tried a couple of different things with that but they were all wrong
> and didn't produce the results I wanted.
> 
> See the previous message below for the input, stylesheet and desired
> output because I think something is missing here or I am not asking
> my
> question correctly. For convenience here is the entire stylesheet
> including the change you suggested. Once again thanks for your help.
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> version="2.0"
>     xpath-default-namespace="http://www.w3.org/1999/xhtml";
>     xmlns:f="http://whatever";
>     xmlns:xs="http://www.w3.org/2001/XMLSchema";>
>     
>     <xsl:template match="/">
>         <xsl:copy>
>             <xsl:for-each-group select="node()"
> group-adjacent="f:is-inline(.)">
>                 <xsl:choose>
>                     <xsl:when test="current-grouping-key()">
>                         <textnode><xsl:copy-of
> select="current-group()"/></textnode>
>                     </xsl:when>
>                     <xsl:otherwise>
>                         <xsl:copy-of select="current-group()"/>
>                     </xsl:otherwise>
>                 </xsl:choose>
>             </xsl:for-each-group>
>         </xsl:copy>   
>     </xsl:template>
>     
>     <xsl:function name="f:is-inline" as="xs:boolean">
>         <xsl:param name="node" as="node()"/>
>         <xsl:sequence select="$node instance of text() or
>
$node[self::u|self::b|self::i|self::strong|self::span|self::em|self::br]"/>
>     </xsl:function>
> </xsl:stylesheet>
> 
> > --- Michael Kay <mike@xxxxxxxxxxxx> wrote:
> > 
> > > > 
> > > > I keep getting this error...
> > > > 
> > > > Description: A sequence of more than one item is not allowed as
> > the
> > > > first argument of f:is-inline()
> > > > URL: http://www.w3.org/TR/xpath20/#ERRXPTY0004
> > > 
> > > Sorry, the code should have said group-adjacent="f:is-inline(.)"
> > > 
> > > Michael Kay
> > > http://www.saxonica.com/
> > > 
> > > > 
> > > > In case this this matters I am debugging this using the Oxygen
> > > editor
> > > > for the mac. The processor I have selected is Saxon8B. Once
> again
> > > help
> > > > is much appreciated.
> > > > 
> > > > To make this easier here is the full xsl doc, input I am
> testing
> > > and
> > > > desired output....
> > > > 
> > > > XSL document...
> > > > <?xml version="1.0" encoding="UTF-8"?>
> > > > <xsl:stylesheet
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> > > > version="2.0"
> > > >     xpath-default-namespace="http://www.w3.org/1999/xhtml";
> > > >     xmlns:f="http://whatever";
> > > >     xmlns:xs="http://www.w3.org/2001/XMLSchema";>
> > > >     <xsl:template match="/">
> > > >         <xsl:copy>
> > > >             <xsl:for-each-group select="node()"
> > > >                 group-adjacent="f:is-inline(node())">
> > > >                 <xsl:choose>
> > > >                     <xsl:when test="current-grouping-key()">
> > > >                         <textnode><xsl:copy-of
> > > > select="current-group()"/></textnode>
> > > >                     </xsl:when>
> > > >                     <xsl:otherwise>
> > > >                         <xsl:copy-of select="current-group()"/>
> > > >                     </xsl:otherwise>
> > > >                 </xsl:choose>
> > > >             </xsl:for-each-group>
> > > >         </xsl:copy>    
> > > >     </xsl:template>
> > > >     
> > > >     <xsl:function name="f:is-inline" as="xs:boolean">
> > > >         <xsl:param name="node" as="node()"/>
> > > >         <xsl:sequence select="$node instance of text() or
> > > > $node[self::u|self::b|self::i|self::strong|self::span|self::em
> > > > |self::br]"/>
> > > >     </xsl:function>
> > > > </xsl:stylesheet>
> > > > 
> > > > XHTML Document...
> > > > 
> > > > <?xml version="1.0" encoding="utf-8"?>
> > > > <html xmlns="http://www.w3.org/1999/xhtml";>
> > > >     <head>
> > > >         <meta name="generator" content="HTML Tidy, see
> > > www.w3.org"/>
> > > >         <title>The Title Is</title>
> > > >     </head>
> > > >     <body>
> > > >         <ul id="bar">
> > > >             <li/>
> > > >             <li>foo<br/> after break <div/> after empty
> div</li>
> > > >             <li>bar<strong>baz</strong></li>
> > > >         </ul>
> > > >         <ol>
> > > >             <li>Item 1</li>
> > > >             <li>Item 2</li>
> > > >         </ol>
> > > >         <p><span>foo</span><br/> asdf <b>bold another</b>
> > > >             and <strong>a strong item</strong>
> > > >         </p>
> > > >         <div>
> > > >             Content of a <b>div tag</b> here.
> > > >             <ul>
> > > >                 <li>
> > > >                     Nested List Item 1
> > > >                 </li>
> > > >                 <li>
> > > >                     Nested List Item 2
> > > >                 </li>
> > > >             </ul>
> > > >             Now list is done
> > > >         </div>
> > > >     </body>
> > > > </html>
> > > > 
> > > > Desired output...
> > > > <?xml version="1.0" encoding="utf-8"?>
> > > > <html xmlns="http://www.w3.org/1999/xhtml";>
> > > >     <head>
> > > >         <meta name="generator" content="HTML Tidy, see
> > > www.w3.org"/>
> > > >         <title><textnode>The Title Is</textnode></title>
> > > >     </head>
> > > >     <body>
> > > >         <ul id="bar">
> > > >             <li/>
> > > >             <li><textnode>foo<br/> after break
> > > > </textnode><div/><textnode> after empty div</textnode></li>
> > > >            
> <li><textnode>bar<strong>baz</strong></textnode></li>
> > > >         </ul>
> > > >         <ol>
> > > >             <li><textnode>Item 1</textnode></li>
> > > >             <li><textnode>Item 2</textnode></li>
> > > >         </ol>
> > > >         <p><textnode><span>foo</span><br/> asdf <b>bold
> > another</b>
> > > >             and <strong>a strong item</strong></textnode>
> > > >         </p>
> > > >         <div>
> > > >             <textnode>Content of a <b>div tag</b>
> > here.</textnode>
> > > >             <ul>
> > > >                 <li>
> > > >                     <textnode>Nested List Item 1</textnode>
> > > >                 </li>
> > > >                 <li>
> > > >                     <textnode>Nested List Item 2</textnode>
> > > >                 </li>
> > > >             </ul>
> > > >             <textnode>Now list is done</textnode>
> > > >         </div>
> > > >     </body>
> > > > </html>
> > > > 
> 
=== message truncated ===


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Current Thread