|
Subject: RE: [xsl] Problems with mixed content and inline elements when transforming XHTML into another XML format From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Wed, 22 Feb 2006 23:40:55 -0000 |
You're using XSLT 2.0 so this can be solved using grouping constructs.
Forget the templates that create <textnode> elements.
You want something like this, which causes adjacent "inline" nodes to be
grouped under a new element, with a function to decide whether a node is an
"inline" node:
<xsl:template match="div">
<xsl:copy>
<xsl:for-each-group select="node()"
group-adjacent="f:is-inline(node())">
<xsl:choose>
<xsl:when test="current-grouping-key()">
<textnode><xsl:copy-of select="current-group()"/></textnode>
</xsl:when>
<xsl:otherwise>
<xsl:copy-of select="current-group()"/>
</
</
</
</
</
<xsl:function name="f:is-inline" as="xs:boolean">
<xsl:param name="node" as="node()"/>
<xsl:sequence select="$node instanceof text() or
$node[self::u|self::b|self::i]"/>
</xsl:function>
Michael Kay
http://www.saxonica.com/
> -----Original Message-----
> From: Tony Kinnis [mailto:kinnist@xxxxxxxxx]
> Sent: 22 February 2006 22:29
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Problems with mixed content and inline
> elements when transforming XHTML into another XML format
>
> Hello all,
>
> I have been trying to solve this problem for a few days now and I have
> had no luck. I am hoping someone here can help me out with this.
>
> I need to parse XHTML and transform it into another XML format. I am
> sure that the XHTML is valid and well formed (I am running it through
> HTMLTidy). The first problem I encountered was the notion of mixed
> elements. Something like...
>
> <div>
> My name is <b>bob</>. What is yours?
> <ul>
> <li>foo</li>
> <li>bar</li>
> </ul>
> </div>
>
> I found a utility script on the web that can turn mixed content into
> element content. I am guessing some of you have seen this script
> before.
>
> <xsl:template match="text()[normalize-space(.)][../*]">
> <xsl:element name="textnode">
> <xsl:value-of select="."/>
> </xsl:element>
> </xsl:template>
>
> <xsl:template match="@*|node()">
> <xsl:copy>
> <xsl:apply-templates select="@*|node()"/>
> </xsl:copy>
> </xsl:template>
>
> This makes the above post look like...
>
> <div>
> <textnode>My name is </textnode><b>bob</><textnode>. What is
> yours?</textnode>
> <ul>
> <li>foo</li>
> <li>bar</li>
> </ul>
> </div>
>
> However, what I would really like to do is have the bold tags included
> inside of the textnode tag so that it looks like...
>
> <div>
> <textnode>My name is <b>bob</>. What is yours?</textnode>
> <ul>
> <li>foo</li>
> <li>bar</li>
> </ul>
> </div>
>
> In other words I would like to treat the <b> element as text
> and not an
> element. There is a finite set of tags I would like to be treated as
> simple text. These are considered in-line elements in html.
> <b><i><em><strong><u>
>
> An alternative, and better solution, would be wrapping all
> text through
> the document in the textnode element including the in-line elmements
> mentioned above. The xml I will finally output from the
> transformation
> of the xhtml requires all text be wrapped in a special displaytext tag
> including the in-line elements mentioned above. By placing every piece
> of text, including the in-line text tags above, in a textnode I could
> easily pass the document through another template that says...
>
> <xsl:template match="textnode[normalize-space(.)]">
> <xsl:element name="displaytext">
> <xsl:apply-templates/>
> </xsl:element>
> </xsl:template>
>
> This would make things much easier.
>
> Below are the xsl processor and xsl version. I am not tied to Saxon if
> another processor could do the job, provided it can be used
> within Java
> and ports across platforms (windows, unix, etc).
>
> Processor: Saxon8B
> XSL Version: 2.0
>
> Thanks in advance for your help.
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| [xsl] Problems with mixed content a, Tony Kinnis | Thread | RE: [xsl] Problems with mixed conte, Tony Kinnis |
| [xsl] Problems with mixed content a, Tony Kinnis | Date | [xsl] xml xsl web architecture, Anthony Ettinger |
| Month |