Re: [xsl] IDENT

Subject: Re: [xsl] IDENT
From: Jeni Tennison <mail@xxxxxxxxxxxxxxxx>
Date: Mon, 5 Feb 2001 16:13:49 +0000
Hi Maribel,

> I'm transforming a HTML file to a text file with xsl,  and I want to
> obtain a plain text, without lines feeds.  How I can obtain it?

The short answer: wrap xsl:text elements around the actual text that
you want outputted.

The long answer: have a look at this template:

>   <xsl:template match="B" mode="special">
>     &lt;B&gt;
>     <xsl:value-of select="."/>
>     &lt;/B&gt;<xsl:text> </xsl:text>
>   </xsl:template>

When a DOM builder goes over that bit of XML, it creates a node tree
that looks like:

+- (element) xsl:template
   +- (text) [NL][SP][SP]<B>[NL][SP][SP]
   +- (element) xsl:value-of
   +- (text) [NL][SP][SP]</B>
   +- (element) xsl:text
   |  +- (text) [SP]
   +- (text) [NL]

All those new lines and spaces are because of the indenting in the
stylesheet. When the XSLT processor gets hold of this node tree, it
first strips out all white-space only text nodes that are in the tree,
unless they appear within a xsl:text element. So it gets rid of the
last text node to give:

+- (element) xsl:template
   +- (text) [NL][SP][SP]<B>[NL][SP][SP]
   +- (element) xsl:value-of
   +- (text) [NL][SP][SP]</B>
   +- (element) xsl:text
      +- (text) [SP]

Now, when the XSLT processor applies this template, any plain text
nodes that are found are copied directly to the result tree.  The
xsl:value-of adds the relevant value (e.g. 'x') and the xsl:text adds
whatever its content is.  You end up with a result tree that looks
like:

+- (text) [NL][SP][SP]<B>[NL][SP][SP]x[NL][SP][SP]</B>[SP]

which is why you get all the whitesapce that you do in your output -
it's copying over the whitespace that you've used to indent stuff in
your stylesheet.

The way around this is to cut out the whitespace that you don't want
using xsl:text elements to limit what whitespace is interpreted as
whitespace for inclusion in the result tree, and what is just used to
indent your stylesheet code.

If you use the template:

<xsl:template match="B" mode="special">
   <xsl:text> &lt;B&gt; </xsl:text>
   <xsl:value-of select="." />
   <xsl:text> &lt;/B&gt;</xsl:text>
</xsl:template>

Then the node tree that the DOM builder builds looks like:

+- (element) xsl:template
   +- (text) [NL][SP][SP]
   +- (element) xsl:text
   |  +- (text) [SP]<B>[SP]
   +- (text) [NL][SP][SP]
   +- (element) xsl:value-of
   +- (text) [NL][SP][SP]
   +- (element) xsl:text
   |  +- (text) [SP]</B>
   +- (text) [NL]

This time, a lot of whitespace disappears when the whitespace-only
nodes are cut out by the XSLT processor:

+- (element) xsl:template
   +- (element) xsl:text
   |  +- (text) [SP]<B>[SP]
   +- (element) xsl:value-of
   +- (element) xsl:text
      +- (text) [SP]</B>

And the output is simply:

+- (text) [SP]<B>[SP]x[SP]</B>

which I think is what you're after.

I hope that helps,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread
  • [xsl] IDENT
    • Maribel - Mon, 05 Feb 2001 16:40:11 +0100
      • Jeni Tennison - Mon, 5 Feb 2001 16:13:49 +0000 <=
      • Michael Kay - Mon, 5 Feb 2001 17:07:53 -0000