Re: [xsl] source XSL to output XSL but removing leading spaces in the source XSL

Subject: Re: [xsl] source XSL to output XSL but removing leading spaces in the source XSL
From: "Abel Braaksma (online)" <abel.online@xxxxxxxxx>
Date: Wed, 8 Aug 2007 23:22:50 +0200 (CEST)
> Hi,
>
> I am new to XSL.

welcome :)

> I was able to transform one XSL file (source) to another
> XSL file (output) with the code below.

really? Why do you want to transform an XSLT file? Or do you mean that you
tried to transform an XML file into another XML file using an XSLT file?

> However, during the process, I was
> not able to remove any leading space from the source XSL file.  Is there a
> way to do this.  Thanks, Steven.

yes, there is. But it would be interesting to know what "leading space" means.
Do you mean that:

<node>
   <node>
      <node/>
   </
</

becomes
<node>
<node>
<node />
</
</

?
Or do you mean that you have text nodes and want the leading space removed?
What do you want to do with whitelines (i.e., do you want them removed as
well)?

See my comments below (trying to take into account that you are new to xslt).
As one approach, I show you how to remove leading whitespace in XSLT 2.0 and
1.0, leaving newlines intact. Considering "leading" to be leading when it
starts at a newline (XML) or when it is at the beginning of a text node <node>
 some text</node> will become <node>some text</node>

>
> <?xml version='1.0'?>

Not a necessity, but I'd recommend to add the encoding attribute to the xml
prolog:
<?xml version="1.0" encoding="utf-8" ?>

> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
>                 version="1.0">

Most people seem to find it easier to learn XSLT 2.0 instead of XSLT 1.0, esp.
when they are new to it. Consider downloading Saxon 8.9 (or AltovaXML or
Gestalt) which are 2.0 capable processors.

>
> 	<xsl:template match="*">

You are only matching elements here. Text nodes, comment nodes, pi nodes and
attribute nodes are not 'captured' by this pattern.

> 		<xsl:copy>
> 			<xsl:for-each select="@*">
> 				<xsl:copy/>
> 			</xsl:for-each>

It seems that you have found your way to the copy idiom. That is the basis for
all XML-to-XML transformations if input and output look the same. Your
template could be optimized a bit though. Instead of xsl:for-each, think of
<xsl:copy-of select="@*" />. And instead of xsl:copy-of consider the whole
copy template as follows:

<xsl:template match="node() | @*">
   <xsl:copy>
       <xsl:template match="node() | @*" />
   </xsl:copy>
</xsl:template>

that will save you some keystrokes the next time and it will copy *all* input
nodes.

> 			<xsl:apply-templates/>
> 		</xsl:copy>
> 	</xsl:template>
> </xsl:stylesheet>


Now that we've captured that bit, all you need is an overriding template. XSLT
automatically chooses the most specific template as leading when it matches a
node, which means that if you want to change the look of text nodes (in your
case) all you need to do is add another matching template for the (more
specific) text nodes, like this:

<xsl:template match="text()">
   .....
</xsl:template>


Now we are almost done. The contents of the text()-matching template is the
only thing that's left. This is fairly easy in XSLT 2.0, where all me need to
do is add a regular expression that takes away the leading spaces:

<xsl:template match="text()">
   <xsl:sequence select="replace(., '^\s+', '')" />
</xsl:template>

now, that one removes *all* whitespace on the beginning of any text node. I
assumed in the intro that you want newlines to stay. That will change the
regex as follows:

replace(., '(&#xa;) +|^ +', '$1')


Ok, you've seen how easy it is in XSLT 2.0. In XSLT 1.0 we have to do it a
little different, we need a recursive template:

<xsl:template match="text()">
   <xsl:call-template name="remove-ws" >
      <xsl:with-param name="text" select="text()" />
   </xsl:call-template>
</xsl:template>

<xsl:template name="remove-ws">
   <xsl:param name="text" />

   <xsl:if test="starts-with(., ' ')" >
      <xsl:call-template name="remove-ws">
         <xsl:with-param name="text" select="substring(., 2)" />
      </xsl:call-template>
   </xsl:if>
   <xsl:if text="not(starts-with(., ' '))">
      <xsl:value-of select="text()" />
   </xsl:if>
</xsl:template>


it shouldn't be too hard to figure out how to change this to a more elaborate
whitespace stripping method. But I hope the above convinced you to move to
XSLT 2.0 ;)


Cheers & happy coding!
-- Abel Braaksma

Current Thread