Re: [xsl] Effects of white space between xml elements

Subject: Re: [xsl] Effects of white space between xml elements
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Tue, 24 Feb 2009 14:08:30 -0500
Nat,

Hermann has offered you the solution. You also asked whether white space between tags in XML source wasn't (shouldn't be) ignored in the transformation.

The answer is no. In the general case, XML parsers are not supposed to strip any whitespace at all, in the expectation that without a schema, it can't know which white space matters, and which white space doesn't. For example, in an element like

<p>He sent her roses <i>and</i> <b>chocolate</b>.</p>

you wouldn't want the whitespace between the 'i' and the 'b' elements to be stripped.

This general rule is however subject to a number of caveats:

* Some XML processors ignore it: caveat emptor. (For example, MSXML is famous for this.)
* In XSLT 2.0, where schema-aware processing can be supported, a source tree can be pruned of insignificant whitespace nodes, on the basis of schema declarations that mandate treating whitespace as cosmetic.


For example, in your data, if there were a schema or DTD declaration for 'actions' that said it could contain only 'effect', or at any rate not contain text (#PCDATA), then it would be known statically that whitespace didn't count, and a processor could discard it for you.

In addition, even in XSLT 1.0 there are top-level instructions, xsl:strip-space and xsl:preserve-space, which allow you to control this directly.

I hope this helps,
Wendell

At 06:16 PM 2/23/2009, you wrote:
<xsl:for-each select="child::*"> does what you want.

"child::node()" matches the (white space) text nodes in addition.
...

Does white space between elements of an xml document effect the output
of a transformation of that document?  I noticed that sometimes adding
a new line in between the closing tag of one element and the opening
tag of the next element changes my output, even though I am working
with the same stylesheet.

I thought that white space in an XML document was ignored during the
transformation; that the only thing that will effect the output are
the on/off tags and how they are nested in each other.  If tags are
nested properly, shouldn't white space and new lines in between
elements be irrelevant?


======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

Current Thread