Re: [xsl] Result still indented despite indent="no"

Subject: Re: [xsl] Result still indented despite indent="no"
From: Dimitre Novatchev <dnovatchev@xxxxxxxxx>
Date: Mon, 21 Feb 2005 18:25:24 +1100
Hi Mukul,

> It is quite clear from the MSXML4 documentation(quote
> above), that MS is also interpreting the "text node
> stripping rules" in XSLT 1.0 spec, as you and Mr. Kay.
> But MSXSL is producing output contrary to your
> reasoning, and to the rules of the spec(and also to
> what is stated in their SDK documentation!)
>

No, it is not contrary, and let me quote again two messages (by
Michael Kay and David Carlisle), which explain exactly this:

Mike Kay:

"The behavior of the Microsoft processor is not due to a different
interpretation of the semantics of xsl:strip-space and xsl:preserve-space.
Microsoft's XSLT processor is behaving the same as the other processors. The
difference is that their XML parser (by default) removes the whitespace
before the XSLT processor gets to see it, and before the XSLT rules come
into play.

Since the conformance rules for XSLT talk only about transforming source
trees into result trees, anything that happens to the data before it is
turned into an XSLT source tree is outside the scope of the XSLT
specification, so legalistically, Microsoft's product is not non-conformant.
It's just different from all the others.

Michael Kay
http://www.saxonica.com/";


David Carlisle:

"No.

As I stated before, msxsl's implementation of xsl:strip space is
exactly the same as saxon an xalan (and is the only possible one by any
normal reading of the specification, it really is not at all ambiguous.)

The MS XSLT engine is conformant. It is the MS XML parser which removes space
(in what's described as an optimisation) when building the tree. So  it has
nothing to do with XSLT: the space is gone if you inspect the tree with
DOM calls or anything else, it is just unconditionally dropped while
parsing.

If you search the archives of this list you'll see this is a faq.


David"


Yes, this has nothing to do with the MSXML XSLT processor -- what it
gets from the *MSXML parser* comes already with stripped
whitespace-only nodes.

Try to perform an XSLT transformation when the "preserveWhiteSpace"
property of IXMLDOMDocument is set to "true" and you'll verify that
the MSXML XSLT processor preserves the whitespace-only nodes (in all
cases when it gets them :o)  ).


Cheers,

Dimitre


On Sun, 20 Feb 2005 22:57:29 -0800 (PST), Mukul Gandhi
<mukul_gandhi@xxxxxxxxx> wrote:
> Hi Dimitre,
>  Thanks for the explanation. I now feel my reasoning
> is wrong. You, Mr. Kay, Mr. Holman and David Carlisle
> are right.
>
> I just read the MSXML 4.0 SDK documentation. Here is
> what is written in it about white space handling..
>
> "
> Any XML document exercises some control over white
> space. The parser handles white space in an XML
> document according to the xml:space attribute and the
> default white space rules, before the document is
> processed by the XSLT processor.
>
> The XML 1.0 specification imposes the following white
> space rules.
>
> The parser normalizes newline characters specific to
> an operating system into true newline characters (hex
> x0A, or decimal 10). This is because different
> operating systems represent line breaks in different
> ways: for example, as true newlines, carriage returns,
> line feed/carriage return character pairs, and so on.
> The parser normalizes the values of attributes (other
> than CDATA-type attributes), by replacing multiple
> consecutive occurrences of white space with a single
> space. For example, an attribute value such as "text
> text" (with four intervening spaces embedded in the
> value is passed from the parser) as "text text". The
> multiple spaces in the original value are replaced
> with a single space.
> If an xml:space attribute in the source XML document
> or style sheet conflicts with explicit XSLT white
> space handling, the behavior associated with that
> xml:space attribute always takes precedence.
> According to the XSLT specification, the XSLT
> processor merges adjacent text nodes into a single
> text node. If a text node (following any merging that
> occurs) consists of white space only, the containing
> element is compared to the list of elements in any
> <xsl:strip-space> elements in the style sheet. If the
> containing element appears in such a list, the text
> node with white space only is removed from the result
> tree.
>
> This applies only to insignificant white spacebthat
> is, white space between, not within, elements. Use the
> XSLT normalize-space() function to normalize white
> space within elements.
>
> White space is handled by the built-in parser of
> MSXML, as well as by its built-in XSLT processor.
> "
>
> It is quite clear from the MSXML4 documentation(quote
> above), that MS is also interpreting the "text node
> stripping rules" in XSLT 1.0 spec, as you and Mr. Kay.
> But MSXSL is producing output contrary to your
> reasoning, and to the rules of the spec(and also to
> what is stated in their SDK documentation!)
>
> Regards,
> Mukul
>
>
> --- Dimitre Novatchev <dnovatchev@xxxxxxxxx> wrote:
>
> > Good morning,
> >
> > Wow... I see that more than 10 hours and 14 new
> > messages later Mukul
> > has still a persistent problem in understanding the
> > XSLT 1.0
> > specification on white-space stripping.
> >
> > Mukul, by now the problem in your thinking should be
> > obvious. Do you
> > see it yourself? If not, here it is:
> >
> > To arrive at the wrong conclusions, you are using
> > the following
> > excerpt from the spec, let's label it
> >
> > "A":
> >
> > "After the tree for a source document or stylesheet
> > document has been constructed, but before it is
> > otherwise processed by XSLT, some text nodes are
> > stripped.
> >
> > A text node is preserved if any of the following
> > apply:
> >
> > 1) The element name of the parent of the text node
> > is
> > in the set of whitespace-preserving element names.
> >
> > 2) The text node contains at least one
> > non-whitespace
> > character. As in XML, a whitespace character is
> > #x20,
> > #x9, #xD or #xA.
> >
> > 3) An ancestor element of the text node has an
> > xml:space attribute with a value of preserve, and no
> > closer ancestor element has xml:space with a value
> > of
> > default.
> >
> > Otherwise, the text node is stripped. "
> >
> >
> > The *obvious* problem is that you have omitted part
> > of the text.
> >
> > You omitted this very important sentence, which
> > precedes the excerpt
> > lets label it
> >
> > "B":
> >
> > "The stripping process takes as input a set of
> > element names for which
> > whitespace must be preserved."
> >
> > The other big problem you demonstrate is a
> > consistent failure to see
> > that the  excerpt "A" you quote *only by itself*
> > cannot be used to
> > determine if a given white-space-only node should be
> > stripped or not.
> >
> > The reason is that this text refers to a  "set of
> > whitespace-preserving element names" -- the set of
> > names of elements,
> > for  which whitespace should be preserved. This set
> > is undefined in
> > the excerpt quoted by you.
> >
> > Therefore this excerpt is incomplete and it is wrong
> > to make any
> > conclusions (including the one you make) only on
> > this excerpt.
> >
> > This is why it is necessary to resolve the
> > incompleteness. The spec
> > does this by providing exactly the missing part.
> > This is done in the
> > following sentence a few lines below, let's label it
> >
> > "C":
> >
> > "Initially , the set of whitespace-preserving
> > element names contains
> > all element names"
> >
> > Now, having "A" + "B" + "C"  there's a complete
> > description of the
> > actions to be taken to determine whether a given
> > whitespace-only node
> > is to be stripped or not.
> >
> > Based on this complete description from the XSLT 1.0
> > specification
> > everyone can conclude that in the example given by
> > you the results
> > obtained by Saxon and Xalan are correct.
> >
> > The only way to arrive at your (opposite) conclusion
> > is to ignore the
> > sentence "C" and to assume that:
> >     initially , the set of whitespace-preserving
> > element names is empty.
> >
> > But this assumption is exactly the opposite of what
> > "C" is saying.
> >
> > Therefore, you arrived at the wrong conclusion by
> > ignoring part of the
> > specification and by making an assumption, which is
> > contrary to the
> > specification.
> >
> >
> > I hope that we can agree to stop further discussing
> > the flaws in your
> > logic as these flaws should now be clear to you.
> >
> > Cheers,
> > Dimitre.
>
> __________________________________
> Do you Yahoo!?
> Meet the all-new My Yahoo! - Try it today!
> http://my.yahoo.com

Current Thread