RE: MSXML vs. Saxon: different handling of tabs & newlines

Subject: RE: MSXML vs. Saxon: different handling of tabs & newlines
From: Kay Michael <Michael.Kay@xxxxxxx>
Date: Mon, 6 Nov 2000 11:18:54 -0000
> I am observing an interesting difference in the way MSXML and 
> Saxon are treating tabs and newlines in my XML instance when viewing 
> the resulting HTML.

The difference is that for MSXML3, the input you supply to the XSLT
processor is in the form of a DOM, and MSXML3 is doing extra whitespace
stripping by default when you build the DOM (i.e. before the tree gets
anywhere near the XSLT processor). I believe it's possible to suppress this.
There are varying views on whether they are conformant in this area, but
since you are building the DOM using a proprietary Microsoft API, it's hard
to point to the spec that they are not conforming to. The final result
certainly defeats the intended effect of the XSLT whitespace rules.

It's actually a problem implementing the whitespace-stripping rules when you
take input from a DOM, since there's a reasonable expectation that the XSLT
processor shouldn't modify the input tree, and doing whitespace-stripping on
the fly as you navigate the tree is likely to be incredibly expensive. If
you supply a DOM as input to Saxon, I copy the whole thing into a new data
structure (which is also expensive).

Mike Kay


 XSL-List info and archive:

Current Thread