Subject: Re: Handling Entities in MSXML From: Mike Brown <mike@xxxxxxxx> Date: Wed, 26 Jul 2000 11:58:54 -0700 (PDT) |
ciaran byrne wrote: > <body>Some text &dllr; </body> > > I get... > > <body>Some text &dllr</body> > > What I want is: > <body>Some text &dllr;</body> You mentioned you don't have a DTD. Your question boils down to "I'm referring to entities that haven't been declared. Why aren't they working?" The only entity references you can have in an XML document of any kind, including XSL documents, without a DTD that declares the entities, are the ones that are built-in to XML: & < > ' " These are needed so that you can differentiate markup from character data. I suspect that MSXML is being lenient when it allows you to have a reference to an undeclared entity in your source XML/XHTML, or perhaps you're just not using the method properly. In either case, rather than complain about your undeclared entity, it's pretending that the reference is really just character data, as if you had said <body><![CDATA[Some text &dllr; ]]></body> You must realize that entities are numerous physical storage units which all together comprise the singular logical document. It is this single logical document that you are concerned with in XSL. You feed the XSL processor the document entity (the primary entity for the document), and it hands it off to an XML parser, which abstracts away all of the 'physical' aspects of it -- so things like general parsed entity references go away, replaced by their replacement text, and the bytes<->character encodings for each entity also go away. The parser reports on the single logical hierarchy of elements, attributes, and character data that it finds, and the XSL processor makes use of that information to create an internal representation of the XPath/XSLT node tree. This node tree has no concept of entities and entity references. If you are thinking "I want &foo; in the output" then you have to either fake it by creating a text node with the characters & f o o ; and the disable-output-escaping attribute set to "yes", or your have to rely on your XSL processor's output method to know that certain characters or node types should be emitted entity references. This is the kind of thing that the HTML output method does in the XSL processors that support it -- certain markup characters, and most text node characters not in the ASCII range (0x0..0x7E) are emitted as numeric character references like Ӓ or as SGML entity references like •. Even though you are managing to somehow get references to undeclared entities parsed as just the characters that make up the entity reference, at that point it is just character data and "&" is no longer special, so it *has* to be escaped on output, to preserve its status as character data rather than markup. Your only workaround is if your XSL processor supports the disable-output-escaping option that was introduced in the later working drafts of XSLT 1.0, you can do something like: <xsl:template match="body"> <body> <xsl:value-of select="." disable-output-escaping="yes"/> </body> </xsl:tempalte> -Mike XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Handling Entities in MSXML, ciaran byrne | Thread | The need for a liaison implementati, Brian Young |
RE: MSXML3: Numerical sort needed.., Brian Young | Date | Re: XSLT engine performance, Paul Tchistopolskii |
Month |