Subject: RE: [xsl] MSXML - Processing non standard characters From: "Andrew Kimball" <akimball@xxxxxxxxxxxxx> Date: Wed, 1 Aug 2001 16:13:12 -0700 |
Warren, You wrote: >I am trying to transform an HTTP XML document which contains special >characters using MSXML. I receive the following error when the >transformation occurs: > >XML Error loading '' >An invalid character was found in text content. > >I have no control over the format of the XML document. The XML document has ><?xml version="1.0"?>in the first line. Microsoft's site says: Re-encode the >XML data as proper UTF-8. > >I added the following to my XSL file but it still doesn't work: <?xml >version="1.0" encoding="UTF-8" ?> > >Since I can't change the original XML file, how can I resolve this problem. > I suspect what is happening is that the data includes invalid XML characters, like 0x1 or 0x2. You can check this out by looking at a binary representation of your file. The XML spec allows these characters: [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ Notice that many control characters are excluded (including x27 ESC). I assume that the reason for this is that XML is a text-based format. In practice, however, there is a need to allow these characters to be represented. I don't know why the XML WG didn't allow these special characters to be entitized (e.g.  ). Anybody know? Maybe they will fix this hole in a future version of the spec. Until then, MSXML is correctly rejecting these illegal XML characters (all conformant parsers must). I'd say you should talk to your XML supplier and point out they're sending you invalid XML data. If this is not possible, then you might have to preprocess the data and remove invalid characters before sending the data to the parser. ~Andy Kimball MSXSL Dev XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] MSXML - Processing non st, Wendell Piez | Thread | RE: [xsl] MSXML - Processing non st, Warren Keane |
[xsl] Re: Consecutive page numberin, Letitia and Drew Hod | Date | [xsl] [BUG] Resin XSL, Jean-Baptiste Quenot |
Month |