Re: [xsl] invalid character was found in text content

Subject: Re: [xsl] invalid character was found in text content
From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx>
Date: Tue, 11 Sep 2001 20:24:16 -0400
At 01/09/11 20:03 -0400, Melvyn Rosengarden wrote:
The file header I create indicates ISO-8859-1 encoding. When I attempt to
parse
my XML file with the MS SAX interface I get the following error;
"invalid character was found in text content".

This is a message about your XML characters ... not about your encoding.


When it first occured I
discovered that an
embedded Hex 1E character was the culprit so my parsing routine "swallowed"
that
character. A few days later the problem reoccured and the culprit was a Hex
05 character.
I do NOT want to be surprised again tomorrow. Is there a comprehensive list
of invalid
characters for the ISO-8859-1 encoding scheme

This is not what you are looking for, though you don't realize it.


that I could use to create the
necessary
pre-process filter ??

You need to filter out non-XML characters ... neither hex 1E nor nex 05 are in XML, but they are both in the C0 set of ISO-2022, the framework within which Latin-1 ISO-8859-1 can be used in either the GL or GR (typically GR).


The list of valid XML characters is in the XML recommendation. According to production [2], only tab, linefeed and carriage return are allowed from the C0 set of control characters. Note these are *not* in Latin-1, but in the control set.

Please see the Recommendation to determine which characters are allowed. This is specified in Unicode, and all characters of Latin-1 are in Unicode. The list I gave you above is the complete list of the three allowed control characters, as specified in production [2].

I hope this helps.

........................ Ken

--
Training Blitz: 3-days XSLT/XPath, 2-days XSLFO in Ottawa 2001-10-01/05

G. Ken Holman                      mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Crane Softwrights Ltd.               http://www.CraneSoftwrights.com/s/
Box 266, Kars, Ontario CANADA K0A-2E0     +1(613)489-0999   (Fax:-0995)
Web site:     XSL/XML/DSSSL/SGML/OmniMark services, training, products.
Book:  Practical Transformation Using XSLT and XPath ISBN 1-894049-06-3
Article: What is XSLT? http://www.xml.com/pub/2000/08/holman/index.html
Next public instructor-led training:      2001-09-18,09-19,10-01,10-04,
-                                         10-22,11-05,12-09,12-10,02-02


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list



Current Thread