[xsl] Smart Quote Encoding

Subject: [xsl] Smart Quote Encoding
From: "Roger L. Cauvin" <roger@xxxxxxxxxx>
Date: Wed, 12 Sep 2007 11:56:38 -0500
I am using Saxon 6.3 and trying to transform some XML using a stylesheet.
The XML is a log file that logs incoming text-only e-mail messages.  The
messages sometimes contain special/nonstandard characters, such as smart
quotes.  If I want to be able to log the verbatim messages yet still be able
to apply XSLT, what is my best strategy?

With XML such as:

  <message-received>
    <from><![CDATA[John Spong <jspong@xxxxxxxxx>]]><text>
    <text><![CDATA[Descartes said, I think, therefore I am.]]><text>
  </message-received>

(The  characters are smart quotes.)

I receive the following error when I try to apply transformations:

  Fatal error reported by XML parser: illegal XML character U+18
    URL:    file:/C:/hello/goodbye.log
    Line:   8
    Column: 116
  Error
    org.xml.sax.SAXParseException: illegal XML character U+18: illegal XML
character U+18
  Transformation failed

The XML file contains the following encoding declaration:

  <?xml version="1.0" encoding="ISO-8859-1"?>

I have also tried UTF-8 and US-ASCII encodings, with the same results.

How do I handle any arbitrary text yet still be able to apply
transformations?

--
Roger L. Cauvin
Cauvin, Inc.
Product Management/Market Research

Current Thread