RE: [xsl] Need to remove unusual character in source

Subject: RE: [xsl] Need to remove unusual character in source
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 26 Sep 2006 23:34:38 +0100
XML 1.0 doesn't allow the control character x18. XML 1.1 does allow it,
provided it's represented as a numeric character reference.

XSLT can only process well-formed XML documents. If you have a document that
isn't well-formed XML, you need to clean it up using some non-XML-aware
process before you try putting it through an XML parser.

If you've agreed with your client, implicitly or explicitly, that it would
be beneficial to both parties to exchange data in XML, then I would point
out to the client that what they are sending you isn't XML and that this
destroys the point of the exercise.

Michael Kay
http://www.saxonica.com/ 

> -----Original Message-----
> From: Mario Madunic [mailto:hajduk@xxxxxxxx] 
> Sent: 26 September 2006 22:31
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Need to remove unusual character in source
> 
> I've come across a character in the source that I need to remove.
> 
> I'm using Saxon 8, XSLT 2
> 
> the character is and its a control character
> 
> 0x18 CAN
> 
> I've tried using character-map while defining the character 
> as an entity as such <!ENTITY controlChar18 "&#x18;">
> 
> <xsl:character-map name="testMap">
>    <xsl:output-character char="&controlChar18;" 
> string="REMOVEDCHAR" /> </xsl:character-map>
> 
> <xsl:result-document use-character-maps="testMap".../>
> 
> the error message I recieve is
> SXXP0003: Error reported by XML parser: Illegal XML 
> character:  &#x18;.
> 
> I've tried using ANT to clean it out but with no luck using 
> native2ascii or escapeunicode
> 
> Can this be done or do I need to ask the client to remove it 
> from their data, which might not be an option?
> 
> Any help or insight would be greatly appreciated.
> 
> Marijan Madunic

Current Thread