Subject: Re: [xsl] 0x19 is not a legal XML character|
From: "Andrew Welch" <andrew.j.welch@xxxxxxxxx>
Date: Thu, 28 Jun 2007 11:33:32 +0100
this may work and will remove all offending U+0019 chars.
The "offending" u+0019 characters could well be good content that's being written/read in the wrong encoding.
Simply stripping them out probably isn't the best approach - you need to work out why they're there, what put them there and then fix that. Patching it up afterwards is never a good idea.
It reminds me of when I came to my last project and discovered things like this in the XML:
It turns out the quotes were causing problems later in the processing (in a regex I think), so they thought they should "escape" them in the XML before they got there...
Imagine explaining your process to someone else in a years time - "this step is where we remove the u+0019 characters".