Subject: RE: [xsl] Unparse-text() string contains ascii chars 29, 30 and 31 From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Wed, 19 Oct 2005 17:43:56 +0100 |
You might be able to make this work by using an XML 1.1 parser (specifying version="1.1" in the XML declaration). The current Saxon release is a bit patchy in its support for XML 1.1 (I've been doing some improvements so it should be better in 8.6) but the basics are there. XML 1.1 allows characters in the range x01 to x1F provided they are written as character references. The only character not allowed is 0, which was the result of a coalition between people who wanted to prevent you holding pure binary, and people who want to write their software in C. substring-before is more likely to work than tokenize, because substring-before allows any string (any string that you can get through the XML parser, that is), whereas regexes have their own rules and another layer of parsing. If necessary use translate() to translate the C0 control characters into PUA Unicode characters, which are legal in a regex. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: andrew welch [mailto:andrew.j.welch@xxxxxxxxx] > Sent: 19 October 2005 16:50 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: [xsl] Unparse-text() string contains ascii chars 29, > 30 and 31 > > I'm trying to process some data that's one long string delimited using > ascii characters 29, 30 and 31 (which are apparently group, record and > unit 'separator characters'). > > I can get access to the string using unparsed-text(), but when I > attempt to process the string using any of the function eg: > > tokenize($str, '') > > or > > substring-before($str, '') > > ...the XML parser complains that these aren't legal XML characters > (when the stylesheet itself is parsed). > > Is there any way around this? I can't see how I can process the > string in XSLT without using the characters themselves. > > The two alternative's I can see are to use an XMLFilter to turn it > into XML using Java, or to go back to the source to get them to export > their data in a less archaic way...
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Unparse-text() string con, Colin Paul Adams | Thread | Re: [xsl] Unparse-text() string con, Colin Paul Adams |
Re: [xsl] Unparse-text() string con, Colin Paul Adams | Date | Re: [xsl] Unparse-text() string con, Colin Paul Adams |
Month |