Subject: RE: [xsl] possible workarounds to process files with invalid character encoding ... From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Fri, 12 Dec 2008 21:26:38 -0000 |
If you're capable of writing a Java Reader that will process this file into a stream of characters, then you can get Saxon to use this Reader by nominating a custom UnparsedTextURIResolver. Alternatively, I suspect you can do it at the Java level by registering an encoding name for the encoding and associating it with a decoder for that encoding - but I'm not familiar with the details. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Matthias Einbrodt [mailto:matthias.einbrodt@xxxxxxxxxxxxx] > Sent: 12 December 2008 21:14 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: [xsl] possible workarounds to process files with > invalid character encoding ... > > Hello, > > I'm trying to transform a textfile with xslt using the > unparsed-text and tokenize functions. Unfortunately the text > file consists of characters which are encoded with a non > Unicode compliant encoding scheme. So as expected my Saxon > Processor (version 9.1.0.3 Basic) shows me a > *MalformedInputException *when I want to parse the file. > > Now my question is if there are any "workarounds" to make > Saxon process the file anyway. Maybe by: > > (1) Writing a sort of plugin that let's Saxon support also > non Unicode compliant encodings; > > (2) By adding in some way Metadata to the input file which > Saxon or another XSLT Parser can handle and that specifies a > mapping of the used character encodings to the appropriate > code points of a Unicode compliant encoding. > > And if there exists such a workaround is it even worth trying > to implement it or would someone be better of preprocessing > the file with a custom Java-Program or by even trying to > modify the program that creates such text-files in such a way > that it uses a Unicode-compliant encoding scheme rather than > it's own custom one? > > What are your opinions? > > Best Regard > > Matthias Einbrodt
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] possible workarounds to proce, Matthias Einbrodt | Thread | [xsl] ooxml grouping, Andreas Peter |
[xsl] possible workarounds to proce, Matthias Einbrodt | Date | Re: [xsl] trying to figure out hand, Fred Christian |
Month |