Re: [xsl] Problem with encoding UTF-8

Subject: Re: [xsl] Problem with encoding UTF-8
From: Geert Josten <Geert.Josten@xxxxxxxxxxx>
Date: Wed, 15 Dec 2004 18:19:05 +0100
Hi there,

Error messages aren't always very accurate. I wouldn't be surprised when xalan throws this message when it occurs something it didn't expect.

One thing of which I know it doesn't expect is a UTF-8 BOM sequence at the start of a file. Open the data file with in hex mode and look whether it starts with the sequence EF BB BF. If so, you have a UTF-8 BOM in your file. You have to get rid of it, or switch to Saxon (or something else?)..

But then again, I'm guessing here...

Cheers,
Geert

David at roamware wrote:

Hi,

I have a collegue who is exporting a XML data file from a PDF (yes you can
do that). He sent me the file, I ran it through my program using Xalan
2.5.1, xml-apis and xercesimpl.jar both from 31st July last year and it
comes up with the very useless message "Premature end of file". Which I
understand can mean enmcoding problems or XSD missing etc.

If I take the file he sent me and in UltraEdit 32 use the UNICODE/UTF-8 ->
UTF-8 conversion option save the file and then pop it through my program,
all works fine. This ambiguous conversion is explained thus "This function
will convert the complete file from Unicode or UTF-8 (ASCII representation)
to UTF-8 (with the file internally as Unicode)"

So I am at a bit of a loss to explain what the file format has to do with
this, the PDF exports the file with the "encoding=UTF-8" in the xml element.
Any experience of this behaviour and how to get around it? I cannot change
what the PDF exports so it will have to be a "not strict" switch or
something on the parser I suppose (couldn't find reference to such a thing
mind you.).

Thx.

David Wynter




-- Geert.Josten@xxxxxxxxxxx IT-consultant at Daidalos BV, Zoetermeer (NL)

http://www.daidalos.nl/
tel:+31-(0)79-3316961
fax:+31-(0)79-3316464

GPG: 1024D/12DEBB50

Current Thread