JADE and 'exotic' encodings.

Subject: JADE and 'exotic' encodings.
From: "Bernhard Adam" <adam@xxxxxxxxxxxxxxxxxxxx>
Date: Tue, 17 Nov 1998 11:46:19 CET

we are planning to implement a formatting process using JADE/DSSSL.
The point is, that we must deal with some (for us) 'exotic' encodings like 
Japanese, Chinese, Korean, ...

In a first step we've parsed some of these documents using NSGMLS with 
settings as follows:
SP_BCTF=SJIS (for Japanese)
(we are also using an appropriate SGML-Declaration).
With these settings everything works fine with NSGMLS - no errors occoured.

Unfortunately it doesn't work with JADE - for the same documents we've got 
a whole lot of 'encoding errors' like:
jade:c:\projects\test.sgm:47:65:E: non SGML character number 65533
In the output JADE then mapps every unknown byte combination to the 
same charref '&#65533;'.

We've learned from the documentation that JADE always sets
SP_CHARSET_FIXED to "YES", i.e. SP_BCTF is not evaluated and
JADE works in the "fixed character set mode". Therefore we've also
set SP_ENCODING to "SHIFT_JIS" (for Japanese).
BTW: if we use these JADE settings with NSGMLS we get the same 
problems as with JADE.

Now I was wondering if there is a way to force JADE to process our 
documents in exactly the same way NSGMLS does?

Should we change the SGML-Declaration (remember: the declaration
works fine with NSGMLS)?
Or must we go into the JADE source at last? If so, what amount
of work must we expect? And where in the source code should we start?

Any response is welcome.
Thanks in advance.


Bernhard Adam

Bernhard Adam                  Phone: (0049)711-13 99 96 21
Vector Informatik GmbH       Email: adam@xxxxxxxxxxxxxxxxxxxx
Friolzheimer Strasse 6
70499 Stuttgart

 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist

Current Thread