Michael Kay wrote:
Apart from these questions, is anybody aware of a resolution
to my problem? Most likely I need an extension function, am I right?
Yes indeed. It was never a design intention that the core function library
should do everything that anyone might conceivably want.
Not even an general purpose language is capable of solving that demand,
and you can't always make a car do the tango. Though there is quite
extensive support for encodings in XSLT, which is why I wonder why
'related' functions do not have this same support level. Of course,
'related' is a relative term and I consider the codepoints-to-string
functions 'related' to encodings ;-)
I can see three solutions to find a generalized treatment of what I call
"encoded codepoints":
1. Create translation tables for each encoding we wish to support.
2. Treating the codepoints as Unicode, serializing them to a non-XML
format as ISO-8859-1, and reading them back in ISO-8859-X
3. Using an extension function that does about the same as (2).
(2) and (3) avoid having to write translation tables, which I am very
reluctant to do, as the host language (here: Java, with Saxon) already
has this available. So, (1) seems pretty awkward, though it would likely
be a pretty stable way.
(2) sounds rather clumsy and risky, because of the extra parsing
invocation that is necessary to serialize and read back in, which I fear
might introduce unwanted side effects and adds much to the complexity of
the current XSLT design we have.
(3) may be my best option, but moves away from portability, which until
now I have pretty much been able to avoid.
Perhaps someone has gone this path already and likes to share his visions?
Thanks,
-- Abel