Char node-type

Subject: Char node-type
From: Richard Light <richard@xxxxxxxxxxxxxxxxx>
Date: Thu, 23 Nov 2000 08:05:59 +0000
When transforming e.g. scientific articles to HTML, we often encounter
characters which browsers cannot represent directly.  These have to be
'translated' to <img> elements which call in a suitable GIF.  Precisely
which GIF to use (bold, superscript, etc.) depends on the ancestry of
the string containing the element.  Similar techniques are required to
resolve and render Unicode combining characters.

Since the translate function only deals with one-to-one character
mappings, we have written a template to process text() nodes which
processed the first character, then calls itself recursively to process
the rest of the string.  This works fine so long as you don't have more
than about 600 characters in a single element - with a long enough
string you get stack overflows.

It wouldn't be necessary to use this unpleasant and inefficient
technique if XPath defined in its data model, and XSLT supported, a char
node-type.  (Isolating a single character would also allow us to access
its character value via the number function, which would be handy for
processing classes of Unicode characters.)

I know that asking XSLT processors to process single characters is less
efficient than working on strings, but given the need to process single
characters, surely it makes sense to give the XSLT processor the job and
do it in a way that won't blow the stack?

... or is there a clever way of doing this sort of thing in XSLT that I
haven't thought of?  (There usually is!)

Richard Light
SGML/XML and Museum Information Consultancy
richard@xxxxxxxxxxxxxxxxx


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread