Subject: Unicode and XSL (was substring()) From: Richard Light <richard@xxxxxxxxxxxxxxxxx> Date: Sat, 5 Jun 1999 10:58:29 +0100 |
In message <93CB64052F94D211BC5D0010A800133170EECF@xxxxxxxxxxxxxxxxxxxxx uk>, Kay Michael <Michael.Kay@xxxxxxx> writes > >We had this conversation a few weeks ago (regarding length()). As I learnt >then, it's all due to the appalling decision to allow non-spacing >diactricals in Unicode, which makes it quite hard to define what you mean by >"the first character" in a string. It isn't just diacriticals. Unicode has a concept of "combining characters" which is used for a wide range of purposes, most of which I don't begin to understand. It divides them into combining character classes, which group together characters which appear over, under, around, (etc.!) the base character. It also has a detailed algorithm for combining multiple combining characters into one base character. The *semantics* of "the first character" might be a difficult one. However, if you are simply trying to count characters, surely all you have to do is to ignore any combining characters that occur within the string. (The first character should be a 'real one' - combining characters always follow the base character they qualify.) Since XML adopts Unicode in an unqualified manner, I assume that XSL back-ends will support the rendering of these combined characters. Just like I assume that all XML editors will support Unicode. ;-( Richard Light. Richard Light SGML/XML and Museum Information Consultancy richard@xxxxxxxxxxxxxxxxx XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: substring(), Kay Michael | Thread | Re: Unicode and XSL (was substring(, James Clark |
Re: XML/XSL and ASP/IIS, Steven Livingstone | Date | Re: Unicode and XSL (was substring(, James Clark |
Month |