Subject: RE: [xsl] Flattening characters to plain latin From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Sat, 17 Feb 2007 17:22:31 -0000 |
> My verdict: If the 'lt' of Michael was on purpose, I still > want to grant him the "Best Original Software Snippet Based > On Any XXX* Language" ;-) I think the original problem wasn't especially well specified, and I was well aware that retaining all the characters below 127 while losing those above was a pretty crude cutoff. In the light of that, the decision whether to keep or lose 127 itself is neither here nor there. Almost certainly a better solution solution is to discard only the characters in particular Unicode groups, which should be possible to achieve using replace() with appropriately selected regular expressions. The basic idea I was trying to propose was using normalize-unicode to translate into decomposed normal form and then discarding modifier characters, and I think that's basically a sound approach. In fact a better solution might be replace(normalize-unicode($in, 'NFKD'), '\P{Mn}', '') but I'm sure that could be improved further. Michael Kay http://www.saxonica.com/
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Flattening characters to , Colin Paul Adams | Thread | [xsl] XHTML html validation, John Steel |
RE: [xsl] Books on XSLT/XPATH, Michael Kay | Date | Re: [xsl] Books on XSLT/XPATH, Rashmi Rubdi |
Month |