|
Subject: RE: [xsl] Flattening characters to plain latin From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Sat, 17 Feb 2007 17:22:31 -0000 |
> My verdict: If the 'lt' of Michael was on purpose, I still
> want to grant him the "Best Original Software Snippet Based
> On Any XXX* Language" ;-)
I think the original problem wasn't especially well specified, and I was
well aware that retaining all the characters below 127 while losing those
above was a pretty crude cutoff. In the light of that, the decision whether
to keep or lose 127 itself is neither here nor there. Almost certainly a
better solution solution is to discard only the characters in particular
Unicode groups, which should be possible to achieve using replace() with
appropriately selected regular expressions. The basic idea I was trying to
propose was using normalize-unicode to translate into decomposed normal form
and then discarding modifier characters, and I think that's basically a
sound approach.
In fact a better solution might be
replace(normalize-unicode($in, 'NFKD'), '\P{Mn}', '')
but I'm sure that could be improved further.
Michael Kay
http://www.saxonica.com/
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] Flattening characters to , Colin Paul Adams | Thread | [xsl] XHTML html validation, John Steel |
| RE: [xsl] Books on XSLT/XPATH, Michael Kay | Date | Re: [xsl] Books on XSLT/XPATH, Rashmi Rubdi |
| Month |