Re: Character mapping in DSSSL/Jade

>> I am generating the separate d and x parts quite happily using modes,
>> and what I would like to do now is use Jade to map the UTF to the
>> ASCII subset used by the indexer.
>
>I'm not knowledgable about character sets and would be curious to know
>what this means. How do you map an arbitrary Kanji character to ASCII?
>Or do all of the Unicode characters already fit into the subset that 
>Unicode has in common with ASCII?

Serves me right for trying to keep out what I thought to be
unnecessary application-specific detail.  I work with Sumerian, and am
indexing graphemes in transliteration.  For display, I want to show
the graphemes with various diacritics including subscript numerals.
For indexing, it is acceptable to ignore certain of the diacritics,
and map others onto otherwise unused characters (we use scaron for
/sh/, but my indexer uses 'c' for this purpose; the identifiable
phonological repertoire of Sumerian can be mapped onto about 20 roman
characters).  Subscript digits, used to disambiguate homophones, are
treated as ordinary digits by the indexer.  So, to rephrase the
question, can I turn "<U-0161>a<U-2083>" into "ca3" using some
clever/efficient Jade?  It occurred to me, actually, that a better
route might be to do the UTF-ASCII translation in the indexer scanner,
but I'd still be curious to know the answer.

 Steve

 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist

<- Previous	Index	Next ->
Re: Character mapping in DSSSL/Jade, Paul Prescod	Thread	Jade requests, Paul Prescod
Re: Character mapping in DSSSL/Jade, Paul Prescod	Date	Jade requests, Paul Prescod
	Month

<-prev [Thread] next->	<-prev [Date] next->
Month Index \| List Home