RE: [xsl] xsl:number

Subject: RE: [xsl] xsl:number
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Mon, 17 Mar 2003 09:25:55 -0000
Here be dragons.

I agree with you that the specification of numbering sequences is very
weak. In my view it's a classic case of "benign cultural imperialism" -
the spec authors wanted to make it fully international and localisable,
but since they were a bunch of Americans plus the odd expatriate
European, they didn't really have much idea in detail how to go about
it. This situation hasn't really changed in the 2.0 working group, and
the same problem has also made it difficult to agree a spec for
format-date().

As regards the specific questions, I think the result is that
implementors have a pretty free hand to do whatever they think is right.

On collating sequences the group has adopted a different approach: leave
it all to the implementor. This is probably wiser, since implementors
who want to sell their product in a particular geographical market
probably have access to local information about the requirements of that
market. (Well, perhaps this is being optimistic - for years US vendors
produced collating sequences for German which were approved by the
grammar textbooks, but had long since been superseded in popular use:
and contrariwise, Microsoft spell-checkers still tell me that "-ize"
endings are not allowed in the UK, when the OED insists that they
are...).

Michael Kay

> -----Original Message-----
> From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx 
> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Mike Brown
> Sent: 17 March 2003 05:50
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] xsl:number
> 
> 
> I have questions about xsl:number. This is the most poorly 
> specified instruction I've come across. It's really hard to 
> even know what questions to ask.
> 
> The way I interpret the XSLT 1.0 spec (and the 2.0 draft 
> doesn't help),
> 
>   <xsl:number format="A"/>
> 
> must be supported, and it must produce something from the sequence
> 
>   A, B, C, ..., Z, AA, AB, AC, ...
> 
> where A=1, B=2, etc.
>  
> The way it is specified, it seems to indicate that the 
> alphabet must be the English alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZ.
> 
> Or perhaps it could be any alphabet that starts with ABC and 
> ends with Z, like the Spanish alphabet, which varies 
> depending on who you ask, but for computing purposes I think 
> is generally ABCDEFGHIJKLMNÑOPQRSTUVWXYZ.
> 
> Or perhaps everything after "A" is just an example, meaning 
> that it very well could be the Swedish alphabet: 
> ABCDEFGHIJKLMNOPQRSTUVWXYZÅÄÖ ... or perhaps Vietnamese, 
> which starts with A and has no Z.
> 
> Anyway, the implication is that a processor must support some 
> alphabet that contains "A". Or is "A" just a placeholder for 
> any alphabetic character?
> 
> "When numbering with an alphabetic sequence, the lang 
> attribute specifies which language's alphabet is to be used; 
> it has the same range of values as xml:lang [XML]; if no lang 
> value is specified, the language should be determined from 
> the system environment."
> 
> It seems to me that if format="A", then the value of lang, 
> whether determined by the processor or specified in the 
> stylesheet, must be a language that contains "A".
> 
> What happens if the processor supports both English and 
> Hebrew, and I do something like
> 
>   <xsl:number format="A" lang="he"/>
> 
> ? Or for that matter,
> 
>   <!-- #1488 = Hebrew letter Aleph -->
>   <xsl:number format="&#1488;" lang="en"/>
>   
> ?
> 
> What does
> 
>   <xsl:number format="B"/>
> 
> mean? At the very least, I know "B" must represent 1. If the 
> default language is English, does this mean the sequence must be
> 
>   B, C, D, ..., Z, BB, BC, BD, ...
> 
> ?
> 
> The spec also says format="I" must be supported by using 
> Roman numerals. What does format="I" mean when the language 
> is not English?
> 
> The spec says "In many languages there are two commonly used 
> numbering sequences that use letters. One numbering sequence 
> assigns numeric values to letters in alphabetic sequence, and 
> the other assigns numeric values to each letter in some other 
> manner traditional in that language. In English, these would 
> correspond to the numbering sequences specified by the format 
> tokens a and i."
> 
> This seems to indicate that using "I" for Roman is a 
> "traditional" English convention, and (reading further) that 
> I could use letter-value="alphabetic" to override this 
> interpretation. If my theory about format="B" is correct, 
> then format="I" with letter-value="alphabetic" would result 
> in I, J, K, ... sequences.
> 
> I don't know. I have more questions, but I'll just stop here. 
> I really hope this stuff gets cleared up in 2.0, although 
> that doesn't help me much in trying to properly implement 1.0.
> 
> Mike
> 
> -- 
>   Mike J. Brown   |  http://skew.org/~mike/resume/
>   Denver, CO, USA |  http://skew.org/xml/
> 
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread