Re: [xsl] Problem with xsl:number formatting

Subject: Re: [xsl] Problem with xsl:number formatting
From: "bryan rasmussen" <rasmussen.bryan@xxxxxxxxx>
Date: Tue, 14 Aug 2007 14:58:22 +0200
I think relevant parts of the 7.7.1 section that indicate this would
have to be implementer defined are as follows:
"Any other format token indicates a numbering sequence that starts
with that token. If an implementation does not support a numbering
sequence that starts with that token, it must use a format token of
1."

and

"When numbering with an alphabetic sequence, the lang attribute
specifies which language's alphabet is to be used; it has the same
range of values as xml:lang [XML]; if no lang value is specified, the
language should be determined from the system environment.
Implementers should document for which languages they support
numbering."

I think that the spec does not explicitly support formatting such as
E.1 and so forth, it is an assumption made formed by our familiarity
with numbering of lists in various types of documents. In the example
given i don't think one could say that IE had a bug (or msxml had a
bug)

HOWEVER --


It seems that IE or msxml does not differentiate at all based on
language, and if we want to say there is a bug it seems not to use
letter-value correctly, in the context of I.

for example changing the earlier stylesheet to:

<?xml version='1.0'?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>


<xsl:template match="CSU_Category">
<xsl:number format="I" level="multiple" count="CSU_Category"
letter-value="alphabetic"/> CSU Category: <xsl:value-of
select="@Name"/>
<br/>
<xsl:apply-templates/>
</xsl:template>

<xsl:template match="CSU">
        <br/>
        <xsl:number level="multiple" format="I"   letter-value="alphabetic"
count="CSU_Category|CSU" /> CSU: <xsl:value-of select="@Name"/>
        <br/>
        <xsl:apply-templates/>
</xsl:template>


<xsl:template match="Template">
        <xsl:number level="multiple" format="I" letter-value="alphabetic"
count="CSU_Category|CSU|Template"/> SWInterface: <xsl:value-of
select="SWInterface_Name"/>
        <br/>
        <xsl:apply-templates/>
</xsl:template>


</xsl:stylesheet>

outputs

I CSU Category: Interface<br />
    <br />I.I CSU: Analog Input Interface<br />
      I.I.I SWInterface: <br />


    <br />I.II CSU: Analog Output Interface<br />
      I.II.I SWInterface: <br />

and so on and so forth. This output is the same with letter-value =
traditional.

However from the spec the following:

A format token i generates the sequence i ii iii iv v vi vii viii ix x ....

A format token I generates the sequence I II III IV V VI VII VIII IX X ....

Any other format token indicates a numbering sequence that starts with
that token. If an implementation does not support a numbering sequence
that starts with that token, it must use a format token of 1.

The letter-value attribute disambiguates between numbering sequences
that use letters. In many languages there are two commonly used
numbering sequences that use letters. One numbering sequence assigns
numeric values to letters in alphabetic sequence, and the other
assigns numeric values to each letter in some other manner traditional
in that language. In English, these would correspond to the numbering
sequences specified by the format tokens a and i. In some languages,
the first member of each sequence is the same, and so the format token
alone would be ambiguous. A value of alphabetic specifies the
alphabetic sequence; a value of traditional specifies the other
sequence. If the letter-value attribute is not specified, then it is
implementation-dependent how any ambiguity is resolved."

Given this I suppose that if the letter-value is given as alphabetic
and the lang attribute is given as en, en-GB, or en-US (or is not
there) then if one of those languages is supported with a numbering
defined for I in alphabetic (or in any Western European language with
an alphabetic numbering of I defined) then it should end up like:

I,J,K and so forth. (obviously with periods placed in appropriate
spots and all that, but the main thing is it should be alphabetic.)

If the en, en-GB, or en-US lang is not supported, or a numbering for I
is not defined then it should end up as

1,2,3.



However IE seems to do the traditional sorting (which I take it is not
the traditional sorting in at least some non-european languages) if
one defines the language to be one of these three or does not define
the language at all.

I think this can be defined as  not following the Spec.



By the way, is there a markup format that allows you to define the
sorting and numbering sequences of the language?


Cheers,
Bryan Rasmussen









On 8/9/07, Michael Kay <mike@xxxxxxxxxxxx> wrote:
> >
> > So, in my mind, 'E' is a valid token on which to start.
> >
> You're definitely in implementation-defined territory here. An
> implementation may or may not support a numbering sequence that starts with
> 'E'; if it does so, it's implementation defined whether that sequence goes
> (E, F, G, H, I) or (E, G, B, D, F).
>
> Saxon will allow format="E" and continue E, F, G, H, I, but there's
> certainly nothing in the spec to require it.
>
> I think the only reliable way to do this across implementations is to
> compute the section number (either using xsl:number or an expression such as
> count(preceding-sibling::x), add 4 to rebase it, and then format it using
> <xsl:number value="x" format="A"/>.
>
> Michael Kay
> http://www.saxonica.com/

Current Thread