Subject: Re: [xsl] xsl:number question (XSLT 1.0)|
Date: Fri, 15 Apr 2005 10:22:07 -0500
The relevant chunk of the spec (from http://www.w3.org/TR/xslt#number): ------------------- The following attributes are used to control conversion of a list of numbers into a string. The numbers are integers greater than 0. The attributes are all optional. The main attribute is format. The default value for the format attribute is 1. The format attribute is split into a sequence of tokens where each token is a maximal sequence of alphanumeric characters or a maximal sequence of non-alphanumeric characters. Alphanumeric means any character that has a Unicode category of Nd, Nl, No, Lu, Ll, Lt, Lm or Lo. The alphanumeric tokens (format tokens) specify the format to be used for each number in the list. If the first token is a non-alphanumeric token, then the constructed string will start with that token; if the last token is non-alphanumeric token, then the constructed string will end with that token. Non-alphanumeric tokens that occur between two format tokens are separator tokens that are used to join numbers in the list. The nth format token will be used to format the nth number in the list. If there are more numbers than format tokens, then the last format token will be used to format remaining numbers. If there are no format tokens, then a format token of 1 is used to format all numbers. The format token specifies the string to be used to represent the number 1. Each number after the first will be separated from the preceding number by the separator token preceding the format token used to format that number, or, if there are no separator tokens, then by . (a period character). -------------------------------- Reading this, I'd say Xalan has it right. "If the first token is a non-alphanumeric token, then the constructed string will start with that token; if the last token is non-alphanumeric token, then the constructed string will end with that token." makes it pretty clear that your example should start with "(" and end with ")". "Each number after the first will be separated from the preceding number by the separator token preceding the format token used to format that number, or, if there are no separator tokens, then by . (a period character)." makes it pretty clear that your numbers should be separated by periods, since you specified no separator. I can see where Mike Kay got his implementation, though: "separated from the preceding number by the separator token preceding the format token used to format that number". However, the "after the first" part makes me think that the opening "(" should not apply to numbers after the first. That, combined with the last sentence from that paragraph in the spec, makes me think that (22.214.171.124) is the right output. As I read this, to get the output that Saxon produced, you'd have to specify "(1(1)", and the fully specified string for what Xalan produced would be "(1.1)". I think a clarifying sentence for when only number is present in the format string but multiple numbers are to be formatted would help. Perhaps something like "When the format string contains only one numeric position but the output will be multiple numeric values, the separator should be . (a period character)." By the way, I do not wish to imply that I think ill of the spec or its authors because of this problem. It's very hard to write something sufficiently generic and still anticipate every case. It's easy to say that a clarifying sentence would help after the problem has arisen. It's a much harder writing task to anticipate the problem in the first place and write the spec to cover it. It's a wonder these kinds of things don't pop up more often. My $.02. Jay Bryant Bryant Communication Services (presently consulting at Synergistic Solution Technologies) Jack Matheson <jack@xxxxxxxxxxxxxx> 04/15/2005 09:31 AM Please respond to xsl-list@xxxxxxxxxxxxxxxxxxxxxx To xsl-list@xxxxxxxxxxxxxxxxxxxxxx cc Subject [xsl] xsl:number question (XSLT 1.0) According to the spec, when a sequence number contains more values than there are formatting tokens, the last formatting token is used for the excess values. Unfortunately, it is a little vague on which separator token to use with the excess values. It says that a '.' is to be used if no separator token exists, but does this also apply to the case where the final formatting token is re-used with excess sequence values? Here is a quick test I did to try and see how different processors are handling this: <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="a/b/c/d"> <xsl:number level="multiple" count="*" format="(1)"/> </xsl:template> </xsl:stylesheet> If my input document is... <?xml version="1.0"?> <a><b><c><d/></c></b></a> ...then Saxon produces this: (1(2(1(1) ...while Xalan produces this: (126.96.36.199) Both answers seem perfectly reasonable to me, given the lack of clarity in the 1.0 spec. Can anyone help me figure out which is (more) correct?