Subject: Re: [xsl] For-each adds whitespace per iteration: why? From: Eliot Kimber <ekimber@xxxxxxxxxx> Date: Fri, 10 Jan 2014 14:55:28 -0600 |
I think too I was still laboring under the XSLT 1 definition of +ADw-xsl:value-of/+AD4-. Taking the time to re-read the definition of value-of in XSLT 2 I see that it explicitly generates text nodes. I don+IBk-t think I ever realized that before. This changes everything (at least in the way I approach constructing result text). Cheers, Eliot -- Eliot Kimber Senior Solutions Architect +ACI-Bringing Strategy, Content, and Technology Together+ACI- Main: 512.554.9368 www.reallysi.com www.rsuitecms.com On 1/10/14, 12:59 PM, +ACI-Eliot Kimber+ACI- +ADw-ekimber+AEA-rsicms.com+AD4- wrote: +AD4-Ah, that explains it. I have gotten in the habit of preferring +AD4APA-xsl:sequence+AD4- over +ADw-xsl:value-of+AD4- but this is apparently one place where +AD4-I should not have. +AD4- +AD4-A subtle aspect of the spec. +AD4- +AD4-I+IBk-m glad my understanding of +IBw-concatenation+IB0- in this context was +AD4-incorrect. +AD4- +AD4-The key bit from 5.7.1 appears to step 3 of the sequence processing rules: +AD4- +AD4AIg-3. Any consecutive sequence of strings within the result sequence is +AD4-converted to a single text node, whose string value +AD4APA-http://www.w3.org/TR/xslt20/+ACM-dt-string-value+AD4- contains the content of +AD4-each of the strings in turn, with a single space (+ACM-x20) used as a +AD4-separator between successive strings.+ACI- +AD4- +AD4-That makes things clear. +AD4- +AD4-Cheers, +AD4- +AD4-Eliot +AD4- +AD4- +AD4--- +AD4-Eliot Kimber +AD4-Senior Solutions Architect +AD4AIg-Bringing Strategy, Content, and Technology Together+ACI- +AD4-Main: 512.554.9368 +AD4-www.reallysi.com +AD4-www.rsuitecms.com +AD4- +AD4- +AD4- +AD4- +AD4-On 1/10/14, 11:27 AM, +ACI-Michael Kay+ACI- +ADw-mike+AEA-saxonica.com+AD4- wrote: +AD4- +AD4APg- +AD4APg-On 10 Jan 2014, at 17:05, Eliot Kimber +ADw-ekimber+AEA-rsicms.com+AD4- wrote: +AD4APg- +AD4APgA+- In the context of writing an XSLT to generate DTD syntax from RNGs (for +AD4APgA+- DITA 1.3) I discovered that for-each results in whitespace being +AD4APgA+-emitted +AD4APgA+- for each iteration. This came as a surprise. Reading the spec it says, +AD4APgA+- under clause 7, Repetition: +AD4APgA+- +AD4APgA+- +ACI-For each item in the input sequence, evaluating the sequence +AD4APgA+-constructor +AD4APgA+- +ADw-http://www.w3.org/TR/xslt20/+ACM-dt-sequence-constructor+AD4- produces a +AD4APgA+-sequence +AD4APgA+- of items (see 5.7 Sequence Constructors +AD4APgA+- +ADw-http://www.w3.org/TR/xslt20/+ACM-sequence-constructors+AD4-). These output +AD4APgA+- sequences are concatenated+ADs- ... +AD4APgA+- +AD4APgA+- I understand +ACI-These output sequences are concatenated+IB0- to mean that +AD4APgA+-string +AD4APgA+- concatenation rules are applied, which explains the white space. +AD4APg- +AD4APg-No, this is a concatenation of two or more sequences to produce a single +AD4APg-sequence. No whitespace is added at this point. +AD4APg- +AD4APgA+- +AD4APgA+- My question: why is for-each defined in this way? +AD4APg- +AD4APg-It isn't. +AD4APgA+- +AD4APgA+- +AD4APgA+- I tested this with this little XSLT transform: +AD4APgA+- +AD4APgA+- +ADw-?xml version+AD0AIg-1.0+ACI- encoding+AD0AIg-UTF-8+ACI-?+AD4- +AD4APgA+- +ADw-xsl:stylesheet xmlns:xsl+AD0AIg-http://www.w3.org/1999/XSL/Transform+ACI- +AD4APgA+- xmlns:xs+AD0AIg-http://www.w3.org/2001/XMLSchema+ACI- +AD4APgA+- xmlns:xd+AD0AIg-http://www.oxygenxml.com/ns/doc/xsl+ACI- +AD4APgA+- exclude-result-prefixes+AD0AIg-xs xd+ACI- +AD4APgA+- version+AD0AIg-2.0+ACIAPg- +AD4APgA+- +AD4APgA+- +ADw-xsl:output method+AD0AIg-text+ACI-/+AD4- +AD4APgA+- +AD4APgA+- +ADw-xsl:template name+AD0AIg-test-for-each+ACIAPg- +AD4APgA+- +ADw-xsl:variable name+AD0AIg-strings+ACI- select+AD0AIg-('one', 'two', 'three', +AD4APgA+-'four')+ACI-/+AD4- +AD4APgA+- value-of +ACQ-strings+AD0APA-xsl:value-of select+AD0AIgAk-strings+ACI-/+AD4- +AD4APgA+- for +ACQ-str in +ACQ-strings return concat('/', +ACQ-str, '/')+AD0APA-xsl:sequence +AD4APgA+- select+AD0AIg-for +ACQ-str in +ACQ-strings return concat('/', +ACQ-str, '/')+ACI-/+AD4- +AD4APgA+- string-join(+ACQ-strings, '')+AD0APA-xsl:sequence select+AD0AIg-string-join(+ACQ-strings, +AD4APgA+- '')+ACI-/+AD4- +AD4APgA+- for-each over strings: +ACIAPA-xsl:for-each select+AD0AIgAk-strings+ACIAPg- +AD4APgA+- +ADw-xsl:sequence select+AD0AIg-concat('/',.,'/')+ACI-/+AD4- +AD4APgA+- +ADw-/xsl:for-each+AD4AIg- +AD4APgA+- +ADw-/xsl:template+AD4- +AD4APgA+- +AD4APgA+- +ADw-/xsl:stylesheet+AD4- +AD4APgA+- +AD4APgA+- +AD4APgA+- +AD4APgA+- Which produces this output using Saxon 9.5.1.2: +AD4APgA+- +AD4APgA+- value-of +ACQ-strings+AD0-one two three four +AD4APgA+- for +ACQ-str in +ACQ-strings return concat('/', +ACQ-str, '/')+AD0-/one/ /two/ /three/ +AD4APgA+- /four/ +AD4APgA+- string-join(+ACQ-strings, '')+AD0-onetwothreefour +AD4APgA+- for-each over strings: +ACI-/one/ /two/ /three/ /four/+ACI- +AD4APgA+- +AD4APg-The whitespace is being added as part of the process of constructing your +AD4APg-final result tree from a sequence of strings. The result tree is +AD4APg-constructed as a document node, following the rules of 5.7.1 Constructing +AD4APg-Complex Content +AD4APg- +AD4APg-http://www.w3.org/TR/2009/PER-xslt20-20090421/+ACM-constructing-compl ex-conte +AD4APg-n +AD4APg-t +AD4APg- +AD4APg-or equivalently the rules applied by the Serializer +AD4APg- +AD4APg-http://www.w3.org/TR/xslt-xquery-serialization/+ACM-serdm +AD4APg- +AD4APg-The simplest way to avoid the space separation is to construct text nodes +AD4APg-rather than strings, which happens if you replace xsl+ADs-sequence by +AD4APg-xsl:value-of in +AD4APg- +AD4APgA+- +ADw-xsl:sequence select+AD0AIg-concat('/',.,'/')+ACI-/+AD4- +AD4APg- +AD4APg-Michael Kay +AD4APg-Saxonica +AD4APg- +AD4APgA+- +AD4APgA+- I see that the for-each result is consistent with the flowr expression. +AD4APgA+- +AD4APgA+- Is my analysis correct that the only way to construct a string with no +AD4APgA+- extra whitespace using a loop is to use string-join() as in my test +AD4APgA+-case? +AD4APgA+- +AD4APgA+- For my DTD-generation application that would mean using the for-each +AD4APgA+-loop +AD4APgA+- to construct a sequence of strings and then using string-join on the +AD4APgA+- sequence to avoid additional whitespace. Of course I can simply account +AD4APgA+- for the space inserted by the concatenation and get the correct +AD4APgA+-indention +AD4APgA+- and keep my code a bit simpler. +AD4APgA+- +AD4APgA+- Cheers, +AD4APgA+- +AD4APgA+- Eliot +AD4APgA+- +AD4APgA+- -- +AD4APgA+- Eliot Kimber +AD4APgA+- Senior Solutions Architect +AD4APgA+- +ACI-Bringing Strategy, Content, and Technology Together+ACI- +AD4APgA+- Main: 512.554.9368 +AD4APgA+- www.reallysi.com +AD4APgA+- www.rsuitecms.com +AD4APgA+- +AD4APgA+- +AD4APgA+- +AD4APgA+- --+//0------------------------------------------------------------------- +AD4APgA+- XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list +AD4APgA+- To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/ +AD4APgA+- or e-mail: +ADw-mailto:xsl-list-unsubscribe+AEA-lists.mulberrytech.com+AD4- +AD4APgA+- --+//0--- +AD4APgA+- +AD4APg- +AD4APg- +AD4APg---+AH4--------------------------------------------------------------- ---- +AD4APg-XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list +AD4APg-To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/ +AD4APg-or e-mail: +ADw-mailto:xsl-list-unsubscribe+AEA-lists.mulberrytech.com+AD4- +AD4APg---+AH4--- +AD4APg- +AD4- +AD4- +AD4---+AH4------------------------------------------------------------------ - +AD4-XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list +AD4-To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/ +AD4-or e-mail: +ADw-mailto:xsl-list-unsubscribe+AEA-lists.mulberrytech.com+AD4- +AD4---+AH4--- +AD4-
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] For-each adds whitespace , Eliot Kimber | Thread | Re: [xsl] For-each adds whitespace , Imsieke, Gerrit, le- |
Re: [xsl] For-each adds whitespace , Eliot Kimber | Date | Re: [xsl] For-each adds whitespace , Imsieke, Gerrit, le- |
Month |