Re: [xsl] For-each adds whitespace per iteration: why?

Subject: Re: [xsl] For-each adds whitespace per iteration: why?
From: Eliot Kimber <ekimber@xxxxxxxxxx>
Date: Fri, 10 Jan 2014 14:55:28 -0600
I think too I was still laboring under the XSLT 1 definition of
+ADw-xsl:value-of/+AD4-. Taking the time to re-read the definition of value-of
in
XSLT 2 I see that it explicitly generates text nodes. I don+IBk-t think I
ever
realized that before.

This changes everything (at least in the way I approach constructing
result text).

Cheers,

Eliot
--
Eliot Kimber
Senior Solutions Architect
+ACI-Bringing Strategy, Content, and Technology Together+ACI-
Main: 512.554.9368
www.reallysi.com
www.rsuitecms.com




On 1/10/14, 12:59 PM, +ACI-Eliot Kimber+ACI- +ADw-ekimber+AEA-rsicms.com+AD4-
wrote:

+AD4-Ah, that explains it. I have gotten in the habit of preferring
+AD4APA-xsl:sequence+AD4- over +ADw-xsl:value-of+AD4- but this is apparently
one place where
+AD4-I should not have.
+AD4-
+AD4-A subtle aspect of the spec.
+AD4-
+AD4-I+IBk-m glad my understanding of +IBw-concatenation+IB0- in this context
was
+AD4-incorrect.
+AD4-
+AD4-The key bit from 5.7.1 appears to step 3 of the sequence processing
rules:
+AD4-
+AD4AIg-3. Any consecutive sequence of strings within the result sequence is
+AD4-converted to a single text node, whose string value
+AD4APA-http://www.w3.org/TR/xslt20/+ACM-dt-string-value+AD4- contains the
content of
+AD4-each of the strings in turn, with a single space (+ACM-x20) used as a
+AD4-separator between successive strings.+ACI-
+AD4-
+AD4-That makes things clear.
+AD4-
+AD4-Cheers,
+AD4-
+AD4-Eliot
+AD4-
+AD4-
+AD4---
+AD4-Eliot Kimber
+AD4-Senior Solutions Architect
+AD4AIg-Bringing Strategy, Content, and Technology Together+ACI-
+AD4-Main: 512.554.9368
+AD4-www.reallysi.com
+AD4-www.rsuitecms.com
+AD4-
+AD4-
+AD4-
+AD4-
+AD4-On 1/10/14, 11:27 AM, +ACI-Michael Kay+ACI-
+ADw-mike+AEA-saxonica.com+AD4- wrote:
+AD4-
+AD4APg-
+AD4APg-On 10 Jan 2014, at 17:05, Eliot Kimber
+ADw-ekimber+AEA-rsicms.com+AD4- wrote:
+AD4APg-
+AD4APgA+- In the context of writing an XSLT to generate DTD syntax from RNGs
(for
+AD4APgA+- DITA 1.3) I discovered that for-each results in whitespace being
+AD4APgA+-emitted
+AD4APgA+- for each iteration. This came as a surprise. Reading the spec it
says,
+AD4APgA+- under clause 7, Repetition:
+AD4APgA+-
+AD4APgA+- +ACI-For each item in the input sequence, evaluating the sequence
+AD4APgA+-constructor
+AD4APgA+- +ADw-http://www.w3.org/TR/xslt20/+ACM-dt-sequence-constructor+AD4-
produces a
+AD4APgA+-sequence
+AD4APgA+- of items (see 5.7 Sequence Constructors
+AD4APgA+- +ADw-http://www.w3.org/TR/xslt20/+ACM-sequence-constructors+AD4-).
These output
+AD4APgA+- sequences are concatenated+ADs- ...
+AD4APgA+-
+AD4APgA+- I understand +ACI-These output sequences are concatenated+IB0- to
mean that
+AD4APgA+-string
+AD4APgA+- concatenation rules are applied, which explains the white space.
+AD4APg-
+AD4APg-No, this is a concatenation of two or more sequences to produce a
single
+AD4APg-sequence. No whitespace is added at this point.
+AD4APg-
+AD4APgA+-
+AD4APgA+- My question: why is for-each defined in this way?
+AD4APg-
+AD4APg-It isn't.
+AD4APgA+-
+AD4APgA+-
+AD4APgA+- I tested this with this little XSLT transform:
+AD4APgA+-
+AD4APgA+- +ADw-?xml version+AD0AIg-1.0+ACI- encoding+AD0AIg-UTF-8+ACI-?+AD4-
+AD4APgA+- +ADw-xsl:stylesheet
xmlns:xsl+AD0AIg-http://www.w3.org/1999/XSL/Transform+ACI-
+AD4APgA+-  xmlns:xs+AD0AIg-http://www.w3.org/2001/XMLSchema+ACI-
+AD4APgA+-  xmlns:xd+AD0AIg-http://www.oxygenxml.com/ns/doc/xsl+ACI-
+AD4APgA+-  exclude-result-prefixes+AD0AIg-xs xd+ACI-
+AD4APgA+-  version+AD0AIg-2.0+ACIAPg-
+AD4APgA+-
+AD4APgA+-  +ADw-xsl:output method+AD0AIg-text+ACI-/+AD4-
+AD4APgA+-
+AD4APgA+-  +ADw-xsl:template name+AD0AIg-test-for-each+ACIAPg-
+AD4APgA+-    +ADw-xsl:variable name+AD0AIg-strings+ACI- select+AD0AIg-('one',
'two', 'three',
+AD4APgA+-'four')+ACI-/+AD4-
+AD4APgA+- value-of +ACQ-strings+AD0APA-xsl:value-of
select+AD0AIgAk-strings+ACI-/+AD4-
+AD4APgA+- for +ACQ-str in +ACQ-strings return concat('/', +ACQ-str,
'/')+AD0APA-xsl:sequence
+AD4APgA+- select+AD0AIg-for +ACQ-str in +ACQ-strings return concat('/',
+ACQ-str, '/')+ACI-/+AD4-
+AD4APgA+- string-join(+ACQ-strings, '')+AD0APA-xsl:sequence
select+AD0AIg-string-join(+ACQ-strings,
+AD4APgA+- '')+ACI-/+AD4-
+AD4APgA+- for-each over strings: +ACIAPA-xsl:for-each
select+AD0AIgAk-strings+ACIAPg-
+AD4APgA+-  +ADw-xsl:sequence select+AD0AIg-concat('/',.,'/')+ACI-/+AD4-
+AD4APgA+- +ADw-/xsl:for-each+AD4AIg-
+AD4APgA+-  +ADw-/xsl:template+AD4-
+AD4APgA+-
+AD4APgA+- +ADw-/xsl:stylesheet+AD4-
+AD4APgA+-
+AD4APgA+-
+AD4APgA+-
+AD4APgA+- Which produces this output using Saxon 9.5.1.2:
+AD4APgA+-
+AD4APgA+- value-of +ACQ-strings+AD0-one two three four
+AD4APgA+- for +ACQ-str in +ACQ-strings return concat('/', +ACQ-str,
'/')+AD0-/one/ /two/ /three/
+AD4APgA+- /four/
+AD4APgA+- string-join(+ACQ-strings, '')+AD0-onetwothreefour
+AD4APgA+- for-each over strings: +ACI-/one/ /two/ /three/ /four/+ACI-
+AD4APgA+-
+AD4APg-The whitespace is being added as part of the process of constructing
your
+AD4APg-final result tree from a sequence of strings. The result tree is
+AD4APg-constructed as a document node, following the rules of 5.7.1
Constructing
+AD4APg-Complex Content
+AD4APg-
+AD4APg-http://www.w3.org/TR/2009/PER-xslt20-20090421/+ACM-constructing-compl
ex-conte
+AD4APg-n
+AD4APg-t
+AD4APg-
+AD4APg-or equivalently the rules applied by the Serializer
+AD4APg-
+AD4APg-http://www.w3.org/TR/xslt-xquery-serialization/+ACM-serdm
+AD4APg-
+AD4APg-The simplest way to avoid the space separation is to construct text
nodes
+AD4APg-rather than strings, which happens if you replace xsl+ADs-sequence by
+AD4APg-xsl:value-of in
+AD4APg-
+AD4APgA+- +ADw-xsl:sequence select+AD0AIg-concat('/',.,'/')+ACI-/+AD4-
+AD4APg-
+AD4APg-Michael Kay
+AD4APg-Saxonica
+AD4APg-
+AD4APgA+-
+AD4APgA+- I see that the for-each result is consistent with the flowr
expression.
+AD4APgA+-
+AD4APgA+- Is my analysis correct that the only way to construct a string with
no
+AD4APgA+- extra whitespace using a loop is to use string-join() as in my
test
+AD4APgA+-case?
+AD4APgA+-
+AD4APgA+- For my DTD-generation application that would mean using the
for-each
+AD4APgA+-loop
+AD4APgA+- to construct a sequence of strings and then using string-join on
the
+AD4APgA+- sequence to avoid additional whitespace. Of course I can simply
account
+AD4APgA+- for the space inserted by the concatenation and get the correct
+AD4APgA+-indention
+AD4APgA+- and keep my code a bit simpler.
+AD4APgA+-
+AD4APgA+- Cheers,
+AD4APgA+-
+AD4APgA+- Eliot
+AD4APgA+-
+AD4APgA+- --
+AD4APgA+- Eliot Kimber
+AD4APgA+- Senior Solutions Architect
+AD4APgA+- +ACI-Bringing Strategy, Content, and Technology Together+ACI-
+AD4APgA+- Main: 512.554.9368
+AD4APgA+- www.reallysi.com
+AD4APgA+- www.rsuitecms.com
+AD4APgA+-
+AD4APgA+-
+AD4APgA+-
+AD4APgA+-
--+//0-------------------------------------------------------------------
+AD4APgA+- XSL-List info and archive:
http://www.mulberrytech.com/xsl/xsl-list
+AD4APgA+- To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
+AD4APgA+- or e-mail:
+ADw-mailto:xsl-list-unsubscribe+AEA-lists.mulberrytech.com+AD4-
+AD4APgA+- --+//0---
+AD4APgA+-
+AD4APg-
+AD4APg-
+AD4APg---+AH4---------------------------------------------------------------
----
+AD4APg-XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
+AD4APg-To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
+AD4APg-or e-mail:
+ADw-mailto:xsl-list-unsubscribe+AEA-lists.mulberrytech.com+AD4-
+AD4APg---+AH4---
+AD4APg-
+AD4-
+AD4-
+AD4---+AH4------------------------------------------------------------------
-
+AD4-XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
+AD4-To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
+AD4-or e-mail:
+ADw-mailto:xsl-list-unsubscribe+AEA-lists.mulberrytech.com+AD4-
+AD4---+AH4---
+AD4-

Current Thread