Re: [xsl] problem with whitespace in mixed content (reverse indentation)

Subject: Re: [xsl] problem with whitespace in mixed content (reverse indentation)
From: "Wolfhart Totschnig wolfhart.totschnig@xxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 27 Feb 2015 16:02:28 -0000
Dear Martin,

Thank you for the prompt reply! I didn't occur to me that I could formulate my conditions using the sibling axis. Following your suggestion, I found the following solution to my problem:

<xsl:template match="text()">
<xsl:choose>
<xsl:when test="matches(.,'^\s+$') and preceding-sibling::* and following-sibling::*">
<xsl:text> </xsl:text>
</xsl:when>
<xsl:when test="matches(.,'^\s+$') and (not(preceding-sibling::*) or not(following-sibling::*))"/>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>


Thanks again,
Wolfhart


On 02/27/2015 03:29 PM, Martin Honnen martin.honnen@xxxxxx wrote:
Wolfhart Totschnig wolfhart.totschnig@xxxxxxxxxxx wrote:
Hello,

I have a problem with whitespace in mixed content to which I cannot find
the solution. I am hoping that one of you can help me.

Due to imprudent use of indent="yes" on <xsl:output>, the whitespace in
the mixed-content elements of my data got messed up. I'll best explain
the problem with an example.

The following original data

<text><i>Italicized</i> normal <i>italicized</i> <b>bold</b>
<i><b>italicized and bold</b></i>.</text>

now looks like this:

<text>
    <i>Italicized</i> normal <i>italicized</i>
    <b>bold</b>
    <i>
       <b>italicized and bold</b>
    </i>.</text>

I have found out how to avoid the indentation in mixed-content elements
in the future, namely by using saxon:suppress-indentation. My question
is how I can return the modified data back to its original form, i.e.,
reverse the indentation. The task can be formulated thus: In <text>
elements, whitespace-only text nodes that are situated between two start
tags should be eliminated, while whitespace-only text nodes that are
situated between an end tag and a start tag should be replaced by a
single space. How can that be done?

I know that I can select whitespace-only text nodes with
test="matches(.,'^\s+$')". But how can I test whether the preceding tag
is a start tag or an end tag?



You are dealing with text nodes in a tree, contained in element nodes, you are not dealing tags.
So you need to translate the conditions to one based on the tree model, I think the condition


> In <text>
> elements, whitespace-only text nodes that are situated between two start
> tags should be eliminated,


can only occur if the text node is the first child of an element node and followed by an element node e.g.

<xsl:template match="text//text()[not(normalize-space())
and . is ../node()[1]
and following-sibling::node()[1][self::*]]"/>

Current Thread