Re: [xsl] White space strategies for mixed content

Subject: Re: [xsl] White space strategies for mixed content
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 5 Nov 2019 11:57:37 -0000
xsl:strip-space only affects the handling of whitespace-only text nodes. It
won't affect handling of whitespace within a text node that also contains
printable characters. Equally, indent=yes won't achieve what's required
(assuming we're talking about using the XML output method).

If you want to drop superflous whitespace in text nodes with printable
content, then the safest strategy is probably to use replace('\s+', ' ') in
the template for text nodes -- that is, replace multiple spaces by a single
space.

Using normalize-space() isn't safe because "a <b>child</b>" will become
"a<b>child</b>".

Of course, if you replace all sequences of whitespace characters by single
spaces you will end up with some very long lines in the XML. Wrapping those
into shorter lines in a second phase of processing is of course possible,
though not especially easy.

Michael Kay
Saxonica

> On 5 Nov 2019, at 01:00, Rick Quatro rick@xxxxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi All,
>
> I have inherited some "interesting" xml that has mixed content and I am
trying to figure out some strategies for getting "cleaner" output in my XSLT
workflow without removing any needed whitespace. In the simple example below,
I want to normalize the white space and let the serializer write out the
breaks where it wants to.
>
> Input:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <root>
>     <para>This is the title element with a
>     <emphasis>child</emphasis> element and<strong> another</strong>
>     one.</para>
> </root>
>
> Output:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <root><para>This is the title element with a <emphasis>child</emphasis>
element and<strong> another</strong> one.</para></root>
>
> In this example, I think <xsl:strip-space elements="root"/> would take care
of it with <xsl:output indent="no"/> but of course my actual input is more
complex.
>
> I have struggled a bit in my understanding of whitespace handling in XML and
XSLT so I may be missing something obvious. Thanks in advance for any advice.
>
> Rick
>
> Rick Quatro
> Carmen Publishing Inc.
> rick@xxxxxxxxxxxxxxx <mailto:rick@xxxxxxxxxxxxxxx>
> 585-729-6746
> www.frameexpert.com/store/ <http://www.frameexpert.com/store/>
>
>
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by
email <>)

Current Thread