Re: [xsl] using normalize-space with mixed element content

Subject: Re: [xsl] using normalize-space with mixed element content
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Tue, 08 Jun 2010 23:37:07 +0100
A crude way would be to use normalize-space() only if there are no element children:

    <xsl:template match="title-group/article-title[not(*)]" mode="none">
        <xsl:value-of select="normalize-space(.)"/>

    <xsl:template match="title-group/article-title[*]" mode="none">
        <xsl:copy-of select="."/>

More subtle is to recognize that you can safely remove leading whitespace from the first text node, and trailing whitespace from the last:

    <xsl:template match="title-group/article-title/text()[1]" priority="100">
      <xsl:value-of select="replace(., '^\s+', '')"/>

    <xsl:template match="title-group/article-title/text()[last()]" priority="101">
      <xsl:value-of select="replace(., '\s+$', '')"/>

    <xsl:template match="title-group/article-title/text()[last()=1]" priority="102">
      <xsl:value-of select="normalize-space(.)"/>

Michael Kay

On 08/06/2010 23:24, Lynn Murdock wrote:

i want to remove trailing whitespace from the contents of an element (<article-title>), but because that element sometimes contains character-level formatting elements (<italic>) as well as text, normalize-space is creating problems. if i use normalize-space(), i lose the italics in the output. (i'm transforming to html.)

here's an example of where i want to remove whitespace (i want to remove the space before</article-title>):

         <article-title>Effect of a Brief Video Intervention on Incident Infection among Patients Attending Sexually Transmitted Disease Clinics</article-title>

and here's the xsl code i've used:

     <xsl:template match="title-group/article-title" mode="none">
         <xsl:value-of select="normalize-space(.)"/>

this code works most of the time, but in a situation like this:

<article-title>A Global Survey of Gene Regulation during Cold Acclimation in<italic>Arabidopsis</italic> <italic>thaliana</italic></article-title>

it ends up removing the italic formatting.

does anyone know of a way to strip the whitespace that i don't want (in the first example) while keeping the character formatting that i do want (in the second example)?

any pointers would be greatly appreciated.


This email is confidential to the intended recipient. If you have received it in error, please notify the sender and delete it from your
system. Any unauthorized use, disclosure or copying is not permitted. The views or opinions presented are solely those of the sender and do
not necessarily represent those of Public Library of Science unless otherwise specifically stated. Please note that neither Public Library
of Science nor any of its agents accept any responsibility for any viruses that may be contained in this e-mail or its attachments and it
is your responsibility to scan the e-mail and attachments (if any).

Current Thread