Re: [xsl] Removing unwanted space

Subject: Re: [xsl] Removing unwanted space
From: "Dave Pawson dave.pawson@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 4 Jun 2021 13:07:19 -0000
Don't forget
https://www.w3.org/TR/xslt-30/#element-strip-space
Useful with selected elements.

HTH

On Fri, 4 Jun 2021 at 13:40, Charles O'Connor coconnor@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Thanks Wendell, Joel, and Graydon! I will use your suggestions and see what
I get and whether I can apply the lessons to other places I need to get rid of
white space.
>
> I am at least a little gratified that this is not an easy problem causing
the bumps on my forehead.
>
> Joel, to answer your question (incompletely), given
>
> <p>
>     <anchor> </anchor>
>     The rain in <bold> <underline> Spain </underline> </bold> <italic> is
</italic> wet.
> </p>
>
> I'd likely want
>
> <p><anchor> </anchor> The rain in <bold> <underline> Spain </underline>
</bold> <italic> is </italic> wet.</p>
>
> That is, remove the leading and trailing spaces caused by indentation, and
assume every other space weirdness that occurs between the first
non-whitespace character and the last non-whitespace character in <p> is
correct. The tricky bit is the <anchor> element--space after or no space
after?--which luckily is not analogous to a structure I will face in the
paragraph case, but I may when I get to tables (yay!). In tables I fear that
some line breaks will be junk and others used to get rendering they want,
which will be near impossible to tease out.
>
>
>
> From: Wendell Piez wapiez@xxxxxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
> Sent: Friday, June 4, 2021 7:36 AM
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re: [xsl] Removing unwanted space
>
>
> Hey Charles,
>
> A couple of techniques I use in this situation:
>
> text()[. is ancestor::p/descendant::text()[1]] -  matches the first text
node in a p, no matter how deep.
> text()[. is ancestor::p/descendant::text()[last()]] - same for the end
>
> text()[not(matches(.,'\S')] - text that has no non-whitespace character
>
> replace($str,'^\s*','') - strip *leading whitespace only* from a string.
> replace($str,'\s*$','') - same for trailing whitespace
>
> Et sim.
>
> I am not sure I would use xsl:analyze-string here since as you observe it
can be (um) pesky. I might do something as simple as
>
> <xsl:template match=" text()[. is ancestor::p/descendant::text()[1]]">
>   <xsl:value-of select=" replace($str,'^\s*','') "/>
> </xsl:template>
>
> But the match might have to be greedier if the inline markup is also deep,
and this is only the front end.
>
> This is not an easy problem since the (very smart) computer doesn't know the
difference between "white space that matters" and "white space that doesn't
matter". Indeed its whole notion of "white space" is somewhat problematic. So
I'm not sure who's actually smarter. :-)
>
> Cheers, Wendell
>
>
>


--
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.

Current Thread