At 11:04 AM 8/11/2004, Mike wrote:
I get upset by people spelling "whitespace" as two words, as if it described
that subset of spaces that are white (as distinct from those that are green
or blue). In fact, of course, there are four whitespace characters and only
one space character. There are other characters whose rendition uses no ink,
and which also use the word "space" in their names, that are neither space
nor whitespace characters, for example zero-width-space.
Don't blame me.
The problem comes with the territory. The OP hasn't actually said what he
means by "whitespace", and the alignment of his requirement with the XSLT
definition of whitespace (the four characters CR, LF, tab and space? but CR
has been normalized to LF by the parser, so aren't we talking three
characters?) is actually unclear until he does; but he doesn't know he
needs to do this if he doesn't even know XSLT *has* a definition for
whitespace, which may (or may not) be relevant to his problem depending on
how it aligns.
Personally, my pet peeve is the common expectation that others' perception
of a problem is necessarily the same as one's own, and that therefore we
can casually say things like "whitespace" and just assume it's clear what
we mean (and if it's not, it's the other party's fault). Sometimes it is,
sometimes not -- and sometimes we have to dig to find out which. The most
constructive posts to this list (conspicuously those from Mike) exhibit
sensitivity to exactly this meta-problem and care in dealing with it.
Well-posed questions demonstrate the same sensitivity. But it isn't always
easy to guess where perceptions or assumptions may not align, which is why
we can't stop taking care, because others won't always be able to.
(The advantage of a list like this, of course, it that it is a venue where
we can take precisely this care. If this weren't useful we could all just
read XSLT: A Programmer's Reference cover to cover and be done with it.)
It is natural, however, sometimes, to get impatient and just cut to the
chase, hoping we're all on the same page.
So the answer is, if your definition of whitespace is the same as XSLT's,
then the XPath function normalize-space() will do what you want (as David
shows). If not, then a technique like what M. David demonstrates with
translate() will work as well.
As it turns out, XSLT and XPath's definition of whitespace kicks it back to
XML [section 2.3], which defines it as any combination of the characters
(#x20 | #x9 | #xD | #xA), that is, space, tab, CR, LF.
Cheers,
Wendell
======================================================================
Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================