Re: [xsl] removing final space from node tree

Subject: Re: [xsl] removing final space from node tree
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Mon, 20 Apr 2009 18:25:29 -0400
Lars,

At 05:40 PM 4/20/2009, you wrote:
> Lars,
>
> How about, instead of post-processing the results of processing, you
> simply don't add the spaces the first time through, and then you add
> them in as a post-process?
>
> This would borrow from both your solutions -- like your first solution,
> it divides the task into two phases. But like the second, it determines
> whether to add the space to begin with -- not whether to remove it after
> it's been added. It would do this, however, not by examining the source
> to say "do I want a space here", but by examining the result (where this
> determination will be much easier).

Thanks, Wendell.
That certainly makes sense.
However I'm not sure I see a significant win over my existing solution.
I would still be adding a post-processing phase that recurses over  the
initial output tree and reconstructs it most of the time.
Please clue me in if I've missed something big...

Well this approach would only be an improvement over your proposal if it were much easier to add whitespace where you know you want it than either (a) add whitespace everywhere and then remove it where you don't want it (your first suggestion), or (b) examine your source statically to determine up front whether you wanted it or not (your second suggestion).


If looking for a single-pass solution, as sometimes people are (e.g., when running unextended XSLT 1.0 in a browser), examining the source statically is the only option. But as you noted in your first post, this can get to be hair-raising, as such an examination can face combinatorial complexity, since it has to account implicitly for the mapping from source to target.

Doing it in two passes alleviates this problem, and I'd concur that either your first idea (post-process to remove extra whitespace) or my refinement of it (don't add it in the first phase at all but only in a second) would be likely to be an improvement -- at least for maintenance and probably for the processor as well (since processing logic would be correspondingly more straightforward).

Between those two choices, it really depends on the details, which is why Andrew's question is the right one (can we please see more specifics). It could well be that you're on the right track.

Also, I'm pretty sure the initial output tree doesn't always contain
enough markup to determine where boundaries between output phrases are.
The initial transformation sometimes just outputs bare strings.

Ah well maybe you should output those boundaries and then strip them again in the second phase, when whitespace would be substituted where wanted?


On the other hand, if it's true that Wendell Piez couldn't immediately
see a much more elegant way to accomplish this task, that would increase
my confidence that my current solution is pretty decent.  :-)

Bah: I'm not sure what to make of that -- it's both flattering and distressing that you should be willing to rely on my blind intuition and flailings in the dark. :->


But FWIW, if your processing is very complex, I do think that splitting into (1) logic, then (2) whitespace fixup would make for a cleaner separation than (1) Logic+whitespace, then (2) cleanup.

And in any case -- I think you know your current solution is already better than many alternatives ... some of which I don't have the imagination to concoct.

Cheers,
Wendell



======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

Current Thread