Re: [xsl] Preserve HTML formatting when apply-templates in variabl

Subject: Re: [xsl] Preserve HTML formatting when apply-templates in variabl
From: Chris Loschen <closchen@xxxxxxxxxxxxxxxxxx>
Date: Fri, 06 Aug 2004 12:33:35 -0400
Hi all,

This raises an issue I've been struggling with as well. I've been reading this
list for several years now and have seen numerous threads about avoiding
d-o-e if at all possible. I've almost always been able to avoid it, and I've
even  rooted out as much of it as I can from the code I've corrected for
others. But I've just used it in my current project, and I'm not sure if there
was another approach I could have used.

Here's the situation: I've got a very large input file which would overflow my
memory if I ran the XSLT all at once. So instead I'm breaking up the file into
smaller pieces and running each one separately. More specifically, the file
is set up like

<!ELEMENT root (header, bill+, trailer)>

and I need to process the header, each individual bill, and finally the trailer
in a series of XSLT transforms, usually more than a thousand all told.
However, I do need to retain some of the data from the transforms as I go
because I need to include a grand total of the amount billed and the total
number of records in the trailer data.

I'm using XSLT 1 and Xalan-J.

What I've done is use xalan:write when I process the header like so:

<xalan:write file="bills.xml">
<xsl:text disable-output-escaping="true">&lt;root&gt;</xsl:text>
</xalan:write>

As I process each bill, I add to that file like so:

<xalan:write file="bills.xml" append="true">
<invoice>
<total-records><xsl:value-of select="$total-records"/></total-records>
<total-amount><xsl:value-of select="$total-amount"/></total-amount>
</invoice>
</xalan:write>


And finally at the end I close the root element while I'm processing the trailer
like so:


      <xalan:write file="bills.xml" append="true">
            <xsl:text disable-output-escaping="yes">&lt;/root&gt;</xsl:text>
      </xalan:write>

I end up with a well-formed XML file which I can then read in with the document()
function and do the sums I need to do etc.


This is working just fine with my current setup (though it leaves a temp file behind, which
I haven't dealt with yet), but it strikes me as very ugly, and goes against all I've learned
on this list about always outputting complete nodes rather than a start tag in one place
and an end tag elsewhere. I haven't been able to come up with a better alternative though.
Does anyone have any suggestions or should I just figure it ain't broke so don't fix it?
Thanks!


PS -- I thank the people who replied to my messages a few days ago -- I sent a reply
to the list but I think it never actually made it -- I didn't have access to my normal email
account, and I asked Tommie to post it for me, but I never saw it -- no doubt Tommie needs
a vacation now and again too! I'll try to post it again as soon as I can.


At 11:35 AM 8/6/2004, you wrote:

> Can you think of a sensible use of "disable-output-escaping"?

4 off the top of my head:


a) creating almost-but-not-quite XML like some template languages

<% .... %>

(ASP JSP etc)

As that isn't legal XML, XSLT won't generate it without a bit of from
d-o-e.

b)
Generating a local subset of a doctype, and entity references, if you
_really_ have to.

c)
In the MathML specification we use CDATA sections for mathml examples.
We don't just "inline" the XML which is the usual advice as we want
tight control over things like indentation and use of ' or " around
attribute values. If you are telling the user that they can do a="2" or
a='2' you don't want the system to write them both out the same way:-)
In the normative  html version of the spec it's no problem you just
value-of the example into a <pre> and it all works, but  in the
XHTML+MathML version we _also_ want to inline it as XML so you get a
side-by-side view of the literal XML and how your browser renders it,
we use d-o-e to produce this.

d) If you have quoted html inside your xml as in
 <foo><![CDATA[ a <br> c <img src= "x.png"> jjj ]]></foo>
and you need that html in the output. If you can fix your input not to
do that it is good but often you can't (RSS feeds etc) and so d-o-e can
be used as a method of last resort (although xslt2 offers alternatives,
more on that another day perhaps:-)(


David




________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

Yours,


Chris Loschen
closchen@xxxxxxxxxxxxxxxxxx
781-718-3017 (cell)

Current Thread