Re: [xsl] Large content rendering in XSLT

Subject: Re: [xsl] Large content rendering in XSLT
From: "James A. Robinson" <jim.robinson@xxxxxxxxxxxx>
Date: Mon, 19 May 2008 22:43:19 -0700
> We do not have any complicated logic in XSLT and it just select of a
> node a copy of the node.
> And I do not need to evaluate the text content and we do not do any
> decode as well.
> 
> Actual content of my text document would be something like Terms and
> conditions of any web site.
> We need to format it in case of HTML part and use the same data for
> text part for  plain text as well.

Ok, this sounds very reasonable for an XSLT application.  The example I
mentioned isn't using xsl:copy-of or xsl:sequence, but is doing the
higher overhead of xsl:apply-templates -- so more work than it sounds
like you might be needing to perform in your application.

> If I go through this kind of approach ,need to have all of content
> should be well-formed and all xslt
> rules that apply. Is there any way to apply XSLT for the existing HTML
> content?

If you mean apply XSLT to existing HTML markup, the answer is that
it depends.  If your HTML happens to be well-formed XML, then you
ought to be able to process it directly in XSLT.  If your HTML is
like 90% of the markup out there, which uses the SGML form (open
tags w/o closing tags in some cases), then you'd need to run the HTML
through a fix-up process before you could process it in XSLT.  We do
something similar for another application here, and have found that
TagSoup (http://home.ccil.org/~cowan/XML/tagsoup/) works pretty well,
though there is a penalty paid in over-head.

It's possible to hook TagSoup directly into a processing stream,
feeding the results into an XSLT transformation engine, but instead of
performing the fixup from SGML to XML on-the-fly it might be prudent
to consider the feasibility of using something like TagSoup or SX
(http://www.jclark.com/sp/sx.htm) to pre-process all of your HTML into
well-formed XML, thereby reducing the CPU cycles needed when serving
requests.

XSLT 2.0 can load and serve unparsed text (non-markup text) directly.

> The reason have mentioned upstream system does need to send the larger
> text for every call they make for transformation and further
> processing. The actual static data should be pulled from the external
> system and used as part of the mimemessgae construction,

Well, many XSLT engines can certainly pull data from remote locations
w/o needing a lot of infrastructure.  Saxon, for example, can certainly
pull data over HTTP (e.g., sending an HTTP GET request to a web service).

Jim

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
James A. Robinson                       jim.robinson@xxxxxxxxxxxx
Stanford University HighWire Press      http://highwire.stanford.edu/
+1 650 7237294 (Work)                   +1 650 7259335 (Fax)

Current Thread