Server-side. Performance Re: Netscape Support for XSL

Subject: Server-side. Performance Re: Netscape Support for XSL
From: Paul Tchistopolskii <paul@xxxxxxx>
Date: Fri, 19 May 2000 20:24:59 -0700
From: Matt Sergeant

>> Matt,
>>        If it's not too much trouble could you elaborate on this speed
>> penalty you speak of?  How much of a speed penalty is it?

> Most XSLT servlets out there are fairly dumb, requiring a parse of every
> stylesheet file and the XML file on every hit, along with performing the
> actual transformation.

> Some may eliminate the need to re-parse the XML + stylesheets on each hit
> by using an internal memory cache of the parsed DOM tree, but still
> perform the XSLT transformation on each hit.

This view looks a bit messy to me.

 XSLT transformation consists of 3 basic steps.

 1. Parse XSL stylesheet  ( compile it into internal structure ).
 2. Parse XML source file ( compile it into internal structure ).
 3. Perform  the actual transformation.

 (1). Most ( if not all ) XSLT servlets compile the stylesheet only once.

 (2). Storing the precompiled XML file in internal memory cache gives not too much
benefits.

 a. It is plain vaste of memory.
 b. It could be efficient enough to serialize / deserialize internal structure (2) to/from
disk
 ( avoiding  XML parsing ).
 c. See (3).

 (3). If you don't have to perform actual tranformation with every request, that means you
 don't actualy need on-the-fly transformation. Why not storing static pre-generated HTML
page then ?
 I understand there could be exceptions, but please see below.

> On the other hand, both AxKit and Cocoon only transform when either the
> stylesheet(s) or the xml file changes, storing the results in a
> cache.

 Well .. That means - still no need in storing xml document in internal cache.
 Caching strategy could be pretty trivial :

 Hashtable:

 key: xml.name!xsl.name
 value1: xml.mtime
 value2: pointer to the result of the transformation ( could be file on disk )

        Processing:

        When receiving request to transform xml.name with xsl.name - check the mtime of
        xml.name ( look for 'dependences' if you like ) and if something has changed -
        restart the transformation ( remember - the stylesheet is already
cached==precompiled -
        see (1) ).

 If not - dump value2.

 Pretty trivial ( standalone ) caching layer. Not longer than 2 days of work.

 I have not implemented this particular caching in PXSLServlet only because I don't see
 to much purpose in supporting plain XML files. XML *sources* is another story, but
 it requires some virtualization of mtime, e t.c.  Also, when parameters come into the
 picture - it becomes not that trivial e t.c.

> This allows AxKit to achieve delivery rates of 100m page views a
> day (nobody is running a site using AxKit that does that, but the point is
> still valid), on the right hardware. I don't have any performance figures
> for Cocoon, although I believe it's slightly slower than AxKit (but I'm
> biased!).

 And I'm a bit suspicios about the 'right hardware'. ( Please see below ).

> AxKit is built in Perl using the Apache API, Cocoon is built as a
> servlet. Which seems kindof backwards when Cocoon is part of the Apache
> project ;-)

 So AxKit is mod_perl.  Please corrent me if I'm wrong, but the biggest
 design  problem of mod_perl ( which has signicifant impact on hps ) is

 - the load
 - how much RAM do you have
 - how complex is the perl application

 Briefly - mod_perl is not scalable ( and not easy for load-balancing ) .

 Please, forgive my mistakes ( if any ) - I'm not monitoring mod_perl
 evolution for a long time. If perl ( and extensions ) finaly got thread-safety
 and mod-perl is using threads, or something else has changed - or my view is
 wrong in some other way - I'l be glad to hear from perl expert  ( you ) what
 is going on there.

 As far as I remember - we have 2 basic engines to run  persistent server-side perl
 applications. mod_perl and FastCGI. FastCGI provides presistensy for
 perl code and perl data. mod_perl provides only persitency for perl precompiled
 bytecode. Righ?

 When we have 2 concurrent requests, mod_perl forks a copy of the perl interpreter
 together with the copy of bytecode ( all the data are 'lost', so  and if you care
 about persistent shareable 'global' data - you should do some tricks to get them).
 Right?

 If you have 5 concurrent requests - you get five copies of perl script.
 In any way 'complex' perl script ( especialy if you are using those CGI.pm and
 other libraries - ( and you *are* using, right? You *should* use, right? )
 has significant memory footprint. Right ?

 Servlets have threads, data persistency,  synchronize ( this ) e t.c.

 My claim is that:

 1. Servlets architecture is more scalable and could be even
 more efficient in the case of complex applications.

 2. Caching has nothing to do with servlets design and/or
 mod_perl design.

 Your original point was about 'performance penatly of java servlets'.

 What in particular is performance penalty? The speed of JVM ?

 Rgds.Paul.



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread