Subject: replacing key() with pipe. From: Paul Tchistopolskii <paul@xxxxxxx> Date: Sat, 05 Aug 2000 05:30:19 -0700 |
Dear Sebastian. In this letter I'm providing the simple invariant of yours test6.xsl But first some long ( sorry) explanation. On my box my invariant is working twice as slow ( comparing to key()) on the 'special' file which is: <?xml version="1.0"?><!DOCTYPE cemetery SYSTEM "cem.dtd" [ <!ENTITY data1 SYSTEM "data1.xml"> <!ENTITY data2 SYSTEM "data2.xml"> ]> <cemetery> &data1; &data2; &data2; </cemetery> I had to produce such a strange file because with your 'smallest' file the difference in speed was not that easy to find, but on your 'biggest' file I got constant swapping ( Windows, 128 Mb ). So I produced 'something' 'relatively big, but without swapping'. saxon + test6 = 1 minute. saxon + my test6 = 2 minutes. <realitycheck> Honestly - I don't care spending 1 minute or 2 minutes ( or even 3 or 4 minutes ) for this *exotic* activity. It should be all powered by the repository. Text file is not a good storage for this kind of information if you want to query this file every five minutes and if you want to make that query once per week / day it will not hurt to wait for 2 minutes instead of one. </realitycheck> I can 'improve' the pipe using java extension with side-effects ( the biggest weakness is 'flat -> hierarchical shift which is based on the weak ( but standard for XSLT ) 'count-based recursion'. ) It looks that with java extension emulating 1 ( one ) updateable variable this could make it significantly faster. <realitycheck> But is it worth trying? Do you really care is it 1 minute or 3 minutes ? Anyway it seems that it does not scale because of the memory first of all. And of course - it is ages behind scalability provided by any SQL server ( including MySQL ). </realitycheck> Now what I did. I'm sorry for explaining many details, but I think it could be interesting what happened with this task. 1. First ( and most important ) I started thinking about the task itself, about the functionality I have to provide ( not thinking about the 'key()' or other XSLT stuff at all ). What test6.xsl actually does : 2.A Query. - It takes all the /cemetery/person. - It pulls out : person/died/date/yr second name first name - Persons should be sorted by : Year, Second name , First Name 2.B Rendering. - It then renders the list of persons, but the Year is displayed only for the group. So here we go. 3. Query part. <?xml version='1.0'?> <xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version="1.0" > <!--JOB: process cemetery file to make a year catalogue, NOT using keys (1) --> <xsl:template match="/cemetery"> <doc> <xsl:for-each select="stone/person"> <xsl:sort select="died/date/yr"/> <xsl:sort select="name/snm"/> <xsl:sort select="name/fnm"/> <person> <year><xsl:value-of select="died/date/yr"/></year> <snm><xsl:value-of select="name/snm"/></snm> <fnm><xsl:value-of select="name/fnm"/></fnm> </person> </xsl:for-each> </doc> </xsl:template> </xsl:stylesheet> I think it is easy to understand what happens here. We are just blindly translating the requirements for Query part into XSLT. So this component have produced the stream: <doc> <person><year>123</year><snm>NAME</snm><fnm>NAME</fnm></person> <person><year>123</year><snm>NAME</snm><fnm>NAME</fnm></person> .... </doc> Now all we need is to render this 'flat' structure into the 'groups' ( because we want the year to get displayed only once per 'group'. - as it is in requirement for Rendering part ). I could write this in XSLScript ;-) But for the sake of conformance here comes the ugly XSLT call-template. <side-effect> In the next version of XSLScript there will be yet another loop compiler 'meta-construction' , not only 'else' ( because I finally got tired with this loop --> recursion conversion ). </side-effect> Whatever. This is again - *typical* recursive XSLT. - take the first elements from list by some criteria - draw them - recursively call yourself with the rest of the list. <?xml version='1.0'?> <xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version="1.0" > <!--JOB: process cemetery file to make a year catalogue, NOT using keys (2) --> <xsl:template match="/doc"> <html> <head> <title>Protestant Cemetery Catalogue </title> </head> <body> <xsl:call-template name="draw_year"> <xsl:with-param name="list" select="/doc/*"/> </xsl:call-template> </body> </html> </xsl:template> <xsl:template name="draw_year"> <xsl:param name="list"/> <xsl:if test="$list"> <xsl:variable name="year" select="$list[1]/year"/> <xsl:variable name="n_souls" select="count( $list[year = $year ])"/> <xsl:variable name="rest" select="$list[ (position() > $n_souls) ]"/> <h2><xsl:value-of select="$year"/></h2> <ol> <xsl:for-each select="$list[ not (position() > $n_souls) ]"> <li><b><xsl:value-of select="snm"/></b>, <xsl:value-of select="fnm"/></li> </xsl:for-each> </ol> <xsl:call-template name="draw_year"> <xsl:with-param name="list" select="$rest"/> </xsl:call-template> </xsl:if> </xsl:template> </xsl:stylesheet> Design patterns used. ------------------------------ Ux is about pipes of simple XSLT components. Have you mentioned that there is no HTML tags in the Query part at all ? Another Ux 'design pattern' is that Query - it is producing some kind of 'formatting objects' for the 'renderer'. Renderer is just blindly doing the production of HTML. I wish this explains why I'm not using key(). Those 'select from .. dual' could be hardly produced from the functional specification ( the code written above is just a simple reflection of functional specification into simple and general XSLT constructions. ). Yes, I have to admit - if not polluting this with some 'other' ugly constructions it works twice as slow ( maybe tree times as slow ) than key() - based solution. Should I start polluting this 'plain XSLT' thing with ugly java hacks, or we can wait 2 minutes instead of 1 minute ( but keep the code supportable by anybody ?) Rgds.Paul. PS. I encountered *crazy* jumps of the speed on different boxes and different versions of the VM. On some boxes SAXON is ( significantly ) faster than XT ( on some 'other stylesheets' ) because it seems that instant SAXON was compiled with some tool which works nice with MS VM. What is the tool? Ah - there are at least 3 of Java boosters out there and some are specifically Windows oriented. Benchmarking XSLT is hard, I think. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: Bug in 'xsl:sort'. ( XT vs SAXO, James Clark | Thread | Re: replacing key() with pipe., Steve Muench |
images in a pdf from a fo file, Robert Koberg | Date | <xsl:import> problems, Ryan Daigle |
Month |