Subject: RE: Re[2]: [xsl] correct use of keys? From: "Perry Molendijk" <perry@xxxxxxxxxxxxxx> Date: Thu, 13 Sep 2001 21:22:34 +0800 |
I made similar performance discovery recently. The xml is about 1.5M and looks something like this: <page> <document category="catOne" url="someUrl" ... more attribute /> <document category="catOne" url="someUrl" .../> <document category="catTwo" url="someUrl" .../> <document category="catTwo" url="someUrl" .../> <document category="catThree" url="someUrl" .../> <document category="catThree" url="someUrl" .../> .......... </page> There are approx. 3400 document nodes in the file. The applied XSL writes out an HTML table. Then marketing wanted a separator between the different categories. So I added this template rule: <xsl:if test="not(@category = preceding::document/@category)"> <xsl:value-of select="category"/> </xsl:if> Up untill then performance differences between Xalan 2, MSXML 3 and Saxon 6.4 were minimal, but this rule brought both the MS and Xalan parsers to a hold; just over 2 minutes each to do the job while Saxon had the job done in 12 seconds. To make sure I wasn't using an under powered machine I tried it again at home on a dual processor with 512M Ram, resulting in faster times but the same differences. Perry Inflexions (WA) Pty Ltd PO Box 57 Inglewood WA 6052 Australia t: +61 08 9371 2140 m: 0401 677 453 e: perry@xxxxxxxxxxxxxx ____________________________________________________________________________ ________________ Microsoft XML Parser 4.0 July 2001 Technology Preview http://msdn.microsoft.com/downloads/default.asp?url=/downloads/sample.asp?ur l=/msdn-files/027/001/677/msdncompositedoc.xml Bullet point 4: 'Substantially faster XSLT engine. Our tests show about x4, and for some scenarios x8, acceleration (except the known serious performance bug for xsl:keys).' > -----Original Message----- > From: Kevin Burges [mailto:xmldude@xxxxxxxxxxxxxxxx] > Sent: 13 September 2001 12:34 > To: Michael Kay > Subject: Re[2]: [xsl] correct use of keys? > > > Michael + Thomas: > > >> I have a stylesheet which, when run on a 10MB doc turns it > into a 30MB > >> doc in ~600 seconds. > >> Even for such a large doc, this seems like along time > given my machine > >> is a 1.33GHz Athlon, 256MB. > > MK> It seems a long time to me, too. Which processor are you > using? Are you > MK> getting thrashing due to shortage of memory? > > I'm using the latest MSXML 4 (July?). Toward the end of the > transformation there is a small amount of swapping going on, but > certainly not what I'd call thrashing. For the majority of the time > there is virtually no drive access at all. > > > MK> better off doing a preprocess of the document in which > elements whose name > MK> contains 'field' are given an extra attribute, > field="yes", and then use > MK> this attribute in the second phase. In any case, I > suspect that you are not > MK> interested in all nodes whose name contains 'field', but > only in elements > MK> whose name contains 'field'. Replacing "node()" by "*" > will speed things up > MK> a bit. > > I tried this in a couple of stages: > Changing "node()" to "*" made no difference > Using "field = 'yes'" instead of "contains(....)" made no > difference > > I also tried using specifically > "*[(name() = 'field') or (name() = 'datefield') or (name() = > 'computedfield')] > This also made no difference. > > > In fact, when I used the "field = 'yes'" method, the Win2k task > manager said my program was using up to 175MB memory, where > previously I had not seen it above 105MB. > > > TP> Then I suggest that you temporarily change the stylesheet so it > TP> outputs only one node where it has to use one of the keys, and see > TP> how long it takes. This will check whether compiling the > TP> stylesheet and building the key indices is taking an inordinately > TP> long time. > > I tried this, and the index was generated instantly. Presumably > because the document that is being indexed is fairly small. > > > TP> Another thing is whether you are testing out the transfomation in > TP> an environment where the result is displayed in a browser (like > TP> XML Cooktop or XML Spy). > > No, I'm transforming programatically in VB so that's not an issue. > > > One thing I did notice is that if the keys are empty (I had made a > mistake), the transform only takes 60 seconds as opposed to 600. > > This suggests surely that there must either be something wrong with > the way I am using the keys (so they are being ineffectual), or MSXML > has a very poor implementation of keys. Any other suggestions??? > > -- > groovy baby, > Kevin mailto:xmldude@xxxxxxxxxxxxxxxx > > ++++++++++++ Cool music - http://burieddreams.com/marshan > ++++++ Attitude Webzine - http://burieddreams.com/attitude > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: Re[2]: [xsl] correct use of key, WATKIN-JONES,ADAM (H | Thread | [xsl] How to pass the system date t, Uslu, Cihan Y (MED) |
Re: [xsl] Re[2]: correct use of key, Kevin Burges | Date | RE: [xsl] Re[2]: correct use of key, Dylan Walsh |
Month |