[xsl] Re: performance advice sought

Subject: [xsl] Re: performance advice sought
From: "Dimitre Novatchev" <dnovatchev@xxxxxxxxx>
Date: Sat, 13 Dec 2003 15:55:17 +0100
It seems to me that the solutions found in the thread "Unbounded element
grouping/concatenation" fully apply to your case.

=====
Cheers,

Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL



"Mike Castle" <dalgoda@xxxxxxxxxxxxx> wrote in message
news:20031211233648.GJ630@xxxxxxxxxxxxxxxxxxxxx
>
> XSL newbie.
>
> It's quite possible that this particular task is better suited to
something
> besides XSL, but since I'm trying to plug this into ant as part of our
> build & test system, and ant comes with a handy dandy <xslt/> task, I
> figured it was a good place to start.
>
> The solution I have come up with works.  It's logically correct.  However,
> it'd damned slow.
>
> Anyway, the task:
>
> Our test system generates four files for each test run: test log
> proper, test engine log, server log, and the server wrapper log (see
> http://wrapper.sf.net).  Now, the first 3 files all use the same route
> to write out infomation using an XML like <LogRecord/>, while the latter
> is a plain text file.
>
> What I need to do is analyze an aggregation of those files for each run.
>
> * If a test engine log or server log has a <LogRecord/> with a severity
> level of FATAL or MFATAL, or if the string "Exception" shows up in the
> text of any <LogRecord/> or the server wrapper file, the whole run is
> a bust, and we count all of the associated tests as failures.
>
> *  For all of the other runs, we look at the the <LogRecord/>s in the test
> log proper, and if any of those lines has a severity level of FATAL or
> MFATAL or the string "Exception" in the text, then a particular test is
> determined to have failed. (There may be multiple FATAL|MFATAL|Exceptions
> for a particular test.)
>
> * Sum up all of the tests ran, tests failed due to system errors, and
> individual test failures.  For extra credit, I've been adding number
> of known tests to pass just to make sure all of the numbers add up.
> I'm paranoid like that.
>
> * For each test failing with a system error, print out what test suite
> failed.
>
> * For each individual test that failed, print out what specific test
> failed.
>
> So really it's less of a transformation and more of a summary report.
> Which is why I'm not certain that XSLT is the right tool for this.
>
> What I do is creat a big XML wrapper using lots of entities to wrap all
> of the pseudo XML fragments (and the plain text file comes in as a big
> CDATA section as a <LogRecord/> element).
>
> When it's all said and done, the XML looks like this:
>
> <?xml version="1.0"?>
> <Log>
>  <Test name="testsuite1">
>   <TestLog>
>    <LogRecord severity="STATUS">Beginning test1 pacakge Test Suite 1
script....</LogRecord>
>    ...
>    <LogRecord severity="STATUS">good stuff is happening here</LogRecord>
>   </TestLog>
>   <TEngine>
>    <LogRecord severity="STATUS">Test Engine started...</LogRecord>
>    ...
>    <LogRecord severity="STATUS">Test Engine stopped.</LogRecord>
>   </TEngine>
>   <Server>
>    <LogRecord severity="STATUS">Server started...</LogRecord>
>    ...
>    <LogRecord severity="STATUS">Server shut down.</LogRecord>
>   </Server>
>   <Wrapper>
>    <LogRecord><![CDATA[several lines of text here]]></LogRecord>
>  </Test>
>  ...
> </Log>
>
> Now, since each of these logs is really essentially a flat file, I have
> to mechanically determine logical break points in the logs.  The break
> points are determined by a <LogRecord/> where the text starts with the
> string "Beginning" and has the string "scripts..." in it (Yeah, I know,
> it's ugly).
>
> So the style sheet I'm currently using is:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE xsl:stylesheet [
> <!ENTITY recordpredicate "@severity='MFATAL' or @severity='FATAL' or
contains(text(),'Exception')">
> <!ENTITY recordfailure "LogRecord[&recordpredicate;]">
> <!ENTITY systemcheck "child::*[self::TEngine or self::Server or
self::Wrapper]">
> <!ENTITY systemfailure "&systemcheck;/&recordfailure;">
> <!ENTITY idtestrecord
"[contains(substring-after(text(),'Beginning'),'scripts...')]">
> <!ENTITY testscriptsid "TestLog/LogRecord&idtestrecord;">
> <!ENTITY testname
"normalize-space(substring-before(substring-after(.,'Beginning'),'package'))
">
> ]>
> <xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
>   <xsl:output method="html" encoding="UTF-8" indent="yes"/>
>   <xsl:template match="/">
>     <xsl:text>Test results:</xsl:text>
>         <xsl:variable name="all-tests"
select="/Log/Test/&testscriptsid;"/>
>         <xsl:variable name="system-failed-tests"
select="/Log/Test/&systemfailure;/../../&testscriptsid;"/>
>         <xsl:variable name="not-system-failed-tests"
select="$all-tests[count(.|$system-failed-tests)!=count($system-failed-tests
)]"/>
>         <xsl:variable name="individual-failed-tests"
select="$not-system-failed-tests/../&recordfailure;/preceding-sibling::*&idt
estrecord;[position()=1]"/>
>         <xsl:variable name="not-individual-failed-tests"
select="$not-system-failed-tests[count(.|$individual-failed-tests)!=count($i
ndividual-failed-tests)]"/>
>
>         There were <xsl:number value="count($all-tests)"/> tests ran.
>         There were <xsl:number value="count($system-failed-tests)"/> tests
that failed at a system level failure.
>         There were <xsl:number value="count($individual-failed-tests)"/>
individual tests that failed.
>         There were <xsl:number
value="count($not-individual-failed-tests)"/> tests known to pass.
>         <xsl:text>
> System level failures:
> </xsl:text>
>         <xsl:for-each select="$system-failed-tests/../../@name">
>           <xsl:value-of select="."/>
>           <xsl:text>
> </xsl:text>
>         </xsl:for-each>
>         <xsl:text>
>  Which caused the following tests to fail:
> </xsl:text>
>         <xsl:for-each select="$system-failed-tests">
>           <xsl:value-of select="../../@name"/> : <xsl:value-of
select="&testname;"/>
>           <xsl:text>
> </xsl:text>
>         </xsl:for-each>
>         <xsl:text>
> Individual failures:
> </xsl:text>
>         <xsl:for-each select="$individual-failed-tests">
>           <xsl:value-of select="../../@name"/> : <xsl:value-of
select="&testname;"/>
>           <xsl:text>
> </xsl:text>
>         </xsl:for-each>
>         <xsl:text>
> Passed tests:
> </xsl:text>
>         <xsl:for-each select="$not-individual-failed-tests">
>           <xsl:value-of select="../../@name"/> : <xsl:value-of
select="&testname;"/>
>           <xsl:text>
> </xsl:text>
>         </xsl:for-each>
>   </xsl:template>
> </xsl:stylesheet>
>
> Ok, actually for each test with a system failure, I'm listing the
> individual tests as well.  The whole "Beginning ... package ...
scripts..."
> thing is quite fragile, I know, and we wanted to make sure we had
> everything typed in correctly in the test scripts.
>
> On the smaller test runs, this style sheet performs quite well.  Takes
> about 30 seconds to generate the report.  Everyone is happy.
>
> However, on a larger set of runs, it takes over 1 hour to generate the
> report.  Everyone is unhappy.
>
> Well, tracking down the culprit seems to be this particular query:
>
> <xsl:variable name="individual-failed-tests"
>
select="$not-system-failed-tests/../&recordfailure;/preceding-sibling::*&idt
estrecord;[position()=1]"/>
>
> Which, in after thought, makes sense.  It's pretty much an O(MxN), as for
> every FATAL|MFATAL|Exception it finds, it then scans backwards looking for
> the magic strings.  And in the one particular file that's giving me
issues,
> there are 251789 LogRecords, 83 failing records and 102 tests.  So out of
> 250k of records, I'm really only interested in 200!
>
> So, what can I do to speed up that process?
>
> One process I considered was doing a two step process:  First would pull
> out the records I'm interested in (essentially acting like a structured
> grep) and then generating the summary against that.
>
> I wish I knew how to do a previous-sibling to work against a smaller
subset
> rather than the whole tree.
>
> Any advice on how to make this a bit more effecient?
>
> Thanks!
> mrc
> -- 
>      Mike Castle      dalgoda@xxxxxxxxxxxxx      www.netcom.com/~dalgoda/
>     We are all of us living in the shadow of Manhattan.  -- Watchmen
> fatal ("You are in a maze of twisty compiler features, all different"); -- 
gcc
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>
>




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread