Subject: RE: [xsl] Processing Efficiently From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Fri, 10 Jun 2005 14:25:02 +0100 |
I haven't looked at this in detail, but I think you can almost certainly solve your performance problems using keys. Look for constructs like //thing[property=value] and replace them with calls on the key() function. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Karl Stubsjoen [mailto:kstubs@xxxxxxxxx] > Sent: 08 June 2005 20:34 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: Re: [xsl] Processing Efficiently > > I had to all ready reduce the size of the XML quite a bit by sheer > element renaming and elination of unused elements. $s use to be 25MB, > but by eliminating unused elements (really needed 2) and by renaming > "xlsRow" to "R" and "xlsColumn" to "C" and by renaming the attribute > "column" to "c" I was able to reduce the size by 1/3. > > The thing is this: $s is my master doc, contains the lookup records. > I have many individual docs that will be compared agains $s, and these > files range in size from 20KB to 5MB (appx.). I don't mind a > different approach (for example reducing $s source). I'm just curious > how others would approach something like this. How would you arrange > such documentation for this sort of processing? > > The scenario is: > Large data file for lookups / validation (10 to 20MB) > Individual data files (up to 5MB) > As individual data files refresh, identify those items that exist in > the master list. Again, this is a topic of "Performance" and "Best > Practice" for peforming frequent validations of documents this size. > > > > On 6/8/05, tomas.vanek@xxxxxxxxxxxxx > <tomas.vanek@xxxxxxxxxxxxx> wrote: > > using keys could help to speed up the transformation (here > is just the > > idea): > > > > ... > > <xsl:key name="summaryInvoice" > > use="document('summary.xml')//xls/R" match="C[@c='I']"/> > > > > ... > > <xsl:template match="xlsRow"> > > <xsl:variable name="current_invoice" > > select="xlsColumn[@column='Invoice_#']"/> > > <xsl:variable name="current_balance" > > select="key('summaryInvoice', $current_invoice)/C[@c='B']"/> > > <xsl:variable name="diff_balance" > > select="$current_balance - xlsColumn[@column='Balance']"/> > > ... > > > > tomi > > > > > > -----Original Message----- > > From: Karl Stubsjoen [mailto:kstubs@xxxxxxxxx] > > Sent: Wednesday, June 08, 2005 10:08 AM > > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > > Subject: [xsl] Processing Efficiently > > > > Hello, > > I would like to optimize the following: > > > > Where $s is a 5MB document and the source document is app 2-5MB. > > The goal: copy everything in the source that exists in $s. > > Catch: need to know the value of the balance in $s. > > > > $s looks like: > > <xls> > > <R row="2"> > > <C c="I">2AA9379</C><!-- match value "invoice" --> > > <C c="B">-127.5</C><!-- this is the balance --> </R> ... > > </xls> > > > > <xsl:stylesheet version="1.0" > > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > > <xsl:output method="xml" indent="yes" encoding="utf-8"/> > > > > <xsl:variable name="s" > > select="document('summarydata/summaryreduced.xml')//xls/R"/> > > > > <xsl:template match="/"> > > <result> > > <xsl:apply-templates > > select="xls/xlsRow[xlsColumn[@column='Invoice_#']=$s/C[@c='I'] | > > xlsColumn[@column='Balance'][not(.= $s/C[@c='B'])]]"/> </result> > > </xsl:template> > > > > <xsl:template match="xlsRow"> > > <xsl:variable name="current_invoice" > > select="xlsColumn[@column='Invoice_#']"/> > > <xsl:variable name="current_balance" > > select="$s[C[@c='I']=$current_invoice]/C[@c'B']"/> > > <xsl:variable name="diff_balance" select="$current_balance - > > xlsColumn[@column='Balance']"/> <xsl:copy> <xsl:apply-templates > > select="@*"/> <xsl:attribute name="current_balance"><xsl:value-of > > select="$current_balance"/></xsl:attribute> > > <xsl:attribute name="diff_balance"><xsl:value-of > > select="$diff_balance"/></xsl:attribute> > > <xsl:apply-templates select="xlsColumn"/> </xsl:copy> > </xsl:template> > > > > <xsl:template match="@*"> > > <xsl:copy> > > <xsl:apply-templates select="@*"/> > > </xsl:copy> > > </xsl:template> > > > > <xsl:template match="xlsColumn"> > > <xsl:copy-of select="."/> > > </xsl:template> > > > > </xsl:stylesheet> > > > > > > > > This message is for the designated recipient only and may > contain privileged, proprietary, or otherwise private > information. If you have received it in error, please notify > the sender immediately and delete the original. Any other > use of the email by you is prohibited.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Processing Efficiently, JBryant | Thread | Re: [xsl] Processing Efficiently, Karl Stubsjoen |
Re: [xsl] To avoid converting html , Ahsan Ali | Date | Re: [xsl] Passing parameters in XSL, JBryant |
Month |