RE: [xsl] Processing Efficiently

Subject: RE: [xsl] Processing Efficiently
From: <tomas.vanek@xxxxxxxxxxxxx>
Date: Thu, 9 Jun 2005 09:51:15 +0200
By aggressive size reduction you can set also the indend to "no" - the
result is a small bit smaller, but this optimization is "for free":
<xsl:output method="xml" indent="no" encoding="utf-8"/>

tomi


-----Original Message-----
From: Karl Stubsjoen [mailto:kstubs@xxxxxxxxx]
Sent: Wednesday, June 08, 2005 9:34 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] Processing Efficiently

I had to all ready reduce the size of the XML quite a bit by sheer
element renaming and elination of unused elements.  $s use to be 25MB,
but by eliminating unused elements (really needed 2) and by renaming
"xlsRow" to "R" and "xlsColumn" to "C" and by renaming the attribute
"column" to "c" I was able to reduce the size by 1/3.

The thing is this:  $s is my master doc, contains the lookup records.
I have many individual docs that will be compared agains $s, and these
files range in size from 20KB to 5MB (appx.).  I don't mind a different
approach (for example reducing $s source).  I'm just curious how others
would approach something like this.  How would you arrange such
documentation for this sort of processing?

The scenario is:
Large data file for lookups / validation (10 to 20MB) Individual data
files (up to 5MB) As individual data files refresh, identify those items
that exist in the master list.  Again, this is a topic of "Performance"
and "Best Practice" for peforming frequent validations of documents this
size.



On 6/8/05, tomas.vanek@xxxxxxxxxxxxx <tomas.vanek@xxxxxxxxxxxxx> wrote:
> using keys could help to speed up the transformation (here is just the
> idea):
>
> ...
>        <xsl:key name="summaryInvoice"
> use="document('summary.xml')//xls/R" match="C[@c='I']"/>
>
> ...
>        <xsl:template match="xlsRow">
>                <xsl:variable name="current_invoice"
> select="xlsColumn[@column='Invoice_#']"/>
>                <xsl:variable name="current_balance"
> select="key('summaryInvoice', $current_invoice)/C[@c='B']"/>
>                <xsl:variable name="diff_balance"
> select="$current_balance - xlsColumn[@column='Balance']"/>
> ...
>
> tomi
>
>
> -----Original Message-----
> From: Karl Stubsjoen [mailto:kstubs@xxxxxxxxx]
> Sent: Wednesday, June 08, 2005 10:08 AM
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Processing Efficiently
>
> Hello,
> I would like to optimize the following:
>
> Where $s is a 5MB document and the source document is app 2-5MB.
> The goal:  copy everything in the source that exists in $s.
> Catch:  need to know the value of the balance in $s.
>
> $s looks like:
> <xls>
> <R row="2">
>  <C c="I">2AA9379</C><!-- match value "invoice" -->
>  <C c="B">-127.5</C><!-- this is the balance --> </R> ...
> </xls>
>
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
> <xsl:output method="xml" indent="yes" encoding="utf-8"/>
>
> <xsl:variable name="s"
> select="document('summarydata/summaryreduced.xml')//xls/R"/>
>
> <xsl:template match="/">
> <result>
> <xsl:apply-templates
> select="xls/xlsRow[xlsColumn[@column='Invoice_#']=$s/C[@c='I'] |
> xlsColumn[@column='Balance'][not(.= $s/C[@c='B'])]]"/> </result>
> </xsl:template>
>
> <xsl:template match="xlsRow">
> <xsl:variable name="current_invoice"
> select="xlsColumn[@column='Invoice_#']"/>
> <xsl:variable name="current_balance"
> select="$s[C[@c='I']=$current_invoice]/C[@c'B']"/>
> <xsl:variable name="diff_balance" select="$current_balance -
> xlsColumn[@column='Balance']"/> <xsl:copy> <xsl:apply-templates
> select="@*"/> <xsl:attribute name="current_balance"><xsl:value-of
> select="$current_balance"/></xsl:attribute>
> <xsl:attribute name="diff_balance"><xsl:value-of
> select="$diff_balance"/></xsl:attribute>
>  <xsl:apply-templates select="xlsColumn"/> </xsl:copy> </xsl:template>
>
> <xsl:template match="@*">
> <xsl:copy>
>  <xsl:apply-templates select="@*"/>
> </xsl:copy>
> </xsl:template>
>
> <xsl:template match="xlsColumn">
> <xsl:copy-of select="."/>
> </xsl:template>
>
> </xsl:stylesheet>
>
>
>
> This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise private information.  If you have
received it in error, please notify the sender immediately and delete
the original.  Any other use of the email by you is prohibited.



This message is for the designated recipient only and may contain privileged,
proprietary, or otherwise private information.  If you have received it in
error, please notify the sender immediately and delete the original.  Any
other use of the email by you is prohibited.

Current Thread