RE: [xsl] RE: String conversion problem when string is large

Subject: RE: [xsl] RE: String conversion problem when string is large
From: "Bulgrien, Kevin" <Kevin.Bulgrien@xxxxxxxxxxxx>
Date: Thu, 22 Mar 2012 09:35:43 -0500
> -----Original Message-----
> From: Scott Trenda [mailto:Scott.Trenda@xxxxxxxx]
> Sent: Wednesday, March 21, 2012 6:48 PM
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] RE: String conversion problem when string is large
>
> Kevin,
>
> The problem with your chunked template is that it isn't
> really bisection. You don't get true O(log n) performance -
> you only get O(n / x) performance, x being dependent upon
> your chunk size. Your template would need to be recursive to
> get the real benefit of bisection. Here it is:
>
> <xsl:template name="HexToDec">
>   <xsl:param name="HexData" />
>   <xsl:param name="Hex" select="'0123456789ABCDEF'" />
>   <xsl:choose>
>     <xsl:when test="contains($HexData, ',')">
>       <xsl:variable name="midpoint"
> select="floor(string-length($HexData) div 2) - 1" />
>       <xsl:variable name="half1" select="substring($HexData,
> 1, $midpoint)" />
>       <xsl:variable name="half2" select="substring($HexData,
> $midpoint + 1)" />
>       <xsl:call-template name="HexToDec">
>         <xsl:with-param name="HexData" select="concat($half1,
> substring-before($half2, ','))" />
>         <xsl:with-param name="Hex" select="$Hex" />
>       </xsl:call-template>
>       <xsl:text>,</xsl:text>
>       <xsl:call-template name="HexToDec">
>         <xsl:with-param name="HexData"
> select="substring-after($half2, ',')" />
>         <xsl:with-param name="Hex" select="$Hex" />
>       </xsl:call-template>
>     </xsl:when>
>     <xsl:when test="starts-with($HexData, '0x')">
>       <xsl:value-of
> select="string-length(substring-before($Hex,
> substring($HexData, 3, 1))) * 16 +
> string-length(substring-before($Hex, substring($HexData, 4, 1)))" />
>     </xsl:when>
>     <xsl:otherwise>
>       <xsl:value-of select="$HexData" />
>     </xsl:otherwise>
>   </xsl:choose>
> </xsl:template>
>
> Try it out, please - I'm actually very curious to see how the
> performance and memory usage stand up against the other
> templates in the various processors. You should be able to
> use it with any processor now, too.
>
> ~ Scott

I liked your idea and did look at your code.  I confess that when I looked at
your binary splitter, I decided to take the low road to get a solution faster
rather than more elegant.  When I started to rewrite and model the results on
paper with a small data set I had a hunch that the edge cases could take time.
I'd been held up by the crashes on the Windows systems for too long and needed
to move on (read management breathing down my neck), and as I didn't have a
Java engine over on those systems (won't go into why - but maybe that will
change eventually now).  The chunker I wrote in minutes and it worked out of
the box.  That said, I'm tempted to follow up on this idea since it is tighter
and since I've just realized that with a very minor change, either your or my
version could change the "Hex" to "AnyBase" without much more effort (not that
I need such a beastie).  This conversion isn't done frequently.  I appreciate
your follow up, so...

With saxonixa:

Execution time: 1.152s (1152ms)
Memory used: 19613968

I'm not sure what's going on with saxonica's memory reporting.  Repeat runs
give variances like 15895712 - 19613968.  Of course, it strikes me as
interesting that I'm using 15-20MB (or multiple GB in the beginning) to
convert a 1MB file.

Xsltproc: 2.23s
Sablotron: 7.21s

Of course, it also is a bit broken.

awk 'BEGIN { FS=","; } /AppCompatCache/ { print $9 " vs " NF-9 }'
idiffout.saxonhe
53392239 vs 53391
53392239 vs 53391
-----^
It drops commas.  It should have built a file 260537, but instead it built one
260512.  It has trouble with some midpoint issues.  For example, see how it
dropped a comma:

-11,32768,2147483650,0,SYSTEM\ControlSet002\Services\lanmanserver\parameters,
0,Guid,3,16,26,35,206,210,171,128,91,74,159,229,42,166,232,8,248,218
+11,32768,2147483650,0,SYSTEM\ControlSet002\Services\lanmanserver\parameters,
0,Guid,3,1626,35,206,210,171,128,91,74,159,229,42,166,232,8,248,218

I'll certainly hang on to the idea here, though I probably will not iron out
the bugs immediately.

This just goes to show what I said before... what an adventure.  I love
learning.

Kevin Bulgrien

This message and/or attachments may include information subject to GD
Corporate Policy 07-105 and is intended to be accessed only by authorized
personnel of General Dynamics and approved service providers.  Use, storage
and transmission are governed by General Dynamics and its policies.
Contractual restrictions apply to third parties.  Recipients should refer to
the policies or contract to determine proper handling.  Unauthorized review,
use, disclosure or distribution is prohibited.  If you are not an intended
recipient, please contact the sender and destroy all copies of the original
message.

Current Thread