Subject: Re: [xsl] A super-efficient way to compute the sum of A[i] * B[i] for i=1 to n? From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Wed, 13 May 2020 03:27:45 -0000 |
I played a little bit using *saxon:threads*: https://www.saxonica.com/html/documentation/extensions/attributes/threads.htm l , with the hope that this could lead to faster execution. The results are below, and they can be useful as a reference. With a source XML document of 1M elements, each of which has a single text node with a numeric value, on my PC (6 cores, Intel i7-8700 CPU @3.20GHz, 32.0GB RAM, x64-based processor and 64-bit Operating System (Windows 10)) I got these results: *Saxon Ver./Edition* *saxon:threads* *Time in seconds* *Remarks* Saxon-EE 9.8.0.12 1 1.8 Saxon-EE 9.8.0.12 2 15.7 ??????????????? ---- b -------b ---------- 4 5.7 ---- b -------b ---------- 6 4.3 ---- b -------b ---------- 8 3.7 ---- b -------b ---------- 12 3.0 ---- b -------b ---------- 16 2.9 ---- b -------b ---------- 20 2.7 ---- b -------b ---------- 24 2.7 ---- b -------b ---------- 28 2.6 ---- b -------b ---------- 32 2.6 Saxon-PE 9.8.0.12 N/A 1.0 ??????????????? Notice that the best result was reached when running Saxon-PE, in which the *saxon:threads* attribute is ignored and no threading takes place. Also note that the best result with threading is when the number of threads is just 1. Here is a fragment of the source XML document, just to give you an idea of the input: <vectors> <seq1> <n>0.1</n> <n>0.2</n> <n>0.3</n> <n>0.4</n> <n>0.5</n> <n>0.6</n> <n>0.7</n> <n>0.8</n> <n>0.9</n> <n>1</n> <n>0.1</n> <n>0.2</n> <n>0.3</n> <n>0.4</n> <n>0.5</n> <n>0.6</n> <n>0.7</n> <n>0.8</n> <n>0.9</n> <n>1</n> <n>0.1</n> . . . . . . . . The <seq1> element has 1 000 000 (1M) child elements named "n". Then it has an immediate sibling element <seq2> with contents that was copied from <seq1>. Here is the transformation that uses *saxon:threads*: <xsl:stylesheet version="3.0" xmlns:xsl=" http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:saxon=" http://saxon.sf.net/"> <xsl:output method="text"/> <xsl:variable name="v1" select="/*/seq1/*/text()/number()"/> <xsl:variable name="v2" select="/*/seq2/*/text()/number()"/> <xsl:variable name="vLength" select="1000000"/> <xsl:template match="/"> <xsl:variable name="vProds" as="xs:double*"> <xsl:for-each select="1 to $vLength" saxon:threads="6"> <xsl:sequence select="$v1[current()] * $v2[current()]"/> </xsl:for-each> </xsl:variable> <xsl:value-of select="sum($vProds)"/> </xsl:template> </xsl:stylesheet> Cheers, Dimitre On Sun, May 10, 2020 at 11:53 AM Dimitre Novatchev <dnovatchev@xxxxxxxxx> wrote: > For reference on fast sequential algorithms for computing the dot product, > see this: > > > https://lemire.me/blog/2018/07/05/how-quickly-can-you-compute-the-dot-product -between-two-large-vectors/ > > > On Sat, May 9, 2020 at 10:52 AM Dimitre Novatchev <dnovatchev@xxxxxxxxx> > wrote: > >> Hi Roger, >> >> > I need a super-efficient way to compute the sum of A[i] * B[i] for >> i=1 to n. >> >> In case you have n cores / processors, and the compiler knows to optimize >> functions like map(), zipWith() (for-each-pair() in XPath 3.0), then all >> processors can each do a corresponding multiplication in parallel in a >> single moment (computing cycle). >> >> Then what remains is the summing of the results of these multiplications. >> This essentially is the map-reduce technique. >> >> Interestingly enough, while not all reduce operations can be >> parallelized, in the case of sum() one can add together n numbers in >> Log2(n) summing steps -- very similar to the DVC (DiVide and Conquer) >> technique, but doing it parallel -- not sequentially. So, first there would >> be n/2 additions, then n/4 additions (of the results of the first step), >> then n/8 additions, and so on -- in a total of ceiling(Log2(n)) steps. If >> for each steps we have sufficient number of processors (n/2 for the first >> step, less for the following steps), then the summing could be performed in >> Log2(n) computing cycles. >> >> So, it seems that the whole computation could be performed in something >> like ~ ceiling(Log2(n)) + 1 computing cycles. >> >> Or maybe I am being wrong here? :) Please, correct me. >> >> Cheers, >> Dimitre >> >> >> On Sat, May 9, 2020 at 4:59 AM Costello, Roger L. costello@xxxxxxxxx < >> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: >> >>> Hi Folks, >>> >>> I need a super-efficient way to compute the sum of A[i] * B[i] for i=1 >>> to n. >>> >>> For example, suppose A is this: >>> >>> <row> >>> <col>0.9</col> >>> <col>0.3</col> >>> </row> >>> >>> and B is this: >>> >>> <row> >>> <col>0.2</col> >>> <col>0.8</col> >>> </row> >>> >>> I want to compute: >>> >>> (0.9 * 0.2) + (0.3 * 0.8) >>> >>> Here's one way to do it: >>> >>> sum(for $i in 1 to count($A/col) return number($A/col[$i]) * >>> number($B/col[$i])) >>> >>> I suspect that is not the most efficient approach. >>> >>> What is the most efficient approach? I will be doing hundreds of >>> thousands of these computations, so I want to use the most efficient >>> approach. >>> >>> /Roger
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] A super-efficient way to , Dimitre Novatchev dn | Thread | Re: [xsl] A super-efficient way to , Michael Kay mike@xxx |
Re: [xsl] exercise in complex group, Syd Bauman s.bauman@ | Date | Re: [xsl] A super-efficient way to , Michael Kay mike@xxx |
Month |