[xsl] Re: RE: Re: lookup-table thoughts (was Re: matching multiple times, outputting once?

Subject: [xsl] Re: RE: Re: lookup-table thoughts (was Re: matching multiple times, outputting once?
From: Dimitre Novatchev <dnovatchev@xxxxxxxxx>
Date: Thu, 8 Nov 2001 12:58:58 -0800 (PST)
Tom Myers wrote:
> As a distinctly stranger thought, to which you will probably not feel
> like responding but Dimitre might:
> 
> You note that Jeni's divconq version doesn't generate the same output.
> A divide-and-conquer equivalent is possible in languages with higher-order
> functions; if we think of consX(a) = cons("X",a) as a function, then
> tailRec(N) = divConq(N) = consX(consX(consX(... ([]) ...))) and we get
>    divConq(N) = dC(N)([])
> where
>      dC(0)(a) = a 
>      dC(1)(a) = consX(a) =  cons("X",a)  and so on,
> in other words
>      dC(0) = IdentityFunction
>      dC(2*N) = compose(dC(N),dC(N))
>      dC(2*N + 1) = compose(dC(N),compose(dC(N),consX));
> and divConq works just like foldL and foldR (because function composition
> is associative). I admit I'm thinking back to Backus' FP systems, wherein
> function composition was one of the primitives (but application was not.)
> I think that some of the massively-parallel proposals for implementations
> of FP (or FFP) might in fact construct the N-deep tree in logarithmic time,
> using linearly many processors, just about as readily as they'd construct
> an N-long list the same way. But I'm not sure, not sure at all.

Hi Tom,

The above is almost correct, with the exception that instead of

>      dC(2*N) = compose(dC(N),dC(N))

the definition has to be:

      dC(2*N) = combine(dC(N),dC(N))

where combine() is problem-specific and sometimes may be extremely complex (e.g. try
to dvc-replace in a given string every occurence of a substring with a given
replacement).

In this concrete case combine() is actually mergeTree, which requires one of the
trees to be copied to the leaf text node, at which place the second tree is to be
copied (inserted).

In case of merge-sort() the combine() function is again merge(), but this time the
merge operation is upon two sorted lists.

Robert Sedgewick discusses DVC in his book "Algorithms in C++ part 1-4". To quote
him, there are three places where additional processing might be necessary -- before
processing the two parts (halves), after processing the first part, but before
processing the second, and after processing both of them. There are also variants of
the DVC pattern, in which the input structure is divided into more than two parts,
which are then independently processed.

What I completely agree with is the extremely straightforward way, in which a DVC
algorithm can be executed on a multi-processor platform.

Sometimes ago I asked the group whether it would not be appropriate to recognise
this extreme case of capability to be processed in parallel by introducing a new
XSLT instruction (e.g. xsl:parallel) that would give the XSLT processor a hint to
try to multi-process the children of xsl:parallel. Compare this to the current
version of the language, where any two content-producing siblings can in theory be
parallelised, but which is never done, partly because there's no clear indication
which of many possible alternatives is worth parallelizing.

I still think an explicit hint is a necessary and a very useful feature.

Cheers,
Dimitre Novatchev.




__________________________________________________
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread