[xsl] Re: [saxon] Questions about the `saxon:threads` extension attribute

Subject: [xsl] Re: [saxon] Questions about the `saxon:threads` extension attribute
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 28 Dec 2019 20:58:24 -0000
Is it too much to ask (for the benefit of the whole XSLT developers' community
and current + future Saxon users) for implementing the desired behaviour in a
future release of Saxon?  :)

1. Honour fn:unordered()  -- this will make it possible to do WaitAny(). The
current behavior is equivalent to WaitAll() even when there is an exception in
one or more threads.

2. If one of the threads crashes, kill without waiting all remaining threads
-- this will make it possible to implement cancellation on external event.
There are obvious benefits of freeing one or more cores that are busy with the
threads whose results, regardless of how long it takes to be produced, will be
ignored


Sadly, I suspect you could count the number of users who would take advantage
of such options on the fingers of one hand. However, Saxon's power-users are
an important part of the community and we do value them.

(1) fn:unordered doesn't relate very well to xsl:for-each because functions
can't be applied directly to instructions; it would make more sense, I think,
to add a saxon:unordered attribute to xsl:for-each.  I don't think it's a
particularly easy change to make, given the existing code, though I'm sure it
could be done. One question is whether the result corresponds to some
permutation of the input sequence, or to some permutation of the output
sequence (that is, are the multiple result items corresponding to one input
item deliverered in the "correct" order? Getting that right depends on
understanding the use cases, and I'm not sure I do...

(2) In principle we could simply issue service.shutdownNow() rather than
service.shutdown(), which would cause active threads to be interrupted.
However, this has no effect unless those threads make themselves
interruptible. As I understand it, unless the thread is doing I/O or waiting
for other threads, that would mean actively issuing an occasional call on
isInterrupted(). There's no obvious place to put this call; if we were to put
it in a frequently-called routine like XPathContext.getContextItem() then it
would have disproportionate impact on workloads that aren't using the feature.
It's not clear that this scenario is high on the list of workloads that need
to be optimised (remember, after all, that dynamic errors already impose a
high cost).

I think that before taking steps like those proposed, we would want to look at
the bigger picture as regards asynchronous processing. Adam Retter and Debbie
Lockett sketched out some interesting ideas in their XML Prague 2018 paper:
http://www.saxonica.com/papers/xmlprague-2019dcl.pdf Above all, I think this
needs to be use-case driven.

Michael Kay
Saxonica

> On 28 Dec 2019, at 01:20, Michael Kay <mike@xxxxxxxxxxxx> wrote:
>
> Quite a difficult question to answer accurately because I haven't looked at
the code for a while and the logic isn't all in one place; it also depends on
understanding the behaviour of the underlying Java services that we rely on,
notably the ExecutorService.
>
> Q1. Firstly, calling fn:error() is exactly the same as any other dynamic
error. We throw an XPathException; this is caught by the
MultithreadedContextMappingIterator, this invokes ExecutorService.shutdown()
which prevents new tasks being accepted but allows existing tasks to finish. A
"task" here is the processing of a single item in the for-each selection.
>
> Q2. There are two cases to consider with try/catch: where the try/catch is
local to one thread (within the multi-threading for-each), and where it is
outside the multithreading for-each. In the first case there should be no
effect on other threads. In the second case the xsl:for-each fails as a whole,
and the error is caught as a whole.
>
> Q3. We don't take any account of fn:unordered(); we always respect the
ordering.
>
> This is complicated by the fact that xsl:for-each may be executed in either
pull or push mode. Push mode is generally used when writing to a tree, pull
mode when evaluating (non-document) variables and functions. In push mode you
can never see any results until they are complete. In pull mode, items are
delivered as soon as they are available; so a reference to $var[1] may return
a result even though $var[2] has not yet been computed, and this applies to
multithreaded execution just as much as to single-threaded execution.
Multithreading complicates it, because $var[2] may be computed before $var[1],
but we won't allow you to see $var[2] in that situation; it will be sitting in
a queue somewhere waiting to be added to the result.
>
> Michael Kay
>
>> On 27 Dec 2019, at 05:18, Dimitre Novatchev <dnovatchev@xxxxxxxxx
<mailto:dnovatchev@xxxxxxxxx>> wrote:
>>
>> The questions below are essentially for Dr. Kay, though anyone interested
and able to shed light on these is welcome.
>>
>> I need this information, because it seems not to be available in the Saxon
documentation.
>>
>>
>> General setup:  We have multi-threaded processing specified by
saxon:threads=b2b
>>
>>
>> Q1. What will happen if thread1 calls the fn:error() function? Will the
execution of thread2 be terminated immediately/promptly, or will it continue
executing until the end of its processing?
>>
>>
>> Q2. Same as Q1 above, but the multi-threaded processing is enclosed in
<xsl:try> and the error thrown by the fn:error() function is caught in the
`<xsl:catch>` child of `<xsl:try>`. Then `<xsl:catch>` produces a normal value
(the error is not re-thrown) b will the 2nd thread be suppressed or will it
continue executing until the end of its processing?
>>
>>
>> Q3.  This time there is no error thrown. Can we access the result of just
one of the threads (whichever finishes first) even before the other thread has
finished? For example, if the results of the two threads are in the sequence
constructor of an `<xsl:variable select=bvmultithreadingResultsb>`, is it
possible to access the result of the first finished thread in an expression
like:
>>
>> fn:unordered($vmultithreadingResults)[1] ?
>>
>>
>> --
>> Cheers,
>> Dimitre Novatchev
>> ---------------------------------------
>> Truly great madness cannot be achieved without significant intelligence.
>> ---------------------------------------
>> To invent, you need a good imagination and a pile of junk
>> -------------------------------------
>> Never fight an inanimate object
>> -------------------------------------
>> To avoid situations in which you might make mistakes may be the
>> biggest mistake of all
>> ------------------------------------
>> Quality means doing it right when no one is looking.
>> -------------------------------------
>> You've achieved success in your field when you don't know whether what
you're doing is work or play
>> -------------------------------------
>> To achieve the impossible dream, try going to sleep.
>> -------------------------------------
>> Facts do not cease to exist because they are ignored.
>> -------------------------------------
>> Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
>> -------------------------------------
>> Sanity is madness put to good use.
>> -------------------------------------
>> I finally figured out the only reason to be alive is to enjoy it.
>>
>> _______________________________________________
>> saxon-help mailing list archived at http://saxon.markmail.org/
<http://saxon.markmail.org/>
>> saxon-help@xxxxxxxxxxxxxxxxxxxxx <mailto:saxon-help@xxxxxxxxxxxxxxxxxxxxx>
>> https://lists.sourceforge.net/lists/listinfo/saxon-help
>
> _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/
> saxon-help@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/saxon-help

Current Thread