Re: [xsl] Re: [saxon] Questions about the `saxon:threads` extension attribute

Subject: Re: [xsl] Re: [saxon] Questions about the `saxon:threads` extension attribute
From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 2 Jan 2020 22:28:01 -0000
> (1) fn:unordered doesn't relate very well to xsl:for-each because
functions can't be applied directly to instructions; it would make more
sense,
> > I think, to add a saxon:unordered attribute to xsl:for-each.  I don't
think it's a particularly easy change to make, given the existing code,
> > though I'm sure it could be done. One question is whether the result
corresponds to some permutation of the input sequence,
> > or to some permutation of the output sequence (that is, are the
multiple result items corresponding to one input item
> > deliverered in the "correct" order? Getting that right depends on
understanding the use cases, and I'm not sure I do...
>
> In this case we need a permutation of the input sequence.

More precisely, we need fn:unordered($vmultithreadingResults) to produce
the items in $vmultithreadingResults in the order that they are being
produced -- not in the order of the input sequence (the one specified in
the `select` attribute of `xsl:for-each`). So, if the 2nd item's processing
is completed first, we need its output as the first item in the final
sequence, otherwise its output should be the 2nd item in the final
sequence. That is: do the processing and skip the ordering -- this is why I
used fn:unordered() -- I didn't mean that this function should do any
un-ordering -- just that it should suppress the standard ordering.

Hope this is more clear now :)

Happy New Year,

Dimitre

On Sat, Dec 28, 2019 at 6:16 PM Dimitre Novatchev <dnovatchev@xxxxxxxxx>
wrote:

> > Sadly, I suspect you could count the number of users who would take
> advantage of such options on the fingers of one hand.
> > However, Saxon's power-users are an important part of the community and
> we do value them.
>
> An american film -name became a proverb: "If you build it they will come"
>
> > (1) fn:unordered doesn't relate very well to xsl:for-each because
> functions can't be applied directly to instructions; it would make more
> sense,
> > I think, to add a saxon:unordered attribute to xsl:for-each.  I don't
> think it's a particularly easy change to make, given the existing code,
> > though I'm sure it could be done. One question is whether the result
> corresponds to some permutation of the input sequence,
> > or to some permutation of the output sequence (that is, are the multiple
> result items corresponding to one input item
> > deliverered in the "correct" order? Getting that right depends on
> understanding the use cases, and I'm not sure I do...
>
> In this case we need a permutation of the input sequence.
>
> > (2) In principle we could simply issue service.shutdownNow() rather than
> service.shutdown(), which would cause active threads to be interrupted.
> > However, this has no effect unless those threads make themselves
> interruptible. As I understand it, unless the thread is doing I/O
> > or waiting for other threads, that would mean actively issuing an
> occasional call on isInterrupted().
> > There's no obvious place to put this call; if we were to put it in a
> frequently-called routine like XPathContext.getContextItem()
> > then it would have disproportionate impact on workloads that aren't
> using the feature.
> > It's not clear that this scenario is high on the list of workloads that
> need to be optimised (remember, after all,
> > that dynamic errors already impose a high cost).
>
> Completely agreed. The developer should be in control and could specify
> that the threads will be interruptible -- maybe  via a
> `saxon:interruptible-threads="yes|[no]"` attribute.
>
> > I think that before taking steps like those proposed, we would want to
> look at the bigger picture as regards asynchronous processing.
> > Adam Retter and Debbie Lockett sketched out some interesting ideas in
> their XML Prague 2018 paper:
> > http://www.saxonica.com/papers/xmlprague-2019dcl.pdf Above all, I think
> this needs to be use-case driven.
>
> I read that -- a good approach but going just to the middle of the road. I
> personally would prefer to have not just a `promise` but an implementation
> of Reactive Extensions (Rx) (http://reactivex.io/intro.html), similar to
> what we have in RxJS (https://github.com/ReactiveX/rxjs)
>
>
> Cheers,
> Dimitre
>
>
> On Sat, Dec 28, 2019 at 12:58 PM Michael Kay mike@xxxxxxxxxxxx <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>> Is it too much to ask (for the benefit of the whole XSLT developers'
>> community and current + future Saxon users) for implementing the desired
>> behaviour in a future release of Saxon?  :)
>>
>> 1. Honour fn:unordered()  -- this will make it possible to do WaitAny().
>> The current behavior is equivalent to WaitAll() even when there is an
>> exception in one or more threads.
>>
>> 2. If one of the threads crashes, kill without waiting all remaining
>> threads -- this will make it possible to implement cancellation on
external
>> event. There are obvious benefits of freeing one or more cores that are
>> busy with the threads whose results, regardless of how long it takes to be
>> produced, will be ignored
>>
>>
>> Sadly, I suspect you could count the number of users who would take
>> advantage of such options on the fingers of one hand. However, Saxon's
>> power-users are an important part of the community and we do value them.
>>
>> (1) fn:unordered doesn't relate very well to xsl:for-each because
>> functions can't be applied directly to instructions; it would make more
>> sense, I think, to add a saxon:unordered attribute to xsl:for-each.  I
>> don't think it's a particularly easy change to make, given the existing
>> code, though I'm sure it could be done. One question is whether the result
>> corresponds to some permutation of the input sequence, or to some
>> permutation of the output sequence (that is, are the multiple result items
>> corresponding to one input item deliverered in the "correct" order?
Getting
>> that right depends on understanding the use cases, and I'm not sure I
do...
>>
>> (2) In principle we could simply issue service.shutdownNow() rather than
>> service.shutdown(), which would cause active threads to be interrupted.
>> However, this has no effect unless those threads make themselves
>> interruptible. As I understand it, unless the thread is doing I/O or
>> waiting for other threads, that would mean actively issuing an occasional
>> call on isInterrupted(). There's no obvious place to put this call; if we
>> were to put it in a frequently-called routine like
>> XPathContext.getContextItem() then it would have disproportionate impact
on
>> workloads that aren't using the feature. It's not clear that this scenario
>> is high on the list of workloads that need to be optimised (remember,
after
>> all, that dynamic errors already impose a high cost).
>>
>> I think that before taking steps like those proposed, we would want to
>> look at the bigger picture as regards asynchronous processing. Adam Retter
>> and Debbie Lockett sketched out some interesting ideas in their XML Prague
>> 2018 paper: http://www.saxonica.com/papers/xmlprague-2019dcl.pdf Above
>> all, I think this needs to be use-case driven.
>>
>> Michael Kay
>> Saxonica
>>
>> On 28 Dec 2019, at 01:20, Michael Kay <mike@xxxxxxxxxxxx> wrote:
>>
>> Quite a difficult question to answer accurately because I haven't looked
>> at the code for a while and the logic isn't all in one place; it also
>> depends on understanding the behaviour of the underlying Java services
that
>> we rely on, notably the ExecutorService.
>>
>> Q1. Firstly, calling fn:error() is exactly the same as any other dynamic
>> error. We throw an XPathException; this is caught by the
>> MultithreadedContextMappingIterator, this invokes
>> ExecutorService.shutdown() which prevents new tasks being accepted but
>> allows existing tasks to finish. A "task" here is the processing of a
>> single item in the for-each selection.
>>
>> Q2. There are two cases to consider with try/catch: where the try/catch
>> is local to one thread (within the multi-threading for-each), and where it
>> is outside the multithreading for-each. In the first case there should be
>> no effect on other threads. In the second case the xsl:for-each fails as a
>> whole, and the error is caught as a whole.
>>
>> Q3. We don't take any account of fn:unordered(); we always respect the
>> ordering.
>>
>> This is complicated by the fact that xsl:for-each may be executed in
>> either pull or push mode. Push mode is generally used when writing to a
>> tree, pull mode when evaluating (non-document) variables and functions. In
>> push mode you can never see any results until they are complete. In pull
>> mode, items are delivered as soon as they are available; so a reference to
>> $var[1] may return a result even though $var[2] has not yet been computed,
>> and this applies to multithreaded execution just as much as to
>> single-threaded execution. Multithreading complicates it, because $var[2]
>> may be computed before $var[1], but we won't allow you to see $var[2] in
>> that situation; it will be sitting in a queue somewhere waiting to be
added
>> to the result.
>>
>> Michael Kay
>>
>> On 27 Dec 2019, at 05:18, Dimitre Novatchev <dnovatchev@xxxxxxxxx> wrote:
>>
>> The questions below are essentially for Dr. Kay, though anyone interested
>> and able to shed light on these is welcome.
>>
>> I need this information, because it seems not to be available in the
>> Saxon documentation.
>>
>>
>> *General setup*:  We have multi-threaded processing specified by
>> saxon:threads=b2b
>>
>>
>> *Q1*. What will happen if *thread1* calls the *fn:error*() function?
>> Will the execution of *thread2* be terminated immediately/promptly, or
>> will it continue executing until the end of its processing?
>>
>>
>> *Q2*. Same as Q1 above, but the multi-threaded processing is enclosed in
>> <xsl:try> and the error thrown by the *fn:error()* function is caught in
>> the `<xsl:catch>` child of `<xsl:try>`. Then `<xsl:catch>` produces a
>> normal value (the error is not re-thrown) b will the 2nd thread be
>> suppressed or will it continue executing until the end of its processing?
>>
>>
>> *Q3*.  This time there is no error thrown. Can we access the result of
>> just one of the threads (whichever finishes first) even before the other
>> thread has finished? For example, if the results of the two threads are in
>> the sequence constructor of an `<xsl:variable
>> select=bvmultithreadingResultsb>`, is it possible to access the result
of
>> the first finished thread in an expression like:
>>
>> fn:unordered($vmultithreadingResults)[1] ?
>>
>> --
>> Cheers,
>> Dimitre Novatchev
>> ---------------------------------------
>> Truly great madness cannot be achieved without significant intelligence.
>> ---------------------------------------
>> To invent, you need a good imagination and a pile of junk
>> -------------------------------------
>> Never fight an inanimate object
>> -------------------------------------
>> To avoid situations in which you might make mistakes may be the
>> biggest mistake of all
>> ------------------------------------
>> Quality means doing it right when no one is looking.
>> -------------------------------------
>> You've achieved success in your field when you don't know whether what
>> you're doing is work or play
>> -------------------------------------
>> To achieve the impossible dream, try going to sleep.
>> -------------------------------------
>> Facts do not cease to exist because they are ignored.
>> -------------------------------------
>> Typing monkeys will write all Shakespeare's works in 200yrs.Will they
>> write all patents, too? :)
>> -------------------------------------
>> Sanity is madness put to good use.
>> -------------------------------------
>> I finally figured out the only reason to be alive is to enjoy it.
>>
>> _______________________________________________
>> saxon-help mailing list archived at http://saxon.markmail.org/
>> saxon-help@xxxxxxxxxxxxxxxxxxxxx
>> https://lists.sourceforge.net/lists/listinfo/saxon-help
>>
>>
>> _______________________________________________
>> saxon-help mailing list archived at http://saxon.markmail.org/
>> saxon-help@xxxxxxxxxxxxxxxxxxxxx
>> https://lists.sourceforge.net/lists/listinfo/saxon-help
>>
>>
>> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
>> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/782854> (by
>> email <>)
>>
>
>
> --
> Cheers,
> Dimitre Novatchev
> ---------------------------------------
> Truly great madness cannot be achieved without significant intelligence.
> ---------------------------------------
> To invent, you need a good imagination and a pile of junk
> -------------------------------------
> Never fight an inanimate object
> -------------------------------------
> To avoid situations in which you might make mistakes may be the
> biggest mistake of all
> ------------------------------------
> Quality means doing it right when no one is looking.
> -------------------------------------
> You've achieved success in your field when you don't know whether what
> you're doing is work or play
> -------------------------------------
> To achieve the impossible dream, try going to sleep.
> -------------------------------------
> Facts do not cease to exist because they are ignored.
> -------------------------------------
> Typing monkeys will write all Shakespeare's works in 200yrs.Will they
> write all patents, too? :)
> -------------------------------------
> Sanity is madness put to good use.
> -------------------------------------
> I finally figured out the only reason to be alive is to enjoy it.
>
>


--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

Current Thread