Subject: Re: [xquery-talk] [xsl] Re: Random number generation : requirements From: "Wolfgang Laun wolfgang.laun@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Wed, 7 May 2014 06:31:26 -0000 |
I think that a random() in XSLT should be provided in a way that lets you call several random number generators (of the same kind) in parallel. Generators may exhibit a big difference between a sequence where all elements are due to successive calls of the same generator and one where a sufficient number of generators is called one by one. For instance: In Dimitre's example: values returned alternate between even and odd, and using this to generate random points (x,y) in 2D omits 50% of the possible points. And this is typical for an entire class of random generators. -W On 07/05/2014, Michael Sokolov msokolov@xxxxxxxxxxxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > My 2c: > > I used an XQuery function based on Dmitry's version before; it works > fine although it's a little inconvenient to have to keep passing in the > prior value. > > I would say the most convenient (or at least the most familiar) > signature for a random function is random($n) returning a random number > between 0 inclusive and $n exclusive; ideally it would return integers > if $n is an integer, floating point numbers if $n is a floating point > number, empty if $n is empty ? and an error otherwise. And I would like > a seed function. Ideally this should be callable many times: I'm not > sure how that could be done non-deterministically though. > > I suppose a sequence would be useful, but it isn't the first thing that > leaps to mind. What if I'm not sure how many I'll need? > > For example, one use case for me was to load a huge amount of data, and > only include 1% of it, in order to generate a predictable test data > sub-set. I want to write an XSLT template that returns nothing 99% of > the time, and for the other 1% of the time it processed the content > normally. I want this to be based on an identifier in the content so > that for a given seed, the same "random" 1% are selected each time: it > should *not* be order-dependent, rather I would like to seed the random > number generator with a hash of a given seed that is a configuration > parameter, and a node-identifier, and then evaluate the next random > number to see if it is > 0.01 (say). Maybe there are other ways to do > that, but that is what I did using Java. > > -Mike > > > On 5/6/2014 6:58 PM, Michael Kay wrote: >> The big problem with a nondeterministic random() function is not defining >> the order of execution, but preventing it being optimised out of a loop. >> For example, how do we ensure that >> >> $xxx[random() gt 0.5] >> >> doesn't select either all the values or none? >> >> Anyway, we're not planning to do non-determinism. This exercise is about >> designing a deterministic way to meet the requirement. >> >> Michael Kay >> Saxonica >> >> On 6 May 2014, at 23:48, Michael Sokolov <msokolov@xxxxxxxxxxxxxxxxxxxxx> >> wrote: >> >>> On 5/6/2014 6:41 PM, Michael Kay mike@xxxxxxxxxxxx wrote: >>>>> My policy on side effects is: all expressions containing side effects >>>>> are going to be evaluated in order >>>>> >>>> I do something like that in Saxon as well. But I don't attempt to define >>>> what "in order" means; for example, the order in which different global >>>> variables are evaluated. Doing this in the spec would be much more >>>> problematic. >>>> >>> You don't think it would be reasonable to say something to the effect >>> that the order in which non-deterministic expressions are evaluated is >>> non-deterministic (ie implementation-defined)? Certainly it would be >>> reasonable enough in the case of a random number generator. Although I >>> suppose if you are going to seed it, you would like the seed to effect >>> the random numbers that are generated. >>> >>> -Mike >>> _______________________________________________ >>> talk@xxxxxxxxxxx >>> http://x-query.com/mailman/listinfo/talk
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xquery-talk] [xsl] Re: Random , Michael Sokolov msok | Thread | Re: [xquery-talk] [xsl] Re: Random , Michael Kay mike@xxx |
[no subject], Unknown | Date | Re: [xsl] Re: format-date() and neg, Wolfgang Laun wolfga |
Month |