Re: [xsl] help with random number generation

Subject: Re: [xsl] help with random number generation
From: "C. M. Sperberg-McQueen cmsmcq@xxxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 17 Nov 2022 16:02:12 -0000
"Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
writes:

> On 11/16/2022 8:31 PM, C. M. Sperberg-McQueen cmsmcq@xxxxxxxxxxxxxxxxx
> wrote:
>> Some readers of this list may know enough about pseudo-random number
>> generators and their use to advise me.  I hope so!
>>
>> ...

>
> In what setting do you work?

At the moment, I am running the XSLT of the simulator from bash, using
Saxon HE.  I like that environment, but if getting the simulation to
work correctly requires moving into a different environment (running
from within Oxygen, rewriting in XQuery, ...), then I can change to that
environment (within reason -- rewriting in a language other than XSLT or
XQuery is not a path I would willingly take -- well, maybe Prolog or
Lisp).

> With Saxon PE/EE (and I think SaxonJS) there is saxon:timestamp() so e.g.
>
> B  <xsl:accumulator name="rnd" as="map(*)"
> initial-value="random-number-generator(saxon:timestamp())">
> B B B  <xsl:accumulator-rule match="*" select="$value?next()"/>
> B  </xsl:accumulator>
>
>
> might work better than
>
>
> B  <xsl:accumulator name="rnd" as="map(*)"
> initial-value="random-number-generator(current-dateTime())">
> B B B  <xsl:accumulator-rule match="*" select="$value?next()"/>
> B  </xsl:accumulator>

Thank you; that's worth knowing.

I don't think I can usefully use accumulators in this application,
though, since each time I process the tree describing the state of the
simulation, the root node would get the same two random numbers.  So at
each time slice in a particular execution of the simulation, it would
behave in exactly the same way.  That won't work for my purposes,
because it will not produce a random sample of the space I want to
explore.

In order for the simulation to produce a random sample of the logical
space inhabited by the process being modeled, the random numbers
generated for any given node in one time slice need to be independent of
those generated in the preceding or next time slice.

> The rest of your requirements were too complicated for me to follow
> through.

In a given run, any given node will be processed many times (for as many
time slices as the node is 'alive'); for each 'live' node, I need two
random numbers during each time slice.  The requirements, as I
understand them, are:

  - The numbers generated for a given node on one time slice need to be
    independent of those generated for that node in other time slices.

    (This is why an accumulator doesn't seem to work for me -- no way to
    make the value of the seed differ sufficiently from time slice to
    time slice.)

  - The numbers generated for a given node on one time slice need to be
    independent of those generated for other nodes (e.g. siblings,
    children) in that time slice.

    If siblings (for example) tend to get similar random numbers, then
    siblings will have a marked and disproportionate tendency all to
    behave in the same way in the simulation.

  - In different executions of the simulator, it is desirable that
    different numbers be generated -- otherwise running the simulator
    100 times will produce 100 copies of the same result, rather than
    100 random samples from the logical space of the process.

My current methods of generating random numbers for different nodes seem
to have trouble with the second and third of these requirements.  I
think this is related to the phenomenon noted by Mike Kay in his
response to my query: when seeds are 'too similar' (e.g. if they are all
small integers), calling random-number-generator() with those seeds will
produce results which all have the same first few digits and are not
randomly distributed in the interval (0, 1).

Michael


--
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Current Thread