Re: [xsl] help with random number generation

Subject: Re: [xsl] help with random number generation
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 16 Nov 2022 20:09:19 -0000
On 11/16/2022 8:31 PM, C. M. Sperberg-McQueen cmsmcq@xxxxxxxxxxxxxxxxx
wrote:
Some readers of this list may know enough about pseudo-random number
generators and their use to advise me.  I hope so!

I am writing an XSLT program to simulate a process, with the aim of
using it to make Monte Carlo estimates of the probability that the
process will produce different kinds of results.  The state of the
simulation is represented by an XML document whose size and shape vary
over time; at each timeslice, we apply templates to the tree produced by
the preceding timeslice, and generate a new tree.  The simulation is
intended to implement a simple birth-and-death model of a population:
for each 'live' node in the tree, we choose randomly whether it gets a
new child in this time slice or not, and also choose randomly whether
the node dies in this time slice or not. We start with a single live
individual and end up with a family tree.  (My interest in the
simulation, if it matters, is in observing the effects of changes to the
assumed birth and death rates on the resulting population of trees, and
the probability that family trees of a given shape will develop.)

In a Monte Carlo simulation, the quality of the random numbers used is
likely to matter a good deal.  In particular, if there is too tight a
correlation among the random numbers, the results are going to be biased
in ways that are going to be very hard to understand and explain (and in
some cases, hard to detect).

XPath 3.0 has a random-number-generator() function which I would like to
use if I can, but the setup I have chosen seems to impose some barriers.

  - I can't just call fn:random-number-generator() each time I need a
    random number, because the function is specified as deterministic,
    and while the implementation-dependent seed may vary from run to run
    of my stylesheet, it should not vary within the run.

    If you want more than one random number, the idea is to use something
    like

      <xsl;variable name="rmap" as="map(*)"
                    select="random-number-generator()"/>

    for the first call, get your number using $map('number'), and make
    the second call with something like

      <xsl;variable name="rmap2" as="map(*)"
                    select="$map('next')()"/>

    and so on.  The seed to be used by the next call is embedded in the
    function returns as the 'next' member of the map.

  - I can do this as I descend the tree, so that the parent element
    generates the random numbers it needs and then passes the appropriate
    function to its child.

    In that case, I'll get (what I hope is) a nice sequence of random
    numbers as I descend the tree.  But each child is going to get the
    same random-number generation function, which will mean that each
    child is going to get the same random numbers and the simulation will
    show siblings always behaving the same way.  Not a good idea.  I need
    to thread a single sequence of random numbers (and generators)
    through the tree, so that each node uses a different seed and will
    get different random numbers.

- So maybe I could use an accumulator?

    That will allow each node to get and use a different hidden seed, but
    as far as I can tell, each time slice is going to start its traversal
    of the tree with the same initial function, which means the root of
    the tree is always going to get the same random numbers, and its
    behavior will be the same in every timeslice.


In what setting do you work?

With Saxon PE/EE (and I think SaxonJS) there is saxon:timestamp() so e.g.


B <xsl:accumulator name="rnd" as="map(*)" initial-value="random-number-generator(saxon:timestamp())"> B B B <xsl:accumulator-rule match="*" select="$value?next()"/> B </xsl:accumulator>


might work better than



B <xsl:accumulator name="rnd" as="map(*)" initial-value="random-number-generator(current-dateTime())"> B B B <xsl:accumulator-rule match="*" select="$value?next()"/> B </xsl:accumulator>


The rest of your requirements were too complicated for me to follow through.


Current Thread