Subject: Re: [xsl] help with random number generation From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Wed, 16 Nov 2022 20:45:30 -0000 |
I would implement/use (even from the 20-years old FXSL) a random-sequence() function that produces a sequence of $N random numbers. Thus, if you know the total number of nodes, for which random numbers are needed, then you could use this as the value for $N. As for using time for a seed, new ()different and actual) current times can be produced even in pure XPath 3.1. But I would prefer to generate a sequence of randoms, not to start from a different seed every time. On Wed, Nov 16, 2022 at 11:30 AM C. M. Sperberg-McQueen cmsmcq@xxxxxxxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > Some readers of this list may know enough about pseudo-random number > generators and their use to advise me. I hope so! > > I am writing an XSLT program to simulate a process, with the aim of > using it to make Monte Carlo estimates of the probability that the > process will produce different kinds of results. The state of the > simulation is represented by an XML document whose size and shape vary > over time; at each timeslice, we apply templates to the tree produced by > the preceding timeslice, and generate a new tree. The simulation is > intended to implement a simple birth-and-death model of a population: > for each 'live' node in the tree, we choose randomly whether it gets a > new child in this time slice or not, and also choose randomly whether > the node dies in this time slice or not. We start with a single live > individual and end up with a family tree. (My interest in the > simulation, if it matters, is in observing the effects of changes to the > assumed birth and death rates on the resulting population of trees, and > the probability that family trees of a given shape will develop.) > > In a Monte Carlo simulation, the quality of the random numbers used is > likely to matter a good deal. In particular, if there is too tight a > correlation among the random numbers, the results are going to be biased > in ways that are going to be very hard to understand and explain (and in > some cases, hard to detect). > > XPath 3.0 has a random-number-generator() function which I would like to > use if I can, but the setup I have chosen seems to impose some barriers. > > - I can't just call fn:random-number-generator() each time I need a > random number, because the function is specified as deterministic, > and while the implementation-dependent seed may vary from run to run > of my stylesheet, it should not vary within the run. > > If you want more than one random number, the idea is to use something > like > > <xsl;variable name="rmap" as="map(*)" > select="random-number-generator()"/> > > for the first call, get your number using $map('number'), and make > the second call with something like > > <xsl;variable name="rmap2" as="map(*)" > select="$map('next')()"/> > > and so on. The seed to be used by the next call is embedded in the > function returns as the 'next' member of the map. > > - I can do this as I descend the tree, so that the parent element > generates the random numbers it needs and then passes the appropriate > function to its child. > > In that case, I'll get (what I hope is) a nice sequence of random > numbers as I descend the tree. But each child is going to get the > same random-number generation function, which will mean that each > child is going to get the same random numbers and the simulation will > show siblings always behaving the same way. Not a good idea. I need > to thread a single sequence of random numbers (and generators) > through the tree, so that each node uses a different seed and will > get different random numbers. > > - So maybe I could use an accumulator? > > That will allow each node to get and use a different hidden seed, but > as far as I can tell, each time slice is going to start its traversal > of the tree with the same initial function, which means the root of > the tree is always going to get the same random numbers, and its > behavior will be the same in every timeslice. > > - So maybe I can produce a different seed for each time I need to get a > random number, based on which time slice we are in, and which node we > are processing, and -- since I don't want every run of the simulator > to produce the same results -- a seed passed in as a parameter (so > that every run can be passed a different value, and they will produce > different results). > > Then for each node that needs one or more random numbers, I calculate > a seed (ideally different for each pass over each node) and call > fn:random-number-generator() with that seed. > > The results are ... disappointing, and I am hoping someone here can > help me do better. > > If I understand the state of play on conventional random number > generators, for a good one, over time each value between 0 and 1 should > have an approximately equal likelihood of turning up at any given time. > > For a given initial seed, however, the result is always going to be the > same (which is good, for reproducing errors and for debugging). > > And I notice that if I call the random number generator a hundred times, > using the integers 0 to 99 as seeds, the results I get don't vary much: > the first few digits of the number are often the same. > > I have experimented with various ways of combining the number of the > timeslice, the position of the node in the tree, and the seed passed to > the stylesheet as a parameter, and ... the results seem to indicate that > there is not enough variation in the random numbers I am getting. In > particular, when I calculate the probability of certain results arising, > and compare that to the number of times the simulation produces those > results, they are too far apart for my comfort. (Choosing a shape for > which the arithmetic is relatively easy, I calculate a 32% chance of the > simulation generating that shape, but consistently get 80% or more > results of that shape. It's one thing for that to happen now and then, > but it has now happened too often. My prior belief in the correctness > of my approach is not strong enough to withstand the evidence that the > coin I am flipping is biased in ways I don't know how to control.) > > I have tried to pose this question in a general form, to give it broader > interest and applicability; if I have inadvertently elided important > details, feel free to ask -- the details are not secret, just (I hope) > not relevant to the general question. > > If anyone can think of a relatively simple way to improve the random > numbers I am generating, I would be glad of any hints. > > -- > C. M. Sperberg-McQueen > Black Mesa Technologies LLC > http://blackmesatech.com > > > -- Cheers, Dimitre Novatchev --------------------------------------- Truly great madness cannot be achieved without significant intelligence. --------------------------------------- To invent, you need a good imagination and a pile of junk ------------------------------------- Never fight an inanimate object ------------------------------------- To avoid situations in which you might make mistakes may be the biggest mistake of all ------------------------------------ Quality means doing it right when no one is looking. ------------------------------------- You've achieved success in your field when you don't know whether what you're doing is work or play ------------------------------------- To achieve the impossible dream, try going to sleep. ------------------------------------- Facts do not cease to exist because they are ignored. ------------------------------------- Typing monkeys will write all Shakespeare's works in 200yrs.Will they write all patents, too? :) ------------------------------------- Sanity is madness put to good use. ------------------------------------- I finally figured out the only reason to be alive is to enjoy it.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] help with random number g, C. M. Sperberg-McQue | Thread | Re: [xsl] help with random numbers, Michael Kay michaelk |
Re: [xsl] help with random number g, Martin Honnen martin | Date | Re: [xsl] help with random numbers, Michael Kay michaelk |
Month |