Subject: Re: [xsl] help with random number generation From: "C. M. Sperberg-McQueen cmsmcq@xxxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Fri, 18 Nov 2022 18:48:07 -0000 |
Dimitre, thank you for your note. You write: > Or maybe I don't understand the requirements well? I think you probably understand the requirements at least as well as I do -- although as I have struggled with this problem my understanding has certainly changed. Ultimately, my goal is to have a simulation that allows me to estimate reliably the probability of certain classes of result; I think that means I need good random sequences of numbers to guide the simulations. The list of requirements I managed to formulate in response to Martin Honnen's useful remark the other day identifies several ways in which I believe a sequence can fail to be good. But who knows? I may find more ways for sequences to be bad as I continue my work. "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> writes: > Michael, > > Reading the requirements you listed, it seems to me that you need: > > Num-TimeSlices * NumNodesPerTimeSlice > > different random numbers. I think that is correct. One practical snag is that I do not know in advance how many nodes there will be for a given time slice, until I get to that time slice. The maximum number for any timeslice would be two to n, where n is 1 for the first timeslce, 2 for the second, and so on. > Why not generate a sequence of random numbers with this length? > > This will require a single seed, that you could use based on the > current time (the start of the simulation). > > Thus, for every simulation there will be a different seed. I expect to take this advice. It wasn't my first approach, because generating the sequence in advance and then indexing into it seemed more complicated than calling a random-number generator at the point where the number was needed. My current conjecture is just that if I need n random numbers, generating that sequence from one seed is likely to produce a good sequence, and generating n/2 sequences from the keys I can generate, and taking the first and second (or the third and fourth) item of each sequence, is likely to produce a bad sequence. At the moment, I have plans for several further steps. First, try to gain better understanding of the behavior of my current code (and diagnosis of its problems, if in fact I am correct in believing that the current behavior has problems): - make the simulation dump the sequence of numbers actually used - apply some basic statistical tests to the sequences being generated; if I understand things correctly it is not possible to prove that a given sequence is random, but it is possible to show that a sequence appears non-random, and that a set of sequences is unlikely to result from a random process - calculate a few more expected probabilities so that I can check the plausibility of a given set of simulation runs. Your method of using Monte Carlo results to estimate a known value is very useful here. I have estimated, and then re-estimated, the probability of my a particular result given particular parameters, and my main evidence that my simulation is not currently behaving well is that it is not converging on the expected number. It would be nice to have a few more checkable numbers, which requires that I either think hard about the probabilities of particular sequences of events or give up and get an introductory textbook on probability and statistics out of the library. (If one were writing a simulation of coin tossing, one would expect trials of 100 tosses to end up with particular numbers of heads with a certain frequency - how closely does the simulation match the expected results? If half of a set of trials were 100 heads in a row, and the other half were 100 tails in a row, I would be inclined to think that a bug in the simulation might be more likely than that a random process had produced that result. (Particularly if it was me who wrote the simulation.) Second, implement different behavior, using the idea that both you and Mike Kay have described: use a single long-enough sequence of random numbers for each run of the simulation. - That sequence may be generated and saved externally using a pseudo-random-number generator (e.g. the one in FXSL, or fn:random-number-generator(), or any of the random number generators available for me to use. - Or it could be acquired from any of the several sources of random bits available on the net: random.org generates random numbers from observations of atmospheric noise; NIST publishes a block of 512 random bits every 60 seconds (a service apparently called the NIST Randomness Beacon), although I did not see what their source of bits is; the web says there used to be one service that used lava lamps as an easily observable chaotic system. For use in simulations like mine, I think that pseudo-random numbers should serve the purpose as well as true random numbers acquired from quantum systems or from chaotic systems like weather, but it will be interesting to try both and see if they produce similar results. > In case you need the seeds to be not too-close, you could generate a > separate sequence of random numbers each of which to be used as the > seed for a separate simulation. This is also a good idea. I have learned that some people select their keys from a list of numbers generated by a source of 'true' random numbers like random.org. This appeals to me, though it probably doesn't matter for this task. Thanks to all who have responded for your help; I will report back if and when I have a simulation that passes the plausibility tests I can formulate for it. Michael -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] help with random number g, Dimitre Novatchev dn | Thread | Re: [xsl] help with random number g, Dimitre Novatchev dn |
Re: [xsl] help with random number g, Dimitre Novatchev dn | Date | [xsl] Jaxen 2.0.0 Beta 1, Elliotte Rusty Harol |
Month |