Random Number Basics
|
||
Working groups |
Blessed plots and figures |
Approving new results and publications |
Approval web pages - new results |
Approval web pages - new publications |
Mu2e Acronyn Dictionary |
Fermilab Meeting Rooms |
Fermilab Service Desk |
ReadyTalk : Home |
ReadyTalk : Help |
ReadyTalk : Toll Free Numbers |
A typical Mu2e simulation job uses multiple independent sequences of pseudo-random numbers. The Mu2e Offline software provides tools to create these sequences, to seed them, to save their state, to restore their state and to ensure that each job in a long chain of jobs produces unqiue events. It also provides a way to ensure that the sequences are exactly repeatable when that is appropriate.
It is your responsibility to know when your job must use the same sequences of pseudo-random numbers as a previous job and when it must use different sequences. You need to understand which behaviour you require and to use the tools provided to implement that behaviour.
This page presents the minimum information needed to manage the repeatability or uniqueness of pseudo-random number sequences when running Mu2e jobs. More advanced readers may also wish to read the complete documentation for management of of pseudo-random numbers in the Mu2e simulation software. If you are writing modules that use random numbers, you must consult the complete documentation.
The information on this page applies only to Mu2e Offline jobs, which use the art framework. It does not apply to G4beamline or MARS. At last report standard Mu2e G4beamline jobs are configured so that a single pseudo-random number engine is seeded by the event number; to produce unqiue events you need only ensure that each separate run of G4beamline produces a unqiue range of event numbers. For additional information, consult a G4beamline or MARS expert.
Most of the Mu2e example .fcl files, in particular the Mu2eG4/test/g4test*.fcl files, are configured to do this by default.
It is also critically important that each event in the two files have a unqiue event ID; an event ID consists of a run number, a subrun number and an event number. This is discused below.
A more general example is this: you wish to run 10,000 unique grid processes by submitting 10 grid jobs of 1000 processes each; in grid-speak, one job of 1000 processes is called a "cluster". You must ensure that events created by each grid process are independent of those created by all other grid processes. To understand how to do this you need to know two things: how to write an .fcl file to do this and how the mu2egrid scripts will automatically do this for you.
When you run a large ensemble of grid jobs, you must ensure that each event has a unqiue event ID. The mu2egrid scripts also look after this requirement automatically.
As of March 2012, all Mu2e Offline code that consumes psuedo-random numbers uses an engine of the type HepJamesRandom. This includes our event generators, our use of Geant4 and all of our hit making codes.
One of the properties of the HepJamesRandom engine is that it can be seeded by supplying a single integer in the range [0, 900 000 000], where [] denotes that the edge values are included in the range. The HepJamesRandom algorithm guarantees that the sequence of random variates produced by any seed in the legal range has a periodicity of 2^144. It also guarantees that the sequences produced by any two seeds do not have long identical subsequences; the strength of these guarantees is sufficient for Mu2e.
What happens if you supply a seed outside of the legal range? According to the CLHEP source code, if you supply a negative seed, there will be serious flaws in the randomness. If you supply a seed more than 900,000,000 then it will produce the same pattern as one of the seeds with a value less than 900,000,000.
Because the degree of randomness provided by the seeding mechanism is sufficient for Mu2e, the problem of managing randomness is reduced to managing the uniqueness or repeatability of seeds.
Mu2e uses two art services to manage pseduo-random numbers:
The RandomNumberGenerator service must be present in the art configuration but it has no parameters; see the line in blue in the following fcl fragment. The SeedService service must also be present in the art configuration and the normal configuration is illustrated by the lines in red in the following fcl fragment.
#include "standardServices.fcl" services: { // ... RandomNumberGenerator : { } user : { // ... SeedService : @local::automaticSeeds } } // ... services.user.SeedService.baseSeed : 0 services.user.SeedService.maxUniqueEngines : 20The beginning "SeedService :" says to configure the SeedService to one of its known standard configurations, named automaticSeeds. That configuration is found in the file, Offline/fcl/standardServices.fcl:
automaticSeeds : { policy : "autoIncrement" baseSeed : nil maxUniqueEngines : nil # verbosity : 1 # endOfJobSummary : true }This tells the seed service that two important parameters are left undefined, baseSeed and maxEngines; if the end-user does not give values to these parameters, FHiCL will issue an error when any code tries to read these parameters. The last two lines in the above .fcl fragment give values to these parameters.
Taken all together, this configuration tells SeedService that it may supply seeds for up to 20 different engines in this job; the seeds will be the integers 0 through 19 ( baseSeed through baseseed+maxUnqiueEngines-1) and they will be given out in the order in which the code asks for them. If the job tries to seed more than 20 engines, the SeedService will print an error message and tell art to perform a graceful shutdown. If this happens, you should increase the value of maxUniqueEngines and rerun the job; please also send email to the Mu2e software team describing the situation in which this happened. Current Mu2e simulation jobs use, at most, about 10 unique engines; so a value of 20 should be safe for a while.
If you uncomment the two lines verbosity and endOfJobSummary you will get some informational printout, including which seed was assigned to which engine. The SeedService has many other features; for further details, see the complete instructions.
If you run the g4test_03.fcl job once and then rerun it, the output will be identical because the SeedService will compute the same seeds each time.
If you wish to generate a additional events, change the value of baseSeed to baseSeed+maxUniqueEngines, in this case, 20. You can repeat this pattern until the baseSeed+maxUniqueEngines excceds 900,000,000. Also remember to change the range of event ID's that are created.
This pattern guarantees that every engine in an ensemble of art jobs will have a unqiue seed. This is actually a stronger requirement than we usually need. Normally it is sufficient that baseSeed be unique in each of an ensemble of art jobs; for example, it is OK if the Straw hit making code in one job has the same seed as the event generator in another job. The point of maxUniqueEngines is that we can enforce the stricter definition of uniqueness should it be important to do so.
If you write your own grid workflow scripts you must ensure that
At this writing ( February 2015), the mu2egrid scripts presume that the grid cluster number and process number are guaranteed to be unique. These scripts take a base fcl file supplied by the user and append six additional lines:
services.user.SeedService.policy : "autoIncrement" services.user.SeedService.maxUniqueEngines : 20 services.user.SeedService.baseSeed : ${RANDOMSEED} source.firstRun : ${CLUSTER} source.firstSubRun : ${PROCESS} source.firstEvent : 1where ${CLUSTER} and ${PROCESS} are the grid cluster and process numbers and where ${RANDOMSEED} is a random number chosen on domain allowed for HepJamesRandom. The value of RANDOMSEED is chosen by the workflow script using its own random number generator.
This guarantees unique event IDs. While it does not guarantee a unique baseSeed, a repeated baseSeed will be extremely rare. The script checkAndMove, which is part of the mu2egrid package will scan the outut of a grid cluster to ensure that the seeds choosen for each process are unique. If you need to ensure uniqueness across multiple clusters, you will need to script that yourself. In the future we may provide such a script.
Security, Privacy, Legal |