Random Number Basics

	Random Number Basics

Bylaws

Members List

Boards and Committees

Bylaws Approval web pages

Working groups

Blessed plots and figures

Approving new results and publications

Approval web pages - new results

Approval web pages - new publications

Project Home

L2 Sub-Projects

Review Status and Preparations

eCAM Notebook

Getting Started

Software Documentation

Standards & Practices

Software and Simulations

Doc-DB Introduction

Doc-DB (private)

Doc-DB (cert)

Blessed Plots and Figures

Published Results

Mu2e Acronyn Dictionary

Fermilab Meeting Rooms

Fermilab Service Desk

ReadyTalk : Home

ReadyTalk : Help

ReadyTalk : Toll Free Numbers

Introduction
- Exclusions
Use Cases
- Code Development
- Grid Jobs
HepJamesRandom and Seeding
Basic Instructions
- Instructions for Code Development
- Instructions for Grid Jobs Grid Jobs and the Uniqueness of Event IDs

Introduction

A typical Mu2e simulation job uses multiple independent sequences of pseudo-random numbers. The Mu2e Offline software provides tools to create these sequences, to seed them, to save their state, to restore their state and to ensure that each job in a long chain of jobs produces unqiue events. It also provides a way to ensure that the sequences are exactly repeatable when that is appropriate.

It is your responsibility to know when your job must use the same sequences of pseudo-random numbers as a previous job and when it must use different sequences. You need to understand which behaviour you require and to use the tools provided to implement that behaviour.

This page presents the minimum information needed to manage the repeatability or uniqueness of pseudo-random number sequences when running Mu2e jobs. More advanced readers may also wish to read the complete documentation for management of of pseudo-random numbers in the Mu2e simulation software. If you are writing modules that use random numbers, you must consult the complete documentation.

Exclusions

The information on this page applies only to Mu2e Offline jobs, which use the art framework. It does not apply to G4beamline or MARS. At last report standard Mu2e G4beamline jobs are configured so that a single pseudo-random number engine is seeded by the event number; to produce unqiue events you need only ensure that each separate run of G4beamline produces a unqiue range of event numbers. For additional information, consult a G4beamline or MARS expert.

Use Cases

There are two classes of use cases that are important:

1: Code Development

In a typical code development use case you will run your code, look at its output, modify your code, rerun it and compare the new output to the previous output. You will likely repeat this many times. If your job uses pseudo-random numbers, you would like it to use exactly the same pseudo-random numbers every time; if the pseudo-random numbers change each time you run your code, debugging your code will be very difficult because the symptoms will change on every run.

Most of the Mu2e example .fcl files, in particular the Mu2eG4/test/g4test*.fcl files, are configured to do this by default.

2: Grid Jobs

Suppose that you have run one art job to generate some simulated events. After looking at these events, you decide that you want to generate more events in order to reduce the statistical errors on your results. If you reuse exactly the same .fcl file, you will generate identical events, which is simply a waste of time. To get statistically independent events you need to change the seeds used by the pseudo-random number engines.

It is also critically important that each event in the two files have a unqiue event ID; an event ID consists of a run number, a subrun number and an event number. This is discused below.

A more general example is this: you wish to run 10,000 unique grid processes by submitting 10 grid jobs of 1000 processes each; in grid-speak, one job of 1000 processes is called a "cluster". You must ensure that events created by each grid process are independent of those created by all other grid processes. To understand how to do this you need to know two things: how to write an .fcl file to do this and how the mu2egrid scripts will automatically do this for you.

When you run a large ensemble of grid jobs, you must ensure that each event has a unqiue event ID. The mu2egrid scripts also look after this requirement automatically.

HepJamesRandom and Seeding

As of March 2012, all Mu2e Offline code that consumes psuedo-random numbers uses an engine of the type HepJamesRandom. This includes our event generators, our use of Geant4 and all of our hit making codes.

One of the properties of the HepJamesRandom engine is that it can be seeded by supplying a single integer in the range [0, 900 000 000], where [] denotes that the edge values are included in the range. The HepJamesRandom algorithm guarantees that the sequence of random variates produced by any seed in the legal range has a periodicity of 2^144. It also guarantees that the sequences produced by any two seeds do not have long identical subsequences; the strength of these guarantees is sufficient for Mu2e.

What happens if you supply a seed outside of the legal range? According to the CLHEP source code, if you supply a negative seed, there will be serious flaws in the randomness. If you supply a seed more than 900,000,000 then it will produce the same pattern as one of the seeds with a value less than 900,000,000.

Because the degree of randomness provided by the seeding mechanism is sufficient for Mu2e, the problem of managing randomness is reduced to managing the uniqueness or repeatability of seeds.

Basic Instructions

Instructions for Code Development

Mu2e uses two art services to manage pseduo-random numbers:

RandomNumberGenerator, a service supplied by art
SeedService, a service now supplied by Mu2e but soon to be supplied by art

When a module wishes to use a random engine it must take two steps:

ask the SeedService for a seed that is guaranteed unique within this art job
pass that seed to the RandomNumberGenerator service and ask the service to instantiate a new pseudo-random engine on behalf of the module.

The module can then use the engine.

The RandomNumberGenerator service must be present in the art configuration but it has no parameters; see the line in blue in the following fcl fragment. The SeedService service must also be present in the art configuration and the normal configuration is illustrated by the lines in red in the following fcl fragment.

#include "standardServices.fcl"

services: {
 // ...
 
 RandomNumberGenerator : { } 
 user : {
   // ...
   SeedService : @local::automaticSeeds
 }
}

// ...
services.user.SeedService.baseSeed         :  0
services.user.SeedService.maxUniqueEngines :  20

The beginning "SeedService :" says to configure the SeedService to one of its known standard configurations, named automaticSeeds. That configuration is found in the file, Offline/fcl/standardServices.fcl:

 automaticSeeds : {
    policy            : "autoIncrement"
    baseSeed          : nil
    maxUniqueEngines  : nil

    # verbosity         : 1
    # endOfJobSummary   : true
  }

This tells the seed service that two important parameters are left undefined, baseSeed and maxEngines; if the end-user does not give values to these parameters, FHiCL will issue an error when any code tries to read these parameters. The last two lines in the above .fcl fragment give values to these parameters.

Taken all together, this configuration tells SeedService that it may supply seeds for up to 20 different engines in this job; the seeds will be the integers 0 through 19 ( baseSeed through baseseed+maxUnqiueEngines-1) and they will be given out in the order in which the code asks for them. If the job tries to seed more than 20 engines, the SeedService will print an error message and tell art to perform a graceful shutdown. If this happens, you should increase the value of maxUniqueEngines and rerun the job; please also send email to the Mu2e software team describing the situation in which this happened. Current Mu2e simulation jobs use, at most, about 10 unique engines; so a value of 20 should be safe for a while.

If you uncomment the two lines verbosity and endOfJobSummary you will get some informational printout, including which seed was assigned to which engine. The SeedService has many other features; for further details, see the complete instructions.

If you run the g4test_03.fcl job once and then rerun it, the output will be identical because the SeedService will compute the same seeds each time.

If you wish to generate a additional events, change the value of baseSeed to baseSeed+maxUniqueEngines, in this case, 20. You can repeat this pattern until the baseSeed+maxUniqueEngines excceds 900,000,000. Also remember to change the range of event ID's that are created.

This pattern guarantees that every engine in an ensemble of art jobs will have a unqiue seed. This is actually a stronger requirement than we usually need. Normally it is sufficient that baseSeed be unique in each of an ensemble of art jobs; for example, it is OK if the Straw hit making code in one job has the same seed as the event generator in another job. The point of maxUniqueEngines is that we can enforce the stricter definition of uniqueness should it be important to do so.

Instructions for Grid Jobs

If you write your own grid workflow scripts you must ensure that

Every event that you generate will have a unqiue event ID
Every grid process that you run has a unique baseSeed.

If you use the mu2egrid scripts, as described in the next section, this is taken of for you.

Mu2e Grid Scripts

At this writing ( February 2015), the mu2egrid scripts presume that the grid cluster number and process number are guaranteed to be unique. These scripts take a base fcl file supplied by the user and append six additional lines:

services.user.SeedService.policy           :  "autoIncrement"
services.user.SeedService.maxUniqueEngines :  20
services.user.SeedService.baseSeed         :  ${RANDOMSEED}
source.firstRun     : ${CLUSTER}
source.firstSubRun  : ${PROCESS}
source.firstEvent   : 1

where ${CLUSTER} and ${PROCESS} are the grid cluster and process numbers and where ${RANDOMSEED} is a random number chosen on domain allowed for HepJamesRandom. The value of RANDOMSEED is chosen by the workflow script using its own random number generator.

This guarantees unique event IDs. While it does not guarantee a unique baseSeed, a repeated baseSeed will be extremely rare. The script checkAndMove, which is part of the mu2egrid package will scan the outut of a grid cluster to ensure that the seeds choosen for each process are unique. If you need to ensure uniqueness across multiple clusters, you will need to script that yourself. In the future we may provide such a script.

[ Fermilab at Work ] [ Mu2e Home ] [ Mu2e @ Work ] [ Mu2e DocDB ] [ Mu2e Search ]

For web related questions: Mu2eWebMaster@fnal.gov.
For content related questions: kutschke@fnal.gov

This file last modified Sunday, 22-Feb-2015 13:04:19 CST


Security, Privacy, Legal

Table of Contents