How-to Guides¶
How to save and resume long computation¶
RandomState
is pickleable. Pickling allows to save and restore
the internal state of the pseudo-random number generators.
import numpy as np
import mkl_random
import pickle
rs = mkl_random.RandomState(seed=777, brng="r250")
draw = rs.standard_normal(size=1357913)
# pickle random state
saved = pickle.dumps(rs)
# draw some numbers as if computation were to continue
post_draw = rs.gamma(5, 1, size=100)
# restore random state, and continue from
restored_rs = pickle.loads(saved)
resumed_draw = restored_rs.gamma(5, 1, size=100)
# sample from restored stated is the same as sample
# from the original one
assert np.array_equal(restored_rs, resumed_draw)
Stochastic computations in parallel with multiprocessing¶
When performing stochastic computations in parallel, care is due to ensure statistical independence of samples drawn in parallel.
Basic quasi-random number generators provide different means to accomplishing
this. Some support skipahead()
method or leapfrog()
method, while
others provide a fixed-size family of generators with nice property that generators
from such family, initialized equally, produce streams of randomness statistically
indistinguishable from independent.
- skipahead(nskips)¶
Advance the state of the generator using skip-ahead method, or raise
ValueError
exception if not supported.The argument nskips must be a positive Python integer.
The method is supported for “philox4x32x10”, “mrg32k3a”, “mcg31m1”, “mcg59”, “wh”, “mt19937”, “sfmt19937”, and “ars5” basic random number generators.
Note
When using skipahead()
, it is important to ensure that a parallel task does not consume more than
nskips
states, otherwise streams of randomness begin to overlap and the assumption of statistical
independence breaks down.
- leapfrog(k, nstreams)¶
Initialize the state of the generator using leap-frog method, or raise
ValueError
exception if not supported.The leap-frog method partitions state tragectory into
nstream
interleaved non-overlapping sub-sequences, and argumentk
identifies the subsequence.The method is supported for “mcg31m1”, “mcg59”, and “wh” basic pseudo-random number generators.
Note
When using leapfrog()
or skipahead()
methods one must remember that parallel tasks partition
generators period and choose a generator with sufficiently long period to avoid cycling over the period
more than once, as doing so also breaks the assumption of statistical independence and may compromise
correctness of the simulation.
mkl_random
also provides two families of basic pseudo-random number generators, “mt2203” and
“wh”, with property that members from particular family, initialized equally, produce streams of
randomness stasistically indistunguishable from independent. To use such families in parallel computation, assign
difference family generators to different parallel workers and sample those assigned generators in each parallel worker.
Please refer to “examples/” folder in the GitHub repo for more details.