Recipe 4.21. Randomly Picking Items with Given Probabilities
Credit: Kevin Parks, Peter
Cogolo
Problem
You want to pick an item at random from a list, just about as
random.choice does, but you need to pick the
various items with different probabilities given in another list,
rather than picking any item with equal probability as
random.choice does.
Solution
Module random in the
standard Python library offers a wealth of possibilities for
generating and using pseudo-random numbers, but it does not offer
this specific functionality, so we must code it as a function of our
own:
import random
def random_pick(some_list, probabilities):
x = random.uniform(0, 1)
cumulative_probability = 0.0
for item, item_probability in zip(some_list, probabilities):
cumulative_probability += item_probability
if x < cumulative_probability: break
return item
Discussion
Module random in the standard Python library does
not have the weighted choice functionality that
is sometimes needed in games, simulations, and random tests, so I
wrote this recipe to supply this functionality. The recipe uses
module random's function
uniform to get a uniformly distributed
pseudo-random number between 0.0 and 1.0, then loops in parallel on
items and their probabilities, computing the increasing cumulative
probability, until the latter becomes greater than the pseudo-random
number.The recipe assumes, but does not check, that
probabilities is a sequence with just as many items
as some_list, which are probabilitiesthat is,
numbers between 0.0 and 1.0, summing up to 1.0; if these assumptions
are violated, you may still get some random picks, but they will not
follow the (inconsistent) specifications encoded in the
function's arguments. You may want to add some
assert statements at the start of the function to
check that the arguments make sense, such as:
assert len(some_list) == len(probabilities)However, these checks can be quite time consuming, so I
assert 0 <= min(probabilities) and max(probabilities) <= 1
assert abs(sum(probabilities)-1.0) < 1.0e-5
don't normally use them and have not included them
in the official Solution.As I already mentioned, the problem solved in this recipe requires
items to be associated with
probabilitiesnumbers between 0 and 1,
summing up to 1. A related but slightly different task is to get
random picks with weighted relative probabilities given by small
non-negative integersodds, rather than
probabilities. For this related problem, the best solution is a
generator, with an internal structure that is rather different from
the function random_pick given in this
recipe's Solution:
import randomThis generator works by first preparing a table whose total number of
def random_picks(sequence, relative_odds):
table = [ z for x, y in zip(sequence, relative_odds) for z in [x]*y ]
while True:
yield random.choice(table)
items is sum(relative_odds), each item of
seq appearing in the table as many times as the
small non-negative integer that is its corresponding item in
relative_odds. Once the table is prepared, the
generator's body is tiny and fast, as it simply
delegates to random.choice the picking of each
random item it yields. Typical uses of this
random_picks generator might be:
>>> x = random_picks('ciao', [1, 1, 3, 2])
>>> for two_chars in zip('boo', x): print ''.join(two_chars),
bc oa oa
>>> import itertools
>>> print ''.join(itertools.islice(x, 8))
icacaoco
See Also
Module random in the Library
Reference and Python in a
Nutshell.