Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Recipe 18.4. Generating Random Samples Without Replacement

Credit: Raymond Hettinger

Problem

You need to generate random samples without replacement out of a
"population" (the integers between
included and some n excluded), and you
want better memory consumption characteristics than
random.sample provides.

Solution

A generator for this purpose requires only constant memory and makes
a small number of calls to random.random:

import random
def sample(n, r):
" Generate r randomly chosen, sorted integers from [0,n) "
rand = random.random
pop = n
for samp in xrange(r, 0, -1):
cumprob = 1.0
x = rand( )
while x < cumprob:
cumprob -= cumprob * samp / pop
pop -= 1
yield n-pop-1

Discussion

random.sample(xrange(10), 3) produces output
statistically equal to list(sample(10,
3)) using this recipe's
sample. Differently from
random.sample(xrange(n),
r), this
recipe's
sample(n,
r) requires a
bounded amount of memory (which does not grow with either
r or n) and is
guaranteed to make only r calls to
random.random. Moreover, this
recipe's sample yields the
r numbers of the sample in sorted order,
while random.sample returns them in random
orderwhich may be insignificant or a crucial advantage one way
or the other, depending on your application's needs.
A definite advantage of random.sample is that its
running time is
O(r),
while this recipe's sample
function's running time is
O(n).

This recipe was inspired by Knuth's
Algorithm S in Donald E. Knuth, The
Art of Computer Programming, Volume 3,
Seminumerical Algorithms, in
section 3.4.2. However, this recipe has one improvement over
Knuth's algorithm: by tracking a cumulative
probability for each selection, this recipe eliminates the need to
make n calls to
random.random.

A potential major improvement would be to find a direct formula
giving the same result as the inner loop: given
x, samp, and
pop, compute the index of the first
sample. Finding this formula would reduce the running time to
O(r).

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی