Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید







Recipe 18.4. Generating Random Samples Without Replacement


Credit: Raymond Hettinger


Problem


You need to generate random samples without replacement out of a
"population" (the integers between
included and some n excluded), and you
want better memory consumption characteristics than
random.sample provides.


Solution


A generator for this purpose requires only constant memory and makes
a small number of calls to random.random:

import random
def sample(n, r):
" Generate r randomly chosen, sorted integers from [0,n) "
rand = random.random
pop = n
for samp in xrange(r, 0, -1):
cumprob = 1.0
x = rand( )
while x < cumprob:
cumprob -= cumprob * samp / pop
pop -= 1
yield n-pop-1


Discussion


random.sample(xrange(10), 3) produces output
statistically equal to list(sample(10,
3)) using this recipe's
sample. Differently from
random.sample(xrange(n),
r), this
recipe's
sample(n,
r) requires a
bounded amount of memory (which does not grow with either
r or n) and is
guaranteed to make only r calls to
random.random. Moreover, this
recipe's sample yields the
r numbers of the sample in sorted order,
while random.sample returns them in random
orderwhich may be insignificant or a crucial advantage one way
or the other, depending on your application's needs.
A definite advantage of random.sample is that its
running time is
O(r),
while this recipe's sample
function's running time is
O(n).

This recipe was inspired by Knuth's
Algorithm S in Donald E. Knuth, The
Art of Computer Programming
, Volume 3,
Seminumerical Algorithms, in
section 3.4.2. However, this recipe has one improvement over
Knuth's algorithm: by tracking a cumulative
probability for each selection, this recipe eliminates the need to
make n calls to
random.random.

A potential major improvement would be to find a direct formula
giving the same result as the inner loop: given
x, samp, and
pop, compute the index of the first
sample. Finding this formula would reduce the running time to
O(r).


See Also


Library Reference and Python in a
Nutshell
docs about random.


/ 394