Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید

Introduction


Credit: Tim Peters, PythonLabs

Algorithm
research is what drew me to Pythonand I fell in love. It
wasn't love at first sight, but it was an attraction
that grew into infatuation, which grew steadily into love. And that
love shows no signs of fading. Why? I've worked in
fields pushing the state of the art, and, in a paradoxical nutshell,
Python code is easy to throw away!

When you're trying to solve a problem that may not
have been solved before, you may have some intuitions about how to
proceed, but you rarely know in advance exactly what needs to be
done. The only way to proceed is to try things, many things,
everything you can think of, just to see what happens. Python makes
such exploration easier by minimizing the time and pain from
conception to code: if your colleagues are using, for example, C or
Java, it's not unusual for you to try (and discard)
six different approaches in Python while they're
still getting the bugs out of their first attempt.

In addition, you will have naturally grown classes and modules that
capture key parts of the problem domain, simply because you find the
need to keep reinventing them when starting over from scratch.
I've used many languages in my computer career, and
I know of none more productive than Python for prototyping. Best of
all, while being an expert is often helpful, moderate skill in Python
is much easier to obtain than for many other languages, yet much more
productive for research and prototyping than merely moderate skill in
any other language I've used. You
don't have to be an expert to start!

So if you're in the research businessand
every programmer who doesn't know everything
occasionally isyou've got a nearly perfect
language in Python. How then do you develop the intuitions that can
generate a myriad of plausible approaches to try? Experience is the
final answer, as we all get better at what we do often, but studying
the myriad approaches other people have tried develops a firm base
from which to explore. Toward that end, here are the most inspiring
algorithm books I've read. They'll
teach you possibilities you may never have discovered on your own:

John Bentley, Programming Pearls and More Programming Pearls (Addison-Wesley)


Every programmer should read these books from cover to cover for
sheer joy. The chapters are extended versions of a popular column
Bentley wrote for the Communications of the Association for Computing
Machinery (CACM). Each chapter is generally self-contained, covering
one or two lovely (and often surprising, in the
"Aha! why didn't I think of
that?!" sense) techniques of real practical value.


Robert Sedgewick, Algorithms in C++ or Algorithms in C (Addison-Wesley)


These books cover the most important general algorithms, organized by
problem domain, and provide brief but cogent explanations, along with
working code. The books cover the same material; the difference is in
which computer language is used for the code. I recommend the C++
book for Python programmers, because idiomatic Python is closer to
C++ than to C. Sedgewick's use of C++ is generally
simple and easily translated to equivalent Python. This is the first
book to reach for when you need to tackle a new area quickly.


Donald Knuth, The Art of Computer Programming, series (Addison-Wesley)


For experts (and those who aspire to expertise), this massive series
in progress is the finest in-depth exposition of the state of the
art. Nothing compares to its unique combination of breadth and depth,
rigor, and historical perspective. Note that these books
aren't meant to be read, they have to be actively
studied, and many valuable insights are scattered in answers to the
extensive exercises. While the books include detailed analysis,
there's virtually no working code, except for
programs written in assembly language for a hypothetical machine of
archaic design (yes, it can be maddeningly obscure). It can be hard
going at times, but few books so richly reward time invested.



To hone your skills, you can practice on an endless source of
problems from the wonderful On-Line Encyclopedia of
Integer Sequences
, at http://www.research.att.com/~njas/sequences/Seisl.
When stress-testing upcoming Python releases, I sometimes pick a
sequence at random from its list of sequences needing more terms and
write a program to attempt an extension the sequence. Sometimes
I'm able to extend a sequence that
hasn't been augmented in years, in large part
because Python has so many powerful features for rapid construction
of new algorithms. Then the new terms are contributed to the
database, where they benefit others. Give it a try! You may love it,
but even if you hate it, you'll certainly find it
challenging.


Timing and timeit.py


The first edition of this book contained a lengthy discussion of the
difficulties in timing alternative approaches. Such difficulties
include the fact that the resolution of time.time
varies across platforms, and time.clock measures
different things on different platforms (e.g., process CPU time on
Linux systems, wall-clock time on Windows).

It may still be important for some to learn all those details, but
Python 2.3 introduced a new timeit module, which
captures best practice and is perfectly suited to timing small
programs with a minimum of fuss and pitfalls. Everyone should learn
how to use timeit, and basic usage is very easy to
learn.

The simplest use of timeit is to pass one or more
Python statements on the command line. Of course, shell syntax varies
across platforms, so you may need to adjust these statements to the
shell you use:

% python timeit.py "100 + 200"
10000000 loops, best of 3: 0.0932 usec per loop
% python timeit.py "100 - 200"
10000000 loops, best of 3: 0.0931 usec per loop

As expected, integer addition and subtraction are just about equally
expensive. (Don't fall into the trap of attributing
any significance to differences as tiny as this one!)
timeit picks the best way of measuring time on
your platform and runs your code in a loop. The module tries a few
times first to determine how many iterations to use in the loop,
aiming at a total loop time between 0.2 and 2 seconds. When it
determines a suitable number of iterations for the loop, it then runs
the loop three times, reports the shortest time, and computes the
time per loop iteration. The iterations per loop, and number of loops
to run, can be forced to specific values with command-line options.
See the Python Library Reference for details.
(It's part of Python's online
documentation and probably also comes in some handy form with your
version of Python.)

As always, you should keep your machine as quiet as possible when
running timing tests. The primary reason timeit
runs three repetitions of the loop and reports the minimum time is to
guard against skewed results due to other machine activity. This is
especially important when running snippets that do very little work,
such as the preceding examples. In such cases, even just one
unfortunate interruption can grossly increase the reported time. Even
so, on my quiet machine, snippets that run this fast can still yield
confusing results:

% python timeit.py "100 + 200; 100 - 200"
10000000 loops, best of 3: 0.151 usec per loop
% python timeit.py "100 + 200" "100 - 200"
10000000 loops, best of 3: 0.151 usec per loop

One correct conclusion is that modern Python no longer has a time
penalty for writing two statements on two lines, instead of squashing
them together on one line separated by a semicolon. Older Pythons
generated a SET_LINENO opcode at the start of each
logical line of code, and those opcodes consumed time to execute!

A more puzzling result is that adding and subtracting in one shot
took 0.151 usec, but adding alone and subtracting alone took 0.0932
usec each. Why didn't we get 2*0.093 = 0.186 usec in
the second experiment? The explanation is quite simple:
timeit uses a fast iteration technique and
doesn't try to subtract the iteration overhead from
the reported results. When timing very fast snippets, this can be
mildly disconcerting. Let's try to measure the
overhead by timing a do-nothing statement:

% python timeit.py "pass"
10000000 loops, best of 3: 0.0203 usec per loop

While 0.02 usec is tiny, it's significant compared
to the 0.093 usec reported for an integer add! Of course this effect
diminishes to insignificance when timing more expensive operations:

% python timeit.py "100**100"
100000 loops, best of 3: 4.04 usec per loop
% python timeit.py "200**200"
100000 loops, best of 3: 9.03 usec per loop
% python timeit.py "100**100" "200**200"
100000 loops, best of 3: 13.1 usec per loop

Large integer exponentiation is much more expensive than adding small
integers, and here the sum of the times for doing 100**100 and
200**200 in isolation is very close to the time for doing both at
once.

The timeit module supports several other
command-line options, and a programmatic interface too, but
I'll defer to the Python Library
Reference
for that information. To start making
productive use of timeit, the only other option
you need to know about is the ability to pass
"setup" statements on the command
line. These statements execute once, outside the loop containing the
code you're timing. For example, import statements
are often used, as well as code that populates data structures. For
example (assuming a backslash \ is your
shell's way to indicate that a long logical line
continues in the next physical line):

% python timeit.py -s "import random" -s "
x=range(100000); random.shuffle(x)" "sorted(x)"
10 loops, best of 3: 152 msec per loop

For each of the three loops, timeit constructed
the randomly ordered array just once, then ran
sorted(x) repeatedly inside the loop. This was so
expensive that timeit ran only 10 iterations per
loop and changed its reporting units from microseconds to
milliseconds. (In Python 2.3, timeit always
reported in microseconds, but in version 2.4, it tries to be more
helpful by picking the appropriate reporting units.) This is very
different from:

% python timeit.py "import random" 
"x=range(100000); random.shuffle(x)" "sorted(x)"
10 loops, best of 3: 309 msec per loop

This snippet timed all the operations: importing
random, building the list, randomly permuting the
list, and sorting the list. This preparation code takes longer than
sorting does! You may be surprised that we see from the reported
times that it took at least as long to build and shuffle the list as
it took to sort it. The first two operations take
O(n)
time, but sorting random data takes
O(n
log n) time;
given this, how can this strange measurement be explained? Why
didn't sorting take longer?

I won't explain that mystery here but will point out
a more significant lesson: timing code always
uncovers mysteries, and a timing tool as easy to use as
timeit can be addictive. So be careful what you
measure! Measuring itself will consume more of your time than you
expect. As noted innumerable times by innumerable authors, the speed
of most of your code doesn't matter at all. Find the
10% that consumes most of the time before worrying about any of it.
When you find the true bottlenecks, timeit can
help you measure the speed of alternatives objectivelyand you
may be surprised by what you find.

/ 394