Introduction
Credit: Tim Peters, PythonLabs
Computer manufacturers of the 1960s estimated that more than 25
percent of the running time on their computers was spent on sorting,
when all their customers were taken into account. In fact, there were
many installations in which the task of sorting was responsible for
more than half of the computing time. From these statistics we may
conclude that either (i) there are many important applications of
sorting, or (ii) many people sort when they
shouldn't, or (iii) inefficient sorting algorithms
have been in common use.
Donald KnuthThe Art of Computer Programming,vol. 3,
Sorting and Searching, page 3
Professor
Knuth's masterful work on the topics of sorting and
searching spans nearly 800 pages of sophisticated technical text. In
Python practice, we reduce it to two imperatives (we read Knuth so
you don't have to):
When you need to sort, find a way to use
the built-in sort method of Python lists.- When you need to search, find a way to use built-in dictionaries.
Many recipes in this chapter illustrate
these principles. The most common theme is using the
decorate-sort-undecorate (DSU) pattern, a
general approach to transforming a sorting problem by creating an
auxiliary list that we can then sort with the default, speedy
sort method. This technique is the single most
useful one to take from this chapter. In fact, DSU is so useful that
Python 2.4 introduced new features to make it easier to apply. Many
recipes can be made simpler in 2.4 as a result, and the discussion of
older recipes have been updated to show how.DSU relies on an unusual feature of
Python's built-in comparisons: sequences are
compared lexicographically. Lexicographical order is a generalization
to tuples and lists of the everyday rules used to compare strings
(e.g., alphabetical order). The built-in cmp(s1,
s2), when s1 and
s2 are sequences, is equivalent to this
Python code:
def lexcmp(s1, s2):This code looks for the first unequal corresponding elements. If such
# Find leftmost nonequal pair.
i = 0
while i < len(s1) and i < len(s2):
outcome = cmp(s1[i], s2[i])
if outcome:
return outcome
i += 1
# All equal, until at least one sequence was exhausted.
return cmp(len(s1), len(s2))
an unequal pair is found, that pair determines the outcome.
Otherwise, if one sequence is a proper prefix of the other, the
prefix is considered to be the smaller sequence. Finally, if these
cases don't apply, the sequences are identical and
are considered equal. Here are some examples:
>>> cmp((1, 2, 3), (1, 2, 3)) # identicalAn immediate consequence of lexicographical comparison is that if you
0
>>> cmp((1, 2, 3), (1, 2)) # first larger because second is a prefix
1
>>> cmp((1, 100), (2, 1)) # first smaller because 1<2
-1
>>> cmp((1, 2), (1, 3)) # first smaller because 1==1, then 2<3
-1
want to sort a list of objects by a primary key, breaking ties by
comparing a secondary key, you can simply build a list of tuples, in
which each tuple contains the primary key, secondary key, and
original object, in that order. Because tuples are compared
lexicographically, this automatically does the right thing. When
comparing tuples, the primary keys are compared first, and if (and
only if) the primary keys are equal, the secondary keys are compared.The examples of the DSU pattern in this chapter show many
applications of this idea. The DSU technique applies to any number of
keys. You can add to the tuples as many keys as you like, in the
order in which you want the keys compared. In Python 2.4, you can get
the same effect with the new key= optional
argument to sort, as several recipes point out.
Using the sort method's
key= argument is easier, more memory-efficient,
and runs faster than building an auxiliary list of tuples by hand.The other 2.4-introduced innovation in sorting is a convenient
shortcut: a sorted built-in function that sorts
any iterable, not in-place, but by first copying it into a new list.
In Python 2.3 (apart from the new optional keyword arguments, which
apply to the sorted built-in function as well as
to list.sort), you can code the same functionality
quite easily:
def sorted_2_3(iterable):Because copying a list and sorting it are both nontrivial operations,
alist = list(iterable)
alist.sort( )
return alist
and the built-in sorted needs to perform those
operations too, no speed advantage is gained in making
sorted a built-in. Its advantage is just the
convenience. Having something always around and available, rather
than having to code even just four simple lines over and over,
does make a difference in practice. On the other
hand, few tiny functions are used commonly enough to justify
expanding the set of built-ins. Python 2.4 added
sorted and reversed because
those two functions were requested very frequently over the years.The biggest change in Python sorting since
the first edition of this book is that Python 2.3 moved to a new
implementation of sorting. The primary visible consequences are
increased speed in many common cases, and the fact that the new sort
is stable (meaning that when two elements compare equal in the
original list, they retain their relative order in the sorted list).
The new implementation was so successful, and the chances of
improving on it appeared so slim, that Guido was persuaded to
proclaim that Python's list.sort
method will always be stable. This guarantee started with Python 2.4
but was actually realized in Python 2.3. Still, the history of
sorting cautions us that better methods may yet be discovered. A
brief account of Python's sorting history may be
instructive in this regard.
A Short History of Python Sorting
In early releases of Python,
list.sort used the qsort
routine from the underlying platform's C library.
This didn't work out for several reasons, primarily
because the quality of qsort varied widely across
machines. Some versions were extremely slow when given a list with
many equal values or in reverse-sorted order. Some versions even
dumped core because they weren't reentrant. A
user-defined _ _cmp_ _ function can also invoke
list.sort, so that one
list.sort can invoke others as a side effect of
comparing. Some platform qsort routines
couldn't handle that. A user-defined _
_cmp_ _ function can also (if it's insane
or malicious) mutate the list while it's being
sorted, and many platform qsort routines dumped
core when that happened.Python then grew its own implementation of the quicksort algorithm.
This was rewritten with every release, as real-life cases of
unacceptable slowness were discovered. Quicksort is a delicate
algorithm indeed!In
Python 1.5.2 the quicksort algorithm was replaced by a hybrid of
samplesort and binary insertion sort, and that implementation
remained unchanged for more than four years, until Python 2.3.
Samplesort can be viewed as a variant of quicksort that uses a very
large sample size to pick the partitioning element, also known as the
pivot (it recursively samplesorts a large
random subset of the elements and picks the median of those). This
variant makes quadratic-time behavior almost impossible and brings
the number of comparisons in the average case much closer to the
theoretical minimum.However, because samplesort is a complicated algorithm, it has too
much administrative overhead for small lists. Therefore, small lists
(and small slices resulting from samplesort partitioning) were
handled by a separate binary insertion sort, which is an ordinary
insertion sort, except that it uses binary search to determine where
each new element belongs. Most sorting texts say this
isn't worth the bother, but that's
because most texts assume that comparing two elements is as cheap as
or cheaper than swapping them in memory, which isn't
true for Python's sort! Moving an object is very
cheap, since what is copied is just a reference to the object.
Comparing two objects is expensive, though, because all of the
object-oriented machinery for finding the appropriate code to compare
two objects and for coercion gets reinvoked each time. This made
binary search a major win for Python's sort.On top of this hybrid approach, a few common special cases were
exploited for speed. First, already-sorted or reverse-sorted lists
were detected and handled in linear time. For some applications,
these kinds of lists are very common. Second, if an array was mostly
sorted, with just a few out-of-place elements at the end, the binary
insertion sort handled the whole job. This was much faster than
letting samplesort have at it and occurred often in applications that
repeatedly sort a list, append a few new elements, then sort it
again. Finally, special code in the samplesort looked for stretches
of equal elements, so that the slice they occupy could be marked as
done early.In the end, all of this yielded an in-place sort with excellent
performance in all known real cases and supernaturally good
performance in some common special cases. It spanned about 500 lines
of complicated C code, which gives special poignancy to recipe
Recipe 5.11.Over the years samplesort was in use, I made a standing offer to buy
dinner for anyone who could code a faster Python sort. Alas, I ate
alone. Still, I kept my eye on the literature because several aspects
of the samplesort hybrid were irritating:
- While no case of quadratic-time behavior appeared in real life, I
knew such cases could be contrived, and it was easy to contrive cases
two or three times slower than average ones. - The special cases to speed sorting in the presence of extreme partial
order were valuable in practice, but my real data often had many
other kinds of partial order that should be exploitable. In fact, I
came to believe that random ordering in input lists almost never
exists in real life (i.e., not outside of timing harnesses for
testing sorting algorithms!). - There is no practical way to make samplesort stable without grossly
increasing memory use. - The code was very complex and complicated in ugly ways by the special
cases.
Current Sorting
It was always clear that a mergesort would
be better on several counts, including guaranteed worst-case
n log n time, and that mergesort is easy to make
stable. The problem was that half a dozen attempts to code a
mergesort for Python yielded a sort that ran slower (mergesort does
much more data movement than samplesort) and consumed more memory.A large and growing literature
concerns adaptive sorting algorithms, which
attempt to detect order of various kinds in the input. I coded a
dozen of them, but they were all much slower than
Python's samplesort except on the cases they were
designed to exploit. The theoretical bases for these algorithms were
simply too complex to yield effective practical algorithms. Then I
read an article pointing out that list merging
naturally reveals many kinds of partial order,
simply by paying attention to how often each input list
"wins" in a row. This information
was simple and general. When I realized how it could be applied to a
natural mergesort, which would obviously exploit all the kinds of
partial order I knew and cared about, I got obsessed enough to solve
the speed problem for random data and to minimize the memory burden.The
resulting "adaptive, natural,
stable" mergesort implemented for Python 2.3 was a
major success, but also a major engineering effortthe devil is
in the details. There are about 1,200 lines of C code, but unlike the
code in the samplesort hybrid, none of these lines are coding for
special cases, and about half implement a technical trick allowing
the worst-case memory burden to be cut in half. I'm
quite proud of it, but the margins of this introduction lack the
space for me to explain the details. If you're
curious, I wrote a long technical description that you can find in a
Python source distribution: Objects/listsort.txt
under the main directory (say, Python-2.3.5 or
Python-2.4) where you unpacked
Python's source distribution archive. In the
following list, I provide examples of the partial order Python
2.3's mergesort naturally exploits, where
"sorted" means in either
forward-sorted or reverse-sorted order:
- The input is already sorted.
- The input is mostly sorted but has random elements appended at either
end, or both, or inserted in the middle. - The input is the concatenation of two or more sorted lists. In fact,
the fastest way to merge multiple sorted lists in Python now is to
join them into one long list and run list.sort on
that. - The input is mostly sorted but has some scattered elements that are
out of order. This is common, for example, when people manually add
new records to a database sorted by name: people
aren't good at maintaining strict alphabetic order
but are good at getting close. - The input has many keys with the same value. For example, when
sorting a database of American companies by the stock exchange
they're listed on, most will be associated with the
NYSE or NASDAQ exchanges. This is exploitable for a curious reason:
records with equal keys are already in sorted order, by the
definition of "stable"! The
algorithm detects that naturally, without code especially looking for
equal keys. - The input was in sorted order but got dropped on the floor in chunks;
the chunks were reassembled in random order, and to fight boredom,
some of the chunks were riffle-shuffled together. While
that's a silly example, it still results in
exploitable partial order and suggests how general the method is.
In short, Python 2.3's
timsort (well, it has to have
some brief name) is stable, robust, and
preternaturally fast in many real-life cases: use it any time you
can!