Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Recipe 5.4. Sorting Keys or Indices Basedon the Corresponding Values

Credit: John Jensen, Fred Bremmer, Nick Coghlan

Problem

You need to count the
occurrences of various items and present the items in order of their
number of occurrencesfor example, to produce a histogram.

Solution

A histogram, apart
from graphical issues, is based on counting the occurrences of items
(easy to do with a Python list or dictionary) and then sorting the
keys or indices in an order based on corresponding values. Here is a
subclass of dict that adds two methods for the
purpose:

class hist(dict):
def add(self, item, increment=1):
 add 'increment' to the entry for 'item' 
self[item] = increment + self.get(item, 0)
def counts(self, reverse=False):
 return list of keys sorted by corresponding values 
aux = [ (self[k], k) for k in self ]
aux.sort( )
if reverse: aux.reverse( )
return [k for v, k in aux]

If the items you're counting are best modeled by
small integers in a compact range, so that you want to keep item
counts in a list, the solution is quite similar:

class hist1(list):
def _ _init_ _(self, n):
 initialize this list to count occurrences of n distinct items 
list._ _init_ _(self, n*[0])
def add(self, item, increment=1):
 add 'increment' to the entry for 'item' 
self[item] += increment
def counts(self, reverse=False):
 return list of indices sorted by corresponding values 
aux = [ (v, k) for k, v in enumerate(self) ]
aux.sort( )
if reverse: aux.reverse( )
return [k for v, k in aux]

Discussion

The
add method of hist embodies the
normal Python idiom for counting occurrences of arbitrary (but
hashable) items, using a dict to hold the counts.
In class hist1, based on a list,
we take a different approach, initializing all counts to 0 in
_ _init_ _, so the add method is
even simpler.

The counts methods produce the lists of keys, or
indices, sorted in the order given by the corresponding values. The
problem is very similar in both classes, hist and
hist1; therefore, the solutions are also almost
identical, using in each case the DSU approach already shown in
Recipe 5.2 and Recipe 5.3. If we need both
classes in our program, the similarity is so close that we should
surely factor out the commonalities into a single auxiliary function
_sorted_keys:

def _sorted_keys(container, keys, reverse):
 return list of 'keys' sorted by corresponding values in 'container' 
aux = [ (container[k], k) for k in keys ]
aux.sort( )
if reverse: aux.reverse( )
return [k for v, k in aux]

and then implement the counts methods of each class
as thin wrappers over this _sorted_keys function:

class hist(dict):
...
def counts(self, reverse=False):
return _sorted_keys(self, self, reverse)
class hist1(list):
...
def counts(self, reverse=False):
return _sorted_keys(self, xrange(len(self)), reverse)

DSU is so important that in Python 2.4, as shown previously in Recipe 5.2 and Recipe 5.3, the sort
method of lists and the new built-in function
sorted offer a fast, intrinsic implementation of
it. Therefore, in Python 2.4, function _sorted_keys
can become much simpler and faster:

def _sorted_keys(container, keys, reverse):
return sorted(keys, key=container._ _getitem_ _, reverse=reverse)`

The bound-method container._ _getitem_ _ performs exactly the same operation as the
indexing container[k] in the Python 2.3
implementation, but it's a callable to call on each
k of the sequence that
we're sorting, namely
keysexactly the right kind of value to pass
as the key keyword argument to the
sorted built-in function. Python 2.4 also affords
an easy, direct way to get a list of a dictionary's
items sorted by value:

from operator import itemgetter
def dict_items_sorted_by_value(d, reverse=False):
return sorted(d.iteritems( ), key=itemgetter(1), reverse=reverse)

The
operator.itemgetter higher-order function, also
new in Python 2.4, is a handy way to supply the
key argument when you want to sort a container
whose items are subcontainers, keying on a certain item of each
subcontainer. This is exactly the case here, since a
dictionary's items are a sequence of pairs (two-item
tuples), and we want to sort the sequence keying on the second item
of each tuple.

Getting back to this recipe's main theme, here is a
usage example for the class hist shown in this
recipe's Solution:

sentence = ' Hello there this is a test.  Hello there this was a test,
but now it is not. '
words = sentence.split( )
c = hist( )
for word in words: c.add(word)
print "Ascending count:"
print c.counts( )
print "Descending count:"
print c.counts(reverse=True)

This code snippet produces the following output:

Ascending count:
[(1, 'but'), (1, 'it'), (1, 'not.'), (1, 'now'), 
(1, 'test,'), (1, 'test.'),
(1, 'was'), (2, 'Hello'), (2, 'a'), (2, 'is'), 
(2, 'there'), (2, 'this')]
Descending count:
[(2, 'this'), (2, 'there'), (2, 'is'), (2, 'a'),
 (2, 'Hello'), (1, 'was'),
(1, 'test.'), (1, 'test,'), (1, 'now'), (1, 'not.'),
 (1, 'it'), (1, 'but')]

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی