Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Recipe 5.14. Enriching the Dictionary Type with Ratings Functionality

Credit: Dmitry Vasiliev, Alex Martelli

Problem

You want to
use a dictionary to store the mapping between some keys and a current
score value for each key. You frequently need to access the keys and
scores in natural order (meaning, in order of ascending scores) and
to check on a
"key"'s current
ranking in that order, so that using just a dict
isn't quite enough.

Solution

We can subclass dict and add or override methods
as needed. By using multiple inheritance, placing base
UserDict.DictMixin before
base dict and carefully arranging our various
delegations and "over"rides, we can
achieve a good balance between getting good performance and avoiding
the need to write "boilerplate"
code.

By enriching
our class with many examples in its docstring, we can use the
standard library's module doctest
to give us unit-testing functionality, as well as ensuring the
accuracy of all the examples we write in the docstring:

#!/usr/bin/env python
''' An enriched dictionary that holds a mapping from keys to scores '''
from bisect import bisect_left, insort_left
import UserDict
class Ratings(UserDict.DictMixin, dict):
"" class Ratings is mostly like a dictionary, with extra features: the
value corresponding to each key is the 'score' for that key, and all
keys are ranked in terms their scores.  Values must be comparable; keys,
as well as being hashable, must be comparable if any two keys may ever
have the same corresponding value (i.e., may be "tied" on score).
All mapping-like behavior is just as you would expect, such as:
>>> r = Ratings({"bob": 30, "john": 30})
>>> len(r)
2
>>> r.has_key("paul"), "paul" in r
       (False, False)
>>> r["john"] = 20
>>> r.update({"paul": 20, "tom": 10})
>>> len(r)
4
>>> r.has_key("paul"), "paul" in r
       (True, True)
>>> [r[key] for key in ["bob", "paul", "john", "tom"]]
       [30, 20, 20, 10]
>>> r.get("nobody"), r.get("nobody", 0)
       (None, 0)
In addition to the mapping interface, we offer rating-specific
methods.  r.rating(key) returns the ranking of a "key" in the
ratings, with a ranking of 0 meaning the lowest score (when two
keys have equal scores, the keys themselves are compared, to
"break the tie", and the lesser key gets a lower ranking):
>>> [r.rating(key) for key in ["bob", "paul", "john", "tom"]]
        [3, 2, 1, 0]
getValueByRating(ranking) and getKeyByRating(ranking) return the
score and key, respectively, for a given ranking index:
>>> [r.getValueByRating(rating) for rating in range(4)]
        [10, 20, 20, 30]
>>> [r.getKeyByRating(rating) for rating in range(4)]
        ['tom', 'john', 'paul', 'bob']
An important feature is that the keys( ) method returns keys in
ascending order of ranking, and all other related methods return
lists or iterators fully consistent with this ordering:
>>> r.keys( )
      ['tom', 'john', 'paul', 'bob']
>>> [key for key in r]
        ['tom', 'john', 'paul', 'bob']
>>> [key for key in r.iterkeys( )]
        ['tom', 'john', 'paul', 'bob']
>>> r.values( )
        [10, 20, 20, 30]
>>> [value for value in r.itervalues( )]
        [10, 20, 20, 30]
>>> r.items( )
        [('tom', 10), ('john', 20), ('paul', 20), ('bob', 30)]
>>> [item for item in r.iteritems( )]
        [('tom', 10), ('john', 20), ('paul', 20), ('bob', 30)]
An instance can be modified (adding, changing and deleting
key-score correspondences), and every method of that instance
reflects the instance's current state at all times:
>>> r["tom"] = 100
>>> r.items( )
        [('john', 20), ('paul', 20), ('bob', 30), ('tom', 100)]
>>> del r["paul"]
>>> r.items( )
        [('john', 20), ('bob', 30), ('tom', 100)]
>>> r["paul"] = 25
>>> r.items( )
        [('john', 20), ('paul', 25), ('bob', 30), ('tom', 100)]
>>> r.clear( )
>>> r.items( )
        [  ]
""
''' the implementation carefully mixes inheritance and delegation
to achieve reasonable performance while minimizing boilerplate,
and, of course, to ensure semantic correctness as above.  All
mappings' methods not implemented below get inherited, mostly
from DictMixin, but, crucially!, _ _getitem_ _ from dict. '''
def _ _init_ _(self, *args, **kwds):
''' This class gets instantiated just like 'dict' '''
dict._ _init_ _(self, *args, **kwds)
# self._rating is the crucial auxiliary data structure: a list
# of all (value, key) pairs, kept in "natural"ly-sorted order
self._rating = [ (v, k) for k, v in dict.iteritems(self) ]
self._rating.sort( )
def copy(self):
''' Provide an identical but independent copy '''
return Ratings(self)
def _ _setitem_ _(self, k, v):
''' besides delegating to dict, we maintain self._rating '''
if k in self:
del self._rating[self.rating(k)]
dict._ _setitem_ _(self, k, v)
insort_left(self._rating, (v, k))
def _ _delitem_ _(self, k):
''' besides delegating to dict, we maintain self._rating '''
del self._rating[self.rating(k)]
dict._ _delitem_ _(self, k)
''' delegate some methods to dict explicitly to avoid getting
DictMixin's slower (though correct) implementations instead '''
_ _len_ _ = dict._ _len_ _
_ _contains_ _ = dict._ _contains_ _
has_key = _ _contains_ _
''' the key semantic connection between self._rating and the order
of self.keys( ) -- DictMixin gives us all other methods 'for
free', although we could implement them directly for slightly
better performance. '''
def _ _iter_ _(self):
for v, k in self._rating:
yield k
iterkeys = _ _iter_ _
def keys(self):
return list(self)
''' the three ratings-related methods '''
def rating(self, key):
item = self[key], key
i = bisect_left(self._rating, item)
if item == self._rating[i]:
return i
raise LookupError, "item not found in rating"
def getValueByRating(self, rating):
return self._rating[rating][0]
def getKeyByRating(self, rating):
return self._rating[rating][1]
def _test( ):
''' we use doctest to test this module, which must be named
rating.py, by validating all the examples in docstrings. '''
import doctest, rating
doctest.testmod(rating)
if _ _name_ _ == "_ _main_ _":
_test( )

Discussion

In many ways, a dictionary is the natural data structure for storing
a correspondence between keys (e.g., names of contestants in a
competition) and the current
"score" of each key (e.g., the
number of points a contestant has scored so far, or the highest bid
made by each contestant at an auction, etc.). If we use a dictionary
for such purposes, we will probably want to access it often in
natural orderthe order in which the
keys' scores are ascendingand
we'll also want fast access to the rankings
(ratings) implied by the current
"score"s (e.g., the contestant
currently in third place, the score of the contestant who is in
second place, etc.).

To achieve these
purposes, this recipe subclasses dict to add the
needed functionality that is completely missing from
dict (methods rating,
getValueByRating, getKeyByRating),
and, more subtly and crucially, to modify method
keys and all other related methods so that they
return lists or iterators with the required order (i.e., the order in
which scores are ascending; if we have to break ties when two keys
have the same score, we implicitly compare the keys themselves). Most
of the detailed documentation is in the docstring of the class
itselfa crucial issue because by keeping the documentation and
examples there, we can use module doctest from the
Python Standard Library to provide unit-testing functionality, as
well as ensuring that our examples are correct.

The most interesting aspect of the implementation is that it takes
good care to minimize boilerplate (meaning repetitious and boring
code, and therefore code where bugs are most likely to hide) without
seriously impairing performance. class Ratings
multiply inherits from dict
and DictMixin, with the
latter placed first in the list of bases, so
that all methods come from the mixin, if it provides them, unless
explicitly overridden in the class.

Raymond Hettinger's DictMixin
class was originally posted as a recipe to the online version of the
Python Cookbook and later
became part of Python 2.3's standard library.
DictMixin provides all the methods of a mapping
except _ _init_ _, copy, and
the four fundamental methods: _ _getitem_ _,
_ _setitem_ _, _ _delitem_ _,
and, last but not least, keys. If you are coding a
mapping class and want to ensure that your class supports all of the
many methods that a full mapping provides to application code, you
should subclass DictMixin and supply at least the
fundamental methods (depending on your class'
semanticse.g., if your class has immutable instances, you need
not supply the mutator methods _ _setitem_ _ and
_ _delitem_ _). You may optionally implement other
methods for performance purposes, overriding the implementation that
DictMixin provides. The whole
DictMixin architecture can be seen as an excellent
example of the classic Template Method Design Pattern, applied
pervasively in a useful mix-in variant.

In this recipe's class, we inherit _ _getitem_ _ from our other base (namely, the built-in type
dict), and we also delegate explicitly to
dict everything we can for performance reasons. We
have to code the elementary mutator methods (_ _setitem_ _ and _ _delitem_ _) because, in
addition to delegating to our base class dict, we
need to maintain our auxiliary data structure
self._ratinga list of (score, key) pairs that we keep in sorted order with the help of
standard library module bisect. We implement
keys ourselves (and while we're
at it, we implement _ _iter_ _ i.e.,
iterkeys as well, since clearly
keys is easiest to implement by using _ _iter_ _) to exploit self._rating and
return the keys in the order we need. Finally, we add the obvious
implementations for _ _init_ _ and
copy, in addition to the three, ratings-specific
methods that we supply.

The result is quite an interesting example of balancing concision,
clarity, and well-advised reuse of the enormous amount of
functionality that the standard Python library places at our
disposal. If you use this module in your applications, profiling may
reveal that a method that this recipe's class
inherits from DictMixin has somewhat
unsatisfactory performanceafter all, the implementations in
DictMixin are, of necessity, somewhat generic. If
this is the case, by all means add a direct implementation of
whatever further methods you need to achieve maximum performance! For
example, if your application performs a lot of looping on the result
of calling r.iteritems( ) for some instance
r of class Ratings, you
may get slightly better performance by adding to the body of the
class the direct implementation of the method:

    def iteritems(self):
for v, k in self._rating:
yield k, v

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی