Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources]

David Ascher, Alex Martelli, Anna Ravenscroft

نسخه متنی -صفحه : 394/ 134
نمايش فراداده

Recipe 5.15. Sorting Names and Separating Them by Initials

Credit: Brett Cannon, Amos Newcombe

Problem

You want to write a directory for a group of people, and you want that directory to be grouped by the initials of their last names and sorted alphabetically.

Solution

Python 2.4's new itertools.groupby function makes this task easy:

import itertools
def groupnames(name_iterable):
sorted_names = sorted(name_iterable, key=_sortkeyfunc)
name_dict = {  }
for key, group in itertools.groupby(sorted_names, _groupkeyfunc):
name_dict[key] = tuple(group)
return name_dict
pieces_order = { 2: (-1, 0), 3: (-1, 0, 1) }
def _sortkeyfunc(name):
''' name is a string with first and last names, and an optional middle
name or initial, separated by spaces; returns a string in order
last-first-middle, as wanted for sorting purposes. '''
name_parts = name.split( )
return ' '.join([name_parts[n] for n in pieces_order[len(name_parts)]])
def _groupkeyfunc(name):
''' returns the key for grouping, i.e. the last name's initial. '''
return name.split( )[-1][0]

Discussion

In this recipe, name_iterable must be an iterable whose items are strings containing names in the form first - middle - last, with middle being optional and the parts separated by whitespace. The result of calling groupnames on such an iterable is a dictionary whose keys are the last names' initials, and the corresponding values are the tuples of all names with that last name's initial.

Auxiliary function _sortkeyfunc splits a name that's a single string, either "first last" or "first middle last," and reorders the part into a list that starts with the last name, followed by first name, plus the middle name or initial, if any, at the end. Then, the function returns this list rejoined into a string. The resulting string is the key we want to use for sorting, according to the problem statement. Python 2.4's built-in function sorted takes just this kind of function (to call on each item to get the sort key) as the value of its optional parameter named key.

Auxiliary function _groupkeyfunc takes a name in the same form and returns the last name's initialthe key on which, again according to the problem statement, we want to group.

This recipe's primary function, groupnames, uses the two auxiliary functions and Python 2.4's sorted and itertools.groupby to solve our problem, building and returning the required dictionary.

If you need to code this task in Python 2.3, you can use the same two support functions and recode function groupnames itself. In 2.3, it is more convenient to do the grouping first and the sorting separately on each group, since no groupby function is available in Python 2.3's standard library:

def groupnames(name_iterable):
name_dict = {  }
for name in name_iterable:
key = _groupkeyfunc(name)
name_dict.setdefault(key, [  ]).append(name)
for k, v in name_dict.iteritems( ):
aux = [(_sortkeyfunc(name), name) for name in v]
aux.sort( )
name_dict[k] = tuple([ n for _ _, n in aux ])
return name_dict

See Also

Recipe 19.21; Library Reference (Python 2.4) docs on module itertools.