Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید


Recipe 5.5. Sorting Strings with Embedded Numbers


Credit: Sébastien Keim, Chui Tey, Alex
Martelli


Problem




You need to sort a list of strings that
contain substrings of digits (e.g., a list of postal addresses) in an
order that looks good. For example, ''foo2.txt''
should come before ''foo10.txt''. However,
Python''s default string comparison is alphabetical,
so, by default, ''foo10.txt'' instead comes before
''foo2.txt''.


Solution


You need to split each string into sequences of digits and nondigits,
and transform each sequence of digits into a number. This gives you a
list that is just the right comparison key for the sort you want, and
you can then use DSU for the sort itselfthat is, code two
functions, shorter than this description:

import re
re_digits = re.compile(r''(\d+)'')
def embedded_numbers(s):
pieces = re_digits.split(s) # split into digits/nondigits
pieces[1::2] = map(int, pieces[1::2]) # turn digits into numbers
return pieces
def sort_strings_with_embedded_numbers(alist):
aux = [ (embedded_numbers(s), s) for s in alist ]
aux.sort( )
return [ s for _ _, s in aux ]# convention: _ _ means "ignore"

In Python 2.4, use the native support for DSU, with the same function
embedded_numbers to get the sort key:

def sort_strings_with_embedded_numbers(alist):
return sorted(alist, key=embedded_numbers)


Discussion


Say you have an unsorted list of filenames, such as:

files = ''file3.txt file11.txt file7.txt file4.txt file15.txt''.split( )

If you just sort and print this list, for example in Python 2.4 with
print '' ''.join(sorted(files)),
your output looks like file11.txt file15.txt
file3.txt
file4.txt file7.txt, since, by
default, strings are sorted alphabetically (to use a fancier word,
the sort order is described as
lexicographical). Python cannot just guess
that you mean to treat in a different way those substrings that
happen to be made of digits; you have to tell Python precisely what
you want, and this recipe shows how.

Using this recipe, you can get a nicer-looking result:

print '' ''.join(sort_strings_with_embedded_numbers(files))

The output is now file3.txt file4.txt file7.txt file11.txt
file15.txt
, which is probably just what you want in this
case.

The implementation relies on
the DSU idiom. We need to code DSU explicitly if we want to support
Python 2.3, while if our code is Python 2.4-only, we just rely on the
native implementation of DSU. We do so by passing an argument named
key (a function to be called on each item
to get the right comparison key for the sort) to the new built-in
function sorted.

Function embedded_numbers in the recipe is how we
get the right comparison key for each item: a list alternating
substrings of nondigits, and the int obtained from
each substring of digits. re_digits.split(s) gives
us a list of alternating substrings of nondigits and digits (with the
substrings of digits at odd-numbered indices); then, we use built-in
functions map and int (and
extended-form slices that get and set all items at odd-numbered
indices) to turn sequences of digits into integers. Lexicographical
comparison on this list of mixed types now produces just the right
result.


See Also


Library Reference and Python in a
Nutshell
docs about extended slicing and about module
re; Python 2.4 Library
Reference
about the sorted built-in
function and the key argument to
sort and sorted; Recipe 5.3; Recipe 5.2.

/ 394