Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید







Recipe 4.6. Flattening a Nested Sequence


Credit: Luther Blissett, Holger Krekel, Hemanth Sethuram,
ParzAspen
Aspen


Problem


Some of the items in a sequence may
in turn be sub-sequences, and so on, to arbitrary depth of
"nesting". You need to loop over a
"flattened" sequence,
"expanding" each sub-sequence into
a single, flat sequence of scalar items. (A
scalar, or atom, is
anything that is not a sequencei.e., a
leaf, if you think of the nested sequence as a
tree.)


Solution


We need to be able to tell which of the
elements we're handling are
"subsequences" to be
"expanded" and which are
"scalars" to be yielded as is. For
generality, we can take an argument that's a
predicate to tell us what
items we are to expand. (A predicate is a
function that we can call on any element and that returns a truth
value: in this case, true if the element is a
subsequence we are to expand, False otherwise.) By
default, we can arbitrarily say that every list or
tuple is to be
"expanded", and nothing else. Then,
a recursive generator offers the simplest solution:

def list_or_tuple(x):
return isinstance(x, (list, tuple))
def flatten(sequence, to_expand=list_or_tuple):
for item in sequence:
if to_expand(item):
for subitem in flatten(item, to_expand):
yield subitem
else:
yield item


Discussion


Flattening a nested sequence, or, equivalently,
"walking" sequentially over all the
leaves of a "tree", is a common
task in many kinds of applications. You start with a nested
structure, with items grouped into sequences and subsequences, and,
for some purposes, you don't care about the
structure at all. You just want to deal with the items, one after the
other. For example,

for x in flatten([1, 2, [3, [  ], 4, [5, 6], 7, [8,], ], 9]):
print x,

emits 1 2 3 4 5 6 7 8 9.

The only problem with this common task is that, in the general case,
determining what is to be
"expanded", and what is to be
yielded as a scalar, is not as obvious as it might seem. So, I ducked
that decision, delegating it to a callable predicate argument that
the caller can pass to flatten, unless the caller
accepts flatten's somewhat
simplistic default behavior of expanding just tuples and lists.

In the same module as flatten, we should also supply
another predicate that a caller might well want to usea
predicate that will expand just about any iterable
except strings (plain and Unicode). Strings are
iterable, but almost invariably applications want to treat them as
scalars, not as subsequences.

To identify whether an object is
iterable, we just need to try calling the built-in
iter on that object: the call raises
TypeError if the object is not iterable. To
identify whether an object is string-like, we simply check whether
the object is an instance of basestring, since
isinstance(obj, basestring) is
true when obj is an
instance of any subclass of
basestringthat is, any string-like type.
So, the alternative predicate is not hard to code:

def nonstring_iterable(obj):
try: iter(obj)
except TypeError: return False
else: return not isinstance(obj, basestring)

Now the caller may choose to call flatten(seq,
nonstring_iterable)
when the need is to expand any iterable
that is not a string. It is surely better not to
make the nonstring_iterable predicate the default
for flatten, though: in a simple case, such as the
example snippet we showed previously, flatten can be
up to three times slower when the predicate is
nonstring_iterable rather than
list_or_tuple.

We can also write a nonrecursive version of generator
flatten. Such a version lets you flatten nested
sequences with nesting levels higher than Python's
recursion limit, which normally allows no more than a few thousand
levels of recursion depth. The main technique for recursion removal
is to keep an explicit last-in, first-out (LIFO) stack, which, in
this case, we can implement with a list of iterators:

def flatten(sequence, to_expand=list_or_tuple):
iterators = [ iter(sequence) ]
while iterators:
# loop on the currently most-nested (last) iterator
for item in iterators[-1]:
if to_expand(item):
# subsequence found, go loop on iterator on subsequence
iterators.append(iter(item))
break
else:
yield item
else:
# most-nested iterator exhausted, go back, loop on its parent
iterators.pop( )

The if clause of the if
statement executes for any item we are to expandthat is, any
subsequence on which we must loop; so in that clause, we push an
iterator for the subsequence to the end of the stack, then execute a
break to terminate the for, and
go back to the outer while, which will in turn
execute a new for statement on the iterator we
just appended to the stack. The else clause of the
if statement executes for any item we
don't expand, and it just yields
the item.

The else clause of the for
statement executes if no break statement
interrupts the for loopin other words, when
the for loop runs to completion, exhausting the
currently most-nested iterator. So, in that else
clause, we remove the now-exhausted most-nested (last) iterator, and
the outer while loop proceeds, either terminating
if no iterators are left on the stack, or executing a new
for statement that continues the loop on the
iterator that's back at the top of the
stackfrom wherever that iterator had last left off,
intrinsically, because an iterator's job is exactly
to remember iteration state.

The results of this nonrecursive implementation of
flatten are identical to those of the simpler
recursive version given in this recipe's Solution.
If you think non-recursive implementations are faster than recursive
ones, though, you may be disappointed: according to my measurements,
the nonrecursive version is about 10% slower
than the recursive one, across a range of cases.


See Also


Library Reference and Python in a
Nutshell
sections on sequence types and built-ins
iter, isinstance, and
basestring.


/ 394