Recipe 19.6. Dividing an Iterable into n Slices of Stride n
Credit: Gyro Funch, Alex Martelli
Problem
You have an iterable p and need to get the
n non-overlapping extended slices of
stride n, which, if the iterable was a
sequence supporting extended slicing, would be
p[0::n],
p[1::n],
and so on up to
p[n-1::n].
Solution
While extended slicing would return sequences of the same type we
start with, it's much more sensible to specify a
strider function that, instead, solves this
problem by returning a list of lists:
def strider(p, n):
"" Split an iterable p into a list of n sublists, repeatedly taking
the next element of p and adding it to the next sublist. Example:
>>> strider('abcde', 3)
[['a', 'd'], ['b', 'e'], ['c']]
In other words, strider's result is equal to:
[list(p[i::n]) for i in xrange(n)]
if iterable p is a sequence supporting extended-slicing syntax.
""
# First, prepare the result, a list of n separate lists
result = [ [ ] for x in xrange(n) ]
# Loop over the input, appending each item to one of
# result's lists, in "round robin" fashion
for i, item in enumerate(p):
result[i % n].append(item)
return result
Discussion
The function in this recipe takes an iterable
p and pulls it apart into a user-defined
number n of pieces (specifically, function
strider returns a list of sublists), distributing
p's items into what would
be the n extended slices of stride
n if p were a
sequence.If we were willing to sacrifice generality, forcing argument
p to be a sequence supporting extended
slicing, rather than a generic iterable, we could use a very
different approach, as the docstring of strider
indicates:
def strider1(p, n):Depending on our exact needs, with such a strong constraint on
return [list(p[i::n]) for i in xrange(n)]
p, we might omit the
list call to make each subsequence into a list,
and/or code a generator to avoid consuming extra memory to
materialize the whole list of results at once:
def strider2(p, n):or, equivalently:
for i in xrange(n):
yield p[i::n]
import itertoolsor, in Python 2.4, with a generator expression:
def strider3(p, n):
return itertools.imap(lambda i: p[i::n], xrange(n))
def strider4(p, n):However, none of these alternatives accepts a generic iterable as
return (p[i::n] for i in xrange(n))
peach demands a full-fledged
sequence.Back to this recipe's exact specs, the best way to
enhance the recipe is to recode it to avoid low-level fiddling with
indices. While doing arithmetic on indices is conceptually quite
simple, it can get messy and indeed is notoriously error prone. We
can do better by a generous application of module
itertools from the Python Standard Library:
import itertoolsThis strider5 version uses three functions from
def strider5(p, n):
result = [ [ ] for x in itertools.repeat(0, n) ]
resiter = itertools.cycle(result)
for item, sublist in itertools.izip(p, resiter):
sublist.append(item)
return result
module itertoolsall of the functions in
module itertools return iterable objects, and, as
we see in this case, their results are therefore typically used in
for loops. Function repeat
yields an object, repeatedly, a given number of times, and here we
use it instead of the built-in function xxrange to
control the list comprehension that builds the initial value for
result. Function cycle takes an
iterable object and returns an iterator that walks over that iterable
object repeatedly and cyclicallyin other words,
cycle performs exactly the round-robin effect that
we need in this recipe. Function izip is
essentially like the built-in function zip, except
that it returns an iterator and thus avoids the memory-consumption
overhead that zip incurs by building its whole
result list in memory at once.This version achieves deep elegance and conceptual simplicity
(although you may need to gain some familiarity with
itertools before you agree that this version is
simple!) by foregoing all index arithmetic and leaving all of the
handling of the round-robin issues to
itertools.cycle. resiter, per se,
is a nonterminating iterator, but the function deals effortlessly
with that. Specifically, since we use resiter
together with p as arguments to
izip, termination is assured (assuming, of course,
that p does terminate!) by the semantics
of izip, which, just like built-in function
zip, stops iterating as soon as any one of its
arguments is exhausted.
See Also
The itertools module is part of the Python
Standard Library and is documented in the Library
Reference portion of Python's online
documentation; the Library Reference and
Python in a Nutshell docs about the built-ins
zip and xrange, and
extended-form slicing of sequences.