Recipe 4.8. Transposing Two-Dimensional Arrays
Credit: Steve Holden, Raymond Hettinger, Attila
Vàsàrhelyi, Chris Perkins
Problem
You need to transpose a list
of lists, turning rows into columns and vice versa.
Solution
You must start with a list whose items are lists all of the same
length, such as:
arr = [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]A list comprehension offers a simple, handy way to transpose such a
two-dimensional array:
print [[r[col] for r in arr] for col in range(len(arr[0]))]A faster though more obscure alternative (with exactly the same
[[1, 4, 7, 10], [2, 5, 8, 11], [3, 6, 9, 12]]
output) can be obtained by exploiting built-in function
zip in a slightly strange way:
print map(list, zip(*arr))
Discussion
This recipe shows a concise yet clear way to turn rows into columns,
and also a faster though more obscure way. List comprehensions work
well when you want to be clear yet concise, while the alternative
solution exploits the built-in function zip in a
way that is definitely not obvious.
Sometimes data just comes at you the wrong
way. For instance, if you use Microsoft's ActiveX
Data Ojbects (ADO) database interface, due to array element-ordering
differences between Python and Microsoft's preferred
implementation language (Visual Basic), the
Getrows method actually appears to return database
columns in Python, despite the
method's name. This recipe's two
solutions to this common kind of problem let you choose between
clarity and speed.In the list comprehension solution, the inner comprehension varies
what is selected from (the row), while the outer comprehension varies
the selector (the column). This process achieves the required
transposition.In the zip-based solution, we use the
*a syntax to pass each item (row) of
arr to zip, in order, as a
separate positional argument. zip returns a list
of tuples, which directly achieves the required transposition; we
then apply list to each tuple, via the single call
to map, to obtain a list of lists, as required.
Since we don't use
zip's result as a list directly,
we could get a further slight improvement in performance by using
itertools.izip instead (because
izip does not materialize its result as a list in
memory, but rather yields it one item at a time):
import itertoolsbut, in this specific case, the slight speed increase is probably not
print map(list, itertools.izip(*arr))
worth the added complexity.
The *args and **kwds Syntax
*args (actually, * followed by
any identifiermost usually, you'll see
args or a as the identifier
that's used) is Python syntax for accepting or
passing arbitrary positional arguments. When you
receive arguments with this syntax (i.e., when you place the star
syntax within a function's signature, in the
def statement for that function), Python binds the
identifier to a tuple that holds all positional arguments not
"explicitly" received. When you
pass arguments with this syntax, the identifier can be bound to any
iterable (in fact, it could be any expression, not necessarily an
identifier, as long as the expression's result is an
iterable).
**kwds (again, the identifier is arbitrary, most
often kwds or k) is Python syntax
for accepting or passing arbitrary named
arguments. (Python sometimes calls named arguments keyword
arguments, which they most definitely are
notjust try to use
as argument name a keyword, such as pass,
for, or yield, and
you'll see. Unfortunately, this confusing
terminology is, by now, ingrained in the language and its culture.)
When you receive arguments with this syntax (i.e., when you place the
starstar syntax within a function's signature, in
the def statement for that function), Python binds
the identifier to a dict, which holds all named
arguments not "explicitly"
received. When you pass arguments with this syntax, the identifier
must be bound to a dict (in fact, it could be any
expression, not necessarily an identifier, as long as the
expression's result is a dict).Whether in defining a function or in calling it, make sure that both
*a and **k come
after any other parameters or arguments. If both
forms appear, then place the **k after the
*a.If you're transposing large arrays of numbers,
consider Numeric Python and other third-party packages. Numeric
Python defines transposition and other axis-swinging routines that
will make your head spin.
See Also
The Reference Manual and Python in
a Nutshell sections on list displays (the other name for
list comprehensions) and on the *a and
*k notation for positional and named argument
passing; built-in functions zip and
map; Numeric Python (http://www.pfdubois.com/numpy/).