Recipe 4.7. Removing or Reordering Columnsin a List of Rows
Credit: Jason Whitlark
Problem
You have a list of
lists (rows) and need to get another list of the same rows but with
some columns removed and/or reordered.
Solution
A list comprehension works well for this task. Say you
have:
listOfRows = [ [1,2,3,4], [5,6,7,8], [9,10,11,12] ]You want a list with the same rows but with the second of the four
columns removed and the third and fourth ones interchanged. A simple
list comprehension that performs this job is:
newList = [ [row[0], row[3], row[2]] for row in listOfRows ]An alternative way of coding, that is at least as practical and
arguably a bit more elegant, is to use an auxiliary sequence (meaning
a list or tuple) that has the column indices you desire in their
proper order. Then, you can nest an inner list comprehension that
loops on the auxiliary sequence inside the outer list comprehension
that loops on listOfRows:
newList = [ [row[ci] for ci in (0, 3, 2)] for row in listofRows ]
Discussion
I often use lists of lists to represent two-dimensional arrays. I
think of such lists as having the
"rows" of a
"two-dimensional array" as their
items. I often perform manipulation on the
"columns" of such a
"two-dimensional array", typically
reordering some columns, sometimes omitting some of the original
columns. It is not obvious (at least, it was not immediately obvious
to me) that list comprehensions are just as useful for this purpose
as they are for other kinds of sequence-manipulation tasks.A list comprehension builds a new list, rather than altering an
existing one. But even when you do need to alter the existing list in
place, the best approach is to write a list comprehension and assign
it to the existing list's contents. For example, if
you needed to alter listOfRows in place,
for the example given in this recipe's Solution, you
would code:
listOfRows[:] = [ [row[0], row[3], row[2]] for row in listOfRows ]Do consider, as suggested in the second example in this
recipe's Solution, the possibility of using an
auxiliary sequence to hold the column indices you desire, in the
order in which you desire them, rather than explicitly hard-coding
the list display as we did in the first example. You might feel a
little queasy about nesting two list comprehensions into each other
in this fashion, but it's simpler and safer than you
might fear. If you adopt this approach, you gain some potential
generality, because you can choose to give a name to the auxiliary
sequence of indices, use it to reorder several lists of rows in the
same fashion, pass it as an argument to a function, whatever:
def pick_and_reorder_columns(listofRows, column_indexes):This example performs just the same column reordering and selection
return [ [row[ci] for ci in column_indexes] for row in listofRows ]
columns = 0, 3, 2
newListOfPandas = pick_and_reorder_columns(oldListOfPandas, columns)
newListOfCats = pick_and_reorder_columns(oldListOfCats, columns)
as all the other snippets in this recipe, but it performs the
operation on two separate "old"
lists, obtaining from each the corresponding
"new" list. Reaching for excessive
generalization is a pernicious temptation, but here, with this
pick_and_reorder_columns function, it seems that we
are probably getting just the right amount of generality.One last note: some people prefer a fancier way to express the kinds
of list comprehensions that are used as
"inner" ones in some of the
functions used previously. Instead of coding them straightforwardly,
as in:
[row[ci] for ci in column_indexes]they prefer to use the built-in function map, and
the special method _ _getitem_ _ of
row used as a bound-method, to perform the
indexing subtask, so they code instead:
map(row._ _getitem_ _, column_indexes)Depending on the exact version of Python,
perhaps this fancy and somewhat obscure way may be slightly faster.
Nevertheless, I think the greater simplicity of the list
comprehension form means the list comprehension is still the best
way.
See Also
List comprehension docs in Language Reference
and Python in a Nutshell.