Recipe 13.7. Unpacking a Multipart MIME Message
Credit: Matthew Cowles
Problem
You want to unpack a multipart MIME
message.
Solution
The walk method of message objects generated by
the email package makes this task really easy.
Here is a script that uses email to solve the task
posed in the "Problem":
import email.Parser
import os, sys
def main( ):
if len(sys.argv) != 2:
print "Usage: %s filename" % os.path.basename(sys.argv[0])
sys.exit(1)
mailFile = open(sys.argv[1], "rb")
p = email.Parser.Parser( )
msg = p.parse(mailFile)
mailFile.close( )
partCounter = 1
for part in msg.walk( ):
if part.get_main_type( ) == "multipart":
continue
name = part.get_param("name")
if name == None:
name = "part-%i" % partCounter
partCounter += 1
# In real life, make sure that name is a reasonable filename
# for your OS; otherwise, mangle that name until it is!
f = open(name, "wb")
f.write(part.get_payload(decode=1))
f.close( )
print name
if _ _name_ _=="_ _main_ _":
main( )
Discussion
The email package makes parsing MIME messages
reasonably easy. This recipe shows how to unbundle a
MIME message with the email
package by using the walk method of message
objects.You can create a message object in several ways. For example, you can
instantiate the email.Message.Message class and
build the message object's contents with calls to
its methods. In this recipe, however, I need to read and analyze an
existing message, so I work the other way around, calling the
parse method of an
email.Parser.Parser instance. The
parse method takes as its only argument a
file-like object (in the recipe, I pass it a real file object that I
just opened for binary reading with the built-in
open function) and returns a message object, on
which you can call message object methods.The walk method is a generator (i.e., it returns
an iterator object on which you can loop with a
for statement). You usually will use this method
exactly as I use it in this recipe:
for part in msg.walk( ):The iterator sequentially returns (depth-first, in case of nesting)
the parts that make up the message. If the message is not a container
of parts (i.e., has no attachments or
alternatesmessage.is_multipart returns
false), no problem: the walk method will then
return an iterator with a single elementthe message itself. In
any case, each element of the iterator is also a message object (an
instance of email.Message.Message), so you can
call on it any of the methods that a message object supplies.In a multipart message, parts with a type of
'multipart/something' (i.e., a main type of
'multipart') may be present. In this recipe, I
skip them explicitly since they're just glue holding
the true parts together. I use the get_main_type
method to obtain the main type and check it for equality with
'multipart'; if equality holds, I skip this part
and move to the next one with a continue
statement. When I know I have a real part in hand, I locate its name
(or synthesize one if it has no name), open that name as a file, and
write the message's contents (also known as the
message's payload), which I
get by calling the get_payload method, into the
file. I use the decode=1 argument to ensure that
the payload is decoded back to a binary content (e.g., an image, a
sound file, a movie) if needed, rather than remaining in text form.
If the payload is not encoded, decode=1 is
innocuous, so I don't have to check before I pass
it.
See Also
Recipe 13.6; documentation
for the standard library package email in the
Library Reference.