Credit: Dinu Gherman, Dan Wolfe
You're running on a reasonably recent version of Mac OS X (version 10.3 "Panther" or later), and you need to know the number of pages in a PDF document.
The PDF format and Python are both natively integrated with Mac OS X (10.3 or later), and this allows a rather simple solution:
#!/usr/bin python import CoreGraphics def pageCount(pdfPath): "Return the number of pages for the PDF document at the given path." pdf = CoreGraphics.CGPDFDocumentCreateWithProvider( CoreGraphics.CGDataProviderCreateWithFilename(pdfPath) ) return pdf.getNumberOfPages( ) if _ _name_ _ == '_ _main_ _': import sys for path in sys.argv[1:]: print pageCount(path)
A reasonable alternative to this recipe might be to use the PyObjC Python extension, which (among other wonders) lets Python code reuse all the power in the Foundation and AppKit frameworks that come with Mac OS X. Such a choice would let you write a Python script that is also able to run on older versions of Mac OS X, such as 10.2 Jaguar. However, relying on Mac OS X 10.3 or later ensures we can use the Python installation that is integrated as a part of the operating system, as well as such goodies as the CoreGraphics Python extension module (also part of Mac OS X "Panther") that lets your Python code reuse Apple's excellent Quartz graphics engine directly.
PyObjC is at http://pyobjc.sourceforge.net/; information on the CoreGraphics module is at http://www.macdevcenter.com/pub/a/mac/2004/03/19/core_graphicsl.