Word Hacks [Electronic resources]

Andrew Savikas

نسخه متنی -صفحه : 162/ 120
نمايش فراداده

Hack 85 Hack Word from Python

Use Word from Python to create attractive printouts of HTML documents on the fly.

Python is a powerful scripting language, and its use on Windows systems as both a development and an administration tool has increased. This hack shows you how to use Python to import an HTML document into Word, tweak the formatting, save the document in native Word format, and print it to the default printer. This hack assumes you have a file named C:\resumel on your system. It also assumes you have Python installed on your system and that you can run Python scripts from the DOS command line.

To download Python (for free), go to http://www.python.org. For detailed information on using Python on Windows systems, check out Python Programming on Win32 (O'Reilly).

Because Python supports COM automation [Hack #84], you can access Word from within a Python script. First, you'll need the pywin32 module, which you can download from the SourceForge web site (http://sourceforge.net/project/showfiles.php?group_id=78018).

9.7.1 Hello, Word

Once you've installed the pywin32 module, you can use Word objects from within a Python script. The following script creates a new document, inserts some text, and applies the Heading 1 style to the text:

from win32com.client import Dispatch
def main( ):
wrd = Dispatch('Word.Application')
wrd.Visible = 1
doc = wrd.Documents.Add( )
rng = doc.Range(0,0)
rng.InsertAfter('Hello, Word!')
rng.Style = 'Heading 1'
if __name__=='__main_  _':
main( )

Save this script as C:\HelloFromPython.py and run it from the DOS command line as follows:

> python HelloFromPython.py

As discussed in [Hack #84], Word objects created as COM servers aren't visible by default. You must explicitly set the Visible property to 1 if you want Word to appear onscreen.

9.7.2 Controlling Word Interactively

Python also includes an interactive command-line interpreter, which you launch by typing python at a DOS prompt:

> python

After some informational text is displayed, the prompt changes to indicate that you're in the Python interpreter.

Python 2.3.4 (#53, May 25 2004, 21:17:02) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

You can now execute Python commands interactively, which is a useful way to experiment with controlling Word because you can see the results in real time. Enter the following sequence of commands after you launch the Python interpreter:

>>> from win32com.client import Dispatch
>>> w = Dispatch('Word.Application')
>>> w.Visible = 1

At this point, a new Word window opens, although Word doesn't create a new, blank document (considering that this instance of Word runs invisibly by default, that kind of makes sense). Though no blank document is created, all the global templates in the Startup folder [Hack #50] are loaded.

With the Python interpreter running and a Word window open, you can actually go back and forth between the two as you fiddle with Word. However, if you modify or remove objects currently referenced from Python within Word (for example, delete a paragraph or close a document), the Python objects may generate errors or become unstable and behave unexpectedly.

Now, create a new, blank Word document and insert a few lines of text with the following code:

>>> doc = w.Documents.Add( )
>>> rng = doc.Range( )
>>> rng.InsertAfter('To be or not to be - Shakespeare\n')
>>> rng.InsertAfter('Do be do be do be do - Sinatra')

By using the interactive interpreter, you can position the DOS window next to or on top of the Word window and watch your Python commands control Word, as shown in Figure 9-10.

Figure 9-10. Controlling Word from the Python command-line interpreter

To close the document and quit Word, enter the following:

>>> doc.Close( )
>>> wrd.Quit( )

Word won't close the document until you choose whether or not to save it. If you run Word invisibly and try the same thing, Word will stay hidden, but its Save As dialog will appear. If you write scripts that run Word invisibly, take care to avoid situations that might launch an unexpected dialog (and probably cause an error in your script). To avoid this particular one, you must either save the document or make Word think you've saved it. The following code shows both scenarios:

>>> doc.SaveAs('C:\Documents\Quotes.doc') # Save the file
>>> doc.Saved = 1 # Or fool Word into thinking it's been saved

9.7.3 Running the Hack

Word does an excellent job of importing HTML filesespecially ones that use simple, standard HTML tags mapped to Word's built-in styles. You can easily translate existing HTML files into a useful printed format by importing them into Word. This process can be automated with Python and COM.

As an example, this hack will show you this process using an HTML file you might already have, and which is probably more up-to-date than any print version: your resume.

Again, this hack assumes you have a file named C:\resumel on your system. The code presented below opens Word, opens the file, changes the appearance of the Heading 2 and Hyperlink styles, saves the document, and prints it out to your default printer:

from win32com.client import Dispatch
MYDIR = 'c:\\'
def main( ):
wrd = Dispatch('Word.Application')
confirm = wrd.Options.ConfirmConversions
wrd.Options.ConfirmConversions = 0
wrd.Visible = 0
doc = wrd.Documents.Open(MYDIR + 'resumel')
sty = doc.Styles('Heading 2')
sty.Font.Size = 18
sty.Font.Italic = 0
sty = doc.Styles('Hyperlink')
sty.Font.Underline = 0
sty.Font.Color = -16777216 
sty.Font.Italic = 1
doc.SaveAs(FileName=MYDIR + 'resume.doc', FileFormat=0)
doc.PrintOut
doc.Close( )
wrd.Options.ConfirmConversions = confirm
wrd.Quit( )
if __name__=='__main_  _':
main( )

Save this script as resumeprinter.py and run it from a DOS command line:

> python resumeprinter.py

A few parts of this script deserve closer attention.

9.7.3.1 Confirming conversions

Select ToolsOptions, click the General tab, and check the "Confirm conversion at open" box. With this option checked, Word will prompt you before opening a file not in the .doc format. If this setting is enabled when the script opens the file, one of those unexpected dialogs will appear, even though the script runs Word invisibly. To make sure the resumel file opens without confirming the conversion, this script explicitly sets the ConfirmConversions option to False. Before doing so, the script stores the current state in a variable named confirm; it then resets the option before it exits.

9.7.3.2 Word constants

This Python script doesn't have access to Word's constants (such as wdUnderlineNone and wdColorAutomatic) via COM. You must use their actual values, as this script does for the Underline and Color properties of the Hyperlink style. To get the value of a constant, query its value in the Immediate window [Hack #2] in the Visual Basic Editor, as shown in Figure 9-11.

Figure 9-11. Getting a constant's value by using the Immediate window in the Visual Basic Editor

9.7.3.3 Named arguments

When using Word from Python, as with VBA, you can use named arguments, which means you can specify the values for a function or method by keyword. When you don't use named arguments, each value passed as an argument must be in a particular order. For example, the syntax for the MsgBox function in VBA is:

MsgBox(prompt[, buttons] [, title] [, helpfile, context])

If you call this function in VBA without using named arguments, the function expects and interprets the values in the order specified by its syntax. To tell the function to display the prompt "Hello, World" with "Message in a Box" as the dialog's title, but without specifying a button type, insert the following:

Msgbox "Hello, World", ,"Message in a Box"

Notice the empty value in between the two commas. It tells Word to use its default value for the buttons argument. If you left out that empty value, Word would try to use "Message in a Box" as the buttons value, which would cause an error. When you use the named-argument syntax in VBA, you can do the same thing in a more readable way, and in any order you choose:

MsgBox Title:="Message in a Box", Prompt:="Hello, World"

Word uses its default settings for any of the arguments not specified. When using Word objects and methods from Python, you can use a similar syntax, as shown in the following line taken from the resumeprinter.py script shown above:

doc.SaveAs(FileName=MYDIR + 'resume.doc', FileFormat=0)

Note that in Python, you don't place a colon before the =, as you would in VBA.

For information on creating Python objects that you can run from within Word using VBA, check out [Hack #88] .