Word Hacks [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Word Hacks [Electronic resources] - نسخه متنی

Andrew Savikas

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید







Hack 56 Reduce Document Bloat by Deleting Old List Templates

Long documents and documents that have been
heavily edited can become needlessly bloated by the remnants of lists
long since deleted from the text. This hack shows how to clean out
this cruft.

Every list you create in Word is based on an
internally defined list template. These
templates function like paragraph styles, allowing the properties of
a list to be defined once, then referenced many times later on.

But once it's been created, you
can't remove a list template. Over time, a large
document may accumulate hundreds, or even thousands, of these list
templates. As you might imagine, that can have a negative impact on
both the file's size and its stability.

With Word 2003, the situation is greatly improved: Word caps the
number of inactive list templates in a document at 50, automatically
removing any old, unused templates once that threshold is met.
However, many individuals and offices still use older versions of
Word, which makes their documents susceptible to serious bloating
issues from extraneous list templates.

To see how quickly these list templates can accumulate, try the
following:

Open a new, blank Word document.

With your cursor in the document, alternately click the Bullets
button and the Numbering button a dozen or so times.

Select ToolsMacroVisual Basic Editor (or press
Alt-F11), type the following in the Immediate window [Hack #2],
and press Enter:

?ActiveDocument.ListTemplates.Count

VBA will report the number of list templates you created (see Figure 6-10). Notice that the number matches the number of
times you clicked the Bullets and Numbering buttons.
That's a lot of list templates for a blank document!


Figure 6-10. Counting the number of list templates in a document

Remember, you can't delete list templates, and only
Word 2003 removes old lists when the number gets above 50. If you use
an older version of Word, however, you can create a hack to help you
clean out your list templates.

As with [Hack #41], you can
convert your document into a format such as RTF and delete anything
you please. The RTF files put all the list templates in one place,
and then use numbers to reference them in the document text. You can
remove any list templates not referenced in the document without
affecting the existing text.


The gory details of RTF are beyond the scope of this book. For an
excellent introduction and reference to RTF, check out
O'Reilly's RTF Pocket
Guide.


6.7.1 The Code


The following Perl script will clean out unused list templates from
an RTF file. It uses the RTF::Parser module. If
you're running the ActivePerl distribution for
Windows, you can install RTF::Parser from the Perl
Package Manager. You can also download the
RTF::Parser from http://www.cpan.org.

#!/usr/bin/perl
use strict;
use RTF::Parser;
my $file = shift;
die "Please provide an rtf file to parse.\n" unless $file;
open(RTFIN, "< $file") or die "Failed to open $file for reading: $!\n";
my $tokenizer = RTF::Tokenizer->new( file => \*RTFIN );
my @listoverride;
while(my ( $type, $arg, $param ) = $tokenizer->get_token( )) {
last if $type eq 'eof';
if( $type eq 'control' and $arg eq 'listoverridetable' ) {
my $brace = 1;
while( $brace > 0 ) {
my @attr = $tokenizer->get_token( );
$brace++ if $attr[0] eq 'group' and $attr[1] == 1;
$brace-- if $attr[0] eq 'group' and $attr[1] == 0;
if( $attr[0] eq 'control'
and ($attr[1] eq 'listid' or $attr[1] eq 'ls')) {
push( @listoverride, $attr[2] );
}
}
}
}
seek(RTFIN, 0, 0);
my %list_map = @listoverride;
for my $key (keys %list_map) {
my $matches = 0;
while(<RTFIN>) {
my @ls = $_ =~ m/\\(ls$list_map{$key})(?:\s|\\|\n|\})/g;
$matches += scalar(@ls);
}
seek(RTFIN, 0, 0);
if ($matches > 1) {
delete $list_map{$key};
}
}
seek(RTFIN, 0, 0);
$tokenizer->read_file( \*RTFIN );
while(my ( $type, $arg, $param ) = $tokenizer->get_token( )) {
last if $type eq 'eof';
if( $type eq 'control'
and ($arg eq 'listoverridetable' or $arg eq 'listtable') ) {
put( $type, $arg, $param);
my $brace = 1;
my @listkeep;
while( $brace > 0 ) {
my @attr = $tokenizer->get_token( );
$brace++ if $attr[0] eq 'group' and $attr[1] == 1;
$brace-- if $attr[0] eq 'group' and $attr[1] == 0;
my @listitem;
my $delete = 0;
push( @listitem, \@attr);
while( $brace > 1 ) {
my @attr = $tokenizer->get_token( );
$brace++ if $attr[0] eq 'group' and $attr[1] == 1;
$brace-- if $attr[0] eq 'group' and $attr[1] == 0;
if( $attr[0] eq 'control' and $attr[1] eq 'listid') {
$delete = 1 if( exists $list_map{$attr[2]} );
}
push( @listitem, \@attr);
}
unless($delete) {
push( @listkeep, \@listitem);
}
}
for (@listkeep) {
for (@$_) {
put(@$_);
}
}
} else {
put( $type, $arg, $param );
}
}
close(RTFIN);
sub put {
my ($type, $arg, $param) = @_;
if( $type eq 'group') {
print $arg == 1 ? '{' : '}';
} elsif( $type eq 'control' ) {
print "\\$arg$param";
} elsif( $type eq 'text') {
print "\n$arg";
}
}

Save the script as cleanlists.pl.


6.7.2 Running the Hack


As described earlier, create a new, blank document and alternately
click the Bullets and Numbering buttons a few dozen times. Use VBA to
make sure that you soiled the file with extra list templates, as
shown in Figure 6-10. Now save the file as RTF and
name it DirtyFile.rtf.

With the script in the same directory as the
DirtyFile.rtf file, enter the following at a DOS
command prompt:

> perl cleanlists.pl DirtyFile.rtf > CleanFile.rtf

Open the new file, CleanFile.rtf, from Word.
Once you're satisfied the script
hasn't altered any existing formatting, you can save
it in .doc format.


Parsing RTF is a complicated task, and RTF files (particularly those
with embedded graphics) can be quite large, so this script may take a
few minutes to run on a lengthy file.

Andy Bruno and Andrew Savikas


/ 162