Mastering Perl for Bioinformatics [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Mastering Perl for Bioinformatics [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید












3.9 How AUTOLOAD Works



The AUTOLOAD mechanism, built into the definition
of Perl packages, is simple to use. If a subroutine named
AUTOLOAD is declared within a package, it is
called whenever an undefined subroutine is called within the package.
AUTOLOAD is a special name, and must be
capitalized as shown, because Perl is designed that way.
Don't use the subroutine name
AUTOLOAD (or DESTROY) for any
other purpose, or you'll suffer unintended
consequences.


Without an AUTOLOAD subroutine defined in a
package, an attempt to call some undefined subroutine simply produces
an error when the program runs. But if an AUTOLOAD
subroutine is defined, it is called instead and is passed the
arguments of the undefined subroutine. At the same time, the
$AUTOLOAD variable is set to the name of the
undefined subroutine.


Here's an example of a short Perl program that tries
to call an undefined function:


#!/usr/bin/perl
use strict;
use warnings;
print "I started the program\n";
report_protein_function("one", "two");
print "I got to the end of the program\n";


It gives the following output:


I started the program
Undefined subroutine &main::report_protein_function called at jk.pl line 8.


Here's what happens when an
AUTOLOAD subroutine is defined in the package:


#!/usr/bin/perl
use strict;
use warnings;
use vars '$AUTOLOAD';
print "I started the program\n";
report_protein_function("one", "two");
print "I got to the end of the program\n";
sub AUTOLOAD {
print "AUTOLOAD is set to $AUTOLOAD\n";
print "with arguments ", "@_\n";
}


It gives the following output:


I started the program
AUTOLOAD is set to main::report_protein_function
with arguments one two
I got to the end of the program


3.9.1 Defining Global Variables



Recall that when you start programs with such statements as:


use strict;


you have to declare all variables as lexically scoped using
my. However, there are times when your program
needs to use global variables that aren't lexically
scoped. To use AUTOLOAD, you need access to the
predefined $AUTOLOAD global variable.


To enable access of the package global $AUTOLOAD,
you must specifically exempt it from the use
strict
injunction. This can be accomplished with the
use vars statement:


use vars '$AUTOLOAD';


Other globals can be declared in this way as well, but globals should
be used sparingly, and preferably not at all.


Newer versions of Perl (after Version 5.6.0) have a cleaner way to
declare global variables even when use
strict is in effect:


our $AUTOLOAD;


This makes the variable $AUTOLOAD a legal global
within the scope in which it is declaredin
Gene3.pm, the scope is the entire class.


Without our $AUTOLOAD or use vars
'$AUTOLOAD
', the program won't run;
instead, it complains vociferously that:


Global symbol "$AUTOLOAD" requires explicit package name


3.9.2 AUTOLOAD Simplifies Writing Methods



Having the
AUTOLOAD mechanism available can greatly
simplify the writing of class methods. Many classes require methods
to examine and to change the values of attributes, as have the two
previous versions Gene1.pm and
Gene2.pm.


If an object has many attributes, you have to write an accessor
method and a mutator method for each attribute. This is repetitive;
it requires defining more methods every time the list of attributes
changes, and, in general, it's hard to maintain such
code.


The new version Gene3.pm uses
AUTOLOAD to automate the handling of methods for
accessors and mutators. All you need do is write the one
AUTOLOAD subroutine, and all these similar, basic
methods are handled in the same fashion by the one bit of code.



3.9.2.1 Bypassing use strict



AUTOLOAD starts by fiddling with the
use strict statement. Just as
it requires the $AUTOLOAD global variable to be
exempted from the use strict
directive, so does the magic AUTOLOAD speedup
(described in the next section) require an exemption from the
use strict directive at a
specific place within the AUTOLOAD subroutine.
Thus, the statement:


no strict "refs";


turns off the use strict where
required. This enables the lines (to be explained later) such as:


*{$AUTOLOAD} = sub { return $_[0]->{$attribute} };


to bypass the otherwise desirable use
strict instruction.



3.9.2.2 AUTOLOAD arguments



Recall that AUTOLOAD is
automatically used when 1) it has been defined, and 2) an undefined
subroutine is called. When this happens, AUTOLOAD
is simply passed the arguments that would have gone to the undefined
subroutine.


For example, say you call an undefined method fold
on an object $peptide:


$peptide->fold(-style => 'prion')


If you define an AUTOLOAD method in the class,
it's called and passed the calling object or class
name, as usual, plus the arguments -style
=> 'prion' you were trying
to pass to the nonexistent fold method. The global
scalar variable $AUTOLOAD is also set to the name
of the nonexistent fold method.


The version of AUTOLOAD in
Gene3.pm captures one written argument. So, of
course, this AUTOLOAD actually captures two
arguments: the class object automatically passed into the subroutine
by arrow notation, which appears first, and the other arguments, if
any. This line in the AUTOLOAD subroutine:


my ($self, $newvalue) = @_;


assigns the reference to the object to the new variable
$self and the value to be set, if any, to the new
variable $newvalue.



3.9.2.3 Using naming conventions to write code: get_ and set_



The various
versions
of the Gene module have named attributes with
beginning underscores, for example, _name for the
gene name. The accessors and mutators for attributes have been
assigned names that prepend get and
set to the beginning of the attribute name, for
example, get_name and set_name.


In Gene3.pm, the AUTOLOAD
subroutine elevates this convention to an enforced discipline, by
recognizing only method names and attribute names that conform to
this convention. It first examines the name of the called subroutine
as stored in the $AUTOLOAD global variable, checks
if the subroutine name is in the expected form, and if so, extracts
the attribute name from the subroutine name with a regular
expression. The AUTOLOAD subroutine then checks
that the requested attribute exists, and fetches or sets the value of
that attribute.


The first part of the AUTOLOAD subroutine does
some checking to see if the subroutine name is in the expected form,
and if so, it extracts the attribute name, and the requested
operation (get or set). This
first test:


my ($operation, $attribute) = ($AUTOLOAD =~ /(get|set)(_\w+)$/);
# Is this a legal method name?
unless($operation && $attribute) {
croak "Method name $AUTOLOAD is not in the recognized form (get|set)_attribute\n";
}
unless(exists $self->{$attribute}) {
croak "No such attribute '$attribute' exists in the class ", ref($self);
}


uses a regular expression to see if the $AUTOLOAD
variable is storing a method name that ends with an attribute name
(complete with leading underscore) that is defined for objects of
this class if it begins with get or
set as the desired operation. The regular
expression:


(get|set)(_\w+)$


looks for a name that, after get or
set, is composed of an underscore followed by one
or more legal word characters (as described in the
perlre manpage on regular expressions):


_\w+


Here, the underscore matches an underscore, and the
\w matches any legal word
character, and the + matches one or more such word
characters. These are remembered and captured in the
$operation and $attribute
variables by surrounding with parentheses the parts of the regular
expression that match the operation and the attribute name:


(get|set)(_\w+)


This attribute name is assigned to the variable
$attribute (for obvious mnemonic reasons) to use
in the rest of the subroutine. Similarly, the operation
get or set is assigned to the
$operation variable.


The second part of the test checks to see if such an attribute name
exists in the hash that represents the class object:


unless(exists $self->{$attribute}) {
croak "No such attribute '$attribute' exists in the class ", ref($self);
}


The exists Perl command checks to see if a hash
key exists; the value for the key may not have been set, but the key
must exist. $self is the reference to the class
object, so the following:


exists $self->{$attribute}


checks to see if any such attribute actually exists in the object.


If the method name passed to AUTOLOAD begins with
get or set, ends with a name
including a leading underscore, and if that name is an existing key
in the hash that is the class object, the tests will succeed. If they
fail, the program will croak at this point.



3.9.2.4 AUTOLOAD accessors



The
next bit of AUTOLOAD code handles the calls to
class accessors:


# AUTOLOAD accessors
if($operation eq 'get') {
# define subroutine
*{$AUTOLOAD} = sub { shift->{$attribute} };
}


The code first determines that a get accessor was
wanted. Then the undefined accessor method (whose name has been saved
in the variable $AUTOLOAD) is defined. The
subroutine definition is placed in the program's
symbol table with *{$AUTOLOAD}. The new subroutine
gets the object from the arguments by the call to
shift. The object is a hash, and the value in the
hash for the attribute is returned from the subroutine. So this
method is a simple accessor, that, given an attribute name, returns
the value. This accessor isn't actually used here;
it's just defined in the symbol table.



3.9.2.5 AUTOLOAD mutators



The
next bit of AUTOLOAD code handles the calls to
class mutators:


# AUTOLOAD mutators
}elsif($operation eq 'set') {
# define subroutine
*{$AUTOLOAD} = sub { shift->{$attribute} = shift; };
# set the new attribute value
$self->{$attribute} = $newvalue;
}


Here, after determining that a set mutator method
was called, the undefined mutator method (whose name has been saved
in the variable $AUTOLOAD) is defined. The new
subroutine gets the object from the arguments by the first call to
shift and sets the attribute of the object to the
new value, which it gets from the arguments by the second call to
shift. After defining the new mutator method, the
code actually sets the attribute key to the
$newvalue that was passed in as an argument.


Finally, the AUTOLOAD program, after defining the
new accessor or mutator method, as the case may be, and setting the
new value of the attribute if a mutator method has been defined,
returns the value of the attribute:


# return the attribute value
return $self->{$attribute};


So the AUTOLOAD method both defines the accessor
or mutator methods and behaves just like the defined accessor or
mutator method by returning the attribute value (if
it's a mutator, it first resets the attribute).



3.9.2.6 AUTOLOAD speedup



The so-called
"magic" lines in the accessor and
mutator code that I've referred to:


*{$AUTOLOAD} = sub { shift->{$attribute} };


and:


*{$AUTOLOAD} = sub { shift->{$attribute} = shift; };


are there purely in order to speed up the code.


AUTOLOAD performs its tasks a bit on the slow
side. For a large program that does a lot of getting and setting of
attributes, the slowdown is noticeable. What is saved in programming
time by having AUTOLOAD handle all these accessors
and mutators, is lost in runtime. The slowdown comes from the program
having to figure out what is wanted by the undefined methods, the use
of regular expressions to parse the names of the methods, etc.


The magic lines actually define the new methods in the symbol table,
on the fly, when they don't already exist. (The *
gives access to the symbol table, but I'll omit the
details of how the symbol table is defined and manipulated and stick
to practicalities here.) After they are called once, and the
AUTOLOAD overhead is incurred, the methods are
thenceforth defined in the symbol table of the running program. So,
for instance, the second time that the accessor method
get_name is called, the program finds the
definition in the symbol table, and AUTOLOAD
isn't called. This results in a considerable speedup
for the program overall.


I'll not delve too deeply into how this works.
Briefly, the $AUTOLOAD variable contains the name
of the desired method call, say, get_name. The
star * in *{$AUTOLOAD} is a reference to the
definition of that method call in the symbol table. This symbol table
reference is assigned the part of the expression to the right of the
assignment sign (=) that's an (anonymous) subroutine
definition.


The symbol table is thus manipulated directly from your program, and
the missing accessor and mutator definitions are installed in the
symbol table the first time AUTOLOAD is called to
handle them. After this first call that invokes
AUTOLOAD, the program can find the method
definitions in the symbol table and uses those definitions, bypassing
AUTOLOAD. For more details, see
O'Reilly's Programming
Perl.



/ 156