| 
 3.7 Gene2.pm: A Second Example of a Perl ClassGene1
 demonstrated the fundamentals of a Perl class. Now,
 I'll build a more realistic example, which also
 includes a few additional standard Perl techniques.My goal is to present an example that you can imitate in order to
 begin to develop your own OO software. I'm going to
 build the example in three more stages, expanding upon the
 Gene1.pm module. First, I'll add
 mutators,
 which are methods that alter the data in an object.
 I'll also add a method that gives information about
 the class as a whole, returning the count of how many objects in the
 class exist in the running program. This depends on the use of
 closures,
 methods that use variables declared outside the methods. This is the
 new material in the Gene2.pm module.After that step, I introduce the AUTOLOAD
 mechanism, which gives a single class method called
 AUTOLOAD that can define large numbers of other
 methods and significantly reduce the amount of coding you need to
 write to develop a more complex object (among other benefits to be
 described later). That will be the Gene3.pm
 module.We'll end up with a Gene.pm
 module you can use as a basis for your own Perl module development.
 It will add a mechanism to specify what properties each attribute has
 (which can prevent improper data manipulation, for instance). It will
 show how to initialize an object with class defaults and how to clone
 an existing object. Finally, Gene.pm will show you
 how to incorporate the documentation for a class right in the Perl
 code for the class.Here is the code for the intermediate Gene2.pm
 module. Following the Gene2.pm module is an
 example of the code and output of a small test program that drives
 the module. Take a minute to look at these two code examples,
 especially at the comments. The module Gene2.pm
 contains several new details that will be discussed following the
 code. The test program should be fairly easy to read and understand.
 package Gene2;#
 # A second version of the Gene.pm module
 #
 use strict;
 use warnings;
 use Carp;
 # Class data and methods, that refer to the collection of all objects
 # in the class, not just one specific object
 {
 my $_count = 0;
 sub get_count {
 $_count;
 }
 sub _incr_count {
 ++$_count;
 }
 sub _decr_count {
 --$_count;
 }
 }
 # The constructor for the class
 sub new {
 my ($class, %arg) = @_;
 my $self = bless {
 _name        => $arg{name}      || croak("Error: no name"),
 _organism    => $arg{organism}  || croak("Error: no organism"),
 _chromosome  => $arg{chromosome}|| "????",
 _pdbref      => $arg{pdbref}    || "????",
 }, $class;
 $class->_incr_count(  );
 return $self;
 }
 # Accessors, for reading the values of data in an object
 sub get_name        { $_[0] -> {_name}       }
 sub get_organism    { $_[0] -> {_organism}   }
 sub get_chromosome  { $_[0] -> {_chromosome} }
 sub get_pdbref      { $_[0] -> {_pdbref}     }
 # Mutators, for writing the values of object data
 sub set_name {
 my ($self, $name) = @_;
 $self -> {_name} = $name if $name;
 }
 sub set_organism {
 my ($self, $organism) = @_;
 $self -> {_organism} = $organism if $organism;
 }
 sub set_chromosome {
 my ($self, $chromosome) = @_;
 $self -> {_chromosome} = $chromosome if $chromosome;
 }
 sub set_pdbref {
 my ($self, $pdbref) = @_;
 $self -> {_pdbref} = $pdbref if $pdbref;
 }
 1;
 Here is the small test program testGene2 that
 demonstrates how to use the objects and methods in this version
 Gene2 of our OO class:
 #!/usr/bin/perl#
 # Test the second version of the Gene module
 #
 use strict;
 use warnings;
 # Change this line to show the folder where you store Gene2.pm
 use lib "/home/tisdall/MasteringPerlBio/development/lib";
 use Gene2;
 #
 # Create object, print values
 #
 print "Object 1:\n\n";
 my $obj1 = Gene2->new(
 name          => "Aging",
 organism      => "Homo sapiens",
 chromosome    => "23",
 pdbref        => "pdb9999.ent"
 );
 print $obj1->get_name, "\n";
 print $obj1->get_organism, "\n";
 print $obj1->get_chromosome, "\n";
 print $obj1->get_pdbref, "\n";
 #
 # Create another object, print values ... some will be unset
 #
 print "\n\nObject 2:\n\n";
 my $obj2 = Gene2->new(
 organism    => "Homo sapiens",
 name        => "Aging",
 );
 print $obj2->get_name, "\n";
 print $obj2->get_organism, "\n";
 print $obj2->get_chromosome, "\n";
 print $obj2->get_pdbref, "\n";
 #
 # Reset some of the values, print them
 #
 $obj2->set_name("RapidAging");
 $obj2->set_chromosome("22q");
 $obj2->set_pdbref("pdf9876.ref");
 print "\n\n";
 print $obj2->get_name, "\n";
 print $obj2->get_organism, "\n";
 print $obj2->get_chromosome, "\n";
 print $obj2->get_pdbref, "\n";
 print "\nCount is ", Gene2->get_count, "\n\n";
 #
 # Create another object, print values: but this fails
 # because the "name" value is required (see the "new"
 # constructor in Gene2.pm)
 #
 print "\n\nObject 3:\n\n";
 my $obj3 = Gene2->new(
 organism      => "Homo sapiens",
 chromosome    => "23",
 pdbref        => "pdb9999.ent"
 );
 print "\nCount is ", Gene2->get_count, "\n\n";
 Finally, here's the output from the test program
 testGene2:
 Object 1:Aging
 Homo sapiens
 23
 pdb9999.ent
 Object 2:
 Aging
 Homo sapiens
 ????
 ????
 RapidAging
 Homo sapiens
 22q
 pdf9876.ref
 Count is 2
 Object 3:
 Error: no name at testGene2 line 68
 It's a good idea to take a moment to read through
 this Gene2.pm module, the test program
 testGene2, and the output. Compare this new
 Gene2 module with the earlier
 Gene1 module. In particular, notice where the
 methods are defined in the module, and then how they are actually
 used in the test program. Don't get hung up on the
 details in this first reading; just look at the overall picture.
 Notice that the definitions are all in the module
 Gene2.pm, which is then loaded at the beginning of
 the test program testGene2; it is
 testGene2 that actually creates the
 module's objects and uses the
 module's methods on those objects. In other words,
 testGene2 is a program;
 Gene2.pm is a definition of a class that is used
 in testGene2.Let's begin examining the module code.
 
 3.7.1 ClosuresA closure keeps
 track of class data. Class
 data
 refers not to a particular object, but to several, possibly all,
 objects of a class that have been created during the running of your
 program. This is frequently important to do. For instance, say you
 have a DNA sequencing pipeline that can handle only 20 sequences at
 any one time. You'd want your controlling program to
 block any attempt to create more than 20 sequence objects until the
 pipeline is ready to receive more. To do this, you would keep a count
 of how many sequence objects your controlling program has created.
 Closures are a way to program such class data.A closure is a subroutine that uses a variable
 defined outside the subroutine. By surrounding such a variable and
 some closures that use that variable within a block, you can use the
 closures to access the variable from anywhere in the program, and the
 variable will never go out of scope and lose its value. This section
 will explain how this works and how to use it in your code.The following code is new in Gene2.pm:
 # Class data and methods, that refer to the collection of all objects# in the class, not just one specific object
 {
 my $_count = 0;
 sub get_count {
 $_count;
 }
 sub _incr_count {
 ++$_count;
 }
 sub _decr_count {
 --$_count;
 }
 }
 This code creates a variable $_count.
 $_count is a lexical my
 variable in a block of curly braces, and therefore is hidden from all
 parts of the code except within the block. The three methods that are
 also defined in the same block use the variable
 $_count.This variable persists throughout the life
 of the program because the subroutines defined with it are closures.
 For example, in the code for the class module
 Gene2.pm, I use $_count to keep
 a count of how many objects are in existence at any given time.
 Notice that the method names _incr_count and
 _decr_count begin with a leading underscore, as
 does the variable name $_count. They
 aren't meant to be called by the user of the class
 but are internal to the module. On the other hand, the remaining
 method get_count doesn't begin
 with a leading underscore and is meant to be called whenever the user
 of the class wants to know what the count is.The previous section of code implements a closure. It is surrounded
 by curly braces creating a Perl block.
 You've seen many blocks associated with loops and
 conditionals as you learned the fundamentals of Perl. The block here
 stands on its own without being a part of another programming
 construct.Any block, this one included, creates a new
 scope for the variables that occur within it.
 my variables (also called
 lexical variables) within a
 block exist only while the program is executing the statements within
 that block. When a program leaves a block by passing beyond its
 closing curly brace, the my variables within it go
 out of scope. In other words, they cease to exist, and disappear from
 the program until the program reenters the block, and they are
 created anew.The preceding paragraph is correct; however, there is one important
 "but."Subroutine definitions don't go out of scope in the
 way that lexically scoped (my) variables do. It is
 also possible for a subroutine definition to affect the behavior of a
 lexically scoped variable. Aha. Read on.To repeat: subroutine definitions aren't subject to
 the same constraints as variables in regards to my
 and blocks. In fact, a subroutine definition is global to the entire
 package in which it's declared. Perl looks for
 subroutine definitions at compile-time, before actually running the
 program, and makes a subroutine definition available to an entire
 package no matter where the subroutine is declaredeven if
 it's declared in a conditional block
 that's never reached during runtimewhen the
 program code is actually executed.As an example, here is a small program with a subroutine definition:
 ## A program to demonstrate the global nature of subroutine definitions
 #
 my $dna = 'ACGT';
 if ($dna eq 'ACGT') {
 print "This statement gets executed\n";
 print "Here's the subroutine call:\n";
 isdna($dna);
 } else {
 print "This statement does not get executed\n";
 #
 # The following subroutine definition is in a block which is
 # never executed at runtime.
 #
 sub isdna {
 # Print the argument if it is DNA
 if($_[0] =~ /^[ACGT]+$/i) {
 print $_[0], "\n";
 else {
 return 0;
 }
 }
 }
 This produces the following output:
 This statement gets executedHere's the subroutine call:
 ACGT
 As you see, even though the subroutine definition is buried in a
 block that's never entered, not even once, it is
 still available to the program. Perl scans the program at
 compile-time, reads in any subroutine definition no matter where it
 is, and the subroutine definition is then available to be called from
 anywhere in the program at runtime.Continuing on, in the code from Gene2.pm under
 consideration, there's the variable definition:
 my $_count = 0; which occurs outside the following subroutine definitions such as:
 sub _incr_count {++$_count;
 }
The variable $_count is declared outside the
 subroutine _incr_count, but the subroutine uses
 the variable. Therefore, by definition, the subroutine
 _incr_count is a closure.There's just one more piece to the puzzle. Consider
 again the code fragment from Gene2.pm, which I
 repeat here:
 # Class data and methods, that refer to the collection of all objects# in the class, not just one specific object
 {
 my $_count = 0;
 sub get_count {
 $_count;
 }
 sub _incr_count {
 ++$_count;
 }
 sub _decr_count {
 --$_count;
 }
 }
 It seems that when the program leaves the block that encloses this
 code, the variable $_count should go out of scope
 and no longer be available to the program. However, in
 Gene2.pm the $_count variable
 doesn't cease to exist.Because the subroutine definitions in this block are global, and
 because they also reference the variable $_count,
 Perl knows that at any point in the program you can put in a call to,
 say, get_count, which in turn needs the variable
 $_count to execute. Perl doesn't
 cause the variable $_count to cease to exist
 because it sees the closures and avoids destroying the variable they
 reference at runtime. At any point in the program, the value of
 $_count can be obtained by calling the subroutine.
 However, the value of $_count
 can't be accessed in any other way than by
 get_count or other closure defined within the same
 block.To summarize, by defining a variable and a closure that uses that
 variable within a block, a program can limit access to that variable
 to calls by the closures. This is exactly what I want to do in
 setting up class methods that refer to the collection of all objects
 that are in use.In Gene2.pm, I want to initialize the count of
 objects to 0 when the program starts and then increment it by one
 each time a new object is created. By defining
 _incr_count as a closure, I can call it from
 within the new object constructor, ensuring that
 the variable $_count will keep an accurate count
 of the number of objects that are created.
 
 3.7.2 Tracking Class Data from the Constructor MethodIn this
 second version of the class, I just have to make a small change to
 the constructor method, the subroutine new.Here is the modified new method
 constructor:
 # The constructor for the classsub new {
 my ($class, %arg) = @_;
 my $self = bless {
 _name        => $arg{name}      || croak("Error: no name"),
 _organism    => $arg{organism}  || croak("Error: no organism"),
 _chromosome  => $arg{chromosome}|| "????",
 _pdbref      => $arg{pdbref}    || "????",
 }, $class;
 $class->_incr_count(  );
 return $self;
 }
 First, I create the object by blessing (and
 initializing) an anonymous hash, as before. This time, however,
 I'll save the object as the local variable
 $self. This allows me to add a call to the class
 method _incr_count in order to keep track of the
 total number of objects created. I'll then return
 the object $self from the subroutine.
 
 3.7.3 Accessor and Mutator MethodsIn the
 first version of Gene1.pm, I printed the values
 stored in an object by accessing simple methods such as
 get_name.In this new version of Gene2.pm, I have the same
 specific methods for each attribute for which I may want to see the
 value. I also include
 mutators,
 which are subroutines that enable the user of the class to alter the
 values of attributes of an object.Here are the accessor and mutator methods for
 Gene2.pm:
 # Accessors, for reading the values of data in an objectsub get_name        { $_[0] -> {_name}       }
 sub get_organism    { $_[0] -> {_organism}   }
 sub get_chromosome  { $_[0] -> {_chromosome} }
 sub get_pdbref      { $_[0] -> {_pdbref}     }
 # Mutators, for writing the values of object data
 sub set_name {
 my ($self, $name) = @_;
 $self -> {_name} = $name if $name;
 }
 sub set_organism {
 my ($self, $organism) = @_;
 $self -> {_organism} = $organism if $organism;
 }
 sub set_chromosome {
 my ($self, $chromosome) = @_;
 $self -> {_chromosome} = $chromosome if $chromosome;
 }
 sub set_pdbref {
 my ($self, $pdbref) = @_;
 $self -> {_pdbref} = $pdbref if $pdbref;
 }
 The mutators collect two arguments. The first is the reference to the
 object, which as before, is passed automatically to the method when
 it is invoked (using the method set_name as an
 example):
 $obj->set_name('hairy');The second argument collected is then the first argument given to the
 call, in this case, setting the gene name to
 hairy.The work of the subroutine is accomplished by the line:
 $self -> {_name} = $name if $name;It simply sets the internal _name attribute to the
 supplied name (hairy in this example) if the
 argument $name is supplied. If
 it's not supplied, the subroutine does nothing.Again, you see that the internal representation of the attributes of
 the object are hidden from the class's user.
 Altering an object's attributes is done with
 methods; the class author is then free to alter the way in which the
 attributes are stored, without changing the Application Programming
 Interface (API), the interface of the class to the outside world. If
 you use this class, you don't have to change your
 code when a new version of the class is written.The test program testGene2 is similar to
 testGene1, with the addition of examples of the
 class mutators.
 |