Mastering Perl for Bioinformatics [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Mastering Perl for Bioinformatics [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید










2.3 Matrices


Perl
matrices are built from simpler data structures using references.
Recall that a
matrix
is a set of values that can be uniquely referenced by indexes. If
only one index is required, the matrix is one-dimensional (this is
exactly how an array works in Perl). If n
indexes are required, the matrix is
n-dimensional.


2.3.1 Two-Dimensional Matrices


A two-dimensional matrix is one of the simplest complex data
structures. It can be conceptualized as a table of rows and columns,
in which each element of the table is uniquely identified by its
particular row and column.

There are several ways to build matrices in Perl.
We'll look at some of the most useful.

Because there is no built-in matrix data structure, you have to build
a matrix from other data structures. The most straightforward way to
do this is with an array of
arrays:

@probes = (
[1, 3, 2, 9],
[2, 0, 8, 1],
[5, 4, 6, 7],
[1, 9, 2, 8]
);
print "The probe at row 1, column 2 has value ", $probes[1][2], "\n";

This prints out:

The probe at row 1, column 2 has value 8


Recall that in Perl the first element of an array is indexed 0; so
row 1 in this program is actually the second row, and column 2 is
actually the third column. Sometimes you may want to refer to the 0th
row as row 1; you have to adjust your code and your interactions with
the user accordingly.

This matrix is implemented as an array (in parentheses), each element
of which is a reference to an anonymous array [in square brackets],
which itself is a list of integers.

Another good way to build an array is to declare a
reference to an anonymous array. In the following example,
I declare an empty anonymous array and then populate it as desired.
This is, in effect, an anonymous array of anonymous arrays:

# Declare reference to (empty) anonymous array
$array = [ ];
# Initialize the array
for($i=0; $i < 4 ; ++$i) {
for($j=0; $j < 4 ; ++$j) {
$array->[$i][$j] = $i * $j;
}
}
# Reset one of the elements of the array
$array->[3][2] = 99;
# Print the array
for($i=0; $i < 4 ; ++$i) {
for($j=0; $j < 4 ; ++$j) {
printf("%3d ", $array->[$i][$j]);
}
print "\n";
}


Note the use of printf to format the output
nicely. For a refresher on this Perl function, consult the Perl
documentation, by typing:

perldoc -f printf

and

perldoc -f sprintf

at a shell prompt or check out http://www.perldoc.com.

This program produces the following output:

  0   0   0   0 
0 1 2 3
0 2 4 6
0 3 99 9

Alternatively, if the values are known, I can declare this as an
anonymous array of anonymous arrays by saying:

$array = [
[0, 0, 0, 0],
[0, 1, 2, 3],
[0, 2, 4, 6],
[0, 3, 99, 9]
];

I can also declare an array of anonymous arrays, by saying:

@array = (
[0, 0, 0, 0],
[0, 1, 2, 3],
[0, 2, 4, 6],
[0, 3, 99, 9]
);

Notice the slight syntactical difference between an array of
anonymous arrays:

@array = ( [  ], [  ], ... );

and an anonymous array of anonymous arrays:

$array = [ [  ], [  ], ... ];

Note that Perl also allows you to say:

$$array[$i][$j]

as a synonym for:

$array->[$i][$j]

But beware confusing:

$array->[$i][$j]

with:

$array[$i][$j]

They are not the same thing and won't refer to the
same array if you intermix them!

Very often you read data in from a file that has the elements of a
matrix displayed one row per line, and you have to store the data
from that file in an array in your Perl program. Say you have the
following data:

  0   0   0   0 
0 1 2 3
0 2 4 6
0 3 99 9

You can read the data into a Perl array with the following loop:

while (<>) {
@row = split;
push(@array, [ @row ]);
}

This assumes that you've named the file on the
command line as an argument to the program. Note that each incoming
line is assigned to the special variable $_ on
each iteration through the while loop. The
split function uses this line stored in
$_ by default. Each incoming line is split into an
array of its whitespace-separated elements, and then an anonymous
array [ @row ] containing those elements is pushed
onto the @array array.

For more details on arrays of arrays, see the
perllol manpage; type perldoc
perllol at your command prompt or visit the Perl
documentation web site at http://www.perldoc.com.


2.3.2 Higher-Dimensional Matrices


To use a higher-dimensional matrix, simply add another dimension:

# Populate a 3-dimensional array
$array = [ ];
# Initialize the array
for($i=0; $i < 4 ; ++$i) {
for($j=0; $j < 4 ; ++$j) {
for($k=0; $k < 4 ; ++$k) {
$array->[$i][$j][$k] = $i * $j * k;
}
}
}

The sharp-witted reader
may have noticed that we seem to be omitting arrow operators between
array subscripts. (After all, these are anonymous arrays of anonymous
arrays of anonymous arrays, etc., so shouldn't they
be written [$array->[$i]->[$j]->[$k]?)
Perl allows this; only the arrow operator between the variable name
and the first array subscript is required. It make things easier on
the eyes and helps avoid carpal tunnel syndrome. On the other hand,
you may prefer to keep the dereferencing arrows in place, to make it
clear you are dealing with references. Your choice.

There's no need to stop at three-dimensional arrays.
If higher-dimensional arrays are hard to imagine, just
don't think of
"dimension" as tied to space. For
instance, four- dimensional arrays have points that are uniquely
identified by four indices; five- dimensional arrays have points that
are uniquely identified by five indices, etc. In fact, subatomic
space is thought to contain eleven dimensions.


2.3.3 Sparse Arrays


Some programs need arrays, but only a small number of the array
elements are ever used. Such arrays are called sparse
arrays.

It would be inefficient to declare, for instance, a 1,000-by-1,000
element array, 1 million elements in all, if only 100 elements are
ever actually used. For such sparse two-dimensional arrays,
it's best to implement the array as a hash of
hashes:

$array = {  };
$array->{4}{83} = 'set';
$array->{34}{9} = 'set';
print $array->{4}{83}, "\n";
print $array->{34}{9}, "\n";

This prints out:

set
set

Perl creates only the table elements referenced, which makes an
efficient implementation for a sparse matrix. However, because merely
looking at a location (to see if there's anything
there) creates an entry in the hash, you have to use the Perl
exists function to keep your hashes sparse when
looking at them. exists reports on whether a
particular key (or array element) has been created, without actually
creating it.[4] So to explore the sparse matrix just
shown, you can say:

[4] The function defined
is related but different; when used on a hash element, it checks if
the value is undef, not whether the value
exists.


$array = {  };
$array->{4}{83} = 'set';
$array->{34}{9} = 'set';
for(my $i=0 ; $i < 100 ; ++$i) {
for(my $j=0 ; $j < 100 ; ++$j) {
if( exists($array->{$i}) and exists($array->{$i}{$j}) ) {
print "Array element row $i column $j is $array->{$i}{$j}\n";
}
}
}

This reports, without increasing the size of the array, that:

Array element row 4 column 83 is set
Array element row 34 column 9 is set

Question: why did you need two exists tests?
(Hint: it's a two-dimensional array.) Another
question: is $array a hash or a reference to an
anonymous hash? Can you implement it the other way? See the exercises
for this chapter.


/ 156