Mastering Perl for Bioinformatics [Electronic resources] نسخه متنی

7.4 Rebase: Building Dynamic Web Pages

The simple examples in the previous
sections showed how to load and use the CGI.pm
module to display a very simple web page and how to examine the error
logs of the web server to help debug a CGI program that
doesn''t display properly.

The real power of CGI comes from its ability to provide dynamic
contentweb pages that may display different information
depending on such factors as when they''re called,
such as the date and time in the previous example. Dynamic content
also handles the requests of users that are entered by typing in text
fields, clicking on so-called
"radio" buttons, selecting from
lists, or other ways of inputting.

In this section, I''ll show you how to use some of
the modules from previous chapters, combined with the use of the
CGI.pm module, to make an interactive, dynamic web
page for displaying restriction maps. In this web page, the user will
select which restriction enzyme or enzymes to search for and specify
the sequence to search either by entering the sequence data into a
text window or by browsing for the file that contains the sequence.

Here is the short CGI program,
webrebase1,
that accomplishes this. The main reason that it''s
short is because I''ve already developed modules for
reading sequence files, for accessing the Rebase database, for
calculating restriction maps, and for displaying the maps with simple
text graphics. I can just reuse those modules here to accomplish my
task:

#!/usr/bin/perl
# webrebase1 - a web interface to the Rebase modules
# To install in web, make a directory to hold your Perl modules in web space
use lib "/var/www/html/re";
use Restrictionmap;
use Rebase;
use SeqFileIO;
use CGI qw/:standard/;
use strict;
use warnings;
print header,
start_html(''Restriction Maps on the Web''),
h1(''<font color=orange>Restriction Maps on the Web</font>''),
hr,
start_multipart_form,
''<font color=blue>'',
h3("1) Restriction enzyme(s)?  "),
textfield(''enzyme''), p,
h3("2) Sequence filename (fasta or raw format):  "),
filefield(-name=>''fileseq'',
-default=>''starting value'',
-size=>50,
-maxlength=>200,
), p,
strong(em("or")),
h3("Type sequence:  "),
textarea(
-name=>''typedseq'',
-rows=>10,
-columns=>60,
-maxlength=>1000,
), p,
h3("3) Make restriction map:"),
submit, p,
''</font>'',
hr,
end_form;
if (param(  )) {
my $sequence = '''';
# must have exactly one of the two sequence input methods specified
if(param(''typedseq'') and param(''fileseq'')) {
print "<font color=red>You have given a file AND
 typed in sequence: do only one!</font>", hr;
exit;
}elsif(not param(''typedseq'') and not param(''fileseq'')) {
print "<font color=red>You must give a sequence file OR type in sequence!</
font>", hr;
exit;
}elsif(param(''typedseq'')) {
$sequence = param(''typedseq'');
}elsif(param(''fileseq'')) {
my $fh = upload(''fileseq'');
while (<$fh>) {
/^\s*>/ and next; # handles fasta file headers
$sequence .= $_;
}
}
# strip out non-sequence characters
$sequence =~ s/\s//g;
$sequence = uc $sequence;
my $rebase = Rebase->new(
#omit "bionetfile" attribute to avoid recalculating the DBM file
dbmfile => ''BIONET'',
mode => ''0444'',
);
my $restrict = Restrictionmap->new(
enzyme => param(''enzyme''),
rebase => $rebase,
sequence => $sequence,
graphictype => ''text'',
);
print "Your requested enzyme(s): ",em(param(''enzyme'')),p,
"<code><pre>\n";
(my $paramenzyme = param(''enzyme'')) =~ s/,/ /g;
foreach my $enzyme (split(" ", $paramenzyme)) {
print "Locations for $enzyme: ",
join('' '', $restrict->get_enzyme_map($enzyme)), "\n";
}
print "\n\n\n";
print $restrict->get_graphic,
"</pre></code>\n",
hr;
}
print end_html;

7.4.1 Installing webrebase1

Installing
webrebase1
is almost exactly the same as installing the scripts seen earlier in
this chapter, such as cgiex1.cgi. However, because
this program depends on several modules, it is necessary to copy them
into a directory that can be found by the web server.
It''s possible to configure a web server to look into
any directory; you may prefer to leave your modules in one place and
give the necessary permissions to your web server to look there. If
your code lives in one place, there won''t be any
problem with out-of-sync duplicate copies.

However, there are security problems
associated with letting the world execute programs from your own
directories. Also, if you try out a change that
doesn''t work while you''re tinkering
with your code, any users on the web site will find the programs
broken as well. So, often it does make sense to have a development
area and a production area where you try to ensure that only working,
tested, and secure programs are placed for public consumption.

On my Red Hat Linux system, I created a directory
/var/www/html/re and copied the modules
Restrictionmap.pm, Rebase.pm,
and SeqFileIO.pm there. On your system and web
server, you may need to check that the existence, ownership, and
permissions on that directory and those module files are suitable for
your web server''s configuration.

I then copied my CGI program webrebase1 into my
CGI directory (on my system, /var/www/cgi-bin),
and dealt with the same questions of ownership and file permissions
as detailed in earlier sections of this chapter. (As they say down at
the dealership, your mileage may vary depending on the operating
system and web server that you are using.)

It''s possible that the first line that invokes the
Perl application, #!/usr/bin/perl, may have to be
changed to run from your web space. Sometimes a web server is
configured to restrict calling any programs from outside the web
space, and a Perl application must be installed into the web space
(usually with such extra security precautions as taint checks
compiled into the application). If you plan to offer programs to the
world from a computer that has sensitive information or is connected
to other computers that have sensitive information, such precautions
are often desirable. But just to get started, try using the same Perl
application you''ve been using, and
it''s likely to work.

One consequence of running the Restrict.pm module
is that the DBM file called bionet is created in
your web server''s CGI directory
(/var/www/cgi-bin on my Linux system running an
Apache web server). So, another thing to check is whether you have
enough space for any files your programs may create in your web
space.

7.4.2 Inside webrebase1

webrebase1
first loads the required modules: the ones you''ve
written and the standard CGI.pm module.

The program has two parts. The first part is always executed, and
displays the form that asks the user to enter the required
information to run the program. The second part executes only when
the program is called with parameters set, which happens after the
user has filled out the form and hit the Submit Query button.

Let''s look at the first part of the code that
creates the form. Everything in this part of the form is one long
print statement. The list of things to print is
composed mostly of calls to various
CGI.pm
functions. For details on these functions, take a look at the
CGI.pm documentation on www.perldoc.org or by typing
perldoc CGI at a command
prompt.

Here are the CGI functions called but not seen in the earlier
programs:

h1(''<font color=orange>Restriction Maps on the Web</font>'')

This is a header, as seen previously; however, it includes a color
directive for the font.

start_multipart_form

Makes the part of the form that handles file uploading work correctly.

Draws a horizontal line across the screen.

''<font color=blue>''

Makes everything in the form blue, up to the closing directive
''</font>''.

h3("1) Restriction enzyme(s)? ")

This header, like similar headers in the form, labels the following
textfield so the user knows what information is
requested.

textfield(''enzyme'')

textfield creates a place for the user to type in
a line of text. The string the user types in is accessible by means
of the parameter named enzyme when the form is
submitted. The user can type in the names of enzymes such as EcoRI
and HindIII to find more information.

This starts a new paragraph in the form.

filefield(-name=>''fileseq'', -default=>''starting value'', -size=>50,

-maxlength=>200,)

filefield provides a way to give the name of a
file that contains a sequence. When the form is submitted, that file
is uploaded from the user''s computer onto your
computer where it can be used to find the restriction map. As you can
see in Figure 7-5, the user can type in the
pathname of a sequence file or use a mouse to interactively browse
until the desired sequence file is found.

The option name gives the name of the parameter
that has the contents of the file; size and
maxlength are the size of the field displayed
and the maximum length of the filename.

Figure 7-5. Rebase1 in a browser window

strong(em("or"))

This prints the word or in some strong fashion
(usually in a bold font, but at the discretion of the
user''s web browser) and with some emphasis (usually
in italics).

textarea( -name=>''typedseq'', -rows=>10, -columns=>60, -maxlength=>1000,)

textarea provides a box in which the user can type
or use the mouse to cut and paste the sequence directly, as opposed
to giving the name of a file that contains the sequence.

submit

This button collects the values the user has given on the form into
the named parameters and restarts the program by submitting the form
to the web server, this time with the parameters set.
webrebase1 has a section that uses the parameters
to perform a computation, as you will see shortly.

end_form

This closes the start_multipart_form given earlier.

end_html

This CGI function is called at the end of the
webrebase1 program, and it prints the final
required HTML tags for the page before sending it back to the
user''s web browser.

When the user hits the Submit button, the parameters are assigned the
values the user has indicated, and the program is called again. This
time, after printing the form, the program gets to the conditional
block beginning:

if (param(  )) {

The param( ) CGI function returns a true value if
parameters have been sent to the program, so at this point the block
is entered. The block does some error checking, extracts the
information from the parameters, computes the restriction map, and
displays the results.

The error checking ensures that all the data needed from the
parameters for the computation to proceed is present.

Assuming the parameters have been set correctly, the program gets the
sequence to be mapped from either the typed-in
textbox field:

}elsif(param(''typedseq'')) {
$sequence = param(''typedseq'');

or from the uploaded file:

}elsif(param(''fileseq'')) {
my $fh = upload(''fileseq'');
while (<$fh>) {
/^\s*>/ and next; # handles fasta file headers
$sequence .= $_;
}
}

As you can see, the uploaded file is provided as an opened filehandle
to your webrebase1 program. The
while loop assumes that the file is in FASTA
format (see the exercises for this chapter), skips the header, and
collects the sequence.

After cleaning up the sequence by stripping out newlines and making
it uppercase, the program then calls the Restrict
and Restrictionmap modules to calculate the
restriction map with the requested enzymes as available by means of
the param(''enzyme'') CGI function call.

Finally, the program is ready to display the results. As you can see
in Figure 7-6, the results appear after the form.
First, webrebase1 prints out the names of the
requested enzymes:

print "Your requested enzyme(s): ",em(param(''enzyme'')),p,

Figure 7-6. Results of the Rebase1 query

Recall that the trailing ,p, is a CGI directive to
start a new paragraph; the em( )
CGI function asks the user''s web browser to emphasis
the text, probably by italics.

The next line is not part of CGI but is an HTML directive that
ensures the following lines are printed in a fixed-width font. Every
character will thus take the same amount of horizontal space, and all
lines and spaces will line up just as they do when
they''re printed to your screen. Without this
directive, the web browser could use a non-fixed font, and the map
would improperly display:

"<code><pre>\n";

Of course, as you now know about HTML tags, they are almost always
required to appear in pairs, so this directive has a closing tag
after the map is displayed:

"</pre></code>\n",

The restriction map for each enzyme, by which I mean the simple list
of locations for each enzyme in the sequence, is displayed by the
following code:

(my $paramenzyme = param(''enzyme'')) =~ s/,/ /g;
foreach my $enzyme (split(" ", $paramenzyme)) {
print "Locations for $enzyme: ",
join('' '', $restrict->get_enzyme_map($enzyme)), "\n";
}
print "\n\n\n";

First the space- or comma-separated list of enzymes is collected from
the parameter enzyme, and the commas, if present,
are removed. The enzymes are then split (on whitespace) into a list,
and for each such enzyme, the Restrictionmap
method get_enzyme_map is called to display the
list of locations.

Finally, the graphic map is displayed. This is accomplished, as
before, by a simple call to the Restrictionmap
method get_graphic, which returns a simple text
version of the graphic (because graphictype
=> ''text'' was specified when
Restrictionmap object $restrict
was created).

print $restrict->get_graphic,

Mastering Perl for Bioinformatics [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی