Mastering Perl for Bioinformatics [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Mastering Perl for Bioinformatics [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید












7.3 The Common Gateway Interface



The preceding sections
of this chapter have presented a brief overview of the design of the
Web, the principal components of the programming environment such as
HTTP, HTML, and URLs, and of the essential request-response nature of
web interactions between web browsers and web servers. Now,
it's time to look at CGI and a specific Perl module,
CGI.pm, that is widely used to create interactive
web pages on servers.


CGI, the Common Gateway Interface, is an interface between a web
server and some other program that requests such web content as HTML
documents or images.


A web browser may request from the web server the output from a
CGI
program. In this case, the web server finds the program or
script,
runs the program, and sends the output of the program back to the web
browser. The output of the program may be HTML, just as it may be
found in a static file, but it is created by the CGI script
dynamically, so it may be different each time; for instance, it may
include the time of day in its display.


In other words, a CGI script is just a program that produces web
content that can be displayed in a web browser. It also can read
information passed to it by the web server, usually parameters filled
out by the user in a form displayed on a web browser that asks for
the specifics of a query. For example, the parameters might be the
name of a sequence file and the name of a restriction enzyme to map
in the sequence. The CGI program takes the parameters, runs, and
outputs a dynamically created web page to be returned to the
user's web browser.


A CGI program can be written in just about any language, but the most
common for CGI programs on the Web is Perl. Perl has a very nice
module called CGI.pm that
eases the task of writing CGI scripts; it's a
popular way to create dynamic web sites with a minimum of bother.



7.3.1 Writing a CGI Program



So,
how do you write a CGI Perl program? Basically, you write a Perl
program that includes the line:


use CGI;


You then use the CGI.pm methods so that your Perl
program outputs the code for the web page you want to return. After
that, it's simply a matter of placing your new CGI
script in the proper place in the web server's
directory structurenamely, in a directory that the web server
knows is supposed to contain CGI scripts. Finally, you type in the
name of the CGI script to a web browser as a URL. The web browser
sends the request to the web server, which executes the CGI script,
collects its output, and returns it to your web browser to be
displayed as a web page (or image, sound, or whatever).


You can actually write a Perl program that just prints out HTML code
(without ever using the CGI.pm module) and install
that as a CGI program. For instance, you can take the HTML page shown
earlier and create a CGI Perl script that dynamically outputs that
page. To prove it's dynamic, I'll
add a little code that includes the time of day:


#!/usr/bin/perl
use strict;
use warnings;
my $time = localtime;
print "Content-type: text/html\n\n";
print <<EndOfHTML;
<html>
<head>
<title>Double stranded RNA can regulate genes</title>
</head>
<body>
<h2>Double stranded RNA can regulate genes</h2>
<p>A recent article in <b>Nature</b> describes the important
discovery of <i>RNA interference</i>, the action of snippets
of double-stranded RNA in suppressing gene expression.
</p>
<p>
The discovery has provided a powerful new tool in investigating
gene function, and has raised many questions about the
nature of gene regulation in a wide variety of organisms.
</p>
<p>
This page was created $time.
</p>
</body>
</html>
EndOfHTML


Notice that the program just prints out the HTML code. It also prints
a header line: print "Content-type:
text/html\n\n";
before printing the HTML code as the body
of the response. Notice the two
\n's in that header line; these
print a blank line between the header and the body of the response,
as described earlier in this chapter.


Also, notice a new last paragraph that reports the time.



7.3.2 Installing a CGI Program



After
writing the program, it is necessary to install it in the
cgi script directory of your web server. Because
of the multiplicity of web servers and operating systems, it is not
possible for me to be comprehensive on this point. On my Linux
system, using the Apache web server, I simply became superuser
(root), copied the script (called cgiex1) into the
directory /var/www/cgi-bin, and then typed:


chmod 755 /var/www/cgi-bin/cgiex1


If you're working on a Mac OS X, the procedure is
similar. If you're on a Microsoft Windows machine,
the details are a little different; consult the documentation for
your web server to see how to install a CGI script in the appropriate
place.


Once I've installed the script, I simply entered the
following URL into my web browser. Notice that the URL gives the
hostname as localhost, which means the web server
is on the same computer on which I'm using the web
browser.


http://localhost/cgi-bin/cgiex1


I hit the Enter or Return key, and the web server returned the web
page that's displayed in my web browser (see Figure 7-2).



Figure 7-2. Web page for cgiex1




Notice that it's exactly the same as the previous
version that just read a file, but this time, the current date on the
web server is also being reported. So each time you run this program,
you'll get a different output (as regards the date,
that is). This qualifies the program as dynamic.


I'll go into more detail about CGI installation in
the next section.



7.3.3 Using the CGI.pm Module



The
following program was written using CGI.pm; it has
the same output as the example in the previous section. Notice how
almost the entire contents of this CGI script are a Perl
print function with a list of arguments, ending
with the argument end_html. The various arguments
to print are either CGI.pm
functions or text strings:


#!/usr/bin/perl
use strict;
use warnings;
use CGI qw/:standard/;
my $time = localtime;
print
header,
start_html('Double stranded RNA can regulate genes'),
h2('Double stranded RNA can regulate genes'),
start_form,
p,
"A recent article in <b>Nature</b> describes the important
discovery of <i>RNA interference</i>, the action of snippets
of double-stranded RNA in suppressing gene expression.",
p,
"The discovery has provided a powerful new tool in investigating
gene function, and has raised many questions about the
nature of gene regulation in a wide variety of organisms.",
p,
"This page was created $time.",
p,
end_form;


This program uses the most common routines defined in the
CGI.pm module, as imported into the
program's namespace by the directive
use CGI
qw/:standard/;.


The function header prints the header information
discussed earlier in this chapter; it takes as an argument the
document type and assumes the type is text/html by default. The
function start_html starts the HTML and gives the
title of the document (which most web browsers display in their
titlebar above the document). The functions h1,
h2, and so forth give the different levels of HTML
headers in the document structure. The function p
starts a new paragraph of text. Finally, the function
end_form closes the HTML document.


Here is the body of the document the web browser receives for display
from the CGI program cgiex1.cgi:


<html>
<head>
<title>Double stranded RNA can regulate genes</title>
</head>
<body>
<h2>Double stranded RNA can regulate genes</h2>
<p>A recent article in <b>Nature</b> describes the important
discovery of <i>RNA interference</i>, the action of snippets
of double-stranded RNA in suppressing gene expression.
</p>
<p>
The discovery has provided a powerful new tool in investigating
gene function, and has raised many questions about the
nature of gene regulation in a wide variety of organisms.
</p>
<p>
This page was created Tue Apr 15 09:42:49 2003.
</p>
</body>
</html>


This simple web page is not much less complicated than the previous
version cgiex1 that didn't use
CGI.pm but simply output the HTML code. However,
as you write more complicated web pages, with forms to be filled in
by the user, choices to click on, and a button to push to submit a
request, you'll see that using
CGI.pm can significantly ease your programming
work.



7.3.4 Testing a CGI Program



Let's
assume the CGI scripts are where they should be, ownership and
permissions have been assigned, and now your web server can find and
attempt to execute them. So how do you test a CGI web program?


First, check the basic syntax by
running:


perl -c cgiex1.cgi


and, hopefully, getting the message:


cgiex1.cgi syntax OK


If not, you can save a bit of trouble by at least fixing the syntax
of your program before installing it in the web space, where you have
to test it with the web browser and web logs and so forth, as
I'll demonstrate in a moment.


Copy your CGI program cgiex1.cgi into your CGI
directory (on my system, it's
/var/www/cgi-bin). I did this as the user root.
Then, still as root, I made the program executable by typing:


chmod 755 /var/www/cgi-bin/cgiex1.cgi


Let's try the program out. Start up a web browser
and type in the URL http://ocalhost/cgi-bin/cgiex1.cgi and hit
the Enter key. Figure 7-3 shows what it looks like.



Figure 7-3. The results of a successful CGI program




It worked! But what would you do if it doesn't? Even
though you check the syntax, the program may have had some other
problem that caused it to fail. For this demonstration, I made
another version of the program with a missing semicolon near the
beginning of the program, called cgiex1ouch.cgi.
When I ran it, I saw something like what's in Figure 7-4.



Figure 7-4. When things go wrong




The best way to proceed is to check the
error logs for the web server to see if they give any useful hints as
to why the program failed.


Again, this will be different on different systems. On Linux, Unix,
and Mac OS X, some variation of the following will work. I determined
where the error logs for the web server are kept on my system, which
is in the directory /etc/httpd/logs, and the
most recent error log there is called error_log. I
opened a new command window and typed the following command:


tail +0f /etc/httpd/logs/error_log


This prints out the error file, and then waits at the end. Whenever
anything new is printed to the error file, it prints it out too. So,
by hitting the Return key a few times to make space after what went
before, and then trying to run the program again from the web
browser, I can see clearly what new error messages resulted.


Here's what I saw in this example:


syntax error at /var/www/cgi-bin/cgiex1ouch.cgi line 6, near "use warnings
use CGI "
Execution of /var/www/cgi-bin/cgiex1ouch.cgi aborted due to compilation errors.
[Tue Apr 15 21:23:10 2003] [error] [client 127.0.0.1] Premature end of script headers:
/var/www/cgi-bin/cgiex1ouch.cgi


Sure enough, I'd removed the semicolon at the end of
the use warnings statement.


For most web programming jobs, this is all you'll
need, because the error log will show you the error output of the
program. If it's a difficult problem, you can even
put extra print statements in your CGI
scriptin the standard way they are used to debug a misbehaving
program. For instance, you might see if a program ever gets to a
certain line by placing directly after that line the statement:


print STDERR "Got to here!\n";


This message will appear in the error logs (if your program gets to
that point before it dies).



/ 156