Mastering Perl for Bioinformatics [Electronic resources] نسخه متنی

6.4 Structured Query Language

The Structured Query Language (SQL,
pronounced "s q l" or
"see quel") can be thought of as
the working definition of a relational database. It provides the
bioinformatics programmer with the wherewithal to create, populate,
interrelate, query, and update a relational database on a computer
system. Your DBMS
comes with its own implementation of SQL, which will have all the
basic commands plus some, or maybe all, the less-used commands and
features in the standard definition of the language, perhaps even
some special extensions to the standard.

SQL dates back to the 1970s when it was developed at IBM. The most
widely used versions of SQL are based on the standard published in
1992 and commonly called
SQL2. A newer standard
called SQL3 is
available and supports emerging database functionality such as
object-oriented and object-relational data models.
MySQL is based on a
subset of the most commonly used parts of SQL2, with the goal of
providing a very fast implementation of the key components of SQL.
Some features of SQL3 are also being added.

SQL is actually a fairly simple language to learn. Most people find
that getting an account established on their computer, reading
through a quick tutorial, and then having example code to copy and
modify with the SQL documentation close at hand, is enough to get
started writing useful SQL code.

I'm not going to present an extensive SQL tutorial
here, for three reasons. First, such tutorials are easily and widely
available. Second, each DBMS has its own version of SQL, so the DBMS
documentation (such as that which comes with MySQL, for example) is
necessary and available to you anyway. Third, SQL is such a basically
simple language that it's quite useful to learn the
basics of it by simply seeing a few examples. That's
the approach I'll take.

If you are new to SQL, the best way to get familiar with it is by
using the interactive
command-line interface to try out
different commands. The following section demonstrates my Linux
system running MySQL.

6.4.1 SQL Commands

First,
I enter the interactive mysql program, providing
my MySQL username ("tisdall") and
interactively entering my MySQL account password:

[tisdall@coltrane tisdall]$ mysql -u tisdall -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2 to server version: 3.23.41
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

Next, I ask for a list of all the databases that are defined in my
MySQL DBMS:

mysql> show databases;
+----------+
| Database |
+----------+
| caudyfly |
| dicty    |
| gadfly   |
| master   |
| mysql    |
| poetry   |
| yeast    |
+----------+
7 rows in set (0.15 sec)

6.4.1.1 Creating a database

I want to create a database
called "homologs". First, I
create it, then I check that it's there, and finally
I make it the active database with use homologs;:

mysql> create database homologs;  
Query OK, 1 row affected (0.00 sec)
mysql> show databases;
+----------+
| Database |
+----------+
| caudyfly |
| dicty    |
| gadfly   |
| homologs |
| master   |
| mysql    |
| poetry   |
| yeast    |
+----------+
8 rows in set (0.01 sec)
mysql> use homologs;
Database changed

6.4.1.2 Creating tables

The next commands create the two
tables for the homologs database. Initially they
are empty. I ask to see the fields that have been defined with
show fields
(show full columns also works):

mysql> create table genename ( name char(20), id int, date date );
Query OK, 0 rows affected (0.00 sec)
mysql> create table organism ( organism char(20), gene char(20) );
Query OK, 0 rows affected (0.00 sec)
mysql> show tables;
+--------------------+
| Tables_in_homologs |
+--------------------+
| genename           |
| organism           |
+--------------------+
2 rows in set (0.00 sec)
mysql> show fields from genename;
+-------+----------+------+-----+---------+-------+
| Field | Type     | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| name  | char(20) | YES  |     | NULL    |       |
| id    | int(11)  | YES  |     | NULL    |       |
| date  | date     | YES  |     | NULL    |       |
+-------+----------+------+-----+---------+-------+
3 rows in set (0.00 sec)
mysql> show fields from organism;
+----------+----------+------+-----+---------+-------+
| Field    | Type     | Null | Key | Default | Extra |
+----------+----------+------+-----+---------+-------+
| organism | char(20) | YES  |     | NULL    |       |
| gene     | char(20) | YES  |     | NULL    |       |
+----------+----------+------+-----+---------+-------+
2 rows in set (0.00 sec)
mysql>

6.4.1.3 Populating the tables

Now,
I've got a new database with two tables defined, and
I'm ready to populate the tables. First, I verify
that the genename table is empty by making a
select command; then I issue three insert commands, one for each row
that I want to insert in the table. After inserting the rows, I
verify that the genename table now has the desired
three rows by means of a select command:

mysql> select * from genename;
Empty set (0.00 sec)
mysql> insert into genename (name,id,date) values ('aging',118,'1984-07-13');
Query OK, 1 row affected (0.00 sec)
mysql> insert into genename (name,id,date) values ('wrinkle',9223,'1987-08-15');
Query OK, 1 row affected (0.00 sec)
mysql> insert into genename (name,id,date) values ('hairy',273,'1990-09-30');
Query OK, 1 row affected (0.01 sec)
mysql> select * from genename;
+---------+------+------------+
| name    | id   | date       |
+---------+------+------------+
| aging   |  118 | 1984-07-13 |
| wrinkle | 9223 | 1987-08-15 |
| hairy   |  273 | 1990-09-30 |
+---------+------+------------+
3 rows in set (0.00 sec)

Now, I repeat the same process to populate the other
organism table:

mysql> show fields from organism;
+----------+----------+------+-----+---------+-------+
| Field    | Type     | Null | Key | Default | Extra |
+----------+----------+------+-----+---------+-------+
| organism | char(20) | YES  |     | NULL    |       |
| gene     | char(20) | YES  |     | NULL    |       |
+----------+----------+------+-----+---------+-------+
2 rows in set (0.00 sec)
mysql> insert into organism ( organism, gene ) values ( 'human', 118 );
Query OK, 1 row affected (0.00 sec)
mysql> insert into organism ( organism, gene ) values ( 'human', 9223 );
Query OK, 1 row affected (0.00 sec)
mysql> insert into organism ( organism, gene ) values ( 'mouse', 9223 );
Query OK, 1 row affected (0.01 sec)
mysql> insert into organism ( organism, gene ) values ( 'mouse', 273 );
Query OK, 1 row affected (0.00 sec)
mysql> insert into organism ( organism, gene ) values ( 'worm', 118 );
Query OK, 1 row affected (0.00 sec)
mysql> select * from organism;
+----------+------+
| organism | gene |
+----------+------+
| human    | 118  |
| human    | 9223 |
| mouse    | 9223 |
| mouse    | 273  |
| worm     | 118  |
+----------+------+
5 rows in set (0.00 sec)

Let's find out which organisms have a homolog of the
wrinkle gene. My query has two stages. First, I get the ID of the
gene and search for it in the ORGANISM table. Then I write it as a
single SQL statement:

mysql> select id from genename where name = 'wrinkle';
+------+
| id   |
+------+
| 9223 |
+------+
1 row in set (0.00 sec)
mysql> select organism from organism where gene = 9223;
+----------+
| organism |
+----------+
| human    |
| mouse    |
+----------+
2 rows in set (0.00 sec)
mysql> select organism from organism, genename
-> where genename.name = 'wrinkle' and genename.id = organism.gene;
+----------+
| organism |
+----------+
| human    |
| mouse    |
+----------+
2 rows in set (0.00 sec)
mysql> 
mysql>

Notice how the last statement asks the same question as the two
preceding statements combined.

Mastering Perl for Bioinformatics [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی