Build Your Own Database-Driven Website Using PHP MySQL [Electronic resources]

Kevin Yank

نسخه متنی -صفحه : 190/ 47

Rule of Thumb: Keep Things Separate

As your knowledge of database-driven Websites continues to grow, you may decide that a personal joke list isn't enough. In fact, you might begin to receive more submitted jokes than you have original jokes of your own. Let's say you decide to launch a Website where people from all over the world can share jokes with each other. You've heard of the Internet Movie Database (IMDB)? You decide to open the Internet Joke Database (IJDB)! To add the author's name and email address to each joke certainly makes a lot of sense, but the way we did it above leads to several potential problems:

What if a frequent contributor to your site named Joan Smith changed her email address? She might begin to submit new jokes using the new address, but all the old jokes would still have the old address attached to them. Looking at your database, you might simply think there were two different people named Joan Smith who submit jokes. If she were especially thoughtful, she might inform you of the change of address, and you might try to update all the old jokes with the new address, but if you missed just one joke, your database would still have incorrect information stored in it. Database design experts refer to this sort of problem as an update anomaly.

It would be natural for you to rely on your database to provide a list of all the people who've ever submitted jokes to your site. In fact, you could easily obtain a mailing list by using the following query:

mysql>SELECT DISTINCT AuthorName, AuthorEMail
    ->FROM Jokes;

The word DISTINCT in the above query tells MySQL not to output duplicate result rows. For example, if Joan Smith submitted 20 jokes to your site, her name and email address would appear 20 times in the list, instead of just once, if you failed to use the DISTINCT option.

If for some reason you decided to remove all the jokes that a particular author had submitted to your site, you'd remove any record of this person from the database in the process, and you'd no longer be able to email him or her with information about your site! As your mailing list might be a major source of income for your site, you wouldn't want to go throwing away an author's email address just because you didn't like the jokes that person had submitted to your site. Database design experts call this a delete anomaly.

You have no guarantee that Joan Smith would not enter her name as "Joan Smith" one day, as "J. Smith" the next, and as "Smith, Joan" on yet another occasion. This would make keeping track of a particular author exceedingly difficult, especially if Joan Smith had several email addresses she liked to use, too.

These problems—and more—can be dealt with very quickly. Instead of storing the information for the authors in the Jokes table, let's create an entirely new table for our list of authors. Since we used a column called ID in the Jokes table to identify each of our jokes with a unique number, we'll use an identically-named column in our new table to identify our authors. We can then use those "author ID's" in our Jokes table to associate authors with their jokes. The complete database layout is shown in "The AID field associates each row in Jokes with a row in Authors".

The AID field associates each row in Jokes with a row in Authors

What the above two tables show are three jokes and two authors. The AID column (short for "Author ID") of the Jokes table provides a relationship between the two tables, indicating that Kevin Yank submitted jokes 1 and 2 and Joan Smith submitted joke 3. Notice also that, since each author now only appears once in the database, and appears independently of the jokes he or she has submitted, we've avoided all the problems outlined above.

The most important characteristic of this database design, however, is that, since we're storing information about two types of "things" (jokes and authors), it's most appropriate to have two tables. This is a rule of thumb that you should always keep in mind when designing a database: each type of entity (or "thing") that you want to be able to store information about should be given its own table.

To set up the above database from scratch is fairly simple (involving just two CREATE TABLE queries), but since we'd like to make these changes in a non-destructive manner (i.e. without losing any of our precious knock-knock jokes), we'll use the ALTER command again. First, we get rid of the author-related columns in the Jokes table:

mysql>ALTER TABLE Jokes DROP COLUMN AuthorName;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql>ALTER TABLE Jokes DROP COLUMN AuthorEMail;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

Now we create our new table:

mysql>CREATE TABLE Authors (
    ->  ID INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    ->  Name VARCHAR(255),
    ->  EMail VARCHAR(255)
    ->);

Finally, we add the AID column to our Jokes table:

mysql>ALTER TABLE Jokes ADD COLUMN AID INT;

If you prefer, the CREATE TABLE commands that will create the two tables from scratch are provided in 2tables.sql in the code archive. All that's left is to add some authors to the new table, and assign authors to all the existing jokes in the database by filling in the AID column^[1]. Go ahead and do this now if you like. This should give you some practice with INSERT and UPDATE queries.

^[1]For now you'll have to do this manually. But don't worry, in "A Content Management System"we'll see how PHP can insert entries with the correct IDs automatically to reflect the relationships between them.