"http://www.w3.org/TR/l11/DTD/l11.dtd">
A table is an unordered set of records, consisting of rows and columns. Each column has a defined data type, and each row represents an entry in the table. Figure 7.9 shows an example of a table with
n rows and
m columns. The
sales_person column with a VARCHAR data type is the first column in the table, followed by the
region column with a CHAR data type. The
year column is the
m th column in the table and has an INTEGER data type.
Tables in DB2 can be classified as illustrated in Figure 7.10. You will learn more about each of these tables in the next sections.
DB2 automatically creates system catalog tables when a database is created. They always reside in the SYSCATSPACE table space. System catalog tables contain information about all the database objects in the database. For example, when you create a table space, its information will be loaded into one or more system catalog tables. When this table space is referenced during a later operation, DB2 checks the corresponding system catalog tables to see whether the table space exists and whether the operation is allowed. Without the system catalog tables, DB2 will not be able to function.
Some of the information contained in system catalog tables includes the following:
Definitions of all database objects
Column data types of tables and views
Defined constraints
Object privileges
Object dependencies
System catalog tables or views use the SYSIBM, SYSCAT, or SYSSTAT schemas.
The
SYSIBM schema is used for the base system catalog tables.
The
SYSCAT schema is used for views defined on the system catalog tables. DB2 users should normally query the SYSCAT views rather than the SYSIBM tables for information.
The
SYSSTAT schema is used for views containing information about database statistics and is also based on the system catalog tables.
Although you cannot update the tables and views residing under the SYSIBM and SYSCAT schemas, you can update the views under the SYSSTAT schema. Updating these views can sometimes influence the DB2 optimizer to choose a specific access path.
Refer to Appendix D, Using the DB2 System Catalog Tables, for details about the system catalog tables.
User tables are used to store a user's data. A user can create, alter, drop, and manipulate user tables.
To create a user table, use the
CREATE TABLE statement. You can specify the following:
The name of the table
The columns of the table and their data types
The table spaces where you want the table, index, and long objects to be stored within the database
The constraints you want DB2 to build and maintain on the table, such as referential constraints and unique constraints
The following example illustrates the creation of the table
myemployees with four columns.
CREATE TABLE myemployees (
empID INT NOT NULL PRIMARY KEY,
empname VARCHAR(30) NOT NULL,
mngrID INT NOT NULL,
history CLOB)
In which table space would the table
myemployees be created? In cases where a table space is not specified, as in this example, follow the flow chart shown in Figure 7.11 to determine what table space would be used.
This next example uses the same
CREATE TABLE situation, but it indicates the table spaces to be used for the table data, index, and long objects.
CREATE TABLE myemployees (
empID INT NOT NULL PRIMARY KEY,
empname VARCHAR(30) NOT NULL,
mngrID INT NOT NULL,
history CLOB)
IN datadms
INDEX IN indexdms
LONG IN largedms
Use the
IN clause to specify the table space where the table data will reside. Use the
INDEX IN clause to specify where all indexes for the table will reside. Use the
LONG IN clause to indicate where the LOB, LONG VARCHAR, or LONG VARGRAPHIC objects will reside.
large table space.
large table spaces were known as long table spaces. Though the syntax of the
CREATE TABLESPACE statement uses the
LARGE clause, the syntax of the
CREATE TABLE statement still uses
LONG .
Figure 7.12 shows the command used to create the table
myemployees and also the corresponding table space commands to create the required table spaces. Note that the third statement creates a large table space.
You can also create a table based on the definition of another table, for example:
CREATE TABLE clone LIKE myemployees
The table
clone will have the same definition as the table
myemployees , however, other objects like constraints, indexes, or triggers associated to the table are not copied. Table data is not copied either.
Another alternative is to create the table structure based on the result of a query, as shown next:
CREATE TABLE clone2 AS (SELECT * FROM myemployees) DEFINITION ONLY
The
DEFINITION ONLY clause is required so that only the structure of the table is copied; otherwise, you would be creating a materialized query table (MQT), which is described in section 7.8.10, Materialized Query Tables and Summary Tables.
Once you have created a table, you cannot change the column names or data types; however, you are allowed to increase the length of VARCHAR columns or add new columns to the end of the table. You can do this with the
ALTER TABLE statement. For example, to add the column
address to the table
myemployees , use this statement:
ALTER TABLE myemployees ADD COLUMN address CHAR(45)
You cannot remove a column from a table using the
ALTER TABLE statement. If you want to remove a column from a table, you have two choices:
Use a view to hide the column you want removed. (Views are discussed in section 7.11.)
Drop the table and recreate it.
To drop a table and all its contents, use the
DROP TABLE statement, for example:
DROP TABLE myemployees
In the
CREATE TABLE statement, you can use the
DEFAULT clause for a given column to provide a default value for the column. This means that when you use an
INSERT statement to insert a row that does not provide a value for the column, the default value specified in the
DEFAULT clause will be used. For example, let's say you create the table
company with this statement:
CREATE TABLE company (
companyID INTEGER,
companyName VARCHAR(30),
city VARCHAR(20) DEFAULT 'TORONTO'
)
Inserting a record with either of the following two statements provides the same result.
(1)
INSERT INTO company (companyID, companyName, city)
VALUES ( 111 , 'cityOne' , DEFAULT)
(2)
INSERT INTO company (companyID, companyName)
VALUES ( 111, 'cityOne' )
The following row would be inserted.
COMPANYID COMPANYNAME CITY ----------- ------------------------------ -------------------- 111 cityOne TORONTO
In the first
INSERT statement, the
DEFAULT keyword is used. In the second
INSERT statement, the third column (
city ) is not included in the statement. In both cases, this means that the default value as defined in the table is inserted for that column.
What about the table columns that do not have a
DEFAULT clause? What is inserted when test columns are omitted from the
INSERT statement? In such scenarios, DB2 will insert a NULL, assuming the column accepts NULL values. If the column does not accept NULL values, you will receive an error. (We describe NULLs in the next section.) For example, the result of this statement:
INSERT INTO company (city)
VALUES ('ATLANTA')
is:
COMPANYID COMPANYNAME CITY ----------- ------------------------------ -------------------- - - ATLANTA
The dash (
- ) represents a NULL value.
The columns of a table can also be defined with the
DEFAULT keyword just by itself. In such a scenario, DB2 will use default values depending on the data type of the column. Typically, DB2 chooses a zero for numeric data types and a blank for character strings. For example, let's recreate the table
company as follows:
CREATE TABLE company (
companyID INTEGER DEFAULT,
companyName VARCHAR(30) DEFAULT,
city VARCHAR(20) DEFAULT 'TORONTO'
)
Issuing the following statement:
INSERT INTO company (city)
VALUES (DEFAULT)
returns:
COMPANYID COMPANYNAME CITY ----------- ------------------------------ ------------ 0 TORONTO
This example shows that because the columns
companyID and
companyName are both defined with the
DEFAULT clause just by itself, DB2 chose a default value of zero for column
companyID , which is an INTEGER, and a blank for column
companyName , which is a VARCHAR.
NULL values represent an unknown state. For example, let's review the contents of the table
student , which contains NULL values.
NAME MARK -------------------- ----------- Peter 100 Mary 60 John - Raul 80 Tom -
John and Tom were sick the day of the exam, therefore the teacher put NULL values for their marks. This is different than giving them a mark of zero. If you issue this statement:
SELECT avg(mark) as average FROM student
The result is:
AVERAGE ----------- 80
Note that the average was calculated as follows: (100 + 60 + 80) / 3. The total number of students considered in the calculation was three, not five, because NULL values were not taken into consideration in the calculation.
Your business requirements dictate when NULL values are allowed in your columns. Let's review another example to illustrate when using
NOT NULL is appropriate. The following statement creates a table that stores a company phone directory.
CREATE TABLE telephoneDirectory (
empID CHAR(3) NOT NULL PRIMARY KEY,
phone_no VARCHAR(15) NOT NULL,
deptname VARCHAR(20) NOT NULL DEFAULT 'Marketing',
position VARCHAR(30) DEFAULT 'Clerk' )
In the example, let's assume the business requirements indicate that the column
empID must uniquely identify a row. Thus,
empID should be created as NOT NULL so that NULL values are not accepted; otherwise, several rows may have NULLs, which would not make the rows unique.
Next, the column
phone_no is also defined as NOT NULL per the business requirements. If the purpose of this table is to store telephone numbers, it's understandable that this column does not accept NULLs.
The third column,
deptname , is defined as NOT NULL with a
DEFAULT value of Marketing. This means that a NULL value is not accepted, and when the column is omitted in an
INSERT statement, the default value of Marketing is used. For example, if you issue this statement:
INSERT INTO telephoneDirectory (empID, phone_no)
VALUES ('111', '905-123-4567')
The result is:
EMPID PHONE_NO DEPTNAME POSITION ----- --------------- -------------------- ------------------------ 111 905-123-4567 Marketing Clerk
The fourth column,
position , allows NULL values and has a default value of Clerk. This case was explained in value clause works the same as the DEFAULT value clause only that NULL values are not allowed.
An identity column is a numeric column in a table that automatically generates a unique numeric value in sequence for each row inserted. A unique identifier is often used in applications to identify a specific row. Unlike sequence objects, which we discuss in section 7.16, Sequences, identity columns are bound to the table they are defined on. There can be only one identity column per table. DB2 can generate the identity column values in two ways.
Generated always : The values are always generated by DB2. Applications are not allowed to provide an explicit value.
Generated by default : The values can be explicitly provided by an application; if no value is given, DB2 generates one. In this case, however, DB2 cannot guarantee the uniqueness of the value generated.
To create an identity column, use the
CREATE TABLE statement with the
GENERATED clause and make sure it contains the
IDENTITY keyword because
GENERATED can also be used to generate other values automatically that are not identity columns. Here is an example.
CREATE TABLE product (
productno INTEGER GENERATED ALWAYS AS
IDENTITY (START WITH 200 INCREMENT BY 1),
description VARCHAR(50) )
The column
productno is an INTEGER defined as an identity column that is always generated. The value generated will start from 200, and it will be incremented by 1. Let's perform a few
INSERT statements and see the results obtained.
INSERT INTO product VALUES (DEFAULT,'banana'); --->inserts 200,banana
INSERT INTO product (description) VALUES ('apple'); --->inserts 201,apple
INSERT INTO product VALUES (300,'pear'); --->error SQL0798N
COMMIT;
INSERT INTO product (description) VALUES ('orange'); --->inserts 202,orange
ROLLBACK;
INSERT INTO product (description) VALUES ('plum'); --->inserts 203,plum
COMMIT ;
The following query shows the final result.
SELECT * FROM product; PRODUCTNO DESCRIPTION ----------- ------------ 200 banana 201 apple 203 plum
The first two
INSERT statements show that two identity column values were generated: 200 and 201. The third
INSERT statement returns an error because you cannot explicitly insert a value for an identity column generated as
ALWAYS . After the third
INSERT statement, we issue a
COMMIT to guarantee these rows are stored in the database. The fourth
INSERT statement causes another identity column value, 202, to be generated; however, we issue a
ROLLBACK statement right after, so this row is not stored in the database. Note that the final
INSERT statement, which inserts the product plum, generates a value of 203, not 202. (
COMMIT and
ROLLBACK statements are explained in more detail in Chapter 13, Developing Database Backup and Recovery Solutions.)
NOTE
An identity column value is generated only once. Once the value has been generated, even if a
ROLLBACK statement is performed, it will not be generated again.
Now let's review another example, this time creating the same table
product with the
GENERATED BY DEFAULT clause.
CREATE TABLE product (
productno INTEGER GENERATED BY DEFAULT AS
IDENTITY (START WITH 200 INCREMENT BY 1),
description VARCHAR(50) )
Next, we insert a few rows.
INSERT INTO product VALUES (DEFAULT,'banana'); --->inserts 200,banana
INSERT INTO product (description) VALUES ('apple'); --->inserts 201,apple
INSERT INTO product VALUES (300,'pear'); --->inserts 300,pear
INSERT INTO product VALUES (201,'orange'); --->inserts 201,orange
COMMIT;
INSERT INTO product (description) VALUES ('papaya'); --->inserts 202,papaya
ROLLBACK;
INSERT INTO product (description) VALUES ('plum'); --->inserts 203,plum
COMMIT;
The following query shows the final result.
SELECT * FROM product PRODUCTNO DESCRIPTION ----------- --------------------- 200 banana 201 apple 300 pear 201 orange 203 plum
The first two
INSERT statements show that two identity column values were generated: 200 and 201. For the third and fourth
INSERT statements, we explicitly provided the values 300 and 201, respectively, for the identity column. Note that DB2 did not return an error as in the previous example because we defined the identity column as
GENERATED BY DEFAULT . After the fourth
INSERT statement, we issue a
COMMIT to guarantee these rows are stored in the database. The fifth
INSERT statement causes another identity column value, 202, to be generated; however, we issue a
ROLLBACK statement right after, so this row is not stored in the database. Note that the final
INSERT statement, which inserts the product plum, generates a value of 203, not 202.
The following final example illustrates a
GENERATED value, which is not an identity column. The example uses
GENERATED ALWAYS , but you can also use
GENERATED BY DEFAULT .
CREATE TABLE income (
empno INTEGER,
salary INTEGER,
taxRate DECIMAL(5,2),
netSalary DECIMAL(7,2) GENERATED ALWAYS AS (salary * (1 - taxRate))
)
If you insert the following row:
INSERT INTO income (empno, salary, taxRate) VALUES (111, 50000, 0.3)
EMPNO SALARY TAXRATE NETSALARY ----------- ----------- ------- --------- 111 50000 0.30 35000.00
DB2 generates the value of the last column
NETSALARY based on the
SALARY and
TAXRATE columns.
Constraints allow you to create rules for the data in your tables. You can define four types of constraints on a table.
A unique constraint ensures that no duplicate key values can be entered in the table.
A referential constraint ensures that a value in one table must have a corresponding entry in a related table.
A check constraint ensures that the values you enter into the column are within the rules specified when the table was defined.
An
informational constraint allows you to enforce or not enforce a constraint.
These constraints are discussed further in the following sections.
A unique constraint indicates that the values for a given column must all be unique. A unique constraint is defined in the
CREATE TABLE or
ALTER TABLE statements using the
UNIQUE clause or the
PRIMARY KEY clause. A primary key, as you will see in the next section, is also a unique constraint.
All the columns that make up a unique constraint must be defined as NOT NULL. For the following example, the column
empID must be defined as NOT NULL because it is the primary key. The column
deptID must also be defined as NOT NULL because it is a unique constraint.
CREATE TABLE employ (
empID INT NOT NULL PRIMARY KEY,
name CHAR(30) ,
deptID INT NOT NULL UNIQUE
)
Now, let's perform a few
INSERT statements in sequence.
INSERT INTO employ VALUES (111, 'Peter', 999) ---> inserts 111, Peter, 999
INSERT INTO employ VALUES (111, 'Peter', 123) ---> SQL0803N error, duplicate primary key 111
INSERT INTO employ VALUES (789, 'Peter', 999) ---> SQL0803N error, duplicate unique key 999
This example illustrates that an error (SQL0803N) occurs if the value you attempt to insert for a unique or primary key column is not unique (it already exists in the table).
Unique constraints are implemented using unique indexes. When a
CREATE TABLE statement has the
UNIQUE or
PRIMARY KEY keywords, DB2 automatically creates a corresponding unique index. The name of this system-generated index starts with "SQL" followed by a timestamp. For the example just shown, two unique indexes were generated with these names:
SQL040422135806320
SQL040422135806460
Both indexes were created on April 22, 2004, at 1:58 p.m.
Though you would normally not refer to an index name directly in an application, a good index name may be helpful when analyzing an explain output. An explain output, as you will see in Chapter 16, Database Performance Considerations, displays the access path DB2 chooses to access your data for a given query. Therefore, rather than letting DB2 generate system names for your indexes, we recommend using the
ALTER TABLE statement in the case of primary key columns and the
CONSTRAINT clause to explicitly give names to the indexes. For example, let's rewrite the
CREATE TABLE statement used in the previous example as follows:
CREATE TABLE employ (
empID INT NOT NULL,
name CHAR(30) ,
deptID INT NOT NULL CONSTRAINT unique_dept_const UNIQUE
)
ALTER TABLE employ ADD CONSTRAINT employ_pk PRIMARY KEY (empID)
In this example, we removed the
PRIMARY KEY clause of the
CREATE TABLE statement and added an
ALTER TABLE statement. The
ALTER TABLE statement allowed us to put in a name for the constraint (
employ_pk ), which also becomes the name of the corresponding unique index.
Instead of the
ALTER TABLE statement, you can also use the following two statements with the same result:
CREATE UNIQUE INDEX employ_pk ON employ (empID)
ALTER TABLE employ ADD PRIMARY KEY (empID)
In this case, the
CREATE UNIQUE statement explicitly creates the unique index and specifies the desired name for the index. Next, the
ALTER TABLE statement indicates that the same column used for the unique index is also used as the primary key. After executing the
ALTER TABLE statement, you will receive this warning message:
SQL0598W Existing index "EMPLOY_PK" is used as the index for the primary key or a unique key. SQLSTATE=01550
This warning is acceptable because this is in fact what is desired.
In the previous
CREATE TABLE statement, we also added a unique constraint using the clause
CONSTRAINT unique_dept_const UNIQUE . With this clause, DB2 generates a corresponding unique index with the name
unique_dept_const.
You can also use the
ALTER TABLE statement to add a unique constraint, as shown in this example:
ALTER TABLE employ ADD CONSTRAINT unique_dept_const UNIQUE (deptID)
Referential constraints are used to support referential integrity. Referential integrity allows your database to manage relationships between tables.
Referential integrity can be better explained with examples. Assume you have two tables, as illustrated in Figure 7.13.
The figure shows the tables
country and
city, where
country is the parent table containing information about all the countries in the world, and
city is the dependent table containing information about a particular city for a given country. Note that the column
country_ID and the column
country_no are used to establish a relationship between the two tables.
The
country_ID column is a primary key column. A primary key consists of one or more columns; it is a special case of a unique constraint. While there can be many unique constraints in a table, there can be only one primary key. A primary key is used to establish a referential integrity relationship with another table.
The
country_no column, known as the foreign key column, will reference the primary key column of the parent table. Because of this relationship, the
country_no column cannot have a value that does not exist in the
country_ID column. The data type for this column must be compatible with the primary key column of the parent table. For the example illustrated in Figure 7.13, if the parent key column is defined as type INTEGER, the foreign key column can be defined as type DECIMAL because it is a numeric data type for which conversion is allowed; however, it cannot be defined as type CHAR. Other than this restriction, the foreign key can be treated like any other column. It can use the
NOT NULL, UNIQUE , and even
PRIMARY KEY clauses.
To establish the referential integrity relationship between the two tables, let's look at the corresponding
CREATE TABLE statements for both tables.
CREATE TABLE country (
country_ID INT NOT NULL PRIMARY KEY,
country_Name VARCHAR(30) NOT NULL,
continent_Name CHAR(15)
)
CREATE TABLE city (
city_ID INT NOT NULL PRIMARY KEY,
city_name VARCHAR(30) NOT NULL,
country_no INT REFERENCES country,
population INT
)
Note that the
CREATE TABLE statement for the
city table includes the
REFERENCES clause and that it does not need to specify any column of the parent table
country . DB2 will automatically look for the primary key column of the parent table to establish the relationship.
What if there is no primary key column for the parent table but a unique constraint instead? What if the parent table contains more than one unique constraint? In such cases, use the
REFERENCES clause followed by the correct column name(s). For example, let's say we actually created the
country table in Figure 7.13 with no primary key but two unique constraints, as follows.
CREATE TABLE country (
country_ID INT NOT NULL UNIQUE,
country_Name VARCHAR(30) NOT NULL,
continent_Name CHAR(15) NOT NULL UNIQUE
)
To establish referential integrity using the column
country_ID , this column must be specified in the
CREATE TABLE statement for the
city table, as shown below.
CREATE TABLE city (
city_ID INT NOT NULL PRIMARY KEY,
city_name VARCHAR(30) NOT NULL,
country_no INT REFERENCES country(country_ID),
population INT
)
NOTE
A unique constraint on a column that has been defined as NOT NULL can also be referenced by a foreign key clause because a primary key is basically the same as a unique constraint.
You can also use the
ALTER TABLE statement to add a foreign key, for example:
ALTER TABLE city
ADD FOREIGN KEY (country_no) REFERENCES country (country_ID)
The enforcement of referential integrity has implications on
INSERT, UPDATE , and
DELETE operations, which must follow certain rules. To explain these rules, let's look at the following example using Figure 7.14.
country and
city , in previous examples. In this particular example, we have inserted a few records in each of these tables. A new table,
district , which is dependent on table
city , is also illustrated. Here is the
CREATE TABLE statement for the table
district .
CREATE TABLE district (
district_ID INT NOT NULL PRIMARY KEY,
district_name VARCHAR(30) NOT NULL,
city_no INT REFERENCES city,
registrations INT
)
The following cases are examined.
What would happen if the following record were inserted in table
country ?
INSERT INTO country VALUES (2,'Spain',4)
Because
country is the parent table at the top of Figure 7.14, any value can be inserted into this table without a need to worry about the dependent tables.
What would happen if the following record were inserted in table
city ?
INSERT INTO city VALUES (44,'Vancouver',3,4000000)
Table
city is dependent on table
country based on column
country_no (the third column in the
city table). This
INSERT statement is trying to insert a record with a value of 3 for the
country_no column. From Figure 7.14 you can see this value is not present in table
country ; therefore, this record cannot be inserted and an error would be returned.
What would happen if the following record were deleted from table
country ?
DELETE FROM country WHERE country_name = 'Canada'
This
DELETE statement would fail with an error SQL0532N because there are related dependent rows. This is the default behavior, also called the
NO ACTION delete rule.
You can specify DELETE rules in the
CREATE TABLE statement of the dependent table. In addition to
NO ACTION , you can use the following rules.
RESTRICT : The behavior is exactly the same as the
NO ACTION rule. The difference is when this constraint is enforced. For more details, refer to the
DB2 UDB SQL Reference manual.
CASCADE : All the dependent rows will be deleted when the parent table row is deleted.
SET NULL : All the dependent rows will have the value of the foreign key column set to NULL, if NULLs are allowed; otherwise, an error is returned. All the other columns remain unchanged.
For example, let's say we actually created the table
city as follows.
CREATE TABLE city (
city_ID INT NOT NULL PRIMARY KEY,
city_name VARCHAR(30) NOT NULL,
country_no INT REFERENCES country(country_ID)
ON DELETE CASCADE,
population INT
)
Note that we added the clause
ON DELETE CASCADE to the foreign key column
country_no .
If we execute the following statement again, will it work this time?
DELETE FROM country WHERE country_name = 'Canada'
The answer is no. Though we defined the
CASCADE rule correctly in the
city table, we did not define it in the
district table. All dependent tables need to be defined using
CASCADE if you want all the dependent rows to be deleted. In this example, if we had defined the
district table correctly, all the rows of all the tables would have been deleted.
You can delete a row from a dependent table with no implications unless the dependent table is the parent table of another table.
You cannot update the primary key of the parent table. To ensure you don't duplicate an existing value, DB2 does not allow this operation.
You can update the foreign key of a dependent table only if the new value already exists in the parent table and the foreign key is defined as
NOT NULL . This is the default behavior, which corresponds to the
NO ACTION update rule.
For example, issuing this statement:
UPDATE city SET country_no = 7 WHERE city_name = 'Montreal'
would return error SQL0530N, which indicates the value of 7 does not exist in the parent table.
The other UPDATE rule possible is
RESTRICT , which behaves similarly to the
NO ACTION rule. The difference is when the rule enforcement takes place. For details about this rule, please review the
DB2 UDB SQL Reference manual.
You can specify UPDATE rules on the
CREATE TABLE statement of a dependent table. For example, we could have created the
city table as follows (on top of the DELETE rules of the previous example).
CREATE TABLE city (
city_ID INT NOT NULL PRIMARY KEY,
city_name VARCHAR(30) NOT NULL,
country_no INT REFERENCES country(country_ID)
ON DELETE CASCADE
ON UPDATE RESTRICT,
population INT
)
Check constraints are used to enforce data integrity at the table level. Once the check constraint is defined, every
INSERT or
UPDATE operation must satisfy the constraint; otherwise, you will receive an error. For example, let's create the table
student .
CREATE TABLE student (
student_ID INT NOT NULL PRIMARY KEY,
name VARCHAR(30) NOT NULL,
sex CHAR(1) NOT NULL
CONSTRAINT sex_check_const CHECK (sex in ('M ', 'F '))
)
This table has the check constraint
sex_check_const defined which verifies that the column
sex has the values of M or F. Now let's attempt the following statement.
INSERT INTO student VALUES (1, 'Tom', 'Z')
We will receive an error SQL0545N because the value Z does not satisfy the check constraint.
You can also add a check constraint with the
ALTER TABLE statement, as shown here.
ALTER TABLE student
ADD CONSTRAINT sex_check_const CHECK (sex in ('M ', 'F '))
If you are adding a check constraint with the
ALTER TABLE statement to a table that already has data, DB2 will check the entire table to make sure the existing data satisfies the check constraint. If it doesn't, the
ALTER TABLE statement will fail with error SQL0544N.
If you do not want DB2 to check the table when a check constraint is added, you can use the
SET INTEGRITY statement. This statement turns off check constraint and referential constraint checking. For example, let's say we create the
student table without a check constraint and insert some rows that will later be invalid for the check constraint.
CREATE TABLE student (
student_ID INT NOT NULL PRIMARY KEY,
name VARCHAR(30) NOT NULL,
sex CHAR(1) NOT NULL
)
INSERT INTO student VALUES (1, 'Tom', 'Z')
INSERT INTO student VALUES (2, 'Mary', 'A')
Now we attempt to add the following check constraint.
ALTER TABLE student
ADD CONSTRAINT sex_check_const CHECK (sex in ('M ', 'F '))
You receive error SQL0544N, as indicated earlier. Thus, use the
SET INTEGRITY command to turn off constraint checking so that you can add the constraint.
SET INTEGRITY FOR student OFF
student table is put in CHECK PENDING state, a state that allows only a few operations on the table, like
ALTER TABLE . Other operations such as
SELECT, INSERT, UPDATE , and
DELETE are disallowed.
After turning off constraint checking, you can repeat the
ALTER TABLE statement, which this time should be successful. Use the
SET INTEGRITY statement again to turn constraint checking on as follows:
SET INTEGRITY FOR student CHECK IMMEDIATE UNCHECKED
The
IMMEDIATE UNCHECKED option turns on check constraints again but does not check the existing table data. Alternatively, you can also issue:
SET INTEGRITY FOR student IMMEDIATE CHECKED
In this case, the
IMMEDIATE CHECKED option turns on check constraints again and also checks the existing table data. If a violation is encountered, the table will remain in CHECK PENDING state. The
SET INTEGRITY statement has an option to move the violating records to an exception table.
SET INTEGRITY FOR student IMMEDIATE CHECKED
FOR EXCEPTION IN student USE my_exception_table
The name of the exception table in this example is
my_exception_table . This table must exist with at least the same columns as the original source table, in this case, the
student table. After this
SET INTEGRITY statement is executed, the violating rows would be moved to the exception table, and the CHECK PENDING status would be removed. For more details about the
SET INTEGRITY statement, refer to the
DB2 UDB SQL Reference manual.
Prior to Version 8.1, DB2 always enforced constraints once you defined them. Though you can turn constraint checking off with the
SET INTEGRITY statement, this is mainly used to perform table alterations to add new constraints to existing tables, as you saw in the previous section. Using the
SET INTEGRITY statement puts your table in CHECK PENDING status, which prevents you from performing many operations on your table.
What if your application already performs constraint checking, and thus there is no need for DB2 to check the data again? For example, large applications such as SAP, PeopleSoft, and Siebel are written to check the constraints before they insert the data into DB2. In this case, defining the constraint in DB2 would cause extra overhead if DB2 is also enforcing the rule and revalidating the constraint. However, if you do not define these constraints, the DB2 optimizer cannot use them to its advantage in choosing the most optimal access plans. (Chapter 16, Database Performance Considerations, explains the DB2 optimizer in more detail.)
With Version 8.1, informational constraints were introduced. Informational constraints allow you to specify whether or not DB2 should enforce the constraint and whether or not it can be used by the optimizer to choose the best access plan for the application statements.
The default operation when you create a constraint is that it is always enforced and can be used by the optimizer. You can change this default behavior by using informational constraints, which are implemented by using the following clauses of the
CREATE TABLE statement.
ENFORCED : This is the default option. Use this clause if you want DB2 to check the constraints for every operation on the table.
NOT ENFORCED : Use this clause if you do not want DB2 to check the constraints for every operation on the table.
ENABLE QUERY OPTIMIZATION : Use this clause so that DB2 can use the knowledge of the constraint when building the plan for accessing the table or referenced tables.
DISABLE QUERY OPTIMIZATION : Use this clause if you want the DB2 optimizer to ignore the constraints defined on your table.
The following example illustrates how informational constraints work.
CREATE TABLE student (
student_ID INT NOT NULL PRIMARY KEY,
name VARCHAR(30) NOT NULL,
sex CHAR(1) NOT NULL
CONSTRAINT sex_check_const CHECK (sex in ('M ', 'F '))
NOT ENFORCED
ENABLE QUERY OPTIMIZATION
)
Note that the constraint for table
student will not be enforced, but the constraint is used for query optimization. Now let's perform the following statements.
(1)
INSERT INTO student VALUES (5, 'John', 'T')
(2)
SELECT * FROM student WHERE sex = 'T'
The first statement executes successfullya T can be inserted for the
sex column because the constraint
sex_check_const is not enforced.
The second statement returns zero records because query optimization is enabled. Therefore, the optimizer does not scan the table but checks the constraints defined for the
sex column in the DB2 catalog tables and assumes it has only values of M or F, quickly returning a result of zero records. Of course, this result is incorrect. If you want to obtain the correct result, disable query optimization. You can do this with the
ALTER TABLE statement:
ALTER TABLE student
ALTER CHECK sex_check_const DISABLE QUERY OPTIMIZATION
If you perform the second statement again, this time you should get one record.
SELECT * FROM student WHERE sex = 'T' STUDENT_ID NAME SEX ----------- ------------------------------ --- 5 John T
NOTE
After issuing the
ALTER TABLE statement to enable or disable query optimization, make sure to issue a
terminate command if working from the CLP so the change will take effect.
NOTE
Use informational constraints only if you are certain the data to be inserted or updated has been correctly checked by your application. Normally you want to use the options
NOT ENFORCED and
ENABLE QUERY OPTIMIZATION together because you want DB2 to reduce overhead by not performing constraint checking, but having the DB2 optimizer take into account the constraint definition.
The
NOT LOGGED INITIALLY clause of the
CREATE TABLE statement allows you to create a table that will not be logged when an
INSERT, UPDATE, DELETE, CREATE INDEX, ALTER TABLE , or
DROP INDEX operation is performed in the same unit of work in which the
CREATE TABLE statement was issued. For example, let's say you execute the following statements in a script.
CREATE TABLE products (
productID INT,
product_Name VARCHAR(30)
)
NOT LOGGED INITIALLY;
INSERT INTO products VALUES (1,'door');
INSERT INTO products VALUES (2,'window');
...
INSERT INTO products VALUES (999999,'telephone');
COMMIT;
INSERT INTO products VALUES (1000000,'television');
UPDATE products SET product_name = 'radio' where productID = 3456;
ALTER TABLE products ACTIVATE NOT LOGGED INITIALLY
INSERT INTO products VALUES (1000001,'desk');
INSERT INTO products VALUES (1000002,'table');
...
INSERT INTO products VALUES (1999999,'chair');
COMMIT;
Any operation from the
CREATE TABLE statement until the first
COMMIT is not logged. Once the
COMMIT is issued, any subsequent operation is logged. For this example, the
INSERT and
UPDATE statements after the first
COMMIT are logged.
NOT LOGGED INITIALLY , if you would like to turn off logging temporarily again, you can use the
ALTER TABLE statement with the
ACTIVATE NOT LOGGED INITIALLY clause, as shown in the example. Any operations between the
ALTER TABLE and the second
COMMIT are not logged.
NOTE
You can use the statement
ALTER TABLE
table_name
ACTIVATE NOT LOGGED INITIALLY only for tables that were originally created with the
NOT LOGGED INITALLY clause.
You can also use the
WITH EMPTY TABLE clause as part of the
ALTER TABLE
table_name
ACTIVATE NOT LOGGED INITIALLY statement to remove all the data of the table. This method is faster than using a
DELETE FROM
table_name statement. For example, to remove all the rows of the table
products , issue:
ALTER TABLE products ACTIVATE NOT LOGGED INITIALLY WITH EMPTY TABLE
You can compress tables to a certain extent by using the
VALUE COMPRESSION clause of the
CREATE TABLE statement. This clause tells DB2 that it can use a different internal format for the table rows so they occupy less space. In a sense, this clause turns on compression for the table; however, you need to specify another clause,
COMPRESS SYSTEM DEFAULT , for each column that you want to compress. Only the columns whose values are normally NULL or the system default value of 0 can be compressed. Also, the data type must not be DATE, TIME, or TIMESTAMP. If the data type is a varying-length string, this clause is ignored. Here's an example:
CREATE TABLE company (
company_ID INTEGER NOT NULL PRIMARY KEY,
name CHAR(10),
address VARCHAR(30) COMPRESS SYSTEM DEFAULT,
no_employees INTEGER NOT NULL COMPRESS SYSTEM DEFAULT
)
VALUE COMPRESSION
The column
address would be ignored since it's a VARCHAR column, and the column
no_employees would be compressed. Table compression saves space especially for tables used in data warehousing applications where many rows contain NULLs or the system default value of 0. However,
UPDATE operations may be impacted when changing to a different value than the default of 0 because the compressed value would first have to be expanded and then updated.
For an existing table containing data, you can enable table compression using the
ALTER TABLE statement, as shown in this example.
ALTER TABLE city ACTIVATE VALUE COMPRESSION
ALTER TABLE city
ALTER COLUMN population COMPRESS SYSTEM DEFAULT
Chapter 12, Maintaining Data, discusses the
REORG utility in more detail.
Materialized query tables (MQTs) allow users to create tables with data based on the results of a query. The DB2 optimizer can later use these tables to determine whether a query can best be served by accessing an MQT instead of the base tables. Here is an example of an MQT:
CREATE SUMMARY TABLE my_summary
AS (SELECT city_name, population
FROM country A, city B
WHERE A.country_id = B.country_no)
DATA INITIALLY DEFERRED
REFRESH DEFERRED
The
SUMMARY keyword is optional. The
DATA INITIALLY DEFERRED clause indicates that DB2 will not immediately populate the
my_summary MQT table after creation, but following the
REFRESH TABLE statement:
REFRESH TABLE my_summary
The
REFRESH DEFERRED clause in the
CREATE SUMMARY TABLE statement indicates that the data in the table is refreshed only when you explicitly issue a
REFRESH TABLE statement. Alternatively, you can create the MQT with the
REFRESH IMMEDIATE clause, which means DB2 immediately refreshes the data when the base tables are changed.
DB2 checks the registry variable
CURRENT REFRESH AGE to determine whether or not the MQT contains up-to-date information. This registry can have a value from 0 up to 99999999999999 (9,999 years, 99 months, 99 days, 99 hours, 99 minutes, and 99 seconds), which indicates the maximum duration the DB2 optimizer can wait since the last
REFRESH TABLE statement was issued on an MQT to consider MQT tables in its calculations. For example, if an MQT were refreshed today, and the
CURRENT REFRESH AGE has a value of 5 days, the DB2 optimizer can consider the MQT for its calculations for the next 5 days. If the value of this register is 0, only the tables created with the
REFRESH IMMEDIATE clause can be used for optimization.
Prior to Version 8, MQTs were known as automatic summary tables (ASTs). With Version 8, ASTs are considered a special case of MQTs whose fullselect contains a
GROUP BY clause summarizing data from the tables referenced in the fullselect.
Temporary tables can be classified as system or user tables. DB2 manages system temporary tables in the system temporary table space. DB2 creates and drops these tables automatically. Since users don't have control over system temporary tables, we don't discuss them any further in this section.
You create user temporary tables inside a user temporary table space. For example, the following statement creates a user temporary table space called
usrtmp4k .
CREATE USER TEMPORARY TABLESPACE usrtmp4k
MANAGED BY SYSTEM USING ('C:\usrtmp')
User temporary tables, referred to as temporary tables from here on, store temporary data, that is, data that will be destroyed after a session or when a connection ends. Temporary tables are typically used in situations where you need to compute a large result set from an operation, and you need to store the result set temporarily to continue with further processing.
Though transaction logging is allowed with temporary tables, most users don't need to log temporary data. In fact, not having transaction logging for this type of table improves performance.
Temporary tables exist only for one connection; therefore, there are no concurrency or locking issues.
To create a temporary table, use the
DECLARE statement. Here's an example.
DECLARE GLOBAL TEMPORARY TABLE temp_table1 (col1 int, col2 int)
ON COMMIT PRESERVE ROWS
NOT LOGGED
IN usrtmp4k
Table
temp_table1 is created in
usrtmp4k , the user temporary table space we created earlier.
DB2 uses the schema
session for all temporary tables regardless of the user ID connected to the database. After you create a temporary table, you can access it just like any regular table. The following statement inserts a row into table
temp_table1 .
INSERT INTO session.temp_table1 (1,2)
The following statement selects all the rows in table
temp_table1 :
SELECT * FROM session.temp_table1
You can drop and alter temporary tables, but you cannot create views or triggers against them. Indexes are allowed.
NOTE
When working with temporary tables, make sure to explicitly specify the schema
session . If you work with objects without specifying the schema, DB2 defaults to the authorization ID or connection ID.