Oracle SQLPlus [Electronic resources] : The Definitive Guide, 2nd Edition نسخه متنی

4.1 Data Manipulation Statements

As manifested in Oracle, the SQL language encompasses several
categories of statements:
statements that work with data, other statements that define and
modify the structures (such as tables and indexes) that hold data,
and still other statements that control the operation of the database
itself.

This chapter focuses on statements used to manipulate data:

INSERT
Places new records, or rows, into a database table.

SELECT
Retrieves previously inserted rows from a database table.

UPDATE
Modifies data in a table.

DELETE
Deletes data from a table.

MERGE
Brings a table up to date by modifying or inserting rows, as
appropriate

The reason I focus on data manipulation in this chapter is that if
you're just starting out with SQL, data manipulation
is likely to be your first problem. You'll likely be
working with tables and other database objects that have been
created. Furthermore, data manipulation statements are the ones
you'll need to leverage the SQL*Plus reporting
capabilities I describe in the next few chapters. Other topics, such
as creating database structures (e.g., tables and indexes) and
controlling the operation of the database, are database
administration topics that rightfully belong in other books.

4.1.1
Inserting Data into a Table

Use the INSERT statement to add a new row of data to a database
table. The following form of INSERT should suffice for almost all
your needs:

INSERT INTO table_name (column_list) VALUES (value_list);

Replace table_name with the name of the
target table, the table to which you wish to add a new row of data.
Replace column_list with a comma-delimited
list of column names. These are the columns for which you will supply
values in your value_list. For example,
type the code from Example 4-1 into SQL*Plus to
insert a new employee.

Example 4-1. Inserting a new employee

INSERT INTO employee (employee_id, employee_name)
VALUES ('114','Marusia Bohuslavka');

This example supplies values for only two of the five
employee table columns. When you insert a row and
omit a value for a column, that column takes on a default value
specified by your DBA when creating the table. Often, the default
value for a column is no value at all, or null. I'll
talk about nulls later in this chapter. They're
quite important to understand.

When you insert new rows, you must supply values for the following
types of columns:

Primary key columns. These are mandatory. In the
employee table, the employee_id
column forms the primary key.

Columns defined as NOT NULL that have no default value defined.

If you omit a value for one of these types of column,
you'll receive an error, as in Example 4-2.

Example 4-2. Omitting a required column

INSERT INTO employee (employee_name) VALUES ('Hryhory Kytasty');
*
ERROR at line 1:
ORA-01400: cannot insert NULL into ("GENNICK"."EMPLOYEE"."EMPLOYEE_ID") The error message here is informing you that
employee_id is a mandatory column. When you
receive an error such as this, supply a value for the column
mentioned in the error message and retry the insert.

4.1.2 Retrieving Data from a Table

What good would a database be if you couldn't get
information back from it? Indeed, you'll find that
you spend more time using SQL to get data out of the database than
you spend on any of the other data manipulation operations. The
SELECT statement, often referred to as a query ,
is what you need. Example 4-3 shows a simple query
that retrieves all data from a table.

Example 4-3. Retrieving all rows, all columns

SELECT * FROM employee;

The asterisk (*) in this example is shorthand for
"all columns." You can enumerate
the columns, as in Example 4-4.

Example 4-4. Enumerating columns in the SELECT list

SELECT employee_id, employee_name, employee_hire_date,
employee_termination_date, employee_billing_rate
FROM employee;

In a database of any size, you aren't likely to want
to retrieve all rows from a table. Rather, you'll
want to define some subset of rows that you are interested in. For
that, use the WHERE clause. Example 4-5 uses a WHERE
clause to restrict the query's results to the row
inserted in the previous section.

Example 4-5. Using a WHERE clause to constrain results

SELECT employee_id, employee_name
FROM employee
WHERE employee_id = 114;

The WHERE clause supports the same comparison operators that
you'll find in just about any programming or
scripting language:

=
Are two values the same?

!=, <>, ^=
Do two values differ?

<
Is one value less than another?

<=
Is one value less than or equal to another?

>
Is one value greater than another?

>=
Is one value greater than or equal to another?

These are the most basic and common operators.
You'll see others in this book that
I'll explain as I continue. For a rigorous trip
through the WHERE clause, I heartily recommend Sanjay Mishra and Alan
Beaulieu's book, Mastering Oracle
SQL , Second Edition (O'Reilly).

Beware the Asterisk!

It's OK to use the asterisk when writing ad hoc
queries, but think twice before using it in any query that gets
embedded into a program or a script. That's because
the results from SELECT * will
change in the event that you or your DBA ever add a new column to the
target table, or you re-create the table using a new column order.
For a SQL*Plus reporting script, such a change will likely result in
nothing more catastrophic than some mangled formatting as a result of
an extra and unexpected column in the report. However, operations
other than simple SELECTs may fail when you suddenly add a column to
a result set. When in doubt, enumerate your columns.

4.1.2.1 Multiple conditions

You can write many conditions in a WHERE clause, which you can link
together using AND and OR. You can use parentheses to clarify the
order of evaluation. Example 4-6 uses OR to retrieve
the following two groups of employees:

Employee number 114 Employees with names beginning with 'Myk', but
only when those employees were hired during or after 2004

Example 4-6. Using AND and OR

SELECT *
FROM employee
WHERE (employee_id = 114)
OR (employee_hire_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy')
AND REGEXP_LIKE(employee_name, '^Myk.*'));

Parentheses ensure that the two AND conditions are collectively
treated as one condition with respect to the OR operation.

The TO_DATE function converts the string
'1-Jan-2004' into a true DATE value, for
comparison to the hire dates. The call to REGEXP_LIKE identifies
those rows having names that begin with 'Myk'. The
string '^Myk.*' is a regular expression.

Regular expression support is an exciting, new feature of Oracle
Database 10 g . Learn more about this new feature
from the Oracle Regular Expression Pocket
Reference (O'Reilly), which I coauthored
with Peter Linsley.

4.1.2.2 Negating conditions

There is also the NOT operator, which you can use to negate a
condition. For example, the two queries in Example 4-7 are equivalent.

Example 4-7. Using the NOT operator

SELECT *
FROM employee
WHERE employee_id <> 114;
SELECT *
FROM employee
WHERE NOT employee_id = 114;

NOT can be handy when writing complex queries because sometimes
it's easier to think in terms of those rows that you
do not want in your result. You can write conditions to identify
those rows you do not want, and you can use NOT to negate those
conditions. Example 4-8 shows a slightly modified
(NOTified?) version of the query from Example 4-6.
This time, I want the statement to return all the rows not returned
in Example 4-6. I could have modified each of the
three comparisons individually, but that would require a fair bit of
thought, and I might make a mistake. It's easier to
wrap the entire original logic in parentheses and apply the NOT
operator to the result.

Example 4-8. Negating complex logic

SELECT *
FROM employee
WHERE NOT ((employee_id = 114)
OR (employee_hire_date >= TO_DATE('1-Jan-2004','dd-mon-yyyy')
AND REGEXP_LIKE(employee_name, '^Myk.*')));

Using NOT as I've just done leaves me open to
problems involving null values. This is an important topic
I'll discuss later in "The Concept
of Null."

4.1.2.3 Table aliases

When you're writing a query, it's
often necessary to qualify column names by also specifying their
table names. You do this using dot notation , as
shown in Example 4-9. The
employee. in front of each column name specifies
that each column belongs to the employee table.

Example 4-9. Qualifying column names with their table names

SELECT employee.employee_id, employee.employee_name
FROM employee;

It's cumbersome to retype long table names many
times over. For this reason, SQL allows you to specify alternate
names, known as table aliases , for the tables in
your query. Specify an alias by placing it immediately after its
table name, thereby separating the two names by whitespace, as shown
in Example 4-10.

Example 4-10. Specifying and using a table alias

SELECT e.employee_id, e.employee_name
FROM employee e;

Keep your aliases short, which is the whole idea.
It's much easier to type e. than
employee. in front of each column name.

You really don't need to worry about qualifying
column names when selecting from only one table. Later in this
chapter, you'll learn how to select from many tables
at once. That's when qualifying column names becomes
important. If you don't qualify column names when
selecting from multiple tables, the database engine must expend extra
resources sorting through which name goes with which table.
That's usually a trivial performance hit, but it can
add up if given enough users and statements. And woe be to you if two
tables share a common column name because the database engine
won't be able to determine your intent, and
you'll get an error.

4.1.2.4 Column aliases

Just as you can specify aliases for tables, you can also specify
aliases for columns. Many people specify column aliases just as they
do table aliases, by placing the alias name immediately after the
column name. Others are a bit more proper in their approach and use
the AS keyword. Example 4-11 demonstrates both
approaches.

Example 4-11. Specifying column aliases

SELECT e.employee_id emp_num, e.employee_name "Employee Name",
e.employee_billing_rate * 0.50 discounted_rate
FROM employee e;
SELECT e.employee_id AS emp_num, e.employee_name AS "Employee Name",
e.employee_billing_rate * 0.50 AS discounted_rate
FROM employee e;

Column aliases are useful for reporting, because SQL*Plus picks them
up and uses them as default column titles. Having "Employee Name" at the top of a column looks much better to a user
than "employee_name". Column aliases are also
useful in naming expressions in your SELECT list such as the one in
Example 4-11 that discounts the billing rate.
Otherwise, the expression itself will be the column name. In Chapter 5, you'll see that
it's much easier to define formatting for an
expression column when the name is one that you
have specified rather than one that the database has generated
for you.

INSERT . . . SELECT FROM

An interesting use of SELECT is as a data source for an INSERT
statement. Suppose that you want to create a project named
"X Overhead" for each existing
project, replacing X with that project's name. For
example, for "Corporate Web Site,"
you'd create a new project named
"Corporate Web Site Overhead." You
can do that using the following INSERT statement:

INSERT INTO PROJECT (project_id, project_name)
SELECT project_id+8000,
SUBSTR(project_name,1,31) || ' Overhead'
FROM project;

The nested SELECT in this statement returns a set of rows, each row
consisting of a project ID and a project name. Those rows feed into
the INSERT statement, which inserts those rows back into the
project table. The SUBSTR (for substring) function
call clips the old project names at 31 characters, to ensure enough
room for adding ' Overhead'. The newly inserted
projects are all numbered above 8000.

In this example, the source and target tables are the same, but
that's not necessary. The SELECT is independent of
the INSERT. All that's necessary is that the SELECT
returns the correct number of columns corresponding in type to those
listed in the INSERT statement.

4.1.3 Updating Data with New Values

Things change. That seems to be the rule in our world today. Your
data will change, too, and for that reason SQL provides the UPDATE
statement. Use it to set new values for some or all columns in a row.
The UPDATE in Example 4-12 changes the name and the
budget for project #1005.

Example 4-12. Changing values in a single row

UPDATE project
SET project_name = 'Virtual Private Network',
project_budget = 199999.95
WHERE project_id=1005;

The update in Example 4-12 updates only a single row.
You can be certain of that, because project_id is
the primary key of the project table, so there can
be at most one row with ID 1005. You can use UPDATE to change values
in many rows, by writing conditions in your WHERE clause to target
those rows. Example 4-13 shows budgets being cut by
30% for all projects but the corporate web site.

Example 4-13. Changing values in many rows

UPDATE project
SET project_budget = project_budget * 0.70
WHERE NOT project_name = 'Corporate Web Site';

You'll notice that the new value for
project_budget in Example 4-13 is
the result of an expression; in Example 4-12, the new
values are constants. When updating many rows, it rarely makes sense
to apply the same value to all, so such updates should use
expressions or subqueries to generate new values appropriate to each
row. (See the Section 4.7 for more information on that
approach.) When you write an update, be sure you know whether it potentially
affects more than one row. If the WHERE clause does not specify a
single value for the table's primary key, or for a
unique key column, then the update could potentially affect many
rows. When an update affects many rows, give careful thought to how
you will compute new values for those rows.

If you know for certain that you can uniquely identify a single row
via a non-key field, go for it. However, you may want to issue a
SELECT first just to be sure. Example 4-13 identified
a project by name. Potentially, two projects can have the same name,
but I knew that wasn't the case when I wrote the
example. In the real world, sometimes you take advantage of what you
know about your own data.

It's possible to issue an UPDATE that changes the
value in a primary key, but such changes aren't
often made, and you're better off avoiding them if
you can. Database designers design databases in ways that minimize or
eliminate the need to change primary key values. When you change the
primary key of a row, you affect the referential integrity of your
database. The change must be rippled through to any other rows that
refer to the row you are changing. Alternately, the database must
prevent you from changing a primary key value that is referenced by
another row. Much depends here on how your database schema was
designed. In a worst-case scenario, you might have to sort out the
proper order for a whole series of updates to foreign-key columns
that reference the primary-key value you wish to change.

4.1.4 Deleting Data from a Table

All things must come to an end, and
that's true of data, too. Actually, with
today's increasing hard-drive sizes,
I'm not so sure that some don't
plan to hold their data forever, but let's ignore
that complication for now. Use the DELETE statement to get rid of
rows you no longer need. Did you insert all those new projects by
executing the code in the "INSERT . . . SELECT FROM" sidebar? Too bad. Now your
boss has changed his mind. Isn't that annoying? Now
it's your job to delete what you just created. Do
that using the DELETE statement shown in Example 4-14.

Example 4-14. Deleting rows from a table

DELETE FROM project
WHERE project_id > 8000;

DELETE is a fairly simple statement, but be careful to craft your
WHERE clause so you only delete rows that you want to delete. I often
write a SELECT statement using the same WHERE conditions and check
the results from that before unleashing a DELETE.

Deleting All Rows from a Table

You wouldn't think it, but deleting data can
actually be an expensive proposition. There's quite
a bit of database overhead involved in a large delete. If you happen
to be deleting all table rows, and you own the
table or are a DBA having the TRUNCATE ANY TABLE privilege, you can
use the
TRUNCATE TABLE statement to good effect:

TRUNCATE TABLE table_name;

If you plan to reload close to the same amount of data, use the
following variant, which maintains the disk space currently allocated
to the table:

TRUNCATE TABLE table_name REUSE STORAGE;

For a table of any size, it's far faster to erase
all the rows by truncating the table than by issuing a DELETE. This
is because Oracle can accomplish the entire truncation by resetting
an internal pointer known as the highwater mark .
The trade-offs are that TRUNCATE commits any transaction you might
have pending, and that you can't undo a TRUNCATE
TABLE statement if you discover you've made a
blunder after it executes. You can, however, undo a DELETE statement
if you discover your mistake before committing the transaction. Read
the section on "Transactions" for
more on this.

If you are at all uncertain about whether to use TRUNCATE TABLE or
DELETE, then use DELETE. You generally cannot issue TRUNCATE TABLE
statements from within programs, at least not as easily as you can
issue DELETE statements. In SQL*Plus scripts, either is equally easy
to use.

4.1.5 Merging Data into a Table

New in Oracle9 i Database, and in the 2003
ANSI/ISO SQL standard, the MERGE statement solves the age-old problem
of needing to update a table or insert a new row depending on whether
a corresponding row already exists. Suppose that
you've created those overhead projects mentioned in
the "INSERT . . . SELECT FROM"
sidebar. You'll have used the statement shown in
Example 4-15.

Example 4-15. Creating overhead projects

INSERT INTO PROJECT (project_id, project_name)
SELECT project_id+8000,
SUBSTR(project_name,1,31) || ' Overhead'
FROM project;

Example 4-15 creates a new set of projects from an
existing set of rows. Now, let's say
you're faced with the task of periodically
refreshing the list of overhead projects. At the time you do a
refresh, you need to account for two possibilities:

For any new projects that have been added, you need to create new
overhead projects.

For any existing projects, you need to migrate any name changes to
their respective overhead project names.

There are different ways that you can go about solving this problem.
One way is to periodically issue the MERGE statement shown in Example 4-16.

Example 4-16. Bringing the list of overhead projects up to date

MERGE INTO project pdest
USING (SELECT project_id+8000 project_id,
SUBSTR(project_name,1,31) || ' Overhead' project_name
FROM project
WHERE project_id <= 8000) psource
ON (pdest.project_id = psource.project_id)
WHEN MATCHED THEN UPDATE
SET pdest.project_name = psource.project_name
WHEN NOT MATCHED THEN INSERT
(project_id, project_name)
VALUES (psource.project_id, psource.project_name);

Here's a step-by-step walkthrough of this MERGE
statement:

The INTO clause identifies the target table, in this case,
project. An alias of pdest is
specified to make it easier to distinguish between the two
occurrences of the table in the one statement.

The USING clause specifies a SELECT statement to use in generating
rows for the merge operation. This SELECT
statement's WHERE clause excludes any overhead rows,
which are rows having project_id values greater
than 8000. The results of this SELECT statement are identified
through the alias psource.

Each row from the SELECT is matched to its corresponding row in
pdest via the condition given in the ON clause.

When a match occurs, which is to say that a row in
pdest has a project_id matching
that of a given row in psource, an update of the
project_name is triggered.

When no match occurs (for a given row from
psource, no corresponding row in
pdest can be found) a new row is inserted into
pdest.

MERGE is commonly used in data warehousing environments to
periodically update reporting tables from operational data. Often,
the source for rows feeding into a MERGE operation will be an
external table , a type of table corresponding to
an operating system file. There's more in Chapter 9 about using external tables to load data
from
files.