4.3 ANSI SQL Window Functions
SQL2003
allows for a
window_clause in
aggregate function calls, the addition of which makes those functions
into window functions. Both Oracle and DB2 support this window
function syntax. This section describes how to use the
window_clause within
Oracle and DB2.
|
Window, or analytic, functions are
similar to standard aggregate functions in that they operate on
multiple rows, or groups of rows, within the result set returned from
a query. However, the groups of rows that a window function operates
on are defined not by a GROUP BY clause, but by partitioning and
windowing clauses. Furthermore, the order within these groups is
defined by an ordering clause, but that order only affects function
evaluation, and has no effect on the order in which rows are returned
by the query.
|
4.3.1 SQL2003's Window Syntax
SQL2003 specifies the following syntax
for window functions:
FUNCTION_NAME(expr) OVER {window_name|(window_specification)}
window_specification ::= [window_name][partitioning][ordering][framing]
partitioning ::= PARTITION BY value [, value...] [COLLATE collation_name]
ordering ::= ORDER [SIBLINGS] BY rule [, rule...]
rule ::= {value|position|alias} [ASC|DESC] [NULLS {FIRST|LAST}]
framing ::= {ROWS|RANGE} {start|between} [exclusion]
start ::= {UNBOUNDED PRECEDING|unsigned-integer PRECEDING|CURRENT ROW}
between ::= BETWEEN bound AND bound
bound ::= {start|UNBOUNDED FOLLOWING|unsigned-integer FOLLOWING}
exclusion ::= {EXCLUDE CURRENT ROW|EXCLUDE GROUP
|EXCLUDE TIES|EXCLUDE NO OTHERS}
4.3.2 Oracle's Window Syntax
Oracle's window
function syntax is as follows:
FUNCTION_NAME(expr) OVER (window_clause)
window_clause ::= [partitioning] [ordering [framing]]
partitioning ::= PARTITION BY value [, value...]
ordering ::= ORDER [SIBLINGS] BY rule [, rule...]
rule ::= {value|position|alias} [ASC|DESC]
[NULLS {FIRST|LAST}]
framing ::= {ROWS|RANGE} {not_range|begin AND end}
not_range ::= {UNBOUNDED PRECEDING
|CURRENT ROW|
|value PRECEDING}
begin ::= {UNBOUNDED PRECEDING
|CURRENT ROW|
|value {PRECEDING|FOLLOWING}}
end ::= {UNBOUNDED FOLLOWING
|CURRENT ROW|
|value {PRECEDING|FOLLOWING}}
4.3.3 DB2's Window Syntax
DB2's syntax is
similar to Oracle's. For OLAP, ranking, and
numbering functions, DB2 allows the following syntax:
FUNCTION_NAME(expr) OVER (window_clause)
window_clause ::= [partitioning] [ordering]
partitioning ::= PARTITION BY (value [, value...])
ordering ::= {ORDER BY rule [, rule...] | ORDER OF table_name}
rule ::= {value|position|alias} [ASC|DESC [NULLS {FIRST|LAST}]]
When aggregate functions (e.g. AVG) are used as
window functions, DB2 allows the addition of a
framing clause:
FUNCTION_NAME(expr) OVER (window_clause)
window_clause ::= [partitioning] [ordering [framing]] [all|framing]
all ::= RANGE UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
partitioning ::= PARTITION BY (value [, value...])
ordering ::= {ORDER BY rule [, rule...] | ORDER OF table_name}
rule ::= {value|position|alias} [ASC|DESC [NULLS {FIRST|LAST}]]
framing ::= {ROWS|RANGE} {group_start|group_between|group_end}
group_start ::= {UNBOUNDED PRECEDING|unsigned-integer PRECEDING
|CURRENT ROW}
group_between ::= BETWEEN {UNBOUNDED PRECEDING|unsigned_integer PRECEDING
|unsigned_integer FOLLOWING|CURRENT ROW}
AND {UNBOUNDED FOLLOWING|unsigned_integer PRECEDING
|unsigned_integer FOLLOWING|CURRENT ROW}
group_end ::= UNBOUNDED FOLLOWING|unsigned-integer FOLLOWING}
4.3.4 Partitioning
Partitioning the rows operated on by the partition clause is similar
to using the GROUP BY expression on a standard
SELECT statement. The partitioning clause takes a list of
expressions that will be used to divide the result set into groups.
We'll use the following table as the basis for some
examples:
SELECT * FROM odd_nums;
NUM ODD
---------- ----------
0 0
1 1
2 0
3 1
The following results illustrate the effects of partitioning by
ODD. The sum of the even numbers is
2 (0+2), and the sum of the odd numbers is 4 (1+3). The second column
of the result set reports the sum of all values in the partition to
which that row belongs. Yet all the detail rows
are returned. The query provides summary results in the context of
detail rows:
SELECT NUM, SUM(NUM) OVER (PARTITION BY ODD) S
FROM ODD_NUMS;
NUM S
--------- ----------
0 2
2 2
1 4
3 4
Not using a partitioning clause at all will sum all of the numbers in
the NUM column for each row returned
by the query. In effect, the entire result set is treated as a
single, large partition:
SELECT NUM, SUM(NUM) OVER ( ) S FROM ODD_NUMS;
NUM S
--------- ----------
0 6
1 6
2 6
3 6
4.3.5 Ordering
You specify the order of the rows on which an analytic function
operates using the ordering
clause. However, this analytic ordering
clause does not define the result set ordering. To define the overall
result set ordering, you must use the query's
ORDER BY clause. The following use of
Oracle's FIRST_VALUE function
illustrates the effect of different orderings of the partitions:
SELECT NUM,
SUM(NUM) OVER (PARTITION BY ODD) S,
FIRST_VALUE(NUM) OVER (PARTITION BY ODD ORDER BY NUM ASC) first_asc,
FIRST_VALUE(NUM) OVER (PARTITION BY ODD ORDER BY NUM DESC) first_desc
FROM ODD_NUMS;
NUM S FIRST_ASC FIRST_DESC
---------- ---------- ---------- ----------
0 2 0 2
2 2 0 2
1 4 1 3
3 4 1 3
As you can see, the ORDER BY clauses in the
window function invocations affect the ordering of the rows in the
respective partitions when those functions are evaluated. Thus,
ORDER BY NUM ASC orders partitions in ascending
order, resulting in 0 for the first value in the
even-number partition and 1 for the first value in
the odd-number partition. ORDER BY NUM DESC has
the opposite effect.
|
4.3.6 Grouping or Windowing
Many analytic functions also allow you to specify a virtual, moving
window surrounding a row within a partition. You do this using the
framing
clause. Such moving windows are useful for running
calculations such as a running total.
The following, Oracle-based example uses the
framing clause on the analytic variant of
SUM to calculate a running sum of the values in
the first column. No partitioning clause is used, so each invocation
of SUM operates over the entire result set.
However, the ORDER BY clause sorts the rows for
SUM in ascending order of NUM's value, and the
BETWEEN clause (which is the windowing clause)
causes each invocation of SUM to include values
for NUM only up through the current
row. Each successive invocation of SUM includes
yet another value for NUM, in order,
from the lowest value of NUM to the
greatest:
SELECT NUM, SUM(NUM) OVER (ORDER BY NUM ROWS
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) S FROM ODD_NUMS;
NUM S
--------- ----------
0 0
1 1
2 3
3 6
This example's a bit too easy, as the order of the
final result set happens to match the order of the running total.
That doesn't need to be the case. The following
example generates the same results, but in a different order. You can
see that the running total values are appropriate for each value of
NUM, but the rows are presented in a
different order than before. The result set ordering is completely
independent of the ordering used for window function calculations:
SELECT NUM, SUM(NUM) OVER (ORDER BY NUM ROWS
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) S FROM ODD_NUMS
ORDER BY NUM DESC;
NUM S
---------- ----------
3 6
2 3
1 1
0 0
4.3.7 List of Window Functions
SQL2003 specifies that any aggregate function may also be used as a
window function. Both Oracle and DB2 largely follow the standard in
that respect, so you'll find that you can take just
about any aggregate function (certainly the standard ones) and apply
to it the window function syntax described in the preceding sections.
In addition to the aggregate functions, SQL2003 defines the window
functions described in the following sections. Only Oracle and DB2
currently implement these functions. All examples use the following
table and data, which is a variation on the ODD_NUMS table used earlier to illustrate the
concepts of partitioning, ordering, and grouping:
SELECT * FROM test4;
NUM ODD
---------- ----------
0 0
1 1
2 0
3 1
3 1
4 0
5 1
Platform-specific window functions for Oracle (there are none for
DB2) are included in the lists found under Section 4.5 later in this chapter.
CUME_DIST( ) |
Calculates the cumulative distribution, or relative rank, of the
current row to other rows in the same partition. The calculation for
a given row is as follows:
number of peer or preceding rows / number of rows in partition
Because the result for a given row depends on the number of rows
preceding that row in the same partition, it's
important to always specify an ORDER BY clause
when invoking this function.
SQL2003 Syntax
CUME_DIST( ) OVER {window_name|(window_specification)}
DB2
DB2 does not support the CUME_DIST( ) window
function.
Oracle
Oracle does not allow the framing portion of the windowing syntax.
Oracle requires the ordering clause:
CUME_DIST( ) OVER ([partitioning] ordering)
Example
The following Oracle-based example uses CUME_DIST(
) to generate a relative rank for each row, ordering by
NUM, after partitioning the data by
ODD:
SELECT NUM, ODD, CUME_DIST( ) OVER
(PARTITION BY ODD ORDER BY NUM) cumedist
FROM test4;
NUM ODD CUMEDIST
---------- ---------- ----------
0 0 .333333333
2 0 .666666667
4 0 1
1 1 .25
3 1 .75
3 1 .75
5 1 1
Following is an explanation of the calculation behind the rank for
the row in which NUM=0:
Because of the ORDER BY clause, the rows in the
partition are ordered as follows:
NUM=0NUM=2NUM=4
There are no rows preceding NUM=0.
There is one row that is a peer of NUM=0, and that is the NUM=0 row itself. Thus, the divisor is 1.
There are three rows in the partition as a whole, making the dividend
3.
The result of 1/3 is.33 repeating, as shown in the example output.
DENSE_RANK( ) |
Assigns a rank to each row in a partition, which should be ordered in
some manner. The rank for a given row is computed by counting the
number of rows preceding the row in question, and then adding 1 to
the result. Rows with duplicate ORDER BY values
will rank the same. Unlike the case with RANK(
), gaps in rank numbers will not result from two rows
sharing the same rank.
SQL2003 Syntax
DENSE_RANK( ) OVER {window_name|(window_specification)}
DB2
DB2 requires the ordering clause and does
not allow the framing clause:
DENSE_RANK( ) OVER ([partitioning] ordering)
Oracle
Oracle also requires the ordering clause
and does not allow the framing clause:
DENSE_RANK( ) OVER ([partitioning] ordering)
Example
Compare the results from the following Oracle-based example to those
shown in the section on the RANK( ) function:
SELECT NUM, DENSE_RANK( ) OVER (ORDER BY NUM) rank
FROM test4;
NUM RANK
---------- ----------
0 1
1 2
2 3
3 4
3 4
4 5
5 6
The two rows where NUM=3 are both ranked at #3. The next higher row
is ranked at #4. Rank numbers are not skipped, hence the term
"dense."
RANK( ) |
Assigns a rank to each row in a partition, which should be ordered in
some manner. The rank for a given row is computed by counting the
number of rows preceding the row in question, and then adding 1 to
the result. Rows with duplicate ORDER BY values
will rank the same, and will lead to subsequent gaps in rank numbers.
SQL2003 Syntax
RANK( ) OVER {window_name|(window_specification)}
DB2
DB2 requires the ordering clause and does
not allow the framing clause:
RANK( ) OVER ([partitioning] ordering)
Oracle
Oracle also requires the ordering clause
and does not allow the framing clause:
RANK( ) OVER ([partitioning] ordering)
Example
The following Oracle-based example uses the NUM column to rank the rows in the test4 table:
SELECT NUM, RANK( ) OVER (ORDER BY NUM) rank
FROM test4;
NUM RANK
---------- ----------
0 1
1 2
2 3
3 4
3 4
4 6
5 7
Because both rows where NUM=3 rank
the same at #4, the next higher row will be ranked at #6. The #5 rank
is skipped.
PERCENT_RANK |
Computes the relative rank of a row by dividing that
row's rank less 1 by the number of rows in the
partition, also less 1:
(rank - 1) / (rows - 1)
Compare this calculation to that used for
CUME_DIST.
SQL2003 Syntax
PERCENT_RANK( ) OVER ({window_name|(window_specification)}
DB2
DB2 does not support the PERCENT_RANK( ) window
function.
Oracle
Oracle requires the ordering clause and
does not allow the framing clause:
PERCENT_RANK( ) OVER ([partitioning] ordering)
Example
The following, Oracle-based example assigns a relative rank to values
of NUM, partitioning the data on the
ODD column:
SELECT NUM, ODD, PERCENT_RANK( ) OVER
(PARTITION BY ODD ORDER BY NUM) cumedist
FROM test4;
NUM ODD CUMEDIST
---------- ---------- ----------
0 0 0
2 0 .5
4 0 1
1 1 0
3 1 .333333333
3 1 .333333333
5 1 1
Following is an explanation of the calculation behind the rank for
the row in which NUM=2:
Row NUM=2 is the second row in its
partition; thus, it ranks #2.
Subtract 1 from 2 to get a divisor of 1.
The dividend is the total number of rows in the partition, or 3.
Subtract 1 from 3 to get a dividend of 2.
The result of 1/3 is.33 repeating, as shown in the example.
ROW_NUMBER |
Assigns a unique number to each row in a partition.
SQL2003 Syntax
ROW_NUMBER( ) OVER ({window_name|(window_specification)}
DB2
DB2 does not allow the framing clause, and
it makes the ordering clause optional:
ROW_NUMBER( ) OVER ([partitioning] [ordering])
Oracle
Oracle requires the ordering clause and
does not allow the framing clause:
ROW_NUMBER( ) OVER ([partitioning] ordering)
Example
SELECT NUM, ODD, ROW_NUMBER( ) OVER
(PARTITION BY ODD ORDER BY NUM) cumedist
FROM test4;
NUM ODD CUMEDIST
---------- ---------- ----------
0 0 1
2 0 2
4 0 3
1 1 1
3 1 2
3 1 3
5 1 4