 |
|
Advanced Oracle
SQL: Ranking Functions
Oracle Tips by Laurent Schneider
|
Laurent
Schneider is considered one of the top Oracle SQL experts, and
he is the author of the book "Advanced
Oracle SQL Programming" by Rampant TechPress. The following
is an excerpt from the book.
Ranking functions
The ranking family of functions uses ORDER BY in
the analytic clause to enumerate the rows or to retrieve previous or
next rows.
SELECT
ENAME,
HIREDATE,
ROW_NUMBER() OVER (ORDER BY HIREDATE) ROW_NUMBER,
LAG(ENAME) OVER (ORDER BY HIREDATE) LAG,
LEAD(ENAME) OVER (ORDER BY HIREDATE) LEAD
FROM
EMP
ORDER BY
HIREDATE;
ENAME HIREDATE
ROW_NUMBER LAG LEAD
---------- --------- ---------- ---------- ----------
SMITH 17-DEC-80 1 ALLEN
ALLEN 20-FEB-81 2 SMITH WARD
WARD 22-FEB-81 3 ALLEN JONES
JONES 02-APR-81 4 WARD BLAKE
BLAKE 01-MAY-81 5 JONES CLARK
CLARK 09-JUN-81 6 BLAKE TURNER
TURNER 08-SEP-81 7 CLARK MARTIN
MARTIN 28-SEP-81 8 TURNER KING
KING 17-NOV-81 9 MARTIN JAMES
JAMES 03-DEC-81 10 KING FORD
FORD 03-DEC-81 11 JAMES MILLER
MILLER 23-JAN-82 12 FORD SCOTT
SCOTT 19-APR-87 13 MILLER ADAMS
ADAMS 23-MAY-87 14 SCOTT
ROW_NUMBER returns a row number in the specified
order. LAG returns the previous row and LEAD returns the next row.
The LAG and LEAD functions are not deterministic; for James and
Ford, the hire date is the same but the function returns different
results which may differ from one execution to another.
RANK and DENSE_RANK are deterministic.
SELECT
ENAME,
HIREDATE,
ROW_NUMBER() OVER (ORDER BY HIREDATE) ROW_NUMBER,
RANK() OVER (ORDER BY HIREDATE) RANK,
DENSE_RANK() OVER (ORDER BY HIREDATE) DENSE_RANK
FROM
EMP
ORDER BY
HIREDATE;
ENAME
HIREDATE ROW_NUMBER RANK DENSE_RANK
---------- --------- ---------- ---------- ----------
SMITH 17-DEC-80 1 1 1
ALLEN 20-FEB-81 2 2 2
WARD 22-FEB-81 3 3 3
JONES 02-APR-81 4 4 4
BLAKE 01-MAY-81 5 5 5
CLARK 09-JUN-81 6 6 6
TURNER 08-SEP-81 7 7 7
MARTIN 28-SEP-81 8 8 8
KING 17-NOV-81 9 9 9
JAMES 03-DEC-81 10 10 10
FORD 03-DEC-81 11 10 10
MILLER 23-JAN-82 12 12 11
SCOTT 19-APR-87 13 13 12
ADAMS 23-MAY-87 14 14 13
ROW_NUMBER always returns distinct numbers for
duplicates. Both RANK and DENSE_RANK return duplicate numbers for
employees with the same hire date. The difference between them is
DENSE_RANK does not skip numbers.
ROW_NUMBER, DENSE_RANK and RANK have different
effects. Because the differences between them are quite subtle, the
specification of the query must be very precise so the right one can
be used.
SELECT
DEPTNO,
ENAME,
SAL
FROM
(
SELECT
DEPTNO,
ENAME,
SAL,
ROW_NUMBER() OVER
(PARTITION BY DEPTNO ORDER BY SAL DESC) R
FROM
EMP
)
WHERE
R<=3
ORDER BY
DEPTNO,
R;
DEPTNO
ENAME SAL
---------- ---------- ----------
10 KING 5000
10 CLARK 2450
10 MILLER 1300
20 SCOTT 3000
20 FORD 3000
20 JONES 2975
30 BLAKE 2850
30 ALLEN 1600
30 TURNER 1500
Note: An analytic function cannot
be used in the WHERE clause, but only in the SELECT or in the ORDER
BY clause.
By default, LEAD and LAG return the previous and
next rows. The second argument allows retrieving the nth
previous and nth next rows. The third argument defines a
default value.
SELECT
ENAME,
SAL,
LAG(SAL,1,0) OVER (ORDER BY SAL) LAG1,
LAG(SAL,2,0) OVER (ORDER BY SAL) LAG2,
LAG(SAL,3,0) OVER (ORDER BY SAL) LAG3
FROM
EMP
ORDER BY
SAL;
ENAME
SAL LAG1 LAG2 LAG3
---------- ---------- ---------- ---------- ----------
SMITH 800 0 0 0
JAMES 950 800 0 0
ADAMS 1100 950 800 0
WARD 1250 1100 950 800
MARTIN 1250 1250 1100 950
MILLER 1300 1250 1250 1100
TURNER 1500 1300 1250 1250
ALLEN 1600 1500 1300 1250
CLARK 2450 1600 1500 1300
BLAKE 2850 2450 1600 1500
JONES 2975 2850 2450 1600
SCOTT 3000 2975 2850 2450
FORD 3000 3000 2975 2850
KING 5000 3000 3000 2975
The values of the last three rows are returned.
Non-existing values default to 0.