 |
|
Advanced Oracle SQL Aggregation, Hierarchies and Analytics
Oracle Tips by Laurent Schneider
|
Laurent
Schneider is considered one of the top Oracle SQL experts, and
he is the author of the book "Advanced
Oracle SQL Programming" by Rampant TechPress. The following
is an excerpt from the book.
Aggregation
It is possible to use aggregation, hierarchies
and analytics within the same query:
SELECT
DEPTNO,
DECODE
(
CONNECT_BY_IS_LEAF,
0, 'LEADER',
1, 'EMPLOYEE',
'TOTAL'
) LEADFUNC,
COUNT(*),
SUM
(
COUNT(*)
)
OVER
(
PARTITION BY DEPTNO
) TOTAL_DEPT
FROM
EMP
CONNECT BY
PRIOR EMPNO=MGR
START WITH
MGR IS NULL
GROUP BY
DEPTNO,
CONNECT_BY_IS_LEAF
ORDER BY
DEPTNO,
CONNECT_BY_IS_LEAF;
DEPTNO
LEADFUNC COUNT(*) TOTAL_DEPT
---------- -------- ---------- ----------
10 LEADER 2 3
10 EMPLOYEE 1 3
20 LEADER 3 5
20 EMPLOYEE 2 5
30 LEADER 1 6
30 EMPLOYEE 5 6
For each department, the leaves, or employees
with no subordinate, and the leaders are aggregated. A global count
over the whole department is returned.
Aggregation within a hierarchy is done using the
first ancestor operator CONNECT_BY_ROOT
(10g).
SELECT
R_ENAME MANAGER,
MAX(ENAME) KEEP
(
DENSE_RANK FIRST
ORDER BY SAL DESC
) BEST_PAID_EMPLOYEE,
MAX(SAL) SALARY,
ROUND(AVG(SAL)) AVERAGE_SAL_EMPLOYEES
FROM
(
SELECT
CONNECT_BY_ROOT ENAME R_ENAME,
ENAME,
SAL
FROM
EMP
WHERE
CONNECT_BY_IS_LEAF=1
AND
LEVEL>1
CONNECT BY
PRIOR EMPNO=MGR
)
GROUP BY
R_ENAME;
MANAGER BEST_PAID_ SALARY
AVERAGE_SAL_EMPLOYEES
------- ---------- ---------- ---------------------
BLAKE ALLEN 1600 1310
CLARK MILLER 1300 1300
FORD SMITH 800 800
JONES ADAMS 1100 950
KING ALLEN 1600 1219
SCOTT ADAMS 1100 1100
The subquery selects bosses with their
employees. CONNECT_BY_ROOT(10g)
saves the ancestor when traversing the hierarchy. CONNECT_BY_IS_LEAFreturns the subordinates with no leading function. LEVEL>1 prevents returning
the employee as his own ancestor.
The main query then groups by the root name and
returns the best paid employee with his name and his salary.
Additionally, the average salary is returned.
A common application of hierarchy is finding the
shortest path between two nodes.
--
*************************************************
-- Copyright © 2008 by Rampant TechPress
-- This script is free for non-commercial purposes
-- with no warranties. Use at your own risk.
--
-- To license this script for a commercial purpose,
-- contact rtp@rampant.cc
-- *************************************************
-- Id : $Id: tablepath.sql,v 1.1 2008/05/23 13:45:23 Laurent Exp
$
-- Author : $Author: Laurent $
-- Date : $Date: 2008/05/23 13:45:23 $
--
-- Create PATH Table in current schema
--
WHENEVER SQLERROR EXIT
EXEC EXECUTE IMMEDIATE 'DROP TABLE PATH'; EXCEPTION WHEN OTHERS THEN
NULL
-- Create Table
CREATE TABLE
PATH
(
SRC VARCHAR2(3),
DST VARCHAR2(3),
DISTANCE NUMBER
);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'A', 'A', 0);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'A', 'B', 8);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'A', 'C', null);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'A', 'D', 40);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'B', 'A', 8);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'B', 'B', 0);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'B', 'C', 15);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'B', 'D', 50);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'C', 'A', null);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'C', 'B', 15);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'C', 'C', 0);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'C', 'D', 1);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'D', 'A', 40);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'D', 'B', 50);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'D', 'C', 1);
INSERT INTO
PATH(SRC,DST,DISTANCE)
VALUES
( 'D', 'D', 0);
COMMIT;
Figure 6.2:
Distances Between Nodes
CREATE
FUNCTION EVAL(EXPR VARCHAR2) RETURN NUMBER IS
RC NUMBER;
BEGIN
EXECUTE IMMEDIATE 'BEGIN :1 := '||EXPR||'; END;' USING OUT RC;
RETURN RC;
END;
/
SELECT
ROOTSRC,
DST,
MIN(N)
FROM
(
SELECT
CONNECT_BY_ROOT SRC ROOTSRC,
DST,
EVAL(SYS_CONNECT_BY_PATH(NVL(TO_CHAR(DISTANCE),'NULL'),'+')) N
FROM
PATH
CONNECT BY
NOCYCLE
PRIOR DST=SRC
)
GROUP BY
ROOTSRC,
DST
ORDER BY
ROOTSRC,
DST;
R D
MIN(N)
- - ----------
A A 0
A B 8
A C 23
A D 24
B A 8
B B 0
B C 15
B D 16
C A 23
C B 15
C C 0
C D 1
D A 24
D B 16
D C 1
D D 0
There are multiple ways to go from B to D, but
the shortest is via C for a cost of 15+1=16. To evaluate the cost
of the path, the SYS_CONNECT_BY_PATH returns a plus-separated
string. This string is evaluated in the function EVAL that uses
dynamic SQL.
Conclusion
Hierarchical queries, which are where a
hierarchy is built upon a parent-child relationship within the same
table or view, have been a part of Oracle databases for more than 20
years. The main keywords in building a hierarchy are STAND BY,
PRIOR and START WITH. From there, several different functions can be
used to acquire different results such as SYS_CONNECT_BY_PATH,
CONNECT_BY_ROOT, and CONNECT_BY_IS_LEAF. These keywords and
functions are just some of the elements of hierarchies that covered
in this chapter.