 |
|
EnterpriseDB: XML in SPL
Oracle Tips by Burleson Consulting
|
Oracle
provides an XML data type called XMLType. This data type is an
object extension of the CLOB data type. It provides methods for
initialization and manipulation.
EnterpriseDB Advanced Server does not have an actual XML data type.
Instead, XML is stored as text. EnterpriseDB does have an XML
extension that helps with XML manipulation. This section will show
how to utilize that extended functionality.
The
eXtensible Markup Language (XML) topic is much too vast to cover in
this book. This will be a brief overview of using the XML
extensions provided to the PostgreSQL community (BSD License) by
Torchbox Ltd. This extension does not replace the Oracle XMLType
data type but it does provide much of the same functionality. I
will describe some ways to mimic the XMLType functionality in the
chapter on Oracle Compatibility.
All of
the supplied functionality is based on XPATH expressions and
requires a decent level of XML knowledge. XPATH is a method for
addressing parts of an XML document via a hierarchical format. This
section is not large enough to teach XPATH but it will show how to
use the supplied functions.
First,
I will define the XML documents that I will be using. This is a
fairly simple document. I want to demonstrate the supplied
functionality, not teach XML.
<Employee dname="RESEARCH" >
<Ename>SMITH</Ename><Empno>7369</Empno><Job>CLERK</Job></Employee>
<Employee dname="SALES" >
<Ename>ALLEN</Ename><Empno>7499</Empno><Job>SALESMAN</Job></Employee>
<Employee dname="SALES" >
<Ename>WARD</Ename><Empno>7521</Empno><Job>SALESMAN</Job></Employee>
<Employee dname="RESEARCH" >
<Ename>JONES</Ename><Empno>7566</Empno><Job>MANAGER</Job></Employee>
The
above XML contains 4 employee records (each as a separate
document). Each employee record contains a dname attribute that
lists the employee's department name. Each record also contains the
employee name, number and job.
I will
also create a table so that I don't have to continually repeat the
above XML document in all of my queries. There is no XML data type,
so I will create the column as a TEXT data type. I will also create
a sequence driven primary key using the serial data type.
CREATE TABLE xml_test (
xml_test_pk serial NOT NULL PRIMARY KEY,
xml_data VARCHAR2(4000) );
I will
then insert the above set of XML into the new table:
INSERT INTO xml_test (xml_data) VALUES (
'<Employee dname="RESEARCH"
><Ename>SMITH</Ename><Empno>7369</Empno><Job>CLERK</Job></Employee>'
);
INSERT INTO xml_test (xml_data) VALUES (
'<Employee dname="SALES"
><Ename>ALLEN</Ename><Empno>7499</Empno><Job>SALESMAN</Job></Employee>'
);
INSERT INTO xml_test (xml_data) VALUES (
'<Employee dname="SALES"
><Ename>WARD</Ename><Empno>7521</Empno><Job>SALESMAN</Job></Employee>'
);
INSERT INTO xml_test (xml_data) VALUES (
'<Employee dname="RESEARCH"
><Ename>JONES</Ename><Empno>7566</Empno><Job>MANAGER</Job></Employee>'
);
COMMIT;
As a
reminder, you should remember that the serial data type is tied to a
sequence. There is no reason to supply a primary key value as it
will automatically increment.
The
guts of the XML support in EnterpriseDB Advanced Server is tied to
extracting values from XML rather than creating XML. As I said
above, I will show you several ways to mimic the functionality of
XMLType and SQL/XML in Chapter 8, Oracle Compatibility.
EnterpriseDB provides four functions for manipulating XML:
*
xpath_string and xpath_number – Extract a value based on an XPATH
expression
*
xpath_nodeset – Extra a node from an XML document based on an XPATH
expression
*
xpath_table – Turn an XML document (or parts of a document) into a
virtual table suitable for selecting or joining to other tables
*
xslt_process – Transform an XML document using XSLT
I will
start with xpath_string and xpath_number. Both functions take as
parameters: the document (in our case the column) and an XPATH
expression. The syntax is:
FUNCTION xpath_string (
xml_document IN TEXT,
xpath_expression IN TEXT )
RETURN TEXT;
FUNCTION xpath_number (
xml_document IN TEXT,
xpath_expression IN TEXT )
RETURN NUMBER;
Here
is a simple query against the table above:
SELECT xpath_string(xml_data, '/Employee/@dname')
dept,
xpath_string(xml_data, '/Employee/Ename') name,
xpath_number(xml_data, '/Employee/Empno') emp_no
FROM xml_test
WHERE xpath_string(xml_data, '/Employee/Job') = 'SALESMAN';
dept | name | emp_no
-------+-------+--------
SALES | ALLEN | 7499
SALES | WARD | 7521
(2 rows)
Notice
that I used xpath_string and xpath_number and even used xpath_string
in the where clause.
If you
refer back to the documents above, we had two salesmen, WARD and
ALLEN.
The
next function is the xpath_nodeset. This function works just like
the above two functions but returns a node from the document instead
of scalar values. The node is returned as TEXT. The syntax is:
FUNCTION xpath_nodeset (
xml_document IN TEXT,
xpath_expression IN TEXT )
RETURN TEXT;
A
simple query using xpath_nodeset:
SELECT xpath_nodeset(xml_data, '/Employee/Empno')
FROM xml_test;
xpath_nodeset
---------------------
<Empno>7369</Empno>
<Empno>7499</Empno>
<Empno>7521</Empno>
<Empno>7566</Empno>
(4 rows)
This
function may not look very useful at this point but when you are
writing XML based applications, this function can be used as input
to xpath_string or xpath_number to ease extracting scalar values.
Our example is very basic but in XML documents with many levels, it
is sometimes easier to work with individual chunks rather than the
entire document.
The
next function is the xpath_table function. This is an exciting
function because it allows you to use an XML document as a virtual
table. The tags within the document become your virtual columns.
The
syntax is a bit more convoluted than the functions we have already
looked at.
FUNCTION xpath_table(
xml_table_pk TEXT,
xml_column TEXT,
xml_table TEXT
xpath_expressions TEXT,
limit_criteria TEXT )
RETURN t_virtual_table( column and data type list );
The
first parameter, xml_table_pk is the primary key of the table
containing the XML document. The next two parameters, xml_column
and xml_table are the column containing the XML document and the
table containing that column.
The
fourth parameter is a pipe (|) delimited list of XPATH expressions.
Each expression will create a column in the virtual table.
The
final parameter is a dynamic where clause that is appended to the
underlying XML document to restrict the values returned. This
parameter is required. If you want all virtual rows in the virtual
table, you can use '1=1' as I did in the example below.
SELECT
xml_test_pk, employee, department
FROM xpath_table(
'xml_test_pk',
'xml_data',
'xml_test',
'/Employee/Ename|/Employee/@dname',
'1=1' )
AS t_virtual_table( xml_test_pk integer, employee text, department
text );
xml_test_pk | employee | department
-------------+----------+------------
1 | SMITH | RESEARCH
2 | ALLEN | SALES
3 | WARD | SALES
4 | JONES | RESEARCH
(4 rows)
If I
wanted to limit the results to only those employees who work in the
SALES department, I could re-write the call like this:
SELECT xml_test_pk, employee, department
FROM xpath_table(
'xml_test_pk',
'xml_data',
'xml_test',
'/Employee/Ename|/Employee/@dname',
'xpath_string(xml_data, ''/Employee/@dname'') = ''SALES'''
)
AS t_virtual_table( xml_test_pk integer, employee text, department
text );
xml_test_pk | employee | department
-------------+----------+------------
2 | ALLEN | SALES
3 | WARD | SALES
(2 rows)
Because the limiting criteria is a string, embedded quotes need to
be doubled; two quotes ('') inside a string a translated to a single
quote in the final string.
In
addition, you can see that the limiting criteria is applied against
the base table so you can refer to a limit within the XML document
as I did, or you can refer directly to a column. Below is an
example limiting the XML to a specific row:
SELECT xml_test_pk, employee, department
FROM xpath_table(
'xml_test_pk',
'xml_data',
'xml_test',
'/Employee/Ename|/Employee/@dname',
'xml_test_pk = 4' )
AS t_virtual_table( xml_test_pk integer, employee text, department
text );
xml_test_pk | employee | department
-------------+----------+------------
4 | JONES | RESEARCH
(1 row)
The
final XML built-in function that I will cover here is xslt_process.
XSLT is the eXtensible Stylesheet Language for Transformations.
Like XPATH and XML itself, this chapter is not large enough to
explain all of the intricacies of XSLT. XSLT is used to transform
an XML document from one format to another. In the example below, I
will transform one of the rows from the xml_test table above into an
html output document.
The
syntax for the xslt_process function is:
FUNCTION xslt_process (
xml_document IN TEXT,
xsl_document IN TEXT )
RETURN TEXT;
The
first step is to create a table to hold the XSLT:
CREATE TABLE xsl_test (
xsl_data VARCHAR2(4000) );
I
create an XSLT document to transform the XML and insert it into the
table:
INSERT INTO xsl_test (xsl_data)
VALUES ('<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output indent="yes" cdata-section-elements="DESCRIPTION"/>
<xsl:template match="/Employee">
<html>
<body>
<table>
<tr>
<td><b>Employee Name: </b></td>
<td><xsl:value-of select="Ename"/></td>
</tr>
<tr>
<td><b>Employee Number: </b></td>
<td><xsl:value-of select="Empno"/></td>
</tr>
<tr>
<td><b>Employee Department: </b></td>
<td><xsl:value-of select="@dname"/></td>
</tr>
<tr>
<td><b>Employee Job: </b></td>
<td><xsl:value-of select="Job"/></td>
</tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>');
A very
quick description of the XSLT is:
* The
first couple of lines are required for an XSLT document
* I
set the template which tells the XSLT to start with the /Employee
node
* The
output will contain, in order, HTML, BODY and TABLE tags
* For
each Employee, the XSLT will create a TR (table row) tag
* For
each column in the Employee node, the XSLT will create a TD (table
data) tag
* The
XSLT ends with closing tags for the table, body, html, template and
the document itself
I will
combine the XML and the XSLT via a SELECT statement. All of these
XML functions can be called via SPL also.
SELECT xslt_process(xml.xml_data, xsl.xsl_data)
FROM xml_test xml, xsl_test xsl
LIMIT 1;
Notice
that the query is a Cartesian join (no join criteria) and I am
limiting my result set to a single row. The Cartesian is to combine
every row in the xml_test table with the single row in the xsl_test
table. The limit is to save space.
The
results from the query (which I have modified for readability):
xslt_process
-------------------------------------------------
<html>
<body>
<table>
<tr>
<td><b>Employee Name: </b></td>
<td>SMITH</td>
</tr>
<tr>
<td><b>Employee Number: </b></td>
<td>7369</td>
</tr>
<tr>
<td><b>Employee Department: </b></td>
<td>RESEARCH</td>
</tr>
<tr>
<td><b>Employee Job: </b></td>
<td>CLERK</td>
</tr>
</table>
</body>
</html>
(1 row)
If you
save this HTML document to a file and run it, it looks like this:
Download the complete XML example from the code depot at:
edb_chap_3_xml_example.sql
*
edb_chap_3_xml_example.sql
CREATE TABLE xml_test (
xml_test_pk serial NOT NULL PRIMARY KEY,
xml_data VARCHAR2(4000) );
INSERT INTO xml_test (xml_data) VALUES (
'<Employee dname="RESEARCH"
><Ename>SMITH</Ename><Empno>7369</Empno><Job>CLERK</Job></Employee>'
);
INSERT INTO xml_test (xml_data) VALUES (
'<Employee dname="SALES"
><Ename>ALLEN</Ename><Empno>7499</Empno><Job>SALESMAN</Job></Employee>'
);
INSERT INTO xml_test (xml_data) VALUES (
'<Employee dname="SALES"
><Ename>WARD</Ename><Empno>7521</Empno><Job>SALESMAN</Job></Employee>'
);
INSERT INTO xml_test (xml_data) VALUES (
'<Employee dname="RESEARCH"
><Ename>JONES</Ename><Empno>7566</Empno><Job>MANAGER</Job></Employee>'
);
COMMIT;
SELECT xpath_string(xml_data, '/Employee/@dname')
dept,
xpath_string(xml_data, '/Employee/Ename') name,
xpath_number(xml_data, '/Employee/Empno') emp_no
FROM xml_test
WHERE xpath_string(xml_data, '/Employee/Job') = 'SALESMAN';
SELECT xpath_nodeset(xml_data, '/Employee/Empno')
FROM xml_test;
SELECT
xml_test_pk, employee, department
FROM xpath_table(
'xml_test_pk',
'xml_data',
'xml_test',
'/Employee/Ename|/Employee/@dname',
'1=1' )
AS t_virtual_table( xml_test_pk integer, employee text, department
text );
SELECT xml_test_pk, employee, department
FROM xpath_table(
'xml_test_pk',
'xml_data',
'xml_test',
'/Employee/Ename|/Employee/@dname',
'xpath_string(xml_data, ''/Employee/@dname'') = ''SALES'''
)
AS t_virtual_table( xml_test_pk integer, employee text, department
text );
SELECT xml_test_pk, employee, department
FROM xpath_table(
'xml_test_pk',
'xml_data',
'xml_test',
'/Employee/Ename|/Employee/@dname',
'xml_test_pk = 4' )
AS t_virtual_table( xml_test_pk integer, employee text, department
text );
CREATE TABLE xsl_test (
xsl_data VARCHAR2(4000) );
INSERT INTO xsl_test (xsl_data)
VALUES ('<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output indent="yes" cdata-section-elements="DESCRIPTION"/>
<xsl:template match="/Employee">
<html>
<body>
<table>
<tr>
<td><b>Employee Name: </b></td>
<td><xsl:value-of select="Ename"/></td>
</tr>
<tr>
<td><b>Employee Number: </b></td>
<td><xsl:value-of select="Empno"/></td>
</tr>
<tr>
<td><b>Employee Department: </b></td>
<td><xsl:value-of select="@dname"/></td>
</tr>
<tr>
<td><b>Employee Job: </b></td>
<td><xsl:value-of select="Job"/></td>
</tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>');
SELECT xslt_process(xml.xml_data, xsl.xsl_data)
FROM xml_test xml, xsl_test xsl
LIMIT 1;
That
covers the built-in XML functionality in EnterpriseDB. In the
chapter on Oracle compatibility, I will discuss what I think is
missing and how you can generate XML as needed.
In
addition to XML support, EnterpriseDB supports many other languages
in addition to SQL and SPL. The next section will outline a few of
those languages.
This
is an excerpt from the book "EnterpriseDB:
The Definitive Reference" by Rampant TechPress.