Farrago SQL/MED Support

This document describes Farrago support for SQL/MED (Management of External Data), the portion of the SQL:1999 standard which governs access to foreign data (both relational and non-relational) hosted in other servers.

Overview

SQL/MED defines several very independent interfaces: Farrago supports the standard DDL interface but not (currently) the standard foreign data wrapper service provider interface. Instead, Farrago defines a somewhat simpler non-standard service provider interface together with some prepackaged implementations: Eventually, if standard SQL/MED foreign data wrapper drivers become popular, Farrago will either provide a bridge for them as another implementation of its non-standard interface, or will make its own non-standard interface obsolete. The standard DDL will remain unchanged in any event.

Foreign Data Wrapper Definition

The standard SQL/MED DDL statement to install a driver is CREATE FOREIGN DATA WRAPPER:

CREATE FOREIGN DATA WRAPPER foreign-data-wrapper-name
[ LIBRARY 'libraryName' ]
LANGUAGE language-name
[ OPTIONS ( opt-name1 'opt-value1' [, opt-name2 'opt-value2' ... ] ) ]
Initially, Farrago requires the LANGUAGE to be JAVA. The LIBRARY name specifies the path to a JAR containing the wrapper implementation. Here's a fictitious example for a foreign data wrapper capable of fetching XML files via TCP/IP:

CREATE FOREIGN DATA WRAPPER xml_tcpip
LIBRARY '/home/jvs/wrappers/FarragoXmlSourceReader.jar'
LANGUAGE JAVA
OPTIONS ( protocol 'TCP/IP' )
According to the standard, foreign data wrappers are catalog objects. In Farrago, they are implicitly created under the SYS_BOOT catalog (and so must have top-level unique names).

As an extension, Farrago also allows a foreign data wrapper to be specified as the fully-qualified name of a class available on the server's classpath (instead of in a particular JAR file):


CREATE FOREIGN DATA WRAPPER xml_
LIBRARY 'class com.yoyodyne.farrago.plugin.xml.FarragoXmlDataWrapper'
LANGUAGE JAVA
OPTIONS ( protocol 'TCP/IP' )

Foreign Server Definition

Once a foreign data wrapper has been installed, it can be bound to data provided by a particular server with the CREATE SERVER statement:

CREATE SERVER foreign-server-name
[ TYPE 'server-type' ]
[ VERSION 'server-version' ]
FOREIGN DATA WRAPPER foreign-data-wrapper-name
[ OPTIONS ( ... ) ]
Example using the previously defined wrapper:

CREATE SERVER movie_catalog
TYPE 'ANY'
VERSION '1.1'
FOREIGN DATA WRAPPER xml_tcpip
OPTIONS ( url 'http://www.movie-fun-facts.org' )
Any type of web server would be allowed, but it would have to support HTTP version 1.1 (and the wrapper would expect HTTP features to work accordingly). As with foreign data wrappers, foreign servers are created under the SYS_BOOT catalog, and their names must not conflict with any other server (or catalog) in the system.

According to the SQL/MED standard, defining a foreign server does not make its data accessible (the CREATE FOREIGN TABLE or IMPORT FOREIGN SCHEMA statements described later are required first). However, Farrago introduces a non-standard mechanism for direct access. Each foreign server defined implies a corresponding top-level virtual catalog. These catalogs are virtual in the sense that no local metadata is stored for them (and no INFORMATION_SCHEMA is defined). Instead, metadata is acquired on-the-fly during query processing. For example, after the above CREATE SERVER example, the following query might be legal:


SELECT DISTINCT r.name
FROM movie_catalog.review_facts.reviewers r,
     movie_catalog.movie_facts.actors a
WHERE r.name=a.name
This queries the names of all actors who have the same names as movie reviewers. The FarragoXmlSourceReader wrapper implementation might map each subdirectory under the root URL to a schema, and xml files in those subdirectories to tables. However, a metadata query like

SELECT * 
FROM movie_catalog.information_schema.table
would fail unless the FarragoXmlSourceReader implementation happened to support the INFORMATION_SCHEMA (unlikely in this case, but more likely for a JDBC foreign data wrapper).

Virtual catalogs are read-only (it is not possible to create new schemas, tables, or other objects under them).

Foreign Table Definition

In order for metadata about a foreign table to be stored in the LOCALDB catalog, a CREATE FOREIGN TABLE statement is required:

CREATE FOREIGN TABLE table-name
SERVER foreign-server-name
[ ( basic-column-definition1 [, basic-column-definition2 ... ] ) ]
[ OPTIONS ( ... ) ]

A basic-column-definition consists of a name, a datatype, and (optionally) a column-level OPTIONS clause.

For example:


CREATE FOREIGN TABLE movie_schema.reviewers
SERVER movie_catalog
OPTIONS ( directory 'review_facts', file 'reviewers.xml' )
Here, the foreign XML file previously referenced implicitly (via SQL identifiers in the virtual catalog) is now referenced explicitly (via wrapper-specific options). The local schema MOVIE_SCHEMA must already exist under the LOCALDB catalog. Since column definitions were omitted, the wrapper must be capable of deriving these automatically from the XML file contents. Now, we can issue a query like

SELECT * 
FROM movie_schema.reviewers
or a metadata query like

SELECT column_name
FROM localhost.information_schema
WHERE table_schema = 'MOVIE_SCHEMA' 
AND table_name = 'REVIEWERS'

Foreign Schema Import

In order to import metadata for a number of tables at once, SQL/MED provides the IMPORT FOREIGN SCHEMA statement (available in version 0.6):

IMPORT FOREIGN SCHEMA foreign-schema-name
[ 
 { LIMIT TO | EXCEPT }
 { ( table1 [, table2 ... ] ) | TABLE_NAME LIKE 'pattern' }
]
FROM SERVER foreign-server-name
INTO local-schema-name
The LIMIT TO clause restricts the import to an explicit list of table names or those matching a LIKE pattern. If a list is provided, all of the named tables must exist on the foreign server or the import will fail. Conversely, the EXCEPT clause imports everything except those named in the list or matching the LIKE pattern. Note that the LIKE pattern is a non-standard extension.

For the running example (this imports only table names starting with R):


IMPORT FOREIGN SCHEMA movie_facts
LIMIT TO TABLE_NAME LIKE 'R%'
FROM SERVER movie_catalog
INTO localhost.movie_schema
Note that as with CREATE FOREIGN TABLE, local schema MOVIE_SCHEMA must already exist before the IMPORT statement is executed.

Additional DDL

SQL/MED defines statements such as DROP/ALTER for foreign data wrappers, servers, and tables. It also defines additional DDL for controlling mapping of users and routines. For the details, consult the standard. Currently, Farrago only supports CREATE/DROP.

Java Interfaces

The interfaces which must be implemented in order to write a new foreign data wrapper are defined in package net.sf.farrago.namespace. Abstract bases in package net.sf.farrago.namespace.impl ease the construction of new wrappers. It is currently a fairly difficult task to develop a new wrapper; eventually a stripped down service provider interface may be defined to make it easier to wrap simple data. Another option for some types of data is to acquire a JDBC driver and use Farrago's JDBC wrapper.

Repository Model

Since CWM does not cover any SQL/MED metadata, the repository model for foreign data is defined in the Farrago extension model (FEM). Each foreign table has an association with a corresponding foreign server, together with options specified when it was created. Catalog initialization also creates a representation for the local server, to which tables stored locally are attached. Catalog initialization creates virtual catalogs for the local MDR repository metadata (using MedMdrForeignDataWrapper). Below is a UML structure diagram for the SQL/MED model:
(Apologies for that interloper in the background!)
The two associations (WrapperAccessesServer and ServerStoresTable) also imply dependencies for DROP RESTRICT/CASCADE.