Introduction
eXist (
http://exist.sourceforge.net) is an native OpenSource XML database entierly written in java. It has several functionalities:
- Automatic nodes indexing
- XQuery requests (W3C working draft 1.0 compliant)
- Text search mode
- Users management (access rights...)
eXist can be deployed in different manners:
- As a standalone java application (eXist server in standalone mode)
- Integrated in an existing java application (there will be no network exchanges)
- As a Web Service (using a servlet engine like Tomcat for example)
How to communicate with eXist ?
There are three ways to access eXist:
- REST (Representational State Transfer)
- XML-RPC (XML-Remote Procedure Call)
- SOAP (Simple Object Access Protocol)
Note that each of those protocols uses HTTP as data transfert protocol. There is a small limitation with REST: users management is not supported because of a lack of security.
Now we will see how those three protocols can be used for each of the possible deployement : standalone mode and Web Service.
Standalone mode
In this mode the eXist server is running and listenning to two ports : 8081 for XML-RPC requests and 8088 for REST requests.
To communicate with the REST interface a simple Web browser is sufficient. For example the URL
http://localhost:8088/db?_query=//Resources will interogate the eXist server on its REST interface to get all the XML elements called "Resources" in the database.
To communicate with the XML-RPC interface it is possible to use the java client provided with eXist distribution. It allows you to execute all possible actions on the database (users management, XQuery request, collections management, adding or deleting XML files, ...). The eXist java client uses the XML:DB API to communicate with the eXist server. By using this API you can also write your own java application that will interact with the eXist server.
The SOAP interface is not disponible in this mode.
Web Service
In this mode eXist is encapsulated in a servlet engine like Tomcat and eXist sevlets provide the Web Service. The three interfaces (REST, XML-RPC and SOAP) are disponible in this mode.
Data organization
eXist is a black box. Indeed eXist points to a repertory where several files like dom.dbx, elements.dbx, ... containing XML files are stored. We cannot see how all this is organized.
What we know is that the logical organization of the XML files is hierarchical : each file belong to a collection and a collection can contain XML files and other collections.
Benchmarks
We wanted to test if an XML database like eXist could treat advanced XQuery request in a reasonable time. We did those benchmarks in two conditions : with the data of the Carnivore registry and with the same data gathered in one XML file and having UCD1+.
With Carnivore data
Conditions
- local eXist server in standalone mode pointing to Carnivore data
- java client in command-line to make the XQuery requests
Results
Test 1: recovery of a resources having a particular identifier
//vr:Resource[vr:identifier/text()='ivo://CDS/VizieR/J/A+AS/123/575/levels3']
- duration: 2s
- memory: 25 Mo
- proc: 100%
Test 2: recovery of the resources having at least one column having a particular UCD
//vr:Resource[vs:table/vs:column/vs:ucd/text()='PHOT_DIFF_MAG']
- duration: 2s
- memory: 65 Mo
- proc: 100%
Note: the memory needed is growing because of the depth of the research in the XML tree.
Test 3: recovery of the resources having at least one column having a particular UCD or an other
//vr:Resource[vs:table/vs:column/vs:ucd/text()="PHOT_DIFF_MAG" or vs:table/vs:column/vs:ucd/text()="PHOT_JHN_B-V"]
- duration: 3s
- memory: 70 Mo
- proc: 100%
With UCD1+ data
Conditions
- eXist server in standalone mode pointing to UCD1+ resources registry. All has been put in one XML file.
- java client in command-line mode.
Results
Test 1: recovery of a resources having a particular identifier
//vr:Resource[vr:identifier/text()='ivo://CDS/VizieR/J/A+AS/123/575/levels3']
- duration: 2s
- memory: 15 Mo
- proc: 100%
Test 2: recovery of the resources having at least one column having a particular UCD
//vr:Resource[vs:table/vs:column/vs:ucd/text()='phot.mag;arith.diff']
- duration: 2s
- memory: 15 Mo
- proc: 100%
Test 3: recovery of the resources having at least one column having a particular UCD or an other
//vr:Resource[vs:table/vs:column/vs:ucd/text()="phot.mag;arith.diff" or vs:table/vs:column/vs:ucd/text()="phot.color;em.opt.B;em.opt.V"]
- duration: 3s
- memory: 120 Mo
- proc: 100%
--
BriceGassmann - 18 Aug 2005