Resource Discovery Prototype 1
Introduction
This page details VOTECH deliverable DS5-02, the first release of prototype softwares developed under Design Study 5 (Intelligent Resource Discovery).
We present 4 different packages, entirely developed in the VOTECH project:
- Registry Query tool
- Data Extraction tool
- Object Names Recognition tool
- FITS Keyword Mapping
Tools description
Registry Query and Data Extraction tools
Registry Query tool
The
Registry Query tool allows to find resources in a VO
registry, by querying on
UCDs. This allows to find the
IVOA identifiers of resources containing tabular data, based on the contents of the tables.
Queries are sent in XQuery using a SOAP interface to the registry.
The list of resources can be saved and exported to the Data Extraction Tool.
Data Extraction tool
The
Data Extraction Tool takes as an input a list of IVOA resources, and allows to extract homogeneous data from these, by specifying an output format and transformation rules. Both ASCII and
VOTable outputs are supported.
Plastic
Both tools can exchange messages using the
PLASTIC protocol (developed during the VOTECH project). These
PLASTIC messages contain one or several IVOA identifiers to be exchanged between the applications. This allows to connect the Registry Query and Data Extraction tools to applications such as AstroGrid's
VOExplorer and CDS's
Aladin.
Application
The tools have been successfully used by Bernd Vollmer to extend his work on SPECFIND (Vollmer, 2005, A&A,
431, 1177). SPECFIND is a tool to build radio spectra from a set of heterogeneous catalogues. The original paper presents the results using 22 radio surveys. Using the Registry Query tool, a much larger number of relevant catalogues has been scanned, and the Data Extraction tool has been used to extract data from 100 catalogues to this date.
The semi-automated information retrieval capabilities were first presented at the IAU 26th general assembly (
Determination of Radio Spectra from Catalogues and Identification of Gigahertz Peaked Sources Using the Virtual Observatory, Vollmer et al.,
The Virtual Observatory in Action: New Science, New Technology, and Next Generation Facilities, 26th meeting of the IAU, Special Session 3, 17-18, 21-22 August, 2006 in Prague).
Object Names Recognition
This tool is designed to automatically identify astronomical object names in published papers. A direct application is to help librarians updating the
SIMBAD database with links between the bibliographic references and the objects.
The tool uses the formats in the
Dictionary of Nomenclature of Celestial Objects, and other resources to generate a list of regular expressions and match them against each article. The PDF papers are converted to plain text, special characters are translated or optically recognized, and new annotated PDF documents are generated with anchors to all strings identified as potential object names.
The tool can also generate script commands for SIMBAD updates, saving a lot of effort for object updates.
The tool was presented in a poster at ADASS XVII, London, september 2007.
FITS Keyword Mapping
In order to search and find data of interest it is necessary to describe and catalog them in a homogeneous way. The MEx utility is supporting this task for astronomy data products like images and spectra that are stored in FITS format.
MEx extracts and transforms keywords and thereby removes the instrument and observatory signature. This is achieved by converting values to physical units and mapping them to standard vocabularies (UCD) and concepts (utype).
Documentation and download
--
SebastienDerriere