r6 - 18 May 2008 - 19:38:35 - ChristianBonninYou are here: TWiki >  VOTech Web  >  ResourceDiscovery > ResourceDiscoveryPrototype1

Resource Discovery Prototype 1

Introduction

This page details VOTECH deliverable DS5-02, the first release of prototype softwares developed under Design Study 5 (Intelligent Resource Discovery).

We present 4 different packages, entirely developed in the VOTECH project:

  • Registry Query tool
  • Data Extraction tool
  • Object Names Recognition tool
  • FITS Keyword Mapping

Tools description

Registry Query and Data Extraction tools

Registry Query tool

The Registry Query tool allows to find resources in a VO registry, by querying on UCDs. This allows to find the IVOA identifiers of resources containing tabular data, based on the contents of the tables.

Queries are sent in XQuery using a SOAP interface to the registry.

The list of resources can be saved and exported to the Data Extraction Tool.

Data Extraction tool

The Data Extraction Tool takes as an input a list of IVOA resources, and allows to extract homogeneous data from these, by specifying an output format and transformation rules. Both ASCII and VOTable outputs are supported.

Plastic

Both tools can exchange messages using the PLASTIC protocol (developed during the VOTECH project). These PLASTIC messages contain one or several IVOA identifiers to be exchanged between the applications. This allows to connect the Registry Query and Data Extraction tools to applications such as AstroGrid's VOExplorer and CDS's Aladin.

Application

The tools have been successfully used by Bernd Vollmer to extend his work on SPECFIND (Vollmer, 2005, A&A, 431, 1177). SPECFIND is a tool to build radio spectra from a set of heterogeneous catalogues. The original paper presents the results using 22 radio surveys. Using the Registry Query tool, a much larger number of relevant catalogues has been scanned, and the Data Extraction tool has been used to extract data from 100 catalogues to this date.

The semi-automated information retrieval capabilities were first presented at the IAU 26th general assembly (Determination of Radio Spectra from Catalogues and Identification of Gigahertz Peaked Sources Using the Virtual Observatory, Vollmer et al., The Virtual Observatory in Action: New Science, New Technology, and Next Generation Facilities, 26th meeting of the IAU, Special Session 3, 17-18, 21-22 August, 2006 in Prague).

Object Names Recognition

This tool is designed to automatically identify astronomical object names in published papers. A direct application is to help librarians updating the SIMBAD database with links between the bibliographic references and the objects.

The tool uses the formats in the Dictionary of Nomenclature of Celestial Objects, and other resources to generate a list of regular expressions and match them against each article. The PDF papers are converted to plain text, special characters are translated or optically recognized, and new annotated PDF documents are generated with anchors to all strings identified as potential object names.

The tool can also generate script commands for SIMBAD updates, saving a lot of effort for object updates.

The tool was presented in a poster at ADASS XVII, London, september 2007.

FITS Keyword Mapping

In order to search and find data of interest it is necessary to describe and catalog them in a homogeneous way. The MEx utility is supporting this task for astronomy data products like images and spectra that are stored in FITS format.

MEx extracts and transforms keywords and thereby removes the instrument and observatory signature. This is achieved by converting values to physical units and mapping them to standard vocabularies (UCD) and concepts (utype).

Documentation and download

-- SebastienDerriere

toggleopenShow attachmentstogglecloseHide attachments
Topic attachments
I Attachment Action Size Date Who Comment
elsegz ObjectNames.tar.gz manage 10490.7 K 03 Oct 2007 - 22:51 SebastienDerriere Object Names recognition package
jpgjpg DJINProcess.jpg manage 89.3 K 18 May 2008 - 19:28 ChristianBonnin DJIN general overview
htmlhtm docEn.htm manage 19.8 K 18 May 2008 - 19:38 ChristianBonnin Object name recognition documentation
Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r6 < r5 < r4 < r3 < r2 | More topic actions
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback