DS6 Stage01 Plan (pre-TAP draft)

Activities

Science requirements and validation

This covers the collection of VO data exploration scenarios from sources like the AVO Science Reference Mission, the VOTECH Science Team and the EuroVO Science Advisory Group, supplemented by the particular interests of the DS6 team, plus the extraction of requirements from them and the development of a framework for validating software against these requirements. It is intended that one scenario will be adopted as a DS6 demonstrator: it should combine data mining and visualization and should include some iterative/interactive steps.

Assessment of existing third-party tools

This will include identifying which existing third-party tools meet (fully or partially) the scientific requirements of VO data exploration, determining which can be made VO-compliant within reasonable resource limits, and detailing how that should be done. The first step will be to define what VO-compliance means in this context

Development of existing tools

This primarily refers to VisIVO, Astroneural and Aladin. There is on-going work to integrate VisIVO and Astroneural, and discussions underway to plan integration of VisIVO with Aladin and VizieR. In parallel, VisIVO and Astroneural will be made VO-compliant.

Column-ordered storage for data exploration

This activity is studying the feasibility of using column-ordered storage methods to support VO data exploration, both as a general concept and as a modification of the existing STIL/TOPCAT tool. Initial work will make use of HDF5, an existing column-ordered data format.

Griddification of a KDE algorithm

This prototypes the use of k-d trees for VO data exploration, by implementing a Kernel Density Estimation (KDE) code which makes use of them and can itself be used as the basis for a number of data mining applications. The KDE software is currently available as a command line C code, which will need wrapping as a CEA application. One of the code's steps is an embarrassingly parallelisable parameter value selection, and this will be performed through computational Grid jobs executed by a broker which may later be generalised for other CEA apps.

Participation in IVOA DM/DAL standards definition work

To allow more sophisticated access to, and visualisation of, metadata relating to "data sets", the IVOA need to complete the definition of the Characterization data model, and to provide SIA/SSA/DAL extension mechanisms enabling the handling of complex and heterogeneous data. The immediate goal of the proposed Stage01 work is to develop current ideas to the status of a Working Draft or Proposed Recommendation, and to work on example implementations, based on illustrative science cases.

Design new multi-parameter indexes for large catalogues using k-d trees

The goal here is to assess the use of k-d trees and similar technologies for providing efficient indexing of large catalogues, with VizieR to provide the testbed for this work.

VO infrastructure advice

Some of the above activities will require apprising DS6 team members new to EuroVO of details of the AstroGrid/AVO infrastructure, and advising them on how to integrate their software with it.

Stage01 Report

The final part of the DS6 Stage01 Plan is the preparation of the DS6 Stage01 Report, which will both summarise progress made during Stage01 and outline the general plans that the DS6 team have for their future work.

Deliverables

The deliverables from these activities are tabulated below; due dates will be added later, by agreement with the people involved.

Description Person
Collection of VO data exploration scenarios BobMann, GiuseppeLongo, Bob Nichol
List of third-party tools to asssess BobMann and DS6 team
Note on what VO-compliance entails for third-party data exploration tools MartinHill and JohnTaylor
Report on integration plans for VisIVO and Astroneural UgoBecciani and GiuseppeLongo
Prototype CEA-wrapped broker that will allow the submission and execution of embarrassingly parallel algorithms GarrySmith
Demonstration of the KDE algorithm executed via the broker GarrySmith and Bob Nichol
Sample catalogues in HDF5 format ClivePage
Prototype tool implementing some data manipulation operations on the HDF5 catalogues ClivePage
Note on inclusion of indexing in HDF5-based tool ClivePage
Report on possibilities for column-orderring in STIL/TOPCAT ClivePage
Draft IVOA standards extending SSA/SIA/DAL for heterogeneous data FrancoisBonnarel
Draft of Characterization data model FrancoisBonnarel
Report on use of k-d trees for indexing large catalogues FrancoisBonnarel
DS6 Stage01 Report BobMann and DS6 team

-- BobMann - 11 Mar 2005

Topic revision: r7 - 2005-03-11 - 16:22:26 - MarcoLeoni
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback