Skip to content
John C Good edited this page Dec 5, 2018 · 7 revisions

AVO Spatial Indexing Study

We are going to start updating these pages by copying over content, then rationalizing it all. Please bear wit us.

This page is meant for internal use by the participants in the NAVO Spatial Indexing study (it won't make much sense to anyone else). At this point in the study we have identified the various ways the astronomical community is using spatial indexing to help retrieve data (mostly from relational databases) and have developed code and datasets to help us perform timing trade-off studies on parameters like the type of indexing (HTM vs HEALPix vs Q3C, etc.), and optimum index bin size (arcsecond, arcminute, etc.), effect of search region size on speed, and the relative times associated with database querying and data I/O.

This set of pages/documents captures the current state of the study:

  • report.02.2016.txt   The report we submitted in February on the status of the project.
  • studyOutline.txt   A write-up from 28 July 2016 meant to be a strawman outline for the final study.
  • navo.tar.gz   All the code in one place: HTM and HPX libraries; utility for adding HTM/HPX indices to a table of locations; utility for turning a region specification into SQL fragments for inclusion in a query; and utilities for running a single cone search and large-scale timing tests against PostgreSQL. C code, with Makefile and README. Mostly self-contained, though the PostgreSQL search utilities require the PostgreSQL libraries.
  • tmass.csv   All 470 million 2MASS sources, but just 2MASS name, RA, and Dec.
  • spatialIndexing.tar.gz   [OLD] Code for adding spatial index values (HTM and HEALPix based) to a table of locations. C code, self-confined, with Makefile.
  • tmass.csv   All 470 million 2MASS sources, but just 2MASS name, RA, and Dec.
  • results.html   A worked-through example of adding index values, loading into PostgreSQL, indexing using the DBMS B-Tree, and running a spatial query.
  • timingTests.html   Initial results of running a couple of million queries agains the 470 million 2MASS table in PostgreSQL with two different scale indices (both HTM and HEALPix).
  • searchDB.tar.gz   [OLD] Code for running a simple spatial search against a database table (2MASS).
  • randLoc.c   A code fragment for generating uniformly spaced random locations on the sky, to be included in a test progam.

The next step will be to fold the spatialIndexing code into the search program and add it a loop of coordinates from the random location algorithm. The resultant timing and source count statistics will give us pretty much all we need for this part of the study.