9.4 : Specifications/Requirements

$Id: general.html,v 1.19 1996/03/14 21:41:53 de Exp de $
In the Overview (9.1) we described three types of DEIMOS information to be managed, and several applications for that information. This section lists specifications for a software and hardware information management system which would meet the goals described in 9.1.


There is a considerable overlap among the requirements outlined in 9.1. The DEIMOS software should include a single, central, flexible, shareable, machine-readable, network-accessible facility for collecting, storing, and exporting information, capable of meeting all described types of information demand.

To meet the above criteria, and to manage large sets of data when rapid access and complex ad hoc analytical queries are required, a relational database management system is the best choice of software tool. The DEIMOS software suite should include a standard RDBMS running on a standard Unix platform. (see Existing Resources (9.6) and Additional Resources (9.7) for more detail).

As little proprietary and commercial software as possible should be used in the information management component of the DEIMOS software suite. There should be minimal dependence on vendor response, licensing mechanisms, etc; as much source code as possible should be visible to users and maintainers of the DEIMOS instrument. A standard non-commercial language set (such as gcc, g++, and Tcl) should be used to develop all modules of the information management component.

Due to the unavailability of reliable and robust free RDBMS, the RDBMS is the only software element in the information management component which should be acquired from commercial sources. Either Sybase or Oracle would be adequate, and there are some reasons for preferring Sybase. See Section 9.7 for additional detail.

As little data as possible should be acquired by manual key entry on the part of observers or observatory staff. As much data as possible should therefore be acquired automatically by interprocess communications, telemetry, etc. (See Appendix A (9.10.A) for a developed, but non-exhaustive, list of useful information and likely sources.)

However, the observer may wish to annotate data or maintain an observing log during the night, so manual annotation and some manual data entry should be permitted, and friendly tools should be provided for these functions.

The hardware, optical, and software engineers should have their own online logbook, whose interface is integrated with the suite of instrument control and test software which they will use. This logbook should become part of the permanent record and must be available online for intelligent searching and cross reference. This facility should be online at the time of lab testing.

The database facility, whether during collection or retrieval, should be integrated seamlessly with the data taking and quick-look software, rather than appearing as a separate user environment or application.

Some types of data define and document individual mechanical, optical, and electronic components of the instrument, such as mirrors, filters, lamps, gratings, slit masks and CCD detectors. Physical instrument components should be fully documented by key entry at their point of manufacture or verification, and this information should be copied to the archive on the Mountain for ready reference during any observing run; maintenance and adjustment of optical/mechanical components should be logged to the database by key entry.

As much data as observers find useful should be exportable quickly and easily to the standard FITS-table format, so that the observer can take these tables home with the acquired data.

The observer should also be able to extract and save the results of any arbitrary query against the public portion of the database and his/her own data. Certain standard queries should be offered in the form of plots and charts. A friendly tool for making ad hoc analytical forays into the data during the observing run should be provided.

All non-confidential archived data should be offered for public interest via a simple WWW interface, using a backend database server different from the one used on the Mountain to collect and initially store the meta-data. The WWW interface should offer basic analytical query support including field selection, record selection by the standard set of Boolean operators, and statistical functions. These non-confidential data would include libraries of standard instrument calibration data.

An alternate query interface using a lowest-common-denominator (SMTP) e-mail protocol should be supported as a fallback and as an access path for those without WWW clients. A query mailed to a "mail robot" would return an email reply containing the requested data.

The private and public servers should both offer a data dictionary defining every FITS keyword used in tables produced by DEIMOS software.

The RDBMS (see above) approach can work well, in combination with large-capacity jukebox technology, to offer an archive of acquired data. The DEIMOS software should provide the option of permanently archiving acquired data, or selected images from the night, to a public or semi-public library. The observer should have control over which images are offered publicly, and on what date they become public. It may be easiest to accomplish this if the standard medium for taking data away from the Mountain is a high density CDROM (in this case the archive version is simply another CDROM of the same format). The archiving process should be integrated with the routine Observatory backup strategy for acquired data, if possible.

The database information should be usable by an automated data reduction pipeline to produce 'first-cut reduced' images to be archived along with the raw data. First-cut extracted spectra should also be archived with the raw data.

The database server used to capture and offer meta-data during the night should reside on the Mountain, on the same local network with the computers used to control the telescope and instrument. Failure of the network connection to this server could result in loss of historical record and could also have negative impacts on the observing process, making it less automated and therefore slower.

The secondary server used to offer public copies of data from the Mountain should be located at a remote site and backed up frequently to stable media.

Data transfer between the two servers should be fully automated, with notification to responsible parties whenever transfer fails.

No random outside connections should be permitted to the primary database server. The computer room should be firewalled to protect the integrity of the telescope and instrument control systems as well as the archive of operational and acquired data. With proper firewall configuration and management, remote observing can and should be supported without weakening of network security.


de@ucolick.org
webmaster@ucolick.org
De Clarke
UCO/Lick Observatory
University of California