11: Pipeline Data Reduction

11.1: Introduction

The result of a reduction pipeline is a data product which is suitable for immediate scientific purposes and for archival purposes. This product is to be achieved automatically without human interaction. The requirements for the pipeline drive many of the preceding aspects of the design of DEIMOS and its information systems.

11.2: Figures

11.3: Glossary

Low Resolution Imaging Spectrometer. Caltech's imaging, multislit spectrograph at the Cassegrain focus of Keck I.
The Flexure Compensation system planned for DEIMOS.

11.4: Functional Requirements

Chapter 7 outlines the nature of the data products for each CCD readout. The pipeline will require additional information about the available ensemble of science and calibration frames and their relationships. The database described in Chapter 9 should contain all the information needed to describe these relationships. This set of relationships should be the electronic equivalent of an indexed observing logbook. At the end of an observing run the relationships between the images should be placed into a FITS table and transmitted along with the rest of the image data.

During and after the observing run this logbook should be used by the pipeline reduction process. It can identify the best currently available calibration frames of each type, and use that information to control the pipeline.

11.4.1: Data Products

For direct image frames the pipeline should produce an image free of pixel sensitivity variations, other CCD artifacts, electronic artifacts, and vignetting. The list of operations includes: In short, the frames should be processed by all of the relevant operations which are typically done by the IRAF noao.imred.ccdred ccdproc task.

The DEIMOS slitmask database, catalog, and the associated WCS should permit complete automation of the spectral identification and extraction. Reasonably accurate positions of the individual spectra will be known in advance, and they can be converted into the database format used within IRAF multispectral extraction. Reasonably accurate wavelength and dispersion calibrations will also be known in advance, and these can be used for automatic identification of calibration lamp or night sky lines.

For spectral frames the pipeline should first produce a calibrated image with the characteristics described above. It should then make use of the slitmask and grating information to predict the location of each slitlet spectrum. With DEIMOS spectra it should not be necessary to run IRAF tasks from the twodspec.apextract package to find and trace spectra. Similarly, it should not be necessary to run the noao.onedspec identify task to establish the wavelength scale.

The pipeline should produce extracted spectra for each slitlet. If the object catalog contains sufficient information to determine the extent of the object the pipeline may make use of the catalog information about the position and extent of the object to produce sky-subtracted spectra.

11.4.2: Night mode

If the FC system works as expected then many, if not all, of the calibration data will be available prior to the acquisition of science data. Using these a priori calibrations the pipeline should operate in parallel with the quick look tools. It should produce a preliminary result within a few minutes after readout.

The observer should be permitted to modify or replace the default operation of the Night mode to satisfy personal tastes. In addition to the calibration operations listed above observers may often wish to apply a sky subtraction algorithm.

11.4.3: Archive Mode

At the end of each observing night, when all relevant calibration frames have been acquired, the pipeline should run to produce data for the archive.

Consistency of the archival data set is important. Re-reduction of older data to a new archival standard may not be feasible. In order to assure consistency the procedures in the archival pipeline should be defined as well as possible before the the first scientific data are acquired. Changes to the archival pipeline should be infrequent.

11.4.4: Portability

The pipeline should be operable at an observer's home institution.

11.5: Preliminary Design

The design of the data reduction pipeline depends strongly on the results of the FC system and the feasibility of constructing a library of long-term calibration frames. Until the characteristics of the instrument are known it is difficult to specify the exact nature of the pipeline.

Whether or not DEIMOS has long-term stability of its calibrations the Information Management system described in Chapter 9 is an essential component of the pipeline. The classification and selection of calibration frames should be done at the time of observation by DEIMOS rather than at the end of the observing run by ccdproc. Similarly, the enumeration and detection of the multislit spectra will be done at the time of observation rather than by apextract.

The logbook tables from DEIMOS should be designed with the expectation that they can be converted into IRAF parameter files and IRAF twodspec databases. It should be possible to construct IRAF parameter files for each of the tasks within ccdproc and the spectral reduction packages. This should drive the pipeline through the basic CCD reduction operations.

11.6: Inheritance

The data reduction pipeline will be constructed using the tools and environment of IRAF. The best local experience with multislit spectra is derived from use of the KECK LRIS. The data from DEIMOS will share many characteristics with LRIS. The FITS headers for LRIS data provide relatively little documentary information; thus the current reduction schemes require significant amounts of manual intervention. The DEIMOS procedures will closely parallel those for LRIS, but they will require far less manual interaction.

11.7: Additional Resources Needed

These are difficult to predict in advance of data from the instrument.

11.8: Interdependencies

The data storage format (Chapter 7) must provide all information relevant to any one CCD readout within a single FITS file (or group of FITS files).

The Information Management system (Chapter 9) is absolutely essential to the success of the pipeline. Its database must be able to construct tables that describe the relationships between a particular CCD readout and other CCD readouts.

The results of the pipeline will be saved using the archiving system (Chapter 9).

11.9: Outstanding Issues and Concerns

The pipeline data reduction scheme may be deemed of less urgency than other operational systems. This could result in a pipeline that does not perform optimally.

The characterization of the instrumental stability will require significant effort. Thousands of images taken repeatedly under conditions that span the range of possible configurations will be needed. In order to keep the labor requirements manageable this sequence of calibrations should be obtained by an automated procedure.

The Information Management system will be indispensable. It will be needed for answering questions about the differences between calibration frames that should be similar, and the similarities of calibration frames that should differ. It will also be needed for keeping track the configurations in which calibration frames have been obtained and for determining if there are any configurations for which there are no calibration frames.

In the absence of a calibration library which is stable over the long-term it will be necessary for observers to acquire their own calibration frames. Operation of a calibration pipeline will be difficult if the expected inputs are not obtained. This will require cooperation from the observers, the observing assistants, and support from CARA.

Steve Allen <sla@ucolick.org>
$Date: 1996/03/20 08:36:02 $