Mask Making for Keck Spectrographs: Master Database Update


If masks are milled and scanned at Keck, how does the master database in Santa Cruz get updated? Normally the Mill PC (slitmaskpc) would update tables when a mask is scanned. But the Keck Mill PC can only update its local database (KSUMMIT) on waiaha, which is not the master. When the daily database "Stomp" takes place, it overwrites the mask-related tables on waiaha with data from oyabun (YAKUZA).

It's therefore necessary for the Keck Mill PC to get the news back to Santa Cruz somehow. Originally we thought of using SMTP for this, but the asynchronous nature of mail delivery and receipt means that it's hard to control the timing of such updates. Backoff times after delivery failures or network outages can be unpredictable. We also tried a rather complicated method based on exchanging lots of little data files via scp; but synchronisation was a nuisance, partly due to the shifting time differential between the two sites.

The final, improved version of the update mechanism relies on CVS. It is not quite "simple" but less Byzantine than the scp version and less demanding of bandwidth.

First, the master database must keep a cvs copy (flat files extracted via sybdump) of the metabase tables. This is something we have been doing for years -- a nightly cvs backup of the database volatile/valuable content. We switched cvsroots so as to restrict access to the data and associated scripts: there is now a special cvsroot just for database flatfile checkin.

There are two tables in particular which require special attention: either server should know ASAP if the other server has altered them. One is MaskBlu, where the 'status' field tells us whether a mask is new, written to floppy, successfully milled, or forgotten. Another is Mask, where a record is stored for each manufactured and QA-checked mask when its barcode is successfully scanned. The floppy writing, milling, and scanning could take place at either mill, and for obvious reasons it would be good for the mill operator to know whether Mask M has already been made and scanned at the other site.

The Web pages look only at the authoritative database in California, but the mill operator GUI looks at its local database. So it's important

  1. to update the Keck database (waiaha/KSUMMIT) when new masks are submitted via the web server, or when masks are made at Lick
  2. to update the Lick database (oyabun/YAKUZA) when masks are milled/scanned at Keck.

The only data changes that can happen at Keck are a change to the 'status' field in MaskBlu or a new record in Mask. Lick, however, can change any aspect of any table. The two are not "peers", yet the Keck end does need to send 'news' to Lick about its local activities.

If waiaha is allowed to do cvs updates on the flat files, we end up with a distressing ping-pong effect: it extracts its tables to local files which have a later date than the last cvs commit fron oyabun. It attempts a commit, which rolls back the changes oyabun committed. If oyabun then extracts and commits later, the changes are reinstated, only to be rolled back again by waiaha on the next cycle. They can't both own the same files.

The solution was to give waiaha ownership not of the flat files but of a file that documents the diffs between its local data and the current cvs version.

Code on waiaha is cronned to wake up about once/hour during the working day. It extracts the two critical tables to flat files and uses the cvs diff -r HEAD command to diff them against the current cvs version (checked in by oyabun). It parses the diff output, seeking only for alterations or new lines found in its own copy, not those which originate from oyabun. It converts these diffs into SQL and embeds the SQL in a thin wrapper of Tcl, saving it in the files Mask.diff.tcl and MaskBlu.diff.tcl and then committing these tcl files, not the data files. It accepts oyabun's version of the data files, updating the files in its cvs dir.

Code on oyabun is cronned to wake up on the half hour, about once/hour throughout the Keck working day. It uses cvs update to get the .tcl files waiaha created and committed, and processes them into executed SQL to update itself. As a result, the 'status' field in MaskBlu may be changed for one or more masks and new records may be introduced into Mask. No other change is permitted. If a new record in Mask conflicts with a previous unique barcode, the previous record is deleted and the new record accepted.

Having absorbed these changes sent from Keck, oyabun then extracts the two critical tables to flat files and commits the flat files. The next time waiaha checks its database content against the latest cvs revision of the flatfiles, these changes will no longer be diffs.

Now we know how data from Keck feed back into the Lick server. But how do new data from Lick feed into the Keck server? At some point waiaha has to absorb new records from oyabun.

Once per day (currently at 11 am HI time), waiaha sends off its latest diff/sql information (if any), and then does a complete cvs update of all the flat data files, sql scripts, etc. which are produced on oyabun by 'sybdump' and checked in at 0600 CA time. Having updated all the flat files it copies them to a working directory where scripts are kept whose purpose is to destroy all the tables in the local 'metabase' database and recreate them from the (latest) flat files.

It is possible that waiaha will lose one iteration's worth of new info when this destruction and re-creation takes place (known informally as The Stomp). However, it has committed the sql/diff info for the most recent changes. Within a half hour, oyabun will have them; and on the next Stomp cycle they will make their way back to waiaha. The worst effect of a "race condition" is that some updates will disappear for 24 hours -- but they will not be completely lost.

Here is a picture of how things work:

If this image doesn't look good on your browser, try downloading the PostScript or PDF version and viewing or printing it.

A few words of explanation are in order. 'Osiris', off to the right, is currently the name of the Lick web server, and the output arrow there represents users hitting the Mask pages to look at the data. MaskKeeper is an "auditor" cron job which 2 times a day checks the mask data for various inconsistencies, errors, etc. and notifies the right people.

The black/grayish part of the diagram shows the bare core of the mask milling operation in HI. The red part of the diagram shows the mechanisms that update oyabun with mask events from HI. The blue bits show the daily whole-database commit and the overwrite of waiaha.


de@ucolick.org
webmaster@ucolick.org
De Clarke
UCO/Lick Observatory
University of California
Santa Cruz, CA 95064
Tel +1 408 459 2630
Fax +1 408 454 9863