Shane AO  

Welcome/Index | Architecture | Procedures | Problems | Features | Documents | Resources | Chronology  



Problem: covert crash

Manifestation:

  • ShARCS exposure or scripy fails.
  • sharcs_fe GUI reports Lost server (and attempts to connect and/or init are unsuccessful.
  • Electronic mail(s) with subject:

    ShaneAO: No heartbeat from dispatcher for Detector server.

or,
  • sharcs_fe fails to complete writing of FITS file.
  • ShARCS ion pump control loses communication to dispatcher.
  • Electronic mail(s) with subject:

    ShaneAO: No heartbeat from dispatcher for Keyheader service.

    ShaneAO: No heartbeat from dispatcher for Ion pump server.

    ShaneAO: No heartbeat from dispatcher for Detector server.

    ShaneAO: No heartbeat from dispatcher for Support service.

  • covert unreponsive to ping while other ShaneAO subsystems repond to ping.
  • After circa 20 minutes the following GUIs disappear (time-out?): sharcs_fe, sharcs display, scriptProc, saowheels.

Solution/Recovery:

Hard power-cycle covert.
  • ping covert.ucolick.org — Verify machine is reponsive.
  • Connect as user to covert and determine whether service(s) are running:

    user@covert > ao status

  • As user to covert, attempt to re-start any services that do not claim to be running e.g.:

    user@covert > ao restart

    Confirm all service are running using:

    user@covert > ao status

    Note: multiple (e.g. 2 or more) attempts to restart the service(s) may be necessary.

  • Watch for automated e-mails indicating restoration of services.
  • Re-start the following GUIs: sharcs_fe, sharcs display, scriptProc, saowheels, laserOffload, LSM.tcl.
  • Cycle Uplink: On-Off-On

Cause:

  • covert.ucolick.org crash.
  • Associated Issues:

    Change Log:

    2019-11-12: Updated.
    2019-08-21: First version.

This document last updated (UTC): Tuesday 04 March 2025