Shane AO  

Welcome/Index | Architecture | Procedures | Problems | Features | Documents | Resources | Chronology  



Problem: Sluggish GUI (e.g. LSM.tcl) / Runaway / Competing Python Processes (e.g. sharcstherm; thorwheel)

Manifestation:

  • Sluggish behaviour of time-critical services (e.g. Laser Shutdown Monitor).
  • Executing top command reveals multiple processes competing for circa 100 per cent of CPU (sometimes user is not the owner, e.g. turk).

Solution/Recovery:

  • Ultimately, migrate Laser Shutdown Monitor (LSM) and other critical safety systems to a dedicated machine.

    Much of the load on shred comes from Xvnc processes handling the busy displays of AO. Longer term, it is suggested to adopt the scheme at Keck where the VNC host is distinct from the instrument host. This isolates user activities from machine activities.

  • Temporary workaround (2017-03-13):
    • Execute time-critical safety system software LSM.tcl on shard.ucolick.org.
    Eventually:
    • Execute critical safety systems and/or time-critical software on dedicated hardware (e.g. shredly.ucolick.org).
  • Attempt to stop and re-start process:

    ao stop sharcstherm

    ao status sharcstherm

    ao start sharcstherm

  • If stop and re-start unsuccessful, identify processes (e.g. sharcstherm):

    ps -ef | grep sharcstherm

    Then, kill process:

    kill -9 <pid>

Cause:

  • sharcstherm confused on power cycle of the Agilent temperature controller?
  • The sharcstherm and Thorlabs filter wheels controller, (whose executable is $LROOT/sbin/thorwheel) executables use kpython and share communications libraries and control thread design. Speculate there is a common issue afflicting both.

Associated Issues:

  • Issue 10812.
  • Issue 10811.
  • May conflict with other services or prevent other services from running or responding in a timely fashion.

Change Log:

2017-03-14: Updated.
2017-03-12: Updated.
2017-01-20: First version.

This document last updated (UTC): Tuesday 04 March 2025