Monitoring Controls System meeting on May 14, 2002


Attendance:
Bill Badgett, Steve Hahn, Andy Hocker, Konstantin Kotelnikov, Jonathan Lewis, Oleg Poukhov,Yeongdae Shon, Margherita Vittone, John Yoh, JC Yun

NEWS:
Steve said the June shutdown schedule has been confirmed to be from June 3 to June 14. Number of crews on a shift will be reduced during this time.

TRIGGER INHIBIT:
A week ago, Jonathan sent out an e-mail in which he said he could not toggle trigger inhibit switch somehow. It turned out, Andy made the change since he was nervous about the shift crew disabling the inhibit and then forgetting about it. Jonathan suggested that subsystem users should have permission to modify these kind of pictures.

Please contact Steve or JC before you modify any code which do not belong to your system!

Jonathan agrued that the trigger inhibit picture is confusing since some of the systems are not implemented and asking them off is not appropriate. JC volunteered to change colors to gray for the systems which are not implemented yet. Also Magherita agreed to put in trigger inhibit information into the Icicle.

ICICLE:
There was an incident which caused an alert for shift crews due to Icicle messages to run control. The message was something like this ' Error... the CDF server4 was not responding'. Since the error is not severe one, Margherita was asked to change the word from 'error' to 'warning'. This brought up another long pending issue - how should we monitor Cdf servers adequately? We agreed to monitor them on the CDF Global alarms page and use scheduler to sound alarms.
Larry completed his graphic browsers for the following 8 systems. These are, CMU, CMX, CMP, MUON, COT, SVX, Solenoid and PlugTemp.

VIRUS SCAN:
Steve said that there is some possibility that Real Time Virus scan cause some pc's to hang. He is testing this possibility by turning off the Real Time Virus scan (RTscan) on the nodes: Vnode1, Vnode2, Solenoid2 and TOF. We should keep eye on these nodes.

CCU:
Oleg said he made pictures to control CCU's. Some times voltages of certain channels go to zero. The shift crews can put the voltages back with this picture. New CCU pictures:

  • Oleg's transparency #1
  • Oleg's transparency #2
  • Oleg's transparency #3
  • PISABOX:
    Konstantine made some pictures to monitor all the channels. New Pisabox pictures:

  • Konstantin's transparency #1
  • Konstantin's transparency #2
  • Konstantin's transparency #3
  • Others:
    Yeongdae said that there is some possibilty that congestion on network request stack could induce pc crashes. He said this could be cured if we upgrade the priority of TCPtask.exe.
    Andy said we still have fake heartbeat problems. JC needs to fix it some time.