Minutes of DAQ/Online Calibration Meeting 05/12/00
Agenda:
=======
Friday, May 12, 9:00 am, B0 Theater
Minutes of last meeting Jack/Cranshaw/Rick St. Denis 5
Follow up from last meeting
===========================================================
Rick brought up the usual question about taking a series of calibration
points. He reported that Alexei Safonov said that this was being done
now but that the run number is differ for each point. Jim said this
could be the case and is probably a temporary solution. After
discussion, it was clear that this is an undesirable feature. The
problem is that calibrations can be run on multiple partitions so each
gets a run number. Hence, there is a potential that the run number does
not increase by one for each point. This makes interpreting the data
difficult. The multipartition running allows faster calibration if
there is no interference. This was used last run.
News
====
B0DAU30 crashed due to a kernel software bug. Oracle was down. Support
DB, in cooperation with Oracle service, got the system running in about
5 hours. The database was down longer than that, but it was decided
that it was not urgent enough to bring people in and it is not a 24/7
machine.
Round Table:
-----------
SVX: Jean Slaughter/Steve Nahn
====================================================
Took data and got consumer to run on ladder testing. Testbeam software
writes to the software event builder but is not using full run control.
Data were written from Feynmann using D-Mode and writing Trybos. Could
not hook to the CSL directly, want to try this and an consumer next. In
this mode it is not possible to download pedestals to the FIB. The
teststands are oversubscribed so testing is limited. The Broker uses
the hardware database and the test stand software does not.
COT: Ashutosh Kotwal/Bill Orejudos
========================================================
Ashutosh reported on the hardware situation. Certainly in a few weeks they will
be pulsing the Front end and shipping this out. They need a crate. A
test of the system can be done if they could power a corner of the CTC
(this is not how it is wired, so they would have to power more, but for
a full test, only 1/4 is really needed). They are missing a processor
but this should be in by Monday. There is potential that the hardware
is ready for pulsing calibrations sooner than a few weeks. Stay tuned.
Bill reported that he has set up maps of crate id to geometric id using
Jack's maps classes. This can be examined by making a view on the
Hardware database -- trivial in SQLPlus. He needs to get it coded and
this is a matter of a bit of time. This map is used in a lot of places
and YMON is an obvious customer as well. It will be tested soon.
DatabaseHardware: Rick Vidal/Umesh Joshi
=================================================
Umesh reported on the CDF Online Production Database Server, B0DAU35.
The system is a Sun E4500 server with
o four 400 MHz Ultra SPARC CPUs
o 1GB memory
o two 9 GB 7200 RP SCSI disks (system)
o 1 A5200 Fibre Channel disk array
---> ten 18 GB 10000 RPM FC disks
o 10 Ethernet ports (10/100 Mb/sec)
It is running a Solaris 2.7 operating system. The file system
for the database is VxFS, version 3.3.2. The volume manager is VxVM
version 3.0.2. it uses Solstice Backup, version 5.5.1 (Legato) and is
running Oracle, Version 8.0.5
It has an uninterruptable power supply, a Matrix 500 which can keep the
system up for greater than 30 minutes. This allows the sytem to be
brought down gracefully.
The general design was to have redundancy throughout the system: except
for the back plane. There are 2 cpu boards with 2 cpu's/board, the
memory and cpu are on the same board.
All disks are mirrored. There are four pairs of mirroed disks (18 x 4
GB of data storage). There is one disk kept as a spare for use with hot
relocation. This means that if one of the disks in the pairs goes, the
spare can be paired up with the disk in the pair that is still good and
the disk that is bad can be replaced. One disk is kept as reserve to do
backups.
There are 10 ethernet ports connected to 4 different cards and 10
different IP addresses. The question was raised as to how this
interacts with oracle addressing and Umesh will study this.
The Veritas Volume Manager allows Hot relocation using the spare disk
and also allows Dirty Region Logging. This keeps a log file for
mirrored volumes.
Oracle redundancy has been implemented with 3 control volumes (hence 6
copies) and 1 redo log volume (hence 2 copies).
There have been problems. First, there was a Bad CPU. This was
replaced. Then there was a bug in the volume manager. This needs an
upgrade. Finally, there was a license problem with VxFS.
In the short term, Umesh wants to configure the system with 10
independent network ports. He wil perform Oracle and System backups and
restores using the Solstice Backup. He will conduct studies with disks
configured for direct I/O. The last case addresses the problem that
lead to the database on b0dau30 not coming up after a crash: the
filesystem was saving the data file and the control files to disk at
different times because the amount of data in the data file caused a
write from the buffer whereas the control file buffer did not fill so
fast.
In the future, a Sun 450 server will be purchased for development and
integration . This should have been purchased first. It has 4 cpus, 4
GB of memory, VxFS, VxVM, Solaris 2.7 and five 18GB SCSI disks -- no
need for mirroring since backups can be done and it is a development machine.
There will be a 1GB memory upgrade of B0dau35 and four more disks will
be added. Also, a backup device will be purchased -- maybe with an
autochanger.
Muon: Kevin Lannon
=======================================
no report
Calorimeter/DBana: Vaia Papadimitriou
=============================================
no report, but see monday's CDF WEEK talk
LED/Source Phillipe Gris
========================================
Banks are defined, the LED source runs were done on the NW arch in D
mode. He is working on the consumer. They are staying on the NW. The
voltages are set so that they will get the gain like in Run I. This is
only for the EM calorimeter, although sources are EM + Had.
CLC: Alexei Safonov
=========================================
Absent, but reported to rick about the multiple runs discussed above.
Thus, the LED calibration going and also D bank calibs are basically ready.
Consumer: Kaori Maeshima
=========================================
No report
ShowerMax: Steve Kuhlmann
=========================================
Email:
Jack hit a roadblock in trying to write a pedestal file to
the database. There is a routine that I need to modify
but haven't had a chance yet
Presentations
-------------
DatabaseSoftware: Jack Cranshaw
Maps Classes
============
This maps the key in the events coming out of the detector during
calibration runs to the key on the database. It gives a level of
indirection to show how things are numbered. It also provides
standardization for applications to use one source of this mapping.
Beyond using this for D and X mode calibs, candidates to use the mapping
are Event display, Banks and DBANA.
CalorMaps is the calorimeter map class. It is used in X, and D but not
DBANA. The transfer of data from the front end is hand coded but the
start address comes from the hardware database. So there is a potential
for things to get out of synch if changes are made.
CotMaps is under defelopment.
Some problems in the software design exist. The functions are define
din the class but should be in a separate place.
Templated D mode
===================
Now being used by Phillip, Andrew (sources), and Alexei. With so many
users, Jack restructured the directories, one per detector.