Minutes of DAQ/Online Calibration Meeting 06/23/00


Minutes of DAQ/Online Calibration Meeting 06/23/00


News
----
o SPARC machine and disk for the offline:                  Nelly
  Nelly was absent but has been working full out to get a production,
  as well as integration instance installed on the offline machine
  designated for this until the final machine can get into place.  
  The machine is a sparc5  and there is concern if it can handle
  the load.  Rick has pointed out that there is not much of a load
  anyway and that is a more serious problem: the OFFLINE database is
  underutilized.  The plan is to call the sparc5 machine cdfora and
  rename the present SGI Indy that is called cdfora to be cdfdev.
  The database instance on the sparc5 for production will be called
  cdfofprd and an alias on the machine to cdfoff1 will also be made.
  Thus, when the machines are operational on Wednesday, the software
  will maintain the same pointers, cdfoff1 on cdfora, but will in fact
  point at the production instance.  
 
  The integration instance will be formed from a copy of the present
  development instance.  The production instance will be populated by
  data that are considered interesting.  This is easily implemented: the
  hard part is deciding what is interesting.  Mark Lancaster estimates 
  the job is of order 10 minutes for the Offline Calibration Database.
  The Data File Catalog will need to be updated too and Rick has been 
  talking to Eric and Terry about what is needed there.

  For the case that things fail the fall back is to use the integration
  instance and in the worse case, go back to the sgi.

o Cut of Integration and Production Calibration Databases: Jack
  Jack needs to consult a bit more with Julie, but is planning to do
  this next week.  With the hardware database scheduled to go to
  production on Saturday (note: it did) Jim Patrick was interested in
  seeing the calibration database go to production soon.
  
o Move to "broken up" version of CalibDBTables:            Jack
  Art Kreymer was thankful that this is underway and has tried to get
  the new packages that were formerly only CalibDBTables and are now
  xxCalibDBTables where xx=subdetector to build in a test release.
  Unfortunately he had problems.  Jack thinks it should work and Art
  should give it another try.  

  Art pointed out that we can choose a machine on which to try out the
  change in the development packages now that he has that possibility.
  Rick suggested using nglas05.fnal.gov since it is a fast machine.
  Rick and Jack pointed out that references to the calibration packages
  need to be sought out so that we can also check those packages.  This
  makes the problem more difficult but Rick and Jack believe that this
  is a reasonably small number of packages and will get a list using the
  code browser.  Art has the ability to change any package so this will
  make it possible to create a test release that works and commit the 
  change all at once.
  

o Schema Consistency and final approval:                   Julie/RickStD
  A final set of comments on the trigger schema was made at the trigger
  database meeting and are being implemented.  The GUI by Donatella
  looks great and Mel is very happy so it is in the last stages of
  getting into place. Rick is trying to get all the cdf schema collected
  into one place to see how it all fits together (or not).  
 
o Replication:                                             RickStD/John
  Rick reported that John is planning to replicate a few tables from sgi
  (b0dau30)  to sgi (cdfdev) when the offline migration is done. 

o Validation on Linux and Linux server:                    Dennis
  Dennis has things running but Jack has reported a bug in otl so they
  are going to look into this.  This linux server on ncdf16 allows
  validation jobs to run every night and wipe, create and put entries into
  the database.  This will also allow us to keep track that the linux
  instance is working.  

o New Schema for Calibration Database:                    Dennis/Jack/JimK
  The new schema needs the production/integration/development instances
  of the databases to be in place both online and offline as also the
  package split to be complete.  The main outcome of this is the ability
  to handle the used and valid sets in a better way.  This work will
  proceed;  however, the valid and used sets need to be used NOW so the
  present implementation will be used based on scripts that Mark
  Lancaster has (he has also circulated instructions). Rick St. Denis
  asked Jim Patrick to think about/find someone who will be responsible
  for making the entries so that we use this mechanism. 
 
o Trigger Implementation:                                  Alexei/Kirsten
  See comments on schema and GUI above. Also, from the trigger database
  meeting Rick points out that they are moving to get the database-based
  trigger download to work over the next week for level 1 calorimetry.

o TrigSim Implementation:                                  Simona
  Rick reported from the Trigger meeting: the ascii files generated for 
  the trigger simulation are currently put into the offline CVS
  repository. This will be moved to a table in the trigger database
  as a CLOB (Character Large Object).  Rick explained that Oracle allows
  large objects (LOB) to be stored as CLOB's or BLOBs (Binary Large
  Objects).  The column of the table in which they are stored can be 
  pointed physically to a specific disk so that all the trigger tables
  can be on one disk or set of disks.  The copying of this can be easily
  managed with Oracle Replication (and, for that matter, an msql-based
  version) since the copying is done column by column.  

  Placing these LOBs on the database allows one to be able to find the
  data.  This, coupled with the generic as well as specific capability
  of the web-based browsers written by Randy Herber and by Donatella
  allow one to locate and view all the data in one place. 

  The use of the CVS repository while ok for the time being, has
  problems once we are in an operational mode.  The offline repository
  will only be updated by the nightly update so that trigger clobs would
  only be available the next day.  Furthermore, they would only be
  conveniently available in development.  For the frozen release where
  most people want to work, this would be awkward to reach.

  However, for the present time, the few experts can learn the awkward
  rules and survive until the LOB's are implemented.
  
  As for implementation, Julie Trumbo has already got the scripts for
  the storage and retrieval of LOBs and they will be stored on the
  database.  Dennis says that the Java retrieval is trivial, but has 
  to give some thought about the C++ interface.  He also believes
  that thinking about this and getting it implemented will solve the
  "array of floats" problem (see silicon below).


o Online                                        JPatrick/RickV/BillB
  See comments on production release and need to use Used/Valid sets
  above.



Followup on what has been done concerning the theme of
the last meeting: DEAD CHANNELS.

Discussion of this occurred in the calorimetry report.

Round Table:
-----------
SVX:                            Jean Slaughter/Steve Nahn
=========================================================
Jean pointed out that there is still nobody to replace RickStD on the
pedestal code and specifically to produce pedestals on Barrel 4 data.
Rick suggested that Stan Thompson be asked to do this since he comes on 
Monday.  Jean was not sure if Stan had finished the work on access to Cell and
bunch crossing number, but thought it a good idea and will talk to him.

Rick pointed out that in the discussion with Oracle this week, the
Oracle people suggested that any table exceeding 50M rows should be
"partitioned."  What this means is that a table can have its
information split amongst various disks AND that under some condition,
like "run number between x and y" the run numbers between certain values
can be put into a partition to allow faster access to the data.  There
are also conveniences gained with respect to backups that need to be
done. Oracle also recommends no more than 1000 partitions be used.  This
leads one to imply that Oracle is not suggesting tables to exceed 50E9
rows.  The current Silicon implementation would put about 120M rows on
the database once per week if the cell dependent pedestals were stored
for each mode of readout.  Clearly arrays of floats to cut this down to 
5000 rows (as is done for the downloading) are needed.  Dennis has
pointed out that this will probably be implemented as part of the
BLOB/CLOB solution.

COT:                             Ashutosh Kotwal
------------------------------------------------
Ashutosh reported that the calibration system hardware for the COT is in
place but he need customers.  There are no front end COT electronics
available. Perhaps the XTC can use it?  All the strobe timing is
measured.  The bottom cables of the COT are still to be done.  Ashutosh
asked about the 1250 numbers.  Rick suggested that the original text
file be put on the database as a CLOB so that it can be easily
referenced and checked. (Later discussion pointed out that this does NOT
mean that the numbers should be accessed by applications this way: they
still should go onto tables.  The reason is that once they are on
tables, then no parsing is needed to obtain the numbers and the
interface to the text file need only be maintained in the method that
puts then into the database tables.)  Being able to view the table and
the original file will be useful in getting them checked and providing a
traceback record of where the numbers are coming from and going to.

Ashutosh pointed out that there are an additional 24 numbers to be
tracked in the cables going to each crate and that these can 
contribute to a t0 offset.  These numbers are assumed to be the same for
now but future versions may differ.  There are also 8 numbers
corresponding to the card to detector offsets.  Rick recommended that
these should be put on the calibration database. Bill Orejudos is
familiar with the technical details and can help define these numbers.
The choice of what values to use and what were used works nicely in the
calibration database's valid/used set scheme.  

Bill Orejudos sent the following update by email:

Hi,

  I will be unable to attend the 9 am calibration meeting. Here is
a quick summary of what's new with COT calibrations:


   1) COT map classes will go in COTGeometry. All parties agree on this.
      The view to the hardware database has been modified to include
      the channel id info in the COTD banks. So there are now 3 ways
      of describing a channel:

       a) crate, slot, chip index (0-95)

       b) superlayer, cell, wire

       c) 9 bit word in headers of COTD bank (essentially superlayer
          bits 6:8, mod ID 0:5), 7 bit word in the data words
         (local cell number 4:6, wire number 0:3).

      The view was modified to deal with case c). Any consumer
      (YMON) getting
      COTD banks will be able to get description a) or b). Of course,
      given a), one can still get b), and vice-versa. CotCalibConsumer
      i set up to get a) from the front end, convert to b), and then
      put both a) and b) in the DB.

   3) STILL waiting for flat cables to be connected to repeater cards
     so that length calibration can ensue...



                                              Bill

DatabaseHardware:    Rick Vidal/Umesh Joshi
--------------------------------------------
No report

Muon:                           Kevin Lannon
--------------------------------------------
No report

Calorimeter/DBana:     Vaia Papadimitriou
-----------------------------------------
Vaia pointed out that she was unable to write to the database for the
last two weeks.  This was understood to have nothing to do with the
Oracle access or the database itself, but 
1. The shareable libraries change and she was using development.
   Since Art is releasing 3.6.0, it is reasonable to base the
   consumers on this release.  jack points out that the consumer
   is presently a tbin which means one has to check it out anyway and
   build it.  The plan then is to check the head release of the
   calibration consumer out against 3.6.0 and build there.  This should
   be much more stable (note: the consumers can be made as part of the
   binaries and will be in the release ONLY if there is a validation
   procedure, ie. gmake test.  This should be implemented for each
   consumer.)

   Art pointed out that the release is made based on the criteria that
   the offline executable works. Rick asked Art about subreleases like 
   3.6.0.1 etc where small changes also allow the release to work for
   calibration/online/consumers.  Art said that making such a release
   for a couple of packages is administratively easy but that it takes
   8 hours for the build to be done because everything is made from
   scratch (ie. gmake clean).  Rick mentioned that the next major
   release, 3.7.0, will be done for physics analysis of MDC2 data.
 


2. There was an overwrite bug that was hard to trace and eventually
   was understood to be due to the readout sending non-existent channels
   that lead the code to overwrite. Jack has protected against this but
   Vaia will have to follow this up with the Online group.


Vaia reported that reasonable error reporting is happening and it is
going to screen and to a file. 

Vaia said there are plans to create an error bank with bad channels in
the front end and that this will be sent to the consumer.  She has error
codes defined.  Rick pointed out that this is all very similar to what
is needed for slow control. Rick asked Vaia to read the plans for
slow control he had sent her and to comment on how the same schema can
be used for error logging and control.

Vaia pointed out that the current x-mode consumers will be templated
like the d-mode consumers. This will allow for easy implementation of
dead channels on the database from the x-mode consumers.

Vaia said that no specific plan on merging information had been formed
and that while there are bad calibration channels that she finds as well
as hot channels that are found by Carla, these have not been put
together.  Rick pointed out that the first step is to get the
information on the database according to a precise definition such as
"channels that have high occupancy" and "channels that are not
calibrating" and to have another piece of code that uses this
information for specific applications such as the trigger masks to be
downloaded to level1.  This factors out the producer of the information
from the consumer/user. 

Rick took the opportunity to emphasize that everyone seems to be
converging on July 15 as the day that they will put lots of things on
the database.  He is concerned that while the source data are being
collected, they are not being put on the database and when this is
ready, there may be other items that cause a further delay in these
numbers.  It is clear that the priority is to have the  initial
calibration so that the PMT voltages can be set; however, daily LED runs
are needed to carry the calibration and monthly source runs are needed
as well, so the window to get things on the database once the voltages
are set is not big.  Furthermore, the COT is coming in with constants at
the same time, Silicon SHOULD be coming (but it is delayed by the
cooling), the offline database machines may require attention that would
otherwise be used to ensure that the constants are getting on the
database and the alignment must also be trying to get a first crack at
the database.  Rick emphasized again that since the offline calibration
database is underutilized, there will be demands coming late there as
well. Therefore, waiting in line with a number is anticipated starting
in mid-July...


LED/Source                 Phillipe Gris
----------------------------------------
Phillipe sent mail that there was nothing new to report.

CLC:                             Alexei Safonov
-----------------------------------------------
No report

ShowerMax:                 Steve Kuhlmann
-----------------------------------------
Mary came to listen in and try to catch up!



AOB