Level 3 Online Completion List
As we near completion of the development phase of Level 3, we thought a
list of tasks would help us keep it all straight. This will be updated
and annotated as we see fit.
Level 3 Internals
Infrastructure
- Physical setup and cabling
-
There remain some temporary cable installations which need to be
cleaned up. The relay racks in the Level 3 area also need to be
removed and replaced with more shelves.
- Linux/ATM driver update
-
Once Fermilab settles on a 2.2 kernel and has all the kinks worked out,
we'll go with it and update our Linux version as well. This will also
require redoing the ATM driver. Performance tests should be repeated
for this update.
- Mechanism to check file versions
-
This would perform basic checks, like file size and checksum, and would
be useful in confirming not only versions of binaries and databases, but
also whether they are really copies of the same physical file. There
will be two implementations of the mechanism: expect scripts, and a
relay implementation.
Data Flow
This is mostly done. We have tested the flow of data through the system
from the event builder out to the consumer server, but always in test
configurations.
- Partition handling
-
We need to check to make sure our handling is consistent. Converter and
processor nodes need to check partition number. The output nodes can
take any partition number.
Control
- Relay control of l3_converter program
-
This has been accomplished.
- ROOT objects for farm divisions and state transitions
-
In progress. ROOT and ORBacus have been compiled together.
- Relay/object server startup
-
This can take two forms: expect scripts, and self-propagating relay
startup. Or we can start ORBacus up as an inet.d service and then
load dynamic libraries (not a function of ORBacus, but perhaps ROOT...)
- Object ping operation
-
This operation is to confirm that an object or node is still available,
thus hopefully protecting the object services from crashing because a
remote node stopped servicing requests.
- Means to connect to an existing relay
-
This will be essential in order to debug running systems. This probably
means we'll define another base key, RELAY_BASE_KEY, from which port
numbers and so forth will be defined, and keep L3_BASE_KEY to indicate
what keys are to be used for gathering information or starting processes.
Monitoring
- Basic monitoring primitive methods
-
We already have some basic operations and the scheme by which they are
accessed by the monitoring system. Some other useful primitives:
segments of the log file for a given process.
- Periodic monitoring structure and channels
-
Also done. We will use CORBA event channels and objects which gather,
collect, and forward monitoring strings.
- Receiving PushClient
-
Implemented in C++.
- Interface to l3_converter
-
Implemented but not tested very much yet.
- Startup procedures for relay use
-
This is the recipe by which the relay mechanism can be used to start the
monitoring.
External Connections
Event Builder
This is pretty much done with the l3recv package.
Consumer Server
- Event flow
-
This has been implemented and tested.
- Query routines (library detail)
-
Some additions to the l3_csl_socket library need to be made for
consumer server use.
- Multiple consumer servers?
-
Possible development path. Not decided.
Run Control
- SmartSockets interface to state machine handlers
-
Since the relay control objects will follow the state machine, this
will hopefully consist of a C++ SmartSockets interface invoking the
right methods.
- Hardware database
-
The hardware database should hold data on what nodes are connected to
what switches by what port. This can then be used to figure out the
optimal configuration for a given set of nodes in subfarms. (Perhaps
queries should be designed to answer queries like "Give me one (or two,
or all) node(s) for converter b0l3c05".)
We have basic Java methods for database access which can be used to
create a relay-readable file (this part hasn't been done).
Online Monitoring (Data Pool?)
- PushClient with SmartSockets output
-
As noted above, a PushClient already exists, but it doesn't yet output
to SmartSockets.
- SmartSockets-reading GUI
-
Later.
- Message handling from individual nodes
-
We note that hundreds of nodes is a lot of messages for the simplest
things. Maybe we'd better just keep it in log files.
Filter Interface
- Data reformatting
-
Current version reformats from minibanks to TRYBOS banks. A future
version might accomplish an EDM2-like transformation?
- Event data exchange with filter executable
-
We are currently using the global buffer system, but to replace all the
different processes in the old version, we need to revise its use
somewhat.
We also need to interpret the trigger results to figure out whether
the event should be sent to the output node.
- Access and update of filter databases
-
This is for things like the trigger table and calibrations.
Methods exist which read the database into a flat-file format. These
need to be distributed to the individual nodes, and somehow the
filter executable needs to know which one is to be used.
Updated 21 November 1999
Jeff Tseng / MIT /
jtseng@fnal.gov