Level 3 Online Status Reports
Back to index
Previous
Next
Issue 11
Monday, July 5, 1999
Event Flow
Reports
- The ATM adapter in b0l3c02 has been reported to be bad---no lights
flashing, and doesn't do anything when told. (July 6 update: seems
to be working again. ???)
- VME SCRAMNet module in the b0eb21 (lower VME64 crate) can't drive
the optical bypass switch anymore. This is the second SCRAMNet module
which has seen a similar problem in that crate, with the same bypass
switch. Steve notes that since driving the switch is a high current
operation, one might expect that it would be the first to go as the
module dies of old age. However, it is proposed to move this module
to another crate (and another bypass switch) to see if it's the
crate's problem.
- A version of the ATM driver has been tested which checks the adapter's
tail pointer register (pointing to the end of the received packet queue)
to see if all packets have been received. It hangs the computer in
cases where the queue entries are filled late, because the CPU has
highest access priority, locking out the adapter's attempts to push
its descriptors out to main memory (in the meantime the CPU keeps
inquiring whether the last descriptor has become valid). Sleeping
for a time doesn't work, since this is an interrupt handler.
- Other possibilities for the ATM driver:
- periodic pushes, attached to the master clock ticks. The handler,
responding to clock ticks (currently 1ms, but could be set for
once ever n ticks, or "jiffies"), would flush the currently valid
descriptors. This appears to be fairly straightforward to
accomplish.
- Lengthen delay between interrupt arrival and service from the
current 314us to something longer, up to 1ms. However, the
time it takes for all 10 fragments of the event to arrive is
about 2ms (in which time there are generally 5 to 7 interrupts),
so there is still always a chance of leaving something around
for the next interrupt (which would be too late in our case).
- Another issue regarding the ATM driver is that it appears from the
traces that the reception is being handled in two steps (the
smoking gun: 20 buffers are expected to be filled for a single
event with 10 builders, but at one point 11 buffers are reported
having been filled, while all 10 receiving threads were activated).
We had assumed that a single read() completes only after the
entire packet had been received, which may therefore not be the
case (moreover, the consistency checks on the converter node only
check the first few words, so it wouldn't detect incompletely
received packets of this kind). If this is a problem, a quick fix
would be to use only one large SKB (currently, the driver uses one
small and one large SKB per fragment), but a longer-term solution
would indeed return only when the complete packet has been received.
Actions
- Move SCRAMNet module from b0eb21 crate to another crate (1st floor?)
with another bypass switch.
- Move SCPU's to 1st floor, along with appropriate cabling.
- Steve will implement and test the "periodic push" ATM driver.
Executable Interface
Reports
- Ilya and Kevin have gone through two iterations on the executable.
On the last attempt, it appears that the filter executable is
receiving L3_CS_RUN_EVENT (or some such message) but expecting
L3_RUN_EVENT, or L3_RUN_STOP Instead of L3_RUN_END.
- Boris has rewritten his database user interface using Swing, but it
isn't tested yet. We also don't have a flat-file implementation of
the database access interface.
- Christoph's new reformatter handles I2, I4, and R4 single-type
banks properly, and we just hope there won't be any mixed-type banks
(as is the current indication). Some ambiguities still exist in
handling internal bank pointers, and the SVX structure remains
unclear (no minibank structure, so no byte-swapping information
included in the received data structures).
Actions
- Jeff will make the handshaking messages consistent.
- Boris will provide a flat-file implementation of the database access
interface.
- Christoph will clarify the pointer format and SVX structure.
Test Control
Reports
- Some implementation errors were corrected in the event builder
control/monitoring software. A first attempt at controlling the
scanner manager with this new software may be within a few days.
Actions
- Ilya and Sasha will test new event builder control.
- Boris's database access interface will be used in event builder control.
- Need to define how all this additional software enters the
repository.
Monitoring
Reports
- Mike and Ivan have made an integrated test of singleton requests and
circular buffers and histograms, using two intermediate nodes.
- An N->1 concentrator has been implemented for the periodic monitoring
requests, using the CORBA event channel service. Some problems with
timeouts (waiting for connections) were resolved.
- SmartSockets is now available for VxWorks. There is an FNAL class
offered the week of July 19 which lasts for an entire week.
- Steve points out that Zephyr messages, as they're used in the event
builder, will eventually have to be spit out to run control as
SmartSockets messages.
Actions
- Ivan will work on starting programs from the singleton request service.
- Mike will continue developing monitoring primitives.
- Ivan and Andreas will integrate periodic request service with singletons.
Physical
Reports
- JJ will order 8 100' optical fiber duplexes to run between the 1st
and 3rd floors. This is a temporary solution. JJ will then continue
with his ordering of the final optical infrastructure.
- There are currently only four crates on the 1st floor with the proper
power.
- Need a 6U-9U adapter for the SCRAMNet module for b0eb25. b0eb25,
however, has been booted, though not used in the event builder test
system yet.
- Ilya has run two optical duplex cables between the 2nd and 3rd floor,
as well as installed a bypass switch for Steve Vejcik to use.
- The minimum configuration we need, in the midst of moving, is to
continue with ATM driver tests: leave the VME64 crates in place,
along with at least one converter node.
- We should try to move the SCPU's and the PC's around the same time
in order to minimize overall downtime. However, aside from the ATM
driver tests there are no pending tests or measurements with the
current test system.
Actions
- Christoph will continue monitoring status of 1st floor infrastructure
so we can move the SCPU's down there.
Back to index
Previous
Next
Jeff Tseng / MIT /
jtseng@fnal.gov