MDC-II Goals and Components ------------------------------ Goals: ------- MDC-II actually consists of two related but separate parts: MDC-IIa and Mdc-IIb. The goals of these parts are different. The primary goal of MDC-IIa is to provide a rate test for the full data handling and analysis chain starting from Level 3 and going through the reading of data sets by end users. The target rate is 20 MB/sec for the complete system. However, we would also like to understand the maximum rate possible for each component and to understand where potential blockages and rate limitations in the system might exist. Additional goals for MDC-IIa are: 1) Insure that all components of the data processing and handling system are functional and can work together without interference. 2) Identify critical operations issues for Level 3, the production farms and the data handling system so that these issues can be resolved well before the commissioning run. In order to meet these goals, we probably need to run MDC-IIa continously (24 hours a day) for 3 days once all component testing is complete. The primary goal of MDC-IIb is to provide large samples of reconstructed simulation data to the collaboration. These samples will be used to: 1) Test the quality of the simulation and the reconstruction software 2) Exercise the DIM and associated data catalog and the DHInput and DHOutput modules 3) Identify critical operations issues associated with the use of the data handling system on fcdfsgi2. Dataflow for MDC-IIa: ---------------------- The chain through which data will flow is as follows: - Simulated data sent to Level 3 nodes in Trybos format (EDM-II converted back to TryBos) - Analysis in Level 3 farms based on simulated trigger stream (using Trgsim++ where possible) - Output of Level 3 sent to consumer server-logger (multiple streams) - Consumer server-logger writes data to dual-ported disk in Root format and records file information in File Catalog - FCC data logger moves data to tape using DH software and fills fileset and tape information into File Catalog - Data moved from tape to Farms I/O node using farms software and Data Handling utilities - Farms control software ships files to worker nodes and controls flow of production jobs - Production jobs run current reconstruction executable and produce split output files that then get shipped back to I/O node using farms control software - Farms/DH software do concatenation and writing of output files onto tape (plus filling of appropriate databases) - AC++ analysis job reads these files from tape using the DHInputModule Required components and Milestones ------------------------------------ For each part of MDC-II a list of required components is given below. The list is marked using the following symbols: +++ -> Required for MDC-II ++ -> Highly desirable for MDC-II + -> Desirable for MDC-II - -> Probably not present for MDC-II Level 3: ---------- Aside from the DAQ chain used by the front-end scanners, there is no way to load data into the front end of Level 3 fast enough to do the MDC-IIa rate test. The proposed solution to this problem is to locally mount four 30 GB IDE disks on the Level 3 I/O node PCs. This means for MDC-IIa, the data sample will be limited to ~120 GB and therefore the same data will have to be processed multiple times. The following components are needed: +++ Hardware: All exists and in place. +++ Level 3 triggers defined in TriggerDatabase The trigger database defines the L2 prereq and full analysis path for each trigger. The database tables are already in place, but the triggers themselves need to be put into the database. Scripting level code (Perl) exists to do this, but it is not very user friendly. The long-term plan is to have a GUI - GUI to add L3 triggers to the database won't be ready for MDC-II +++ Access TriggerDatabase to create L3 tcl file Kirsten has this code working (it is written using PerlDBI) +++ Prerequisite module to select on L2 trigger bits This code still needs to be finished. + TrigSim++ We will use whatever is ready in TrgSim++ but we won't hold up MDC-II for it. +++ Calibration DB access Because all nodes need access to the DB information at the same time, this information is read once by a separate process and downloaded to all of the nodes in the form of a flat file. The Database API is able to read these flatfiles (using code automatically generarated from the Java table definitions). Reading and writing the flat files now works. ??? Is the mechanism to do this automatically at run start working? Is it required??? We need to make sure offline code using CalibDb API is uses in L3 +++ TL3D and associated banks Level 3 needs to record which trigger paths passed for each event. This is done in TL3D. In addition, it must be possible to associate this bit information with a trigger name. This part of the system was flawed in MDC-I and Kevin is working on a redesign. +++ Network Message to DataLogger to specify trigger bits and offset from start of blob to where run-section info is written This exists. (second item might need modification for root changes?) +++ Dropping of Level 3 reconstrution banks This was done using a kludge for MDC-I. Liz Sexton has the basic functionality in AC++ properly now, but L3 has its own output module and I don't know if that module has incorporated Liz's changes. +++ AC++ modules for reconstruction What we have is ok +++ Edm Related items: See Edm list for details. +++ Need to discuss how we control which files each L3 node reads B0 Data Logger: ---------------- The consumer server logger will output events in Root format. The logger is NOT an AC++ application, but rather a standalone C program. The data logger is responsible for putting the run section information into the event (it sees the event as a blob but receives a message from L3 telling it where to write the number). Data will be written to the dual ported disks and separated into two output streams: Stream A: High pt lepton events (top, W/Z, Higgs) Stream B: QCD and B triggers Data will be written in ~1 GB files and files will contain events from a contiguous range of runsections. Files from different data streams will be segregated into different directories on disk. The B0 data logger is responsible for writing file based information (eg number of events on the file) into the CDF File Catalog. The datalogger exists and is working. The issues remaining are: +++ Switch from XXX format to true Root format See Edm list for details. This might require addition work for multibranch Root. ++ Writing file information to DataFile Catalog For MDC-I this was done by writing to a file and then uploading the catalog later. This is clearly not a long term solution. There were concerns that using the Oracle database directly would make it impossible to write data if the Oracle server were down. This of course can be addressed with an appropriate design. It is REQUIRED that a solution be found by the commissioning run +++ Dual ported disk Makoto and Stephan have bought a product from SGI that allows the dual mounting of the filesystem from two SGI machines. This solution should be much more robust than the home-grown option. But, it still needs to be installed and tested. FCC DataLoger --------------- The FCC data logger will take the files from the dual ported disks and write them into filesets on tape. This data logger will run as a daemon that waits until a tape's worth of data for a given stream is available and bundles it for writing. The FCC data logger is responsible for filling the fileset and tape information into the CDF File Catalog. The following components are necessary: +++ Dual ported disk See above +++ Tape drives (Ait-II) Ordered +++ Tape Media Ordered +++ Disk Inventory Manager (DIM) +++ Daemons and Cron Jobs +++ FileCatalog Production Farms: ------------------ Data Input: The farms production system will look in the CDF catalog to see what data are available for processing and will spool the relevant data to disk using the Disk Inventory Manager. The farms control system will then move these files to the worker nodes for processing. Processing in the worker nodes: The binary used to reconstruct data will be linked against a frozen release of the offline package. We will access calibration data using the standard C++ API and Oracle database access for each node. The output from each worker node will be split according to trigger bit. The generator information and raw data will be kept on the output files. There will be one output file per output stream per input file. Data Output: Output from the worker nodes will be shipped back to the IO node where they will be collected until enough files exist for the concatenation (back to the target filesize of 1 GB) to be done. The concatenation will be done using AC++ and the Data Handling output module. This module is responsible for filling the file information in the CDF Catalog. A special input module that can talk to the farm control system using the farms interprocess communciation software (FIPC) was used in MDC-I and is likely to be needed for MDC-II as well. The output files will be put on tape using the same software as the FCC data logger. The components required for the Production farms are: +++ Hardware All installed and working +++ Farm Control Software v2.0 milestone to release: May 1 + Level 4 database It would be really nice to be able to generate the splitting part of the tcl file from the trigger database, even if the reconstruction path is hand-coded, since the splitting will depend on L3 triggers. The splitting part is REQUIRED for commissioning run. +++ Concatenation Job Some changes needed from MDC-I, including : +++ Move to multibranch root and new DHOutputModule +++ Filter module to select on new TL3D + Autogeneration of tcl files from TriggerDatabase See comments above (Level 4 database) EDM: ----- There are a number of EDM items for MDC-II: +++ Replace XXX format with Root Format for DataLogger (use Write(Tbuffer)) Since this is REQUIRED for the commissioning run it is very important to have it tested during MDCII. +++ Multibranch Root As we are going to do this for Commissioning run, we probably are better off doing it now instead of later. + Support for compression Reconstruction: --------------- We will use whatever we can get into the release tagged in late May (3.6.0) +++ Need quick turnaround to make sure all objects written in production can be read, interpreted and make sense. +++ Need high quality physics validation plots ++ Would like to get calibration database access into code so that we can test our operational model of having all the production nodes independently contact Oracle (since different nodes may be working on different runs and since nodes don't start sychronously, the Level 3 solution doesn't make sense for the Production farms) ++ Pad output banks It would be highly desirable but may not be possible Analysis Jobs: ---------------- Once the data are in the CDF catalog and on tape, they will be accessible for users on fcdfsgi2. These users will access the data through AC++ and the DataHandling input module. Requirements: +++ Tape drives and MTTools on fcdfsgi2 +++ Disk Inventory Manager on fcdfsgi2 +++ DHInput Module and FileCatalog Done +++ LSF batch queues properly configured ++ Strategy for Secondary Datasets