This file is $CDFSOFT2_DIR/FarmTools/doc/running.html

Running the farms

(for Production Coordinators)

Step 1. Special User Accounts

Please ask Yen-Chu Chen (chenyc@hycppc05.fnal.gov) or the outgoing Farms Production Coordinators to edit the file ~cdfprod0/.k5login on node cdffarm1.fnal.gov and add your username to the file.

Step 2. Responsibility for the FarmTools package in cvs

Please ask Art Kreymer (kreymer@patnt2.fnal.gov) or the CDF Code Management Team (cdf_code_management@fnal.gov) for the responsibility for access to the FarmTools package, so you can update this document.

Step 3. Logging in

You should be logged in as user cdfprod0 on cdffarm1.fnal.gov

If you are not logged on as cdfprod0:

  1. You may need to do a kinit on the desktop computer where you are.
  2. > telnet -F -l cdfprod0 cdfpca.fnal.gov

Step 4. Running the Farm

Step 5. Making Plots

> cd /cdf/scratch/cdfprod0/4.1.0_b/Stream_B
> mkdir plots
> source ~cdfsoft/cdf2.cshrc
> setup cdfsoft2 development
> $CDFSOFT2_DIR/FarmTools/hist/make_all_plots
Enter subdirectory 4.1.0_b
Enter Streamlet Stream_B
Do you want to have color plots? y

Contact People:
ElectronValidation Bob Wagner rgwcdf@pcl4.hep.anl.gov
MuonValidation Ken Bloom bloom@umaxp1.physics.lsa.umich.edu
TrackingMon Eiko Yu eiko@penn01.fnal.gov
COTHitMon Eiko Yu eiko@penn01.fnal.gov
JetMetValidation Jean-Francois Arguin jarguin@physics.utoronto.ca
TowerValidation Simone Dell'Agnello simon@cdfsga.fnal.gov
FitBeamModule Hartmut Stadie stadie@ekp.physik.uni-karlsruhe.de
RawMonitor Tara Shears tara@cdfsga.fnal.gov
ClusterMonitor Tara Shears tara@cdfsga.fnal.gov
TrackMonitor Tara Shears tara@cdfsga.fnal.gov

Step 6. Monitoring the progress of the farms

A good monitoring display can be found here

Step 7. Finding the Output

/cdf/scratch/cdfprod0/ directory contains

  1. ProductionExe tar files;
  2. output directories for particular ProductionExe version and farmlet. The name of the directory is built as "version" + "_" + "farmlet".
  3. Output directory contains badhist core crashed hist logs plots.

    To debug the core dump use ~sexton/bin/gdbkcc. If running debugger on the core does not give you enough information to find the problem it is better to run ProductionExe directly on "bad" event. You can figure out dataset name, run number, and event number from core file and corresponding log file and then strip "bad" event to file.

  4. A script exists that submits a batch job and strips the event to a specified directory:

    pickEvent -o /cdf/scratch/stdenis/stripped/ -r "run number" -e "event number" -d "dataset"

    Use

    pickEvent -h

    for help.

  5. A daemon that copies the file from fcdfsgi2 to fcdflnx1.

    The output file has a name like

    ar01d0e0.0001phys.119008.20843

    This is all done by the farms production coordinator, but instructions are provided here to allow new coordinators to learn how to do this or anyone else should the fpc be unavailable.

    The tcl file should be the production executable.

    ProductionExe $CDFSOFT2_DIR/Production/ProductionExe.tcl \ -i /cdf/scratch/cdfprod0/4.1.0a_b/test1/stripped/ar01d0e0.0001phys.119008.20843 -o /cdf/scratch/cdfprod0/4.1.0a_b/test1/stripped/ar01d0e0.0001phys.119008.20843.root

Step 8. Examining log files

The logs files are available on fcdflnx1 in a directory with a name similar to
/cdf/scratch/cdfprod0/4.1.0_b/Stream_B/logs

Report all "ERLOG-s" errors from the log files
Report all "ERLOG-e" errors from the log files
Report all "Hanged" errors from the log files

Step 9. Analyzing Core Dumps

Rick and Liz said that the core file is not as useful as the event since they have always been able to trace the problem with the event. But, the core files are available, just a bit hard to figure out. In case you really need one, the fpc can track it down.


Last updated November 1st, 2001

Legal Notices