CDF Offline Desktop Systems MOU Draft Version 0.4 November 9, 2000 Preamble -------- This agreement covers two years in time. Components ---------- The MOU represents an agreement among the affected parties for support of CDF Desktops at Fermilab. The CDF Offline Desktop Systems are primarily those UNIX desktops located on-site at Fermilab that are intended for CDF offline analysis or CDF support purposes. These desktops are mainly in the CDF Trailers, but they could in principal be anywhere on Fermilab site (currently there are CDF Desktop Systems in Wilson Hall, New Muon, and SiDet). Overall, there are currently about 150 PCs running Linux, and about 50 SGI workstations running IRIX that are supported. The Desktop itself consists of the following supported hardware: a basic CPU unit (motherboard, CPU, memory), input devices (keyboard and mouse), output devices (video card, monitor) and storage devices (local IDE/ATA or SCSI disk with disk controllers internal to the CPU unit, CD-R or CD-RW and the Run II supported serial media choice, currently Exabyte Mammoth-2 drives. If desktop video conferencing becomes a supported application, then sound cards and speakers would be added to the list of supported devices. A 1 Gbit/sec link exists between the trailers and FCC allowing users to occassionally copy highly compressed secondary and tertiary datasets or ntuples, from FCC to the trailers for repeated analysis on the desktop. The connection betwen the desktop and Trailers networking is 100 Mbit/sec per node,although a single 100Mbit/sec link may be shared among several machines in the same office if necessary. Common file servers allows sharing of spool disk and backed-up home directories between the large SMPs in FCC and the trailer desktop machines. The aforementioned networking and file servers are not a part of the Desktop Systems project, but are necessary for their operation and hence are discussed in this document. The code/product server in the CDF trailers is a part of the Desktop Systems project, and provides access to software for the majority of desktops. Local code/product distributions are available but discouraged to minimize maintenance effort. Some desktops are clustered into workgroups corresponding to CDF collaborating institutions, and share common resources such as account information and NFS mounted disks. Computing Division Responsibilities ----------------------------------- CDF Task Force (TF) in the CDF Department: Will install, upgrade, and provide system administration and support for the CDF desktops at Fermilab, ensuring that users have functional desktop environments ``out-of-the-box''. This includes 8x5 desktop system administration and support for hardware components of the desktop. The code/product server will be supported 24x7, and the TV will provide backups for this system. The TF does not support backups of PCs, although it does support CD-RWs and the tape drives that conform to the Fermilab serial media choice (currently Mammoth-2). The TF will support hardware that is compatible with the Desktop System Guidelines. The TF will advise the collaboration as to what hardware is supported and/or order the hardware on the institutions budget code on request. The TF will attempt to support workgroup clusters, which must also adhere to the Desktop System Guidelines. The support of workgroup clusters is a lower priority than the support of un-clustered desktops, and in the event of a shortage of manpower the support of workgroup clusters may need to be decreased. Please see the estimated effort section for further details. The TF will not support large disk pools or SMP CPUs in the CDF trailers that exceed the levels necessary for desktop analysis of small compressed datasets; for example, the TF will not support 4-way or greater Linux systems connected to Terabytes of disk in the CDF trailers. Please see the appendix on the CDF trailers data model for further details. The TF will install and provide system administration support for operating systems on which the CDF Offline Software is supported. For PC's this is currently only Fermi Red Hat Linux (FRHL). The TF will upgrade the OS from unsupported to supported versions of FRHL by the time support for older versions is removed. The TF will not support multiple-boot systems, for example those that allow windows as well as FRHL to be booted, except for initially configuring home systems as multiple boot before transport off lab site. The TF does not support laptops, although DHCP ports are available in the trailers to plug in laptops. The TF will install kerberos software for computer security on the Fermilab desktops. Data Communications Department (DCD): Will provide design, installation and operations support for the networks required for the CDF Desktop Systems. This includes support for the switches and cabling and the tuning and debugging the network. User help in the cdf trailers for networking problems is provided by the CDF Department of the Particle Physics Division, who coordinates with DCD on the resolution of serious problems. DCD will provide support for the Fermilab kerberos key server, allowing users to login in to kerberized desktops, and will provide user consulting on kerberos in case TF needs assistance. Operating Systems Support (OSS) Department: Will provide support for Fermi Red Hat Linux and the desktop products it contains, at levels of support specified in the Desktop Products MOU. OSS will provide support for kerberos computer security software, including user consulting in cases where the CDF Task Force needs assistance. The Computing Division provides the product support for agreed-upon desktop products, via various departments which accept particular support responsibilities, as itemized in the Products Support MOU. Collaboration Responsibilities ------------------------------ Non-Fermilab institutions are responsible for financing the purchase of desktop systems and subsequent hardware additions and replacements, but are not responsible for the support of these systems as long as they adhere to the Desktop System Guidelines. If institutions or individuals decide not to follow the guidelines for their desktop system, they have the responsibility to support the system themselves. For example, if a group purchases a backup tape stacker that does not conform to the Fermilab serial media choice, that group takes sole responsibility for the maintenance and support of that hardware. It is the collaboration's responsibility to backup critical files on their PCs. In the event of a shortage of manpower for desktop support, the support of workgroup clusters may shift to the collaborating institution. Joint Responsibilities ---------------------- This gives any responsibilities that are shared between the Collaboration and CD and, if necessary, specifies how these are executed. It is understood that the TF will occasionally, at approximately six-month to one year intervals, wish to install major upgrades of this operating system. Because these changes are potentially disruptive to physics work taking place in the trailers, the upgrade will happen during a transition period, and the decision to begin the upgrade will be the responsibility of the heads of the CDF Offline Operations Organization. The heads may delegate this decision to a committee. Specification and updating of required products and software on the products/code server is also a joint responsibility of the Collabation and CD. Here again, the heads of the COOO are responsible for coordinating this activity, and may delegate this responsibility to a committee. The publication of the Desktop System Guidelines is also a joint responsibility, however, the CDF Department has the authority to ammend the content of the guidelines at any time in order to expedite requisitions and the setup of systems. The heads of the COOO are also responsible for making changes in the support model or Desktop guidelines as necessary. For example, if workgroup clusters cannot be supported by the TF with the planned personnel allotment to this project, then the heads of the COOO will be responsible for making the decision to drop this form of support and issue new guildelines for support to the collaboration. Again, the heads of the COOO may delegate this task or seek advice from a committee including collaboration members. Estimated Effort ---------------- Please see the Appendix for estimates of current effort used to suport 200 CDF desktop systems. The estimated effort to support CDF Desktop systems in the future depends on the number of desktops, workgroup clusters, the level of customization, and the level of support. It is likely that the existing IRIX dekstops will be converted into Linux desktops within a year or two. In addition, there are 34 VMS workstations in the CDF trailers, currently supported by PPD, that will likely also be converted into Linux desktops in the next year or two and have to be supported by the TF. There are also about 80 x-terminals, some of which may be converted to Linux boxes. A new trailer and a new building adjacent to the cdf trailers, planned to be complete in summer 2001, will have space for roughly 100 Linux desktops. Thus, it is likely the TF will have to support a total of roughly 350 desktops, which is a 75% increase in the number of desktops they now support. With improved efficiency through better account management systems it is hoped that we could do this without an increase in the support or consulting personnel. However, it is probably more realistic to assume that the TF will need to increase the level of desktop support by around 50%, which would require another FTE. Each workgroup cluster is currently costing an extra 1 day per month (0.05 FTE per cluster) above the normal support workload for a non-clustered desktop. This is due largely to the instability of NFS mounts of disks for the cluster. The TF is developing guidelines for workgroup clusters to reduce this effort, but until this is put in place we assume that each added workgroup cluster will cost roughly this amount. It could therefore cost another 2 FTEs to support an anticipated 40 workgroup clusters in the trailers. This amount of additonal effort will likely not be available. Since support of workgroup clusters is lower in priority than support of individual desktops, support of workgroup clusters by the TF cannot be assured. The probability that workgroup clusters are supported by the TF can be enhanced by the adoption of task force guidelines, and workgroup clusters will be required to the guidelines in order to receive support. With the exception of the code/products server, the CDF desktop systems are currently supported 5 days a week for 8 hours per day. We do not foresee the need for an increased level of support which would require additional personne. CDF Trailers Data Analysis Model -------------------------------- The desktop systems in the CDF trailers are intended for personal productivity, communications with central and remote systems, for code development and testing, for generation of small Monte Carlo samples, and for rapid user analysis of highly compressed datasets, including ntuples. The desktop systems are not a substitute for the CDF central systems in reducing and analyzing large data samples. There will not be available bandwidth between the CDF Trailers and FCC to copy "large" datasets (>1 TB); therefore hardware to support TB RAID arrays, for example, will not be supported. Similarly, the trailer computing environment isn't intended to provide large SMPs to be shared by many users, and therefore only single and dual processor architectures are currently supported. Desktop System Guidelines ------------------------- Guidelines for the specification of desktop systems were roughly outlined in the ``Report and Recommendations of the Run II Trailer Computing Committee'', CDF5241, January 26, 2000. CDF collaborators may obtain the most recent set of detailed guidelines by consulting with the CDF Task Force in the CDF Computing and Analysis Department. ------------------------------------------- Appendix: Current effort on Desktop Systems ------------------------------------------- As of October 2000 we employ two FTEs in the TF whose sole responsibility is support and consulting for 150 linux desktops, 50 IRIX desktops, and four workgroup clusters. There is roughly an additional 1.25 FTE spread over 5 personnel in the department involved in hardware and software consulting, hardware transport, purchasing, security, TF management and other issues. A rough breakdown of this current effort, which does not necessarily reflect future responsibilities, is presented in the table below. Personnel Role Effort -------------- ---------------------------- ------ Mark Schmitz Desktop Support & Consulting 1 FTE Jason Harrington Desktop Support & Consulting 1 FTE Rick Colombo TF Leader: Requistions, Support & Consulting 0.6 FTE Randy Herber Desktop Consulting 0.4 FTE Richard Jetton Desktop Consulting 0.1 FTE Glenn Cooper Desktop Consulting 0.05 FTE Robert Harris Department Head 0.1 FTE