A New Linux Support Model For DESY ================================== Knut Woller -IT- systems group Revised 09 Feb 2000 The -IT- systems group recognises that the current Linux support policy issued in November 1998 is insufficient and requires modifications to suit the growing demand for personal workstations located on the user's desktop. Starting from the status quo, this paper describes a roadmap towards an open Linux support model which allows user contributions and modifications. The scope of this paper is system installation and maintenance as provided by the -IT- Systems Group. Is is assumed that application support and end user consulting will be kept at least at the current level. Introduction ------------ At the end of 1999, the existing Linux support model draws a line is between the 'work group server', which is mostly identical to previous work group servers based on commercial unix platforms, and the 'Linux desktop', which inherited much of the work group server's setup, but has some serious configuration limitations, not all of which can be justified with technical boundary conditions. Hardly any criticism has been brought up against the Linux work group server. However, compared to other platforms, the Linux implementations of NIS (YP), AFS, and NFS proved less robust and less performant and often caused downtimes and extra work [see appendix A]. Acceptance of the Linux 'desktop' suffers, in addition to the above, from features it inherits from the 'work group server' and the limitations imposed on it, e.g.: - the end user has no root password and cannot perform even trivial administration tasks - applications installed in AFS cannot be overridden locally - NFS export and import of disks is not allowed - X11 comes with an outdated default setup and depends heavily on the central font service, which causes temporary screen freezes - desktop disk controller is IDE only, no SCSI support in kernel - peripheral device support is minimal (ZIP, CD-RW, printer, ...) - installation requires synchronous action at -IT- and can be delayed if specific persons are absent Still, about 280 installations have been done to date, about half of which are 'desktops'. The 'desktop' model has been successfully adopted by some groups, while others turned it down and continued installing PCs themselves. These systems are sometimes set up insecurely and maintained poorly, and a few have given way to security breaches. Recent incidents involve hacker intrusions, password sniffing, accidental loss of complete file systems, and accidental 'denial of service' for a considerable number of other users. Today, PC based 'work group servers' and 'desktops' have almost identical hardware. In many cases, a new 'desktop' PC has a more powerful CPU than 'work group servers' available elsewhere in the user's group. This leads to load distributions and data movements which are not foreseen in the general DESY computing model. On the other hand, treating all 'desktops' as 'work group servers' and allowing access to all services for anyone cannot work in the current model since vital network centric services (backup, file service, OSM, ...) and support manpower would fail to scale with hundreds of PCs. The ongoing upgrade of the DESY networking infrastructure increases the effective bandwith available at the user's desk by one to two orders of magnitude. The switched network also allows for better control, remote fault isolation, and protection of individual devices. Hence, it is time to rethink the support policy. The considerations made above imply that the current discrimination between Linux 'work group servers' and 'desktops' is largely artificial and not based on usage profiles. In fact, the whole idea of a work group server is losing contour because services are increasingly focused on dedicated machines, and the growing number of computers reduces the number of users per PC. The Vision ---------- Imagine: You are a group administrator and you are planning to get a new Linux PC for one of your co-workers. If the DESY Standard seems unsuited for some reason, you consult the systems group's list of supported hardware (or ask them in person) and order the PC. As usual, you ask the network segment administrator to create a DNS entry for the new PC. When the PC is delivered, you enter its hardware and configuration data into a web form. Starting from a base configuration (which can be a clone of an existing system), you use a menu to add services and packages as required and submit it. Bootp record, netgroup entry, and a system profile for the installation server are generated automatically and instantly, and you receive a notification when this has been done successfully. With the same form, you can subscribe to an automatic update service which e.g. fixes security problems on your system as soon as they become known by installing an appropriate patch, or updates packages installed locally. If you don't want this, you can choose to be notified when patches or updates become available. With your standard boot floppy, you start setting up the PC. Since you created the profile already, everything is now automatic and after half an hour, the system is ready to use, including AFS, user registry, mail, printing, and X11 configuration. An online troubleshooting guide will help you if the automated procedure fails at any point. Setting up farms of clusters is done analogous. 10 PCs can be installed in less than one hour and without even attaching a console monitor and keyboard. Adding more identical PCs later is done by simply cloning the setup description from existing ones. To install additional packages from the installation server later, you revisit your web form and mark the required packages. This also allows you to install every package which comes with the Linux distribution and also to install software locally which otherwise resides in the AFS repositories. After submitting, the update is run from the installation server. If your system disk crashes for whatever reason, the preconfigured setup can be reinstalled on a new disk by simply booting from a boot floppy. Unless local modifications were made which are unknown to the server, you will receive an identical setup. To upgrade your system with a new release, you revisit the web form and clone the old setup. You are told which packages changed or became unavailable and you make your adaptions if required. Since you tell the server that this is an upgrade, system configurations like partition tables, ssh host keys, automounter maps, X11 setup, and the like will be restored after installation. Similar to the above, it will be possible to install Linux without AFS, just like you would from CD-ROM, or to start with an AFS-based installation on non-standard hardware. This will typically require local modifications and tuning afterwards, but the base installation will be perfectly reproducible using the same mechanisms and configuration tools. End of vision, start of work: What are the required ingredients for this model, and how do we get there? The rest of this paper will give an outline of implementation details, develop a roadmap, and identify technical boundary conditions which require extra work or involve extra cost to scale with our intentions. Supported Base Configurations ----------------------------- The terms 'work group server' and 'desktop' have their own history at DESY and imply a biased functionality. PCs with large disks on the user's desk as well as Linux farms don't fit into this scheme. I want to avoid this bias and, for the rest of this paper, will distinguish between linux based 'servers', which provide any kind of service to other machines or users, and 'personal workstations'. I will try to define the obvious differences of these base configurations, while keeping an eye on the sliding scale between both. -IT- wants to address all existing points of criticism. We want to continue 'server' support at the known level which also is available for other Unix platforms. The 'personal workstation' includes the current 'desktop' model, but can be extended far beyond that if the user is willing to share responsiblity. Technical limitations will be made transparent where they influence the support strategy to avoid misunderstandings. Available manpower should not influence the support concept itself, but timelines will depend on it. We hope that it will be possible to allocate the required manpower for the final model. We suggest to distinguish two base classes of Linux computers by accessibility, both physically and in terms of network and control connections. This implies the following services: -----------------+-----------------------------+--------------------------- | Server | Personal Workstation -----------------+-----------------------------+--------------------------- location | central, CC preferred | user's desktop hardware | dedicated (incl. SMP) | DESY standard PC preferred network | as fast as required | standard network passwords | central registry | central registry homedirs | in AFS | in AFS applications | in AFS | in AFS updates | by IT, can be automatic | automatic or manual by user monitoring | system, resources, servces | (optional) operating | 'hands on' by -IT- staff | by user alarms | optional 7x24 service | ./. batch | optional | ./. ADSM backup | optional | ./. OSM access | optional | ./. NFS data mounts | to central servers | ./. console | virtual console (inside CC) | local monitor and keyboard UPS power | yes (inside CC) | ./. air condition | yes (inside CC) | ./. theft protection | yes (inside CC) | ./. -----------------+-----------------------------+--------------------------- These models are the outer edges of a scale of centrally supported Linux computers at DESY. Since we expect large numbers of both configurations in PC farms and on the desktop, we would like to see the majority of installations fit into these models with as little modifications as possible. However, individual adaptions will be possible after starting with one of these. It must be clear that central support may be limited then. This may even be desired. The 'server' class includes the existing 'work group servers', PC farms, and dedicated machines (e.g. web or file servers) located in the computer center. Other locations are possible, but the level of service will be reduced according to what the location permits in terms of network and accessibility. If additional infrastructure (high speed network, cooling, UPS, ...) at external locations is required, -IT- can only provide this at the user group's expense. Such istallations need prior negotiation to make sure that the required central services can be scaled appropriately [see appendix B]. The 'personal workstation' class includes the present 'desktop' systems, but allows for a wider range of configurations and delegated responsibility for the user. In addition to the above, its features are: - The end user can have root access and can act as his own computer's administrator. [1] - There should not be local user accounts. [2] - Installation of additional software is possible. [3] - System modifications are possible after installation, but no support can be granted in case of conflicts with central services. [4] - Dependance on central services can be reduced if requested. [5] - Root access for -IT- personnel is possible for monitoring and updates. [6] - There is no explicit support for multi-boot configurations. [7] [1] Technically, we suggest to do this with 'sudo' so that the group administrator still has control over the root password. [2] Conflicts between local user IDs and AFS users must be avoided and controlled by technical means. A 'reserved' ID range will be provided. Local users cannot receive mail since all AFS computers are IMAP clients to mail.desy.de, which only serves centrally registered accounts. [3] The preferred mechanism of distributing software across several computers is still an AFS volume. In order to install software to /usr/local, this path must be taken out of AFS. This has major impact on the application support at DESY and the unix computing model. [4] In particular, system monitoring and update services may break and would then be discontinued, leaving the user on his own. Reproducibility of the installation will only be possible for the part which is configured on the server. [5] This targets specifically the font service which can be replaced by local fonts to increase application startup and performance. However, certain applications may not run when their fonts are unavailable [6] Any kind of central monitoring and updating requires root access, and remote administration tools depend on it. Automated administration is easier if the PC runs around the clock, and the current scheme requires it. Asynchronous mechanisms are desireable, but still have to be implemented. [7] The standard installation will repartition the system disk completely, a second disk will not be touched. Hence, multi-boot systems are possible if there are several disks or if the user is willing to partition manually. Since the number of possible multiboot configurations is large and the problems can be subtle, the user is left on his own (and the HowTo documents) for everything that is beyond the pure OS. -IT- discourages multiboot systems with network centric installations. We suggest to use VMware instead, which will be offered aliong with the standard installation. Caveats ------- -IT- reserves the right to disconnect individual PCs from WAN traffic if security requirements are not met due to local modifications of the OS. -IT- reserves the right to disconnect PCs from the local network if they fail to behave in accordance with the 'DESY netiquette', i.e. in particular if individual PCs hinder other users beyond the tolerable level. Installation ------------ -IT- provides and maintains an installation server with a high degree of automation for both 'servers' and 'personal workstations'. There will be a standard configuration available for both classes which can be extended by additional packages available from the installation server. A suitable interface shall be provided by IT. [*] -IT- currently provides ready-to-use kernels for three configurations: - Single CPU box with IDE boot disk (e.g. the DESY Standard PC) - Single CPU servers with SCSI only disks - SMP servers with SCSI only disks All PCs must be registered prior to installation so that required netgroup and equipment database entries are available. -IT- aims at providing an end user interface which eliminates the synchronous activities on -IT- side so that installations can proceed faster and do not require presence of specific individuals. [*] Linux releases shall be provided in a timely manner to keep up with recent software developments, but the number of releases per year should not exceed two. Releases will have defined properties and life cycles, i.e. support will only be granted for a limited amount of time after new releases become available. For each PC, the individual installation shall be reproducible at any time, provided that software installations done by the user or group administrator use appropriate mechanisms which allow to store the package information on the installation server. This includes group specific configurations like automounter maps or host specific data like ssh keys. [*] End user documentation and release notes will be made available on the -IT- web pages. This includes package version information and a list of supported hardware. [*] Updates of installed systems can be provided automatically. We intend to define update classes (e.g. 'kernel', 'security', 'application') to which the user can subscribe at different levels (e.g. 'automatic', 'notification', 'never'). This way, the user can decide on his own whether a specific update will be done. It must be clear that there may be updates which must be installed in order to keep a fully supported system. [*] NOTE: All items marked with [*] require additional work or development effort compared to the current situation. Their implementation therefore depends on available manpower and the priorities defined by DESY. User Support ------------ Groups which operate Linux AFS PCs as 'personal workstations' must nominate a Linux Technical Administrator (group administrator). He is the first contact for -IT- personnel in case of problems with one of his PCs. He is the first contact for his users. He should be able to solve their everyday problems and relay larger ones to the responsible contact at IT. All group administrators and users who administer their 'personal workstations' will be added to the mailing list linux-admins@desy.de so that -IT- can keep them informed about updates and patches. This list can also serve as discussion forum. The newsgroup desy.linux can be used for site wide information and is open to all users at DESY. We will hold a periodical Linux user meeting so that -IT- can receive direct feedback from the users and discuss further developments. End user consulting will be done by the user consulting office (UCO) at the same level as it is done for current 'work group server' users. No consulting other than based on good will is provided for any Linux installations outside this support model. The abundance of Linux distributions and hardware configurations exceeds what can be met with reasonable efforts. Application Support ------------------- We assume that application support for Linux can be kept at a coherent level with other unix flavours at DESY. Experience shows that this can lead to significant delays in the rollout of a fully supported and centrally maintained system. While Linux is maturing and changes between releases become less dramatic, we expect the rollout process to become faster for future versions. As in the past, software which is not yet available in AFS can be installed locally as long as it is contained in the Linux distribution. Roadmap ------- 1. Policy changes (effective immediately) The support policy paper from November 1998 states a number of 'must' and 'must not' statements and binds access to the installation server to fulfillment of these conditions. This can be given up immediately since exceptions have been made in the past already. 2. Open installation server (January 2000) In principle, the installation server can already be used by every PC at DESY. In addition, ftp installations were possible from the anonymous ftp repository on x4u2. The first method was not used due to lack of documentation, and the latter suffered from disk space limitations on x4u2. The plan is to write a HowTo-document for NFS based standalone installations at DESY which enables the Linux-literate user to install a custom system via NFS without the DESY extensions. This should reduce the need to buy SuSE CDs for many groups. To allow this, the installation server has been upgraded with enough disk space to hold about five releases in parallel. Packages which improve security on such standalone installations will be made available and documented. 3. Reproducible installations (summer 2000?) Ideally, every aspect of an installation should be reproducible so that in case of reinstallations (after a disk crash, hack, theft, or for other reasons) an identical setup can be provided in minimum time. This is currently the case for the base installation as stored on the server, and even there past setups cannot be exactly reproduced due to package updates on the server. The next step would be to take regular snapshots of the installation information on the computer. This includes the list of packages installed, ssh keys, adsm keys, rc.config, automounter maps, fstab, passwd, shadow and the like. Storage and restoration of these is difficult to automate because care must be taken that detected changes were intended and not accidental or even result of a hack. Obviously, strict security mechanisms (e.g. PGP encryption) and a version history are required on the repositiory side. Mechanisms must still be developed, but some buidling blocks are available already. 4. Add-on building blocks (stepwise within 2000) For registered PCs, it will be possible to install additional DESY specific extensions even if the automatic installation mechanism is not used and the PC set up without AFS. -IT- will provide documentation on the individual products and their dependencies. E.g., it should be possible to install the DESY printing environment on non-AFS PCs. 5. Self Service Model (end 2000?) The next step towards reproducible setups would be that the user chooses the profile for his PC prior to installation using a suitable (e.g. WWW) interface. This profile would then be stored on the installation server in the form of an info file and a package selection list. Using this interface consistently would result in a perfectly reproducible and well documented system setup. The required interface would resemble a 'remote YaST' and must be written from scratch. Development effort is estimated to be several man months. If this interface provided strong authentication, authorised group admins would be able to create the required netgroup and equipment database entries on the fly and would no longer have to wait for -IT- personnel to do the registration. Appendix -------- A - technical Linux issues -------------------------- Being the first HEP institute that provided official Linux support from a central installation server, DESY decided to use the SuSE distribution in 1997. In cooperation with the company, a highly automated installation service had been set up already when the rest of the HEP community went towards RedHat Linux. Moving to RedHat has not been an issue for DESY so far because both distributions use identical package installation mechnisms (RPM) and are binary compatible as long as comparable releases are used. Up to now, automatic installation and internationalisation of the OS seem superior in SuSE Linux. Changing the base distribution would be possible if HEP-wide collaboration required it. With the Linx Standard Base at the horizon, this is becoming less and less likely. -IT- is not using Linux computers for central servers yet due to technical deficiencies compared to commercial Unices in areas which are critical in the DESY environment: AFS: The AFS client running on systems with 2.0.x Kernels (SuSE 5.x) ist AFS 3.3a, a port done by Derek Atkins at MIT. There is no support from Transarc, multiprocessor (SMP) operation was not guaranteed, and there is no statement concerning Y2K compliance. Official Transarc support is available with AFS 3.5 for 2.2.x Kernels since April 1999. DESY has a working and bug-free module with SMP-support since end of August 1999. This is now available in the SuSE 6.x based DESY systems. NFS: The Linux implementation is NFS V2; there is little activity towards NFS V3 for Linux, and up to now it is only available as a set of Kernel patches which collide with Transarc's AFS binaries. The client suffers from performance problems, the Linux NFS servers running at DESY drop exports on a timescale of days to weeks, and implementation bugs have caused file corruptions even on read only file systems at DESY after client crashes. Clusters with heavy cross mounting (theol, hasyl) are therefore perceived as unstable by the users. NIS (YP): The DESY user and host registry is based on NIS. A hanging NIS client results in an unusable system (login impossible). The standard NIS client ypbind-3.3 has known bugs (see the NIS-Howto) and can hang itself when its NIS server is restarted or on network interruptions, requiring manual restart afterwards. On some DESY systems, this happened several times per week. Ypbind depends crucially on the Kernel implementation of RPC services, which has known weaknesses in libc5. Improvements are being made in libc6, and an alternate ypbind-mt has been made available by SuSE with enhanced features (server probing, reconnecting), which has now become the default for DESY Linux systems and seems to solve most YP issues. File systems: Linux still lacks a journaling file system. This means that after a crash all file systems must be recovered with fsck, which takes about one minute per Gigabyte of disk space in sequential operation. Large disks lead to significant startup delays which are usually intolerable for central servers. PC farms vs. big SMP computers: Replacing an SMP computer like an SGI Challenge XL by a Linux PC farm or even disributed PCs is attractive if costs per CPU cycle are considered as main criterion. However, system performance highly depends on I/O troughput between CPU and disk, CPU and RAM, or even between CPUs. While a high end multiprocessor computer has a backplane bandwidth of many GB/s and disk access with 20..80 MB/s, a fast Ethernet interconnect between farm PCs will saturate below 10 MB/s. Physical distribution of PCs leads to inaccessibility and makes a central 'hands on' service impossible. This should be considered during acquisition because it may limit the useability of the PC for the intended purpose. While CPU power is cheap, data access can be expensive, and there are valid reasons to stick with big server machines in some areas until Linux has caught up with their technology. B - DESY boundary conditions ---------------------------- File service: The DESY Unix model is tightly wrapped around AFS as enterprise wide file system for user home directories and binary service. The current setup of AFS servers allows for regular volume sizes of 500 MB. Larger home or group directories can be implemented as multiple volumes. Exceptions of higher volume have been made, for the price of reduced manageability of the AFS servers. DESY's AFS cell can be scaled to larger volume sizes by major hardware investments into server computers and raid systems. Corresponding investigations have been started already. Network: Major parts of DESY have been equipped with structured cabling and switched networks in 1999 (Bldg. 1, 2, TTF), with more to follow. This network infrastructure provides a guaranteed bandwidth of 10 MBit/s at each end device, redundant connections between switch and computer center, improved fault tolerance and fault isolation. Where this infrastructure is not available, PCs have to share a bandwitdh of 10 Mbit/s with other PCs (Linux and NT) and X-terminals. This forbids any high bandwidth services like NFS data mounts in those locations. PCs should be set up and used with great care on those subnets, otherwise X-terminals on the same segment may become unusable. Even inside the computer center, data access for PCs can be limited due to network type incompatibilities. If, e.g., a PC with Fast Ethernet accesses an SGI through its HIPPI interface, the SGI is forced to send 1.5 kB packages instead of the native 64 kB of HIPPI. This can lead to HIPPI queue overflows, packet losses, and overall network lag. We have seen fast ethernet NICs with as little as 100 kB/s throughput. Up to now, several man weeks have been spent on analysing and improving PC farm throughput. The lesson learned is that ideally, PCs should live in a homogeneous (fast and gigabit) ethernet world. Backup: Linux 'servers' with high bandwitdh network connection can have part of their disks backed up by ADSM following the established work group computing model. Backup of 'personal workstations' is limited to AFS filespace. With the current server capacity, local disks cannot be included in the ADSM backup. Major investments in ADSM servers, tape robot capacity, network, and support would be required for backups of any kind of desktop systems at DESY. OSM: The nature of the OSM system limits the number of concurrent I/O streams since it performs direct sequential tape I/O. Each read or write request physically blocks one drive, and only one process can access a specific tape at any time. The current systems does not scale to a large number of clients. Access must be limited to 'servers' only, and even these may require careful manual tuning of the network setup to allow adequate throughput on the network interface. Development work which may improve this situtation in the long run has been started in collaboration with Fermilab. Operating: 'Hands on' operating directly depends on available manpower. The current situation at DESY only allows for remote service for any computer outside the computer center. C - cost considerations ----------------------- While PC hardware prices are still decreasing, central setup and services require considerable additional investments: - rack with ventilation (ca DM 3000) ca DM 300 / PC - virtual console (serial line) ca DM 300 / PC - network, typically fast ethernet ca DM 250 / PC - computer center floor space ca DM 10000 / m^2 optional services include: - console commander interface ca DM 900 / PC - batch system license, typically LSF ca DM 600 / PC - ADSM backup space (depending on data volume) - UPS power capacity - Management overhead for installation, alarm, monitoring, service, ... The total cost of installing a PC inside the computer center can easily exceed the PC's hardware costs if low level boxes are used. Multiprocessor PCs should therefore be preferred for servers now that AFS support for SMP is officially available and the Linux SMP kernel became sufficiently stable. As a low end 'desktop', PC hardware can be cheaper than most dedicated desktop devices. However, while the latter (e.g. the new SUNray) are mostly plug and play and can be supported in almost arbitrary numbers, PCs (regardless of OS) require a great deal of work for installation, maintenance, and monitoring. PCs have higher bandwidth requirements in the network and give way to security breaches which can become very expensive in terms of time and money invested. Being widely usable, they are more prone to theft than any other desktop device. Cheap commercial components favour hardware failures. Replacing a personalised system is much harder than exchanging a 'stupid' terminal. Restoring the previous state after disk failure can be time consuming or even impossible. The total cost of ownership for a desktop PC is considerable. This must be taken into account when a decision for a 'desktop' device is made.