6.5 NPACI Rocks
NPACI Rocks clustering software leverages RedHat''s Kickstart utility to manage the software and configuration of all nodes. It fundamentally enables the notion that reals clusters have many node types (hereafter referred to as "appliance types" or "appliances"). Rocks decomposes the configuration of each appliance into several small single-purpose package and configuration modules. Further, all site- and machine-specific information is managed in an SQL (MySQL) database as a single "oracle" of cluster-wide information.
The Rocks configuration modules can be easily shared between cluster nodes and, more importantly, cluster sites. For example, a single module is used to describe the software components and configuration of the
ssh service. Cluster appliance types which require
ssh are built with this module. The configuration is completely transferrable, as is, to all Rocks clusters.
In Rocks, a single object-oriented framework is used to build the configuration/installation module hierarchy, resulting in multiple cluster appliances being constructed from the same core software and configuration description. This framework is composed of XML files and a Python engine to convert component descriptions of an appliance into a Redhat-compliant Kickstart file.
Anaconda is RedHat''s installer that interprets Kickstart files. The Kickstart file describes what must be done from disk partitioning, to package installation, and finally post- or site-configuration to create a completely functional node.
Figure 6.3 presents a sample Kickstart file. It has three sections:
command, package, and post. The command section contains almost all the answers posed by an interactive installation (e.g., location of the distribution, disk partitioning parameters and language support). The packages section lists the names of Redhat packages (RPMs) to be installed on the machine. Finally, the post section contains scripts which are run during the installation to further configure installed packages. The post section is the most complicated because this is where site-specific customization is done. Rocks, for example, does not repackage available software—it simply has a mechanism to easily provide the needed post-configuration.
url --url http://10.1.1.1/install/1386
zerombr yes
clearpart --all
part / --size 4096
lang en_US
keyboard us
mouse genericps/2
timezone --utc GMT
skipx
install
reboot
%packages
@Base
pdksh
%post
cat > /etc/motd << ''EOF''
Kickstarted on ''date''
EOF
Figure 6.3: Basic RedHat Kickstart file. The RedHat Installer, Anaconda, interprets the contents of the kickstart file to build a node
While a Kickstart file is a text-based description of all the software packages and software configuration to be deployed on a node, it is both static and monolithic. At best, this requires separate files for each appliance type. At worst, this requires a separate file for each host. The overwhelming advantage of Kickstart is that it provides a de facto standard for installing software, performing the system probing required to install and configure the correct device drivers, and automating the selection of these drivers on a per machine basis. A Kickstart file is quite generic in Chapter 20 and the Jazz cluster for a real world experience on the necessity of having test hardware.
Description mechanisms for other distributions and operating systems exist and include SuSE''s YaST (and YaST2), Debian FAI (Fully Automatic Installer), and Sun Solaris Jumpstart. The structure of each of the text descriptions are actually quite similar as the same problems of hardware probing, software installation, and software post-configuration must be done. The specifics of package naming, partitioning commmands and other details are quite different among these methods.
6.5.1 Component-based configuration
The key functionality missing from Kickstart to make it the only installation tool needed for clusters is the lack of macro language and a framework for code re-use. A macro language would improve the programmability of Kickstart and code reuse significantly ameliorates the problems of software skew across appliances by having shared configuration among appiance types be truly shared (instead of being copies that require vigilance to keep in sync).
Rocks uses the concept of package and configuration modules as building blocks for creating entire appliances. Rocks modules are small XML files that encapsulate package names and post-configuration into logical "chunks" of functionality. XML is used by Rocks because of de facto standard software for parsing data.
Once the functionality of a system is broken into small single-purpose modules, a framework describing the inheritance model is used to derive the full functionality of complete systems, each of which shares common base configuration. Figure 6.4 is a representation of such a framework which describes the configuration of all appliances in a Rocks cluster. The framework is a directed graph—each vertex represents the configuration for a specific service (software package(s), service configuration, local machine configuration, etc.) Relationships between services are represented with edges. At the top of the graph there are four vertexes which indicate the configuration of a "laptop", "desktop", "frontend", and "compute" cluster appliance.

Figure 6.4: Description (Kickstart) Graph. This graph completely describes all of the appliances of a Rocks Cluster.
When a node is built using Rocks, the Kickstart file for a particular node is generated and customized on-the-fly by starting at an appliance entry node and traversing the graph. The modules (XML Files) are parsed, and customization data is read from the Rocks SQL database. Figure 6.5 shows some detail of the configuration graph. Two appliance types are illustrated here—
standalone and
node . Both share everything that is contained in the
base module and hence will be indentically installed and configured for everything in
base and modules below. In this example, a module called
c-development is only attached to
standalone . With this type of construction it is quite easy to see (and therefore focus on) the differences between appliances.

Figure 6.5: Description Graph Detail. This illustrates how two modules ''
standalone.xml '' and ''
base.xml '' share base configuration and also differ in other specifics
It is interesting to note that the interconnection graph is a different file from the modules themselves. This means that if a user desires to have the
c-development module as part of the base installation, one simply makes that change in the graph file and attaches
c-development to the
base module. Also in Figure 6.5, edges can be annotated with architecture type (
i386 and
ia64 in this example). This allows the same generic structure to describe appliances across significant architectural boundaries. Real differences, such as the
grub (for ia32) and
elilo (for ia64) boot loaders can be teased out without completely replicating all of the configuration.
6.5.2 Graph Components
In an earlier section, it was stated that image-based systems put the bulk of their configuration into creating an image, while description methods put the bulk of their configuration into the description (e.g. Kickstart) file. In Rocks, the modules are small XML files with simple structures as illustrated in Figures 6.6 and 6.7.
<?xml version="1.0" standalone="no"?>
<!DOCTYPE kickstart SYSTEM "dtds/node.dtd"
[<!ENTITY ssh "openssh">]>
<kickstart>
<package>&ssh;</package>
<package>&ssh;-clients</package>
<package>&ssh;-server</package>
<package>&ssh;-askpass</package>
<!-- Required for X11 Forwarding -->
<package>XFree86</package>
<package>XFree86-libs</package>
<post>
<!-- default client setup -->
cat > /etc/ssh/ssh_config << ''EOF''
Host *
CheckHostIP no
ForwardX11 yes
ForwardAgent yes
StrictHostKeyChecking no
UsePrivilegedPort no
FallBackToRsh no
Protocol 1,2
EOF
</post>
</kickstart>
Figure 6.6: The ssh.xml module includes the ssh packages and configures the service in the Kickstart post section.
<?xml version="1.0" standalone="no"?>
<!DOCTYPE kickstart SYSTEM "dtds/node.dtd">
<kickstart>
<main>
<lang><var name="Kickstart_Lang"/></lang>
<keyboard><var name="Kickstart_Keyboard"/></keyboard>
<mouse><var name="Kickstart_Mouse"/></mouse>
<timezone><var name="Kickstart_Timezone"/></timezone>
<rootpw>--iscrypted <var name="RootPassword"/></rootpw>
<install/>
<reboot/>
</main>
</kickstart>
Figure 6.7: The ''base.xml'' module configures the main section of the Kickstart file.
ssh" module in the graph. The single purpose of this module is to describe the packages and configuration associated with the installation of the ssh service and client on a machine. The package and post XML tags map directly to Kickstart keywords. Figure 6.7 shows how global operations such as the root password and mouse selection similarly can be described. Rocks also contains options on partitioning hard drives that ranges from a fully-automated scheme (which works on IDE, SCSI, and RAID Arrays) to completely manual (adminstrator-controlled). The real advantage here is that ssh configuration policy is done once instead of being replicated across all appliance types.
6.5.3 Putting it all together
Rocks uses a graph structure to create decription files for appliances. In the background is a mySQL database that holds cluster-wide configuration information. When a node requests an IP address, a dhcp server on the head node replies with a
filename tag that contains a URL for the node''s kickstart file. The node contacts the web server and a CGI script is run that 1) looks up the node and appliance type in the database, and 2) traverses and expands the graph for that appliance and node type to dynamically create the Kickstart file. Once the decscription is downloaded, the native installer takes over and downloads packages from the location specified in the kickstart file, installs packages, performs the post installation tasks specified, and then reboots. Rocks also uses the same structure to bootstrap a head node, except that the kickstart generation framework and Linux distribution is held on the local boot CD and interactive screens gather the local information. In summary, we annotate the installation steps with the steps that Rocks takes:
Install Head Node—Boot Rocks-augmented CD
Configure Cluster Services on Head Node—automatically done in step 1
Define Configuration of a Compute Node—Basic setup installed. Can edit graph or nodes to customize further
For each compute node—repeat
Detect Ethernet Hardware Address of New Node use
insert-ethers tool
Install complete OS onto new node—Kickstart
Complete Configuration of new node—already described in Kickstart file
Restart Services on head node that are cluster-aware (e.g. PBS, Sun Grid Engine)—part of
insert-ethers
The key features of Rocks are that it is RedHat-specific, uses descriptions to build appliances, leverages the Redhat Installer to do hardware detection, and will take hardware with no installed OS to an operating cluster in a short period of time. The description files are almost completely hardware independent allowing the construction of Beowulfs with different physical nodes to be handled as easily as homogeneous nodes.