High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI [Electronic resources] نسخه متنی

4.2 Configuring Services

Once you have the basic
installation completed, you'll need to configure the
system. Many of the tasks are no different for machines in a cluster
than for any other system. For other tasks, being part of a cluster
impacts what needs to be done. The following subsections describe the
issues associated with several services that require special
considerations. These subsections briefly recap how to configure and
use these services. Remember, most of this will be done for you if
you are using a package like OSCAR or Rocks. Still, it helps to
understand the issues and some of the basics.

4.2.1 DHCP

Dynamic
Host Configuration Protocol (DHCP) is used to supply network
configuration parameters, including IP addresses, host names, and
other information to clients as they boot. With clusters, the head
node is often configured as a DHCP server and the compute nodes as
DHCP clients. There are two reasons to do this. First, it simplifies
the installation of compute nodes since the information DHCP can
supply is often the only thing that is different among the nodes.
Since a DHCP server can handle these differences, the node
installation can be standardized and automated. A second advantage of
DHCP is that it is much easier to change the configuration of the
network. You simply change the configuration file on the DHCP server,
restart the server, and reboot each of the compute nodes.

The basic installation is rarely a problem. The DHCP system can be
installed as a part of the initial Linux installation or after Linux
has been installed. The DHCP server configuration file, typically
/etc/dhcpd.conf, controls the information
distributed to the clients. If you are going to have problems, the
configuration file is the most likely source.

The DHCP configuration file may be created or changed automatically
when some cluster software is installed. Occasionally, the changes
may not be done optimally or even correctly so you should have at
least a reading knowledge of DHCP configuration files. Here is a
heavily commented sample configuration file that illustrates the
basics. (Lines starting with "#"
are comments.)

# A sample DHCP configuration file.
# The first commands in this file are global, 
# i.e., they apply to all clients.
# Only answer requests from known machines,
# i.e., machines whose hardware addresses are given.
deny unknown-clients;
# Set the subnet mask, broadcast address, and router address.
option subnet-mask 255.255.255.0;
option broadcast-address 172.16.1.255;
option routers 172.16.1.254;
# This section defines individual cluster nodes.
# Each subnet in the network has its own section.
subnet 172.16.1.0 netmask 255.255.255.0 {
group {
# The first host, identified by the given MAC address,
# will be named node1.cluster.int, will be given the
# IP address 172.16.1.1, and will use the default router
# 172.16.1.254 (the head node in this case).
host node1{
hardware ethernet 00:08:c7:07:68:48;
fixed-address 172.16.1.1;
option routers 172.16.1.254;
option domain-name "cluster.int";
}
host node2{
hardware ethernet 00:08:c7:07:c1:73;
fixed-address 172.16.1.2;
option routers 172.16.1.254;
option domain-name "cluster.int";
}
# Additional node definitions go here.
}
}
# For servers with multiple interfaces, this entry says to ignore requests
# on specified subnets.
subnet 10.0.32.0 netmask 255.255.248.0 {  not authoritative; }

As shown in this example, you should include a subnet section for
each subnet on your network. If the head node has an interface for
the cluster and a second interface connected to the Internet or your
organization's network, the configuration file will
have a group for each interface or subnet. Since the head node should
answer DHCP requests for the cluster but not for the organization,
DHCP should be configured so that it will respond only to DHCP
requests from the compute nodes.

4.2.2 NFS

A network
filesystem is a filesystem that physically resides on one computer
(the file server), which in turn shares its files over the network
with other computers on the network (the clients). The best-known and
most common network filesystem is

Network File
System (

NFS ). In setting up a
cluster, designate one computer as your NFS server. This is often the
head node for the cluster, but there is no reason it has to be. In
fact, under some circumstances, you may get slightly better
performance if you use different machines for the NFS server and head
node. Since the server is where your user files will reside, make
sure you have enough storage. This machine is a likely candidate for
a second disk drive or raid array and a fast I/O subsystem. You may
even what to consider mirroring the filesystem using a small
high-availability cluster.

Why use an NFS? It should come as no surprise that for parallel
programming you'll need a copy of the compiled code
or executable on each machine on which it will run. You could, of
course, copy the executable over to the individual machines, but this
quickly becomes tiresome. A shared filesystem solves this problem.
Another advantage to an NFS is that all the files you will be working
on will be on the same system. This greatly simplifies backups. (You
do backups, don't you?) A shared filesystem also
simplifies setting up SSH, as it eliminates the need to distribute
keys. (SSH is described later in this chapter.) For this reason, you
may want to set up NFS before setting up SSH. NFS can also play an
essential role in some installation strategies.

If you have never used NFS before, setting up the client and the
server are slightly different, but neither is particularly difficult.
Most Linux distributions come with most of the work already done for
you.

4.2.2.1 Running NFS

Begin with the server; you won't get anywhere with
the client if the server isn't already running. Two
things need to be done to get the server running. The file
/etc/exports must be edited to specify which
machines can mount which directories, and then the server software
must be started. Here is a single line from the file
/etc/exports on the server

amy :

/home    basil(rw) clara(rw) desmond(rw) ernest(rw) george(rw)

This line gives the clients basil,
clara, desmond,
ernest, and george
read/write access to the directory /home on the
server. Read access is the default. A number of other options are
available and could be included. For example, the
no_root_squash option could be added if you want
to edit root permission files from the nodes.

Pay particular attention to the use of spaces in this file.

Had a space been inadvertently included between
basil and (rw), read access
would have been granted to

basil and read/write
access would have been granted to all other systems. (Once you have
the systems set up, it is a good idea to use the command
showmount -a to see who is mounting what.)

Once /etc/exports has been edited,
you'll need to start NFS. For testing, you can use
the service command as shown here

[root@fanny init.d]# /sbin/service nfs start
Starting NFS services:                                     [  OK  ]
Starting NFS quotas:                                       [  OK  ]
Starting NFS mountd:                                       [  OK  ]
Starting NFS daemon:                                       [  OK  ]
[root@fanny init.d]# /sbin/service nfs status
rpc.mountd (pid 1652) is running...
nfsd (pid 1666 1665 1664 1663 1662 1661 1660 1657) is running...
rpc.rquotad (pid 1647) is running...

(With some Linux distributions, when restarting NFS, you may find it
necessary to explicitly stop and restart both
nfslock and portmap as
well.) You'll want to change the system
configuration so that this starts automatically when the system is
rebooted. For example, with Red Hat, you could use the
serviceconf or chkconfig
commands.

For the client, the software is probably already running on your
system. You just need to tell the client to mount the remote
filesystem. You can do this several ways, but in the long run, the
easiest approach is to edit the file /etc/fstab,
adding an entry for the server. Basically,
you'll add a line to the file that looks something
like this:

amy:/home    /home    nfs    rw,soft    0 0

In this example, the local system mounts the
/home filesystem located on
amy as the /home directory
on the local machine. The filesystems may have different names. You
can now manually mount the filesystem with the mount command

[root@ida /]# mount /home

When the system reboots, this will be done automatically.

When using NFS, you should keep a couple of things in mind. The mount
point, /home, must exist on the client prior to
mounting. While the remote directory is mounted, any files that were
stored on the local system in the /home
directory will be inaccessible. They are still there; you just
can't get to them while the remote directory is
mounted. Next, if you are running a firewall, it will probably block
NFS traffic. If you are having problems with NFS, this is one of the
first things you should check.

File ownership can also create some surprises. User and group IDs
should be consistent among systems using NFS, i.e., each user will
have identical IDs on all systems. Finally, be aware that root
privileges don't extend across NFS shared systems
(if you have configured your systems correctly). So if, as root, you
change the directory (cd) to a remotely mounted
filesystem, don't expect to be able to look at every
file. (Of course, as root you can always use su
to become the owner and do all the snooping you want.) Details for
the syntax and options can be found in the
nfs(5), exports(5),
fstab(5), and mount(8)
manpages. Additional references can be found in the Appendix A.

4.2.2.2 Automount

The preceding discussion of NFS describes editing the
/etc/fstab to mount filesystems.
There's another alternativeusing an automount
program such as autofs or
amd. An automount daemon mounts a remote
filesystem when an attempt is made to access the filesystem and
unmounts the filesystem when it is no longer needed. This is all
transparent to the user.

While the most common use of automounting is to automatically mount
floppy disks and CD-ROMs on local machines, there are several
advantages to automounting across a network in a cluster. You can
avoid the problem of maintaining consistent
/etc/fstab files on dozens of machines.
Automounting can also lessen the impact of a server crash. It is even
possible to replicate a filesystem on different servers for
redundancy. And since a filesystem is mounted only when needed,
automounting can reduce network traffic. We'll look
at a very simple example here. There are at least two different
HOWTOs (http://www.tldp.org/) for
automounting should you need more information.

Automounting originated at Sun Microsystems, Inc. The Linux
automounter autofs, which mimics
Sun's automounter, is readily available on most
Linux systems. While other automount programs are available, most
notably amd, this discussion will be limited to
using autofs.

Support for autofs must be compiled into the
kernel before it can be used. With most Linux releases, this has
already been done. If in doubt, use the following to see if it is
installed:

[root@fanny root]# cat /proc/filesystems
...

Somewhere in the output, you should see the line

nodev   autofs

If you do, you are in business. Otherwise, you'll
need a new kernel.

Next, you need to configure your systems. autofs
uses the file /etc/auto.master to determine
mount points. Each line in the file specifies a mount point and a map
file that defines which filesystems will be mounted to the mount
point. For example, in Rocks the auto.master
file contains the single line:

/home auto.home --timeout 600

In this example, /home is the mount point, i.e.,
where the remote filesystem will be mounted. The file
auto.home specifies what will be mounted.

In Rocks, the file /etc/auto.home will have
multiple entries such as:

sloanjd  frontend.local:/export/home/sloanjd

The first field is the name of the subdirectory that will be created
under the original mount point. In this example, the directory
sloanjd will be mounted as a subdirectory of
/home on the client system. The subdirectories
are created dynamically by automount and should not exist on the
client. The second field is the hostname (or server) and directory
that is exported. (Although not shown in this example, it is possible
to specify mount parameters for each directory in
/etc/auto.home.) NFS should be running and you
may need to update your /etc/exports file.

Once you have the configuration files copied to each system, you need
to start autofs on each system.
autofs is usually located in
/etc/init.d and accepts the commands
start, restart,
status, and reload. With Red
Hat, it is available through the /sbin/service
command. After reading the file, autofs starts
an automount process with appropriate parameters for each mount point
and mounts filesystems as needed. For more information see the
autofs(8) and
auto.master(5) manpages.

4.2.3 Other Cluster File System

NFS has its limitations. First, there are potential security issues.
Since the idea behind NFS is sharing, it should come as no surprise
that over the years crackers have found ways to exploit NFS. If you
are going to use NFS, it is important that you use a current version,
apply any needed patches, and configure it correctly.

Also, NFS does not scale well, although there seems to be some
disagreement about its limitations. For clusters, with fewer than 100
nodes, NFS is probably a reasonable choice. For clusters with more
than 1,000 nodes, NFS is generally thought to be inadequate. Between
100 and 1,000 nodes, opinions seem to vary. This will depend in part
on your hardware. It will also depend on how your applications use
NFS. For a bioinformatics clusters, many of the applications will be
read intensive. For a graphics processing cluster, rendering
applications will be write intensive. You may find that NFS works
better with the former than the latter. Other applications will have
different characteristics, each stressing the filesystem in a
different way. Ultimately, it comes down to what works best for you
and your applications, so you'll probably want to do
some experimenting.

Keep in mind that NFS is not meant to be a high-performance, parallel
filesystem. Parallel filesystems are designed for a different
purpose. There are other filesystems you could consider, each with
its own set of characteristics. Some of these are described briefly
in Chapter 12.
Additionally, there are other storage technologies such as storage
area network (SAN) technology. SANs offer greatly improve filesystem
failover capabilities and are ideal for use with high-availability
clusters. Unfortunately, SANs are both expensive and difficult to set
up. iSCSI (SCSI over IP) is an emerging technology to watch.

If you need a high-performance, parallel filesystems, PVFS is a
reasonable place to start, as it is readily available for both Rocks
and OSCAR. PVFS is discussed in Chapter 12.

4.2.4 SSH

To run
software across a cluster, you'll need some
mechanism to start processes on each machine. In practice, a
prerequisite is the ability to log onto each machine within the
cluster. If you need to enter a password for each machine each time
you run a program, you won't get very much done.
What is needed is a mechanism that allows logins without passwords.

This boils down to two choicesyou can use remote
shell (RSH) or secure shell
(SSH). If you are a trusting soul,
you may want to use RSH. It is simpler to set up with less overhead.
On the other hand, SSH network traffic is encrypted, so it is safe
from snooping. Since SSH provides greater security, it is generally
the preferred approach.

SSH provides mechanisms to log onto remote machines, run programs on
remote machines, and copy files among machines. SSH is a replacement
for ftp, telnet,
rlogin, rsh, and
rcp. A commercial version of SSH is available
from SSH Communications Security (http://www.ssh.com), a company founded by
Tatu Ylönen, an original developer of SSH. Or you can go
with OpenSSH, an open source version from http://www.openssh.org.

OpenSSH is the easiest since it is already included with most Linux
distributions. It has other advantages as well. By default, OpenSSH
automatically forwards the DISPLAY variable. This
greatly simplifies using the X Window System across the cluster. If
you are running an SSH connection under X on your local machine and
execute an X program on the remote machine, the X window will
automatically open on the local machine. This can be disabled on the
server side, so if it isn't working, that is the
first place to look.

There are two sets of SSH protocols, SSH-1 and SSH-2. Unfortunately,
SSH-1 has a serious security vulnerability. SSH-2 is now the protocol
of choice. This discussion will focus on using OpenSSH with SSH-2.

Before setting up SSH, check to see if it is already installed and
running on your system. With Red Hat, you can check to see what
packages are installed using the package manager.

[root@fanny root]# rpm -q -a | grep ssh
openssh-3.5p1-6
openssh-server-3.5p1-6
openssh-clients-3.5p1-6
openssh-askpass-gnome-3.5p1-6
openssh-askpass-3.5p1-6

This particular system has the SSH core package, both server and
client software as well as additional utilities. The SSH daemon is
usually started as a service. As you can see, it is already running
on this machine.

[root@fanny root]# /sbin/service sshd status
sshd (pid 28190 1658) is running...

Of course, it is possible that it wasn't started as
a service but is still installed and running. You can use
ps to double check.

[root@fanny root]# ps -aux | grep ssh
root     29133  0.0  0.2  3520  328 ?        S    Dec09   0:02 /usr/sbin/sshd
...

Again, this shows the server is running.

With some older Red Hat installations, e.g., the 7.3 workstation,
only the client software is installed by default.
You'll need to manually install the server software.
If using Red Hat 7.3, go to the second install disk and copy over the
file
RedHat/RPMS/openssh-server-3.1p1-3.i386.rpm.
(Better yet, download the latest version of this software.) Install
it with the package manager and then start the service.

[root@james root]# rpm -vih openssh-server-3.1p1-3.i386.rpm
Preparing...                ########################################### [100%]
1:openssh-server         ########################################### [100%]
[root@james root]# /sbin/service sshd start
Generating SSH1 RSA host key:                              [  OK  ]
Generating SSH2 RSA host key:                              [  OK  ]
Generating SSH2 DSA host key:                              [  OK  ]
Starting sshd:                                             [  OK  ]

When SSH is started for the first time, encryption keys for the
system are generated. Be sure to set this up so that it is done
automatically when the system reboots.

Configuration files for both the server,
sshd_config, and client,
ssh_config, can be found in
/etc/ssh, but the default settings are usually
quite reasonable. You shouldn't need to change these
files.

4.2.4.1 Using SSH

To log onto a remote machine, use the command ssh
with the name or IP address of the remote machine as an argument. The
first time you connect to a remote machine, you will receive a
message with the remote machines'
fingerprint, a string that identifies the
machine. You'll be asked whether to proceed or not.
This is normal.

[root@fanny root]# ssh amy
The authenticity of host 'amy (10.0.32.139)' can't be established.
RSA key fingerprint is 98:42:51:3e:90:43:1c:32:e6:c4:cc:8f:4a:ee:cd:86.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'amy,10.0.32.139' (RSA) to the list of known hosts.
root@amy's password: 
Last login: Tue Dec  9 11:24:09 2003
[root@amy root]#

The fingerprint will be recorded in a list of known hosts on the
local machine. SSH will compare fingerprints on subsequent logins to
ensure that nothing has changed. You won't see
anything else about the fingerprint unless it changes. Then SSH will
warn you and query whether you should continue. If the remote system
has changed, e.g., if it has been rebuilt or if SSH has been
reinstalled, it's OK to proceed. But if you think
the remote system hasn't changed, you should
investigate further before logging in.

Notice in the last example that SSH automatically uses the same
identity when logging into a remote machine. If you want to log on as
a different user, use the -l option with the
appropriate account name.

You can also use SSH to execute commands on remote systems. Here is
an example of using date remotely.

[root@fanny root]# ssh -l sloanjd hector date
sloanjd@hector's password: 
Mon Dec 22 09:28:46 EST 2003

Notice that a different account, sloanjd, was
used in this example.

To copy files, you use the scp command. For
example,

[root@fanny root]# scp /etc/motd george:/root/
root@george's password: 
motd                 100% |*****************************|     0       00:00

Here file /etc/motd was copied from

fanny to the /root
directory on

george .

In the examples thus far, the system has asked for a password each
time a command was run. If you want to avoid this,
you'll need to do some extra work.
You'll need to generate a pair of authorization keys
that will be used to control access and then store these in the
directory ~/.ssh. The
ssh-keygen command is used to generate keys.

[sloanjd@fanny sloanjd]$ ssh-keygen -b1024 -trsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/sloanjd/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/sloanjd/.ssh/id_rsa.
Your public key has been saved in /home/sloanjd/.ssh/id_rsa.pub.
The key fingerprint is:
2d:c8:d1:e1:bc:90:b2:f6:6d:2e:a5:7f:db:26:60:3f sloanjd@fanny
[sloanjd@fanny sloanjd]$ cd .ssh
[sloanjd@fanny .ssh]$ ls -a
.  ..  id_rsa  id_rsa.pub  known_hosts

The options in this example are used to specify a 1,024-bit key and
the RSA algorithm. (You can use DSA instead of RSA if you prefer.)
Notice that SSH will prompt you for a passphrase, basically a
multi-word password.

Two keys are generated, a public and a private key. The private key
should never be shared and resides only on the client machine. The
public key is distributed to remote machines. Copy the public key to
each system you'll want to log onto, renaming it
authorized_keys2.

[sloanjd@fanny .ssh]$ cp id_rsa.pub authorized_keys2
[sloanjd@fanny .ssh]$ chmod go-rwx authorized_keys2
[sloanjd@fanny .ssh]$ chmod 755 ~/.ssh

If you are using NFS, as shown here, all you need to do is copy and
rename the file in the current directory. Since that directory is
mounted on each system in the cluster, it is automatically available.

If you used the NFS setup described earlier, root's
home directory/root, is not shared. If you want
to log in as root without a password, manually copy the public keys
to the target machines. You'll need to decide
whether you feel secure setting up the root account like this.

You will use two utilities supplied with SSH to manage the login
process. The first is an SSH agent program that caches private keys,
ssh-agent. This program stores the keys locally
and uses them to respond to authentication queries from SSH clients.
The second utility, ssh-add, is used to manage
the local key cache. Among other things, it can be used to add, list,
or remove keys.

[sloanjd@fanny .ssh]$ ssh-agent $SHELL
[sloanjd@fanny .ssh]$ ssh-add
Enter passphrase for /home/sloanjd/.ssh/id_rsa: 
Identity added: /home/sloanjd/.ssh/id_rsa (/home/sloanjd/.ssh/id_rsa)

(While this example uses the $SHELL variable, you
can substitute the actual name of the shell you want to run if you
wish.) Once this is done, you can log in to remote machines without a
password.

This process can be automated to varying degrees. For example, you
can add the call to ssh-agent as the last line
of your login script so that it will be run before you make any
changes to your shell's environment. Once you have
done this, you'll need to run
ssh-add only when you log in. But you should be
aware that Red Hat console logins don't like this
change.

You can find more information by looking at the
ssh(1), ssh-agent(1), and
ssh-add(1) manpages. If you want more details on
how to set up ssh-agent, you might look at

SSH, The Secure Shell by Barrett and Silverman,
O'Reilly, 2001. You can also find scripts on the
Internet that will set up a persistent agent so that you
won't need to rerun ssh-add
each time.

One last word of warning: If you are using
ssh-agent, it becomes very important that you
log off whenever you leave your machine. Otherwise,
you'll be leaving not just one system wide open, but
all of your systems.

4.2.5 Other Services and Configuration Tasks

Thus far, we have taken a minimalist approach. To make like easier,
there are several other services that you'll want to
install and configure. There really isn't anything
special that you'll need to dojust
don't overlook these.

4.2.5.1 Apache

While
an HTTP server may seem unnecessary on a cluster, several cluster
management tools such as Clumon and Ganglia use HTTP to display
results. If you will monitor your cluster only from the head node,
you may be able to get by without installing a server. But if you
want to do remote monitoring, you'll need to install
an HTTP server. Since most management packages like these assume
Apache will be installed, it is easiest if you just go ahead and set
it up when you install your cluster.

4.2.5.2 Network Time Protocol (NTP)

It is
important to have synchronized clocks on your cluster, particularly
if you want to do performance monitoring or profiling. Of course, you
don't have to synchronize your system to the rest of
the world; you just need to be internally consistent. Typically,
you'll want to set up the head node as an NTP server
and the compute nodes as NTP clients. If you can, you should sync the
head node to an external timeserver. The easiest way to handle this
is to select the appropriate option when you install Linux. Then make
sure that the NTP daemon is running:

[root@fanny root]# /sbin/service ntpd status
ntpd (pid 1689) is running...

Start the daemon if necessary.

4.2.5.3 Virtual Network Computing (VNC)

This is a
very nice package that allows remote graphical logins to your system.
It is available as a Red Hat package or from http://www.realvnc.com/. VNC can be tunneled
using SSH for greater security.

4.2.5.4 Multicasting

Several
clustering utilities use multicasting to distribute data among nodes
within a cluster, either for cloning systems or when monitoring
systems. In some instances, multicasting can greatly increase
performance. If you are using a utility that relies on multicasting,
you'll need to ensure that multicasting is
supported. With Linux, multicasting must be enabled when the kernel
is built. With most distributions, this is not a problem.
Additionally, you will need to ensure that an appropriate multicast
entry is included in your route tables. You will also need to ensure
that your networking equipment supports multicast. This
won't be a problem with hubs; this may be a problem
with switches; and, should your cluster span multiple networks, this
will definitely be an issue with routers. Since networking equipment
varies significantly from device to device, you need to consult the
documentation for your specific hardware. For more general
information on multicasting, you should consult the multicasting
HOWTOs.

4.2.5.5 Hosts file and name services

Life will be much simpler in the long
run if you provide appropriate name services. NIS is certainly one
possibility. At a minimum, don't forget to edit
/etc/hosts for your cluster. At the very least,
this will reduce network traffic and speed up some software. And some
packages assume it is correctly installed. Here are a few lines from
the host file for

amy :

127.0.0.1               localhost.localdomain localhost
10.0.32.139             amy.wofford.int         amy
10.0.32.140             basil.wofford.int       basil
...

Notice that amy is not included on the line with
localhost. Specifying the host name as an alias
for localhost can break some software.