High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI [Electronic resources] - نسخه متنی

Joseph D. Sloan

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید








10.1 C3


Cluster Command
and Control is a set of about a dozen command-line utilities used to
execute common management tasks. These commands were designed to
provide a look and feel similar to that of issuing commands on a
single machine.[1]
The commands are both secure and scale reliably. Each command is
actually a Python
script. C3 was developed at Oak Ridge National Laboratory and is
freely available.

[1] A Python/TK GUI known as
C2G has also been developed.



10.1.1 Installing C3


There
are two ways C3 can be installed. With the basic install,
you'll do a full C3 installation on a single
machine, typically the head node, and issue commands on that machine.
With large clusters, this can be inefficient because that single
machine must communicate with each of the other machines in the
cluster. The alternate approach is referred to as a scalable
installation. With this method, C3 is installed on all the machines
and the configuration is changed so that a tree structure is used to
distribute commands. That is, commands fan out through intermediate
machines and are relayed across the cluster more efficiently. Both
installations begin the same way; you'll just need
to repeat the installation with the scalable install to alter the
configuration file. This description will stick to the simple
install. The simple installation includes a file

README.scale that describes the scalable
installation.

Since the C3 tools are scripts, there is very little to do to install
them. However, since they rely on several other common packages and
services, you will need to be sure that all the prerequisites are
met. On most systems this won't be a problem;
everything you'll need will already be in place.

Before
you can install C3, make sure that rsync, Perl, SSH, and Python are
installed on your system and available. Name resolution, either
through DNS or a host file, must be available as well. Additionally,
if you want to use the C3 command
pushimage, SystemImager must be installed.
Installing SystemImager is discussed in Chapter 8.

Once you have met the prerequisites, you can download, unpack, and
install C3. To download it, go to http://www.csm.ornl.gov/torc/C3/ and follow
the link to the download page. You can download sources or an RPM
package. In this example, sources are used. If you install from RPMs,
install the full install RPM and profile RPM on servers and the
client RPM on clients. Note that with the simple installation you
only need to install C3 on the head node of your cluster. However,
you will need SSH and the like on every node.

Once you have unpacked the software and read the README files, you
can run the install script Install-c3.

[root@fanny src]# gunzip c3-4.0.1.tar.gz
[root@fanny src]# tar -xvf c3-4.0.1.tar
[root@fanny src]# cd c3-4.0.1
[root@fanny c3-4.0.1]# ./Install-c3

The install script will copy the scripts to
/opt/c3-4 (for Version 4 at least), set paths,
and install man pages. There is nothing to compile.

The next step is creating a configuration file. The default
file is /etc/c3.conf. However, you can use other
configuration files if you wish by explicitly referencing them in C3
commands using the -f option with the file name.

Here is a very simple configuration file:

cluster local {
fanny.wofford.int
george.wofford.int
hector.wofford.int
ida.wofford.int
james.wofford.int
}

This example shows a configuration for a single cluster. In fact, the
configuration file can contain information on multiple clusters. Each
cluster will have its own cluster description block, which begins
with the identifier cluster followed by a name for
a cluster. The name can be used in C3 commands to identify the
specific cluster if you have multiple cluster description blocks.
Next, the machines within the cluster are listed within curly braces.
The first machine listed is the head node. To remove ambiguity, the
head node entry can consist of two parts separated by a
colonthe head node's external interface to
the left of the colon and the head node's internal
interface to the right of the colon. (Since
fanny has a single interface, that format was
not appropriate for this example.) The head node is followed by the
compute nodes. In this example, the compute nodes are listed one per
line. It is possible to specify a range. For example,
node[01-64] would specify 64 machines with the
names node1, node2, etc.
The cluster definition block is closed with another curly brace. Of
course, all machine names must resolve to IP addresses, typically via
the /etc/hosts file. (The commands
cname and cnum, described
later in this section, can be discerning the details surrounding node
indices.)

Within the compute node list, you can also use the qualifiers
exclude and dead.
exclude is applied to range qualifiers and
immediately follow a range specification. dead
applies to individual machines and precedes the machine name. For
example,

node[1-64]
exclude 60
alice
dead bob
carol

In this list node60 and bob
are designated as being unavailable. Starting with Version 3 of C3,
it is possible to use ranges in C3 commands to restrict actions to
just those machines within the range. The order of the machines in
the configuration file determines their numerical position within the
range. In the example, the 67 machines defined have list positions 0
through 66. If you deleted bob from the file
instead of marking it as dead,
carol
's position would change from 66 to
65, which could cause confusion. By using exclude
and dead, you effectively remove a machine from a
cluster without renumbering the remaining machines.
dead can also be used with a dummy machine to
switch from 0-indexing to 1-indexing. For example, just add the
following line to the beginning of the machine list:

dead place_holder

Once done, all the machines in the list move up one position. For
more details on the configuration file, see the
c3.conf(5) and c3-scale(5)
manpages.

Once you have created your configuration file, there is one last
thing you need to do before C3 is ready to go. For the command
ckill to work properly, the Perl script
ckillnode must be installed on each individual
machine. Fortunately, the rest of C3 is installed and functional, so
you can use it to complete the installation. Just issue these
commands:

[root@fanny root]# cexec mkdir /opt/c3-4
************************* local *************************
--------- george.wofford.int---------
...
[root@fanny root]# cpush /opt/c3-4/ckillnode
building file list ... building file list ... building file list ... building
file list ... done
...

The first command makes the directory /opt/c3-4
on each machine in your cluster and the second copies the file
ckillnode to each machine. You should see a fair
amount of output with each command. If you are starting SSH manually,
you'll need to start it before you try
this.


10.1.2 Using C3 Commands


Here is
a brief description of C3's more useful utilities.


10.1.2.1 cexec

This
command executes a command string on each node in a cluster. For
example,

[root@fanny root]# cexec mkdir tmp
************************* local *************************
--------- george.wofford.int---------
--------- hector.wofford.int---------
--------- ida.wofford.int---------
--------- james.wofford.int---------

The directory tmp has been created on each
machine in the local cluster. cexec has a serial
version cexecs that can be used for testing.
With the serial version, the command is executed to completion on
each machine before it is executed on the next machine. If there is
any ambiguity about the order of execution for the parts of a
command, you should use double quotes within the command. Consider:

[root@fanny root]# cexec "ps | grep a.out"
...

The quotes are needed here so grep will be run
on each individual machine rather than have the full output from
ps shipped to the head node.


10.1.2.2 cget

This
command is used to retrieve a file from each machine in the cluster.
Since each file will initially have the same name, when the file is
copied over, the cluster and host names are appended. Here is an
example.

[root@fanny root]# cget /etc/motd
[root@fanny root]# ls
motd_local_george.wofford.int
motd_local_hector.wofford.int
motd_local_ida.wofford.int
motd_local_james.wofford.int

cget ignores links and subdirectories.


10.1.2.3 ckill

This
script allows you to kill a process running on each node in your
cluster. To use it, specify the process by name, not by number,
because it is unlikely that the processes will have the same process
ID on each node.

[root@fanny root]# ckill -u sloanjd a.out
uid selected is 500
uid selected is 500
uid selected is 500
uid selected is 500

You may also specify an owner as shown in the example. By default,
the local user name will be used.


10.1.2.4 cpush

This
command is used to move a file to each node on the cluster.

[root@fanny root]# cpush /etc/motd /root/motd.bak
building file list ... done
building file list ... done
motd
motd
building file list ... done
motd
wrote 119 bytes read 36 bytes 62.00 bytes/sec
total size is 39 speedup is 0.25
wrote 119 bytes read 36 bytes 62.00 bytes/sec
total size is 39 speedup is 0.25
wrote 119 bytes read 36 bytes 62.00 bytes/sec
total size is 39 speedup is 0.25
building file list ... done
motd
wrote 119 bytes read 36 bytes 62.00 bytes/sec
total size is 39 speedup is 0.25

As you can see, statistics for each move are printed. If you only
specify one file, it will use the same name and directory for the
source and the destination.


10.1.2.5 crm

This
routine deletes or removes files across the cluster.

[root@fanny root]# crm /root/motd.bak

Like its serial counterpart, you can use the -i,
-r and -v options for
interactive, recursive, and verbose deletes, respectively. Please
note, the -i option only prompts once, not for
each node. Without options, crm silently deletes
files.


10.1.2.6 cshutdown

This
utility allows you to shut down the nodes in your cluster.

[root@fanny root]# cshutdown -r t 0

In this example, the time specified was 0 for an immediate reboot.
(Note the absence of the hyphen for the t option.)
Additional options are supported, e.g., to include a shutdown
message.


10.1.2.7 clist, cname, and cnum

These three commands are used to query the configuration file to
assist in determining the appropriate numerical ranges to use with C3
commands. clist
lists the different clusters in the configuration file.

[root@amy root]# clist
cluster oscar_cluster is a direct local cluster
cluster pvfs_clients is a direct local cluster
cluster pvfs_iod is a direct local cluster

cname
lists the names of machines for a specified range.

[root@fanny root]# cname local:0-1
nodes from cluster: local
cluster: local ; node name: george.wofford.int
cluster: local ; node name: hector.wofford.int

Note the use of 0 indexing.

cnum
determines the index of a machine given its name.

[root@fanny root]# cnum ida.wofford.int
nodes from cluster: local
ida.wofford.int is at index 2 in cluster local

These can be very helpful because it is easy to lose track of which
machine has which index.


10.1.2.8 Further examples and comments

Here is an example using a range:

[root@fanny root]# cpush local:2-3 data
...

local designates which cluster is within your
configuration file. Because compute nodes are numbered from 0, this
will push the file data to the third and fourth
nodes in the cluster. (That is, it will send the file from
fanny to ida and
james, skipping over george
and hector.) Is that what you expected? For more
information on ranges, see the manpage
c3-range(5).

Note that the name used in C3 commands must match the name used in
the configuration file. For C3, ida and
ida.wofford.int are not equal even if there is
an alias ida that resolves to
ida.wofford.int. For example,

[root@fanny root]# cnum ida.wofford.int
nodes from cluster: local
ida.wofford.int is at index 2 in cluster local
[root@fanny root]# cnum ida
nodes from cluster: local

When in doubt about what form to use, just refer back to
/etc/c3.conf.

In addition to the commands just described, the C3 command
cpushimage
can be used with SystemImager to push an image from server to nodes.
There are also several user-contributed utilities. While not
installed, these can be found in the C3 source tree in the
subdirectory contrib. User-contributed scripts
can be used as examples for writing other scripts using C3 commands.

C3
commands take a number of different options not discussed here. For a
brief description of other options, use the --help
option with individual commands. For greater detail, consult the
manpage for the individual command.


/ 142