Beowulf Cluster Computing with Linux, Second Edition [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Beowulf Cluster Computing with Linux, Second Edition [Electronic resources] - نسخه متنی

William Gropp; Ewing Lusk; Thomas Sterling

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
توضیحات
افزودن یادداشت جدید







Chapter 6 for a discussion of post-purchase cluster setup. A variety of paths to this goal can be taken, each with pros and cons.



2.10.1 Cluster Vendors


A common approach to building clusters is to find a vendor that provides integrated solutions. Many large system vendors now have products in the cluster space. They are experienced with the problems that customers will have in the initial stages of cluster setup, and know the questions that should be asked initially. These vendors are able to ship integrated solutions. In many cases, the cluster can be powered on when delivered, and be running applications in hours. Experienced cluster vendors optionally offer on-site hardware and software support. This approach is certainly the simplest, but can be more expensive than the following options; all of the extra services provided by the vendors cost money to provide. However, in many cases, the extra cost is well worth it.


2.10.2 White Boxes


Another common approach to building clusters is to find a vendor that builds custom computers, but has no cluster expertise. The vendor builds machines to the customer's specifications. This allows the customer to specify the exact parts the cluster should be assembled from. While on-site hardware maintenance may be available, software maintenance isn't. Experienced cluster builders may choose to take this route, as the difference between white box vendors and cluster vendors largely consists of help with cluster specific issues.


2.10.3 DIY


The final approach taken to building clusters is to do everything yourself. Every detail of system configuration is controllable; from the type of power supply to cables, and fans used for cooling. Hundreds of boxes will be delivered containing each of the parts required for each cluster node. Nodes must be assembled, and software can then be installed. This approach provides the most flexibility, but also has the highest potential for pitfalls.


2.10.4 Pitfalls


Many problems can manifest themselves during the construction and operation of a cluster. Some can be avoided by making proper decisions during the specification process. These problems can make clusters virtually unusable, so they should be taken seriously. Problems mentioned here could be treated as a laundry list of issues to be checked before a cluster is setup.


It should be verified that enough power and cooling exist to properly operate the cluster. Underpowered or overheating clusters rarely perform well, and in many cases exhibit strange problems that can consume days, weeks, or months of administrator time to properly debug.

The use of some sort of console solution should be employed. Many hardware errors are displayed during the BIOS boot sequence. Whether the BIOS supports a serial console or not, the hardware needed to see these errors should be available. The simplest solution for this problem is a crash cart. This consists of a single keyboard, monitor and mouse on a cart that can be connected to machines in case of problems. More elaborate solutions can be constructed using serial concentrators to provide usable consoles on each machine, or KVM switches.

Real profiling of target applications should be performed. Performance on artificial benchmarks is better information than no information at all, however, these results aren't important unless the primary application run on a cluster will be benchmarks.

Finally, remember that everything is harder when it needs to be done multiple times. While it is an easy process to assemble a single new machine, assembling 32, 64, 96, or 128 machines is a much harder process. Remember that time has value. Cutting corners for the sake of small amounts of money almost always causes problems.


/ 198