Beowulf Cluster Computing with Linux, Second Edition [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Beowulf Cluster Computing with Linux, Second Edition [Electronic resources] - نسخه متنی

William Gropp; Ewing Lusk; Thomas Sterling

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
توضیحات
افزودن یادداشت جدید







16.5 Troubleshooting

Maui's diagnostic commands provide a good start for troubleshooting any scheduling issues. The

diagnose command together with

checknode and

checkjob provides detailed state information about the scheduler, including its various facilities, nodes, and jobs. In addition to state information, these commands can also trigger extensive internal sanity checks for the scheduling realm of interest. For example, if the job priorities do not appear to properly reflect site objectives, the

diagnose -p command can be used to display the priorities of all jobs and the contributions of the various priority components and subcomponents. This command will also look for invalid priority values and summarize overall priority contributions of each component. At a glance, it will help administrators determine whether parameters need to be adjusted and, if so, by how much. Other diagnostic commands assist in both problem resolution and system tuning in areas such as throttling policies, reservations, fairshare, Grid scheduling, and job management. If any diagnostic command uncovers a potential problem, the issue is reported in the form of

WARNING messages appended to the normal command output. Use of these commands typically identifies or resolves the vast majority of all scheduling issues.


If additional information is required, Maui writes out detailed logging information in a logfile specified by the

LOGFILE parameter (usually in

'log/maui.log' ). The

LOGLEVEL and

LOGFACILITY parameters enable control over the verbosity and focus of these logs. Maui's high verbosity levels are very verbose, however, so keeping the LOGLEVEL below 4 or so unless actually tracking problems can help prevent excessing file activity.

These logs contain a number of entries, including the following:



  • INFO: provides status information about normal scheduler operations.



  • WARNING: indicates that an unexpected condition was detected and handled.



  • ALERT: indicates that an unexpected condition occurred that could not be fully handled.



  • ERROR: indicates that problem was detected that prevents Maui from fully operating. This may be a problem with the cluster that is outside of Maui's control or may indicate corrupt internal state information.



  • Function header: indicates when a function is called and what parameters are passed.



A simple

grep through the log file will usually indicate whether any serious issues have been detected and is of significant value when obtaining support or locally diagnosing problems. If neither commands nor logs point to the source of the problem, the Maui users list (<support@supercluster.org>) may be consulted for additional assistance.

/ 198