HP OpenView System Administration Handbook [Electronic resources] : Network Node Manager, Customer Views, Service Information Portal, HP OpenView Operations نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

HP OpenView System Administration Handbook [Electronic resources] : Network Node Manager, Customer Views, Service Information Portal, HP OpenView Operations - نسخه متنی

Tammy Zitello

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید


3.1 MANAGEMENT REQUIREMENTS


Gather the management requirements first. Ask the following types of questions; they should lead to many more questions: What will be managed? How it is to be managed? (Those two questions drive hardware and software requirements.) What problems are to be detected? How can the problems be detected? Can the problem be detected by an SNMP trap or by monitoring thresholds? Can the problem be detected by writing a shell script? When a problem is detected, what steps can be taken to automatically correct the detected problem? Can those steps be automated?

The term "managed" needs to be defined. What is managed on one node will be different than another. What is going to be managed? Will only network devices be managed or will servers and workstations be managed? How much is to be managed? Will only Up/Down status of systems and interfaces be managed? Will the status of processes need to be monitored on any given system? Will the resources of a system or resources a process is consuming need to be managed? These types of questions drive the size of the system and license requirements. And don't forget to gather the reporting requirements.

There will be a limiting factor on what will be managed. That factor is the budget. All the management requirements can be gathered and put in to a multi-year plan that takes into account the given budget.

3.1.1 Problem Detection


What problems require detection and thus an alert that the problem has been detected? Most likely, some event, catastrophic or otherwise, has occurred that led a company to determine that network and systems management needs to be deployed. Loss of revenue due to the unavailability of a network, system, or service is at the top of the list of reasons to deploy network and systems management product suites. But to what granularity of management does one need to have in order to be alerted of a particular problem? One can think of NNM as the manager of the "block diagram" of networks and systems. At a high level, NNM monitors the connectivity status of network interfaces of the network "block diagram" using ping and SNMP and problem detection becomes more granular from there.

OVO manages services and processes, and to some extent, Operating System and process resource utilization. The processes and resources are those that run on a "block" within the network block diagram. For more detailed process and resource utilization there is the coda sub-agent of the Operations Agent. And for even more detailed process and resource utilization, the

OpenView Performance Agent (OVPA) can be purchased and configured. The OVPA will send "events" to NNM or OVO, using SNMP or Operations Center Messaging respectively. OVPA also collects and summarizes and maintains history of the resource data. How much detail is needed?

Specific service response time can be measured through

OpenView Internet Services (OVIS).

OpenView Performance Insight (OVPI) collects management data from NNM, OVIS, OVPA, and SNMP and contains threshold monitoring and forwards detected desecrations via SNMP.

The question again is, at what "point" must there be detection of the problem? Define the problem, and then determine the method of instrumentation to detect and report the problem to the management server. After the problem is detected and reported, decide what action must be taken, whether it is an automatic action taken by the product, or manual process by human intervention.

3.1.2 Software Requirements


Another reason for deploying management systems is to enable the company to become more proactive when it comes to managing networks and systems. Becoming proactive means the organization managing the networks and systems knows about the problem before the customer calls and notifies them of the problem. The ultimate goal is to know of the problem and fix the problem so the customer never knows there was a problem. This ultimate goal goes beyond the scope of this book because it entails the design of highly available networks, systems and services that are managed by the Network Management System. The goal of a network management team is to instrument the network in such a way that the problem is detected, alerted, and that any automatic or manual actions are taken to correct the problem. All these events are recorded within the NMS browser. What management software will help reach these goals? NNM and OVO problem detection and notification will be discussed here.

3.1.2.1 Network Node Manager

NNM's Alarm browser contains SNMP traps, or events. The way to get an alarm into NNM's Alarm browser is an SNMP trap. There are many ways that an SNMP trap can be generated. One is that a trap is generated from the SNMP agent on the managed node and sent to the trap destination. In this case, it is sent to NNM, but could easily be sent to a system that is designed for event correlation. SNMP traps are built-in to the SNMP agent from the vendor and are sent upon detection of an anomaly on the system. The associated traps are vendor, operating system, and product specific and are described in either the agent's documentation or the trap definition file. The traps are generally not configurable, though some agents allow for the customization of the data sent within the trap. HP OpenView's Extensible SNMP agent does allow for the creation of trap definitions.

HP's Extensible SNMP agent runs on HP-UX and Solaris and can be "enhanced" to provide additional data through customized

Object Identifiers (OIDs) that operate through an

snmpget or can perform system operations through an

snmpset . The Extensible agent runs as a sub-agent to the HP's SNMP agent. The creation of customized OIDs and traps allow for detection of specific problems not available with the standard SNMP agent.

To use NNM to detect and notify that a specific problem has occurred on a node, check to see if there is a vendor-specific SNMP trap that will provide the problem detection. First, obtain the trap configuration file or documentation on the vendor's agent. The trap configuration file is an ASCII file and can be read with any editor. Using an editor makes it easier to search through the file for information rather than loading the trap definitions into the event database and looking down the event tree. Traps that meet the requirements can be loaded into the event database and configured through

xnmevents . It is within

xnmevents that actions can be configured when the trap is received.

A second way is to poll an SNMP agent for specific OIDs and create an internal trap within NNM based on a threshold. The OIDs can be found within a vendor's MIB definitions for the vendor's SNMP agent. These MIB definitions are text files and readable by an editor (brush up on reading ASN.1 syntax) and can be searched for a possible OID that can be retrieved to provide the needed information. The MIB definitions can also be loaded into NNM for reading in the MIB tree using NNM's MIB browser (xnmbrowser). They must be loaded in order to configure NNM data collection. When the MIB is loaded into NNM, configure NNM's Data Collection, Threshold, and Monitoring to retrieve SNMP data and internally create a threshold and rearm trap (event) based on checking the retrieved value against for the OID. When the threshold is broken, it sends an SNMP-specific trap. When the value for the OID returns to normal, the rearm trap (event) occurs. These traps will be viewed in the Alarm. If the information cannot be retrieved from the SNMP agent, another way will have to be devised, possibly through an OVO action, command, or monitor, or some other method or add-on product.

Tip

When using NNM Threshold and Monitoring, only numeric variables can be retrieved for status checking. Polling for changes in text cannot be done.

3.1.2.2 OpenView Operations

OVO is agent based, meaning that an agent is deployed to the node and alarms (messages) are sent to the OVO Management station based upon the templates assigned to the node. There are different types of templates that read log files, run monitor scripts, or intercept and format SNMP traps. Whatever the type of template, the template itself is used to retrieve the specific data and format a message based on a template string. The agents can only be deployed to specific operating systems and not to network devices such as routers, bridges, and switches. There are also

Smart Plug-ins (SPIs) available, both free and at cost, that are designed to be deployed with the OVO UNIX agents to manage specific databases, operating systems, mail servers, and more. The templates and the actions, commands, and monitors they run are fully customizable. New templates, actions, commands, and monitor scripts can be created to do just about anything. If a script can be written to determine if there is a problem, it can be monitored through OVO. If a script can be written to properly correct a problem, it can be an automatic action. A trap template can be assigned to management server or collection station to send SNMP traps that are received by NNM's ovtrapd process to the OVO UNIX message browser via the OVO messaging system. The OVO message can be sent using the TCP protocol to ensure the message gets to the OVO management server.

Whether using NNM alone or in conjunction with OVO, it is recommended that Node Up, Node Down, Interface Up, and Interface Down events be enabled first, and everything else is turned off. If the OVO product is loaded and configured with the basic events and templates deployed, the system will be inundated with information about the monitored systems. Most of this information will be unwanted, unnecessary, and duplicated. Literally tens of thousands of events and messages can easily be generated on a daily basis if the default templates and traps are deployed out-of-the-box. It is virtually impossible to peruse through fifty thousand messages on a daily basis. The same concept applies to NNM. NNM will be flooded with traps from the nodes it discovers, especially if the discovered SNMP agent allows the management server to configure itself as a trap destination. Add alarms and traps as necessary within the product that is suited to detect, report, and act on the problem. If the operators monitoring the Alarm or Message browser are told to "ignore" what is presented in the browser, they will begin to lose faith in the system and its ability to provide alarms that are substantive.

3.1.2.3 Additional Management Products

Adding management products increases the complexity of the overall solution. Depending on the management requirements, there may be no way to avoid it. Don't add management capabilities if the other products are not working as expected. If the NNM alarm or OVO message browser contains unnecessary alarms, they need to be modified in order that the browser becomes state based. The more products that are added, the more memory, CPU, disk space, backup, and training requirements are needed. There will be more alarms coming if they aren't configured properly and implemented slowly. Keep in mind that some products have different methods of backing up the data, and some may not function in an MC/ServiceGuard or other High Availability configuration. Plus, not all the products can be managed remotely and require access to the system console for configuration. These items must be considered within the plan.

3.1.2.4 Functionality Overlap

There is the possibility of the overlap of functionality within each product. Determine which product has better functionality and use it. Both NNM and OVO have browsers that view alarms. NNM has the Alarm browser and OVO has the Message browser. When deleting alarms in the NNM browser, they are just that: Deleted! Acknowledging OVO messages in the active Message browser moves the message to the history table and remains there until downloaded. Until the message is downloaded from history table, it can be retrieved for viewing. Assigning the trap template to a NNM management console or collection station will forward the SNNP events into the OVO Message browser enabling the OVO Message browser to be primary viewer for alarms.

Note

By deploying an OVO Agent and trap template to a NNM management/collection station, SNMP traps can be sent from the NNM to the OVO via TCP, which guarantees delivery. After the trap has been converted to an OVO message, the message can be owned by an operator, escalated within OVO, text can be added for testing or instructions on what to do with the message when it is received, and so on.


    / 276