HP OpenView System Administration Handbook [Electronic resources] : Network Node Manager, Customer Views, Service Information Portal, HP OpenView Operations نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

HP OpenView System Administration Handbook [Electronic resources] : Network Node Manager, Customer Views, Service Information Portal, HP OpenView Operations - نسخه متنی

Tammy Zitello

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید







22.2 FUNCTIONAL CHECKS


The functional checks are helpful when you need to determine overall operability of the OVO environment. When the functional checks do not result in the desired outcome, the next step is to plan time for the necessary troubleshooting, configuration changes, or perhaps a patch fix. In order to isolate the problem, ensure that you have collected as much system information as possible. Also, perform the initial data gathering and basic troubleshooting prior to placing a support call, if necessary. Table 22-3 provides a listing of OVO functional checks.

Table 22-3. OpenView Operations Functional TestChecklist

Check the following areas

Procedure

Notes

1. Are all of the management server processes running?

opcsv

Restart the processes. Investigate why the process was not running.

2. What is the version of the management server software?

opcsv -version

Verify the current patch level for the management server. Upgrade if necessary.

3. Are all of the managed node processes running?

ovc

Check the error log file for possible reasons the process is down. Restart the processes.

4. What is the patch version of the managed node software?

opcagt -version

Try to maintain the current patch release.

5. Check the installed version of .managed node software and the sub-agent components from the management server

opcragt agent_version <node>

This is also a quick way to check the communications between the server and the node. If this fails to verify, the node is managed by this server. It may be necessary to restart the processes on the node.

6. Check for OVO errors on the DCE managed node.

tail /var/opt/OV/log/opcerror

If there are error messages in the file, check for resolutions via the eCare problem resolution web site. It may be necessary to log a support call via Itrc.hp.com.

7. Check the OVO errors on the HTTPS managed node.

tail /var/opt/OV/log/System.txt

8. Check the remote procedure call service (RPC), verify that the DCE/(NCS) or RPC-based agents are registered.

rpccp show mapping

opcmsgrd

opcdistm

opcdispm

RPC processes on the node:

rpcd (HP)

opcctla

dced (RPC)

llbserver (NCS)

RPC processes on the server:

9. Check the status of the Oracle processes.

ps ef|grep ora

If the oracle processes are not running, the OVO processes will not start.

10. Verify the Oracle listener.

lsnrctl status

If the listener lsnrctl start is not running, restart it. The database must be open in order for the listener to start properly. You may need the help from a DBA to restart the database.

11. Check the OpenView platform background processes and services (NNM).

ovstatus -c

If the OpenView background processes are not running, OVO processes will not start.

12. Verify hostname resolution from the server.

nslookup <node>

The OpenView platform utilizes nslookup <local_server_name> name resolution lookup services, this information must be correct within the DNS, NIS or local host file.

Note:
The command nsquery is becoming more useful for hostname or IP lookups. It provides more detailed information.

13. Verify hostname resolution from the managed node.

nslookup <server> nslookup <local_node_name>

14. Check network connectivity from the server.

ping <node>

ping tests may indicate certain network delays that can be viewed with OpenView as a node down. Intermittent ping failures should be investigated and resolved.

15. Check network connectivity from the managed node

ping <server>

If the node is unable to ping the server; messages are buffered until the server is available.

16. Check the name resolution service configuration file.

more /etc/nsswitch.conf

Verify the search order. If using multi-homed hosts, see #17.

17. Is the node multi-homed?

Add an entry with the second IP_ADDRESS HOSTNAME into a file that you create s on the server called /etc/opt/OV/share/conf/OpC/mgmt_sv/opc.host

The file has the same format as the /etc/hosts file. After the file is created restart the server processes. Any changes to the file require process restart.

18. Operator and Administrator login.

ps -ef|grep ui

ps -ef|grep opcuiwww

Each operator (including the administrator) should have a unique login account for security purposes.

19. Login to OVO using the operator account to ensure that the environment displays the correct information based upon the operators responsibilities.

20. Check the Oracle environmental variables

cat /etc/opt/OV/share/conf/ovdbconf

21. Check the Oracle environmental variable within the user Oracle shell environment.

su oracle

env|grep ORA

Note:
Compare this output to the listing found in Chapter 23. It may be necessary to consult with a DBA to verify the health of the OVO oracle database. The oracle user account requires the correct environment for installation and configuration of the database and during runtime operations.

22. Check for all the environmental variables within the user oracle shell environment

su oracle

env > /tmp/ora_env.out

more /tmp/ora_env.out

System environment variables are also important, such as the PATH, MANPATH, LANG, and so on.

23. Check connectivity to the Oracle database

  1. su oracle

  2. sqlplus opc_op/

  3. Exit

Oracle 8.x and 9.x

Note:
It may be necessary to check with a DBA to gain access to the database using the privileged user accounts.

24. Check remote connections to the database.

  1. su oracle

  2. sqlplus opc_op/opc_op@ov_net

25. Check the error logs.

Refer to Section 22.1.1.

26. Clean up pending template distributions.

ll /var/opt/OV/share/tmp/OpC/distrib

rm /var/opt/OV/share/tmp/OpC/distrib/*

If the distribution failed, a message will periodically appear in the message browser; remove the files from the distribution directory.

27. Check the integrity of the queue files

/opt/OV/contrib/OpC/opcqchk <queue_file>

If this program presents a menu and basic operations can be performed the queue file is OK.

28. Test message data flow in general

opcmsg a=a o=o msg_t=test

If the message appears in the browser message flow is OK.

29. Messages should appear in the browser of the assigned operator if not check the following:

  1. Check the operator's responsibilities

  2. Use local logging for the policy to check whether

    1. the agent actually recieves the input event.

    2. the event is suppressed or not

    3. rule out duplicate message suppression and message/event correlatoiin.

    4. check history browser, message could have been configured to go directly to history tables in the database.

  3. Review policy inventory on the managed node using the command ovpolicy or opctemplate.

  4. Use the opcpat command to verify pattern syntax is correct. See the man pages for use of the opcpat command.

30. Test action execution.

Use command broadcast and run a command (i.e. hostname)

If successful actions work in general

31 Verify agent status

opcragt <node_name>

This displays agent status and tests basic connectivity

32. Test basic HTTPS connectivity

bbcutil -ping <node>

Execute the command on the managed node

33. Test HTTPS agent security certificates

ovcert -list

34. Test connectivity of DCE agent

rpccp show mapping

35. Utilize the trace facility

If necessary to further investigation configure XPL tracing

Refer to the Tracing Concepts and Users Guide for more information about XPL tracing and logging.

22.2.1 When the Processes Will Not Start


If you perform the functional check and the processes will not start, it could be due to corrupt queue or pipe files. Corrupt queue files occur on rare occasions if the OS or physical disk buffers have not been flushed to disk successfully. These files can be removed from the server or node while the OVO processes are not running. If it becomes necessary to remove the queue files, be aware they contain data that will be lost. However, if you have tried everything else and you are still experiencing problems with agent or server processes, move the queue files. Check the integrity of the OVO processes by moving the queue files temporarily, then restart the agent. If the agent starts, return the original queue file; this will minimize data loss. If necessary remove the queue files and restart the processes. The queue files will be recreated.

22.2.1.1 Procedure to Remove the Queue Files on the Server


  1. Exit all OVO GUIs.

  2. Stop the management server processes:

    ovstop ovoacomm .

  3. Erase the OVO temporary files:

    rm /var/opt/

    OV /share/tmp/OpC/mgmt_sv/* .

  4. Restart the management server processes:

    ovstart opc .

  5. Restart the GUI:

    opc .

22.2.1.2 Procedure to Remove the Queue Files on the Node


  1. Stop all the managed node processes:

    opcagt kill .

  2. Erase the OVO temporary files:

    rm /var/opt/

    OV /tmp/OpC/* .

  3. Verify that all of the process have stopped:

    ps eaf|grep opc .

  4. Restart the managed node processes:

    opcagt -start .


/ 276