HP OpenView System Administration Handbook [Electronic resources] : Network Node Manager, Customer Views, Service Information Portal, HP OpenView Operations نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

HP OpenView System Administration Handbook [Electronic resources] : Network Node Manager, Customer Views, Service Information Portal, HP OpenView Operations - نسخه متنی

Tammy Zitello

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید

16.2 THE PERFORMANCE AGENT


The OVPA (formerly called the MeasureWare Agent) captures performance, resource, and transaction data from managed servers or workstations. The program environment uses minimal system resources to collect, log, summarize, timestamp, detect alarm conditions and send notifications to the appropriate applications, such as OVO or NNM. It allows other programs such as OV Performance Manager, OV Reporter and Glance to utilize (extract) the data collected. OVPA utilizes data source integration (DSI) technology to receive alarm on and log data from external data sources such as applications, databases, networks, and other operating systems.

16.2.1 OVPA Installation


OVPA is distributed with OVO and requires a separate license. It is installed from the management server if you select from the menu

ActionsAgentInstall Subagent , then select MWA as the subagent to be installed. The agent can also be installed from a distribution CD or software depot using swinstall. After installation, the files and programs are located in the directory /opt/perf. Figures 16-6 and 16.7 show the template source for OVPA. Assign and distribute the templates to the managed node where you need to monitor up to as many as 300 metrics and take advantage of the other features offered by OVPA. The OVPA is supported on HP-UX, Solaris, Windows, Tru64, LINUX, and AIX platforms.

Figure 16-6. OVPA Message Source Template Group is shown in the Message Source Template window.

[View full size image]

height="265" SRC="/image/library/english/10090_16fig06.gif" >

Figure 16-7. OVPA Message Source Template Group contains the default templates listed in the Message Source Template window.

[View full size image]

height="265" SRC="/image/library/english/10090_16fig07.jpg" >

16.2.2 OVPA (3.x) Process Environment


  • RPCD
    HP-UX remote procedure call daemon, provides the endpoint map service for a system. The rpcd program listens on udp and tcp port 111. The endpoint map service is a system-wide database where local RPC servers register binding information associated with their interface identifiers. The endpoint map is maintained by the endpoint map service of the RPC daemon. The endpoint map services are responsible for handling RPC lookups from requesting clients of compatible locally mapped servers. This technology is being phased out of the OpenView platform and replaced by new technology, HTTPS communications programs. Refer to Chapter 14, "Agents, Policies, and Distribution," for more information about the HTTPS-based agent.

  • DCED (Distributed Computing Environment Daemon)Solaris (remote procedure process on SUN platforms).

  • rpcss
    Windows (remote procedure process on Microsoft platforms).

  • ovbbccb
    HTTPS-based data communications process.

  • Perflbd
    Reads the perlbd.rc file to obtain data source names and locations. perflbd starts a rep_server process for each configured data source in the perflbd.rc file. perflbd gives the client products (such as OV Performance Manager) some data communication information about the agent. Communication with perflbd is through a TCP socket.

  • Rep_server
    Repository server process that provides access to the data stored in the logfiles. Communication with the rep_server is through RPCs.

  • Agdbserver
    Process that provides access to the alarm generator system database. The database contains information concerning all systems that will be receiving alarms from the agent. Communication with the agdbserver is through RPCs.

  • Alarmgen
    Process that analyzes the data and generates and sends alarm notifications to the alarm daemon in OVPM or the message interceptor in OVO or ovtrapd in NNM.

  • Scopeux
    Collects performance data from the operating system where OVPA is installed. After collecting the data, scopeux summarizes the data and logs it in raw log files based on the specification for data collection defined in the collection parameter (parm) file.

  • Midaemon
    Collects and counts trace data coming from the kernel and translates it for use by OVPA and other performance programs, such as Glance, via a shared memory segment. OVPA's scopeux daemon program attaches to the shared memory interface.

  • DSI
    Data Source Integration logging daemon.

  • Utility
    Manages scopeux log files and analyzes or checks the log files via the repository servers and alarmdef file.

  • Extract
    mwa program for obtaining specific summary or detail data from the repositories.


Note

OVPA 4.x replaces the DCE-RPC based processes and functionality with that of HTTPs-based communications processes. Refer to Section 16.2.8, "OVPA 4.x," for more information about OVPA 4.x.

16.2.3 OVPA Startup


The perflbd.rc file is read by the perflbd program during OVPA startup and allows the selected data to be made available for alarm processing and analysis. The default perflbd.rc file contains one entry for a data source named SCOPE that starts a repository server for the scopeux log file set.

The startup sequence for OVPA is as follows:

  • Start scopeux (which starts midaemon if it not already running).

  • Start transaction tracker (if it is not already running).

  • Check for rpcd.

  • Start perflbd; this starts the rep_server processes (one at a time) as requested in the perflbd.rc configuration file. Note: This can take some time if the logfiles defined for the data sources are large.

  • After the rep_server processes are running, perflbd starts agdbserver.

  • Abdbserver starts alarmgen.


After alarmgen is running, connections will be accepted from external programs (such as the HP OpenView Performance Manager).

16.2.4 OVPA Configuration


OVPA has a set of repository severs (called rep_servers) that provide log file data to the alarm generator and other products, such as OV Performance Manager, OVO, NNM and OV Reporter. There is one rep_server for each data source consisting of a scopeux or DSI log file set.

Configure data sources in the /var/opt/perf/perflbd.rc file. A data source is identified with the following syntax within the perflbd.rc file.



# cat perflbd.rc
DATASOURCE=SCOPE LOGFILE=/var/opt/perf/data/image/library/english
/10090_logglob

The DATASOURCE line informs the alarm generator where to find the datafile; the scopeux daemon collects and summarizes performance measurements.

16.2.4.1 Data Source Log File Types

There are several data source LOGFILE types supported. The contents of the data source files are defined here for reference:

  • logglob
    Measurements of global system resource utilization metrics. Global records are logged every 5 minutes.

  • logappl
    Measurements of processes in user-defined application process data.

  • logproc
    Measurements of selected "interesting" processes. Interesting processes are tracked when they first start up, end, or exceed a user-defined threshold for CPU use. Process records are written every 60 seconds and every 5 minutes; the records in logproc are summarized according to the definitions in the parameter file and logged into the logappl file.

  • logdev
    Measurements of individual device performance for disks and volume data, summarized every 5 minutes.

  • logtran
    Measurements of transaction data, summarized every 5 minutes. The transaction-tracking concept is covered in Section 16.3.3 of this chapter.

  • logindx
    Instructions on how to access data in other log files

  • Data Source Integration
    User-defined log file (definition and Configuration covered in Section 16.2.5.2 of this chapter).


16.2.4.2 OVPA Alarm Configuration ExampleContributed by Emil Velez

The example in this section demonstrates the configure information to add to the alarmdef file in order to send performance messages to the OVO message browser if a metric threshold is violated. A brief explanation is provided with each step.


# MeasureWare format alarmdef file. DO NOT REMOVE THIS LINE!
#
# @(#) sample alarm definitions
#
# Sample alarmdef file
#
# edit any lines in this file as desired..
# First come a few sample alarms that illustrate some of the aspects of
# performance alarming.
# The following alarm, if uncommented, will go off every ten minutes:
#
#alarm GBL_CPU_TOTAL_UTIL > 0 for 10 minutes
#type = "test"
#start
# red alert "Test Alarm starting"
#repeat every 10 minutes
# yellow alert "Test Alarm continuing"
#end
# reset alert "Test Alarm ending"
#
# The following application alarm shows the use of the EXEC statement to
# execute the local action of mailing a message. Normally, if the "Other"
# application is using too much cpu, you should determine which processes
# are causing this activity and then tune your parm file so that this
# workload is bucketed into one of the application groups appropriate for
# your environment.
#alarm OTHER:APP_CPU_TOTAL_UTIL > 10 for 10 minutes
#start {
# yellow alert "Other application using more than 10 percent of the cpu"
# exec "echo 'other application using > 10% cpu' | mail root"
# }
#end
# reset alert "Other application cpu warning over"
#
# End of sample alarm section.
# Below are the primary CPU, Disk, Memory, and Network Bottleneck alarms.
# For each area, a bottleneck symptom is calculated, and the resulting
# bottleneck probability is used to define yellow or red alerts.
symptom CPU_Bottleneck type=CPU
rule GBL_CPU_TOTAL_UTIL > 75 prob 25
rule GBL_CPU_TOTAL_UTIL > 85 prob 25
rule GBL_CPU_TOTAL_UTIL > 90 prob 25
rule GBL_PRI_QUEUE > 3 prob 25
alarm CPU_Bottleneck > 50 for 5 minutes
type = "CPU"
start
if CPU_Bottleneck > 90 then
red alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
else
yellow alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
repeat every 10 minutes
if CPU_Bottleneck > 90 then
red alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
else
yellow alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
end
reset alert "End of CPU Bottleneck Alert"
symptom Disk_Bottleneck type=DISK
rule GBL_DISK_UTIL_PEAK > 50 prob GBL_DISK_UTIL_PEAK
rule GBL_DISK_SUBSYSTEM_QUEUE > 3 prob 25
alarm Disk_Bottleneck > 50 for 5 minutes
type = "Disk"
start
if Disk_Bottleneck > 90 then
red alert "Disk Bottleneck probability= ", Disk_Bottleneck, "%"
else
yellow alert "Disk Bottleneck probability= ", Disk_Bottleneck, "%"
repeat every 10 minutes
if Disk_Bottleneck > 90 then
red alert "Disk Bottleneck probability= ", Disk_Bottleneck, "%"
else
yellow alert "Disk Bottleneck probability= ", Disk_Bottleneck, "%"
end
reset alert "End of Disk Bottleneck Alert"
symptom Memory_Bottleneck type=MEMORY
rule GBL_MEM_QUEUE > 1 prob 20
rule GBL_MEM_PAGE_REQUEST_RATE > 10 prob 20
rule GBL_MEM_PAGE_REQUEST_RATE > 40 prob 20
rule GBL_MEM_PAGEOUT_RATE > 1 prob 20
rule GBL_MEM_PAGEOUT_RATE > 10 prob 35
rule GBL_MEM_SWAPOUT_RATE > 1 prob 35
rule GBL_MEM_SWAPOUT_RATE > 4 prob 50
alarm Memory_Bottleneck > 50 for 5 minutes
type = "Memory"
start
if Memory_Bottleneck > 90 then
red alert "Memory Bottleneck probability= ", Memory_Bottleneck, "%"
else
yellow alert "Memory Bottleneck probability= ", Memory_Bottleneck, "%"
repeat every 10 minutes
if Memory_Bottleneck > 90 then
red alert "Memory Bottleneck probability= ", Memory_Bottleneck, "%"
else
yellow alert "Memory Bottleneck probability= ", Memory_Bottleneck, "%"
end
reset alert "End of Memory Bottleneck Alert"
symptom Network_Bottleneck type=NETWORK
rule GBL_NFS_CALL_RATE > 100 prob 25
rule GBL_NET_COLLISION_1_MIN_RATE > 60 prob 25 # 1 per second
rule GBL_NET_COLLISION_1_MIN_RATE > 600 prob 25 # 10 per second
rule GBL_NET_COLLISION_1_MIN_RATE > 3000 prob 25 # 50 per second
rule GBL_NET_PACKET_RATE > 150 prob 10
rule GBL_NET_PACKET_RATE > 300 prob 15
rule GBL_NET_PACKET_RATE > 500 prob 25
rule GBL_NET_PACKET_RATE > 1000 prob 25
alarm Network_Bottleneck > 50 for 5 minutes
type = "Network"
start
if Network_Bottleneck > 90 then
red alert "Network Bottleneck probability= ", Network_Bottleneck, "%"
else
yellow alert "Network Bottleneck probability= ", Network_Bottleneck, "%"
repeat every 10 minutes
if Network_Bottleneck > 90 then
red alert "Network Bottleneck probability= ", Network_Bottleneck, "%"
else
yellow alert "Network Bottleneck probability= ", Network_Bottleneck, "%"
end
reset alert "End of Network Bottleneck Alert"
# The following alarm assumes that on a good network, errors are rare:
alarm GBL_NET_ERROR_1_MIN_RATE > 10
type = "Network"
start
red alert "Network error rate is greater than ten per minute"
end
reset alert "End of network error rate condition"
# Global swap space utilization alarm:
alarm GBL_SWAP_SPACE_UTIL > 95
start
red alert "Global swap space is nearly full"
end
reset alert "End of global swap space full condition"
LVOLUME loop
{
if ( lv_space_util > 80 ) then
{
if ( lv_dirname == "/var" ) then
if ( lv_space_util > 80 ) then
YELLOW ALERT "/var is greater than 80%, currently at: ",
lv_space_util
if ( lv_dirname == "/opt" ) then
if ( lv_space_util > 92 ) then
YELLOW ALERT "/opt is greater than 90%, currently at: ",lv_space_util
if ( lv_dirname == "/usr" ) then
if ( lv_space_util > 90 ) then
YELLOW ALERT "/usr is greater than 90%, currently at: ",lv_space_util
if ( lv_dirname == "/" ) then
if ( lv_space_util > 90 ) then
YELLOW ALERT "/ is greater than 90%, currently at: ",lv_space_util
if ( lv_dirname == "/home" ) then
if ( lv_space_util > 70 ) then
YELLOW ALERT "/home is greater than 70%, currently at: ",lv_space_util
if ( lv_dirname == "/opt/maestro" ) then
if ( lv_space_util > 80 ) then
YELLOW ALERT "/opt/maestro is greater than 80%, currently at: ",
lv_space_util
if ( lv_dirname == "/var/opt/perf/datafiles" ) then
if ( lv_space_util > 95 ) then
YELLOW ALERT "/var/opt/perf/datafiles is greater than 95%, currently at: "
,lv_space_util
}
}
INCLUDE "/var/opt/perf/nos/nsmdnt2/alarmdef"

16.2.4.3 Examples of Measureware Extractions



vi report1
REPORT "report 1"
FORMAT ASCII
HEADINGS ON
DATA TYPE GLOBAL
DATE
TIME
GBL_ACTIVE_PROC
GBL_ALIVE_PROC
GBL_COMPLETED_PROC
GBL_CPU_CSWITCH_TIME
GBL_CPU_CSWITCH_UTIL
GBL_DISK_FS_IO
GBL_DISK_FS_IO_RATE
GBL_DISK_FS_READ
GBL_DISK_FS_READ_RATE
GBL_DISK_FS_WRITE
GBL_MEM_PAGEOUT
GBL_MEM_PAGEOUT_RATE
GBL_MEM_PAGE_REQUEST
GBL_MEM_PAGE_REQUEST_RATE
GBL_MEM_QUEUE
GBL_MEM_SWAP
vi report2
REPORT "report 2"
FORMAT ASCII
HEADINGS ON
DATA TYPE PROCESS
DATE
TIME
YEAR
PROC_CPU_CSWITCH_TIME
PROC_CPU_CSWITCH_UTIL
PROC_CPU_INTERRUPT_TIME
PROC_CPU_INTERRUPT_UTIL
PROC_CPU_NICE_TIME
PROC_CPU_NICE_UTIL
PROC_CPU_NORMAL_TIME
PROC_CPU_NORMAL_UTIL
PROC_CPU_REALTIME_TIME
PROC_CPU_REALTIME_UTIL
PROC_CPU_SYSCALL_TIME
PROC_CPU_SYSCALL_UTIL
PROC_CPU_SYS_MODE_TIME
PROC_DISK_FS_IO
PROC_DISK_FS_IO_RATE
PROC_DISK_FS_READ
PROC_PROC_NAME
PROC_RUN_TIME
PROC_SEM_WAIT_PCT
PROC_TTY
PROC_USER_NAME
# extract -xp -fd -Gg -b today-1 -e today -r report1 -f rxlog.txt
# extract -xp -fd -p -b today-1-e today-r report2 -f rxlog_proc.txt
REPORT "report 3"
FORMAT ASCII
HEADINGS ON
DATA TYPE GLOBAL
DATE
DATE_SECONDS
DAY
TIME
YEAR
GBL_ACTIVE_PROC
GBL_ALIVE_PROC
GBL_COMPLETED_PROC
GBL_CPU_HISTOGRAM
GBL_CPU_IDLE_TIME
GBL_CPU_IDLE_UTIL
GBL_CPU_INTERRUPT_TIME
GBL_CPU_INTERRUPT_UTIL
GBL_CPU_SYS_MODE_TIME
GBL_CPU_SYS_MODE_UTIL
GBL_CPU_TOTAL_TIME
GBL_CPU_TOTAL_UTIL
GBL_CPU_USER_MODE_TIME
GBL_CPU_USER_MODE_UTIL
GBL_DISK_CACHE_READ
GBL_DISK_CACHE_READ_RATE
GBL_DISK_HISTOGRAM
GBL_DISK_LOGL_READ
GBL_DISK_LOGL_READ_RATE
GBL_DISK_PHYS_BYTE
GBL_DISK_PHYS_BYTE_RATE
GBL_DISK_PHYS_IO
GBL_DISK_PHYS_IO_RATE
GBL_DISK_PHYS_READ
GBL_DISK_PHYS_READ_BYTE_RATE
GBL_DISK_PHYS_READ_RATE
GBL_DISK_PHYS_WRITE
GBL_DISK_PHYS_WRITE_BYTE_RATE
GBL_DISK_PHYS_WRITE_RATE
GBL_DISK_TIME_PEAK
GBL_DISK_UTIL_PEAK
GBL_FS_SPACE_UTIL_PEAK
GBL_MEM_CACHE_HIT_PCT
GBL_MEM_FREE_UTIL
GBL_MEM_PAGEOUT_RATE
GBL_MEM_PAGE_REQUEST
GBL_MEM_PAGE_REQUEST_RATE
GBL_MEM_SYS_AND_CACHE_UTIL
GBL_MEM_USER_UTIL
GBL_MEM_UTIL
GBL_NET_IN_PACKET
GBL_NET_IN_PACKET_RATE
GBL_NET_OUT_PACKET
GBL_NET_OUT_PACKET_RATE
GBL_NET_PACKET_RATE
GBL_NUM_NETWORK
GBL_PROC_RUN_TIME
GBL_PROC_SAMPLE
GBL_RUN_QUEUE
GBL_STARTED_PROC
GBL_SWAP_SPACE_UTIL
GBL_SYSCALL_RATE
GBL_WEB_CACHE_HIT_PCT
GBL_WEB_CGI_REQUEST_RATE
GBL_WEB_CONNECTION_RATE
GBL_WEB_files_RECEIVED_RATE
GBL_WEB_files_SENT_RATE
GBL_WEB_FTP_READ_BYTE_RATE
GBL_WEB_FTP_WRITE_BYTE_RATE
GBL_WEB_GET_REQUEST_RATE
GBL_WEB_GOPHER_READ_BYTE_RATE
GBL_WEB_GOPHER_WRITE_BYTE_RATE
GBL_WEB_HEAD_REQUEST_RATE
GBL_WEB_HTTP_READ_BYTE_RATE
GBL_WEB_HTTP_WRITE_BYTE_RATE
GBL_WEB_ISAPI_REQUEST_RATE
GBL_WEB_LOGON_FAILURES
GBL_WEB_NOT_FOUND_ERRORS
GBL_WEB_OTHER_REQUEST_RATE
GBL_WEB_POST_REQUEST_RATE
REPORT "report 4"
FORMAT ASCII
HEADINGS ON
DATA TYPE PROCESS
DATE
TIME
YEAR
PROC_PROC_NAME
PROC_APP_ID
PROC_CPU_SYS_MODE_TIME
PROC_CPU_SYS_MODE_UTIL
PROC_CPU_TOTAL_TIME
PROC_CPU_TOTAL_TIME_CUM
PROC_CPU_TOTAL_UTIL
PROC_CPU_TOTAL_UTIL_CUM
PROC_CPU_USER_MODE_TIME
PROC_CPU_USER_MODE_UTIL
PROC_INTEREST
PROC_INTERVAL_ALIVE
PROC_MEM_RES
PROC_MEM_VIRT
PROC_MINOR_FAULT
PROC_PRI
PROC_PROC_IDPROC_RUN_TIME
# extract -xp -fd -Gg -b today-1 -e today -r report3.txt -f rxlog.txt
# extract -xp -fd -p -b today-1 -e today -r report4.txt -f rxlog_proc.txt
Examples of running ovpm from command line
"c:\Program Files\HP Openview\HPOV_IOPS\cgi-bin\analyzer.exe"
-GRAPHTEMPLATE: CODA "CPU Summary" -SYSTEMNAME: r204c30 -GRAPHTYPE: TSV
"c:\Program Files\HP Openview\HPOV_IOPS\cgi-bin\analyzer.exe"
-GRAPHTEMPLATE: CODA "CPU Summary" -SYSTEMNAME: r204c30

16.2.4.3 Check the OVPA message interface to OVO

If the OVPA is installed on a managed node where OVO agents are installed, OVPA automatically sends alarms to OVO. If there is no OVO agent on the system, disable the OVO messages setup. OVPA can also send SNMP traps to NNM (agsysdb add hostname). This is configured in the alarmgen target system database. Check the configuration with the following command:

/opt/perf/bin/agsysdb l (on HP-UX), /usr/lpp/perf/bin/agsysdb l (on AIX) and

c:\rpmtools\bin\agsysdb l (on Windows).

The output from the command will look similar to the following:



# /opt/perf/bin/agsysdb -l
MeasureWare alarming status:
SystemDB Version :
ITO messages : on Last Error : none
Exec Actions : on

There is more detailed information on the use of this command in the man pages or in the OVPA User's Guide.

16.2.5 Data Source Integration (DSI)


Use the DSI component to implement user defined data sources. For example, you may want to extract the vmstat data every 20 seconds for the User, System, and Idle statistics. The OVPA installation includes the components to check, analyze, and extract the DSI data. SPI's utilize the DSI as a method of collecting application data.

The example in Section 16.2.5.1 demonstrates the steps required to configure a new data source that will send a message to the OVO message browser if a metric threshold is violated. A brief explanation is provided with each step.

16.2.5.1 Data Source Integration Example

The process to implement a DSI log includes the following steps:

Create the Class Specification file



# vi /tmp/vmstat.spec
CLASS VMSTATS = 10001;
METRICS
USER_CPU = 101
LABEL "USER_CPU";
SYSTEM_CPU = 102
LABEL "SYSTEM_CPU";
IDLE_CPU = 103
LABEL "%IDLE_CPU";

Compile the Class Specification file, and create the logfile set (three new files in the current directory).



# sdlcomp /tmp/vmstat.spec /tmp/vmstat.log
sdlcomp
Check class specification syntax.
CLASS VMSTATS = 10001;
METRICS
USER_CPU = 101
LABEL "USER_CPU";
SYSTEM_CPU = 102
LABEL "SYSTEM_CPU";
IDLE_CPU = 103
LABEL "IDLE_CPU";
NOTE: Time stamp inserted as first metric by default.
Syntax check successful.
Update SDL vmstat_log.
Shared memory id used by vmstat_log : 9
Class VMSTATS successfully added to logfile set.
# ls vmstat.log*
vmstat.log vmstat.log.VMSTATS
vmstat.log.desc

Create a format file.



# vi /tmp/vmstat.fmt
$numeric $numeric $numeric $numeric $numeric
$numeric $numeric $numeric $numeric $numeric
$numeric $numeric $numeric $numeric $numeric
USER_CPU SYSTEM_CPU IDLE_CPU

Note

$number value discounts the first 15 fields from the vmstat output.

Table 16-1 shows the

vmstat output field descriptions.

Table 16-1. vmstat Command Field Descriptions

Primary Field

Secondary Fields

procs :
Information about numbers of processes in various states.

R
In run queue

b
Blocked for resources (I/O, paging, and so on)

w
Runnable or short sleeper (< 20 secs) but swapped

memory :
Information about the usage of virtual and real memory. Virtual pages are considered active if they belong to processes that are running or have run in the last 20 seconds.

avm
Active virtual pages

free
Size of the free list

page :
Information about page faults and paging activity. These are averaged each five seconds, and given in units per second.

re
Page reclaims (without -S)

at
Address translation faults (without -S)

si
Processes swapped in (with -S)

so
Processes swapped out (with -S)

pi
Pages paged in

po
Pages paged out

fr
Pages freed per second

de
Anticipated short-term memory shortfall

sr
Pages scanned by clock algorithm, per second

faults :
Trap/interrupt rate averages per second over last 5 seconds.

in
Device interrupts per second (nonclock)

sy
System calls per second

cs
CPU context switch rate (switches/sec)

cpu :
Breakdown of percentage usage of CPU time for the active processors

us
User time for normal and low priority processes

sy
System time

id
CPU idle

The vmstat command Column Descriptions (Alternate format)

The column headings and the meaning of each column are:


  1. procs: Information about numbers of processes in various states.


    r In run queue
    b Blocked for resources (I/O, paging, etc.)
    w Runnable or short sleeper (< 20 secs) but
    swapped
    memory: Information about the usage of virtual and real
    memory. Virtual pages are considered active if they
    belong to processes that are running or have run in
    the last 20 seconds.
    avm Active virtual pages
    free Size of the free list
    page: Information about page faults and paging activeity.
    These are averaged each five
    seconds, and given in
    units per second.
    re Page reclaims (without -S)
    at Address translation faults (without -S)
    si Processes swapped in (with -S)
    so Processes swapped out (with -S)
    pi Pages paged in
    po Pages paged out
    fr Pages freed per second
    de Anticipated short term memory shortfall
    sr Pages scanned by clock algorithm, per
    second
    faults: Trap/interrupt rate averages per second over last 5
    seconds.
    in Device interrupts per second (nonclock)
    sy System calls per second
    cs CPU context switch rate (switches/sec)
    cpu Breakdown of percentage usage of CPU time for the
    active processors
    us User time for normal and low priority
    processes
    sy System time
    id CPU idl
    # vmstat


    procs memory page faults
    cpu
    r b w avm free re at pi po fr de sr in sy
    cs us sy id
    1 0 0 230390 20390 8 4 0 0 0 0 2 407 1111
    158 1 0 99

  2. Test the dsilog process:



    # vmstat 20|dsilog /tmp/vmstat.log VMSTATS -f /tmp/vmstat.fmt vo
    I: 1003415064 0 0 0 10594 1913 0
    0 0 0 0 0 0 110
    211 37 4.0000 2.0000 95.0000
    I: 1003415064 0 0 0 8415 1579 0
    0 0 0 0 0 0 108
    144 32 2.0000 1.0000 96.0000
    I: 1003415084 0 0 0 10212 1593 0
    0 0 0 0 0 0 107
    157 37 0.0000 1.0000 99.0000
    interval marker
    L: 1003414800 2.0000 1.3330 96.6660
    Notes:
    I: shows incoming data
    L: actual data to be logged

  3. Start the dsilog logging process:



    # vmstat 20|dsilog /tmp/vmstat.log VMSTATS -f /tmp/vmstat.fmt &

  4. View the collected DSI data:



    extract -xp -l /var/opt/perf/vmstat_log -C VMSTATS
    detail -H -fd -b first

    Make the DSI a permanent data source:



    DATASOURCE=SCOPE LOGFILE=/var/opt/perf/data/image
    /library/english
    /10090_logglob
    DATASOURCE=DSI_VMSTAT LOGFILE=/tmp/vmstat_log

  5. Define alarms on DSI data in the /var/opt/perf/alarmdef file:



    Vi /var/opt/perf/alarmdef (partial listing)
    #######DSILOG
    alarm DSI_VMSTAT:VMSTATS:USER_CPU>30 for 10 minutes
    start
    critical alert "User CPU exceeded threshold"
    repeat every 15 minutes
    critical alert "User CPU exceeda threshold after 15 minutes"
    end
    reset alert "The User CPU Alert is over"

  6. Customize graphs in OV Performance Manager:

Refer to the OpenView Performance Manager Documentation for specific implementation and customization details.

16.2.5.2 Definition of Commands and Terms

  • DSI
    Provides the ability to collect, log, correlate, and summarize data from a variety of sources. Common DSI terms and definitions are provided here for reference.

  • sdlcomp
    Tool that creates the DSI log file set (vmstat.log, VMSTAT_log) by reading a specification file.

  • Class Specification File (ASCII)
    Describes the data that is collected using DSI.

  • Class Specification File CLASS
    Defines a group of metrics (USER_CPU, SYSTEM_CPU, and IDLE_CPU) and how they are collected (for example: CLASS name VMSTATS followed by class ID is used internally by DSI; the METRICS values are assigned a unique name and number. Each metric description is terminated with a semicolon.).

  • Class Specification File LABEL
    Identifies the set of metrics defined by the class.

  • Format File
    Determine what data fields will appear in the final data record and excludes unnecessary information (column headings and data fields). The example format file vmstat.fmt is located in the /tmp directory along with the specification file.

  • Data Feed Process (dsilog)
    Runs continuously in background mode, sending application output to the DSI log file (/tmp/vmstat.log). The vmstat application example shows vmstat (with the list of command line parameters ) sending data through a UNIX pipe to the

    dsilog command. The

    dsilog command line parameters include the name of the logfile set, the CLASS name, and data sent to a specific the dsilog file. Syntax checking the specification file with

    vo dsilog command line option sends the data only to standard output not the actual DSI log file.

  • Preview the data (extract)
    Views the data written to the DSI log file via the

    extract command and writes to an ACSII output file with the name (xfrdCLASS.asc).


16.2.6 OVPA Interface with Other Programs


The Database Smart-Plug In (DB SPI) is one example of a SPI that incorporates data collection capabilities and integrates with OVPM (for graphing and analysis) using the DSI features of OVPA. Installing the DB SPI inserts new entries in the parm file to define the instances of the database as a new application class.

16.2.7 OVPA Commands and Files


  • /opt/per/bin/mwa status
    Checks the OVPA status.

  • /opt/perf/bin/mwa stop
    Stops OVPA.

  • /opt/perf/bin/midaem T
    Stops midaemon (Also stops active Glance sessions. Glance is described later in this chapter.).

  • /

    opt/perf/bin/mwa start
    Starts OVPA processes, including midaemon and scopeux.

  • /opt/perf/bin/perfstat v
    Checks the version and status of the OVPA environment.

  • /opt/perf/bin/ttd k
    Stop the transaction tracker daemon (refer to the previous section for process description).

  • Parm
    Contains parameters that are used to define applications and processes.

  • Alarmdef
    Defines the conditions that generate alarms.

  • /var/opt/perf/perflbd.rc
    Contains the startup and shutdown commands for the repository servers for each data source that has been configured.

  • /var/opt/perf/status.scope
    Status and error log for scopeux.


The following status files contain diagnostic information from the process environment. The default file size is 1MB, and if the file grows past the limit it is renamed status.filename.old. Use these files to troubleshoot problems that may arise with the processes that generate the files:

  • /var/opt/perf/status.alarmgen

  • /var/opt/perf/status.perflbd

  • /var/opt/perf/status.rep_server

  • /var/opt/perf/status.ttd

  • /var/opt/perf/status.mi


16.2.8 OVPA 4.x


OVPA 4.x is the same functionally as OVPA 3.x. Origianlly developed for the LINUX platform, OVPA 4.x replaces the DCE-RPC-based components and utilizes OVOA (coda) and the HTTPs-based daemon (ovbbccb) for data collection and communications. The OVOA replaces the functionality of the perflbd and rep-server daemons. The perflbd.rc file is replaced by a datasources file and the alarmgen process is replaced by the perfalarm daemon. Use the

ovpa command (instead of mwa) to check the OVPA status. The major components of OVPA 4.x are shown in Figure 16-9.

Figure 16-9. The OVPA 4.x core component for data gathering is coda (OVOA).

[View full size image]

height="523" SRC="/image/library/english/10090_16fig09.gif" >

The following status files contain diagnostic information from the process environment. The default file size is 1MB, and if the file grows past the limit it is renamed status.filename.old. Use these files to troubleshoot problems that may arise with the processes that generate the files:

/var/opt/perf/status.scope

/var/opt/perf/status.perfalarm

/var/opt/perf/status.mi

/var/opt/perf/status.ttd

/var/opt/OV/log/coda.log


Metric data available from OVPA 4.x is available at: http://ovweb.external.hp.com/ovnsmdps/pdf/metlinux. Installation, release notes, user guides and other documentation is available at the Openview documentation web site: http://ovweb.external.hp.com/lpe/doc_serv/.

Note

OVPA 4.x may be changed, upgraded or released by HP for other platforms in the future. Check the OpenView web site for the most up to date product information.

16.2.9 Examples Directory


Example configuration files are located in the directory /opt/perf/examples. The directory includes sample configuration and alarm definition and README files.

16.2.10 Available Metrics


There are over 1000 metrics available for collection on any given system. You can see all the metrics available system-wide with a tool like Glance. OVPA collects a subset of about ~500 metrics on the HP-UX platforms. The OVPA metrics are defined in the text document /opt/perf/paperdocs/mwa/C/methp.txt.

/ 276