Starting Your Monitoring Routine
Setting up a monitoring routine consists of several steps, including setting up a basic monitoring configuration (sometimes called "overview" settings), testing the limits of acceptable performance under various conditions, and establishing a baseline. The following sections describe how to undertake these steps.
Your Minimum Monitoring Configuration
The minimum performance objects to monitor are those corresponding to the main hardware resources of your system: memory, processors, disks, and network components. Table 27.7 lists the appropriate counters and the categories of information they provide.
Table 27.7 Monitoring the Minimum Objects
Component | Performance Aspect Being Monitored | Counters to Monitor |
---|---|---|
Disk | Usage | LogicalDisk% Free Space
LogicalDisk% Disk Time PhysicalDiskDisk Reads/sec PhysicalDiskDisk Writes/sec Use diskperf -y to enable disk counters and diskperf -n to disable them. To specify the type of counters you want to activate, include d for physical disk drives and v for logical disk drives or storage volumes. When the operating system starts up, it automatically sets the diskperf command with the -yd switch to activate physical disk counters. Type diskperf -yv to activate logical disk counters. For more information about using the diskperf command, type diskperf -? at the command prompt. The % Disk Time counter must be interpreted carefully. Because the _Total instance of this counter might not accurately reflect utilization on multiple-disk systems, it is important to use the % Idle Time counter as well. Note that these counters cannot display a value exceeding 100 percent. For more information about disk performance counters, see "Examining and Tuning Disk Performance"in this book. |
Disk | Bottlenecks | LogicalDiskAvg. Disk Queue Length
PhysicalDiskAvg. Disk Queue Length (all instances) |
Memory | Usage | MemoryAvailable Bytes
MemoryCache Bytes You can also use MemoryCommitted Bytes and MemoryCommit Limit to detect problems with virtual memory. |
Memory | Bottlenecks or leaks | MemoryPages/sec
MemoryPage Faults/sec MemoryPages Input/sec MemoryPage Reads/sec MemoryTransition Faults/sec MemoryPool Paged Bytes MemoryPool Nonpaged Bytes Although not specifically Memory object counters, the following are also useful for memory analysis: Paging File% Usage Object (all instances) CacheData Map Hits % ServerPool Paged Bytes and ServerPool Nonpaged Bytes |
Network | Usage | Network Segment: % Net Utilization
Note that you need to install the Network Packet Protocol driver for Network Monitor in order to use this counter. |
Network | Throughput | Protocol transmission counters (varies with networking protocol); for TCP/IP:
Network InterfaceBytes total/sec Network InterfacePackets/sec ServerBytes Total/sec or ServerBytes Sent/sec and ServerBytes Received/sec You might want to monitor other objects for network and server throughput, as described in "Monitoring Network Performance" in the Server Operations Guide. |
Processor | Usage
Bottlenecks |
Processor% Processor Time (all instances)
SystemProcessor Queue Length (all instances) ProcessorInterrupts/sec SystemContext switches/sec |
If you want to test the limits of your system as part of establishing a baseline, monitor the recommended counters during the following activities:
Adding base services
Adding connections
Running network applications
Opening a file
Printing a file
Copying or writing to a file
Accessing a database
Sending a message
After becoming familiar with System Monitor and the process of configuring graphs and logs, you are ready to incorporate monitoring into your daily routine of system administration. Routine monitoring over periods ranging from days to weeks to months allows you to establish a baseline for system performance.
A baseline is a measurement that is derived from the collection of data over an extended period during varying but typical types of workloads and user connections. The baseline is an indicator of how individual system resources or a group of resources are used during periods of normal activity.
When determining your baseline, it is important to know the types of work being done and the days and times when the work is being done. That will help you to associate work with resource usage and to determine the reasonableness of performance during those intervals.
For example, if you find that performance diminishes somewhat for a brief period at a given time of day, and you find that at that time many users are logging on or off, it might be an acceptable slowdown. Similarly, if you find that performance is poor every evening at a certain time and you can tell that time coincides with nightly backups when no users are logged on to the system, again that performance loss might be acceptable. But you can make that determination only when you know the degree of performance loss and its cause.
When you have built up data on performance over a period, with data reflecting periods of low, average, and peak usage, you can make a subjective determination of what constitutes acceptable performance for your system. That determination is your baseline. Use your baseline to detect when bottlenecks are developing or to watch for long-term changes in usage patterns that require you to increase capacity.