Working with Disk Counters

Windows 2000 includes counters that monitor the activity of physical disks (including removable media drives) and logical volumes. The PhysicalDisk object provides counters that report physical-disk activity; the LogicalDisk object provides counters that report statistics for logical disks and storage volumes. These counters measure disk throughput, queue length, usage, and other data. The interrelationships between different aspects of disk performance make it useful to monitor them simultaneously. The operating system enables a driver called Diskperf.sys to activate the disk monitoring counters. By default, the operating system activates only the PhysicalDisk performance counters. Users must activate the LogicalDisk counters manually using the diskperf command. See the following procedure for activating disk counters with the diskperf command.

To use the diskperf command to enable LogicalDisk object counters

At the command prompt, type diskperf - yv

The diskperf command takes the following syntax:

diskperf [ - y[d|v] | - n[d|v]] [computer_name]

Use -y to enable counters and -n to disable counters. To specify the type of counters, include d for physical disk drives or v for logical disk drives or storage volumes. When the operating system starts up, it automatically sets the diskperf command with the -yd switch to activate physical disk counters. For more information about using the diskperf command, type diskperf-? at the command prompt.

The PhysicalDisk object counters provide data on activity for each of the physical disks in your system; the LogicalDisk object counters provide data on logical volumes in your system. The System Monitor user interface identifies physical disks by number starting with 0. If you are monitoring logical disks, it identifies these by drive letter. For logical disks consisting of multiple physical disks, the disk instances might appear as disk 0 C and disk 1 C, where logical drive C: consists of physical drives 0 and 1.

When monitoring logical volumes, remember that they might share a physical disk and your data might reflect contention between them. If you have a spanned volume or striped volume with disk controllers that support hardware-enabled redundant array of independent disks (RAID) volumes, the counters report physical disk data for all disks in the stripe as if they are a single disk. Software-enabled RAID-5 volumes are available only on computers running Microsoft® Windows® 2000 Server.

Use the counters described in Table 30.1 to measure disk space, disk throughput, and disk utilization.

Table 30.1 Performance Objects and Counters for Disk Monitoring

Counter Description

LogicalDisk% Free Space
Reports the percentage of unallocated disk space to the total usable space on the logical volume. When calculating the _Total instance, the %Free Space counters recalculate the sum as a percentage for each disk.
There is no % Free Space counter for the PhysicalDisk object.

LogicalDisk|PhysicalDisk Avg. Disk Bytes/Transfer

Measures the size of input/output (I/O) operations. The disk is efficient if it transfers large amounts of data relatively quickly.
Watch this counter when measuring maximum throughput.
To analyze transfer data further, use Avg. Disk Bytes/Read and Avg. Disk Bytes/Write.

LogicalDisk|PhysicalDisk Avg. Disk sec/Transfer

Indicates how fast data is being moved (in seconds). Measures the average time of each data transfer, regardless of the number of bytes read or written. Shows the total time of the read or write, from the moment it leaves the Diskperf.sys driver to the moment it is complete.
A high value for this counter might mean that the system is retrying requests due to lengthy queuing or, less commonly, disk failures.
To analyze transfer data further, use Avg. Disk sec/Read and Avg. Disk sec/Write.

LogicalDisk|PhysicalDisk Avg. Disk Queue Length

Tracks the number of requests that are queued and waiting for a disk during the sample interval, as well as requests in service. As a result, this might overstate activity.
If more than two requests are continuously waiting on a single-disk system, the disk might be a bottleneck. To analyze queue length data further, use Avg. Disk Read Queue Length and Avg. Disk Write Queue Length.

LogicalDisk|PhysicalDisk Current Disk Queue Length

Indicates the number of disk requests that are currently waiting as well as requests currently being serviced. Subject to wide variations unless the workload has achieved a steady state and you have collected a sufficient number of samples to establish a pattern.
An instantaneous value or snapshot of the current queue length, unlike Avg. Disk Queue Length, Avg. Disk Read Queue Length, and Avg. Disk Write Queue Length, that reports averages.

LogicalDisk|PhysicalDisk Disk Bytes/sec

Indicates the rate at which bytes are transferred and is the primary measure of disk throughput.
To analyze transfer data based on reads and writes, use Disk Read Bytes/sec and Disk Write Bytes/sec, respectively.

LogicalDisk|PhysicalDisk Disk Transfers/sec

Indicates the number of read and writes completed per second, regardless of how much data they involve. Measures disk utilization.
If value exceeds 50 (per physical disk in the case of a striped volume), then a bottleneck might be developing.
To analyze transfer data based on reads and writes, use Disk Read/sec and Disk Writes/sec, respectively.

LogicalDiskFree Megabytes
Reports the amount of bytes on the disk that are not allocated.
There is no Free Megabytes counter for the PhysicalDisk object.

LogicalDisk|PhysicalDisk Split IO/sec
Reports the rate at which the operating system divides I/O requests to the disk into multiple requests. A split I/O request might occur if the program requests data in a size that is too large to fit into a single request or if the disk is fragmented. Factors that influence the size of an I/O request can include application design, the file system, or drivers. A high rate of split I/O might not, in itself, represent a problem. However, on single-disk systems, a high rate for this counter tends to indicate disk fragmentation.

LogicalDisk|PhysicalDisk % Disk Time

Reports the percentage of time that the selected disk drive is busy servicing read or write requests. Because this counter's data can span more than one sample, and consequently overstate disk utilization, compare this value against % Idle Time for a more accurate picture.
By default this counter cannot exceed 100 percent; however, you can reset the registry to allow System Monitor to display percentages exceeding 100 percent if appropriate. For information about this adjustment and other aspects of performance data collection and reporting, see "Performance Objects" in "Overview of Performance Monitoring" in this book.

LogicalDisk|PhysicalDisk % Disk Write Time
Reports the percentage of time that the selected disk drive is busy servicing write requests.

LogicalDisk|PhysicalDisk % Disk Read Time
Reports the percentage of time that the selected disk drive is busy servicing read requests.

LogicalDisk|PhysicalDisk % Idle Time
Reports the percentage of time that the disk system was not processing requests and no work was queued. Notice that this counter, when added to % Disk Time, might not equal 100 percent, because % Disk Time can exaggerate disk utilization.

Counter	Description
LogicalDisk% Free Space	Reports the percentage of unallocated disk space to the total usable space on the logical volume. When calculating the _Total instance, the %Free Space counters recalculate the sum as a percentage for each disk. There is no % Free Space counter for the PhysicalDisk object.
LogicalDisk\|PhysicalDisk Avg. Disk Bytes/Transfer	Measures the size of input/output (I/O) operations. The disk is efficient if it transfers large amounts of data relatively quickly. Watch this counter when measuring maximum throughput. To analyze transfer data further, use Avg. Disk Bytes/Read and Avg. Disk Bytes/Write.
LogicalDisk\|PhysicalDisk Avg. Disk sec/Transfer	Indicates how fast data is being moved (in seconds). Measures the average time of each data transfer, regardless of the number of bytes read or written. Shows the total time of the read or write, from the moment it leaves the Diskperf.sys driver to the moment it is complete. A high value for this counter might mean that the system is retrying requests due to lengthy queuing or, less commonly, disk failures. To analyze transfer data further, use Avg. Disk sec/Read and Avg. Disk sec/Write.
LogicalDisk\|PhysicalDisk Avg. Disk Queue Length	Tracks the number of requests that are queued and waiting for a disk during the sample interval, as well as requests in service. As a result, this might overstate activity. If more than two requests are continuously waiting on a single-disk system, the disk might be a bottleneck. To analyze queue length data further, use Avg. Disk Read Queue Length and Avg. Disk Write Queue Length.
LogicalDisk\|PhysicalDisk Current Disk Queue Length	Indicates the number of disk requests that are currently waiting as well as requests currently being serviced. Subject to wide variations unless the workload has achieved a steady state and you have collected a sufficient number of samples to establish a pattern. An instantaneous value or snapshot of the current queue length, unlike Avg. Disk Queue Length, Avg. Disk Read Queue Length, and Avg. Disk Write Queue Length, that reports averages.
LogicalDisk\|PhysicalDisk Disk Bytes/sec	Indicates the rate at which bytes are transferred and is the primary measure of disk throughput. To analyze transfer data based on reads and writes, use Disk Read Bytes/sec and Disk Write Bytes/sec, respectively.
LogicalDisk\|PhysicalDisk Disk Transfers/sec	Indicates the number of read and writes completed per second, regardless of how much data they involve. Measures disk utilization. If value exceeds 50 (per physical disk in the case of a striped volume), then a bottleneck might be developing. To analyze transfer data based on reads and writes, use Disk Read/sec and Disk Writes/sec, respectively.
LogicalDiskFree Megabytes	Reports the amount of bytes on the disk that are not allocated. There is no Free Megabytes counter for the PhysicalDisk object.
LogicalDisk\|PhysicalDisk Split IO/sec	Reports the rate at which the operating system divides I/O requests to the disk into multiple requests. A split I/O request might occur if the program requests data in a size that is too large to fit into a single request or if the disk is fragmented. Factors that influence the size of an I/O request can include application design, the file system, or drivers. A high rate of split I/O might not, in itself, represent a problem. However, on single-disk systems, a high rate for this counter tends to indicate disk fragmentation.
LogicalDisk\|PhysicalDisk % Disk Time	Reports the percentage of time that the selected disk drive is busy servicing read or write requests. Because this counter's data can span more than one sample, and consequently overstate disk utilization, compare this value against % Idle Time for a more accurate picture. By default this counter cannot exceed 100 percent; however, you can reset the registry to allow System Monitor to display percentages exceeding 100 percent if appropriate. For information about this adjustment and other aspects of performance data collection and reporting, see "Performance Objects" in "Overview of Performance Monitoring" in this book.
LogicalDisk\|PhysicalDisk % Disk Write Time	Reports the percentage of time that the selected disk drive is busy servicing write requests.
LogicalDisk\|PhysicalDisk % Disk Read Time	Reports the percentage of time that the selected disk drive is busy servicing read requests.
LogicalDisk\|PhysicalDisk % Idle Time	Reports the percentage of time that the disk system was not processing requests and no work was queued. Notice that this counter, when added to % Disk Time, might not equal 100 percent, because % Disk Time can exaggerate disk utilization.

When working with the disk-time or disk-queue length counters, be aware of the following limitations that might yield unlikely counter values.

The % Disk Read Time and % Disk Write Time counters can exaggerate disk time. This is because they report busy time based on the duration of the I/O request, which includes time spent in activities other than reading to or writing from the disk. It then sums up all busy time for all requests and divides it by the elapsed time of the sample interval. If multiple requests are in process at a time, the total request time is greater than the time of the sample interval; as a result, reported disk utilization can exceed actual utilization.

Counter values that report sums can be misleading for multidisk systems. When you look at the _Total instance for the % Disk Time or disk-queue counters on a multidisk system, the counters report values totaled for all disks and do not divide these totals over the number of disks in use. Therefore, in a system with one idle disk and one disk that is 100 percent busy, it can appear as if all disks are 100 percent busy.

The following sections describe how you can use disk-monitoring counters to observe available space on the disk and to observe the efficiency of disk operations as you become acquainted with your system's disk performance.

Monitoring Disk Space

It is important to monitor the amount of available storage space on your disk because programs might fail due to an inability to allocate space. In addition, low disk space might make it impossible for your paging file to grow to support virtual memory. Fragmentation also has this effect. For information about setting the paging file size for optimal performance, see "Evaluating Memory and Cache Usage" in this book.

Use the % Free Space and Free Megabytes counters to monitor disk space. If the available space is becoming low, you might want to run Disk Cleanup in the Disk Properties dialog box, compress the disk, or move some files to other disks. Notice that disk compression incurs some performance loss.

Another option is Remote Storage, which enables you to create virtual disk storage out of tape or optical drives. When you use this service, infrequently accessed files are moved to tape or to other media storage. Remote Storage volumes are well suited for data that you need to access only at certain intervals, such as quarterly reports. Remote Storage service is available on computers running Windows 2000 Server. For more information about remote storage options, see "Removable Storage and Backup" in this book.

If you are using NTFS and you want to restrict the amount of space allocated by individual users, use the Quota tab in Disk Properties. Notice that using quotas results in a small performance loss. If you are not using NTFS, you can set an alert on the % Free Space counter to track dwindling disk space.

Even if you are not currently short on disk space, you need to be aware of the storage requirements for applications you are running. Complete the following procedure to determine whether your disk has adequate space for your needs.

To evaluate the adequacy of your system's disk capacity

For best results, start with 1 GB (although the minimum disk size required to install the operating system might be lower).

Add the total size of all applications.

Add the size of the paging file (this depends on the amount of memory; this size needs to be at least twice that of system memory).

Add the amount of disk space budgeted per user (if a multiuser system), multiplied by the number of users.

Multiply by 1.3 (or take 130 percent) to allow room for expansion (this percentage can vary based on your expected growth).

The result is the size of disk you need.

NOTE
Although not exactly a disk-storage issue, disk fragmentation slows the transfer rate and seek times of your disk system and you need to monitor for increasing disk fragmentation. On single-disk systems, you can use the Split IO/sec counter to determine the degree of fragmentation of your disks. Defragment the disk if this counter rate is consistently high and run Disk Defragmenter periodically to keep stored data organized for best performance.

Figure 30.2 shows a graph of disk counters including % Free Space. Notice that the % Free Space counter begins to rise approximately halfway through the graph. This illustration shows changes that result from deleting files on the disk.

Figure 30.2 Increase in % Free Space Counter

Monitoring Disk Efficiency

Along with disk capacity, you need to consider disk throughput when evaluating your starting configuration. Use the bus, controller, cabling, and disk technologies that produce the best throughput that is practical and affordable. Most workstations perform adequately with the most moderately priced disk components. However, if you want to obtain the best performance, you might want to evaluate the latest disk components available.

If your configuration contains different types of disks, controllers, and buses, the differences in their designs can have an influence on throughput rates. You might want to test throughput using these different disk systems to determine if some components produce less favorable results overall or for certain types of activity, and replace those components as needed. In addition, the use of certain kinds of volume-set configurations can offer performance benefits. For example, using striped volumes can provide better performance because they increase throughput by enabling multiple disks to service sequential or clustered I/O requests. (Striped volumes are not fault tolerant.) System Monitor supports monitoring volume sets with the same performance objects and counters provided for individual disks. Notice that hardware-based RAID devices report all activity to a single physical disk and do not show distribution of disk operations among the individual disks in the array. For more information about using striped volumes, see Windows 2000 Help.

Be aware of the seek time, rotational speed, access time, and the data transfer rate of your disks by consulting manufacturer documentation. Also consider the bandwidth of cabling and controllers. The slowest component determines the maximum possible throughput, so be sure to monitor each component.

To compare the performance of different disks, monitor the same counters and activity on the disks. If you find differences in performance, you might want to distribute workload to the better performing disk or replace slower performing components.

Preparing for Comparison Testing

If you want to know more about the volume and rate of activity through the disk system, monitor the reading and writing activity as described in the following sections. Before you begin to test disk efficiency, complete the following steps to ensure valid results:

When testing disk performance, log performance data to another physical disk or computer so that it does not interfere with the disk you are testing. If you cannot do this, log to another logical volume on the drive, or measure monitoring overhead during an idle period and subtract that overhead from your data to ensure your results include only disk-specific data and not overhead from other activity.

Monitor individual instances whenever practical. Summed values and the _Total instance can provide overstated values. For more information about interpreting counter data, see "Working with Disk Counters" earlier in this chapter.

Remember to defragment your disk before testing. If your disk is nearly full, the remaining free space is likely to be fragmented, which adds to the seek time of I/O write operations as the disk looks for each sector of free space.

Ensure that disks being monitored are not compressed or encrypted, to avoid having these features add overhead during monitoring. However, if you plan to use these features, testing performance with the features deployed can yield results that are more representative of your production environment.

The following are suggested tests to perform to learn about your disk system's performance.

Testing Maximum Throughput

A maximum throughput test tells you about one of the limits of your system. To conduct a maximum throughput test, you can use one of the request-generation programs that are publicly available on the World Wide Web. Use the following counters on the PhysicalDisk and LogicalDisk objects for this test:

Avg. Disk Read Queue Length

Avg. Disk Bytes/Read

Avg. Disk sec/Read

Disk Read Bytes/sec

Disk Reads/sec

Figure 30.3 illustrates a test of maximum throughput. Notice that the values for Disk Read Bytes/sec and Avg. Disk Read Queue Length become extremely high in this graph.

Figure 30.3 A Disk Reaching Maximum Throughput

After you determine the maximum throughput for your disk, you can adjust the load on your disk so it does not become a bottleneck.

Testing Reading vs. Writing

Some disks and disk configurations perform better when reading than when writing. You can compare the reading and writing capabilities of your disks by reading from a physical disk and then writing to the same physical disk. To measure reading from and writing to disk, log the Logical and Physical Disk objects in System Monitor, then chart the counters shown in Table 30.2.

Table 30.2 Counters for Measuring Reading and Writing

For Information About Counters for Reads Counters for Writes

Average size of the request Avg. Disk Bytes/Read Avg. Disk Bytes/Write

Average duration of the request Avg. Disk sec/Read Avg. Disk sec/Write

Rate of transfer for the type of request Disk Read Bytes/sec Disk Write Bytes/sec

Rate at which requests are processed Disk Reads/sec Disk Writes/sec

For Information About	Counters for Reads	Counters for Writes
Average size of the request	Avg. Disk Bytes/Read	Avg. Disk Bytes/Write
Average duration of the request	Avg. Disk sec/Read	Avg. Disk sec/Write
Rate of transfer for the type of request	Disk Read Bytes/sec	Disk Write Bytes/sec
Rate at which requests are processed	Disk Reads/sec	Disk Writes/sec

You might see some variations in the time it takes to read from or write to disk on standard disk configurations. For example, disks with fast write caches can complete write operations very quickly if there is sufficient idle time between random writes. Also, if reads are sequential, read operations might also occur very quickly, provided the disk has had time to prefetch data. Prefetching data is the process whereby data that is expected to be requested is read ahead into the onboard cache.

On striped volumes, reading is faster than writing. When you read, you read only the data; when you write, you read, modify, and write the parity, as well as the data. The exception to this rule is full-stripe writes. If entire stripes are being written, there is no need to read the old data or parity.

When you start writing to the disk during a read operation that you are monitoring, you will notice some dips in the curves of graphed data for read activity. This is because the application doing the reads must stop briefly to allow the write operation to proceed and then, when the write is finished, the read operation resumes. You can observe this as Performance Logs and Alerts service logs data.

Figure 30.4 shows the effect of writing on the efficiency of the reads. Notice how the increase of reading activity is accompanied by a slight decrease in writing.

Figure 30.4 How I/O Operations Are Affected by Competing Activity