Windows IT Pro
Windows IT Library
  - Advertise        
Windows IT Pro Logo

  Home  |   Books  |   Chapters  |   Topics  |   Authors  |   Book Reviews  |   Whitepapers  |   About Us  |   Contact Us  |   ITTV  |   IT Jobs

search for  on    power search   help
 






Windows NT Performance Monitor in Depth
View the book table of contents
Author: James Stewart
Published: April 1999
Copyright: 1999
Publisher: 29th Street Press
 


Troubleshooting a Physical RAM Shortage
Detecting or isolating a physical RAM shortage involves several steps. First, measure or evaluate the paging activity of the system. This is done by monitoring the values of the following counters:
  • Memory: Page Faults/sec displays the average number of page faults per second. This counter includes both soft and hard pages.
  • Memory: Pages Input/sec displays the average number of pages per second read from disk. In other words, this counter measures the level of hard fault paging. This counter shows the number of pages read, not the number of disk reads.
  • Memory: Page Reads/sec displays the average number of times per second the disk is read to resolve hard fault paging. This counter shows only the number of disk reads, not the number of pages read.
By comparing Page Faults with Pages Input, you can determine the proportion of page faults resulted in disk access. In Figure 2.13, you can see the differences between the Page Faults and Pages Input counters.

The Page Faults plot line often has values over 100. This indicates a high level of page faults. The Pages Input line rarely peaks over 100. These two plot lines intersect when the VMM reads a page from disk, that is, a hard page. The distance between these plot lines represents faults satisfied with soft paging.

By comparing Pages Input with Page Reads, you can determine how many pages are read per disk access. Each time these plot lines intersect, one page is read per disk access. The distance between these plot lines represents multiple page reads per disk access.

The average number of Page Faults per measurement interval was 80; Pages Input, 19; and Page Reads, 6. These averages reveal several important items. Eighty percent of all memory accesses resulted in a page fault. Approximately 24 percent of those faults caused hard paging (Pages Input/Page Faults). Each disk access resulted in an average transfer of 3 pages (Pages Input/Page Reads). A consistent level of hard paging in excess of 20 percent should be considered abnormal. A consistent Page Reads per second of 5 or greater indicates significant disk access.

Second, compare paging with disk activity. When paging consumes more than 10 percent of disk activity, too much paging is occurring. To compare paging with disk activity, add the following counters to your Chart view:
  • LogicalDisk: % Disk Read Time: _Total — This counter displays the percentage of time the disk was busy processing read requests. Levels greater than 75 percent should come under suspicion.
  • LogicalDisk: Avg. Disk Read Queue Length: _Total — This counter displays the average number of requests waiting to be processed by the disk. A value greater than 2 can indicate a performance problem.
  • LogicalDisk: Disk Reads/sec: _Total — This counter displays the number of disk accesses per second. Most high performance disks can process 40 I/Os per second.
When too little memory is present on a system, the disk system is taxed to process all of the paging requests. High values for % Disk Read Time and Avg. Disk Read Queue Length should be explored. However, comparing Memory: Page Reads/sec with LogicalDisk: Disk Reads/sec should indicate whether the problem lies with RAM or the physical disk. If Memory: Page Reads/sec is greater than 10 percent of LogicalDisk: Disk Reads/sec, then the RAM is the problem.

This same comparison can be made with paging writes as well. The relevant counters to evaluate memory-caused disk writes are
  • Memory: Page Writes/sec
  • Memory: Pages Output/sec
  • Logical Disk: Disk Writes/sec
  • Logical Disk: Disk Write Bytes/sec
  • Logical Disk: Avg. Disk Write Queue Length
Third, the last step in evaluating physical memory shortage is to monitor the use of the swap file. On systems with too little RAM, NT expands the paging file to create additional virtual memory. NT can expand the paging file to either the maximum limit defined through the System applet’s Performance Tab or to the point where there is no available disk space. The size of the paging file should be monitored using the Process: Page File Bytes counter. If the swap file expands to its maximum capacity, this can indicate too little physical RAM, too small a swap file, or a leaky application that fails to release resources. If your page file consumes all available and allocated space, you need to eliminate the possibility of a leaky application. This is done by monitoring the Process: Page Faults/sec counter for all active processes. A histogram view can quickly reveal whether any one application is causing the paging.

Once you determine that your system doesn’t have enough physical RAM, there are several actions you can take to eliminate or reduce the effect of the shortage on system performance. The following list is arranged in a suggested order of approach, within the parameters of reasonable cost, ease of implementation, and manageable effects on the overall system (i.e., When you implement the solution, how much is the whole system changed?):
  • Increase the maximum size of your swap file.
  • Split the swap file across multiple fast disks.
  • Reduce memory usage by limiting applications and services.
  • Remove or correct leaky applications.
  • Increase the speed of the RAM (i.e., replace the current RAM with new faster RAM). Remember that RAM operates at the speed of the slowest RAM chip.
  • Add more physical RAM.
  • Add a fast disk system—faster drives and/or faster drive controllers.
If your system has sufficient physical RAM, you can observe the symptoms of a low memory system by altering your startup parameters. (This is purely for experimental purposes; if you do modify your startup parameters, be sure to revoke your changes to return to normal operations.) By adding the MAXMEM parameter to the appropriate line in the BOOT.INI file, you can limit how much physical RAM NT can “see” and use. This setting can be used to simulate a system with not enough physical RAM without altering the physical configuration of your system. The parameter is /MAXMEM=n, where n is the number of megabytes of memory to which you limit NT’s ability to “see.” Add this parameter after the label on a line under the [operating systems] heading. For example,

[boot loader]
timeout=5
default=multi(0)disk(0)rdisk(0)partition (2)\WINNTW
[operating systems]
multi(0)disk(0)rdisk(0)partition(2)\WINNTW="WINDOWS NT WORKSTATION VERSION 4.00"
multi(0)disk(0)rdisk(0)partition(2)\WINNTW="WINDOWS NT WORKSTATION VERSION 4.00 [VGA MODE]" /basevideo /sos
C:\ = "MS-DOS 6.22"

should be edited to

[boot loader]
timeout=5
default=multi(0)disk(0)rdisk(0)partition (2)\WINNTW
[operating systems]
multi(0)disk(0)rdisk(0)partition(2)\WINNTW="WINDOWS NT WORKSTATION VERSION 4.00, 16 MB" /MAXMEM=16
multi(0)disk(0)rdisk(0)partition(2)\WINNTW="WINDOWS NT WORKSTATION VERSION 4.00 [VGA MODE]" /basevideo /sos
C:\ = "MS-DOS 6.22"

Warning: Make sure that you do not set the value of MAXEM to less than 8, because NT will fail to boot. We suggest using a value of 32, 16, or 12 to perform the simulation. Be sure to remove this parameter once you have completed the simulation.

Processor Bottlenecks
The CPU is the central component of a computer system. Almost every transmission of data occurs through the CPU. Thus, it is essential to identify and solve processor-related bottlenecks quickly. Before you can determine whether you have a processor bottleneck, you need to eliminate the possibilities of memory, application, and disk problems.

In most cases, a processor bottleneck is identified by one of the following counters:
  • Processor: % Processor Time indicates the amount of time the CPU spends on non-idle work. It’s common for this counter to reach 100 percent during application launches or kernel-intensive operations, such as Security Accounts Manager (SAM) synchronization. But if this counter remains above 90 percent for an extended period, you should suspect a CPU bottleneck. This level of activity indicates that the system is performing work 90 percent of the time and does not have much capacity for additional work.
  • Processor: % Total Processor Time applies to multiprocessor systems only. This counter should be used the same way as the single CPU counter: If any value remains consistently higher than 90 percent, at least one of your CPUs is a bottleneck.
  • System: Processor Queue Length indicates the number of threads waiting for processor time. A sustained value of 2 or higher for this counter indicates processor congestion. Note that this counter is a snapshot at the time of measurement, not an average value over time.
A processor bottleneck occurs when the CPU is so busy that it can’t respond to new requests for computing cycles. While high CPU utilization can indicate a bottleneck, more revealing indicators are the queue length and poor interface response times.

Once a CPU bottleneck has been determined, it can be eliminated or its effects on system performance can be reduced with the following actions. The following list is arranged in a suggested order of approach, within the parameters of reasonable cost, ease of implementation, and manageable effects on the overall system (i.e., When you implement the solution, how much is the whole system changed?):
  • Remove all graphic-intensive screen savers.
  • Transfer CPU-intensive applications to other servers.
  • Alter the execution priorities for nonkernel processes.
  • Add more L2 or secondary cache.
  • Upgrade older motherboards.
  • Replace the current CPU with a faster CPU. Remember that you may have to replace your motherboard and memory to upgrade your CPU.
  • Upgrade network and disk controller cards to 16-, 32-, and 64-bit PCI models with bus mastering instead of programmed I/O. Avoid ISA cards if at all possible, especially for heavily taxed disk and network controllers.
  • Add a second CPU. Keep in mind that on most systems additional CPUs offer a diminishing return on performance improvement.
Tip: Before rushing to correct a CPU bottleneck, try to identify any processes that could be causing CPU constricion. In some cases, a faulty application can cause performance problems that have symptoms like those of a true CPU bottleneck. Correcting an application problem is often easier and less expensive than replacing a CPU. Process/application inspection requires monitoring the performnace of individual processes and comparing them to your system’s baseline.



Page: 1, 2, 3, 4, 5, 6

next page



Windows IT Pro Home Register FAQ for Windows WinInfo News
Europe Edition About Us Contact Us/Customer Service Media Kit Affiliates / Licensing  
SQL Server Magazine Office & SharePoint Pro Windows Dev Pro IT Job Hound ITTV
IT Library Technology Resource Directory Connected Home Windows Excavator Windows SuperSite 
 
 Windows IT Pro is a Division of Penton Media Inc.
 Copyright © 2008 Penton Media, Inc., All rights reserved. Terms and Use | Privacy Statement | Reprints and Licensing