Windows IT Pro
Windows IT Library
  - Advertise        
Windows IT Pro Logo

  Home  |   Books  |   Chapters  |   Topics  |   Authors  |   Book Reviews  |   Whitepapers  |   About Us  |   Contact Us  |   ITTV  |   IT Jobs

search for  on    power search   help
 






SQL Server Performance Monitoring Guidelines
View the book table of contents
Author: Kevin Cox
William Jones
Published: July 1997
Copyright: 1997
Publisher: 29th Street Press
 


Memory
The Memory: Pages/sec counter is the number of pages read or written to the disk when the system can’t find the page in memory. This page management process is referred to as paging. If the average value for this counter is five, you need to tune the system. If this value is 10 or more, put tuning the server high on your priority list. Before SQL Server 6.0, the value for this counter was an important flag to tell you whether memory was the bottleneck. Now, with SQL Server’s parallel read-ahead feature, this counter will give you only an indication of how busy the read-ahead manager is. However, we will discuss other counters that are better at tracking the read-ahead manager. In other words, this counter may have been one of the most significant counters to track in the past, and it still is on machines without SQL Server, but better ones are available to track memory.

The Memory: Available Bytes counter displays the amount of free physical memory. If the value for this counter is consistently less than 10 percent of your total memory, paging is probably occurring. You have too much memory allocated to SQL Server and not enough to Windows NT.

Processor
Before we start talking about the counters in the processor category, it is important to know that Windows NT assigns certain responsibilities to certain processors if you have four or more CPUs. Processor 0 is the default CPU for the I/O subsystem. Network Interface Cards (NIC) are assigned to the remaining CPUs, starting from the highest-numbered CPU. If you have four processors and one NIC, that card is assigned Processor 3. The next NIC gets Processor 2. Windows NT does a good job of spreading out processor use. You can also set which processors SQL Server uses. See Chapter 16, “Performance Tuning,” particularly the notes on the Affinity Mask, for more information about allocating processors.

You can monitor each processor individually or all the processors together. For monitoring individual processors, use the Processor: % Process Time counter. This counter lets you see which processors are the busiest.

A better counter to monitor over the long term is the System: % Total Processor Time counter, which groups all the processors to tell you the average percentage of time that all processors were busy executing non-idle threads.

Who (or what) is consuming the CPU time? Is it the users, system interrupts, or other system processes? The Processor: Interrupts/sec counter will tell you if it is the system interrupts. A value of more than 1,000 indicates that you should get better network cards, disk controllers, or both. If the Processor: % Privileged Time is greater than 20 percent (of the total processor time) and Processor: % User Time is consistently less than 80 percent, then SQL Server is probably generating excessive I/O requests to the system. If your machine is not a dedicated SQL Server machine, make it so. If none of these situations is occurring, user processes are consuming the CPU. We will look at how to monitor user processes when we consider SQL Server-specific counters in the next section.

Disk I/O
As discussed in Chapter 16, “Performance Tuning,” having many smaller drives is better than having one large drive for SQL Server machines. Let’s say that you need 4 GB of disk space to support your application with SQL Server. Buy four 1-GB drives instead of one 4-GB drive. Even though the seek time is faster on the larger drive, you will still get a tremendous performance improvement by spreading files, tables, and logs among more than one drive.

Special Note: The single best performance increase on a SQL Server box comes from spreading I/O among multiple drives (adding memory is a close second).

Monitor the disk counters to see whether the I/O subsystem is the bottleneck, and if it is, to determine which disk is the culprit. The problem may be the disk controller board. The first thing to know about monitoring disk I/O is that to get accurate readings from the Physical Disk counters, you must go to a command prompt window and type DISKPERF -y, then reboot. This procedure turns on the operating system hooks into the disk subsystem. However, this setup also causes a small performance decrease of 3 to 5 percent, so you want to turn this on only periodically and only for a short period. Use the Diskperf -n command to turn it off, then restart your system.

Track Physical Disk: % Disk Time to see how much time each disk is busy servicing I/O, including time spent waiting in the disk driver queue. If this counter is near 100 percent on a consistent basis, then the physical disk is the bottleneck. Do you rush out and buy another disk? Perhaps that is the best strategy if the other drives are also busy, but you have other options. You may get more benefit from buying another controller and splitting the I/O load between the different controllers. Find out what files or SQL Server tables reside on that disk, and move the busy ones to another drive. If the bottleneck is the system drive, split the virtual memory swap file to another drive, or move the whole file to a less busy drive. You should already have split the swap file, unless you only have one drive (which is very silly on a SQL Server machine).

LogicalDisk: Disk Queue Length and PhysicalDisk: Disk Queue Length can reveal whether particular drives are too busy. These counters track how many requests are waiting in line for the disk to become available. Values of less than 2 are good; if the value is any higher, it’s too high.

Network
Redirector: Read Bytes Network/Sec gives the actual rate at which bytes are being read from the network. Dividing this value by the value for the Redirector: Bytes Received/Sec counter gives the efficiency with which the bytes are being processed.

If this ratio is 1:1, your system is processing network packets as fast as it gets them. If this ratio is below 0.8, then the network packets are coming in faster than your system can process them. To correct this problem on a multiprocessor system, use the Affinity Mask and SMP Concurrency options in the SQL Configuration dialog box to allocate the last processor to the network card, and don’t let SQL Server use that processor. For example, if you have four CPUs, set the Affinity Mask to 7 (binary 0111) and SMP Concurrency to 3. This setup gives three CPUs to SQL Server and the fourth processor to the network card, which Windows NT assigns to that processor by default. If I/O is also a problem, set the Affinity Mask to 6 (binary 0110) and SMP Concurrency to 2, because Windows NT assigns the I/O subsystem to the first processor by default.

Monitoring SQL Server
The questions to ask yourself when monitoring the SQL Server categories are “Do I have the optimal configuration values for SQL Server?” and “Who is consistently using the most resources?”

If any of the counters considered in this section indicate a problem, the problem is somewhere related to SQL Server. If the problem is I/O, memory, CPU, or locks, you can dig deeper and find out who the culprits are. However, if you are using a long-term logging strategy for monitoring, you must monitor every session to be sure you have the necessary historical data when you want to see what was happening at a particular time.

If you are watching the monitor when a problem occurs, go to the SQL Server-Users object and turn on the counter for all instances. The instances in this case are the sessions currently logged on. You can see the login ID and the session number. If you see one or more sessions causing the problem, you can spy on them to find the last command sent. Go to the Enterprise Manager, click the Current Activity button on the toolbar, and double-click the line in the display corresponding to the session number. You will see the last command received from the session. To trace commands in more depth, use the SQLTrace utility that is new with version 6.5. (See Chapter 3, “Administrative and Programming Tools,” for details.)

The five main categories of SQL Server counters to monitor are cache, disk I/O, log, locks, and users. We will consider each of these categories separately as well as a mix of other important predefined counters. The final part of this section discusses the new user-defined counters.

Cache
To monitor your cache, watch SQL Server — Cache Hit Ratio. It monitors the rate at which the system finds pages in memory without having to go to disk. The cache hit ratio is the number of logical reads divided by the total of logical plus physical reads. If the value for this counter is consistently less than 80 percent, you should allocate more memory to SQL Server, buy more system memory, or both. However, before you buy more memory, you can try changing the read-ahead configuration options. Also look at the discussion of free buffers in the next chapter to determine whether the number of free buffers is approaching zero. Changing the free buffers configuration parameter may increase the cache hit ratio.

To find out if you have configured SQL Server properly, you should monitor SQL Server-Procedure Cache: Max Procedure Cache Used (%). If this counter approaches or exceeds 90 percent during normal usage, increase the procedure cache in the SQL Server configuration options. If the maximum cache used is less than 50 percent, you can decrease the configuration value and give more memory to the data cache. Rumor has it that SQL Server 7.0 will have a floating-point number for the procedure cache configuration parameter so that you can give the procedure cache less than 1 percent of your SQL Server memory. For a super server with gigabytes of memory, even 1 percent is too much for procedure cache.

If a 2K data page has been swapped to the Windows NT virtual memory file and read in again later, SQL Server still counts the page as already in memory for the purposes of the Cache Hit Ratio counter. Therefore, a system bogged down by heavy swapping to virtual memory could still show a good cache hit ratio. To find out if your system is in this category, monitor the Memory: Page Faults/Sec counter.

The Memory: Page Faults/Sec counter watches the number of times a page was fetched from virtual memory, meaning that the page had been swapped to the Windows NT swap file. It also adds to the counter the number of pages shared by other processes. This value can be high while system services, including SQL Server, are starting up. If it is consistently high, you may have given too much memory to SQL Server. The network and operating system may not have enough memory to operate efficiently.

Warning: This counter is a strange one to figure out. Running this counter on four different types of machines gave widely different results. To try to get a baseline value, we turned off all services, including SQL Server, unplugged the boxes from the network, and ran Performance Monitor with only the Memory: Page Faults/Sec counter turned on. The lowest measurement of page faults per second was from the system we least expected — a 50 MHz 486 with 16 MB of memory and one disk drive. It settled in at about five to seven page faults per second. The DEC Alpha with 4 processors, 10 GB RAID 5 striping on 5 drives, and 256 MB of memory was up in the 35 to 40 page faults per second range. So was a similarly configured Compaq ProLiant. The laptop performed in the middle, at about 15 page faults per second. It is a 90 MHz Pentium with 1 disk drive and 40 MB of memory. All were running Microsoft Windows NT version 3.51 service pack 4. All services except Server and Workstation were turned off. Running the same experiment with Windows NT 4.0 service pack 1 showed approximately the same results, except that the page faults per second numbers ran consistently 10 percent less than in Windows NT 3.51.

The result of this experiment is that we can’t recommend a range to gauge the performance of your machine. The best you can do is turn off all services for a brief period to get a baseline measurement on your machine, then use this value as a guide for your regular usage.

Disk I/O
Several counters measure how busy your disk drives are and which disk drives are the busiest. Remember that for any I/O measurements to be effective, you must run the Windows NT Diskperf -y command and reboot the system.

Even though the SQL Server: I/O Transactions Per Second counter is a bit misleading, it is still good, especially for capacity planning. This counter measures the number of Transact-SQL batches processed since the last refresh period. You should not use these results against any standard TPC benchmark tests that give results in transactions per second — it is not referring to a Begin/Commit transaction, just to batches of commands. Watch this number over a span of several months, because an increase in this counter can indicate that the use of SQL Server is growing.

The SQL Server: I/O — Lazy Writes/Sec counter monitors the number of pages per second that the lazy writer is flushing to disk. The lazy writer is the background Windows NT process that takes the data from cached memory and writes it to disk, although sometimes a lazy writer is hardware that reads the cached memory on the disk drive and is managed by the disk controller. A sustained high rate of lazy writes per second could indicate any of three possible problems:

  • the Recovery Interval configuration parameter is too short, causing many checkpoints
  • too little memory is available for page caching
  • the Free Buffers parameter is set too low
Normally this rate is zero until the least-recently used (LRU) threshold is reached. LRU is the indicator by which memory is released for use by other processes. Buying more memory may be the best solution if the configuration parameters seem to be in line for your server size.

The SQL Server: I/O Outstanding Reads counter and the I/O Outstanding Writes counter measure the number of physical reads and writes pending. These counters are similar to the PhysicalDisk: Disk Queue Length counter. A high value for this counter for a sustained period may point to the disk drives as a bottleneck. Adding memory to the data cache and tuning the read-ahead parameters can decrease the physical reads.

The SQL Server: I/O Page Reads per Second counter is the number of pages not found in SQL Server data cache, which indicates physical reads of data pages from disk. This value does not count pages that are read from the Windows NT virtual memory disk file. There is no way to watch only the logical page reads per second. According to sources in the SQL development team, counters for logical pages reads are hidden in a structure that is not available in this version of SQL Server. However, you can figure out the logical page reads per second by taking the total page reads per second and subtracting the physical page reads per second.



Page: 1, 2, 3

next page



ADS BY GOOGLE SPONSORED LINKS FEATURED LINKS

WinConnections Conference Fall 2008
Don’t miss the premier event for Microsoft IT Professionals in Las Vegas, November 10-13. Register and book your room by August 25 and receive a FREE room night (based on a three night minimum stay).

Maximize your SharePoint Investment – 8 Cities
Discover best practices and tips for both architecting and administering SharePoint. Early Bird Price of $99 through Sept 15th.

Find a new job now on the all new IT Job Hound!
Search jobs, post your resume, and set up job e-mail alerts!

Master SharePoint with 3 eLearning Seminars
Learn how to build a better SharePoint infrastructure and enable powerful collaboration with MVPs Dan Holme and Michael Noel. Register today!

Top Tools for Virtualization Disaster Recovery & Replication
View this web seminar on August 14th to learn about two tools that will result in faster backup and restore with P2V disaster recovery.

SharePointConnections Conference Fall 2008
Don’t miss the premier event for Microsoft IT Professionals in Las Vegas, November 10-13. Register and book your room by August 25 and receive a FREE room night (based on a three night minimum stay).

VMworld 2008 - Sign Up Today!
Join your peers on September 15-18 at The Venetian Hotel in Las Vegas as VMware hosts VMworld 2008, the leading Virtualization event.



When managing just VMware isn’t enough
Plan/Manage/Secure – NetIQ VMware management. Download whitepaper.

What’s up with your network? Find out with ipMonitor
Availability monitoring for servers, applications and networks – FREE trial

Microsoft® Tech•Ed EMEA 2008 IT Professionals
Advance your thinking with new ideas and practical real-world solutions at Microsoft’s FIVE day technical infrastructure conference 3-7 Nov., 2008. Register before 26 September 2008 to save €300.

Order Your Fundamentals CD Today!
Gain an introduction to Exchange, learn server security requirements, and understand how unified communications can play a role in your messaging strategies with this free Exchange CD.

Are You Really Compliant with Software Regulations?
View this web seminar that will help you with compliance best practices and check out a management solution to assure that you won’t be in jeopardy of an audit.

Virtualization Congress Oct. 14-16 in London
Don't miss Virtualization Congress, the premiere EMEA conference dedicated to hardware, OS and application virtualization. Oct. 14-16 in London.
Windows IT Pro Home Register FAQ for Windows WinInfo News
Europe Edition About Us Contact Us/Customer Service Media Kit Affiliates / Licensing  
SQL Server Magazine Office & SharePoint Pro Windows Dev Pro IT Job Hound ITTV
IT Library Technical Resources Directory Connected Home Windows Excavator Windows SuperSite 
 
 Windows IT Pro is a Division of Penton Media Inc.
 Copyright © 2008 Penton Media, Inc., All rights reserved. Terms and Use | Privacy Statement | Reprints and Licensing