1. Which counters on I/O performance does SQL Server provide?
From release SQL Server 2000 the system function fn_virtualfilestats records I/O values for each database file. This includes the number of reads and writes, the corresponding number of bytes read or written, and the time spent waiting on I/O in ms (IOSTALLS). All values are per file and refer to SQL Server start.
All versions of SQL Server provide the command dbcc sqlperf(WAITSTATS)Some of the wait events are related to I/O performance. These are in particular
- PAGEIOLATCH_SH: indicates I/O performance on data files; this is one server wide value since SQL Server start (or last counter reset).
- WRITELOG: indicates I/O performance on transaction log file(s); this is one server wide value since SQL Server start (or last counter reset).
Warning: The columns 'Requests', 'Wait Time', and 'Signal Wait Time' have an overflow at 2^32 (= 4,192,895,744). This is not true for the row 'Total': only the column 'Requests' has an overflow. The overflow is correctly handled in the DB Collector (see below).
2. Where do I find 'IO per File' counters in the SAP database monitor?
- 6.20 and later: You find the display at ST04 -> 'Detail Analysis menu' -> 'IO per File'. You can choose between tempdb and the SAP database with a button.
- 6.10: You find the display at DB02 -> 'DB analysis' -> 'IO per File'. You can choose between tempdb and the SAP database with a button.
Warning: The values displayed are wrong when the job 'SAP_<sid>__MSSQL_COLLECTOR' is not running. This applies to 6.10 and later. See section 'The monitor ST04 -> 'IO per File' is empty.' for a solution.
- 4.6C, 4.6D: You find the display at ST04 -> 'Detail Analysis menu' -> Menu 'DB analysis' -> 'IO per File'. You can choose to display the values for the database of the SAP system or for tempdb, which is used by SQL Server for temporary objects. This functionality requires the Basis Support Packages listed.
- 6.10 and later: Additionally the full output of fn_virtualfilestats is displayed at ST04 -> 'Detail Analysis menu' -> 'System tables' -> 'System functions'. (This does not include the calculated value 'IOSTALLS/READS').
- Older releases: In older SAP releases the function is not implemented. You can use the SQL Server Query Analyzer to display the values of function fn_virtualfilestats using the command: select * from ::fn_virtualfilestats(-1,-1)(see Books Online for details). In SE37 you can use function module MSQ_VTLFILESTATS_DOWNLOAD or /SDF/MSS_VIRTFILESTATS (if either of both exists) to display the result of this command.
The following script provides a more convenient formatted output for the database you are connected to: declare @db smallint set @db=db_id()
select cast(IoStallMS/cast(F.NumberReads as decimal)
as numeric(9,2)) as [ms/IO], S.filename, F.DbId, F.FileId
from ::fn_virtualfilestats(@db,-1) F inner join sysfiles S
on S.fileid=F.FileId order by S.filename
3. Where do I find 'dbcc sqlperf(WAITSTATS)' counters in the SAP database monitor?
ST04 -> Detail analysis menu -> DB utilities -> dbcc sqlperf(WAITSTATS).
4. What are the enhancements in the monitor for 6.40?
- On the entrance screen of ST04 on tab 'Current Activity' average values for the last 20 min are displayed as:
- 'Latch Wait Time per Request (ms)' derived from 'PAGEIOLATCH_SH';
- 'Wait Time per Log Write (ms)' derived from 'WRITELOG';
- 'IOStall per Request (ms)' as in 'IO per File'.
To examine the overall I/O performance switch to the values 'Since Startup' in menu 'DB Analysis'.
- In the entrance screen of ST04 choose 'DB Collector'. On the tab 'Snapshot' choose the 'I/O' tab. I/O values are shown in three different granularities. 'Server I/O' displays the same counters as described above. 'Database I/O' and 'Datafile I/O' display values of fn_virtualfilestats on the according summarization level. On the tab 'Time series' a history of the data is shown. The time window can be chosen. To display a chart, select one or more rows and choose 'Graphics'. Various data categories related to I/O can be selected. On the tab 'Snapshot Details' the current values of additional counters are available. Various data categories can be selected here as well.
- The display of 'dbcc sqlperf(WAITSTATS)' shows also the calculated values 'Waittime/Request' and provides a RESET/SINCE RESET functionality, which facilitates monitoring performance for a short, specific time frame.
5. What are the reference values for the 'IO per File' counters?
The monitor 'IO per File' includes a column 'ms/IO'. On data files 'ms/IO' is typically arround 3 to 4ms for well performing I/O subsystems (values down to 1ms possible). Values up to 10ms for data files can still be consistent with satisfying system performance. For values above 10ms a negative impact on system response times is more then likely.
For the transaction log files these thresholds are not valid. fn_virtualfilestats can not be used to evaluate performance on the transaction log files.
Column 'ms/IO' is calculated from the fn_virtualfilestats output by dividing 'IOSTALLS' by 'Number of reads' . On files with write load higher then read load the 'ms/IO' value calculated this way is misleading. It should be calculated dividing by 'Number of reads' + 'Number of writes'. (Only in SAP release 6.10 division by 'Number of reads' + 'Number of writes' is done). But typically in SAP systems the number of reads is much higher then the number of writes and the difference including writes in the calculation or not does not matter. Furthermore with I/O subsystems using caches for write operations, the writes play only a minor role for I/O wait situations on the data files.
fn_virtualfilestats counts the statistics since SQL Server start. From 6.40 on values are sampled every 20 minutes to allow an analysis of certain time windows. In the entrance screen of ST04 values for the last 20 minutes are shown. In former releases, to analyze a specific time window of database activity, export data to a spreadsheet and calculate differences to a later snapshot.
6. What are the reference values for the 'dbcc sqlperf(WAITSTATS)' counters?
From the output of sqlperf(WAITSTATS) you have to divide WAITTIME by REQUESTS. For these quotients the following reference values hold for the average value since SQL Server start:
- PAGEIOLATCH_SH: < 20 ms/request
Typical values are in the range of 4 to 10 ms/request.
- WRITELOG: < 10 ms/request
Typical values are in the range of 1 to 5 ms/request.
From 6.40 values are sampled every 20min to allow an analyis of certain time windows. In the entrance screen of ST04 values for the last 20min are shown.
7. How can I monitor the impact of I/O on the current database activity?
- SQL Processes
The current database activity can be observed in ST04 ->'Detail analysis menu' -> 'SQL processes'.To find processes waiting on I/O order by column 'Waittime' descending. Those processes with 'Waittime' <> 0 are currently waiting (ignore system processes). Column 'SQL wait resource' displays what the processes are waiting on. PAGEIOLATCH_SH or PAGEIOLATCH_EX indicates waiting on I/O on data files. For an indepth analysis the wait resource can be used to identify the file and the database object where the wait occurs. WRITELOG indicates waiting on I/O on transaction log files.
- Last minutes values in 6.40
From 6.40 values are sampled every 20 minutes to allow an analysis of certain time windows. See 'the enhancements in the monitor for 6.40'.
- Last minutes values in 6.20 (IO per File)
From 6.20 onwards every 20 minutes a snaphot of the IO per File counters is sampled. From the last snapshot values for the period of the last 20 minutes are calculated for 'ms/IO', 'Number of reads', 'Bytes per read', 'Number of writes', and 'Bytes per write'. In 6.20 these columns are not displayed by default. From the icon 'Layout setting' choose 'Change layout' and move the missing columns to the list of displayed columns.
8. What else can I use to monitor SQL Server I/O?
Most dedicated storage hardware comes with its own monitoring software. For some products this software is only available with the vendors support.
On all installations the System Monitor (Performance Snap-In of Microsoft Management Console or Performance Monitor) of Windows can be used for a detailed monitoring of the I/O. Sampling intervals from 10 seconds to 5 minutes may be used.
Note: Most counters are available separately for read and write operations and summed up for all operations. Depending on the level of detail of the analysis only the sum (e.g. "Avg disk sec / transfer") or the separate counters (e.g. "Avg disk sec / read" and "Avg disk sec / writes") can be monitored. Below the counters for read and write separately are listed.
- To examine the response time of the I/O system use the counters:
PhysicalDisk: Avg disk sec / read
PhysicalDisk: Avg disk sec / writes
Based on these response times you can decide whether I/O contributes considerably to the database response times and therefore the system performance is I/O bound.
- To identify an overload situation of the I/O system use additionally the counters:
PhysicalDisk: Avg. Disk Read Queue Length
PhysicalDisk: Avg. Disk Write Queue Length
In case of an overload the response times deteriorate while the queue lengths build up. The 'Current Disk Queue Length' can be used also, but has to be interpreted with care as it is a snapshot of a short-living value.
These counters can also be used for a long term monitoring to identify hardware bottlenecks in advance.
- To examine the throughput of the I/O system use the counters:
PhysicalDisk: Disk Read Bytes/sec
PhysicalDisk: Disk Reads/sec
PhysicalDisk: Disk Write Bytes/sec
PhysicalDisk: Disk Writes/sec
Every I/O hardware comes with a technical specification for the maximum throughput. If the measured throughput saturates far below this specification, the hardware itself, or the configuration, or software is erroneous.
Note that the "% Disk Time" counters are useless as they are identical to the corresponding "Avg. Disk Queue Length" counters.
9. What can I do when I have identified an I/O bottleneck?
I/O analysis can be complicated due to the many layers involved. This includes the hardware (e.g. disks, host adapters, caches, lines, and controllers), software (firmware, drivers, Windows operating system, and the database SQL Server), and their configuration. For most components the hardware partner is the best contact. General database tuning may reduce the I/O load. SAP offers the service 'SAP Storage Subsystem Optimization' (quick link 'stso' on the SAP Support Portal).
SQL Server 2000 doesn't have configuration parameters relevant for I/O. Therefore the administrative task reduces to a sensible configuration of the database files. When SAP brought its product on SQL Server hardware and operating systems were already advanced enough, to support a very simple implementation: all tables are put into one large pool, which is distributed over all database files. This is fully managed by SQL Serveralone. Therefore SQL Server installations do not suffer from hot spots caused by misconfiguration of table storage. This eliminates a layer typically critical for the I/O performance of RDBMS.
The database administrator has to follow only few simple rules:
- No data file must be full.
- The free space within the data files is spread evenly over all files (more precisely it is sufficient to spread it evenly over all available I/O channels - where the meaning of 'I/O channel' depends on the hardware).
10. The monitor ST04 -> 'IO per File' is empty. What is the reason?
(This occurs only in 6.10 or later.) There are two possible reasons:
- Make sure that the job SAP_<sid>_<DB>_MSSQL_COLLECTORis running.
(To check: In the 'SQL Server Jobs' overview of DB12 choose 'All Database Jobs' to display job SAP_<sid>_<DB>_MSSQL_COLLECTOR. Column 'Run status' should read 'Successful' and 'Next run date' should show the current date.)
If this is not the case: running report MSSPROCS - > 'Start all' recreates the job. Or the job properties can be changed with the SQL Server Enterprise Manager. There might possibly be a problem (database error 229) with changing this job - see note 598767.
- If the table 'IO per file' is empty for the 'Current database' but shows values for 'tempdb': In case that the system was copied note 151603 provides the solution in section 'Initializing the Database Collector' (truncating 3-4 monitoring tables).
Program errors
Columns '.. per hour' are corrected in Support Packages