SQL Centre of Excellence

The Microsoft Assessment and Planning Toolkit is a fantastic tool for helping gather resource usage for a SQL consolidation / virtualisation project.

http://technet.microsoft.com/en-us/library/bb977556.aspx

One thing that is quite basic though is its presentation of disk IOPS (see below). This article shows how to analyse the data in a bit more detail using the underlying SQL database and excel.

image_thumb1

1) IOPs distribution graph

This is useful for seeing the distribution of IOPS so we can determine what percentage of IO falls within percentile boundaries. For example on the above example, would we want to purchase a device only capable of performing 15 IOPS. Probably not as this means that only 50% of IO is within performance targets. We may want 80% or 95% of IO coming in within performance targets. So how do we determine this….

a) Locate the database on the “MAPS” instance and query the database for IO broken down by 15 minute periods:

SELECT     CAST(FLOOR(CAST(dbo.performance_disk.collection_datetime AS float(53)) * 24 * 4) / (24 * 4) AS smalldatetime) AS time, 
                      CEILING(SUM(dbo.performance_disk.disk_transfers_per_sec) / COUNT(*) * COUNT(DISTINCT dbo.performance_disk.device_number)) AS IOPS
FROM         dbo.performance_disk INNER JOIN
                      dbo.devices ON dbo.performance_disk.device_number = dbo.devices.device_number
WHERE     (dbo.performance_disk.instance = N'_Total')
GROUP BY CAST(FLOOR(CAST(dbo.performance_disk.collection_datetime AS float(53)) * 24 * 4) / (24 * 4) AS smalldatetime)

OK – i know the query is horrible, but its a once off!

b) Copy and paste the results into excel

image_thumb2

c) Click Insert-Pivot Table and make a Pivot table showing the IOPS on the rows and the count of time (twice on the columns). name the first count of time “Freq” and the second “Quartile”

image_thumb3

d) Format the second series (we named Quartile) as “% Running Total In” IOPS

image_thumb6

e) Click Insert 2D Line Chart, right click on the second series (Quartile)  and select Format Data Series – Plot Series on Secondary Axis and tidy up the chart with nice formatting.

image_thumb9

 

We can now see some interesting data as: While the average IOPS is about 19.5, the 80% quartile is about 25 and the 90% quartile is about 34.

This might help specify SAN requirements as we could now say that our workload requires:

  • 25 IOPS at a latency of 8ms (80% percentile)
  • 34 IOPS at 20ms (90% quartile)
  • max IOPS of 37 (100%)

Ok, these are just small play numbers, but the concept is there. Some wider points to consider:

a) This is assuming that we don’t want to massively increase performance

b) We may need to add more IOPS for future growth and other factors

A Sample excel sheet with results is attached..Love to hear if anyone finds this useful!

 

Download File - IO_Distribution.zip

As mentioned previously (http://blogs.prodata.ie/post/Using-MAP-Tool-and-Excel-to-Analyse-IO-for-SQL-Consolidation-Part-I-e28093-basic-distribution.aspx), the MAP tool has a nice database containing all your disk IOPS data and this is nicely contained in a single table called “performance_disk”

We can use PowerPivot to analyse the IOPs form different perspective and start to apply some logic such as: excluding non-business hours, seeing demand by date, time and other factors, and examining read/write ratios

I’ve attached a sample PowerPivot mashup that shows us some more details on one sheet (sample below)

image

As its PowerPivot we can do some funky stuff like drag the “ServerName” onto a chart to see the breakdown by actual servers to help locate who are the IO gobblers (see below)
image

Non BI folk can stop reading now. Go away and play with the PowerPivot sheet, refresh it against your MAPS database and hopefully its useful.

----------------------------------------------------------------------------------------------------

For those BI folk, a few interesting challenges for PowerPivot:

  • The IOPS is semi-additive as in we need to average the IOPS across time, but then sum it across servers to get the correct figure. We do this by simply averaging the IOPS and then multiplying by the number of unique servers. the number of unique server is determined by the DAX expression as below:
    image
  • For a better distribution graph we want to “band” IOPS into buckets. For example 20-25, 25-30 and so forth. To do this we can use the “ceiling” DAX function which is much more powerful than the TSQL equivalent as it allows for rounding up on large whole numbers like 5,10,50 or 100. the example below rounds up IOPS to the nearest two.
    image

 

Download File - MAP_IO_Analysis.xlsx

Storage area networks are getting larger and more sophisticated. One of the problems I face as SQL consultants is running sqlio on them.

One common stumbling block is the size of the cache on some storage area networks can be huge (128GB plus), especially with the use of solid state devices as a tier “zero” cache accelerator. If our test file is too small we simply get the “from cache” speed of the SAN and while this is interesting its not indicative of performance when we throw a couple of TB of SQL Server data onto it. The fact is that often read cache on a SAN is a waste of money for SQL Server anyway as the cache hit ratio on the SAN will be incredibly low – SQL Server already has its own cache in the buffer pool, so its not going to be reading data from the cache of the SAN anytime soon unless the cache on the SAN is larger than the size of the SQL Servers buffer pool (possible but unlikely).

So what do we do when the SAN has a 64GB cache ? Well guidelines we use is that the size of the test file should be about four times the size of the cache. So your going to need a 256GB test file.

What if you don’t want to wait for a 256GB test file to be created as you only have limited test time?

Here’s a neat trick that Joe Sack from the SQL MCM program showed me for using SQL Servers “Instant File Initialisation” feature to massively speed up creation of test files.

1. Ensure that SQL DB Engine is installed and that Instant File Initialisation is configured in the local Security Policy Editor.

2. Create a database MDF file of the correct size (say 256GB) and detach the database

3. Modify the parameter file that sqlio uses to specify that you have pre-sized the file (see below)

4. Away you go.

Modifying the sqlio parameter file:

Usually the parameter file for sqlio is called something like “param.txt” and looks like this:

image

You can simply change the filename to be the MDF file and then delete the last parameter so that the size is not specified.

image

Using this technique I recently had a 1TB test file generated in under three seconds. Nice Winking smile

Technorati Tags:

Page List

Page List