SQL Centre of Excellence

For those of you following my session at SqlBits X on load testing analysis service you might want to tinker with some of the sample reports or tests.

Attached is the VS 2010 source code for reports and unit tests. You will need SQL 2012 RTM installed to edit the code, but you can deploy and test against 2008R2. You can also deploy the cube to SSAS 2005 if needed.

Happy to answer any questions. I’m probably too busy to make many enhancements, but feel free to do anything you like with the source code.

My actual scaled up Adventure Works DB database is too big to post (14 MB). I have an older post on tips for scaling up adventure works here http://blogs.prodata.ie/post/Scaling-AdventureWorksDW-for-BI-Testing-(1-of-2).aspx For my actual demo I used a much smarter technique of just using “views” to scale the data, so the actual size is still quite small on disk.

Sample Files Below

source code

SSAS Load testing presentation PPTX

If you are continually creating Always On availability groups you may want to speed up the enabling of the Always On service within SQL Configuration manager (see below). In my demo of a four node cluster creation from scratch, you “really” don’t want to manually log onto each node to reconfigure and restart the service!


Here is the PowerShell Script I use in a demo to enable multiple nodes at once

# ConfigureAlwasyOn.ps1
#       Enables AlwaysOn on instances See 
# http://msdn.microsoft.com/en-us/library/ff878259(v=sql.110).aspx
Write-Host -ForegroundColor Green "Enabling AlwaysOn on NODE2"
Enable-SqlAlwaysOn -Force > $null
Write-Host -ForegroundColor Green "Enabling AlwaysOn on NODE3"
Enable-SqlAlwaysOn -Force > $null
Write-Host -ForegroundColor Green "Enabling AlwaysOn on NODE4"
Enable-SqlAlwaysOn -Force > $null


Notice that between RC0 and RTM of SQL 2012 the name of the cmdlet has changed from “-HaDrService” to “-SqlAlwaysOn”.The techies managed to catch up with the marketing people!

I am just doing some tests on Analysis Services 2012, ready for my talk on load testing at SqlBits X (http://sqlbits.com/Sessions/Event10/Load_Testing_Analysis_Services)

To quickly test cube processing, I took AdventureWorksDW, scaled up to 180 million rows and did a full cube process. 2008R2 was 4% faster than 2005, but SQL 2012 was some 56% faster than 2005!


Not yet sure as to what is causing the better performance, maybe the default thread settings for processing are much better configured to modern hardware.

The test configuration was:

  • AdventureWorksDW scaled to 180 million rows for reseller sales
  • Partitioned by Month
  • Dell R410 server with 64 GB ram and 24 logical processors across 2 NUMA nodes

With the release of SQL Server 2012 (http://www.microsoft.com/sqlserver/en/us/default.aspx) and BIDS Helper 1.6 (http://bidshelper.codeplex.com/releases/view/68109), we can now finally develop for Analysis Services in Visual Studio 2010.

Even better from a single project we can deploy to SQL 2005, SQL 2008R2 and SQL 2012. This means that the upgrade from 2008 to 2012 does not even have to involve an upgrade of Visual Studio.

How to Target Multiple versions of Analysis Services

I simply set up a separate configuration for each environment as screen show below. Each configuration just has a different server name as the deployment server.

Here’s a screenshot of a test server with 2005, 2008R2 and 2012 running side by side, all deployed from the exact same source code in VS 2010. Not a sign of BIDS 2005 or 2008.


So What's the Catch – Is this dancing babies?

Well unfortunately you can only get the BI components in Visual Studio 2010 IF you have also installed some of the SQL 2012 client tools. This doesn't affect me, as I work in most editions of SQL Server, but may affect someone who does not have access to the SQL 2012 client.

When evaluating load test tools for Analysis Services, there are number of tools available on codeplex or you can roll your own using a load testing suite such as Visual Studio Test Edition.

What do you need out of a load test tool ? Here is a list of qualities you may want to consider when selecting tool(s):

Quality Description
Easy to Get Started The toolset should have enough to get you up and running quickly without writing buckets of code or even worse, custom XML files.
Capture Trace the toolset should support automatically capturing a trace file, suitable for replay.
Work With Parameters The tool should deal with parameterised queries (reporting services), both from a capture point of view and a replay perspective.
Folder playback Does the tool work with queries saved into a folder (pretty basic and messy)
Table playback Does the tool support loading queries direct from a trace sql table (nicer)
Query Replay Does the allow replay of queries
Trace Replay Can the tool play back a trace file, or trace table
Multi –Threaded Can the tool playback on multiple threads
Distributed Load Can the tool use multiple agent to playback
Distributed Replay Can the tool replay a trace file using a distributed method
NLB Simulation Can the tool simulate multiple Analysis Servers in farm
Capture performance counters Can the tool automatically capture relevant performance counters from analysis services and windows
Chart performance counters Can the tool collate and chart counters to help identify bottlenecks.
Performance Thresholding Can the tool automatically identify known issues with performance counters
Store Results Can the tool label and store results so they can be retrieved at a later date.
Reporting Tools Can the tool produce reports and charts on performance
Regression Testing Can the tool compare a baseline to other runs and identify regression issues.
Functional/Upgrade Testing Can the tool verify that two environments produce the exact same output.
Reporting Services Can the tool also load test reporting services or web sites for end to end load testing.


I quickly looked at a few of the toolsets available

Toolset Description
ASCMD for stress testing
Great for automating capture, but replay is basic enough and does not scale to distributed or offer performance counter capture / analysis by itself
AS LoadSim
Written by Microsoft Consulting Services, this toolset uses Visual Studio to load test from a custom XML format which can be generated using the “AS Query Generator”
AS Performance Workbench
Assuming you have sample queries in a folder, this toolset replays queries, captures and shows performance graphs and produces reports.

very easy to setup and use.
SQL Profiler Great for capturing and replay of workloads, although does not support distributed replay for MDX – even in SQL 2012.
Custom VS 2010 Load Test Roll your own code, not as hard as you think and more flexible.

Well be publishing our own solution soon , after SQL Bits X


So how do these tools stack up on features. A rough comparison is below:



the conclusion - If you want a basic test then AS Performance Workbench combined with using Profiler or ASCMD to help capture a workload will get you started faster.

If you want something a lot more scalable or more sophisticated with regression or functional testing, then using a Visual Studio 2010 load test is going to suite better. I’ll publish some sample code to help get started  with the feature matrix as above

I see a lot of customer MDX queries where comments are used to help explain complex logic inside an MDX query, or to remove a block that is not needed.

Please avoid using the single line style of comment as below:

SELECT -- Reseller Sales Amount by Quarter
[Measures].[Reseller Sales Amount] ON 0
,[Date].[Calendar].[Calendar Quarter].Members ON 1
FROM [Adventure Works];

This causes havoc for diagnostics and load testing as when we capture a trace, the entire query is concatenated onto one line as below:

SELECT -- Reseller Sales Amount by Quarter [Measures].[Reseller Sales Amount] ON 0 ,[Date].[Calendar].[Calendar Quarter].Members ON 1 FROM [Adventure Works];
A better approach is to use the start and end comment block approach, so that when the query is reduced to one line it is still valid syntax. 

SELECT /*Reseller Sales Amount by Quarter*/ [Measures].[Reseller Sales Amount] ON 0 ,[Date].[Calendar].[Calendar Quarter].Members ON 1 FROM [Adventure Works];

We do have techniques to clean the comments out of the MDX, but the best way I have found so far is to insert CRLF’s before know heuristics that comments might come before (very messy).

As an aside, if you also code in TSQL on the DB Engine platform, commenting queries is also an even bigger no no. This is because the hash of the entire query “including comments” is used to identify similar query plans. If you have the same queries with different comments, then the query processor will have to make up multiple plans for them!

So lets recap:

  • Comments in compiled code =good.
  • Comments in the middle of query statements=handle with care or avoid completely

My first glance of a lightning talk was at SqlBits. Kasper de Jong delivered a superb session on what's new in PowerPivot 2.0. In five minutes he managed to go through and demo 80% of the key new features – I felt like Neo in the Matrix getting some information uploaded super fast via a jack. No BS – just a straight informative session with lots of value. Saved me days of reading blogs and playing around with the Beta.   

I’ve accepted the challenge now Winking smile at #sqlsat105 there is an open session for lightning talks. Each session is strictly 5 minutes and anything goes. My advice is don’t try to cover too much content and don’t give death by Power Point. For someone who is used to delivering 60 to 90 minutes sessions getting a message down to five minutes is quite a challenge!

If you are new to public speaking this is a great chance to try your hand at getting some experience. if you are a seasoned speaker, this is a chance to hone your five minutes, no BS skills.

Not sure what topic I am going to do. I was thinking of a session entitled “Microsoft Access for SQL Server folk” or maybe a five minute demo of What’s new in PowerView (err everything)

You can either just turn up on the day and register interest at the reception desk or email the  SQL Saturday 105 crew at sqlsaturday105@sqlsaturday.com

SWEET! you can now access windows Azure Storage Directly from SQL Server Management Studio. This means you can now simply right click on a backup taken in SQL Azure and select “Import Data-Tier Application” to import the “bacpac” file onto your on-premise SQL Server. 2-3 clicks of the mouse and cloud zapped to local database.


You can also now right click on any on-premise database and select “Export Data-Tier Application” to create a bacpac and upload it direct into SQL Azure. Check out the new export screen shot below.


I’m still a big fan of the Import Wizard on codeplex (http://sqlazuremw.codeplex.com/) as it automatically fixes up minor issues like missing clustered indexes. However, the bacpac way is  going to be a lot more efficient for larger data sets as it is compressed. You can even rename it to a “zip” file to poke around.

Holy crap there is also a “Deploy” to SQL Azure option in SQL 2012 RTM, where you can pick also create the database and provisioning is all automated. Nice..


So its pretty good for a v1.0, but what is it missing?

Well life would be just complete if you could right click on a SQL Azure database and the “Export” Task did an export from SQL Azure rather than dragging everything locally and doing and creating a bacpac from your machine. If anyone else would like this please vote on connect item I just added.


If you don’t give some love and attention to how you design your analysis cube and look at optimising processing  then this is often fine at the very small cube size (under 2GB), but as you approach 10GB+ the first warning sign is usually horrendously long wait times for processing.

The analysis Services Performance Guide Section 4.0  (http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=17303) can seem a little daunting, so I have summarised the three "quick wins”

The following tests use the 120 million row adventure works that you can get from this blog article (http://blogs.prodata.ie/post/Scaling-AdventureworksDW-Cube-for-BI-Testing-(2-of-2).aspx)

Note that there is a lot more “tuning” that you can do to improve processing performance, but these few tips are more the “basics” before you even look at tuning. I don’t want to see anyone ask me to look at RSS,their network card settings and jumbo frames when they are only using one out of 20 cores!

1. Partition Your Analysis Services Measure Groups to shorten processing time.

While partitioning can speed up queries in SSAS, it is more essential to allow the processing to happening in parallel, especially with the tendency for larger servers with 40, 64 or even 80 logical cores (pity that SSAS 2008R2 can only address 64 cores!). To prove this I ran two tests:

  • The 120 million Internet Sales measure group with the default single partition
  • The 120 million Internet Sales measure group partitioned by month

With only a single partition, both the data warehouse and the analysis services server are “idling”. we can see just a single core working hard . To make matters worse, most modern servers with power management or Turbo Boost CPU features and lots of cores will see this as as indication that the workload doesn’t really need any juice and will lower the clock speed appropriately (ouch).

Performance counters show us shifting about 260k rows per second and only using 2 threads (one for the reading and one for the writing)


Now we try that same processing with partitions by month and we get starkly different results. All the cores are humming along:


The performance counters are looking much healthier now with 758 k rows per second. Note we have 24 threads in use now. We can assume that 12 are being processes in parallel.


The improvement time wise is pretty good, from 15 minutes to 5 minutes.



2. Don’t use wide Fact tables
If you are only selecting 10 columns from the fact table and the table has 100 columns, then the number of rows packed per 8k page in the database engine is going to be very low and performance is going to really suffer as the data warehouse just can’t supply the rows fast enough.

If you need to bring in lots of additional data that is not needed for the cube, and processing time of measure groups is important, then consider three options:

a) creating a table with just the important attributes for the dimensional model and another “junk” table for attributes that are not needed by the multi-dimensional analysis services model.

b) Maybe you need a large sprawling data warehouse and your analysis services dimensional model should come form a smaller, leaner data mart. See http://thomasivarssonmalmo.wordpress.com/2012/02/18/the-data-warehouse-value-proposition/

c) Create a narrower version of the fact table for the cube to use, this could be a physical table (preferable) or an indexed view. I like the Indexed View if the situation is political data modelling wise – you can create a physical data mart in secret as ahem “tuning”  and leave people to believe they have a single data warehouse.

Obviously some common sense is needed here. A “few” extra attributes aren't going to hurt, but in row CLOB’s and very wide tables will not be good.

To prove this I added a column to the fact table which was 2,000 characters long and measured the increase in processing time from the baseline 15 minutes.


The table size grew from 17GB to a whopping 220 GB in size. More importantly our rows per second dropped when processing from 260k to about 1k (see perfmon screenshot below)


The chart below shows the difference in processing times for a single large partition. It went from 15 minutes to about 55 Hours when we added a lot of unwanted or wide columns to the fact table. Obviously the workload shifted from memory bound to storage and memory bound.



3. Watch your Threads

Analysis Services can default to only allow 64 threads for storage operations such as processing. On your shining new 40 core server this may not be enough and you will see much queuing on the “processing pool job queue length”.

During processing watch to see if you fail to achieve > 80% of CPU and the Idle threads comes down to zero. This is a sign that you need to increase the Threadpool\Process\MaxThreads setting.

we can see from the high Idle threads and lack of queuing, that we haven’t hit the the limit yet (see below). We are not too bothered as the CPU is avg 94%, so more threads would just create context switching.


If you want to process more than TEN partitions in parallel you will need to also change the maximum database connections allowed by analysis services (see below)


4. Things that may not help as much as you think they would – Table Partitioning

On data warehouses with under say 30Gb size fact tables, the need to partition the analysis services cube comes much sooner than the need to partition the data warehouse. Sure its a best practice to partition them both, but bear in mind the partitioning of the data warehouse will often not translate into “massive” performance gains for Analysis Services Processing.

SQL Enterprise for example supports “merry go round” table scans where multiple queries can use the same table scan results.(http://sqlserverpedia.com/blog/sql-server-bloggers/merry-go-round-scans/), which mitigates this scenario a lot.

In our sample 120 million row AdventureWorksDW, partitioning of the sample data warehouse led to about a  5% improvement in throughput. Not bad, but not as good as the 300% improvement with partitioning the Analysis Services Measure group !

The first thing you are going to is to figure out is if you even need a dedicated analysis services server.

Some of our smaller customers just go for a mini “appliance” server which has Analysis Services/SharePoint and sometimes even a small data warehouse and PowerPivot installed on a single virtual machine with 16+ GB of RAM. For under 50 users, with only maybe 10 concurrent users, a cube size under 10GB or so and light usage this may work well, and is certainly a good starting block for an SME, proof of concept or small department.

Larger customers may want to “co-habit” the data warehouse and the analysis services installation. This is an ideal option if the data warehouse is populated at night and the analysis services is queried mainly during the day. Memory allocation is the problem here as in both SQL Server and Analysis Services you will need to adjust memory settings. we have worked for 4-5 financial institutions who do this.

As the user count creeps into the hundreds, cube sizes start to go over 100GB, or queries become more intense, the need for a dedicated server emerges.

While there are some great reference architectures for data warehouses like Fast Track, there is no such equivalent for Analysis Services. Here are my thoughts based on what we see in our customer sites, our lab and load testing:

1. How much Memory do I need ?

well many of our customers with cubes under 50GB in size just make sure the memory is larger than the cube “to be sure”, but as you scale to 100GB+ this becomes in-practical. So you need to plan for all the dimensions and a good sizable chunk of measure groups begin cacheable. If you already have the cube and a load test harness, the “best” way to know how much memory is needed is simply to measure it! The [MSAS2008:Memory\Memory Usage Kb] performance counter will show the working set and memory should be sized as say 40% higher than this to allow for OS and growth. Once you know how much memory you need you’ll know if you can buy a single socket server(s) or if you need to buy a dual socket server and only fit one socket (see below)

2. How many Sockets or CPU’s do I need?

Two things we really need to be aware of is:

  • Analysis services 2008R2 does NOT support greater than 64 processors, so your four socket, 80 core server is a no no until SQL 2012.
  • For many workloads the cost of NUMA is going to result in very poor performance gains and in cases where the workload is memory bound, potentially slower performance.

So the largest server that we recommend by default for Analysis Services is a SINGLE numa node. If you need more throughput than a single node, you need to scale out using load balancing.

Now we are not saying that you can’t scale analysis services beyond two sockets, but for most customers we have worked with in under 500GB or so size range, performance has not scaled as well as you would hope for with NUMA. Below is a load test we ran on a complex report where the single NUMA node using 10 cores actually performed more throughout in terms of queries over a fixed time on the same server with 20 cores. Let me repeat that – you may find out that you can make your  Analysis Services server service more users by simply removing some CPU’s!


The challenge here is you may not be able to buy a single socket server as these are often limited to only 32GB of RAM, so you may end up buying a dual socket server and only fitting one socket just to fit enough memory. Some of our customers even buy dual socket servers and “throw” one of the CPUs in a cupboard, it depends how flexible your supplier is!

3. What type of CPU and Memory clock speed do I need ?

This depends on your workload and how much you care about performance

The formula engine in Analysis Services is single threaded, so in many cases, 1 good CPU will be faster than 64 slower ones.

If you queries which are returning large cell sets then memory clock speed will be critical.

Here is the selection we would make buying a dell R410. Note that with the licensing changes in SQL 2012 a four core configuration may become practical.


We have customers who use IBM or HP Servers with the 10 core Intel Xeon E7 2870 chipset with 10 cores and 20 threads (hyper threaded). These have scaled to 20 concurrent connections running complex queries, with a 10-1 concurrency ratio of actual users to physical connections this could be a user base of say 200 users.

As I mentioned before be aware of licensing changes where one license will be required per four cores. This makes denser environments less appealing then less cores with higher specification.

4. How many nodes do I need for Scale Out ?

The answer to this is quite simple. How many users can a single node support and how many users do I want to plan for, accounting for the fact that we may need N-1 if we are implementing high availability.

As a rough rule of thumb, maybe 1 node per 50 physical connections for smaller environment. If you have larger queries this may come down to say one node per 20 physical connections or lower. Bear in mind that 50 actual connections may be a user base of hundreds of concurrent users due to think time, etc.

5. Can I virtualise Analysis Services ?

Absolutely, certainly more so than with the SQL Server DB Engine. Most hypervisors do not like to cross NUMA boundaries and Analysis Services does not like to cross NUMA boundaries so from this perspective they are good partners.

However there is one huge conflict. In Analysis Services we want fast CPU’s and memory clock speed. With virtualisation people typically use much denser environments with much slower CPU’s and lots of cores to increase efficiency. You will pay your performance costs not because of the virtualisation itself, but more due to the generic nature of the hardware used.

6. How can I prove I have the right hardware and it is configured correctly?

A load test is actually much easier than you think. Come and see my session at SQL Bits 10 which may be recorded Winking smile


Alternatively, I love nothing better than helping customers load testing Analysis Services!

Page List

Page List