SQL Centre of Excellence

Few people are naturally “wide” enough to cover the whole of the SQL MCM with ease, and no I don't mean in terms of eating burgers, ice cream, beer and carbs.

When I went to Redmond, I knew that I was weak in clustering, replication and partitioning. I may have ahem “architected” some clusters while working for Microsoft Consultancy Services and even reviewed some design documents, best practices, etc but I had not actually got my hands dirty. It was getting embarrassing as sales guys were even saying “You have a SQL cluster issue, you need to speak to our guy Bob. Bob – show them what's wrong will you”. A friend who runs the dell training centre took me through a series of labs and we made clusters, broke them, player with quorum models, looked into logs, etc. After that one evening some weight was lifted off my shoulders. I could now start to merge the theory and practice a bit better with some affirmation from someone who has set up maybe hundreds of environments. I also stopped avoiding clusters like the plague.

Some advice. If you don’t have and know your weak areas, either your already already an MCM, MCA and BI Maestro (and your name is Dirk Gubbels). OR maybe you haven’t looked hard enough.

Find those weak areas and embrace them. If you are going for MCA don’t be afraid to make it clear on your skills, where they blur and where they end. In my MCA review board I was asked a truck load of questions on ITIL, operations and data centre manageability, I recommended that we bring in a team including an Ops consultant (for which I had some contact), and explained that while I could help them navigate through some of the (obvious) high level issues to engage deeper we needed a specialist rather than just myself. I was more a solution/technical design and infrastructure dude than hard core data centre operations. Expecting a ding, I actually got a comment from the review board of “solid – knows what he knows and what he doesn’t know”. So don’t expect to be an expert in everything. In the MCM you may have to guess some questions as its a Microsoft Learning exam but in MCA – don’t bullshit so much. In the real world you don’t need to personally be an expert in everything to get to the correct result.

If you have read the "http://blogs.prodata.ie/post/So-you-think-your-a-SQL-MCM-Test-Yourself-Part-I.aspx” here some more questions to test where your at. If you read these questions and say “that was fun, must check out 1-2 things” you are ready. If you read them and go “holy crap”, maybe you need to watch the videos some more and attend some of the level 400-500 training events that are going around.

Health Warning: These are not actual MCM questions. Just some questions to demonstrate the type depth I personally would expect to see. If you pass these questions and fail the MCM – sorry, I’ll buy you a beer…

Locking and Blocking

1. How can you see the locks for a specific SPID (two common methods)

2. Someone has a shared lock on a resource. Can an update lock also be serviced?

3. Name two types of locks that can be made if an IX lock is in place ?

4. What can cause a row lock to escalate to a page lock?

5. How can you implement partition level locking?

6. When would you consider the “UPDLOCK” instead of the default shared lock??

7. How can you control lock timeout

8. Name two ways to deal with lock timeout in procedural code?

9. How can you avoid blocking (up to five methods)?

10. What is the relevance of the “'blocked process threshold”

11. How can you avoid deadlocks (six techniques)

Snapshot Isolation

1. What is ACID?

2. Someone issues a begin tran and then two identical select statement. Under what isolation levels will the same results return if updates are ongoing?

3. When RCSI is enabled what is increase in row length and why?

4. What is the difference between RCSI and SNAPSHOT?

Partitioning

1. What are the pros and cons of PV’s versus PT’s?

2. Name two improvements in query plans on partitioned tables in 2008+ over 2005

3. Are there any queries that will run slower on partitioned tables v non partitioned. Name some examples?

4. How do you enable partition level lock escalation?

5. How and why would you disable lock escalation completely on a partitioned table?

6. A customer with a complex OLTP is having performance problems. The developers have read about partitioning and want to apply this. Is this a good idea?

Index Fragmentation

1. What is the largest IO size with read ahead in enterprise and std edition ?

2. What is the largest amount of read ahead that Enterprise and standard edition will do ?

3. Do you want to change your answer to question (1) after doing question 2 ;-)

4 Explain the difference between logical, extent and physical fragmentation. Which ones prevent SQL Server read ahead from using larger IO size?

5. Why is the perfmon counter “Page/Splits per second” not very useful for detecting harmful page splits?

6. How can you detect harmful page splits versus natural ones?

7. List three ways to fix fragmentation

8. What are the pros and cons of using MAXDOP of 1 on an index rebuild for a data warehouse ?

9. Does a clustered index rebuild rebuild all the Non clustered Indexes ?

10. What can prevent online index rebuild operations (name three things)

11. How can you monitor the progress of an index rebuild?

HA - Backup

1. A log backup is started while a full backup is being taken. Will the log record by cleared ?

2. A crash happens during a backup. Will transactions started before the backup be written to the data file on disk? why ?

3. Does a backup also include any log details ? why ?

4. What does BACKUP LOG WITH TRUNCATE ONLY do?

5. How do you backup the “tail” of the log following disk failure ?

6. Will taking a full backup break the log chain ?

7. Name two common ways the log chain is broken on log backups ?

8. What is the largest IO size you can get with backups and how can you get it ?

HA - Restore

1. How do you set instant file initialization?

2. How much faster does instant file initialization make the log file creation?

3. Apart from a faster disk (and instance file initialization) how can you make restore form a full back up faster?

4. You have a customer that needs to create a 100GB log file on restore. How can they make this faster without additional hardware?

5. How can you determine how long the “undo” phase of restore will take ?

6. If you restore with “NORECOVERY” is the redo phase applied

7. What affects how long the “redo” phase takes ?

8. Why should you not use SSMS when recovering from a disaster ?

9. When and why would you use RESTORE..WITH STANDBY ?

10. How does this DMV help with a restore strategy ? “dm_db_persisted_sku_features”

HA – Consistency and Corruption

1. What is the number one cause of corruption ?

2. What is the difference between a hard and soft IO issues ?

3. What is a read-retry error ? does it result in data loss ?

4. Name two places 823, 824 and 825 I/O errors are logged ?

5. What is the recommended tool to check a disk subsystem for IO errors for an OLTP SQL workload?

6. Can you whiteboard the difference between torn page detection and checksum ?

7. What does checksum protect from that torn pages does not ?

8. You switch a database from torn-page detection to checksum. You run a query which will read a page from disk that is corrupted. At what point will the error be detected.

9. What types of errors does automatic page repair work with ?

10. Name some improvements in DBCC CHECKDB introduced in 2005 ?

11. Name some improvements in DBCC CHECKDB introduced in 2008 ?

12. Does DBCC CHECKDB check constraints?

13. How do you estimate how much tempdb may be required for a DBCC CHECKDB ?

14. DBCC checkDB always takes 20 minutes to run, Today it is 60 minutes and still going.

a. What is the most likely cause?

b. How can we help answer how long it will take?

Post any comments on questions that bug you. Would love to get an existing MCM to help provide answers but I’m too lazy ;-)

Whops I accidently deleted the original list of questions. Like a true data professional I don’t backup my  blog. Here’s of the original questions.

Internals

1. What’s an example of a legacy LOB type v new LOB Type ?

2. Are Legacy LOB Types stored on/Off row by default

3. Are New LOB Types stored on/off row by default ?

4. How can you change this behaviour for legacy LOB Types?

5. How can you change this behaviour for New LOB Types?

6. If you change row storage for a LOB from “on row“ to “off row” on a 100GB database how long will it take ?

7. How big in bytes is the “version tag” for versioned rows and what does it store ?

8. How often to PFS pages appear in a data file?

9. How often to IAM pages appear in data file?

Index Strategies

1. A table has 1,000 pages with 100 rows per page. A query on an (select * from blah where MyColumn=1) column returns 500 rows and there is an index on MyColumn. Will it use the index if the index is non-clustered?

2. Can an index include LOB data types?

3. How big can the clustered Index Keys on a Materialised View be?

4. Recommend TWO Ways to get rid of unused indexes if you have a representative trace file

Log Internals

1. Is there such a thing as a non-logged operation ? If so, examples?

2. How Can you get minimally logged inserts on a clustered table with rows in it ? (specify two things to do)

3. A Developer issues a begin tran followed by an update and then a checkpoint. Are the changes written to disk in the checkpoint ?

4. How do you resolve a “checkpoint IO storm”

5. Someone creates a 1GB log file. How many VLF’s are there ?

6. Is a log backup smaller or larger with BULK_LOGGED than with FULL logging mode ?

7. Is rebuilding of an index in simple mode minimally logged?

8. Is re-org of an index in simple mode minimally logged?

9. You need to update a 100GB BLOB when in bulk logged mode. Can this be done minimally logged? If so, how?

10. If you switch from full to bulk and then MDF corrupts. Can you backup the tail log ?

11. How can you make the log file initialisation faster ?

Statistics

1. How many steps are in a stats histogram

2. Are histogram values left or right based?

3. What can cause a recompile (there are 14 events). List as many as you can.

Page List

Page List