Time for a Change

This past year has been a year of changes for me.

Throughout this year I’ve been very blessed to develop some close friendships within the SQL Community that I don’t take for granted. So thanks to everyone who invited or at least put up with me.

However the changes aren’t done yet. I began my career at Magenic in March of 2007 and after almost 6 years of employment I have decided it’s time for a change. I came to Magenic looking for new technological as well as business challenges and Magenic did not disappoint. I cannot state enough how much I have enjoyed both professionally as well as personally my time at Magenic.. I have no problem recommending it as a place of employment and hope to maintain many of the relationships that I made while employed there. In fact this was a very difficult decision to make primarily due to the people and the Magenic “family”. However effective Friday, January 4, I resigned.

A New Chapter

Beginning on Monday, January 7, I began work at Huron Consulting Group, specifically as a Database Solutions Architect within the Legal Practice working on Electronic Discovery Solutions. What is E-Discovery?  Imagine party A has a legal matter against party B, such as a lawsuit. Once the framework for that legal matter is decided upon, there is a process of discovery. This is basically a phase where both parties share information about the legal matter. Now the process of “discovering” that information to share can be fairly complex, time-consuming and expensive. Imagine all your company’s data – email, server shares, server logs, laptop hard drives, desktop hard drives, etc. – something and usually someone, must go through the data to see what needs to be shared.

There are certainly some very difficult challenges such as using technology to reduce the amount of documents that need to be reviewed by a lawyer – through identifying and removing duplicates along with keyword and semantic analysis. Once it’s reviewed by a lawyer, that lawyer will mark up the document in a number of different ways to indicate it should be shared, redacted or other things. So we are talking about large amounts of data, data movements, reports and analytics – both to fulfill the client needs as well as what’s behind the process for Huron to determine things like capacity, velocity, cost and other metrics to drive their our business. I’m excited about this move as I’m expecting this will provide many challenges and opportunities to learn.

Here’s to a new year, a new chapter and a new challenge.

Temp Table vs. Table Variable

This is a comparison summary that I’ve shared to describe the differences between Temp Tables vs. Table Variables.  A few people have asked that I post it so here you go:

Comparison Summary

This is a summary of the functionality and behavior of the different object types. Note that global temporary tables aren’t included, because typically you use those for different purposes than the other types of temporary objects. You might find this table handy as a reference when you need to choose the appropriate temporary object type for a given task.

Functionality/Object Type Local Temp Table Table Variable Table Expression
Scope/Visibility Current and inner levels Local Batch Derived Table/CTE: Current statement View/Inline UDF: Global
Physical representation in tempdb Yes Yes No
Part of outer transaction/Affected by outer transaction rollback Yes No Yes
Logging To support transaction rollback To support statement rollback Yes
Locking Yes No Yes
Statistics/recompilations/efficient plans Yes No Yes
Table size Any Typically recommended for small tables Any

Temp tables can be altered with DDL statements but table variables can’t (so you cannot create a non-clustered index on a table variable for example). That makes every table variable a heap, or at best a table with a single, clustered index and every table variable access a table scan (or clustered index scan, which is essentially the same thing).

SQL Server creates statistics for temp tables, so the query optimizer can choose different plans for data involving temp tables. However, SQL Server does not create statistics on columns in table variables (So, the only way to access data in a table variable is through a table scan).

Now go read Paul White’s (b | t) post on Temporary Tables in Stored Procedures

Table Partitioning vs. Partitioned Views

Recently I was doing some work trying to figure out whether to use Table Partitioning or Partitioned views in a data warehouse to solve a problem.  In this particular data warehouse the data cannot be partitioned by date, it’s actually partitioned by a value in a table.  I was leaning towards Table Partitioning and talking with Brent (b | t) we put together this quick comparison up on a whiteboard:

Table Partitioning Partitioned Views
Parallel Loads Y Y
Query Partition Elimination Y Y
Divide Filegroups Y Y
Index Tuning EASY (single table means single set of indexes) POWERFUL (different indexes on different tables)
Schema changes doable N (single table means adding a column is time consuming) Y (different tables can be different schemas and then managed through views)
Enforced Relational Integrity Y N
Freeze Data 10% R/O, Perfect Stat/Indexes Y Y

As I said I was leaning towards table partitioning while Brent was leaning towards partitioned views.  My argument was for referential integrity while Brent’s was ability of schema changes.  Well turns out that Brent won (big surprise huh?) however not for any of the reasons listed.  The reason we went Partitioned Views is that once that is in place we can then implement table partitioning on the individual tables.  Combining these two techniques can be very powerful (if done correctly that is).  If we went straight to table partitioning, we would be limited in our future potential in this regards.

So lesson of the day is when looking and planning architecture – think, whiteboard and then think some more – do not just open up SSMS and start typing code.