When I began working at Microsoft, I was very much a novice at performance troubleshooting. There was a lot to learn, and hash match joins were pointed out to me multiple times as the potential cause for a given issue. So, for a while I had it in my head, “hash match == bad”. But this really isn’t the case.

Hash matches aren’t inefficient; they are the best way to join large result sets together. The caveat is that you have a large result set, and that itself may not be optimal. Should it be returning this many rows? Have you included all the filters you can? Are you returning columns you don’t need?

If SQL Server is using a hash match operator, it could be a sign that the optimizer is estimating a large result set incorrectly. If the estimates are far off from the actual number of rows, you likely need to update statistics.

Let’s look at how the join operates so we can understand how this differs from nested loops

How Hash Match Joins Operate

Build Input

A hash match join between two tables or result sets starts by creating a hash table. The first input is the build input. As the process reads from the build input, it calculates a hash value for each row in the input and stores them in the correct bucket in the hash table.

Creating the hash table is resource intensive. This is efficient in the long run, but is too much overhead when a small number of rows are involved. In that case, we’re better off with another join, likely nested loops.

If the hash table created is larger than the memory allocation allows, it will “spill” the rest of the table into tempdb. This allows the operation to continue, but isn’t great for performance. We’d rather be reading this out of memory than from tempdb.

The building of the hash table is a blocking operator. This means the normal row mode operation we expect isn’t happening here. We won’t read anything from the second input until we have read all matching rows from the build input and created the hash table. In the query above, our build input is the result of all the operators highlighted in yellow.

Probe Input

Once that is complete, we move on to the second input in the probe phase. Here’s the query I used for the plan above:

USE WideWorldImporters
GO

SELECT *
FROM Sales.Invoices inv
INNER JOIN Sales.InvoiceLines invl
	ON invl.InvoiceID = inv.InvoiceID
WHERE
	inv.AccountsPersonID = 3002
GO

The build input performed an index seek and key lookup against Sales.Invoices. That’s what the hash table is built on. You can see from the image above that this plan performs a scan against Sales.InvoiceLines. Not great, but let’s look at the details.

There is no predicate or seek predicate, and we are doing a scan. This seems odd if you understand nested loops, because we are joining based on InvoiceID, and there is an index on InvoiceID for this table. But the hash match join operated differently, and doesn’t iterate the rows based on the provided join criteria. The seek\scan against the second table has to happen independently, then we probe the hash table with the data it returns.

If the read against Sales.InvoiceLines table can’t seek based on the join criteria, then we have no filter. We scan the table, reading 490,238 rows. Also unlike a nested loop join, we perform that operation once.

There is a filter operator before the hash match operator. For each row we read of Sales.InvoiceLines, we create a hash value, and check against the hash table for a match. The filter operator reduces our results from 490,238 rows to 751, but doesn’t change the fact that we had to read 490,238 rows to start with.

In the case of this query, I’d want to see if there’s a filter I can apply to the second table. Even if it doesn’t change our join type away from a hash match, if we performed a seek to get the data initially from the second table, it would make a huge difference.

Remember Blocking Operators?

I mentioned the build input turns that branch of our execution plan into a blocking operator. This is something try to call out the normal flow of row mode execution.

With a nested loops join, we would be getting an individual row from the first source, and doing the matching lookup on the second source, and joining those rows before the join operator asked the first source for another row.

Here, our hash match join has to gather all rows from the first source (which here includes the index seek, key lookup, and nested loops join) before we build our hash table. This could significantly affect a query with a TOP clause.

The TOP clause stops the query requesting new rows from the operators underneath it once it has met it’s requirement. This should result in reading less data, but a blocking operator forces us to read all applicable rows first, before we return anything to upstream operators.

So if your TOP query is trying to read a small number of rows but the plan has a hash match in it, you will be likely reading more data that you would with nested loops.

Summary

Actual numbers comparing join types would depend a ton on the examples. Nested loops are better for smaller result sets, but if you are expecting several thousand (maybe ten or more) rows read from a table, hash match may be more efficient. Hash matches are more efficient in CPU usage and logical reads as the data size increases.

I’ll be speaking at some user groups and other events over the next few months, but more posts are coming.

As always, I’m open to suggestions on topics for a blog, given that I blog mainly on performance topics. You can follow me on twitter (@sqljared) and contact me if you have questions. You can also subscribe on the right side of this page to get notified when I post.

Have a good night.

There are a lot of things to know to understand execution plans and how they operate. One of the essentials is understanding joins, so I wanted to post about each of them in turn.

The three basic types of join operators are hatch match, merge, and nested loops. In this post, I’m going to focus on the last, and I will post on the other two shortly.

How Nested Loops Operate

Nested loops joins are the join operator you are likely to see the most often. It tends to operate best on smaller data sets, especially when the first of the two tables being joined has a small data set.

In row mode, the first table returns rows one at a time to the join operator. The join operator then performs a seek\scan against the second table for each row passed in from the first table. It searches that table based on the data provided by the first table, and the columns defined in our ON or WHERE clauses.

  • You can’t search the second table independently; you have to have the data from the first table. This is very different for a merge join.
  • A merge join will independently seek or scan two tables, then joins those results. It is very efficient if the results are in the same order so they can be compared more easily.
  • A hash join will seek or scan the second table however it can, then probes the hash table with values from the second table based on the ON clause.

So, if the first table returns 1,000 rows, we won’t perform an index seek (or scan) against the second; we will perform 1,000 seeks (or scans) against that index, looking for any rows that relate to the row from the first table.

The query above joins several table, using nested loops between each. we can see that the row counts for the first several tables are all 1. We read one SalesPerson record, the related SalesTerritory, an Employee record, and a Person record. When we join that to the SalesOrderHeader table, we find 234 related rows. That comes from 1 index seek, as internal result set only had 1 row thus far. If we join to LineItem next, we would perform that seek 234 times.

Key Lookups

The optimizer always uses nested loops for key lookups, so you’ll see them frequently for this purpose. I was unsure if this was the only way key lookups are implemented, but this post from Erik Darling confirms it.

In the plan above, we return 149 rows from OrderLines table. We use a nested loops join operator so we can include the UnitPrice and Description in our output, since they aren’t in the nonclustered index.

Which means we execute the key lookup 149 times. The cost for this operator is 99% of the total query, but the optimizer overestimated how many rows we would return.

I frequently look at key lookups as an opportunity for tuning. If you look at the output for that operator, you can see which columns are causing the key lookup. You can either add those columns to the index you expect this query to use (as included columns), or you can ask whether those columns really need to be included in the query. Either approach will remove the key lookup and the nested loop.

LOOP JOIN hints

You can direct SQL Server to use nested loops specifically in your query by writing the query with (INNER\LEFT\RIGHT\FULL) LOOP JOIN. This has two effects:

  • Forces the optimizer to use the specified join type. This will only force the join type for that specific join.
  • Sets the join order. This forces the tables to be joined in the order they are written (subqueries from WHERE EXISTS clauses are excluded from this). So this point may affect how the others joins operate.

I’ve blogged about using hints previously, so I won’t go on for long on this subject. I like the phrase “With great power comes great responsibility” when thinking about using hints to get a specific behavior. It can be an effective way to get consistent behavior, but you can make things worse if you don’t test and follow up to confirm it works as intended.

Summary

I’ll discuss the other two join types in another post. In short, hash matches are more efficient for large data sets, but the presence of a large data set should make us ask other questions. Merge joins are very efficient when dealing with data from two sources that are already in the same order, which is unlikely in general.

Thanks to everyone who voted for my session in GroupBy. I enjoyed speaking at the event, and had some interesting discussion and questions. I expect recordings for the May event will be available on GroupBy’s Youtube page, so keep an eye out for that if you missed the event.

As always, I’m open to suggestions on topics for a blog, given that I blog mainly on performance topics. You can follow me on twitter (@sqljared) and contact me if you have questions. You can also subscribe on the right side of this page to get notified when I post.

Hope you are all enjoying a long weekend.

In my most recent blog post, I looked at the Query Variation report, which compares the recent performance of queries versus their historical performance to either highlight improvements or regressions in performance. The Waits Variation component does the same, but comparing the recent waits for a query to its historical waits.

One thing to keep in mind, is that if a given query is changed in any way (to change the filter, return additional columns, or include a hint), the changed query will have a different query_id in Query Store. In which case, both the Waits Variation and Query Variation procedures will not compare the historical performance of the old query to the recent performance of the new one.

That being said, let’s look at the dbo.WaitsVariation procedure’s options.

Waits Variation Options

  • @ServerIdentifier: Defaults to the current instance. Set this to gather data from another instance of SQL Server.
  • @DatabaseName: Defaults to the current database. I’m querying data from WideWorldImporters for my examples.
  • @ReportIndex: Default NULL. When used, stores information about the parameters used for the report.
  • @ReportTable: Default NULL. Allows you to store the report data in a table like dbo.WaitsVariationStore, created by the installer.
  • @WaitType: Which wait category are we concerned with? Total by default. [Total, CPU, Lock, Latch, BufferLatch, BufferIO, WorkerThread, NetworkIO, Parallelism…] There are 36 options here, so check the header of dbo.WaitsVariation for the full list.
  • @Metric: Do we want to compare the total wait time for the query, or the average? Avg by default. [Avg, Total]
  • @VariationType: Are we looking for queries that have improved [I] or regressed [R]. ‘R’ by default.
  • @ResultsRowCount: Number of rows to include in the report. Only the largest regressions or improvements are included. Default is 25, and if this value is NULL or negative, the procedure uses the default.
  • @RecentStartTime and @RecentEndTime: Defines what the ‘recent’ period is, which will be compared against the historical. The defaults are 1 hour ago for @RecentStartTime, and now for @RecentEndTime.
  • @HistoryStartTime and @HistoryEndTime: Defines what the ‘history’ period is, which will be compared against the recent. The defaults are 30 days ago for @HistoryStartTime, and 1 hour ago for @HistoryEndTime.
  • @IncludeQueryText: Includes the text for any identified queries in the output. Default is 0.
  • @ExcludeAdhoc: Ignores ad hoc queries, anything that isn’t part of a procedure or other defined object. Default is 0.
  • @ExcludeInternal: Excludes internal queries\operations run by SQL Server itself. Default is 0.
  • @VerboseMode: Default 0. Provides the queries being used in the messages tab.
  • @TestMode: Default 0. When enabled, does everything except actually run the generated queries to create the report.
  • OUTPUT @ReportID: When you @ReportIndex and @ReportTable, a saved report is generated. You can then use the @ReportID to see that report separately.

So, these parameters are very similar to dbo.QueryVariation. You’ll likely want to alter your time parameters, and you could generate multiple reports to check different wait types. Or you could do a regular report for total waits, and review to see which type of wait had the biggest change.

Best Total Wait Improvement

Given that the default is to look for regressions instead of improvements, if I wanted to compare the last two day’s waits to the previous month, I could use the following:

DECLARE @RecentStartTime	DATETIME2 = DATEADD(HOUR, -48, GETUTCDATE());
DECLARE @RecentEndTime	DATETIME2 = GETUTCDATE();
DECLARE @HistoryStartTime	DATETIME2 = DATEADD(DAY, -30, @RecentEndTime);
DECLARE @HistoryEndTime	DATETIME2 = @RecentStartTime;

	EXECUTE [dbo].[WaitsVariation]
		@DatabaseName	= 'WideWorldImporters',
		@WaitType	= 'Total',
		@Metric		= 'Avg',
		@VariationType = 'I',
		@ExcludeAdhoc = 1,
		@HistoryStartTime = @HistoryStartTime,
		@HistoryEndTime	= @HistoryEndTime,
		@RecentStartTime = @RecentStartTime,
		@RecentEndTime = @RecentEndTime;

Choosing ‘Total’ as your @Metric may work for some cases, but would tend to give results with large variation. If a query executed more times in one period than the other, I would expect that period to have more waits.

I ran a workload to generate some activity, and only one query in it regressed, so I’m returning improvements so there’s a little more to look at.

Wait Variation – Improvements

So, the four procs listed had some rather dramatic improvements. We can see the amount of the total wait reduction, and columns for unknown and CPU waits. This is only a portion of the incredibly wide result set. There are history, recent, and variation % columns for each wait type.

We ran this script looking for ‘Total’ as our @WaitType, which means we are sorting and returning the top rows with the most improvement in that area. But it did record the improvement for all wait types. It only takes a little scrolling to find our causes:

Wait Variation – Causes

So, the first procedure had relatively large parallelism waits in its history, compared with none in the recent period. I would argue the variation % should be 100% not 0%, but the logic here may be special to prevent us from diving by 0.

All four procedures had NetworkIO waits. It’s likely the tool I was using to run these queries in a loop was not consuming the results fast enough, causing delays within SQL Server. These queries were taking so long that I used a smaller date range on my more recent calls. Fewer records, less time for the app to consume them, and smaller NetworkIO waits.

One thing I have noticed is that if you use @IncludeQueryText, the text returned is compressed and unreadable. Unless you create a saved report by using the @ReportIndex and @ReportTable options, which leads us to the second procedure.

Waits Variation from Store

If I save the results, there are records in the tables indicated, I can run dbo.WaitsVariationFromStore to read out the header of the report and its details in two result sets.

Waits Variation from Store output

The result set is much more limited, but focuses on the metric we chose. We can see the number of executions, the related object, the change in our totals waits, and the decompressed text for the query itself.

There are only three parameters for this proc, but here’s an example:

USE QDSToolBox
GO

	EXECUTE dbo.WaitsVariationFromStore
		@ReportID = 1,
		@VerboseMode = 0,
		@TestMode = 0;

Nearly done

There are two more components to the QDS Dashboard I want to detail, but I may be spreading them out between some other topics I want to blog on.

Time to go update slides for a user group meeting later this week.

If you have any topics related to performance in SQL Server you would like to hear more about, please feel free to make a suggestion. You can follow me on twitter (@sqljared) and contact me if you have questions.

In a recent talk with the Sacramento SQL Server User Group (@SACSQLUG), a question was asked I wanted to address, and the answer seemed more like a short blog than a tweet.

I was presenting on how to use the TOP clause to break down large operations into short, fast, bite-sized operations. The mechanics are things I learned from writing processes that do garbage collection, backfill new columns, and anonymizing PII data on existing tables. I’ve just posted the slides and example scripts here if you are interested.

Are they the same?

The question was whether the SET ROWCOUNT command would work just the same, and the answer is sometimes yes but largely no.

SET ROWCOUNT will limit the number of rows returned by the following statements, and this functions like a connection level setting. It will remain in effect until you change it or clear it by setting the rowcount to 0 (unlimited).

I would be concerned about setting the option and not clearing it, causing it to affect additional statements.

So let’s adapt some of my examples and see how it behaves.

The simple case with SET ROWCOUNT

Here’s a script to return 100 rows from a table in WideWorldImporters.

USE WideWorldImporters
GO
SELECT TOP 100 
	sol.OrderLineID, 
	sol.UnitPrice 
FROM Sales.OrderLines sol 
WHERE sol.OrderLineID < 1000 
	AND sol.UnitPrice > 50;
GO

SET ROWCOUNT 100;

SELECT 
	sol.OrderLineID, 
	sol.UnitPrice 
FROM Sales.OrderLines sol 
WHERE sol.OrderLineID < 1000 
	AND sol.UnitPrice > 50;
GO

Fairly simple, and the plans are much the same.

100 rows, as promised

The only difference is that the TOP operator is missing. But the query is behaving much like it is present. I surmise that the SELECT operator itself is limiting the number of rows by not requesting additional data after the first 100.

Quick aside on Row Mode

SQL Server queries typically operate in row mode. Batch mode is used in some cases with columnstore indexes and can be used on rowstore indexes in SQL Server 2019 and above. But I want to focus on how a query processes in row mode (with thanks to Grant Fritchey for his description of this).

The activity is driven by the root node, in this latest plan the Select operator. It asks the operator under it (the Top operator) to go get a row. The Select operator doesn’t really care how that operator does what it does, it just wants a row. And for the Top operator to do it’s job, it asks the operator below it for the same.

Until we get down to the last Nested Loops operator. It asks the Index Scan operator for a row, and that operator can finally get us some data. It returns a row and, being a join operator, Nested Loops asks the Key Lookup for data related to its row. It joins the results and pass the joined row up to the operator above it.

When data flows up to an operator, it will typically reply, “Now give me another row.”

The Top operator changes this be simply not asking for more rows once it has received the number of rows it needs. So all the activity for the underlying operators stops because the Top hasn’t told it to continue.

In the simple example of my two queries, we can see the Clustered Index Seek only returned 100 rows, then the query ended. It seems with SET ROWCOUNT in effect, the Select operator itself stopped the operation by not requesting more data.

But then it falls apart

Here’s a stored procedure I wrote to delete old records from the Sales.Orders table and all related tables.

USE WideWorldImporters
GO
CREATE OR ALTER PROCEDURE Sales.Orders_GarbageCollection(
@BatchSize INT = 100,
@Duration INT = 30
)AS
SET NOCOUNT ON;

	DECLARE 
		@NestedTransaction BIT = 0,
		@EndTime DATETIME;

	CREATE TABLE #OrdersGC(
		OrderID INT,
		INDEX IX_OrdersGC(OrderID)
	);

	CREATE TABLE #InvoicesGC(
		InvoiceID INT,
		INDEX IX_InvoicesGC(InvoiceID)
	);

	SET @EndTime = DATEADD(SECOND, @Duration, GETUTCDATE());

	WHILE (GETUTCDATE() < @EndTime)
	BEGIN
		TRUNCATE TABLE #OrdersGC;
		TRUNCATE TABLE #InvoicesGC;

		IF @@NESTLEVEL > 0
		BEGIN
			SET @NestedTransaction = 1;
		END
		ELSE
		BEGIN
			BEGIN TRANSACTION;
		END;

		INSERT INTO #OrdersGC
		SELECT TOP (@BatchSize)
			so.OrderID
		FROM Sales.Orders so
		WHERE
			so.OrderDate < 
				DATEADD(month, -50, GETUTCDATE());

		DELETE sol
		FROM #OrdersGC gc
		JOIN Sales.OrderLines sol
			ON sol.OrderID = gc.OrderID;
	
		INSERT INTO #InvoicesGC
		SELECT --TOP (@BatchSize)
			inv.InvoiceID
		FROM #OrdersGC gc
		JOIN Sales.Invoices inv
			ON inv.OrderID = gc.OrderID;


			DELETE ctr
			FROM #InvoicesGC inv
			JOIN Sales.CustomerTransactions ctr
				ON ctr.InvoiceID = inv.InvoiceID;

			DELETE sti
			FROM #InvoicesGC inv
			JOIN Warehouse.StockItemTransactions sti
				ON sti.InvoiceID = inv.InvoiceID;

			DELETE invl
			FROM #InvoicesGC inv
			JOIN sales.InvoiceLines invl
				ON invl.InvoiceID = inv.InvoiceID;

		DELETE inv
		FROM #OrdersGC gc
		JOIN Sales.Invoices inv
			ON inv.OrderID = gc.OrderID;

		DELETE so
		FROM #OrdersGC gc
		JOIN Sales.Orders so
			ON so.OrderID = gc.OrderID;
		
		IF @NestedTransaction = 0
		BEGIN
			COMMIT TRANSACTION;
		END;

	END;

GO

BEGIN TRANSACTION

EXEC Sales.Orders_GarbageCollection
	@BatchSize = 100,
	@Duration = 5;

ROLLBACK TRANSACTION
GO


--CREATE INDEX IX_Sales_Orders_OrderDate ON Sales.Orders (OrderDate);

This proc identifies 100 rows in Sales.Orders that we want to delete, and puts those OrderIDs in a temp table. We then delete records from related tables based on those OrderIDs, and we similarly delete records related to rows we want to delete from Sales.Invoices. And this procedure was written to loop and keep deleting records for a specified number of seconds.

One key thing to remember is that we only use a TOP clause when we populate the first temp table. We want to delete those 100 rows and everything related to them, not those 100 rows and only 100 related rows from the related tables. If we only delete some of the related rows, our process will fail on a foreign key violation when we try to delete some rows from the Sales.Orders table.

If you were to change this to use a SET ROWCOUNT, you would need to reset that to 0 after the statement inserting into the temp table. If not, the next statement looks like this:

With SET ROWCOUNT 100

Note the Top operator in this DELETE statement. All is flowing as expected, but then the Top operator shuts this down prematurely. Top only passes on 100 rows, when then get deleted. So, later when we try to DELETE from the Sales.Orders table, we fail because we haven’t deleted all the related rows.

And you could just SET ROWCOUNT 0 once you don’t need it, but I’d be concerned I’d forget to do so and cause issues.

Also, it’s deprecated

If you take a look at the remarks section of the documentation for this statement, there’s something interesting.

Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in a future release of SQL Server. Avoid using SET ROWCOUNT with DELETE, INSERT, and UPDATE statements in new development work, and plan to modify applications that currently use it. For a similar behavior, use the TOP syntax. For more information, see TOP (Transact-SQL).

Microsoft

So they aren’t using the word deprecated here, but it amounts to that. This will stop affecting DELETE, INSERT, and UPDATE statements in future, so you wouldn’t want to write a process using SET ROWCOUNT only to have it fail at some undefined point in the future.

It doesn’t state whether this will also affect MERGE statements, but I’d rather not have to use different syntax on some statements anyway, so I’ll just stick with my TOP clauses.

Happy Holidays

Should have another post or two coming in the next few weeks thanks to some extra time off and not traveling much for the holidays. Hope yours are great.

If you have any topics related to performance in SQL Server you would like to hear more about, please feel free to make a suggestion. You can follow me on twitter (@sqljared) and contact me if you have questions.

Also, subscribe to get emails when I post again.

Since Query Store stores statistics long term, we can use it to look for trends or major changes in performance. The Query Variation component of the QDS Toolbox helps us review changes and has a lot of options to allow us to select the kinds of changes we are interested in.

So, let’s review our options and go through some examples.

Options

Let’s discuss the options and how they interact.

  • @ServerIdentifier: Defaults to the current instance. Set this to gather data from another instance of SQL Server.
  • @DatabaseName: Defaults to the current database. I’m querying data from WideWorldImporters for my examples.
  • @ReportIndex: Default NULL. When used, stores information about the parameters used for the report.
  • @ReportTable: Default NULL. Allows you to store the report data in a table like dbo.QueryVariationStore, created by the installer.
  • @Measurement: What statistic are we basing our report on? CPU by default. [CLR, CPU, DOP, Duration, Log, LogicalIOReads, LogicalIOWrites, MaxMemory, PhysicalIOReads, Rowcount, TempDB]
  • @Metric: How are we comparing our measurements? Avg by default. [Avg, Max, Min, StdDev, Total]
  • @VariationType: Are we looking for queries that have improved [I] or regressed [R]. ‘R’ by default.
  • @ResultsRowCount: Number of rows to include in the report. Default is 25.
  • @RecentStartTime and @RecentEndTime: Defines what the ‘recent’ period is, which will be compared against the historical. The defaults are 1 hour ago for @RecentStartTime, and now for @RecentEndTime.
  • @HistoryStartTime and @HistoryEndTime: Defines what the ‘history’ period is, which will be compared against the recent. The defaults are 30 days ago for @HistoryStartTime, and 1 hour ago for @HistoryEndTime.
  • @MinExecCount: Only considers queries with at least this many recent executions. Default is 1.
  • @MinPlanCount and @MaxPlanCount: Only includes queries with a number of plans in this range. Defaults are 1 and 99999.
  • @IncludeQueryText: Includes the text for any identified queries in the output. Default is 0.
  • @ExcludeAdhoc: Ignores ad hoc queries, anything that isn’t part of a procedure or other defined object. Default is 0.
  • @ExcludeInternal: Excludes internal queries\operations run by SQL Server itself. Default is 0.
  • @VerboseMode: Default 0. Provides the queries being used in the messages tab.
  • @TestMode: Default 0. When enabled, does everything except actually run the generated queries to create the report.

So let’s look at a few examples Pablo Lozano put in the header for the procedure.

Worst Regressions

The first example using the Query Variation proc creates a list of the 25 queries with the largest regression in performance for the last hour, compared to the history over the last 30 days.

	EXECUTE [dbo].[QueryVariation]
		@DatabaseName	= 'WideWorldImporters',
		@Measurement	= 'Duration',
		@Metric			= 'Avg';

We’re relying a lot on default values here, but it shows how simple it can be to run this procedure and get usable results.

Our results include queries that are in procedures and those that aren’t. We have the QueryID values for each if we want to run another query to look them up, but we didn’t include the option to get the text for the statements.

Since the default is to look for regressed queries, our list has 25 queries in descending order based on how large the regression was. We can see the average duration recently and historically, the number of executions is each window, and the overall number of plans.

The first two queries on this list from Integration.GetCustomerUpdates both have regressed by almost 300%. Both also have two execution plans. That could be the explanation for their regression; we could have a new plan that is worse for a variety of reasons. We also are comparing 8 recent executions to 100 in our history, so it could just be the small sample size (or cold cache) is making a big difference.

We have the information we need here to find the query and follow up from there. I’m fond of using the “Tracked Queries” report in SSMS under Query Store, and that would show us the query and it’s recent plans.

Biggest CPU Improvements

Here’s the second example from the procedure’s comments.

		EXECUTE [dbo].[QueryVariation]
			@DatabaseName	= 'WideWorldImporters',
			--@ReportIndex		= '[dbo].[QueryVariationIndex]',
			--@ReportTable		= '[dbo].[QueryVariationReport]',
			@Measurement		= 'CPU',
			@Metric				= 'Total',
			@VariationType		= 'I',
			@ResultsRowCount	= 10,
			@MaxPlanCount		= 1,
			@IncludeQueryText	= 1;

I’ve added my database name and left the report tables in comments, and I did change the option to include query text.

This report should show us queries that have improved the most, based on total CPU usage. It will only give us 10 results, and only includes query with a single plan.

Results here are showing us procedure with the biggest reduction in overall CPU usage, and several are around 90%. But, that’s not surprising since we measured off the total numbers instead of average, and we’ve run the queries far less in the recent window. If we run the query 92% less in the recent window, this will show up as a reduction.

If our executions were more even, it could draw attention to statements that have improved despite the plan not changing, since we set our @MaxPlanCount to 1.

The QuerySQLText has information on variables at the beginning, but we see which statement is being referred to by each line of the report.

Query Variation Usage

The Query Variation report is good for a general comparison of performance. Run this daily, check your average duration\CPU\logical IO reads against the historical to look for regressions, and you get a good list of queries and procs you may want to look into.

Do keep in mind that you can’t use this check the results of a change to a query. If you are changing the text of a statement in any way, the modified statement is a different query_id in Query Store. So this report wouldn’t compare it to the original statement.

This can identify queries that have a change in performance because of a new plan, depending on how you set the @MaxPlanCount. So if you are changing an index to improve a query, a query variation report could show the statement’s improvement (or regression).

Speaking at SQL Friday this week

If you are not aware of SQL Friday, there’s a talk done online live every week at noon CET (GMT +6). You can also view previous sessions through links here.

I’ll be speaking this week with a case study on optimizing a very frequently run procedure. It is the same subject I presented at Virtual PASS 2021.

I have a few more blogs coming on the QDS Dashboard, and a few other topics I’m likely to write on soon. If you have any topics related to performance in SQL Server you would like to hear more about, please feel free to make a suggestion. You can follow me on twitter (@sqljared) and contact me if you have questions.

In contrast to the previous QDS Toolbox components, this one is relatively simple. Instead of a tool that’s ready to be scheduled to give you a report, this just adds the dbo.query_store_wait_stats_pivoted view.

So, what’s the value here?

The typical case

The way I use Query Store most often, is to investigate the performance of a specific procedure or statement. I tend to do this when I know there is an issue, or when I want to evaluate the improvement of a new change.

Here’s a query I’ll often run (or some variation thereof):

--Statement numbers for the last two hours, no aggregation
SELECT 
	qrs.avg_duration * qrs.count_executions AS total_duration,
	qrs.count_executions, 
	qrs.avg_duration, 
	qsq.query_id,
	qsi.end_time,
	qt.query_sql_text, 
	qsq.query_hash
FROM sys.query_store_query qsq 
JOIN sys.query_store_query_text qt 
	ON qt.query_text_id = qsq.query_text_id
JOIN sys.query_store_plan qp 
	ON qp.query_id = qsq.query_id
JOIN sys.query_store_runtime_stats qrs 
	ON qp.plan_id = qrs.plan_id
JOIN sys.query_store_runtime_stats_interval qsi
	ON qsi.runtime_stats_interval_id = qrs.runtime_stats_interval_id
WHERE 
	qsq.object_id = OBJECT_ID('Integration.GetStockItemUpdates')
	AND qsi.end_time > DATEADD(hour, -2, GETUTCDATE())
	--AND qt.query_sql_text LIKE '%something%'
ORDER BY 
	qt.query_sql_text, 
	qsq.query_id,
	qsi.end_time

Now, Query Store’s performance statistics are separated into intervals. We can set the size for an interval when we turn on Query Store (60 minutes by default, I have it at 15 on my test environment). So, this query will show me the performance of each statement for each interval over the last two hours.

Sometimes I’ll aggregate the data so I can look at the averages over several days, but sometimes I like seeing each interval to see if there is a lot of variation.

So, here’s that result set:

Runtime Stats from Query Store

sys.query_store_wait_stats

But let’s say we look at this data and are concerned one of these statements is taking longer than expected, and we want to see if it is waiting on anything. So, we take that query and add a join to sys.query_store_wait_stats. Seems simple enough.

--Statement numbers for the last two hours, no aggregation, with wait stats for one statement
SELECT 
	qrs.avg_duration * qrs.count_executions AS total_duration,
	qrs.count_executions, 
	qrs.avg_duration, 
	qsq.query_id,
	qsi.end_time,
	qt.query_sql_text, 
	ws.wait_category_desc,
	ws.execution_type_desc,
	ws.total_query_wait_time_ms,
	ws.avg_query_wait_time_ms
FROM sys.query_store_query qsq 
JOIN sys.query_store_query_text qt 
	ON qt.query_text_id = qsq.query_text_id
JOIN sys.query_store_plan qp 
	ON qp.query_id = qsq.query_id
JOIN sys.query_store_runtime_stats qrs 
	ON qp.plan_id = qrs.plan_id
JOIN sys.query_store_runtime_stats_interval qsi
	ON qsi.runtime_stats_interval_id = qrs.runtime_stats_interval_id
LEFT JOIN sys.query_store_wait_stats ws
	ON ws.plan_id = qrs.plan_id
	AND ws.runtime_stats_interval_id = qsi.runtime_stats_interval_id
WHERE 
	qsq.object_id = OBJECT_ID('Integration.GetStockItemUpdates')
	AND qsi.end_time > DATEADD(day, -2, GETUTCDATE())
ORDER BY 
	qt.query_sql_text, 
	qsq.query_id,
	qsi.end_time,
	ws.wait_category_desc

And let’s see our results:

Runtime Stats with Wait Stats

Now we have 9 rows instead of 5. And the number under avg_duration look redundant.

What’s happened, is that sys.query_store_wait_stats has multiplied our result set. Since it stores a record for each wait type observed in each interval, we have multiple combined rows for several of these intervals. And one of our rows would have been removed from this set if we didn’t LEFT JOIN to sys.query_store_wait_stats, since it had no waits.

This is probably not how you would image it to work on first glance. And you might want to pivot those results instead, but the Pivoted Wait Stats view has done that for you.

Using Pivoted Wait Stats

If I take my original query and join to dbo.query_store_wait_stats_pivoted, I will get one row of output for each plan_id and runtime_stats_interval_id.

And the result set will be rather wide. There’s a total and average column for each wait type in sys.query_store_wait_stats. You may or may not want to include all in your query.

So, here’s an example with the average for several interesting wait types:

--Statement numbers for the last two hours, no aggregation
SELECT 
	qrs.avg_duration * qrs.count_executions AS total_duration,
	qrs.count_executions, 
	qrs.avg_duration, 
	qsq.query_id,
	qsi.end_time,
	qt.query_sql_text, 
	wsp.Average_BufferIO,
	wsp.Average_BufferLatch,
	wsp.Average_CPU,
	wsp.Average_Lock,
	wsp.Average_Memory,
	wsp.Average_NetworkIO,
	wsp.Average_Parallelism,
	wsp.Average_Tracing
FROM sys.query_store_query qsq 
JOIN sys.query_store_query_text qt 
	ON qt.query_text_id = qsq.query_text_id
JOIN sys.query_store_plan qp 
	ON qp.query_id = qsq.query_id
JOIN sys.query_store_runtime_stats qrs 
	ON qp.plan_id = qrs.plan_id
JOIN sys.query_store_runtime_stats_interval qsi
	ON qsi.runtime_stats_interval_id = qrs.runtime_stats_interval_id
LEFT JOIN dbo.query_store_wait_stats_pivoted wsp 
	ON wsp.plan_id = qp.plan_id
	AND wsp.runtime_stats_interval_id = qrs.runtime_stats_interval_id
WHERE 
	qsq.object_id = OBJECT_ID('Integration.GetSaleUpdates')
	AND qsi.end_time > DATEADD(day, -5, GETUTCDATE())
ORDER BY 
	qt.query_sql_text, 
	qsq.query_id,
	qsi.end_time

Runtime Stats with Pivoted Wait Stats

We have one record for this query for each of three intervals. In each row, I can see the average wait time for CPU, disk (BufferIO), buffer latch, and other waits. In this case, the largest waits were with NetworkIO. Likely, SSMS was having trouble keeping up with displaying multiple result sets, because I ran this batch 50 times in a row.

One last thing, keep in mind the times from the main tables in Query Store like sys.query_store_runtime_stats are reported in microseconds, while the wait tables are in milliseconds whether you use sys.query_store_wait_stats or dbo.query_store_wait_stats_pivoted. A query that takes 7,000,000 microseconds to run but waited for CPU time for 6,900 milliseconds spent nearly its entire duration waiting.

Don’t forget about PASS Data Community Summit 2021!

PASS Data Community Summit 2021 is next week, and the event is virtual and free, so please go and register if you haven’t already. I’m happy to see PASS and the summit are Red Gate’s hands, and expect it to be a great event.

Many of the sessions, including my session, are prerecorded and can be watched at any time. This should make it easier to get around conflicts between sessions you want to hear. There are also Q&A sessions for the prerecorded sessions, and those are scheduled and live. So get registered and make sure you don’t miss out.

If you have any topics related to performance in SQL Server you would like to hear more about, please feel free to make a suggestion. You can follow me on twitter (@sqljared) and contact me if you have questions.

When configuring Query Store, we have a few options for deciding how it retains data but little control over how it cleans up that data. We can set the max size of our query store, the max number of plans to keep per query, and how long to keep query statistics.

The QDS Cache Cleanup component of the QDS Toolbox gives us a number of other options for what data to remove:

  • Internal queries
  • Adhoc queries
  • Orphaned queries (from deleted stored procedure and other dropped objects)
  • Queries that have not run recently
  • Target queries with fewer than X executions
  • Remove only stats associated with targeted queries

Performance

There was a performance issue where I work that related to the QDS size-based cleanup that Mark Wilkinson discusses here. There were a number of symptoms and issues seen at the time and hats off to our DBAs for determining the root cause. This is something I wanted to highlight.

These issues didn’t start until QDS started hitting max size, so that was over a month in some cases, and it didn’t hit all instances and databases at the same time. This means the issue seemed “random” when it was happening.

Mark Wilkinson

Using the QDS Cache Cleanup, you can prevent the typical size-based cleanup from running, while having more control on what gets cleaned up. The procedure provided will identify which queries or stats in Query Store to remove, then calls system procedures to remove them:

  • sp_query_store_unforce_plan
  • sp_query_store_remove_query
  • sp_query_store_reset_exec_stats

Options

Let’s discuss the options and how they interact.

  • @InstanceIdentifier: You can use this to gather data from another instance of SQL Server.
  • @DatabaseName: Defaults to the current database. I keep my QDS data in a separate database, so I will use it in my examples.
  • @CleanAdhocStale: Binary option; default is 0. Setting this to 1 will clean up data related to any stale queries that are ad-hoc queries, i.e. not associated with an object like a stored procedure. Only stale ad-hoc queries are affected, as defined by the @Retention and @MinExecutionCount.
  • @CleanStale: Binary option; default is 1. Cleans up all stale queries. Also depends on the settings for @Retention and @MinExecutionCount.
  • @Retention: This setting helps define which queries are considered “stale”. Takes in a number of hours since the query was last run. The default is 168 hours (a week), meaning any query that had run in the last week would not count as stale. If you set this to 0, it will include all queries that match the @CleanStale/@CleanAdhocStale options.
  • @MinExecutionCount: Queries with fewer than this many executions are stale. Default is 2. So, if we take both default options, this will clean up queries with less than 2 executions and anything that has not run in the last week. Setting this to 0, again, would include all queries or all ad-hoc queries.
  • @CleanInternal: Binary option; default is 1. Cleans up any internal queries. This includes queries being run by SQL Server itself to do things like update statistics, and is based on a filter on sys.query_store_query.is_internal_query. This isn’t a “stale” option, so @Retention and @MinExecutionCount don’t affect this.
  • @CleanOrphan: Binary option; default is 1. Cleans up any queries that are associated with an object that no longer exists.
  • @CleanStatsOnly: Binary option; default is 0. When enabled, this option doesn’t remove the query and plan, but uses sp_query_store_reset_exec_stats to remove the statistics for any targeted queries. When not enabled, the default behavior unforces any forced plans with sp_query_store_unforce_plan, and then removes the query, its plans, and its execution statistics with sp_query_store_remove_query. Applies to any query included by the above options.
  • @ReportAsText and @ReportAsTable: Both default to 0. You can enable either or both to get details on the amount of space recovered from the cleanup, in whichever format you prefer.
  • @ReportIndexOutputTable: Default NULL. Setting this allows you to store the report data in a table like dbo.QDSCacheCleanupIndex, created during setup.
  • @ReportDetailsAsTable: Default 0. When enabled, returns details about each query being deleted from Query Store.
  • @ReportDetailsOutputTable: Default NULL. allows you to store the report details in a table like dbo.QDSCleanSummary, created during setup.
  • @TestMode: Default 0. Doesn’t actually delete data, but provides output as though it does. You could use this to see what the effect of a given set or parameters would be before actually taking a destructive action.
  • @VerboseMode: Default 0. Provides the queries being used in the messages tab.

One point for clarity, if the @Retention or @MinExecutionCount is 0,

QDS Cache Cleanup examples:

EXECUTE [dbo].[QDSCacheCleanup]
	@DatabaseName = 'WideWorldImporters'
	,@CleanAdhocStale = 0
	,@CleanStale = 1
	,@Retention = 24
	,@MinExecutionCount = 2
	,@CleanOrphan = 1
	,@CleanInternal = 1
	,@ReportAsTable = 1
	,@ReportDetailsAsTable = 1
	,@TestMode = 1;
GO

This is a example execution given in the comments of the procedure. Since this is run in test mode, nothing is actually deleted; the reports provided give information on what would be deleted if we ran this process normally.

CleanAdhocStale is not used, but CleanStale is a superset of it; so stale ad-hoc queries are included with all stale queries. Anything that has been executed at least twice in the last day is not considered stale.

This process does include any queries from dropped objects and any internal queries, regardless of when they were executed.

Report Table

The report from the QDS Cache Cleanup gives one line per type of query affected. The output includes when this was generated and where, along with the count of queries and plans cleaned up and the space that would be recovered.

Report Details Table

The report details shows the object name (where possible), Query ID, LastExecutionTime, ExecutionCount and QueryText for all queries included in the cleanup.

In my case, it found queries from a procedure I was testing recently (though I had to up the execution count so there would be some stale queries). Most of what the QDS Cache Cleanup flagged for deletion were internal queries involved in updating stats. But nothing was removed, since this was still in TestMode.

There were also some stale ad-hoc queries that look like statistics activity (you can see StatMan in the QueryText). These are internal queries but had executed few times and not recently, so they were also flagged as stale.

This raises an important point. Based on the @Retention and @MinExecutionCount options, orphaned or internal queries can count as stale and be included in your cleanup. If either setting is 0, all queries will be included by the QDS Cache Cleanup; essentially a full wipe of Query Store.

Here’s a few more examples with comments:

USE QDSToolBox
GO
	-- Test Mode; no deletion
	-- Includes Stale queries (24 hours or < 20 executions)
	-- Also includes orphaned and internal queries
	-- Provides report, details, and verbose output (Messages tab)
EXECUTE [dbo].[QDSCacheCleanup]
	@DatabaseName = 'WideWorldImporters'
	,@Retention = 24
	,@MinExecutionCount = 20
	,@CleanStale = 1
	,@CleanAdhocStale = 0
	,@CleanOrphan = 1
	,@CleanInternal = 1
	,@ReportAsTable = 1
	,@ReportDetailsAsTable = 1
	,@TestMode = 1
	,@VerboseMode = 1;
GO

	-- Test Mode; no deletion
	-- Only includes orphaned and internal queries
	-- Provides report, details, and verbose output (Messages tab)
EXECUTE [dbo].[QDSCacheCleanup]
	@DatabaseName = 'WideWorldImporters'
	,@CleanStale = 0
	,@CleanOrphan = 1
	,@CleanInternal = 1
	,@ReportAsTable = 1
	,@ReportDetailsAsTable = 1
	,@TestMode = 1
	,@VerboseMode = 1;
GO

	-- Test Mode; no deletion
	-- Disables most default options
	-- Only includes ad-hoc queries only executed once
	--		and not run in the last 24 hours.
	-- Provides report, details, and verbose output (Messages tab)
EXECUTE [dbo].[QDSCacheCleanup]
	@DatabaseName = 'WideWorldImporters'
	,@Retention = 24
	,@MinExecutionCount = 2
	,@CleanStale = 0
	,@CleanAdhocStale = 1
	,@CleanOrphan = 0
	,@CleanInternal = 0
	,@ReportAsTable = 1
	,@ReportDetailsAsTable = 1
	,@TestMode = 1
	,@VerboseMode = 1;
GO

	-- Test Mode; no deletion
	-- @CleanStale = 1, other clean uptions disabled
	-- Setting @Retention or @MinExecutionCount to 0 means all queries are stale
	-- Output table options included but commented.
	-- Provides report, details, and verbose output (Messages tab)
DECLARE
	@ReportID BIGINT;

	EXECUTE [dbo].[QDSCacheCleanup]
		@DatabaseName = 'WideWorldImporters',
		@CleanAdhocStale = 0,
		@CleanStale = 1,
		@Retention = 0,			--	All Queries Stale
		@MinExecutionCount = 0,	--	All Queries Stale
		@CleanOrphan = 0,
		@CleanInternal = 0,
		@CleanStatsOnly	= 0,
		@ReportAsText = 1,
		@ReportAsTable = 1,
		@ReportDetailsAsTable = 1,
		--@ReportIndexOutputTable	= '[dbo].[QDSCacheCleanupIndex]',
		--@ReportDetailsOutputTable	= '[dbo].[QDSCacheCleanupDetails]',
		@TestMode = 1,
		@VerboseMode = 1,
		@ReportID = @ReportID OUTPUT;

SELECT @ReportID;
GO

More to come

Some of the options in the QDS Cache Cleanup didn’t function like I expected at first, so I think this post should be helpful. I’ll continue to post on the QDS Dashboard, but I’ll likely include a few on other topics in the weeks to come.

I will be speaking at PASS Summit, which is free and virtual this year, so please sign up if you haven’t already.

If you have any topics related to performance in SQL Server you would like to hear more about, please feel free to make a suggestion. You can follow me on twitter (@sqljared) and contact me if you have questions.

Query Waits

Carrying on from my previous post on the QDS Toolbox, let’s review the Query Waits component.

This component provides details about the wait types associated with a given object, query, or plan. This doesn’t pull data from the reports generated by the Server Top Queries component; this gets the data directly from the Query Store in the database you specify. So you can use this to review the waits on a given procedure, even if you haven’t generated any reports with Server Top Queries .

That being said, reports from Server Top Queries could identify problem queries and give you the PlanID, QueryID, or ObjectName you need to run the procedure. A query that spends a lot of time waiting would tend to have a higher duration and lower active metrics like CPU usage or logical reads. So those are the queries I would want to run the Query Waits report against.

Syntax

Assuming you installed the QDS Dashboard without issue, let’s look at the options for the procedure. @sqlozano documented these in the header of the procedure, for reference. Here’s an example script:

USE QDSToolBox
GO
DECLARE 
	@StartTime DATETIME2,
	@EndTime DATETIME2;

SELECT
	@StartTime = DATEADD(MONTH,-1,GETUTCDATE()),
	@EndTime = GETUTCDATE();

EXECUTE [dbo].[QueryWaits]
	--@ServerIdentifier = 'ServerName\Instance',
	@DatabaseName	= 'WideWorldImporters',
	--@ReportIndex		= '[dbo].[QueryWaitsIndex]'
	--@ReportTable		= '[dbo].[QueryWaitsStore]'
	@StartTime = @StartTime,
	@EndTime = @EndTime,
	@ObjectName	= 'OnDisk.InsertVehicleLocation'
	--@QueryID	= 172
	--@PlanID	= 178
	--@VerboseMode = 0
	--@TestMode = 0
GO

  • ServerIdentifier: Commented out. I’m gathering data from the instance I’m logged into, which is the default.
  • DatabaseName: I have a separate database for the QDS Toolbox, so I’m using this parameter to get data from the WideWorldImporters database.
  • ReportIndex and ReportTable: You can store the data generated by this procedure, and refer to it later. I’m referencing the two tables created by the scripts included in the toolbox. In the header, Pablo included an example that pushes the data to a centralized server across a linked server. You just need to fully qualify the name, like ‘[LinkedSrv].[LinkedDB].[dbo].[CentralizedQueryWaitsIndex]’. If you skip these options, you’ll just get a result set that won’t be saved.
  • StartTime and EndTime: In my example, I set the EndTime to be current, and the StartTime to be a month ago. So you can specify any reasonable window you want.
  • ObjectName, QueryID, PlanID: Choose 1 only. The procedure will fail with an error if you provide two. Again, the Server Top Queries output will give you options you can use here.
  • Verbose Mode: Allows you to see the TSQL generated to produce the report. You could then tweak that query if you want something different in your result set.
  • TestMode: Enabling this prevents the produced script from actually running.

Output

I used the OnDisk.InsertVehicleLocation procedure originally to populate some data in my database (which is part of the World Wide Importers sample on github). Running the Query Waits report to check it’s behavior, this is what the output looks like.

The PlanID, QueryID, and QueryTextID may or may not be returned depending on the input. If you query for a specific QueryID, but it has many associated PlanIDs, this report will only give you one row per interval because it aggregates all the plans together. If they weren’t aggregated, you could have dozens or hundreds of rows per interval.

The results show the numbers from each time interval within the window specified. The duration and CPU time are included, along with the number of plans and queries executed in that interval.

Then we have a lot of columns indicating waits in various categories (I didn’t even include them all in this image). The names here correspond to the types in the page for sys.query_store_wait_stats, so you will need to refer to the table I’ve linked. The WWI workload driver inserts 120,000 rows by default (across 120 threads), and the first interval has data from running that tool.

The data in sys.query_store_wait_stats is in milliseconds, but other tables with performance data like sys.query_store_runtime_stats are in microseconds. This report multiplies the time from sys.query_store_wait_stats by 1000, so if you compare the waits to the duration they have the same units.

There are 17 milliseconds of waits were in the Idle category; this includes a number of things, and it could also be that the tool itself has some delays between batches.

There are also 21 milliseconds of memory waits. This is another category with a number of specific wait types, but both of these seems inconsequential given the procedure took 10,226 milliseconds in this interval. The vast majority of that overall duration is accounted for by the 10,114 milliseconds of CPU time.

So in this case, waits don’t appear to be the issue.

Wrapping up

If I see a query’s CPU usage is much less than half its duration, I’ll usually consider its waits. This is an important aspect of SQL Server performance, and the Query Waits report here makes this data very accessible.

Two down, six more components in the QDS Toolbox. I’ll keep working on these posts, with a few other topics in between.

I hope you find this post helpful. If you have any topics related to performance in SQL Server you would like to hear more about, please feel free to make a suggestion. You can follow me on twitter (@sqljared) and contact me if you have questions.

In having a talk reviewed recently, it was suggested I spend more time defining some of the subject I touched on. It occurred if I should go over (or at least introduce) these ideas during a talk for a SQL Saturday audience, some might find a post on the subject useful. Hence my recent post on key lookups.

Another such topic is table variables. I use table variables frequently at my current job, but they came up very infrequently when I worked at CSS in Microsoft. I remember the conversations about them being very simple at the time, as in, “you should just use temp tables instead.” But there is a lot of utility with table variables, and they could be a useful arrow in your quiver.

Statistics

Table variables are much maligned for one issue; they don’t have statistics.

In most ways, table variables function a lot like temp tables. You can define their columns and insert rows into them. Scoping is different. If you declare a table variable inside a procedure, its scope ends when you return from that procedure. If you create a temp table inside a procedure, it remains for the connection, so the procedure that called the procedure creating the temp table can see the temp table and its contents.

Both can have indexes created on them, but the table variable will have no statistics on its indexes. The result is that the SQL Server optimizer doesn’t know how many rows are in the table variable and will assume 1 row when trying to compare possible execution plans. This may lead to worse plan than you would have with a temp table containing the same data.

This is mitigated in a feature of SQL Server 2019’s Intelligent Query Processor, table variable deferred compilation. This provides cardinality estimates based on the actual number of rows in the table variable, and statements are not compiled until their first execution.

“This deferred compilation behavior is identical to the behavior of temporary tables. This change results in the use of actual cardinality instead of the original one-row guess.”

Microsoft

This feature does require you to be using compatability level 150, but this is a significant upgrade.

Alternatively, if you want to change the behavior around a table variable and aren’t using SQL Server 2019 (or have another compatability mode), you could use a query hint. I’ve used an INNER LOOP JOIN hint many times when using a table variable I expect to have a small number of rows as a filter to find specific rows in a permanent table. This gives me the join order and join type I expect to see, but I’m less shy about hints than many as I have commented previously.

Tempdb contention

There’s an ocean of posts about how to address tempdb contention, mostly focusing on creating enough data files in your tempdb or enabling trace flags that change some behaviors. The point I think that gets understated, is that creating fewer objects in tempdb would also address the issue.

One of the ways table variables function like temp tables, is that they store data in tempdb and can drive contention there as well. The cause of that contention is different, but creating table variables across many different sessions can still cause contention.

Table Valued Parameters

TVPs allow us to pass a table variable to a procedure as an input parameter. Within the procedure, they function as read-only table variables. If the read-only part is an issue, you can always define a separate table variable within the procedure, insert the data from the TVP, then manipulate it as you see fit.

The caveats here are the same; the TVP is stored in tempdb (and can cause contention there), you may have issues with cardinality due to the lack of statistics.

But the advantage is that we can make procedure calls that operate on sets of data rather than in a RBAR fashion. Imagine your application needs to insert 50 rows into a given table. You could call a procedure with parameters for each field you are inserting, and then do that 49 more times. If you do it serially, that will take some time, in part because of the round trips from your application to the SQL Server. If you do this in parallel, it will take longer than it should because of contention on the table. I haven’t blogged about the hot page issue, but that might make a good foundation topic.

Or you could make one call to a procedure with a TVP so it can insert the 50 rows without the extra contention or delay. Your call.

Memory Optimized Table Variables

This changes this by addressing one of the above concerns. Memory optimized table variables (let’s say motv for short), store data in memory and don’t contribute to an tempdb contention. The removal of tempdb contention can be a huge matter by itself, but storing data in memory means accessing a motv is very fast. Go here if you’d like to read the official documentation about memory optimized table variables.

A motv has to have at least one index, and you can choose a hashed or nonclustered index. There’s an entire page on the indexes for motv’s, and more details for both index types on this page on index architecture.

I tend to always use nonclustered indexes in large part for the simplicity. Optimal use of hash indexes requires creating the right number of buckets, depending on the number of rows expected to be inserted into the motv. Too many or too few buckets both present performance issues. It seems unlikely to configure this correctly and then not have the code change that later. It’s too micromanage-y to me, so I’d rather use nonclustered indexes here.

We do need to keep in mind that we are trading memory usage for some of these benefits, as Eric Darling points out amusingly and repeatedly here. This shouldn’t matter the majority of the time, but we should keep in mind that it isn’t free and ask questions like:

  • Does that field really need to be nvarchar(max)?
  • I expect this INSERT SELECT statement to insert 100 rows, but could it be substantially more?
  • What if there are 100 simultaneous procedures running, each with an instance of this motv? Each with 100 rows?
  • How much total memory could this nvarchar(200) require if we’re really busy?

I certainly would want to keep my columns limited, and you can make fields nullable.

Memory Optimized Table Valued Parameters

So, we can memory optimize our TVP and get the advantages of both. I use these frequently, and one of the big reasons (and this applies to TVPs in general) is concurrency.

So, let’s compare inserting 50 rows of data into a physical table using a normal procedure versus a procedure with a memory optimized table variable.

Singleton Inserts

To test this, I used a tool provided in the World Wide Importers sample on github. Specifically, I was using the workload driver. This was intended to compare the performance of in-memory, permanent tables with on-disk tables, but the setup will work just fine for my purposes.

The VehicleLocation script enables In-Memory OLTP, creates a few tables, procedures to insert into those tables, and adds half a million rows to them. I’ll be using the OnDisk.VehicleLocations table and the OnDisk.InsertVehicleLocation to insert into it.

I wrote the following wrapper procedure to provide random inputs for that procedure:

USE WideWorldImporters
GO

CREATE OR ALTER PROCEDURE OnDisk.InsertVehicleLocationWrapper
AS
BEGIN
	-- Inserting 1 record into VehicleLocation
	DECLARE @RegistrationNumber nvarchar(20);
	DECLARE @TrackedWhen datetime2(2);
	DECLARE @Longitude decimal(18,4);
	DECLARE @Latitude decimal(18,4);

	SET NOCOUNT ON;

	SET @RegistrationNumber = N'EA' + RIGHT(N'00' + CAST(CAST(RAND() AS INT) % 100 AS nvarchar(10)), 3) + N'-GL';
	SET @TrackedWhen = SYSDATETIME();
	SET @Longitude = RAND() * 100;
	SET @Latitude = RAND() * 100;

	EXEC OnDisk.InsertVehicleLocation @RegistrationNumber, @TrackedWhen, @Longitude, @Latitude;
END;
GO

With this, I basically wrote a loop to call this to individually insert 500 rows. Query Store showed the procedure ran its one statement in 23 µs. I love measuring things in microseconds.

Parallel Inserts

Except that didn’t really test what I wanted. Running it in this fashion in SSMS, it would have run the procedure 500 times serially. I mentioned concurrency as an advantage for memory optimized TVPs, but to see that advantage we need to do singleton inserts into our table in parallel.

A little Powershell later, and I ran this again. I did only 50 inserts, but I did them using 50 parallel threads. Each thread will call the wrapper proc to generate random inserts, then call OnDisk.InsertVehicleLocation.

Since this table has clustered index based on an identity column, we will always be inserting into the last page on the table. But that will require an X lock to write the page, which means our parallel calls will be blocking each other. And the more there are, the worst the waiting will get. This is the hot page issue in a nutshell.

On average each execution took 59 µs. Nearly three times as long. Likely, the first thread completed just as fast as previous, but each subsequent thread has to get in line to insert into the OnDisk.VehicleLocation table. And the average duration increases steadily throughout.

Inserting with a Memory Optimized TVP

Here’s the script I wrote to test using a memory optimized TVP to insert 500 rows.

USE WideWorldImporters
GO

IF NOT EXISTS(
	SELECT 1 
	FROM sys.types st
	WHERE 
		st.name = 'VehicleLocationTVP'
)
BEGIN
	CREATE TYPE OnDisk.VehicleLocationTVP
	AS TABLE(
		RegistrationNumber nvarchar(20),
		TrackedWhen datetime2(2),
		Longitude decimal(18,4),
		Latitude decimal(18,4)
		INDEX IX_OrderQtyTVP NONCLUSTERED (RegistrationNumber)
	)WITH(MEMORY_OPTIMIZED = ON); 
END;
GO

CREATE OR ALTER PROCEDURE OnDisk.InsertVehicleLocationBatched
	@VehicleLocationTVP OnDisk.VehicleLocationTVP READONLY
AS
BEGIN
	SET NOCOUNT ON;
	SET XACT_ABORT ON;

	INSERT OnDisk.VehicleLocations
		(RegistrationNumber, TrackedWhen, Longitude, Latitude)
	SELECT
		tvp.RegistrationNumber, 
		tvp.TrackedWhen, 
		tvp.Longitude, 
		tvp.Latitude
	FROM @VehicleLocationTVP tvp;

	RETURN @@RowCount;
END;
GO

-- So let's insert 500 rows into this on-disk table using a TVP
DECLARE 
	@Target INT = 500,
	@VehicleLocationTVP OnDisk.VehicleLocationTVP;

DECLARE @Counter int = 0;
SET NOCOUNT ON;

WHILE @Counter < @Target
BEGIN
	-- Generating one random row at a time
	INSERT @VehicleLocationTVP
		(RegistrationNumber, TrackedWhen, Longitude, Latitude)
	SELECT
		N'EA' + RIGHT(N'00' + CAST(@Counter % 100 AS nvarchar(10)), 3) + N'-GL',
		SYSDATETIME(),
		RAND() * 100,
		RAND() * 100;

	SET @Counter += 1;
END;
	
EXEC OnDisk.InsertVehicleLocationBatched @VehicleLocationTVP;
GO

This creates the type for our memory optimized table variable and defines the procedure. The last batch generates 500 rows of random data, inserting them into the same type, and calls the new proc with the TVP as input. Executing this once to insert 500 rows took 1981 µs, which is just under 4 µs per row.

Full results for all three tests

Wrapping up

This might have been my wordiest blog, but I hope you learned something from it. I’ve rewritten a number of procedures in recent years to operate on batches, and the results have largely been great.

Again, I will frequently use a join or index hint when joining the TVP to a base table, and that’s a minor change that can prevent an expensive mistake from the lack of statistics.

If you have any topics related to performance in SQL Server you would like to hear more about, please feel free to @ me and make a suggestion.

Please follow me on twitter (@sqljared) or contact me if you have questions.

USE WideWorldImporters
GO
SELECT TOP 100 
	sol.OrderID, 
	sol.UnitPrice,
	sol.Quantity,
	sol.PickedQuantity,
	sol.LastEditedWhen
FROM Sales.OrderLines sol 
WHERE 
	sol.StockItemID = 20
GO

Key Lookups

In working on my presentation for Data Saturday #8 – Southwest US, I hadn’t realized how many topics come up at least briefly in the talk. I wanted to make a few posts about to go into details on each of these topics and why they are important.

My thanks again to Deborah Melkin for her review and feedback of the presentation.

A key lookup is an operation that occurs when a query has used a nonclustered index on a given table, but needs to access more columns to complete the query. It may need to check columns not in that index for additional filters, or it may just need to return that column as part of its result set.

In the simple query above, we’re retrieving 100 rows from the seek against a nonclustered index, then performing a key lookup against the clustered index. There is a nested loops operator between the two and understanding how that operates is important; for each row we receive from the first table, we perform the second operation once. So, in this query we are seeking 100 rows from the nonclustered index, then performing the key lookup 100 times. We go through the index once for each row we return, and you can see the cost of the key lookup operator is 99% of the query.

Operator Details

Details for the Key Lookup operator

Mouseover

If we mouseover the key lookup, we can see the details of this operation. We actually read 100 rows . The “Estimated Operator Cost” (0.324977) is nearly 100 times that of the index seek (0.0035899).

The “Number of Executions” is 100, so for each row received from the index seek, we traverse the clustered index (its index and leaf pages) once to get that row. And we do 100 separate seeks of that index to get 100 rows. This is a lot more work than we did to get 100 rows with 1 index seek from the nonclustered index.

The estimates match our actuals, but the TOP clause is a very good hint for how many rows we should receive.

If you have a table scan somewhere in your plan is table scanning millions of rows, you should probably address that first. But removing the key lookup by returning fewer columns drops this query from 12.5 milliseconds to 73 microseconds. That’s a 94.16% duration reduction (thank you Query Store).

Resolution

There’s two ways to handle a query like this with a key lookup.

  1. Do we need these columns in our query?
  2. Create a covering index.

Addition by Subtraction

We are doing the key lookup because we want to return columns, or filter\otherwise use columns, that are not in the nonclustered index. Let’s first ask this: do we need these columns in our query?

If we check the code or application that’s retrieving the results, does it actually consume those columns from the result set and use them? If we are filtering on that column, does that filter still make sense? If not, let’s just take it out of the query to simplify matters.

And it is very clear which columns are the issue. If you look at the details of the key lookup in the image above, the Output List for that operator shows which columns we are using the clustered index to retrieve. If you don’t need any of them, you can remove them from the query. Your new execution plan will be missing a key lookup.

CREATE COVERING INDEX

The heading is a joke; there’s no such command, of course. A covering index is a nonclustered index that supplies all the information you need from a given table to complete a given query. So far, we’re doing key lookups for this query because no such an index exists. We could get all these columns from the clustered index, but we would have to scan the whole index because our WHERE clause doesn’t match the sorting of the clustered index.

Normally when we create an index, we want our index to include any columns we are filtering on. So it would include columns in our WHERE clause, or the columns in our JOIN clause if we are joining from another table. In some cases, you might want the index to match an ORDER BY. Here just the section in red.

For a covering index on this query, we need to include the SELECT list (in the green section) in our index. In general, every column for this table referenced in the query needs to be in our index.

The INCLUDE column is a great way to add in the columns in our SELECT list.

We could add those 5 columns to our index normally as key values, but that would unnecessarily bloat all the pages of the index. We aren’t filtering on any of those columns, so we don’t need the columns in the index pages for us to filter properly. If we use the INCLUDE clause, these columns will be present only in the leaf page of our index. This is similar to how the columns from the clustered index are added to all nonclustered indexes.

So a script for the new index would look like this:

CREATE NONCLUSTERED INDEX [IX_Sales_OrderLines_AllocatedStockItems] ON [Sales].[OrderLines](	
	[StockItemID] ASC
) INCLUDE(
	[PickedQuantity],
	[OrderID],
	[UnitPrice],
	[Quantity],
	[LastEditedWhen]	
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = ON, ONLINE = ON, 
	ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [USERDATA]
GO

With the index in place, our original query took 95 microseconds. Slightly longer than the query with the reduced result set, but we did increase the size of the index some.

Conclusion

A key lookup might be an operation you don’t notice often, but I’ve been impressed with the result of removing them when I can.

I’ll be posting other blogs with foundational topics in the near future and more posts in general than I’ve had recently. Maybe this isn’t foundational; it might be on the first floor.

I hope you’ve learned something from this post. Please follow me on twitter (@sqljared) or contact me if you have questions.