Update and Correction:

This blog was originally posted on February 20. Since then I read other articles that suggested different behavior with the Halloween Problem. I contacted Paul White, who informed me that the WideWorldImporters database uses compatibility level 130 (SQL Server 2016) by default. So, I tested on a SQL Server 2019 instance but was probably seeing an issue addressed in later updates.

I tested again at compatibility level 150 and saw a different execution plan which led to different conclusions.

I’ve left the majority of the post unchanged, but I’m adding an addendum section, and updating the summary and its conclusions. So, make sure you read those sections for the corrections.

Original Post:

I find myself talking about the Halloween Problem a lot and wanted to fill in some more details on the subject. In short, the Halloween Problem is a case where an INSERT\UPDATE\DELETE\MERGE operates on a row more than once, or tries to and fails. In the first recorded case, an UPDATE changed multiple rows in the table more than once.

So let’s take a look at an example using a publicly available database, WideWorldImporters.

A Halloween Problem example

Here’s a simple update procedure. We’re going to update the quantity for an item in the Sales.OrderLines table:

CREATE OR ALTER PROCEDURE Sales.OrderLines_UpdateQuantity
	@OrderID INT,
	@StockItemID INT,
	@Quantity INT
	WITH EXECUTE AS OWNER
AS
BEGIN
	SET NOCOUNT ON;
	SET XACT_ABORT ON;

	UPDATE sol
	SET
		sol.Quantity = @Quantity,
		sol.PickedQuantity = @Quantity
	FROM Sales.OrderLines sol
	WHERE
		sol.OrderID = @OrderID
		AND sol.StockItemID = @StockItemID;
		-- AND sol.Quantity <> @Quantity;
END;
GO

You may notice the commented line. In one description of the Halloween Problem I heard\read, it was suggested that if we try to SET something that is in our WHERE clause the problem is likely to occur. Or rather, SQL Server will see the possibility of the problem and add protections to our execution plan to prevent it.

First, let’s test without that line, and see what our execution plan tells us.

EXEC Sales.OrderLines_UpdateQuantity
	@OrderID = 5,
	@StockItemID = 155,
	@Quantity = 21;
GO
Note the eager spool

The eager spool between our index reads and clustered index update shows that SQL Server added Halloween protections to prevent the problem. The problem is prevented by separating the read phase of the query from the write phase.

This usually involves a blocking operator. Most often this is an eager spool, but if there is another blocking operator in the plan like a sort or hash match, that blocking operator may remove the need for a separate spool.

The Halloween Problem would occur if a query is running in row mode and as rows are still being read, rows are being updated and moved in an index. This allows the read operation to potentially read the updated row again and operate on it again. The index movement is key in this scenario.

But with a blocking operator between the read operation and the write, we force all the reads to complete first. This gives us a complete, distinct list of rows to be updated (in this example) before we get to the clustered index update, so it isn’t possible to update the same row twice.

So, how does index movement come into play here? We are updating the Quantity and PickedQuantity columns in our UPDATE statement. Both fields are key columns in the only columnstore index on the table, NCCX_Sales_OrderLines.

CREATE NONCLUSTERED COLUMNSTORE INDEX [NCCX_Sales_OrderLines] ON [Sales].[OrderLines]
(
[OrderID],
[StockItemID],
[Description],
[Quantity],
[UnitPrice],
[PickedQuantity]
)WITH (DROP_EXISTING = OFF, COMPRESSION_DELAY = 0) ON [USERDATA];
GO

So when we update these columns, the affected rows will move in that index. If the row moves, that means a read operation could continue reading and find the same row again, returning it as a part of its result set a second time.

Interestingly, we aren’t reading from the columnstore index in the plan provided. Since that’s the only index containing these columns as key values, it’s the only index where the rows should move. In this case, our read operators shouldn’t encounter updated rows a second time, since they use the FK_Sales_OrderLines_OrderID (with a key lookup against PK_Sales_OrderLines).

I wonder if SQL Server decided the Halloween protections were needed before it decided which index it would use for the read.

Removing the index

Either way, if we dropped the NCCX_Sales_OrderLines index, we should see a plan without an eager spool between the read operators and the update operator.

IF EXISTS(
	SELECT 1
	FROM sys.indexes si
	WHERE
		si.name = 'NCCX_Sales_OrderLines'
)
BEGIN
	DROP INDEX [NCCX_Sales_OrderLines]
		ON Sales.OrderLines;
END;
GO

With the index removed, let’s look at the new plan.

Unspooled

We’ve lost the extra steps to the left of the clustered index update operator to update the columnstore index, and we have also lost the eager spool between the read operators and the update operator. This shows without the index movement, Halloween protections are no longer needed.

Performance impact of protections

Let’s look at the data from Query Store to see how big the difference is between the two execution plans.

I ran a simple query against the same OrderID in Sales.OrderLines before running the procedure before and after the index change to get the data into the cache (because cold cache issues were making a large difference). I also ran the procedure 10 times to try to average out our results in case any odd wait types were seen.

80 microseconds versus 46 microseconds. Blazing fast in both cases with the data already cached, but the plan with Halloween protections took 74% longer. Unsure if the update to a columnstore index is significantly more expensive than that of a rowstore index. Perhaps we should test this again without columnstore complicating the issue.

Speaking in general, I would expect a bigger difference in a query affecting more rows. For a query that only returns 3 rows from the first index seek, the delay caused by the spool would be very small. But imagine if we have a query that reads tens or hundreds of thousands of rows before performing its write operation.

Normally such a query would be passing rows it has read up to the join and update operators while it is continuing to read. Those operations would be happening on different threads in parallel.1

If we are being protected from the Halloween Problem, the eager spool will not return any rows to the operations above it (like the clustered index update) until all rows have been read. So the writes cannot start until much later, and the more rows being read the more considerable the delay.

Nonclustered indexes?

If you noticed the “+3 non-clustered indexes” banner in one of the plans above, that’s indicating the nonclustered indexes updated when we updated the clustered index. This is more obvious in Plan Explorer than in the plans as shown in SQL Server Management Studio. So, I wanted to point that out in case the visual was confusing to anyone.

But this raises another question. If we are updating those indexes, why don’t they cause the Halloween protections to be used?

That is because the quantity columns are present in those indexes only as included columns. Changes to those columns won’t affect where the row sorts, but the values still need to be updated.

Rowstore testing

So, let’s see how this looks with a rowstore index. Here’s a second procedure, similar to the first but also updating PickingCompletedWhen.

CREATE OR ALTER PROCEDURE Sales.OrderLines_UpdateQuantityWhen
	@OrderID INT,
	@StockItemID INT,
	@Quantity INT
	WITH EXECUTE AS OWNER
AS
BEGIN
	SET NOCOUNT ON;
	SET XACT_ABORT ON;

	UPDATE sol
	SET
		sol.Quantity = @Quantity,
		sol.PickedQuantity = @Quantity,
		sol.PickingCompletedWhen = GETUTCDATE()
	FROM Sales.OrderLines sol
	WHERE
		sol.OrderID = @OrderID
		AND sol.StockItemID = @StockItemID
		AND sol.PickingCompletedWhen < GETUTCDATE();
END;
GO

Initially, no index uses PickingCompletedWhen. So if we execute the procedure as is, we shouldn’t see the tell-tale eager spool.

EXEC Sales.OrderLines_UpdateQuantityWhen
	@OrderID = 5,
	@StockItemID = 155,
	@Quantity = 21;
GO

This plans is what we’d expect. If we add an index, how does this change the plan and how does this change the performance?

IF NOT EXISTS(
	SELECT 1
	FROM sys.indexes si
	WHERE
		si.name = 'IX_OrderLines_OrderID_StockItemID_PickingCompletedWhen'
)
BEGIN
	CREATE INDEX IX_OrderLines_OrderID_StockItemID_PickingCompletedWhen
		ON Sales.OrderLines (OrderID, StockItemID, PickingCompletedWhen);
END;
GO

Here, we see the eager spool implementing the Halloween protections again, but between the index seek and the key lookup. Note that the new index is the one we are using for the index seek. The clustered index update now indicates it is updating 4 nonclustered indexes, including the new index.

So, is the performance difference as stark as it was with the columnstore index?

So, 52 µs vs 118 µs. The query took about ~126% longer when the Halloween protections were present. More than we saw with the columnstore index, which is surprising. Perhaps it is relevant that we are updating a third field. It almost feels like the observer effect at this scale.

Addendum

So, to correct things here, let’s go back to the first procedure.

CREATE OR ALTER PROCEDURE Sales.OrderLines_UpdateQuantity
	@OrderID INT,
	@StockItemID INT,
	@Quantity INT
	WITH EXECUTE AS OWNER
AS
BEGIN
	SET NOCOUNT ON;
	SET XACT_ABORT ON;

	UPDATE sol
	SET
		sol.Quantity = @Quantity,
		sol.PickedQuantity = @Quantity
	FROM Sales.OrderLines sol
	WHERE
		sol.OrderID = @OrderID
		AND sol.StockItemID = @StockItemID;
		-- AND sol.Quantity <> @Quantity;
END;
GO

If I run this procedure again with the restored database and no other changes besides updating the compatibility level to 150, I see the following execution plan:

So, we have no eager spool, which means the Halloween Problem isn’t a problem now.

Previously, there was a spool between the index seek and lookup and the clustered index update. The only index using any of the updated fields as a key value was the columnstore index. This suggested that the optimizer will use Halloween protections if any index uses the updated fields as a key value because the rows would be moved in that index.

This new plan disproves that because the optimizer no longer uses the protections with the later compatibility level. And the columnstore index (NCCX_Sales_OrderLines) is still present (as you can see if you hover over the clustered index update operator).

As for the second procedure, I see the Halloween protections even without the index I added in my example. Without that index, the query originally used the FK_Sales_OrderLines_OrderID index to seek the rows in question. At the higher compatibility level, the IX_Sales_OrderLines_Perf_20160301_02 index is used, which is keyed on (StockItemID, PickingCompletedWhen).

So, the Halloween protections are used because we read from an index keyed on one of the updated fields, and rows being updated will potentially move in that index.

We’ve seen the Halloween protections when using nonclustered indexes so far, but what if we are using the clustered index for the read?

I wrote a quick procedure to change the OrderLineID, which is the only column in the clustered primary key for this table. And this matches expectations; we see the eager spool between the clustered index seek and the update operator.

Summary

Hopefully, the addendum corrects the matter while keeping things clear. I’m updating one of the bullet points below, as well.

It seems there are only two criteria for the protections against the Halloween Problem to be used for an UPDATE query:

  1. The object being updated must also be in the query.
  2. A column being updated must be a key column in at least one index on the table. One of the updated columns must be a key value in the index used for the read portion of the query, so that the rows may move in that index.

For other statements, the setup is more complex. I find the UPDATE statement is the most straightforward example of the Halloween Problem. But you can see the protections in place if you query from a table as part of an INSERT or DELETE (or MERGE) where you change that same table.

And if we see Halloween protections in the plan for a query, we could change the offending index or the query to change the behavior.

Or we could use the manual Halloween technique, which I will discuss next time.

Thanks again to Paul White for pointing out the compatibility level; I doubt that would ever have occurred to me.

Please contact me if you have any questions or comments. I’ve updated my social media links above to include Counter.Social and Mastodon. We’ll see if there is more #sqlfamily activity on those platforms going forward.

Footnotes

1: Not the type of parallelism we typically think of with SQL Server. Parallelism is typically when a given operation, like an index scan, is expected to process many rows, and SQL Server dedicates multiple threads to that operator or group of operators. In this case, I say parallel because different operators (the index seek, nested loops join, and clustered index update) are all processing rows at the same time, one row at a time.

When I began working at Microsoft, I was very much a novice at performance troubleshooting. There was a lot to learn, and hash match joins were pointed out to me multiple times as the potential cause for a given issue. So, for a while I had it in my head, “hash match == bad”. But this really isn’t the case.

Hash matches aren’t inefficient; they are the best way to join large result sets together. The caveat is that you have a large result set, and that itself may not be optimal. Should it be returning this many rows? Have you included all the filters you can? Are you returning columns you don’t need?

If SQL Server is using a hash match operator, it could be a sign that the optimizer is estimating a large result set incorrectly. If the estimates are far off from the actual number of rows, you likely need to update statistics.

Let’s look at how the join operates so we can understand how this differs from nested loops

How Hash Match Joins Operate

Build Input

A hash match join between two tables or result sets starts by creating a hash table. The first input is the build input. As the process reads from the build input, it calculates a hash value for each row in the input and stores them in the correct bucket in the hash table.

Creating the hash table is resource intensive. This is efficient in the long run, but is too much overhead when a small number of rows are involved. In that case, we’re better off with another join, likely nested loops.

If the hash table created is larger than the memory allocation allows, it will “spill” the rest of the table into tempdb. This allows the operation to continue, but isn’t great for performance. We’d rather be reading this out of memory than from tempdb.

The building of the hash table is a blocking operator. This means the normal row mode operation we expect isn’t happening here. We won’t read anything from the second input until we have read all matching rows from the build input and created the hash table. In the query above, our build input is the result of all the operators highlighted in yellow.

Probe Input

Once that is complete, we move on to the second input in the probe phase. Here’s the query I used for the plan above:

USE WideWorldImporters
GO

SELECT *
FROM Sales.Invoices inv
INNER JOIN Sales.InvoiceLines invl
	ON invl.InvoiceID = inv.InvoiceID
WHERE
	inv.AccountsPersonID = 3002
GO

The build input performed an index seek and key lookup against Sales.Invoices. That’s what the hash table is built on. You can see from the image above that this plan performs a scan against Sales.InvoiceLines. Not great, but let’s look at the details.

There is no predicate or seek predicate, and we are doing a scan. This seems odd if you understand nested loops, because we are joining based on InvoiceID, and there is an index on InvoiceID for this table. But the hash match join operated differently, and doesn’t iterate the rows based on the provided join criteria. The seek\scan against the second table has to happen independently, then we probe the hash table with the data it returns.

If the read against Sales.InvoiceLines table can’t seek based on the join criteria, then we have no filter. We scan the table, reading 490,238 rows. Also unlike a nested loop join, we perform that operation once.

There is a filter operator before the hash match operator. For each row we read of Sales.InvoiceLines, we create a hash value, and check against the hash table for a match. The filter operator reduces our results from 490,238 rows to 751, but doesn’t change the fact that we had to read 490,238 rows to start with.

In the case of this query, I’d want to see if there’s a filter I can apply to the second table. Even if it doesn’t change our join type away from a hash match, if we performed a seek to get the data initially from the second table, it would make a huge difference.

Remember Blocking Operators?

I mentioned the build input turns that branch of our execution plan into a blocking operator. This is something try to call out the normal flow of row mode execution.

With a nested loops join, we would be getting an individual row from the first source, and doing the matching lookup on the second source, and joining those rows before the join operator asked the first source for another row.

Here, our hash match join has to gather all rows from the first source (which here includes the index seek, key lookup, and nested loops join) before we build our hash table. This could significantly affect a query with a TOP clause.

The TOP clause stops the query requesting new rows from the operators underneath it once it has met it’s requirement. This should result in reading less data, but a blocking operator forces us to read all applicable rows first, before we return anything to upstream operators.

So if your TOP query is trying to read a small number of rows but the plan has a hash match in it, you will be likely reading more data that you would with nested loops.

Summary

Actual numbers comparing join types would depend a ton on the examples. Nested loops are better for smaller result sets, but if you are expecting several thousand (maybe ten or more) rows read from a table, hash match may be more efficient. Hash matches are more efficient in CPU usage and logical reads as the data size increases.

I’ll be speaking at some user groups and other events over the next few months, but more posts are coming.

As always, I’m open to suggestions on topics for a blog, given that I blog mainly on performance topics. You can follow me on twitter (@sqljared) and contact me if you have questions. You can also subscribe on the right side of this page to get notified when I post.

Have a good night.

Another Example: 

I was reviewing the performance of a procedure recently and stumbled over another pumpkin. 

My last blog post was about the Halloween Problem, and we saw its effects on an UPDATE statement. In this case, it was the same issue but with an INSERT statement. Here’s the code being executed:

INSERT INTO Schema1.Object1 (
		Column1,
		Column2,
		Column3,
		Column4 )
	SELECT
		Object2.Column1,
		Object2.Column2,
		Object2.Column3,
		Object2.Column4 
	FROM Object3 Object2
	LEFT LOOP JOIN Schema1.Object1 Object4 WITH(INDEX(Column5))
		ON Object4.Column3 = Object2.Column3
		AND Object4.Column1 = Object2.Column1
		AND Object4.Column2 = Object2.Column2
		AND Object4.Column4 = Object2.Column4
	WHERE
		Object4.Column6 IS NULL

The gist is, we’re trying to insert a record into Object1, assuming said record doesn’t already exist. We’re querying the data we want to insert from a temp table, but joining to the base table to make sure a record doesn’t already exist with the same values.

In the case of an UPDATE statement, if we update fields that are in our query’s search criteria, we could update the same row multiple times. SQL Server’s Halloween protections prevent this, but result in extra work that affect our performance.

The INSERT statement here is similar, trying to insert a record while querying to see if the same record exists. So, again SQL Server adds Halloween protections to our plan:

Plan Analysis

I would have expected us to scan the temp table, then have a LEFT JOIN to the base table. The Table Spool is the red flag that we have an issue with the plan, and is frequently seen with Halloween protections.

The index scan on the base table seems to be overkill since we’re joining on the primary key columns (the key lookup isn’t much of a concern). But we’re likely doing the scan because of the spool; it’s SQL Server’s way of getting all relevant records in one place at one time, breaking the normal flow of row mode operation, to make sure we don’t look up the same record multiple times.

Easy Fix

The data we are trying to insert is being passed into the procedure using a memory-optimized table valued parameter. We’ve queried that into the temp table as another step before our final INSERT SELECT query, because SQL Server will sometimes make poor optimizations when TVP’s are involved (because they have no statistics).

The solution then is an easy one. We move our LEFT JOIN out of the final INSERT, and we make that check as we query the TVP’s data into the temp table. We separate the SELECT against that table from the INSERT; they are now in separate operations, and the Halloween protections are no longer necessary.

If you liked this post, please follow me on twitter or contact me if you have questions.

I encountered something recently I’d never encountered, so I had to share. I was making another change to a procedure I’ve been tuning recently. The idea was to alter the UPDATE statements to skip rows unless they are making real changes. The activity here is being driven by customer activity, and that sometimes leads to them setting the same value repeatedly. Difficult to know how often we update a row to the same value, but we think it could be significant. So, we added a clause to the UPDATE so we’ll only update if ‘old_value <> new_value’. The actual update operator is the most expensive part of the statement, but Simple enough so far. The scan is against a memory optimized table variable, and the filter to the left our our seeks and scans check for a change to our value. Nothing left but to update the index and…

Curve Ball

Wait, what’s all this? We have a Split operator after our Clustered Index Update. SQL Server does sometime turn an UPDATE statement into effectively a DELETE and INSERT if the row needs to move, but this seems a bit much. We have a total of 4 index update/delete operators now, and they aren’t cheap. My very simple addition to the WHERE clause actually caused a small increase in duration, and a big jump in CPU. So what’s going on?

UPDATE Object1
SET
	Column1 = CASE 
		WHEN Variable1 = ? THEN Object2.Column1 
		WHEN Variable1 = ? THEN Object1.Column1 + Object2.Column1 
		END,
	Column4 = GETUTCDATE()
FULL OUTER JOIN Variable5 dcqt
	ON Object1.Column9 = Variable2
	AND Object1.Column10 = Variable3
	AND Object1.Column6 = CASE WHEN Variable6 = ? AND Object2.Column2 >= ? THEN ? ELSE Object2.Column2 END
LEFT JOIN Variable7 oq
	ON Object1.Column6 = Object3.Column6
WHERE
	Object1.Column10 = Variable3
	AND Object1.Column9 = Variable2
	AND Object1.Column2 >= ?
	AND Variable8 < ?
	AND Object1.Column1 <> CASE 
		WHEN Variable1 = ? THEN Object2.Column1 
		WHEN Variable1 = ? THEN Object1.Column1 + Object2.Column1 
		END

So, we’re updating Column1 to the result of a CASE statement, and the last part of our WHERE clause compares Column1 to the same CASE. And the CPU for this statement just doubled? I happened to jog this by the superlative Kevin Feasel, who suggested this was the Halloween Problem.

The Halloween Problem

The Halloween Problem is a well documented issue, and it affects other database systems, not just SQL Server. The issue was originally seen by IBM engineers using an UPDATE to set a value that was also in their WHERE clause. So, database systems include protections for the Halloween Problem where necessary in DML statements, and SQL Server decided it needed to protect this query. And our query matches the pattern for this issue; we’re filtering on a field while we are updating it. All DML statements can run afoul of this issue, and there are examples for all in this really excellent series of posts by Paul White.

An Unlikely Ally

The protection SQL Server employs ultimately comes down to interrupting the normal flow of rows from operators up the plan. We actually need a blocking operator! A blocking operator would cause all the rows coming from the query against our primary table to pool in that operator. We’ll have a list of all relevant rows in one place, and they can be passed on to operators above without continuing the index seek against our table; possibly seeing the same row a second time. Eager spools\table spools are frequently used for this purpose, and SQL Server used an eager spool to provide Halloween protection in my case. A sort would also do, and we could design this query to sort the results of querying the table. If our query already employed a blocking operation to interrupt the flow, SQL Server would not need to introduce more operations to protect against the Halloween Problem. In my case, I definitely don’t want it spooling and creating three more expensive index operations. My idea for rewriting this goes a step farther.

A Two Table Solution

If we queried the data from our base table into a temp table, we could then update the base table without querying that same table on the same column, tripping over a pumpkin in the process. My procedure is already using memory optimized table variables, because this proc runs so frequently that temp tables cause contention in tempdb (described here). So in this instance, I’ll actually query this data into another motv. I can also use the data I stash in my motv to decide if any rows I intended to update don’t exist yet. I’ll INSERT those later, only if needed.

Bonus

Now, the whole point of the original change was to reduce work when we update a field to be equal to its current contents. In my motv, I’m going to have the old and new value for the field. It would be simplicity itself to put my UPDATE in a conditional, so that we only run the UPDATE if at least one row in my motv is making an actual change to the field. So instead of just reducing the cost of our update operators, we’ll skip the entire UPDATE statement frequently, for only the cost of one SELECT and a write to our motv.

Results

Stunning. The new logic causes this proc to skip the UPDATE statements entirely over 96% of the time. Even given that we are running a query to populate the memory optimized table variable (which is taking <100 microseconds) and running an IF EXISTS query against that motv (which takes 10-20 microseconds), we’re spending 98% less time doing the new logic than the original UPDATE statement. When I started reviewing the procedure several months ago, it took 3.1 milliseconds on average. I’ve tried several other changes in isolation, some effective, some not. The procedure is now down to 320 microseconds; an almost 90% reduction overall after a 71% drop from this change. I have some other ideas for tweaks to this proc, but honestly, there’s very little left to gain with this process. Time to find a new target. If you liked this post, please follow me on twitter or contact me if you have questions.

How Relevant is the TOP operator?

I’ve explained what a blocking operator is and provided a few examples, but maybe this doesn’t seem important. It’s affecting the TOP operator, sure, but don’t people just use this to look at the TOP 1000 rows of their tables in SSMS?

The TOP operator is useful for many operations, especially in a large environment. Large operation can timeout or fail for a variety of reasons, consuming resources without providing the results you need. A small, batch-sized operation is more likely to succeed and tends to perform more consistently. Many maintenance operations make sense to run with a TOP operator, so we should make sure those operations aren’t stymied by blocking operators. Some examples:

  • Garbage collection on a table with many millions rows. You want this to perform quickly, but you really can’t afford for this to time out (whatever that timeout period may be). We can limit how long this runs by GC’ing a small batch at a time, but this can be hampered badly by an extra sort or a hash join.
  • Archiving data applies for the same reasoning as GC. If you archive data to another table\database\server, you’ll want to keep your operation small enough to manage.
  • Backfilling a new column. If the existing table is large, you can’t just UPDATE the whole table; you’ll lock the table and block all your users. A batched UPDATE in a loop or in a scheduled process can resolve this without causing an outage.
  • GDPR is here, and CCPA is coming. You may need to anonymize data across many related tables. We need this to perform well to cleanse our existing data, and we’ll continue running this process going forward.
  • Queries producing potentially large results to your application may need to be batched as well. If this times out, you’re still wasting a lot of time and reads, even if no data is changed.

Out the Window

I examined one process recently that was similar to this query, causing a GC operation to time out.

SELECT TOP 100
	inv.InvoiceID,
	ili.InvoiceLineID,
	ROW_NUMBER() OVER(Partition By inv.CustomerID ORDER BY inv.InvoiceDate) AS SortID
FROM Sales.Invoices inv
JOIN Sales.InvoiceLines ili
	ON ili.InvoiceID = inv.InvoiceID
WHERE
	inv.InvoiceDate < DATEADD(day, -60, GETUTCDATE()) 
	AND ili.LastEditedWhen < DATEADD(day, -60, GETUTCDATE()) 
ORDER BY SortID 

It was a SELECT statement, and it inserted the important fields into a temp table, and ran DELETEs against multiple tables based on the contents of the temp table. But it was the initial SELECT that had a poor plan and caused the timeout.

I quickly saw a SORT in the execution plan and wondered why. The actual query didn’t have an ORDER BY clause. But it did have a ROW_NUMBER OVER in the select list; took me a minute to notice that.

But was the sorting necessary? “We need to delete really old records in this table, but it’s vitally important that we delete them in the order they were originally inserted!”

It seemed a poor reason to sort a table with tens of millions of rows. Coupled with the very small batch size, we were doing an extraordinary amount of work to get rid of a few rows. So what if we commented the ROW_NUMBER and the ORDER BY out?

Even though the new plan has a scan operator, we only read 110 rows from it because we are using the TOP operator properly. Note the row counts from the first plan again. We have 479 thousand rows going through multiple operators in the first, but only 100 per operation in the second.

Avoid one Blocking Operator, find another

Here’s an example from recent work. I was looking at a query in a GC that was populating a temp table to use for later DELETEs. I was anticipating the optimizer to try a hash match join, so I used an INNER LOOP JOIN hint to avoid that blocking operator. The results were quite unpleasant, as you can see in this anonymized plan.

So, I avoided the hash match join, but the SQL Server optimizer didn’t see it the way I did. The first table is a temp table, Object6, which we are joining to a normal table, Object11. But this plan includes a table spool that not only forces us to read all 587741 rows in the seek against our table, it seems to create a cross join in memory between the results of the clustered index scan on the temp table and the clustered index seek against the base table (538 * 587741 = 316204658).

That hint obviously wouldn’t work. I reversed the order of the tables, then removed the hint altogether, giving the following:

 

2000 rows returned across the board. There is a SORT operation, but it’s after the TOP so there’s no harm.. Our results are being sorted before inserting them into a second temp table. And a much more performant query, taking ~8 ms instead of 75000 ms.

Sort out blocking operators

Hopefully this has been informative. I honestly wasn’t aware of blocking operators until a month or two ago. That’s the frustrating and interesting thing of SQL Server sometimes; you can work on this as long as I have and, even putting new development aside, there’s always more to learn.

Hope this helps you optimize some of your own operations.

It seems so simple

The TOP operator seems pretty straightforward. “Hey SQL Server, give me the first 100 rows that match this criteria, then stop.” But when certain operations get involved it can go sideways.

Let’s start with a simple example in the WorldWideImporters database.

SELECT TOP 100 
	sol.OrderLineID, 
	sol.UnitPrice 
FROM Sales.OrderLines sol 
WHERE sol.OrderLineID < 1000 
	AND sol.UnitPrice > 50

The query here is simple, and we can see the TOP operator only returns 100 rows, but so does the index seek underneath. The way this works is that the TOP asks the operation connected to it for 1 row, and keeps asking until it has 100 or the operation below can’t find anymore rows that match.

Because of the WHERE clause I chose, we actually had to read 943 rows in the table to get 100 that matched. At that point the TOP stopped asking for more rows.

If the TOP operator kept asking for more rows and only displayed the first 100, that would have been a waste of effort. Still, we’re only querying 1 index here.

Let’s add a wrinkle

SELECT TOP 100 
	sol.OrderID, 
	sol.UnitPrice,
	sol.Description
FROM Sales.OrderLines sol 
WHERE 
	sol.OrderID < 1000 
	AND sol.Quantity > 5
ORDER BY sol.OrderID

Now we have a key lookup, and we can follow the flow of what happened. We read and returned 270 rows from the index seek. There’s a nested loops operator connecting us to our key lookup, and that returned only 100 rows. So, we had to read 270 rows from the index seek to find 100 rows that met all our filters. 

Ultimately the TOP operator was asking for 1 row repeatedly until it met its quota, and then it stopped asking for more. As expected. 

What’s a blocking operator?

So, with another change we’ll see one of those operators I mentioned earlier.

SELECT TOP 100 
	sol.OrderID, 
	sol.UnitPrice,
	sol.Quantity,
	sol.PickedQuantity,
	sol.LastEditedWhen
FROM Sales.OrderLines sol 
WHERE 
	sol.OrderID < 5000 
	AND sol.Quantity > 5
ORDER BY sol.LastEditedWhen

Here, we’re seeking based on a range of OrderIDs, but getting results sorted in a different order. We’re reading many more records from the index and the key lookup (where the estimate is wayyy off). We don’t need to read this much to get our results, because we returned 2160 rows from the key lookup and the nested loops join, but then we reduce when we hit the sort.

It makes sense that you can’t return the top 100 rows that meet this criteria, until you’ve seen them all. Well, unless the data is in the order you want, and you can just seek it that way from the index. Our previous query had an ORDER BY clause but no sort operation because our sort matched our range seek.

Sort is a blocking operator. Don’t feel bad if you haven’t heard of the term; I’ve been working with SQL Server for 15 years, and I’m sure I never heard the term until the incomparable Grant Fritchey mentioned it while he was lecturing at my place of employment.

So sorts and several other types of operators (eager spools, remote query\scan\etc, hash match joins, and more) will block the normal flow and gather all their results before passing any rows on. The hash match join only blocks while building its hash table from the first input, before probing the second.

Let’s hash it out

SELECT TOP 100 
	so.OrderID, 
	so.CustomerID,
	sol.Quantity,
	sol.PickedQuantity,
	sol.LastEditedWhen
FROM Sales.OrderLines sol 
JOIN Sales.Orders so
	ON so.OrderID = sol.OrderID
WHERE 
	sol.OrderID < 5000 
	AND sol.Quantity > 5
--ORDER BY sol.LastEditedWhen

We filtered on OrderID < 5000, and we read 4999 rows from the build table. So we read everything that fit that criteria; we didn’t stop early because we had our 100 rows. So, definitely blocking behavior.

Then we probe the second table and return 900 rows, and they pass through the hash match. The results get reduced to 100 by the TOP operator. 

Why 900 rows from the OrderLines table, not 100? That’s less than clear. As I vary the result set or the TOP size, I get a number of behaviors using different indexes and returning batches of various sizes. It appears the probe may be trying to do a batch of rows at a time, or it may be related to the memory allocated. (If I can get a clearer answer to this aspect, I’ll update the post).

Update: I consulted with my colleague, the superb Kevin Feasel, who suggested this was operating in batch mode. I was testing this using SQL Server 2019, and batch mode on rowstore is a new feature that everyone should be aware of.

I thought I had ruled this out however, and indeed the probe operation was in row mode:

However, the hash match operation was not!

So the hash match requested a batch of 900 rows, which is why we saw the unexpected number of rows.

If you try this in SQL Server 2019, you may see different behaviors as you vary the result set (when I removed LastEditedWhen from the result set, it changed to row mode and only returned 100 rows) or the TOP size (TOP 1000 dropped it back to row mode). I also saw some variation with the index used against OrderLines, including the columnstore index.

Summing up

These have been relatively simple examples of the TOP operator in action, and how it interacts with other operators. In my next post, I’ll provide some more complicated plans, and discuss how we can keep our TOPs in top shape.