Slow Garbage Collection

I encountered a curious issue recently, and immediately knew I needed to blog about it. Having already blogged about implicit conversions and how the TOP operator interacts with blocking operators, I found a problem that looked like the combination of the two.

I reviewed a garbage collection process that’s been in place for some time. The procedure populates a temp table with the key values for the table that is central to the GC. We use the temp table to delete from the related tables, then delete from the primary table. However, the query populating our temp table was taking far too long, 84 seconds when I tested it. We’re scanning and returning 1.4 million rows from the first table, doing a key lookup on all of them. We scan another table to look up the retention period for the related account, as this database has information from multiple accounts, and return 8.4 million rows there. We join two massive record sets, then have a filter operator that gets us down to a more reasonable size.

In general, when running a complex query you want your most effective filter, that results in the fewest rows, to happen first. We want to get the result set small early, because every operation after becomes less expensive. Here’s part of our anonymized query:

 WHERE
   Object5.Column10 = ?
   AND Object5.Column6 < Variable4 - COALESCE(Object12.Column1,Variable1,?)

So, Object5 is our main table. Column10 is a status column, and we compare that to a scalar value to make sure this record is ready for GC. That’s fine. The second is checking if the record is old enough to GC, and this is where things get interesting. The code here is odd; we’re subtracting a COALESCE of three values from a DATETIME variable, relying on behavior that subtracting an INT from a DATETIME subtracts that number of days from the DATETIME. There’s an implicit conversion, but it is on the variable side of the inequality. And the plan above appears to be our result.

So, Bad Date Math Code?

Seems like a good root cause, and I don’t like this lazy coding pattern, but let’s test it out with tables we can all look at. Would this be any better if we had written this to use the DATEADD function to do the math properly?


USE AdventureWorks2014
GO
--CREATE INDEX IX_SalesOrderHeader_ModifiedDate ON Sales.SalesOrderHeader (ModifiedDate);

SELECT TOP 100
	soh.ModifiedDate,
	soh.SalesOrderID
FROM Sales.SalesOrderHeader soh
WHERE
	soh.ModifiedDate < DATEADD(day, -1, GETUTCDATE());
	
SELECT TOP 100
	soh.ModifiedDate,
	soh.SalesOrderID
FROM Sales.SalesOrderHeader soh
WHERE
	soh.ModifiedDate < GETUTCDATE() - 1;

So, if we had an index on the OrderDate column, would it perform differently?

Apparently not. Same plan, same cost. But when you think about it, this tests the date math by subtracting an integer from the DATETIME provided by GETUTCDATE(). The original was subtracting a COALESCE of three values. One of those values was a float. Could the COALESCE or the resulting data type have made this more complicated?

Testing COALESCE


USE AdventureWorks2014
GO
DECLARE
	@AccoutSetting FLOAT = 8.0,
	@Default INT = 365;
	
SELECT TOP 100
	soh.ModifiedDate,
	soh.SalesOrderID
FROM Sales.SalesOrderHeader soh WITH(INDEX(IX_SalesOrderHeader_ModifiedDate))
WHERE
	soh.ModifiedDate < DATEADD(DAY, -COALESCE(@AccoutSetting, @Default), GETUTCDATE());

Running this, again we see a nice seek that reads and returns 100 rows. So the different data type and the COALESCE makes no difference.

Looking at the original query again, the first value isn’t a variable, it’s a column from a different table. We can’t filter by this column until we’ve read the other table, which affects our join order. But we have no criteria to seek the second table with.

Joined Filtering

One more test. Let’s see what the behavior looks like if we join to the Customers table to look for the RetentionPeriod. First, I’ll create and populate some data in that column:


USE AdventureWorks2014
GO
IF NOT EXISTS(
	SELECT 1
	FROM INFORMATION_SCHEMA.COLUMNS isc
	WHERE
		isc.TABLE_NAME = 'Customer'
		AND isc.TABLE_SCHEMA = 'Sales'
		AND isc.COLUMN_NAME = 'RetentionPeriod'
)
BEGIN
	ALTER TABLE Sales.Customer
		ADD RetentionPeriod INT NULL;
END;
GO
UPDATE TOP (100) sc 
	SET sc.RetentionPeriod = (sc.CustomerID%4+1)*90
FROM Sales.Customer sc;
GO
CREATE INDEX IX_Customer_RetentionPeriod ON Sales.Customer (RetentionPeriod);
GO

I only populated a few of the records to better match the production issue; only some accounts have the value I’m looking for, hence the COALESCE.


USE AdventureWorks2014
GO
DECLARE
	@Default INT = 365;

SELECT TOP 100
	soh.ModifiedDate,
	soh.SalesOrderID,
	sc.RetentionPeriod
FROM Sales.SalesOrderHeader soh
LEFT JOIN Sales.Customer sc
	ON sc.CustomerID = soh.CustomerID
	AND sc.RetentionPeriod IS NOT NULL
WHERE
	soh.ModifiedDate < DATEADD(DAY, -COALESCE(sc.RetentionPeriod, @Default), GETUTCDATE());

Now we’re trying to filter records in SalesOrderHeader, based on the Customer’s RetentionPeriod. How does this perform?

 

Well, the row counts aren’t terrible, but we are scanning both tables. The optimizer opted to start with Customer table, which is much smaller. We’re not filtering on the date until the filter operator, after the merge join.

I’d be worried that with a larger batch size or tables that don’t line up for a merge join, we’d just end up doing a hash match. That would force us to scan the first table, and without any filter criteria that would be a lot of reads.

Solution

The solution I applied to my production query was to create a temp table that had the account ID and the related retention period. I joined this to my primary table, and the query that was taking 84 seconds was replaced with one that took around 20 milliseconds.

 


USE AdventureWorks2014
GO
IF OBJECT_ID('tempdb..#AccountDates') IS NULL
BEGIN
	CREATE TABLE #AccountDates(
		CustomerID INT,
		GCDate DATETIME
	);
END
GO
DECLARE
	@Default INT = 365;

INSERT INTO #AccountDates
SELECT 
	sc.CustomerID,
	DATEADD(DAY, -COALESCE(sc.RetentionPeriod, @Default), GETUTCDATE()) AS GCDate
FROM Sales.Customer sc
WHERE
	sc.RetentionPeriod IS NOT NULL;

SELECT 
	soh.SalesOrderID
FROM #AccountDates ad
JOIN Sales.SalesOrderHeader soh
	ON soh.CustomerID = ad.CustomerID
	AND soh.ModifiedDate < ad.GCDate;

TRUNCATE TABLE #AccountDates;
GO

Here’s how I’d apply that thought to our example. I’ve created a temp table with CustomerID and its associated RetentionDate, so we could use that to search Sales.SalesOrderHeader.

You may have noticed my filter on the Sales.Customer table. In the live issue, the temp table had dozens of rows for dozens of accounts, not the tens of thousands I’d get from using all rows Sales.Customer in my example. So, I filtered it to get a similar size in my temp table.

Infinitely better. Literally in this case, since the ElapsedTime and ActualElapsedms indicators in the plan XML are all 0 milliseconds. I’m declaring victory.

If you liked this post, please follow me on twitter or contact me if you have questions.

How Relevant is the TOP operator?

I’ve explained what a blocking operator is and provided a few examples, but maybe this doesn’t seem important. It’s affecting the TOP operator, sure, but don’t people just use this to look at the TOP 1000 rows of their tables in SSMS?

The TOP operator is useful for many operations, especially in a large environment. Large operation can timeout or fail for a variety of reasons, consuming resources without providing the results you need. A small, batch-sized operation is more likely to succeed and tends to perform more consistently. Many maintenance operations make sense to run with a TOP operator, so we should make sure those operations aren’t stymied by blocking operators. Some examples:

  • Garbage collection on a table with many millions rows. You want this to perform quickly, but you really can’t afford for this to time out (whatever that timeout period may be). We can limit how long this runs by GC’ing a small batch at a time, but this can be hampered badly by an extra sort or a hash join.
  • Archiving data applies for the same reasoning as GC. If you archive data to another table\database\server, you’ll want to keep your operation small enough to manage.
  • Backfilling a new column. If the existing table is large, you can’t just UPDATE the whole table; you’ll lock the table and block all your users. A batched UPDATE in a loop or in a scheduled process can resolve this without causing an outage.
  • GDPR is here, and CCPA is coming. You may need to anonymize data across many related tables. We need this to perform well to cleanse our existing data, and we’ll continue running this process going forward.
  • Queries producing potentially large results to your application may need to be batched as well. If this times out, you’re still wasting a lot of time and reads, even if no data is changed.

Out the Window

I examined one process recently that was similar to this query, causing a GC operation to time out.

SELECT TOP 100
	inv.InvoiceID,
	ili.InvoiceLineID,
	ROW_NUMBER() OVER(Partition By inv.CustomerID ORDER BY inv.InvoiceDate) AS SortID
FROM Sales.Invoices inv
JOIN Sales.InvoiceLines ili
	ON ili.InvoiceID = inv.InvoiceID
WHERE
	inv.InvoiceDate < DATEADD(day, -60, GETUTCDATE()) 
	AND ili.LastEditedWhen < DATEADD(day, -60, GETUTCDATE()) 
ORDER BY SortID 

It was a SELECT statement, and it inserted the important fields into a temp table, and ran DELETEs against multiple tables based on the contents of the temp table. But it was the initial SELECT that had a poor plan and caused the timeout.

I quickly saw a SORT in the execution plan and wondered why. The actual query didn’t have an ORDER BY clause. But it did have a ROW_NUMBER OVER in the select list; took me a minute to notice that.

But was the sorting necessary? “We need to delete really old records in this table, but it’s vitally important that we delete them in the order they were originally inserted!”

It seemed a poor reason to sort a table with tens of millions of rows. Coupled with the very small batch size, we were doing an extraordinary amount of work to get rid of a few rows. So what if we commented the ROW_NUMBER and the ORDER BY out?

Even though the new plan has a scan operator, we only read 110 rows from it because we are using the TOP operator properly. Note the row counts from the first plan again. We have 479 thousand rows going through multiple operators in the first, but only 100 per operation in the second.

Avoid one Blocking Operator, find another

Here’s an example from recent work. I was looking at a query in a GC that was populating a temp table to use for later DELETEs. I was anticipating the optimizer to try a hash match join, so I used an INNER LOOP JOIN hint to avoid that blocking operator. The results were quite unpleasant, as you can see in this anonymized plan.

So, I avoided the hash match join, but the SQL Server optimizer didn’t see it the way I did. The first table is a temp table, Object6, which we are joining to a normal table, Object11. But this plan includes a table spool that not only forces us to read all 587741 rows in the seek against our table, it seems to create a cross join in memory between the results of the clustered index scan on the temp table and the clustered index seek against the base table (538 * 587741 = 316204658).

That hint obviously wouldn’t work. I reversed the order of the tables, then removed the hint altogether, giving the following:

 

2000 rows returned across the board. There is a SORT operation, but it’s after the TOP so there’s no harm.. Our results are being sorted before inserting them into a second temp table. And a much more performant query, taking ~8 ms instead of 75000 ms.

Sort out blocking operators

Hopefully this has been informative. I honestly wasn’t aware of blocking operators until a month or two ago. That’s the frustrating and interesting thing of SQL Server sometimes; you can work on this as long as I have and, even putting new development aside, there’s always more to learn.

Hope this helps you optimize some of your own operations.

It seems so simple

The TOP operator seems pretty straightforward. “Hey SQL Server, give me the first 100 rows that match this criteria, then stop.” But when certain operations get involved it can go sideways.

Let’s start with a simple example in the WorldWideImporters database.

SELECT TOP 100 
	sol.OrderLineID, 
	sol.UnitPrice 
FROM Sales.OrderLines sol 
WHERE sol.OrderLineID < 1000 
	AND sol.UnitPrice > 50

The query here is simple, and we can see the TOP operator only returns 100 rows, but so does the index seek underneath. The way this works is that the TOP asks the operation connected to it for 1 row, and keeps asking until it has 100 or the operation below can’t find anymore rows that match.

 

Because of the WHERE clause I chose, we actually had to read 943 rows in the table to get 100 that matched. At that point the TOP stopped asking for more rows.

If the TOP operator kept asking for more rows and only displayed the first 100, that would have been a waste of effort. Still, we’re only querying 1 index here.

Let’s add a wrinkle

SELECT TOP 100 
	sol.OrderID, 
	sol.UnitPrice,
	sol.Description
FROM Sales.OrderLines sol 
WHERE 
	sol.OrderID < 1000 
	AND sol.Quantity > 5
ORDER BY sol.OrderID

Now we have a key lookup, and we can follow the flow of what happened. We read and returned 270 rows from the index seek. There’s a nested loops operator connecting us to our key lookup, and that returned only 100 rows. So, we had to read 270 rows from the index seek to find 100 rows that met all our filters. 

Ultimately the TOP operator was asking for 1 row repeatedly until it met its quota, and then it stopped asking for more. As expected. 

What’s a blocking operator?

So, with another change we’ll see one of those operators I mentioned earlier.

SELECT TOP 100 
	sol.OrderID, 
	sol.UnitPrice,
	sol.Quantity,
	sol.PickedQuantity,
	sol.LastEditedWhen
FROM Sales.OrderLines sol 
WHERE 
	sol.OrderID < 5000 
	AND sol.Quantity > 5
ORDER BY sol.LastEditedWhen

Here, we’re seeking based on a range of OrderIDs, but getting results sorted in a different order. We’re reading many more records from the index and the key lookup (where the estimate is wayyy off). We don’t need to read this much to get our results, because we returned 2160 rows from the key lookup and the nested loops join, but then we reduce when we hit the sort.

It makes sense that you can’t return the top 100 rows that meet this criteria, until you’ve seen them all. Well, unless the data is in the order you want, and you can just seek it that way from the index. Our previous query had an ORDER BY clause but no sort operation because our sort matched our range seek.

Sort is a blocking operator. Don’t feel bad if you haven’t heard of the term; I’ve been working with SQL Server for 15 years, and I’m sure I never heard the term until the incomparable Grant Fritchey mentioned it while he was lecturing at my place of employment.

So sorts and several other types of operators (eager spools, remote query\scan\etc, hash match joins, and more) will block the normal flow and gather all their results before passing any rows on. The hash match join only blocks while building its hash table from the first input, before probing the second.

Let’s hash it out

SELECT TOP 100 
	so.OrderID, 
	so.CustomerID,
	sol.Quantity,
	sol.PickedQuantity,
	sol.LastEditedWhen
FROM Sales.OrderLines sol 
JOIN Sales.Orders so
	ON so.OrderID = sol.OrderID
WHERE 
	sol.OrderID < 5000 
	AND sol.Quantity > 5
--ORDER BY sol.LastEditedWhen

We filtered on OrderID < 5000, and we read 4999 rows from the build table. So we read everything that fit that criteria; we didn’t stop early because we had our 100 rows. So, definitely blocking behavior.

Then we probe the second table and return 900 rows, and they pass through the hash match. The results get reduced to 100 by the TOP operator. 

Why 900 rows from the OrderLines table, not 100? That’s less than clear. As I vary the result set or the TOP size, I get a number of behaviors using different indexes and returning batches of various sizes. It appears the probe may be trying to do a batch of rows at a time, or it may be related to the memory allocated. (If I can get a clearer answer to this aspect, I’ll update the post).

Update: I consulted with my colleague, the superb Kevin Feasel, who suggested this was operating in batch mode. I was testing this using SQL Server 2019, and batch mode on rowstore is a new feature that everyone should be aware of.

I thought I had ruled this out however, and indeed the probe operation was in row mode:

However, the hash match operation was not!

So the hash match requested a batch of 900 rows, which is why we saw the unexpected number of rows.

If you try this in SQL Server 2019, you may see different behaviors as you vary the result set (when I removed LastEditedWhen from the result set, it changed to row mode and only returned 100 rows) or the TOP size (TOP 1000 dropped it back to row mode). I also saw some variation with the index used against OrderLines, including the columnstore index.

Summing up

These have been relatively simple examples of the TOP operator in action, and how it interacts with other operators. In my next post, I’ll provide some more complicated plans, and discuss how we can keep our TOPs in top shape.