I encountered something recently I’d never encountered, so I had to share.

I was making another change to a procedure I’ve been tuning recently. The idea was to alter the UPDATE statements to skip rows unless they are making real changes. The activity here is being driven by customer activity, and that sometimes leads to them setting the same value repeatedly. Difficult to know how often we update a row to the same value, but we think it could be significant.

So, we added a clause to the UPDATE so we’ll only update if ‘old_value <> new_value’. The actual update operator is the most expensive part of the statement, but

Simple enough so far. The scan is against a memory optimized table variable, and the filter to the left our our seeks and scans check for a change to our value. Nothing left but to update the index and…

Curve Ball

Wait, what’s all this? We have a Split operator after our Clustered Index Update. SQL Server does sometime turn an UPDATE statement into effectively a DELETE and INSERT if the row needs to move, but this seems a bit much. We have a total of 4 index update/delete operators now, and they aren’t cheap.

My very simple addition to the WHERE clause actually caused a small increase in duration, and a big jump in CPU. So what’s going on?

So, we’re updating Column1 to the result of a CASE statement, and the last part of our WHERE clause compares Column1 to the same CASE. And the CPU for this statement just doubled?

I happened to jog this by the superlative Kevin Feasel, who suggested this was the Halloween Problem.

The Halloween Problem

The Halloween Problem is a well documented issue, and it affects other database systems, not just SQL Server. The issue was originally seen by IBM engineers using an UPDATE to set a value that was also in their WHERE clause.

So, database systems include protections for the Halloween Problem where necessary in DML statements, and SQL Server decided it needed to protect this query. And our query matches the pattern for this issue; we’re filtering on a field while we are updating it. All DML statements can run afoul of this issue, and there are examples for all in this really excellent series of posts by Paul White.

An Unlikely Ally

The protection SQL Server employs ultimately comes down to interrupting the normal flow of rows from operators up the plan. We actually need a blocking operator!

A blocking operator would cause all the rows coming from the query against our primary table to pool in that operator. We’ll have a list of all relevant rows in one place, and they can be passed on to operators above without continuing the index seek against our table; possibly seeing the same row a second time. Eager spools\table spools are frequently used for this purpose, and SQL Server used an eager spool to provide Halloween protection in my case.

A sort would also do, and we could design this query to sort the results of querying the table. If our query already employed a blocking operation to interrupt the flow, SQL Server would not need to introduce more operations to protect against the Halloween Problem. In my case, I definitely don’t want it spooling and creating three more expensive index operations.

My idea for rewriting this goes a step farther.

A Two Table Solution

If we queried the data from our base table into a temp table, we could then update the base table without querying that same table on the same column, tripping over a pumpkin in the process.

My procedure is already using memory optimized table variables, because this proc runs so frequently that temp tables cause contention in tempdb (described here). So in this instance, I’ll actually query this data into another motv. I can also use the data I stash in my motv to decide if any rows I intended to update don’t exist yet. I’ll INSERT those later, only if needed.

Bonus

Now, the whole point of the original change was to reduce work when we update a field to be equal to its current contents.

In my motv, I’m going to have the old and new value for the field. It would be simplicity itself to put my UPDATE in a conditional, so that we only run the UPDATE if at least one row in my motv is making an actual change to the field. So instead of just reducing the cost of our update operators, we’ll skip the entire UPDATE statement frequently, for only the cost of one SELECT and a write to our motv.

Results

Stunning.
The new logic causes this proc to skip the UPDATE statements entirely over 96% of the time. Even given that we are running a query to populate the memory optimized table variable (which is taking <100 microseconds) and running an IF EXISTS query against that motv (which takes 10-20 microseconds), we’re spending 98% less time doing the new logic than the original UPDATE statement.

When I started reviewing the procedure several months ago, it took 3.1 milliseconds on average. I’ve tried several other changes in isolation, some effective, some not. The procedure is now down to 320 microseconds; an almost 90% reduction overall after a 71% drop from this change.

I have some other ideas for tweaks to this proc, but honestly, there’s very little left to gain with this process. Time to find a new target.

If you liked this post, please follow me on twitter or contact me if you have questions.

In my last post, I spoke about optimizing a procedure that was being executed hundreds of millions of times per day, and yes, that is expected behavior.

The difficult thing about trying to optimize this procedure is that it only takes 2.5ms on average to run. Tuning this isn’t a matter of changing a scan to a seek; we’ll have to look hard to find the opportunities here. A one millisecond Improvement on a procedure running 100 million times a day would save 100,000 seconds every day.

Well, I’ve found a few more options since my last post, and wanted to share my findings.

Setup

The procedure has some complex logic but only runs a few queries.

  • There are a few simple SELECT statements to populate some variables. These take a small percentage of the overall runtime.
  • There are two UPDATE statements, and we will run one or the other. Both join a table to a table variable; one has a second CTE doing some aggregation the other lacks. The majority of our time is spent running these UPDATEs.
  • An INSERT statement that takes place every time. This is to ensure that if we didn’t update a record because it didn’t exist, we make sure we insert the row. It’s very likely on a given run we will INSERT 0 rows.

Brainstorming

Since we 80% of our time in the UPDATE (I love Query Store), that’s the place with the most potential for gain.

But, on first look (first couple) it seems difficult to see room for improvement here. We’re doing index seeks with small row counts. The index scans are against memory optimized table variables, and you may notice they are cheaper than than the index seeks.

But, looking at plan one thing did draw my attention:

There’s a table spool, and following that I see the plan is updating an indexed view. Which we would do every time there’s an UPDATE. Hundreds of millions of times a day…huh. So, removing the index on that view would eliminate this entire middle branch from the plan.

The view is based on two columns that are the first two columns of the clustered PK of the underlying table. The view does some aggregation, but the difference between querying the view or the table is reading 1 row versus maybe 2 or 3 rows, most of the time. Dropping that index seems like a good thing to try.

And I did mention this in my last blog post, but we perform the INSERT statement every execution of the procedure, and we run the trigger on this table even if we inserted 0 rows. So, if we can detect whether the INSERT is needed, we can potentially skip the majority of the executions of the statement and the trigger.

The logic for the procedure uses a TVP and a couple of table variables, which isn’t optimal. SQL Server doesn’t have statistics on table variables, so it’s not able to make good estimates of how many rows are going to be returned (unless you are using table variable deferred compilation in SQL Server 2019). We could change these to temp tables, and see if we have better results. Hopefully, we’ll have a more stable plan across the many databases running it.

Both of the UPDATE statements have a bookmark lookup. We’ll be looking up only a few rows, but this could be a significant improvement for a query that takes so little time. Also, one of the UPDATE statements references the main table an additional time in its CTE. So we have two index seeks plus the key lookup. How much of our time is spent in the second access and the bookmark lookup?

Results

first change

I’ve been working on releasing these changes individually, and the first one is complete. Removing the index on the view, resulted in a 17% reduction in the duration of the procedure (from 2.34ms to 1.94ms), and a 20% reduction in CPU. Come to think of it, not having to update that index would have helped with our INSERT statement as well.

I’ll update this post once I have details on the other changes.

If you liked this post, please follow me on twitter and contact me if you have questions.

Updates

Second change

So, the second idea I’ve tried on this issue, is replacing the table variables with temp tables. The idea was that temp tables have statistics and table variables don’t, so we should tend to have better execution plans using temp tables. However, it didn’t work out that was in this case because it also caused pagelatch contention in tempdb.

Tempdb pagelatch contention is a very well documented issue. There are things you can do to mitigate this issue, but at a certain point you just need to create fewer temp tables. I’m not sure where the line is in the case of my environment, but it was clear that creating 3 temp tables in a proc running this often crosses that line.

This change has been reverted, but I’ll update this post again shortly. I should be making the change to skip unnecessary INSERT statements this week.

Skipping Inserts

This change has been made, and the results are fairly minor. The first measure I took showed the average duration of the proc dropping from 1.95 to 1.91 ms, but this varies enough from day to day that I could cherry pick numbers and get a different, even negative improvement.

This confirms something interesting. The cost of the procedure overall was skewed very heavily to the UPDATE statements, with the INSERT being significantly less. The most expensive operator in a DML operation is typically the Insert\Update\Delete operator itself; the step where we actually change data. My expectation all along has been that there were very few rows actually being inserted by our INSERT statement. It seems this is true and since the statement doesn’t actually insert data, the Insert operator doesn’t do work or take any significant amount of time. So we gained very little by this change, at least from the procedure.

This change also prevented us from calling the trigger with empty inserts, so we saw the trigger drop from executing ~350 million times per day to 6 million. This saves us about 14,000 seconds per day. I don’t think about context switching in the context of SQL Server very often, but in addition to the time benefit, there’s a small benefit in just not having to set up and switch to the context for the trigger itself.

Overall a small victory, but I’ll take it.

Pending

I should have my final update on this topic in the next week. This will change the logic of one of the UPDATE statements to remove an unnecessary CTE which references the main table, and adds an index hint to use the primary key, removing a bookmark\key lookup from both statements. The first update hits the table twice (once for the nonclustered, another for the key lookup) and the second hits it thrice (because of the CTE), but this change will drop both to 1 access of the PK, and should only query a few rows.

I’m very interested to see how this affects performance, as this should affect the largest part of work done outside of the actual Update operator in those statements.

Finale

So my change to the UPDATE statement is out, and the results are pretty good. First, The runtime of the procedure has dropped to 1.399 ms. If you recall, it originally took about 2.5 ms, so we’ve dropped this overall by more than 1 millisecond, which would be 40% of its original runtime.

Second, I love the simplicity of the new plan. The previous anonymized plan is at the top of this post; it took 2 screenshots to cover most of it. In particular, I noticed we were hitting the central table 3 times. Once in a CTE that was doing aggregation that was unnecessary, I removed that entirely. The second reference was in the main query, and it caused a key lookup which was the third access. I hinted our query to use the clustered primary key, which leads with the same two columns that are in the nonclustered index the optimizer seems to prefer.

The index scans are both in memory optimized table variables, and you can see the estimates are lower than accessing the main table in the clustered index seek. I also love seeing that the actual update operator on our left is such a large amount of the effort; that’s what I expect to see on a DML operation that’s tuned well.

In summary

I had 4 ideas originally to tune this query. I wanted to see if we could drop it by 1 millisecond, and we gained 1.1.

  1. Removing the index from an indexed view referencing this table had a significant effect. This required a significant amount of research, as I had to find all the places where we referenced the view and determine if any would experience significantly worse performance without the aggregation from that view. I didn’t see any red flags, and this changed dropped the procedure’s duration from 2.5ms to 1.94ms (22% reduction).
  2. The procedure uses a few in memory table variables with temp tables. The idea is that temp tables have statistics. That could lead to SQL Server making better plans, because it can estimate how many rows are in a given operation. This would work in other cases, but not for a proc that runs this often. Performance was actually slowed because of significant PAGELATCH waits. Every execution we were creating multiple temp tables, and at this pace our threads were constantly waiting on key pages in tempdb. This change was reverted.
  3. Reducing the INSERTs was a gain, but a minimal one. The INSERT statement itself took very little of our time in the procedure. We also got to skip running the INSERT trigger, but that also did not take long to run. It’s possible we ended up with less waiting in the main table, or the table our trigger was updating, but if so the gains were too small to quantify.
  4. Simplifying the logic of the two UPDATE statements that were taking most of the time in the procedure was a success. We went from 3 operations on the only permanent table in our query to 1, and removed an aggregation step we didn’t need. This dropped our runtime from 1.94ms to 1.399ms (27.8%). Every operation counts.

Hopefully you’ve learned something from this post and its updates. If so, please follow me on twitter or contact me if you have questions.


Reducing Trigger Executions

I’ve never been a fan of triggers. I don’t like the idea of them adding an additional tax on every operation. It’s easy to forget they are even there, consuming your cycles. I’ve even seen a few nasty death-by-a-thousand-cuts scenarios with them. But I found something out this week that makes me like them even less.

I was tuning a procedure that runs 284 million times a day.

Over a number of servers and databases, but yes, that number is correct. It takes 2.5ms to run on average, with 1.0ms of CPU time. I’ll spare you the math, but that means over 3 cores of SQL Server are doing nothing but running this procedure 24/7/365. Anything we can do to improve this will be significant, even if we just shave off half a millisecond.

Small Improvements

The procedure is attempting to update or insert a few records based on a TVP input. Sometimes we run an UPDATE statement, but update the column to its current value based on one input parameter. Based on another parameter, we may filter out all rows and update nothing. But in both cases, even though we haven’t changed any data we still took the time to run the UPDATE statement. I’m unsure how often those two parameters match one of those two checks, but if we check them first we can entirely skip the UPDATE.

Like I said, every little bit helps.

After the UPDATE, there is an INSERT to make sure if we didn’t UPDATE the row because it doesn’t exist, we INSERT it. There are conditionals around the UPDATE, but not the INSERT. So, we run it every time we run the proc. 284 million times a day. Even if there’s nothing to insert.

Trigger Happy

The revelation is this:

“These triggers fire when any valid event fires, whether table rows are affected or not.”

Ouch. And I really wasn’t aware of this. So, we pay for the INSERT execution every time, even if we don’t need to insert anything. And the INSERT trigger on that table fires as well, trying to update another table even though the INSERTED and DELETED tables are empty. Charming.

But if our TVP had x items to update\insert and our UPDATE statement handled y of them, we only need to try the INSERT if y<x.

We won’t even need to query the underlying table or the TVP; just check the row counts.

We’re likely to update these records many times, and we’ll obviously only insert them once. Checking the execution counts for the INSERT and UPDATE triggers and subtracting the number of executions from this proc, it seems we UPDATE the table several hundred times for each INSERT. So, we should be able to reduce trigger executions by a huge amount.

Well then. I’ll look forward to seeing those executions drop shortly.

If you liked this post, please follow me on twitter or contact me if you have questions.

Postscript

This change reduced trigger executions and INSERT statement executions from ~350 million per day, to ~5 million.

Graphs like this make me happy. 🙂

Slow Garbage Collection

I encountered a curious issue recently, and immediately knew I needed to blog about it. Having already blogged about implicit conversions and how the TOP operator interacts with blocking operators, I found a problem that looked like the combination of the two.

I reviewed a garbage collection process that’s been in place for some time. The procedure populates a temp table with the key values for the table that is central to the GC. We use the temp table to delete from the related tables, then delete from the primary table. However, the query populating our temp table was taking far too long, 84 seconds when I tested it. We’re scanning and returning 1.4 million rows from the first table, doing a key lookup on all of them. We scan another table to look up the retention period for the related account, as this database has information from multiple accounts, and return 8.4 million rows there. We join two massive record sets, then have a filter operator that gets us down to a more reasonable size.

In general, when running a complex query you want your most effective filter, that results in the fewest rows, to happen first. We want to get the result set small early, because every operation after becomes less expensive. Here’s part of our anonymized query:

So, Object5 is our main table. Column10 is a status column, and we compare that to a scalar value to make sure this record is ready for GC. That’s fine. The second is checking if the record is old enough to GC, and this is where things get interesting. The code here is odd; we’re subtracting a COALESCE of three values from a DATETIME variable, relying on behavior that subtracting an INT from a DATETIME subtracts that number of days from the DATETIME. There’s an implicit conversion, but it is on the variable side of the inequality. And the plan above appears to be our result.

So, Bad Date Math Code?

Seems like a good root cause, and I don’t like this lazy coding pattern, but let’s test it out with tables we can all look at. Would this be any better if we had written this to use the DATEADD function to do the math properly?

So, if we had an index on the OrderDate column, would it perform differently?

Apparently not. Same plan, same cost. But when you think about it, this tests the date math by subtracting an integer from the DATETIME provided by GETUTCDATE(). The original was subtracting a COALESCE of three values. One of those values was a float. Could the COALESCE or the resulting data type have made this more complicated?

Testing COALESCE

Running this, again we see a nice seek that reads and returns 100 rows. So the different data type and the COALESCE makes no difference.

Looking at the original query again, the first value isn’t a variable, it’s a column from a different table. We can’t filter by this column until we’ve read the other table, which affects our join order. But we have no criteria to seek the second table with.

Joined Filtering

One more test. Let’s see what the behavior looks like if we join to the Customers table to look for the RetentionPeriod. First, I’ll create and populate some data in that column:

I only populated a few of the records to better match the production issue; only some accounts have the value I’m looking for, hence the COALESCE.

Now we’re trying to filter records in SalesOrderHeader, based on the Customer’s RetentionPeriod. How does this perform?

 

Well, the row counts aren’t terrible, but we are scanning both tables. The optimizer opted to start with Customer table, which is much smaller. We’re not filtering on the date until the filter operator, after the merge join.

I’d be worried that with a larger batch size or tables that don’t line up for a merge join, we’d just end up doing a hash match. That would force us to scan the first table, and without any filter criteria that would be a lot of reads.

Solution

The solution I applied to my production query was to create a temp table that had the account ID and the related retention period. I joined this to my primary table, and the query that was taking 84 seconds was replaced with one that took around 20 milliseconds.

 

Here’s how I’d apply that thought to our example. I’ve created a temp table with CustomerID and its associated RetentionDate, so we could use that to search Sales.SalesOrderHeader.

You may have noticed my filter on the Sales.Customer table. In the live issue, the temp table had dozens of rows for dozens of accounts, not the tens of thousands I’d get from using all rows Sales.Customer in my example. So, I filtered it to get a similar size in my temp table.

Infinitely better. Literally in this case, since the ElapsedTime and ActualElapsedms indicators in the plan XML are all 0 milliseconds. I’m declaring victory.

If you liked this post, please follow me on twitter or contact me if you have questions.

Understanding Cost in SQL Server

Cost is an important concept in SQL Server. It is key in how plans are compared and chosen by the optimizer, and it can guide us to problem operators as we tune a query’s performance. It can also lead us astray if we follow it blindly. In this post, I want to explain what cost is and how we use it.

Many queries (that aren’t trivial) can be executed in a number of different ways. Each index on a table is a possible path for the optimizer to use, and statistics allow the optimizer to determine the cost of a given operation. SQL Server determines what potential plan it will use in large part based on cost. The optimizer isn’t exhaustive. It won’t compare all possible plans; that would take too long.

Let’s take a look at an example query from the AdventureWorks database.

There is a cost provided for each query relative to the batch; that can help you narrow down which statement is the issue if a large batch or procedure is performing poorly. This query is in the only one in the batch. 

This query hits 5 tables, and has a number of joins and other operations. Each operation has a cost displayed here as a percentage of the total query. You’ll notice one operation has a cost of 94%; let’s zoom in. 

This has a cost of 0.54456. This is broken further into an I/O Cost and a CPU Cost. There is also a Subtree Cost, which would include any operators that feed into this one, but in this case there aren’t any. The operator returned 234 rows, but it read over 31k. So the cost seems appropriate; we really are doing some work here. 

The number for cost is always presented without a unit, you may ask 0.54456 of what exactly? Pounds? Parsecs?

Calibration

The story as I recall was that cost was derived from how long it took an early developer of SQL Server to run a given operation on his desktop computer, in seconds. So initially cost was expressed in seconds, but that’s not the case anymore. It’s a more generic expression of how much work is involved in executing the operation.

Given the cost value is fairly generic, you need an idea of what is a cheap operator or an expensive one. Here’s one of the other index seeks from the query above.

So, we’re doing an index seek not a scan. We not only returned 1 row, we only read one row. In terms of reading a normal table, it’s not going to get any cheaper than this. This cost of about 0.003 is something you’ll see many times for that reason.

So, what constitutes an expensive query? That’s a matter of opinion. You could gauge this by the “Cost Threshold for Parallelism” setting in SQL Server. This is a server level setting that sets how expensive an operation has to be for SQL Server to consider going parallel to perform it. The default setting is 5, so you could argue that’s an expensive query. But this setting is probably quite low for current servers. I think at work, this setting is 100 for most of the servers I work on.

Keep in mind, the cost threshold is for the operation, not the entire plan. 

Cost isn’t exact

One thing to remember is that cost in SQL Server is always an estimate. This is a number SQL Server calculates when considering multiple potential plans to determine which would be the best. But the number of rows it expects a given operation to return or how many times that operation runs can be off. All of that is based on statistics.

It doesn’t then go back and update the cost number later if those numbers were incorrect. So while we can use the cost as an indicator of which query or operator we should focus on, don’t completely tunnel-vision that one thing.

I could talk at this point about estimated plans versus actual plans, but fortunately Grant Fritchey has already done so. The gist of his post is that an actual plan is one that has actual runtime metrics. For example, “Number of Rows Read”, “Actual Number of Rows”, and “Number of Executions” in the images above.

It’s helpful to have plans with these actual numbers; they can help you confirm if the costs look accurate. “Number of Rows Read” is the main data point I look at most of the time.

You may also find the following situation in plans as you look at them:

The optimizer estimated 887 or 888 rows for these operators, but the number of rows returned is much higher. So the cost of 72% for the one operator isn’t really accurate. That operator read and returned many more rows, as did the other related operators. If you saw an estimated plan without these runtime numbers, you may come away with a very different impression of how this query is running.

Conclusion

I’ve always felt cost is not well explained, so hopefully this post will help answer some questions. Understanding cost can be really helpful in troubleshooting poorly performing queries, but don’t focus solely on it when analyzing a problem.

If you liked this post, please follow me on twitter or contact me if you have questions.

The transitive Property

You may recall the transitive property from elementary school math class. It states:

The SQL Server optimizer can and will use this property, and it can lead to issues in your queries. When I’m writing a query, I have a clear idea of how I want it to operate. But using the transitive property, SQL Server has additional options one might not expect, and this may occasionally cause things to go awry. Consider this:

I’d expect the optimizer to seek the Orders table, then join to OrderLines. But since @OrderID is also equal to sol.OrderID, it could start the query there. There would be little difference; the OrderID column is indexed in both tables and neither table is that large. There’s only 3 OrderLines for this OrderID.

But what if:

  • The CustomerID was present in all tables. And was part of the primary key in each.
  • The data set was larger, with 100’s of millions of rows in our OrderLines table.
  • Statistics were last updated by an auto-update. A table of this size would have a very small sampling rate by default, <1%.
  • Our database has a large number of customers, and we’re running a query to get recent orders for a large customer that haven’t been picked yet.

We may end up with a query like this:

If our PK has the CustomerID first, we could use that to search either table. If we don’t have an index with the CustomerID and our dates on Orders, those fields will be in our predicate, not our seek predicate. With poor statistics, the optimizer may seek OrderLines first, returning all lines for that customer, before later joining an filtering it down. In this situation, that could be a mountain of reads.

Countering The transitive Property

So what if you experience or anticipate this issue with a query? I see a few approaches to prevent the bad plan from sneaking up on your server.

  • Join Hints or Force Order: The FORCE ORDER hint would direct the optimizer to hit the Orders table first. A join hint like INNER LOOP JOIN would set the join type, and also force the order. Either would results in seeking the Orders table before joining to OrderLines, preventing our worst case.
  • Index hints: If you hint the optimizer to use an index based on CustomerID and OrderDate, that would point it in the direction of using the Orders table first, avoiding the problem.
  • Better stats: An argument can definitely be made to not use auto-updated stats on a sufficiently large table. I’ve seen sampling rates below 0.1%, and mistakes can be made then. If you updated stats with a large sampling rate, the optimizer will have better information to work with, and that may avoid the worst case.

I thought this property of the optimizer was interesting when I first saw it in play, and I haven’t seen it referred to much. Hopefully this post will give you a little more insight into what the optimizer can do, and how we sometimes need to respond.

Reviewing the concept

A quick search will tell you that implicit conversions are pretty awful for performance, and in particular drive CPU usage. That’s not news. There is an aspect of this I think a lot of engineers don’t understand; why does it cause performance issues?

An implicit conversion occurs when you try to compare between or assign data across two different data types, without converting one yourself. If you used a CAST or CONVERT, it would be an explicit conversion. When SQL Server does it for you, it’s an implicit conversion, and these can have a real impact on your execution plans. Not all combinations of types can be converted implicitly, for a full list look here.

So let’s look at an example based on WideWorldImporters:

The Description column is an nvarchar type, but I’m comparing it to a string that is non-unicode. And for the sake of this example, I did create an index on this column.

And we see an implicit conversion. In this case, the optimizer converted our string to a nvarchar(4000) type to match the Description column. But in this case, the implicit conversion isn’t going to cause a problem. Converting one value isn’t that expensive; you can confirm that from the ~0.025 cost for this operation. We even used the index on the column.

The real problem comes when SQL Server converts the column to match the variable.

Reproducing the issue

I’ve seen implicit conversions many times, and have one example in my head I recall very clearly. So I was fairly astonished when I tried to reproduce this using WideWorldImporters. SQL Server tries very hard to convert the parameter or literal, rather than the column. It’s quite difficult to get it behave badly, in part because of the data types chosen for the tables. 

Microsoft’s documentation on data type precedence indicates that SQL Server will convert from one type to another based on the priority in the chart; but I have an example that seems to counter this.

The Quantity column is a int, which has a lower priority than our decimal type variable. So according to the chart, the column should be converted, meaning every value on every row; more on that later.

But looking at the plan, we don’t convert the Quantity column. Instead there is a scalar operator on the variable that appears to be obscuring the underlying conversion:

So, the behavior here doesn’t match expectations. In my testing, all the int\decimal\money types worked similarly, and the optimizer would convert the variable. One takeaway I had from writing this is that implicit conversion occur much more often than I thought, but SQL Server is smart enough to convert the variable unless it has no choice. You likely wouldn’t see the many plans that convert the column, because they aren’t causing a performance problem. Instead, you’ll see and remember the one that caused a massive issue.

another angle

An easy example would be to compare a varchar column to an nvarchar variable. This forces SQL Server’s hand, only in one direction. SQL can easily convert any varchar to an nvarchar, but the opposite is more problematic. If your nvarchar string contains kanji or something else that can’t be represented with a varchar string, the operation would fail. So, SQL Server would rather convert the varchar, even if that means converting a column.

Note, I had to change databases here, because there are no varchar columns in WideWorldImporters (thus preventing this convert).

Even though I’ve created an index on the AccountNumber column, this results in a scan.

Usually I’ve seen the CONVERT_IMPLICIT on the scan operator itself, but here we see it as part of the filter. The query only returns 1 row, but we read all 19820 rows on the table to fulfill this. Image the effect if this table had 10 million rows.

Why the Table Scan?

Reasonable question. SQL Server doesn’t really have a choice. The index is built on the AccountNumber, not the output of this function call:

So, SQL Server can’t search the index for this information; it’s not in the index. And SQL Server can’t really predict what the output of the function will be before it has converted the data, so it has to look at every AccountNumber value, convert it, then compare. The most expensive part isn’t the convert itself, but the table scan it forces.

So, pay attention to your data types when writing procedures and queries to avoid these issues. The effect of an implicit conversion on a column is much more pronounced the larger the table is, and it can easily drive your CPU to 100%.

Hidden Implicit Conversions

One of my attempts to create an implicit conversion worked, but didn’t show it clearly in the plan. I tried the following example using the sql_variant type:

The plan is similar to the last one, but we don’t see the CONVERT_IMPLICIT when we mouseover any of the three operators.

However, if you hit F4 and examine the Compute Scalar operator, you can see the definition of expression Expr1001 in the Properties window.

This clearly shows the column being converted, But it’s odd to need to do this to see the convert. 

This would be a good time to refer to a bit of brilliance Jonathan Kehayias posted quite some time ago. This script will allow you to see any implicit conversions on columns in the plan cache, along with the statement, plan, and other relevant details. This can be useful in a number of cases, but especially when the operation is hidden as in this example.

Also, Any Other Function CALL

This post is primarily about understanding how implicit conversions work and why the occur, but it bears mentioning that basically any function call wrapping a column you want to search on will work the same way. 

For example, using SUBSTRING on AccountNumber gives largely the same result:

If you need to use a function call in your WHERE, hopefully you have another clause that can do the majority of your filtering.

Happy New Year

Hope you had a great holiday season! I’ll keep up the posting going forward, so check back in, and reach out if you have any questions I can help answer.

How Relevant is the TOP operator?

I’ve explained what a blocking operator is and provided a few examples, but maybe this doesn’t seem important. It’s affecting the TOP operator, sure, but don’t people just use this to look at the TOP 1000 rows of their tables in SSMS?

The TOP operator is useful for many operations, especially in a large environment. Large operation can timeout or fail for a variety of reasons, consuming resources without providing the results you need. A small, batch-sized operation is more likely to succeed and tends to perform more consistently. Many maintenance operations make sense to run with a TOP operator, so we should make sure those operations aren’t stymied by blocking operators. Some examples:

  • Garbage collection on a table with many millions rows. You want this to perform quickly, but you really can’t afford for this to time out (whatever that timeout period may be). We can limit how long this runs by GC’ing a small batch at a time, but this can be hampered badly by an extra sort or a hash join.
  • Archiving data applies for the same reasoning as GC. If you archive data to another table\database\server, you’ll want to keep your operation small enough to manage.
  • Backfilling a new column. If the existing table is large, you can’t just UPDATE the whole table; you’ll lock the table and block all your users. A batched UPDATE in a loop or in a scheduled process can resolve this without causing an outage.
  • GDPR is here, and CCPA is coming. You may need to anonymize data across many related tables. We need this to perform well to cleanse our existing data, and we’ll continue running this process going forward.
  • Queries producing potentially large results to your application may need to be batched as well. If this times out, you’re still wasting a lot of time and reads, even if no data is changed.

Out the Window

I examined one process recently that was similar to this query, causing a GC operation to time out.

It was a SELECT statement, and it inserted the important fields into a temp table, and ran DELETEs against multiple tables based on the contents of the temp table. But it was the initial SELECT that had a poor plan and caused the timeout.

I quickly saw a SORT in the execution plan and wondered why. The actual query didn’t have an ORDER BY clause. But it did have a ROW_NUMBER OVER in the select list; took me a minute to notice that.

But was the sorting necessary? “We need to delete really old records in this table, but it’s vitally important that we delete them in the order they were originally inserted!”

It seemed a poor reason to sort a table with tens of millions of rows. Coupled with the very small batch size, we were doing an extraordinary amount of work to get rid of a few rows. So what if we commented the ROW_NUMBER and the ORDER BY out?

Even though the new plan has a scan operator, we only read 110 rows from it because we are using the TOP operator properly. Note the row counts from the first plan again. We have 479 thousand rows going through multiple operators in the first, but only 100 per operation in the second.

Avoid one Blocking Operator, find another

Here’s an example from recent work. I was looking at a query in a GC that was populating a temp table to use for later DELETEs. I was anticipating the optimizer to try a hash match join, so I used an INNER LOOP JOIN hint to avoid that blocking operator. The results were quite unpleasant, as you can see in this anonymized plan.

So, I avoided the hash match join, but the SQL Server optimizer didn’t see it the way I did. The first table is a temp table, Object6, which we are joining to a normal table, Object11. But this plan includes a table spool that not only forces us to read all 587741 rows in the seek against our table, it seems to create a cross join in memory between the results of the clustered index scan on the temp table and the clustered index seek against the base table (538 * 587741 = 316204658).

That hint obviously wouldn’t work. I reversed the order of the tables, then removed the hint altogether, giving the following:

 

2000 rows returned across the board. There is a SORT operation, but it’s after the TOP so there’s no harm.. Our results are being sorted before inserting them into a second temp table. And a much more performant query, taking ~8 ms instead of 75000 ms.

Sort out blocking operators

Hopefully this has been informative. I honestly wasn’t aware of blocking operators until a month or two ago. That’s the frustrating and interesting thing of SQL Server sometimes; you can work on this as long as I have and, even putting new development aside, there’s always more to learn.

Hope this helps you optimize some of your own operations.

It seems so simple

The TOP operator seems pretty straightforward. “Hey SQL Server, give me the first 100 rows that match this criteria, then stop.” But when certain operations get involved it can go sideways.

Let’s start with a simple example in the WorldWideImporters database.

The query here is simple, and we can see the TOP operator only returns 100 rows, but so does the index seek underneath. The way this works is that the TOP asks the operation connected to it for 1 row, and keeps asking until it has 100 or the operation below can’t find anymore rows that match.

 

Because of the WHERE clause I chose, we actually had to read 943 rows in the table to get 100 that matched. At that point the TOP stopped asking for more rows.

If the TOP operator kept asking for more rows and only displayed the first 100, that would have been a waste of effort. Still, we’re only querying 1 index here.

Let’s add a wrinkle

Now we have a key lookup, and we can follow the flow of what happened. We read and returned 270 rows from the index seek. There’s a nested loops operator connecting us to our key lookup, and that returned only 100 rows. So, we had to read 270 rows from the index seek to find 100 rows that met all our filters. 

Ultimately the TOP operator was asking for 1 row repeatedly until it met its quota, and then it stopped asking for more. As expected. 

What’s a blocking operator?

So, with another change we’ll see one of those operators I mentioned earlier.

Here, we’re seeking based on a range of OrderIDs, but getting results sorted in a different order. We’re reading many more records from the index and the key lookup (where the estimate is wayyy off). We don’t need to read this much to get our results, because we returned 2160 rows from the key lookup and the nested loops join, but then we reduce when we hit the sort.

It makes sense that you can’t return the top 100 rows that meet this criteria, until you’ve seen them all. Well, unless the data is in the order you want, and you can just seek it that way from the index. Our previous query had an ORDER BY clause but no sort operation because our sort matched our range seek.

Sort is a blocking operator. Don’t feel bad if you haven’t heard of the term; I’ve been working with SQL Server for 15 years, and I’m sure I never heard the term until the incomparable Grant Fritchey mentioned it while he was lecturing at my place of employment.

So sorts and several other types of operators (eager spools, remote query\scan\etc, hash match joins, and more) will block the normal flow and gather all their results before passing any rows on. The hash match join only blocks while building its hash table from the first input, before probing the second.

Let’s hash it out

We filtered on OrderID < 5000, and we read 4999 rows from the build table. So we read everything that fit that criteria; we didn’t stop early because we had our 100 rows. So, definitely blocking behavior.

Then we probe the second table and return 900 rows, and they pass through the hash match. The results get reduced to 100 by the TOP operator. 

Why 900 rows from the OrderLines table, not 100? That’s less than clear. As I vary the result set or the TOP size, I get a number of behaviors using different indexes and returning batches of various sizes. It appears the probe may be trying to do a batch of rows at a time, or it may be related to the memory allocated. (If I can get a clearer answer to this aspect, I’ll update the post).

Update: I consulted with my colleague, the superb Kevin Feasel, who suggested this was operating in batch mode. I was testing this using SQL Server 2019, and batch mode on rowstore is a new feature that everyone should be aware of.

I thought I had ruled this out however, and indeed the probe operation was in row mode:

However, the hash match operation was not!

So the hash match requested a batch of 900 rows, which is why we saw the unexpected number of rows.

If you try this in SQL Server 2019, you may see different behaviors as you vary the result set (when I removed LastEditedWhen from the result set, it changed to row mode and only returned 100 rows) or the TOP size (TOP 1000 dropped it back to row mode). I also saw some variation with the index used against OrderLines, including the columnstore index.

Summing up

These have been relatively simple examples of the TOP operator in action, and how it interacts with other operators. In my next post, I’ll provide some more complicated plans, and discuss how we can keep our TOPs in top shape.