Here is the query and its query plan
select
st.StockTakeDate
,st.StockOnHand
,st.StockOnHand-stPrev.StockOnHand StockChange
from tbStockTake st
inner join tbStockTake stPrev
on stPrev.ProductId=st.ProductId
and stPrev.StockTakeDate=DATEADD(dd,-1,st.StockTakeDate)

Notice how the join on the second tbStockTake table is sorted so it can be merged with the first tbStockTake table. I am scanning 750 Million rows, so this sort is not trivial. We know there is no need to sort the data, since the join will be in the same sequence as the first table.
Here is a very simple enhancement to the query
select
st.StockTakeDate
,st.StockOnHand
,st.StockOnHand-stPrev.StockOnHand StockChange
from tbStockTake st
inner join tbStockTake stPrev
on stPrev.ProductId=st.ProductId
and stPrev.StockTakeDate=st.StockTakeDate-1
Notice how this new query plan is able to Merge Join the two tables without a sort. This dramatically improves the query performance and tempdb resources. The table is simply scanned with two cursors and merged.
The SQL Server optimiser is very clever in understanding that a clustered scan by ProductId, StocktakeDate is in the same sequence (so can be merged) with st.ProductId=stPrev.ProductId and st.StocktakeDate=stPrev.StockTakeDate-1. It is not quite clever enough to determine this for st.StocktakeDate=DATEADD(dd,-1,stPrev.StocktakeDate), somewhat understandably since a function is involved.
Datetime in SQL Server (at least up until the current version 2008 R2) can have arithmetic applied to it. 1 is equivalent to 1 day, and fractions of 1 are equivalent to portions of the day. For example, 0.25 is equivalent to 6 hours.
Interestingly (or should I say unfortunately) we cannot use the same trick where a Date datatype is used instead of the Datetime datatype.
So, whenever you have something that you would like to go faster, simply apply some focus on it, since the chances are, there is a faster option.
No comments:
Post a Comment