Richard Lees on BI: SQL Functions and Plan Caching

I feel that I have become typecast as a Business Intelligence specialist. However I often employ my SQL skills and I still enjoy the challenge of solving pure SQL performance issues. Here's an issue, which presented itself as a table valued Function in an OLTP environment that was taking a lot of compilation energy (6.7 seconds) every execution, while the execution was only 0.15 seconds. It was executing about 1/second at peak.

Now, I am a great fan of functions, but I know that they are not compiled like stored procedures, so I am weary of using them for complex SQL queries. This query was very complex. I've got the function's code below, but it is based on a much more complex view, which joins 5 tables. There is a picture of the function execution above. See how the execution time is only 125ms, while the parse and compile time is 6638ms! The obvious solution is to convert this function to a stored procedure, which will enable SQL to store the plan.

However, this would involve changes to .NET code, which would involve a lot of testing and change control. I found another solution, which enabled us to fix the Function without having any change in .NET code, permitting a quick fix to production. The fix was to alter the function so that it has multiple lines! This way SQL Server will treat the 'multistatement table-valued function' as a stored procedure and enable it to be cached. You can see from the execution results below that their is no parse and compile time, and the execution time is the same 125ms.

How does it do this? It appears that SQL Server will treat the multistatement table-valued function as a stored procedure. We can see this from the plan cache query below.

Note, SQL Server is able to cache Adhoc queries, so that your execution of a function with a particular parameter can be cached. Don't let this cloud your analysis, unless your OLTP environment is going to execute the function with exactly the same Adhoc text. You will see these 'Adhoc' cached plans above with objtype='Adhoc'

Note, there is documentation that explains this behaviour, although you might have missed it. See http://msdn.microsoft.com/en-us/library/cc293623.aspx and read the text near 'multistatement table-valued functions'.

What does this tell us? Firstly, it reminds us that compile time is not always trivial. Secondly, it reminds us that we should treat Functions as different to Procedures, and only use Functions where they are appropriate. I find them most useful where they are doing something like string manipulation and optimisation is not appropriate.

Old function (with 6.8 second compile time every execution)
create function MyFunction
(@Parameter1 int)
RETURNS TABLE AS
RETURN (
SELECT t2.*,t1.MyDate
from (select max(t4.TheDate) MyDate,
t3.BranchNo,
t3.TranCode,
t3.Cash1,
t3.Cash2,
t3.Cash3,
t3.Cash4
from vBranchCount as t3
left outer join (SELECT *
FROM vBranchCount
WHERE vBranchCount.Status = 'Applied') AS t4
ON t3.TranCode = t4.TranCode
AND t3.BranchNo = t4.BranchNo
AND t3.Cash1 = t3.Cash1
AND t3.Cash2 = t4.Cash2
AND t3.Cash3 = t4.Cash3
AND t3.Cash4 = t4.Cash4
AND t4.TheDate < t3.TheDate
where t3.StockCountPlaceId = @Parameter1
group by t3.BranchNo,t3.TranCode,t3.Cash1,t3.Cash2,t3.Cash3, t3.Cash4) as t1
inner join vBranchCount as t2
on t1.BranchNo = t2.BranchNo
and t1.TranCode = t2.TranCode
and t1.MyDate = t2.CashDate
and t1.Cash1 = t2.Cash1
and t1.Cash2 = t2.Cash2
and t1.Cash3 = t2.Cash3
and t1.Cash4 = t2.Cash4 )

New Function with 6.8 second compile time only on first execution.

create function MyNewFunction

(@Parameter1 int)

RETURNS @t TABLE

( TeamID bigint,

Rain bit,

WitchHat varchar(50),

JoyStormID bigint,

Happyness int,

TeamName varchar(250),

CashIn money,

CashOut float,

TheDate datetime,

JoyClerkID bigint,

JoyClerkName varchar(150),

TranCode bigint,

Bills float,

CashDate datetime,

BranchNo bigint,

Temperature varchar(50),

StockCountPlaceId bigint,

Applicability int,

Readyness int,

SiteID bigint,

BillsToChange int,

Cash1 varchar(50),

Cash2 varchar(50),

Cash3 varchar(50),

Cash4 varchar(50),

JoyUnit1 int,

JoyUnit2 int,

JoyUnit3 int,

JoyUnit4 int,

TranSystemId bigint,

Handle1 varchar(35),

Handle2 varchar(35),

Consumer varchar(50),

JoyConsumerId bigint,

Rainbow varchar(50),

StormCode varchar(50),

ChangeDate datetime)

BEGIN

INSERT INTO @t

SELECT t2.*,t1.MyDate

from (select max(t4.TheDate) MyDate,

t3.BranchNo,

t3.TranCode,

t3.Cash1,

t3.Cash2,

t3.Cash3,

t3.Cash4

from vBranchCount as t3

left outer join (SELECT *

FROM vBranchCount

WHERE vBranchCount.Status = 'Applied') AS t4

ON t3.TranCode = t4.TranCode

AND t3.BranchNo = t4.BranchNo

AND t3.Cash1 = t3.Cash1

AND t3.Cash2 = t4.Cash2

AND t3.Cash3 = t4.Cash3

AND t3.Cash4 = t4.Cash4

AND t4.TheDate < t3.TheDate

where t3.StockCountPlaceId = @Parameter1

group by t3.BranchNo,t3.TranCode,t3.Cash1,t3.Cash2,t3.Cash3, t3.Cash4) as t1

inner join vBranchCount as t2

on t1.BranchNo = t2.BranchNo

and t1.TranCode = t2.TranCode

and t1.MyDate = t2.CashDate

and t1.Cash1 = t2.Cash1

and t1.Cash2 = t2.Cash2

and t1.Cash3 = t2.Cash3

and t1.Cash4 = t2.Cash4

RETURN

END

Query plan cache

SELECT usecounts, cacheobjtype, objtype, text

FROM sys.dm_exec_cached_plans

CROSS APPLY sys.dm_exec_sql_text(plan_handle)

WHERE (text like '%MyFunction%' or text like '%MyNewFunction%')

and objtype <> 'Adhoc'

and cacheobjtype='Compiled Plan'

ORDER BY usecounts DESC;

2 comments:

UnknownSeptember 15, 2014 at 11:18 PM
Aren't inline table valued functions execution plans cached as part of their calling SQL's execution plan? So if your TVF was being called within a stored procedure, then as long as your stored procedure has a cached plan then the TVF's plan is within it.
Richard LeesSeptember 16, 2014 at 7:35 PM
Good point Chris,

Actually, it doesn't appear quite that simple. If you have a single statement Table Value Function (tvf) called from inside a stored procedure, then you are correct. If the function is multiple statemented, as I have written the new tvf above, it is cached as it's own procedure.

The performance issue I uncovered above was SQL that was called directly from .NET, so it wasn't cached (single statement tvf). If it had been inside a proc, it would have been compiled and cached within the proc as you suggest. Now that it's a multi statement tvf, it will get compiled on its own. So if it is used within a stored procedure, it will not be optimized overall (there will be two plans). Arguably it will be slower as it can only support a nested loop.

Seems like you need to know how the tvf is being used before you can be sure on how to code it.

Thanks for pointing that out.

Wednesday, September 3, 2014

SQL Functions and Plan Caching

2 comments: