Now, I am a great fan of functions, but I know that they are not compiled like stored procedures, so I am weary of using them for complex SQL queries. This query was very complex. I've got the function's code below, but it is based on a much more complex view, which joins 5 tables. There is a picture of the function execution above. See how the execution time is only 125ms, while the parse and compile time is 6638ms! The obvious solution is to convert this function to a stored procedure, which will enable SQL to store the plan.
However, this would involve changes to .NET code, which would involve a lot of testing and change control. I found another solution, which enabled us to fix the Function without having any change in .NET code, permitting a quick fix to production. The fix was to alter the function so that it has multiple lines! This way SQL Server will treat the 'multistatement table-valued function' as a stored procedure and enable it to be cached. You can see from the execution results below that their is no parse and compile time, and the execution time is the same 125ms.
How does it do this? It appears that SQL Server will treat the multistatement table-valued function as a stored procedure. We can see this from the plan cache query below.
Note, SQL Server is able to cache Adhoc queries, so that your execution of a function with a particular parameter can be cached. Don't let this cloud your analysis, unless your OLTP environment is going to execute the function with exactly the same Adhoc text. You will see these 'Adhoc' cached plans above with objtype='Adhoc'
Note, there is documentation that explains this behaviour, although you might have missed it. See http://msdn.microsoft.com/en-us/library/cc293623.aspx and read the text near 'multistatement table-valued functions'.
What does this tell us? Firstly, it reminds us that compile time is not always trivial. Secondly, it reminds us that we should treat Functions as different to Procedures, and only use Functions where they are appropriate. I find them most useful where they are doing something like string manipulation and optimisation is not appropriate.
Old function (with 6.8 second compile time every execution)
create function MyFunction
(@Parameter1 int)
RETURNS TABLE AS
RETURN (
SELECT t2.*,t1.MyDate
from (select max(t4.TheDate) MyDate,
t3.BranchNo,
t3.TranCode,
t3.Cash1,
t3.Cash2,
t3.Cash3,
t3.Cash4
from vBranchCount as t3
left outer join (SELECT *
FROM vBranchCount
WHERE vBranchCount.Status = 'Applied') AS t4
ON t3.TranCode = t4.TranCode
AND t3.BranchNo = t4.BranchNo
AND t3.Cash1 = t3.Cash1
AND t3.Cash2 = t4.Cash2
AND t3.Cash3 = t4.Cash3
AND t3.Cash4 = t4.Cash4
AND t4.TheDate < t3.TheDate
where t3.StockCountPlaceId = @Parameter1
group by t3.BranchNo,t3.TranCode,t3.Cash1,t3.Cash2,t3.Cash3, t3.Cash4) as t1
inner join vBranchCount as t2
on t1.BranchNo = t2.BranchNo
and t1.TranCode = t2.TranCode
and t1.MyDate = t2.CashDate
and t1.Cash1 = t2.Cash1
and t1.Cash2 = t2.Cash2
and t1.Cash3 = t2.Cash3
and t1.Cash4 = t2.Cash4 )
New Function with 6.8 second compile time only on first execution.
create function MyNewFunction
(@Parameter1 int)
RETURNS @t TABLE
( TeamID bigint,
Rain bit,
WitchHat varchar(50),
JoyStormID bigint,
Happyness int,
TeamName varchar(250),
CashIn money,
CashOut float,
TheDate datetime,
JoyClerkID bigint,
JoyClerkName varchar(150),
TranCode bigint,
Bills float,
CashDate datetime,
BranchNo bigint,
Temperature varchar(50),
StockCountPlaceId bigint,
Applicability int,
Readyness int,
SiteID bigint,
BillsToChange int,
Cash1 varchar(50),
Cash2 varchar(50),
Cash3 varchar(50),
Cash4 varchar(50),
JoyUnit1 int,
JoyUnit2 int,
JoyUnit3 int,
JoyUnit4 int,
TranSystemId bigint,
Handle1 varchar(35),
Handle2 varchar(35),
Consumer varchar(50),
JoyConsumerId bigint,
Rainbow varchar(50),
StormCode varchar(50),
ChangeDate datetime)
AS
BEGIN
INSERT INTO @t
SELECT t2.*,t1.MyDate
from (select max(t4.TheDate) MyDate,
t3.BranchNo,
t3.TranCode,
t3.Cash1,
t3.Cash2,
t3.Cash3,
t3.Cash4
from vBranchCount as t3
left outer join (SELECT *
FROM vBranchCount
WHERE vBranchCount.Status = 'Applied') AS t4
ON t3.TranCode = t4.TranCode
AND t3.BranchNo = t4.BranchNo
AND t3.Cash1 = t3.Cash1
AND t3.Cash2 = t4.Cash2
AND t3.Cash3 = t4.Cash3
AND t3.Cash4 = t4.Cash4
AND t4.TheDate < t3.TheDate
where t3.StockCountPlaceId = @Parameter1
group by t3.BranchNo,t3.TranCode,t3.Cash1,t3.Cash2,t3.Cash3, t3.Cash4) as t1
inner join vBranchCount as t2
on t1.BranchNo = t2.BranchNo
and t1.TranCode = t2.TranCode
and t1.MyDate = t2.CashDate
and t1.Cash1 = t2.Cash1
and t1.Cash2 = t2.Cash2
and t1.Cash3 = t2.Cash3
and t1.Cash4 = t2.Cash4
RETURN
END
Query plan cache
SELECT usecounts, cacheobjtype, objtype, text
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE (text like '%MyFunction%' or text like '%MyNewFunction%')
and objtype <> 'Adhoc'
and cacheobjtype='Compiled Plan'
ORDER BY usecounts DESC;
Aren't inline table valued functions execution plans cached as part of their calling SQL's execution plan? So if your TVF was being called within a stored procedure, then as long as your stored procedure has a cached plan then the TVF's plan is within it.
ReplyDeleteGood point Chris,
ReplyDeleteActually, it doesn't appear quite that simple. If you have a single statement Table Value Function (tvf) called from inside a stored procedure, then you are correct. If the function is multiple statemented, as I have written the new tvf above, it is cached as it's own procedure.
The performance issue I uncovered above was SQL that was called directly from .NET, so it wasn't cached (single statement tvf). If it had been inside a proc, it would have been compiled and cached within the proc as you suggest. Now that it's a multi statement tvf, it will get compiled on its own. So if it is used within a stored procedure, it will not be optimized overall (there will be two plans). Arguably it will be slower as it can only support a nested loop.
Seems like you need to know how the tvf is being used before you can be sure on how to code it.
Thanks for pointing that out.