Wednesday, September 3, 2014

SQL Functions and Plan Caching

I feel that I have become typecast as a Business Intelligence specialist.  However I often employ my SQL skills and I still enjoy the challenge of solving pure SQL performance issues.  Here's an issue, which presented itself as a table valued Function in an OLTP environment that was taking a lot of compilation energy (6.7 seconds) every execution, while the execution was only 0.15 seconds.  It was executing about 1/second at peak.

Now, I am a great fan of functions, but I know that they are not compiled like stored procedures, so I am weary of using them for complex SQL queries.  This query was very complex.  I've got the function's code below, but it is based on a much more complex view, which joins 5 tables.  There is a picture of the function execution above.  See how the execution time is only 125ms, while the parse and compile time is 6638ms!  The obvious solution is to convert this function to a stored procedure, which will enable SQL to store the plan.

However, this would involve changes to .NET code, which would involve a lot of testing and change control.  I found another solution, which enabled us to fix the Function without having any change in .NET code, permitting a quick fix to production.  The fix was to alter the function so that it has multiple lines!  This way SQL Server will treat the 'multistatement table-valued function' as a stored procedure and enable it to be cached.  You can see from the execution results below that their is no parse and compile time, and the execution time is the same 125ms.
How does it do this?  It appears that SQL Server will treat the multistatement table-valued function as a stored procedure.  We can see this from the plan cache query below.


















Note, SQL Server is able to cache Adhoc queries, so that your execution of a function with a particular parameter can be cached.  Don't let this cloud your analysis, unless your OLTP environment is going to  execute the function with exactly the same Adhoc text.  You will see these 'Adhoc' cached plans above with objtype='Adhoc'

Note, there is documentation that explains this behaviour, although you might have missed it.  See http://msdn.microsoft.com/en-us/library/cc293623.aspx and read the text near 'multistatement table-valued functions'.

What does this tell us?  Firstly, it reminds us that compile time is not always trivial.  Secondly, it reminds us that we should treat Functions as different to Procedures, and only use Functions where they are appropriate.  I find them most useful where they are doing something like string manipulation and optimisation is not appropriate.

Old function (with 6.8 second compile time every execution)
create function MyFunction 
(@Parameter1 int) 
RETURNS TABLE AS
RETURN (
SELECT t2.*,t1.MyDate
from (select max(t4.TheDate) MyDate, 
  t3.BranchNo,
  t3.TranCode, 
  t3.Cash1, 
  t3.Cash2,  
  t3.Cash3, 
  t3.Cash4
      from vBranchCount as t3 
      left outer join  (SELECT *
                        FROM vBranchCount
                        WHERE vBranchCount.Status = 'Applied') AS t4 
ON t3.TranCode = t4.TranCode 
                AND t3.BranchNo = t4.BranchNo 
                AND t3.Cash1 = t3.Cash1 
                AND t3.Cash2 = t4.Cash2 
                AND t3.Cash3 = t4.Cash3 
                AND t3.Cash4 = t4.Cash4 
                AND t4.TheDate < t3.TheDate
      where t3.StockCountPlaceId = @Parameter1
      group by t3.BranchNo,t3.TranCode,t3.Cash1,t3.Cash2,t3.Cash3, t3.Cash4) as t1
inner join vBranchCount as t2 
  on t1.BranchNo = t2.BranchNo
 and t1.TranCode = t2.TranCode
 and t1.MyDate = t2.CashDate
 and t1.Cash1 = t2.Cash1
 and t1.Cash2 = t2.Cash2
 and t1.Cash3 = t2.Cash3
 and t1.Cash4 = t2.Cash4  )

New Function with 6.8 second compile time only on first execution.
create function MyNewFunction 
(@Parameter1 int) 
RETURNS @t TABLE
(  TeamID bigint,
           Rain bit,
           WitchHat varchar(50),
           JoyStormID bigint,
           Happyness int,
           TeamName varchar(250),
           CashIn money,
           CashOut float,
           TheDate datetime,
           JoyClerkID bigint,
           JoyClerkName varchar(150),
           TranCode bigint,
           Bills float,
           CashDate datetime,
           BranchNo bigint,
           Temperature varchar(50),
           StockCountPlaceId bigint,
           Applicability int,
           Readyness int,
           SiteID bigint,
           BillsToChange int,
           Cash1 varchar(50),
           Cash2 varchar(50),
           Cash3 varchar(50),
           Cash4 varchar(50),
           JoyUnit1 int,
           JoyUnit2 int,
           JoyUnit3 int,
           JoyUnit4 int,
           TranSystemId bigint,
           Handle1 varchar(35),
           Handle2 varchar(35),
           Consumer varchar(50),
           JoyConsumerId bigint,
           Rainbow varchar(50),
           StormCode varchar(50),
           ChangeDate datetime)
 AS
BEGIN  
INSERT INTO @t
SELECT t2.*,t1.MyDate
from (select max(t4.TheDate) MyDate, 
  t3.BranchNo,
  t3.TranCode, 
  t3.Cash1,
  t3.Cash2,
  t3.Cash3, 
  t3.Cash4
 from vBranchCount as t3 
 left outer join  (SELECT *
FROM vBranchCount
WHERE vBranchCount.Status = 'Applied') AS t4 
ON t3.TranCode = t4.TranCode 
AND t3.BranchNo = t4.BranchNo 
AND t3.Cash1 = t3.Cash1 
AND t3.Cash2 = t4.Cash2 
AND t3.Cash3 = t4.Cash3 
AND t3.Cash4 = t4.Cash4 
AND t4.TheDate < t3.TheDate
 where t3.StockCountPlaceId = @Parameter1
 group by t3.BranchNo,t3.TranCode,t3.Cash1,t3.Cash2,t3.Cash3, t3.Cash4) as t1
inner join vBranchCount as t2 
 on t1.BranchNo = t2.BranchNo
 and t1.TranCode = t2.TranCode
 and t1.MyDate = t2.CashDate
 and t1.Cash1 = t2.Cash1
 and t1.Cash2 = t2.Cash2
 and t1.Cash3 = t2.Cash3
 and t1.Cash4 = t2.Cash4   

RETURN
END

Query plan cache

SELECT usecounts, cacheobjtype, objtype, text 
FROM sys.dm_exec_cached_plans 
CROSS APPLY sys.dm_exec_sql_text(plan_handle) 
WHERE (text like '%MyFunction%' or text like '%MyNewFunction%')
and objtype <> 'Adhoc'
and cacheobjtype='Compiled Plan'
ORDER BY usecounts DESC;

2 comments:

Chris Harshman said...

Aren't inline table valued functions execution plans cached as part of their calling SQL's execution plan? So if your TVF was being called within a stored procedure, then as long as your stored procedure has a cached plan then the TVF's plan is within it.

Richard Lees said...

Good point Chris,

Actually, it doesn't appear quite that simple. If you have a single statement Table Value Function (tvf) called from inside a stored procedure, then you are correct. If the function is multiple statemented, as I have written the new tvf above, it is cached as it's own procedure.

The performance issue I uncovered above was SQL that was called directly from .NET, so it wasn't cached (single statement tvf). If it had been inside a proc, it would have been compiled and cached within the proc as you suggest. Now that it's a multi statement tvf, it will get compiled on its own. So if it is used within a stored procedure, it will not be optimized overall (there will be two plans). Arguably it will be slower as it can only support a nested loop.

Seems like you need to know how the tvf is being used before you can be sure on how to code it.

Thanks for pointing that out.