Sunday, February 21, 2010

Using Profiler to tune a SQL Server data warehouse

The last blog was on tuning TP applications. How about tuning a DW application? Surely it is quite similar. Well yes, the skills for tuning data warehouses and TP systems overlap. In this blog I will provide some hints for tracing/tuning the data warehouse SQL Server database. This includes any SQL statements (and utilities) on the warehouse database.

A key difference is that we want to trace SQL statements, rather than individual procedure statements. The number of statements tends to be an order of magnitude less, so we can (and should) trace for a longer period, 24 hours, if possible. But at least spanning your main load process. It is still important to include a row limit, as some performance issues can manifest as iterative SQL calls. Such as INSERT statements when you are expecting BULK INSERT, or iterative UPDATE statements for the SCD SSIS task.

There is a complication. That is, some statements might be dynamically compiled and executed, in which case the trace will group by the meaningless name of the procedure and parameters. To ensure that we have the prepare statements (that's where the SQL query is) we will clear the plan cache at the start of the Profiler trace. Then we will be able to find the prepare statement for the procedure by searching earlier records on the same SPID.

To trace all SQL requests
  1. Start Profiler, as in the previous blog, clear all events, and only include RPC:Completed from Stored Procedures and SQL:BatchCompleted from TSQL. This will capture all SQL requests to SQL Server.
  2. Select the same columns as the previous blog.
  3. Limit the number of records to 100,000. This should be plenty, since a data warehouse tends to have a small number of large requests.
  4. Execute DBCC FREEPROCCACHE to clear the procedure cache.
  5. Now leave the Profiler trace to go over a long period, perhaps 24 hours. This way you will cover an entire load and process schedule.
  6. Check that the number of trace records is less than your limit. If you hit your limit, find out which query is iteratively called. Ask yourself the question, can this iterative statement be executed more efficiently in batch? If so, change it, if not, filter it out of the trace and start the trace again.
  7. It is worthwhile altering the trace table. Right click on the table, select Design and change ntext to varchar(max), nvarchar() to varchar() and remove any unnecessary columns. You might need to change your designer options on SQL 2008 to allow a table change that involves a drop table. see Can't change SQL table
  8. Some of the SQL requests won't have the actual SQL statement, since it has been prepared (optimised) into a temporary procedure for fast subsequent executions. To help query the results, we will to update the trace table to include a pointer to the prepare statement. To do this, add a column PrepareRowNumber INT to the trace table and execute the following update statement to populate it.
  9. update t
    set PrepareRowNumber=(select top 1 RowNumber from t tt where tt.starttime<=t.StartTime and tt.spid=t.spid and tt.TextData like '%sp_prepare%' order by starttime desc) where TextData like '%sp_execute %' update t set PrepareRowNumber= (select top 1 RowNumber from t tt where tt.starttime<=t.StartTime and tt.spid=t.spid and tt.TextData like '%sp_cursorprep%' order by starttime desc) where TextData like '%sp_cursor %'
  10. Now you can execute a query to get the top 50 SQL batches by total resource.
  11. select top 50 *,Duration/Executions Duration_avg, CPU/Executions CPU_avg, Reads/Executions Reads_avg, Writes/Executions Writes_avg, RowCounts/Executions RowCounts_avg from (select t.DatabaseName, t.ApplicationName, tt.TextData PrepStmt, convert(char(100),t.TextData) TextData50, sum(t.Duration)/1000.0 Duration, sum(t.CPU)/1000.0 CPU, sum(t.Reads) Reads, sum(t.Writes) Writes, sum(t.RowCounts) RowCounts, count(*) Executions from t with (nolock) left outer join t tt on tt.RowNumber=t.PrepareRowNumber where t.ObjectName is not null group by t.DatabaseName, t.ApplicationName, convert(char(100),t.TextData),tt.textdata) t order by t.Duration desc
  12. Notice how there is PrepStmt column as well as the textdata. If the textdata contains a sp_execute (or sp_cursor) statement, the PrepStmt column should contain the prepare statement. That was why we cleared the proc cache at the start of the Profiler trace.
  13. Now the fun starts. You can review the SQL Statements and see if there is a way of tuning them.

  1. The above update queries don't use the variable number, which they should do to ensure the right prepare statement is fetched. The top 50 list displays both the execute statement and the prepare statement, so you can eyeball that the correct statement has been selected. If you find that the incorrect statements are being selected, you might want to amend the update statement. An SQL function to parse the text for the parameter number would work a treat.
  2. Sometimes I will include a CASE clause around the textdata, so that I can manually group together statements, that I know about. You should see the above top 50 SQL query as a template, or starting point, from which you can taylor for your Profiler data.
  3. The above steps will also work for a TP application that calls with SQL statements (ie, without stored procedures). However a TP system is likely to have many more overlapping dynamic procedures, so it is more difficult (and more important) to locate the right prepare statement.
  4. You could filter the Profiler trace to only record statements that consume a minimum resource level. However, be careful that you don't ignore repeated statements that individually consume very little resource, but collectively take a long time. A couple of examples of this would be the UPDATE statement on the SSIS SCD task, or a load process that is firing INSERTs.
  5. This profiler trace is simply on the SQL Server workload. Your data warehouse is likely to have Analysis Services processing work and cache warming, which you should also be interested in. Treat this as a separate tuning exercise.

For a real-time data warehouse demonstration see This site inserts about 300 facts/second into SQL Server and incrementally processes a few OLAP cubes and data mining models. The above trace output was from a 24 hours trace of the SQL Server database server.

No comments: