Thursday 29 June 2017

Explain Plan in Azure Warehouse (sort of)

Simplistically, the purpose of an explain plan is to reveal how the database plans to deliver results for a given query. In Oracle we have "SQL> EXPLAIN PLAN FOR <QUERY>;" and in SQL Server we have "click->display estimated execution plan" which shows a graphical diagram showing all the details of the execution steps and path. Both are important to diagnose query performance related issues.

In Azure Data warehouse, we have very little. It takes a certain amount of effort to get something readable as the aforementioned SSMS "rclick->display" option does not work in the studio. There are a number of posts on the internet about how to get this information but the results seem somewhat lacklustre: one method is to get the execution plan in an XML format that is not compatible with SSMS - so one is left to look at an XML file. The other option is to go the Azure portal and see the query running via the web console. Both options falls short of what most db developers are accustomed to.

I have found a useful alternative although it is not exactly a traditional explain plan. What it is is the sequence of steps the database is taking to achieve the results.
These steps can be found in the object "sys.dm_pdw_request_steps" - how I use this is shown below.

I can only use the following steps while my query is running. Once the query is done, the steps vanish from the steps table. Therefore, this is more of a technical query progress report showingthe steps that ran, the steps which are running, and the steps which will run. The first key is identify my sessions REQUEST_ID in sys.dm_pdw_exec_requests. I use sys.dm_pdw_exec_requests and I find that the most laborious queries pop in the top results. Once I have those master request and their child requests, I use those to find the segments of steps in the "sys.dm_pdw_request_steps" table.

print ''
print '============================================='
print '============================TOP QUERIES IN DW'
print '============================================='
print ''
go

  select TOP 10
         status,
         request_id as request_id,
         submit_time as submit_time,
         start_time as start_time,
         (total_elapsed_time/1000) as total_elapsed_time,
         substring(command,0,50) command
    from sys.dm_pdw_exec_requests
order by 1 desc,total_elapsed_time DESC
GO

print ''
print '========================================='
print '============================RUNNING STEPS'
print '========================================='
print ''
go


  SELECT DATEDIFF(SECOND, s.start_time, s.end_time) Time_In_Seconds,
         s.request_id AS request,
         s.operation_type AS operation_type,
         s.distribution_type AS distribution_type,
         s.location_type AS location_type,
         s.status AS status,
         s.start_time AS start_time,
         s.end_time AS end_time,
         s.total_elapsed_time AS total_elapsed_time,
         s.row_count AS row_count,
         s.command AS command
    FROM sys.dm_pdw_request_steps s
   WHERE request_id IN (SELECT request_id
                          FROM sys.dm_pdw_exec_requests
                         WHERE status = 'Running'
                           AND total_elapsed_time > 5 ) 
ORDER BY s.request_id, s.step_index;

For simplicity, I reduced the result sets both column and row wise so I can focus on a session that is taking a while to run. Request QID89235213 is a massive query running in parallel across the warehouse. It is inserting data into a table. Its steps can be shown in the RUNNING STEPS block. I can see the time its taken for each step, the number of rows per step and the type of operation it is running. At the time of rendering, the results show that the process is running a ShuffleMoveOperation which is likely processing 66 million records. It still has a number of PENDING steps to process.

=============================================
============================TOP QUERIES IN DW
=============================================

status                           request_id           total_elapsed_time command
-------------------------------- -------------------- ------------------ --------------------------------------------------
Running                          QID89235213                        1881 INSERT INTO Warehouse.tb_FactTable_WHTest                                                      

=========================================
============================RUNNING STEPS
=========================================

request      operation_type       distribution_type location_type   status     total_elapsed_time row_coun command
------------ -------------------- ----------------- --------------- ---------- ------------------ -------- ------------------------------
QID89235213  RandomIDOperation    Unspecified       Control         Complete   0                  -1       TEMP_ID_1788
QID89235213  OnOperation          AllDistribution   Compute         Complete   78                 -1       CREATE TABLE [tempdb].[dbo].[
QID89235213  ShuffleMoveOperation Unspecified       DMS             Running    936636             -1       SELECT [T1_1].[EventName] AS
QID89235213  OnOperation          AllDistribution   Compute         Pending    NULL               -1       DROP TABLE [tempdb].[dbo].[TE
QID89235213  OnOperation          AllDistribution   Compute         Pending    NULL               -1       DROP TABLE [tempdb].[dbo].[TE


As the engine completes each step, the query is closer to completion. This helps me know whats happening under the hood of the Azure Warehouse Database.

No comments:

Post a Comment