Simplistically, the purpose of an explain plan is to reveal how the database plans to deliver results for a given query. In Oracle we have "SQL> EXPLAIN PLAN FOR <QUERY>;" and in SQL Server we have "click->display estimated execution plan" which shows a graphical diagram showing all the details of the execution steps and path. Both are important to diagnose query performance related issues.
In Azure Data warehouse, we have very little. It takes a certain amount of effort to get something readable as the aforementioned SSMS "rclick->display" option does not work in the studio. There are a number of posts on the internet about how to get this information but the results seem somewhat lacklustre: one method is to get the execution plan in an XML format that is not compatible with SSMS - so one is left to look at an XML file. The other option is to go the Azure portal and see the query running via the web console. Both options falls short of what most db developers are accustomed to.
I have found a useful alternative although it is not exactly a traditional explain plan. What it is is the sequence of steps the database is taking to achieve the results.
These steps can be found in the object "sys.dm_pdw_request_steps" - how I use this is shown below.
I can only use the following steps while my query is running. Once the query is done, the steps vanish from the steps table. Therefore, this is more of a technical query progress report showingthe steps that ran, the steps which are running, and the steps which will run. The first key is identify my sessions REQUEST_ID in sys.dm_pdw_exec_requests. I use sys.dm_pdw_exec_requests and I find that the most laborious queries pop in the top results. Once I have those master request and their child requests, I use those to find the segments of steps in the "sys.dm_pdw_request_steps" table.
print ''
print '============================================='
print '============================TOP QUERIES IN DW'
print '============================================='
print ''
go
select TOP 10
status,
request_id as request_id,
submit_time as submit_time,
start_time as start_time,
(total_elapsed_time/1000) as total_elapsed_time,
substring(command,0,50) command
from sys.dm_pdw_exec_requests
order by 1 desc,total_elapsed_time DESC
GO
print ''
print '========================================='
print '============================RUNNING STEPS'
print '========================================='
print ''
go
SELECT DATEDIFF(SECOND, s.start_time, s.end_time) Time_In_Seconds,
s.request_id AS request,
s.operation_type AS operation_type,
s.distribution_type AS distribution_type,
s.location_type AS location_type,
s.status AS status,
s.start_time AS start_time,
s.end_time AS end_time,
s.total_elapsed_time AS total_elapsed_time,
s.row_count AS row_count,
s.command AS command
FROM sys.dm_pdw_request_steps s
WHERE request_id IN (SELECT request_id
FROM sys.dm_pdw_exec_requests
WHERE status = 'Running'
AND total_elapsed_time > 5 )
ORDER BY s.request_id, s.step_index;
For simplicity, I reduced the result sets both column and row wise so I can focus on a session that is taking a while to run. Request QID89235213 is a massive query running in parallel across the warehouse. It is inserting data into a table. Its steps can be shown in the RUNNING STEPS block. I can see the time its taken for each step, the number of rows per step and the type of operation it is running. At the time of rendering, the results show that the process is running a ShuffleMoveOperation which is likely processing 66 million records. It still has a number of PENDING steps to process.
=============================================
============================TOP QUERIES IN DW
=============================================
status request_id total_elapsed_time command
-------------------------------- -------------------- ------------------ --------------------------------------------------
Running QID89235213 1881 INSERT INTO Warehouse.tb_FactTable_WHTest
=========================================
============================RUNNING STEPS
=========================================
request operation_type distribution_type location_type status total_elapsed_time row_coun command
------------ -------------------- ----------------- --------------- ---------- ------------------ -------- ------------------------------
QID89235213 RandomIDOperation Unspecified Control Complete 0 -1 TEMP_ID_1788
QID89235213 OnOperation AllDistribution Compute Complete 78 -1 CREATE TABLE [tempdb].[dbo].[
QID89235213 ShuffleMoveOperation Unspecified DMS Running 936636 -1 SELECT [T1_1].[EventName] AS
QID89235213 OnOperation AllDistribution Compute Pending NULL -1 DROP TABLE [tempdb].[dbo].[TE
QID89235213 OnOperation AllDistribution Compute Pending NULL -1 DROP TABLE [tempdb].[dbo].[TE
As the engine completes each step, the query is closer to completion. This helps me know whats happening under the hood of the Azure Warehouse Database.
No comments:
Post a Comment