27.01.2014

Some useful PDW Queries

Technical Value

During my work with the PDW there were plenty of times where I needed some information and out of that the following query set was created. It is a useful compilation of different queries, which should help you with your appliance. 

The first part includes some queries, which can be very handy if table or partitioning information is needed. The second part shows some query, loading and performance queries.

Tabledefinition

The first query part will give you a good overview of queries to get the table as well as the partition information out of your PDW.

Tabletype

The Most common question is probably about the Table Distribution. Is it either REPLICATE or HASH? With this simple query you can extract the information.

  1. SELECT
  2.  
  3. a.name AS [table]
  4.  
  5. , b.distribution_policy_desc AS [type] FROM sys.tables a
  6.  
  7. JOIN sys.pdw_table_distribution_properties b ON a.object_id = b.object_id 

Distributioncolumn

An easy way to figure out the distribution columns oft he tables is the following statement which gives you all the information.

  1. SELECT a.name AS [table]
  2.  
  3. , c.name AS [column]
  4.  
  5. , b.name AS [datatype] FROM sys.pdw_column_distribution_properties d INNER JOIN sys.columns c ON c.object_id = d.object_id INNER JOIN sys.tables a ON a.object_id = d.object_id LEFT OUTER JOIN sys.types b on c.user_type_id = b.user_type_id WHERE d.distribution_ordinal = 1 AND c.column_id = d.column_id

Partition column

The partition column can be shown with this query. It shows a simple overview oft he table as well as the column.

  1. SELECT
  2.  
  3. t.name AS [table]
  4.  
  5. , c.name AS [column]
  6.  
  7. FROM sys.tables AS t JOIN sys.indexes AS i ON t.object_id = i.object_id JOIN sys.columns AS c ON t.object_id = c.object_id JOIN sys.partition_schemes AS ps ON ps.data_space_id = i.data_space_id JOIN sys.index_columns AS ic ON ic.object_id = i.object_id AND ic.index_id = i.index_id AND ic.partition_ordinal > 0 WHERE i.type <= 1 AND c.column_id = ic.column_id

Partitiondetails

Now it gets a little bit more into detail on how the data is put on each partition, as well and the distribution on each partition and the boundary values Partitiondefinition

  1. LEFT CREATE TABLE myTable ( id int NOT NULL ,lastName varchar(20) ) WITH ( PARTITION ( id RANGE LEFT FOR VALUES (10, 20, 30, 40 ))) -       Partition 1:      [id]             <=    10 -       Partition 2:      10              <       [id]    <= 20 -       Partition 3:      20              <       [id]    <= 30 -       Partition 4:      30              <       [id]    <= 40 -       Partition 5:      40              <       [id]   Partitiondefinition RIGHT CREATE TABLE myTable ( id int NOT NULL ,lastName varchar(20) ) WITH ( PARTITION ( id RANGE RIGHT FOR VALUES (10, 20, 30, 40 ))) -       Partition 1:      [id]             <       10 -       Partition 2:      10              <=    [id]    < 20 -       Partition 3:      20              <=    [id]    < 30 -       Partition 4:      30              <=    [id]    < 40 -       Partition 5:      40              <=    [id]   Partitiondistribution and boundarys
  1. SELECT
  2.  
  3. t.name AS [table]
  4.  
  5. , p.partition_number
  6.  
  7. , f.type_desc as [function]
  8.  
  9. , CASE f.boundary_value_on_right
  10.  
  11. WHEN 1 THEN 'RIGHT'
  12.  
  13. ELSE 'LEFT'
  14.  
  15. END as [range]
  16.  
  17. , r.boundary_id
  18.  
  19. , r.value AS [boundary_value]
  20.  
  21. FROM sys.tables AS t JOIN sys.indexes AS i ON t.object_id = i.object_id JOIN sys.partitions AS p ON i.object_id = p.object_id AND i.index_id = p.index_id JOIN  sys.partition_schemes AS s ON i.data_space_id = s.data_space_id JOIN sys.partition_functions AS f ON s.function_id = f.function_id LEFT JOIN sys.partition_range_values AS r ON f.function_id = r.function_id and r.boundary_id = p.partition_number WHERE  i.type <= 1 ORDER BY p.partition_number

Monitoringqueries

The second part will include some useful PDW Monitoring Queries.  

Current Queries

The most common question is probably about the queries on your system. A good overview will give you the following query.

  1. select 
  2.  
  3. Q.request_id     as 'request_id'
  4.  
  5. ,Q.session_id     as 'session_id' ,S.login_name     as 'user_name' ,S.status         as 'user_status' ,ISNULL(S.app_name,'') as 'user_app' ,Q.submit_time         as 'request_start' ,Q.status              as 'request_status' ,ISNULL(Q.end_time,'2999-12-31 23:59:59')      as 'request_end' ,RIGHT(CONVERT(nvarchar(23)
  6.  
  7. ,DATEADD(ms ,Q.total_elapsed_time,'1990-01-01 00:00:00'),121),12) as 'elapsed' ,ISNULL(Q.&SQUARE_BRACKETS_OPEN;label],'') as 'label' ,Q.command             as 'query' ,ISNULL(Q.error_id,'') as 'error_id' ,SUBSTRING(ISNULL(E.details,''),2 ,50) as 'error_desc' ,S.client_id           as 'user_address' from sys.dm_pdw_exec_requests       AS Q inner join sys.dm_pdw_exec_sessions AS S on S.session_id = Q.session_id left join sys.dm_pdw_errors         AS E on Q.error_id=E.error_id and Q.session_id=E.session_id WHERE LEFT(S.client_id, 9) <> '127.0.0.1' AND ISNULL(Q.&SQUARE_BRACKETS_OPEN;label],'') <> 'QUERY-MONITOR' ORDER BY CASE UPPER(Q.status) WHEN 'RUNNING' THEN 0 WHEN 'SUSPENDED' THEN 1 ELSE 2 END ASC ,ISNULL(Q.end_time, SYSDATETIME()) DESC ,ISNULL(Q.submit_time, SYSDATETIME()) DESC OPTION( LABEL='QUERY-MONITOR' )

Current Loads via DWLoader

Another common question is the question about the ongoing loads. So far we checked the queries executing, but what about the loads committed to the appliance?

  1. select
  2.  
  3. run_id
  4.  
  5. ,submit_time
  6.  
  7. ,start_time
  8.  
  9. ,end_time
  10.  
  11. ,total_elapsed_time
  12.  
  13. ,table_name
  14.  
  15. ,status
  16.  
  17. ,progress
  18.  
  19. ,rows_processed
  20.  
  21. ,rows_rejected ,rows_inserted
  22.  
  23. from sys.pdw_loader_backup_runs where operation_type = 'LOAD' order by 1 desc

Performancecounters

An overview about the current performance can indicate you the following query. It shows you an overview about included nodes.

  1. SELECT  
  2.  
  3. PC.counter_category    as 'type'
  4.  
  5. ,PC.counter_name       as 'counter' ,PC.instance_name      as 'instance' ,n.name                as 'node' ,PC.counter_value      as 'value' FROM sys.dm_pdw_os_performance_counters as PC JOIN sys.dm_pdw_nodes as N on N.pdw_node_id = PC.pdw_node_id WHERE DATEDIFF(second, PC.last_update_time, SYSDATETIME()) < 10 order by 1 ,2 ,3 ,4

 

Teilen auf

Newsletter Anmeldung

Abonnieren Sie unseren Newsletter!
Lassen Sie sich regelmäßig über alle Neuigkeiten rundum ORAYLIS und die BI- & Big-Data-Branche informieren.

Jetzt anmelden