What you need to know about PDW stats

Technical Value

One of the most important things for a SQL Server, and this also includes the APS are up-to-date statistics. I would probably take it one step further and say that 80-90% of query execution problems can be solved by creating statistics and maintaining them. It happens quite often when customers contact us about poor execution times that the stats haven’t been updated or the customer tells me that they have auto-stats enabled. What most of them don’t know is that auto-stats only work when 20% of the data has either been updated or inserted. The 20% are based on the row count when the stats have last been created. The same behavior is also correct for the PDW. Even though the query is processed by several nodes we still need recent stats. Even though stats are created in some cases automatically depending the query it is really important to create them directly after the table has been created. What might happen is, that PDW creates stats on the nodes but doesn’t show them on the control node since they are not explicitly created. So when you create the stats on the Control Node, all stats which have been created in the background are also updated. Therefore create the stats right after you created the table. Otherwise you loose precious time during updating since you might be doing the same column twice. To make it easier I created the following generic scrip which can be used to either update existing stats, or create new stats for newly added columns.

  1. with cte as ( select SCHEMA_NAME(t1.schema_id) as [Schema], t1.name as Tabelle, t2.name as Spalte, t4.name as StatsName, t3.Stats_ID from sys.tables t1 inner join sys.columns t2 on t1.object_id = t2.object_id left outer join sys.stats_columns t3 on t1.object_id = t3.object_id and t2.column_id = t3.column_id and t3.stats_id != 1 left outer join sys.stats t4 on t1.object_id = t4.object_id and t3.stats_id = t4.stats_id and t3.stats_id != 1 ) select case WHEN (StatsName IS NULL ) THEN 'CREATE STATISTICS Stat' + &SQUARE_BRACKETS_OPEN;Schema] + Tabelle + Spalte + ' ON ' + &SQUARE_BRACKETS_OPEN;Schema] + '.' + Tabelle + '(' + Spalte + ');' ELSE 'UPDATE STATISTICS ' + &SQUARE_BRACKETS_OPEN;Schema] + '.' + Tabelle + ' (' + StatsName + ');' END from cte order by 1

In case you are not sure if you have created your stats right after creating the table it might be best to CTAS the Table again and rename it afterwards. Now you can create the stats and be sure that everything is correct. Of course the script can also be used after you have added new columns.

Teilen auf

Newsletter Anmeldung

Abonnieren Sie unseren Newsletter!
Lassen Sie sich regelmäßig über alle Neuigkeiten rundum ORAYLIS und die BI- & Big-Data-Branche informieren.

Jetzt anmelden