27.01.2014

Updating a table on your PDW iteratively

Technical Value

It is quite common in a SMP Environment to update a table. Sometimes it happens that you have to update a big table. And to make it even worse most of the data in the hug table needs to be updated. For example when you have added a new column and want to update it with a certain value.

It is important that the transaction log is not going to be maxed out and the whole transaction will be rolled back. In our SMP world we can use something like the query below to go over a partial result and update only a certain amount. In our case we

  1. declare @rows int = 1;
  2.  
  3. while (@rows > 0)
  4.  
  5. begin
  6.  
  7.           update top(100000) HugeTable
  8.  
  9.           set updatecol = updatecol
  10.  
  11.           where Month_ID = 201312
  12.  
  13.                and updatecol is null
  14.  
  15.   set @rows = @@ROWCOUNT
  16.  
  17. end

This query will only update 1.000.000 rows at a time to make sure the transaction log doesn’t get into FULL state. If you had the chance to experience the awesome performance of the PDW you will also experience a couple of things that won’t work right away because some things are not implemented yet. One of those features is the feature “update top (XXXXXX)”.

But what can we do if we have sort of the same requirement and need to update the table iteratively. The answer to this question is “modulo”. The best thing to do, is to select e.g. an integer column with a great selectivity as well as a great cardinality.

In our scenario I selected the Contract_ID with modulo and splittet the table into 10 equal result sets. Select the min and max values and iterate as you are used to over the table.

declare @min int

declare @max int

  1. set @min = (select min(Contract_ID%10) from HugeTable where Month_ID = 201312)
  2.  
  3. set @max = (select max(Contract_ID%10) from HugeTable where Month_ID = 201312)
  4.  
  5. while @min <= @max
  6.  
  7. begin
  8.  
  9.           update HugeTable
  10.  
  11.           set updatecol = updatecol
  12.  
  13.           where Month_ID = 201312
  14.  
  15.           and updatecole is null
  16.  
  17.                and Contract_ID %10 = @min
  18.  
  19. set @min = @min + 1        
  20.  
  21. end

  With this simple rewrite you can achieve and easy update for your tables.

Teilen auf

Newsletter Anmeldung

Abonnieren Sie unseren Newsletter!
Lassen Sie sich regelmäßig über alle Neuigkeiten rundum ORAYLIS und die BI- & Big-Data-Branche informieren.

Jetzt anmelden