11.01.2016

Fill down table in T-SQL (last-non-empty value)

Technical Value

A common task in data processing is to fill null values in a table with the latest existing value. For example, look at the following inventory table

image

Let’s assume we only get an inventory record, if the inventory changes but we want to fill the missing values (for example to create a snapshot fact table).

The final result should look like this with null values being filled with the last existing value:

image

In Power Query (or “Get and Transform” if you’re using Excel 2016) this is an easy task using the fill-down function, which does exactly what want:

image

So, how can we get the same effect in T-SQL? We could use a sub query, but in this post I’m using a window function. The idea is to generate row groups for which each row group starts with a not-null value and has a unique identifier. This allows us to use the first_value function for each group to get the existing value for that group.

In order to build the row groups I start with a change indicator for the table values. The idendicator is 1 if we have a balance, 0 if not. The resulting query is quite simple:

 

  1. select *, case when UnitsBalance is null then 0 else 1 end ChangeIndicator from [dbo].[Inventory]

image

The next step is to create a running total over the change indicator. In fact, this running total already defines our row groups. I’m using the sum() over window function here. The query from above is converted into a common table expression (cte) as this is easier to read:

  1. with
  2. help1 as (
  3. select *, case when UnitsBalance is null then 0 else 1 end ChangeIndicator
  4. from [dbo].[Inventory]
  5. )
  6.  
  7. select *, Sum(ChangeIndicator) over (order by DateKey) RowGroup from help1
  8. order by Datekey

image

As you see, the number for the row group increments with each existing balance value which means that the first row always contains the number (with the exception of the very first line for which we don’t have a value at all). In order to finalize the query, we just need the first_value function to get the value of the first row per row group:

  1. with
  2. help1 as (
  3. select *, case when UnitsBalance is null then 0 else 1 end ChangeIndicator
  4. from [dbo].[Inventory]
  5. )
  6. , help2 as (
  7. select *, Sum(ChangeIndicator) over (order by DateKey) RowGroup from help1
  8. )
  9. select *,
  10.  
  11.  
  12. case when UnitsBalance is not null then UnitsBalance
  13. else first_value(UnitsBalance) over (partition by RowGroup order by DateKey)
  14. end UnitsBalanceFillDown

 

  1. from help2

  2. order by datekey

image

So this is exactly what we wanted (if you remove the columns that we just needed during the process from the final output, you get exactly the result from above).

 

Conclusion

Windows functions in T-SQL are a good solution to fill gaps in your data set similar to last-non-empty.
As I’m using the first_value function here, just as a reminder to avoid frustration: first_value order by date descending and last_value order by date ascending do not give the same result.

Kommentare

Tobi
Di, 05.06.2018 - 10:51

Hi, this is a clever way of ding this. many thanks.
However, I am unsure if the CASE clause in line 12 is really required. I think it should work with 'first_value' alone. Do I miss something? Tobi

Neuen Kommentar schreiben

Der Inhalt dieses Feldes wird nicht öffentlich zugänglich angezeigt.

Klartext

  • Keine HTML-Tags erlaubt.
  • HTML - Zeilenumbrüche und Absätze werden automatisch erzeugt.
  • Web page addresses and email addresses turn into links automatically.
Teilen auf

Newsletter Anmeldung

Abonnieren Sie unseren Newsletter!
Lassen Sie sich regelmäßig über alle Neuigkeiten rundum ORAYLIS und die BI- & Big-Data-Branche informieren.

Jetzt anmelden

Unsere Website benutzt Cookies, um bestmögliche Funktionalitäten bieten zu können. Durch die Nutzung unserer Website, erklären Sie sich mit der Verwendung von Cookies einverstanden. In unserer Datenschutzerklärung finden Sie unter §5 Informationen und Links, wie Sie diesen Maßnahmen jederzeit widersprechen können.