02.10.2015

Dealing with hierarchies Part 4: How to flatten hierarchies in a view

Technical Value

There are different advantages and requirements to flatten a parent-child-hierarchy as mentioned in my previous blog posts. In the following example I will show you a way to flatten most parent-child-hierarchies in a standardized pattern. Our goal is a dimension which consists of two parts:

  • The left part resembles the original parent-child-structure and
  • the right part is a transposed version of the essential attributes.

Parent-child-hierarchies Let’s use the DimEmployee from AdventureWorksDW2012. First we write two common-table-expressions to define the left part and the essential attributes which need to be transposed. These are the main parts which you need to edit when using the code for different hierarchies.

  1. WITH left_part AS
  2. (
  3. SELECT
  4. NodeKey = EmployeeKey
  5. ,ParentKey = ParentEmployeeKey
  6. /* additional attributes */
  7. ,Name = LastName + ', ' + FirstName
  8. ,StartDate
  9. ,EndDate
  10. FROM DimEmployee
  11.  
  12. )
  13. , right_part AS
  14. (
  15. SELECT
  16. NodeKey = EmployeeKey
  17. ,ParentKey = ParentEmployeeKey
  18. /* Numerics */
  19. ,N01 = EmployeeKey
  20. ,N02 = NULL
  21. ,N03 = NULL
  22. ,N04 = NULL
  23. ,N05 = NULL
  24. /* Nvarchars */
  25. ,T01 = LastName
  26. ,T02 = MiddleName
  27. ,T03 = FirstName
  28. ,T04 = NULL
  29. ,T05 = NULL
  30. FROM DimEmployee
  31. )

The following steps are similar for all hierarchies and you can use them for all your hierarchies. In the next step we transform the “right_part” in a way that we have one row for each recursive child of an element together with its original element and the distance between the original element and each child:

  1. , rec AS
  2. (
  3. SELECT
  4. NodeKey
  5. ,ParentKey
  6.  
  7.         ,N01
  8. ,N02
  9. ,N03
  10. ,N04
  11. ,N05
  12.  
  13.         ,T01
  14. ,T02
  15. ,T03
  16. ,T04
  17. ,T05
  18.  
  19.         ,Distance = 0
  20. FROM right_part
  21.  
  22.     UNION ALL
  23.  
  24.     SELECT
  25. rec.NodeKey
  26. ,ri.ParentKey
  27.  
  28.         ,ri.N01
  29. ,ri.N02
  30. ,ri.N03
  31. ,ri.N04
  32. ,ri.N05
  33.  
  34.         ,ri.T01
  35. ,ri.T02
  36. ,ri.T03
  37. ,ri.T04
  38. ,ri.T05
  39.  
  40.         ,Distance = Distance + 1
  41. FROM Rec
  42. JOIN right_part AS ri
  43. ON
  44. ri.NodeKey = Rec.ParentKey
  45. WHERE
  46. Distance < 10
  47. )

Next step inverts the distance to hierarchy level simply by calculating maximum distance of the recursive part above and substracting the distance of the current element.

  1. , invLevel AS
  2. (
  3. SELECT
  4. Level = 1+MAX(Distance) OVER (Partition BY NodeKey)
  5. ,invLevel = 1+MAX(Distance) OVER (Partition BY NodeKey) - Distance
  6. ,*
  7. FROM rec
  8. )

Level contains the value for the NodeKey and invLevel is the Level of the child. Definition of a diagonal matrix:

  1. ,tr
  2. AS
  3. (
  4. SELECT * FROM
  5. (
  6. VALUES
  7. ( 1,'01','',NULL,NULL,NULL,NULL    ,0,NULL,NULL,NULL,NULL),
  8. ( 2,'02',NULL,'',NULL,NULL,NULL    ,NULL,0,NULL,NULL,NULL),
  9. ( 3,'03',NULL,NULL,'',NULL,NULL    ,NULL,NULL,0,NULL,NULL),
  10. ( 4,'04',NULL,NULL,NULL,'',NULL    ,NULL,NULL,NULL,0,NULL),
  11. ( 5,'05',NULL,NULL,NULL,NULL,''    ,NULL,NULL,NULL,NULL,0)
  12.  
  13.     ) AS T (invLevel, Lev, T01, T02, T03, T04, T05, N01, N02, N03, N04, N05)
  14. )

  Last step before transposing the table is to fill up the elements for upper hierarchy-nodes in order to have children for all levels.  

  1. , complete AS
  2. (
  3. SELECT
  4. ri.NodeKey,
  5. Level = MAX(inv.Level) OVER (PARTITION BY ri.NodeKey),
  6. Tr.invLevel
  7.  
  8.         ,N01 = ISNULL(inv.N01,ri.N01)
  9. ,N02 = ISNULL(inv.N02,ri.N02)
  10. ,N03 = ISNULL(inv.N03,ri.N03)
  11. ,N04 = ISNULL(inv.N04,ri.N04)
  12. ,N05 = ISNULL(inv.N05,ri.N05)
  13.  
  14.         ,T01 = ISNULL(inv.T01,ri.T01)
  15. ,T02 = ISNULL(inv.T02,ri.T02)
  16. ,T03 = ISNULL(inv.T03,ri.T03)
  17. ,T04 = ISNULL(inv.T04,ri.T04)
  18. ,T05 = ISNULL(inv.T05,ri.T05)
  19.  
  20.     FROM right_part ri
  21. CROSS JOIN tr
  22. Left Join invLevel AS inv
  23. ON
  24. inv.NodeKey = ri.NodeKey AND
  25. inv.invLevel = Tr.invLevel
  26. )

Now we use a diagonal matrix and a simple aggregation to transpose the table.

  1. , final AS
  2. (
  3. SELECT
  4. NodeKey
  5. ,Level
  6.  
  7.         ,L01N01 = MAX(tr.n01 + c.n01),L01N02 = MAX(tr.n01 + c.n02)
  8.         ,L01N03 = MAX(tr.n01 + c.n03),L01N04 = MAX(tr.n01 + c.n04)
  9.         ,L01N05 = MAX(tr.n01 + c.n05)
  10. ,L01T01 = MAX(tr.t01 + c.t01),L01T02 = MAX(tr.t01 + c.t02)
  11.         ,L01T03 = MAX(tr.t01 + c.t03),L01T04 = MAX(tr.t01 + c.t04)
  12.         ,L01T05 = MAX(tr.t01 + c.t05)
  13.  
  14.         ,L02N01 = MAX(tr.n02 + c.n01),L02N02 = MAX(tr.n02 + c.n02)
  15.         ,L02N03 = MAX(tr.n02 + c.n03),L02N04 = MAX(tr.n02 + c.n04)
  16.         ,L02N05 = MAX(tr.n02 + c.n05)
  17. ,L02T01 = MAX(tr.t02 + c.t01),L02T02 = MAX(tr.t02 + c.t02)
  18.         ,L02T03 = MAX(tr.t02 + c.t03),L02T04 = MAX(tr.t02 + c.t04)
  19.         ,L02T05 = MAX(tr.t02 + c.t05)
  20.  
  21.         ,L03N01 = MAX(tr.n03 + c.n01),L03N02 = MAX(tr.n03 + c.n02)
  22.         ,L03N03 = MAX(tr.n03 + c.n03),L03N04 = MAX(tr.n03 + c.n04)
  23.         ,L03N05 = MAX(tr.n03 + c.n05)
  24. ,L03T01 = MAX(tr.t03 + c.t01),L03T02 = MAX(tr.t03 + c.t02)
  25.         ,L03T03 = MAX(tr.t03 + c.t03),L03T04 = MAX(tr.t03 + c.t04)
  26.         ,L03T05 = MAX(tr.t03 + c.t05)
  27.  
  28.         ,L04N01 = MAX(tr.n04 + c.n01),L04N02 = MAX(tr.n04 + c.n02)
  29.         ,L04N03 = MAX(tr.n04 + c.n03),L04N04 = MAX(tr.n04 + c.n04)
  30.         ,L04N05 = MAX(tr.n04 + c.n05)
  31. ,L04T01 = MAX(tr.t04 + c.t01),L04T02 = MAX(tr.t04 + c.t02)
  32.         ,L04T03 = MAX(tr.t04 + c.t03),L04T04 = MAX(tr.t04 + c.t04)
  33.         ,L04T05 = MAX(tr.t04 + c.t05)
  34.  
  35.         ,L05N01 = MAX(tr.n05 + c.n01),L05N02 = MAX(tr.n05 + c.n02)
  36.         ,L05N03 = MAX(tr.n05 + c.n03),L05N04 = MAX(tr.n05 + c.n04)
  37.         ,L05N05 = MAX(tr.n05 + c.n05)
  38. ,L05T01 = MAX(tr.t05 + c.t01),L05T02 = MAX(tr.t05 + c.t02)
  39.         ,L05T03 = MAX(tr.t05 + c.t03),L05T04 = MAX(tr.t05 + c.t04)
  40.         ,L05T05 = MAX(tr.t05 + c.t05)
  41.  
  42.     FROM complete c
  43. JOIN tr
  44. ON
  45. c.invLevel = tr.invLevel
  46. GROUP BY
  47. NodeKey,
  48. Level
  49. )
  50.  
  51.  

Finally the SELECT query:

  1. SELECT
  2. le.*
  3. , ri.Level
  4.  
  5.     , L01N01, L01T01, L01T02, L01T03
  6. , L02N01, L02T01, L02T02, L02T03
  7. , L03N01, L03T01, L03T02, L03T03
  8. , L04N01, L04T01, L04T02, L04T03
  9. , L05N01, L05T01, L05T02, L05T03
  10.  
  11. FROM left_part AS le
  12. JOIN final AS ri
  13. ON
  14. le.NodeKey = ri.NodeKey
  15. ORDER BY
  16. Level, NodeKey

Here it is: parent-child-hierarchies

Conclusion

With this script you can flatten nearly every parent child hierarchy except you have explicit gaps like direct jumps from Lev2Nodes to Lev4Nodes. You only have to modify the first and second cte – so it’s really efficient to implement new hierarchies. In our scenario we use a version for up to 15 hierarchy-levels and use the script in a view definition.

Teilen auf

Newsletter Anmeldung

Abonnieren Sie unseren Newsletter!
Lassen Sie sich regelmäßig über alle Neuigkeiten rundum ORAYLIS und die BI- & Big-Data-Branche informieren.

Jetzt anmelden