BLOG

Moving Averages in DAX vs. MDX

04.09.2011 Hilmar Buchta

SQL Server 2008 |  SQL Server 2008R2 | SQL Server Denali | PowerPivot

Yes, I’m a supporter of equal rights of DAX and MDX. And like many others, I can’t wait to have BISM multidimensional (aka OLAP Cubes) supporting DAX so that we can use project Crescent on top of all those nice cubes. But back to my topic. My last post was about moving averages in DAX and I was so sure I blogged about calculating them in MDX before… but I didn’t. This is not fair.

On the other hand, Mosha Pasumansky, the godfather of MDX, wrote an excellent and very complete article about this topic and I can only suggest reading it. It doesn’t only cover simple moving averages but also weighted and exponential ones. Also Bill Pearson wrote a very good step-by-step guide about this topic. You can find it here and I can only suggest reading it.

So, basically there is no need for me to write another article about this. Therefore this will be a very short blog post… ah, I just remembered something I may write about. Mosha and Bill both investigated on the calculation of moving averages within a query. In the context of a specific query, things are sometimes easier compared to the situation where you create a cube measure that has to work under different query conditions. For example, you cannot be sure which hierarchy has been used.The first thing that comes into mind is the wizard for adding time intelligence. This wizard does a pretty good job. The main result is a short piece of code that is inserted into the cube script. This piece of code looks similar to the following example:

  1. Scope(
  2.        {
  3.          [Measures].members
  4.        }
  5. ) ;
  6. /*Three Month Moving Average*/
  7.   (
  8.     [Date].[Calendar Date Calculations].[Three Month Moving Average],
  9.     [Date].[Month Name].[Month Name].Members,
  10.     [Date].[Date].Members
  11.   )
  12.     =
  13.   Avg(
  14.        ParallelPeriod(
  15.                        [Date].[Calendar].[Month],
  16.                        2,
  17.                        [Date].[Calendar].CurrentMember
  18.        )
  19.        : [Date].[Calendar].CurrentMember
  20.        *
  21.        [Date].[Calendar Date Calculations].[Current Date]
  22.   ) ;
  23. End Scope ;

The result can be intuitively used by end users as they simply have to choose in which kind of view the data should appear (actual, three month moving average or any other calculation generated by the wizard, for example year-to-day or year-over-year growth). Also, this computation is focusing on the data dimension, not the specific measure, so it can be used for any measure in the cube.

In my last post I used a DAX calculation that computed the moving average based on the last date in the current interval. We can do pretty much the same in MDX by “translating” the DAX formula to MDX. Here is the calculation for a cube calculated member:

  1. CREATE MEMBER CURRENTCUBE.[Measures].SalesAmountAvg30d AS
  2. Avg(
  3.   LASTPERIODS(
  4.        30
  5.        , tail(descendants([Date].[Calendar].currentmember,[Date].[Calendar].[Date]),1).item(0)
  6.   )
  7.   , [Measures].[Internet Sales Amount]
  8. );

After defining this measure we can use it in a query or within a pivot table. Here’s the result from a query:

  1. select {[Measures].[Internet Sales Amount], [Measures].[SalesAmountAvg30d]} on 0,
  2. descendants([Date].[Calendar].[Calendar Year].&[2003])  on 1
  3. from [Adventure Works]

t1

If you compare these values with the values from my last post you see that the values are absolutely identical (just the order of the values differs because of the way I wrote the query). Here are both definitions side by side:

MDX DAX
  1. Avg(
  2. LASTPERIODS(
  3.    30
  4.    , tail(
  5.        descendants(
  6.          [Date].[Calendar].currentmember
  7.          ,[Date].[Calendar].[Date]
  8.        )
  9.        ,1
  10.      ).item(0)
  11.    )
  12.    , [Measures].[Internet Sales Amount]
  13. );
  1. AverageX(
  2. Summarize(
  3.   datesinperiod(‚Date‘[Date]
  4.    , LastDate(‚Date‘[Date]),-30,DAY)
  5.    ,‚Date‘[Date]
  6.    , „SalesAmountSum“
  7.    , calculate(
  8.      Sum(‚Internet Sales‘[Sales Amount]),
  9.      ALLEXCEPT(‚Date‘,‚Date‘[Date])
  10.      )
  11.    )
  12.    ,[SalesAmountSum]
  13. )

Again, the idea (approach) is the same in both cases, therefore both definitions are similar. However, in my opinion the DAX syntax is a bit harder to read in this case. Especially the CALCULATE(…, ALLEXCEPT(…)) makes it harder to understand. In MDX, we rely on attribute relationship for this purpose but in DAX we need to “break” the context manually.

Now, let’s do some performance tests. In order to compare performance I used these queries

MDX DAX
  1. with
  2. member SalesAmountAvg AS
  3.   Avg(
  4.     LASTPERIODS(
  5.       30
  6.       , tail(
  7.         descendants(
  8.           [Date].[Calendar].currentmember
  9.           ,[Date].[Calendar].[Date]),1
  10.         ).item(0)
  11.     )
  12.     , [Measures].[Internet Sales Amount]
  13.   )
  14.  select
  15. {
  16.   [Measures].[Internet Sales Amount]
  17.   , SalesAmountAvg
  18. } on 0,
  19. descendants([Date].[Calendar].[All Periods],,LEAVES) on 1
  20. from [Adventure Works]
  1. define
  2. measure ‚Internet Sales‘[SalesAmountAvg] =
  3.   AverageX(
  4.     Summarize(
  5.       datesinperiod(‚Date‘[Date]
  6.       , LastDate(‚Date‘[Date]),-30,DAY)
  7.       ,‚Date‘[Date]
  8.       , „SalesAmountSum“
  9.       , calculate(
  10.           Sum(
  11.           ‚Internet Sales‘[Sales Amount])
  12.           ,ALLEXCEPT(
  13.           ‚Date‘,‚Date‘[Date])
  14.         )
  15.     )
  16.    ,[SalesAmountSum]
  17. )
  18. evaluate (
  19.   addcolumns(
  20.     values(‚Date‘[Date])
  21.     ,„Internet Sales Amount“
  22.     , SumX(relatedtable
  23.        (‚Internet Sales‘),[Sales Amount])
  24.     ,„SalesAmountAvg“,
  25.     ‚Internet Sales‘[SalesAmountAvg]
  26.   )
  27. )

Both queries return exactly the same results (you may add an “order by ‘Date’[Date]” at the end of the DAX query in order to have the dates returned in the same order as from the MDX query).

For the MDX queries I cleared the cache before running the queries. I changed the number of days (number of days to include in the average, written bold, red in the queries above) and got the following results. For number of days = 0 I took out the calculation and left only the sales amount as aggregate. Time was measured in seconds using SQL Server Management Studio (on a virtual machine, old hardware).

  n=0 n=10 n=30 n=50 n=100 n=1000
MDX 1 3 3 3 3 4
DAX 0 9 9 9 9 12

t2

When looking at these results I was somewhat surprised. Not about the situation that the DAX query took longer to execute. Please keep in mind that I’m running the queries on an early preview of the product so I suppose there is still a lot of debugging and internal logging going on here. We will have to wait for the final product to make a comparison. What surprises me was the fact that the DAX query time did not go up significantly with higher values of n. For the MDX engine I was pretty sure that it would perform this way because we have mature and a well built cache behind it. So, although we’re increasing the number of computed cells dramatically (with higher values for n), the MDX query performance should almost be constant as we have a lot of overlapping calculations here. But also the current DAX engine performs in the same way that shows how very well the DAX engine is implemented. This is a pretty good result and we can expect a lot of performance from this new DAX query engine.

Your email address will not be published. Required fields are marked *

Join #teamoraylispeople

Gestalte mit uns
die Welt der Daten