October 5, 2013

MDX Where clause slicer axis – AND/OR confusion


This is one of the repeated question and causes confusion among the MDX developers, especially if they migrate from SQL development. Before continuing this article, it is better not to compare the MDX and SQL as this will only lead to confusion and error.

What is “where” clause in MDX?

As per MSDN “The slicer axis filters the data returned by the Multidimensional Expressions (MDX) SELECT statement, restricting the returned data so that only data intersecting with the specified members will be returned”

Does it support OR/AND syntax ?

No, not directly but can be achieved with caution

Simple example for where clause

SELECT
{[Measures].[Internet Sales Amount]} ON COLUMNS
,[Date].[Fiscal].[Fiscal Year] ON ROWS
FROM [Adventure Works]
WHERE [Product].[Subcategory].[caps]
The above query will simply slice the cube for Caps sales as shown below. So in SQL term it has restricted the result set for Subcategory = ‘caps’ sales. 
image
Now let us also check the same data with the below query
SELECT
{[Measures].[Internet Sales Amount]} ON COLUMNS
,{[Product].[Subcategory].Members }* {[Date].[Fiscal].[Fiscal Year]} ON ROWS
FROM [Adventure Works]
This will produce the complete set without slicing anything and as expected result are matching
image

Let us imitate SQL OR clause in the slicer axis

What if I want the helmets sale as well as caps sale. Then in SQL world we will simply say subcategory = ‘caps’ or Subcategory = ‘Helmet’
In MDX world this can be achieved by using multi select in where clause. This was introduced in 2005sp1
SELECT
{[Measures].[Internet Sales Amount]} ON COLUMNS
,[Date].[Fiscal].[Fiscal Year] ON ROWS
FROM [Adventure Works]
WHERE {[Product].[Subcategory].[caps]
       ,[Product].[Subcategory].[Helmets]}
image
Now let us verify the result by running the below query with cross join
SELECT
{[Measures].[Internet Sales Amount]} ON COLUMNS
,{[Product].[Subcategory].[caps]
       ,[Product].[Subcategory].[Helmets]} * {[Date].[Fiscal].[Fiscal Year]} ON ROWS
FROM [Adventure Works]
image
if you sum up the caps and Helmets then you will get the results
Year Helmets Caps Total
FY2008 215923.29 18834.05 234757.3
fy2009 9412.31 854.05 10266.36
Let us imitate SQL “AND” clause in the slicer axis
What if want to get the Helmets UK sales alone. In SQL world we would simply say this by subcategory = ‘caps’ and Country = ‘United Kingdom’
in MDX world we use the same multi select principle. Slight change in syntax as we use curved brackets to represent Tuple rather than Set . But as we are slicing by different dimension it will automatically act as AND condition. if you think about the two 2D graph, (x,y) will define a point in the space. If we mention this in words then we might say point  of “x AND y” in the space
SELECT
{[Measures].[Internet Sales Amount]} ON COLUMNS
,[Date].[Fiscal].[Fiscal Year] ON ROWS
FROM [Adventure Works]
WHERE([Product].[Subcategory].[Helmets]
       ,[Geography].[Country].&[United Kingdom])
image
We can verify the results with Crossjoin
SELECT
{[Measures].[Internet Sales Amount]} ON COLUMNS
,([Product].[Subcategory].[Helmets]
       ,[Geography].[Country].&[United Kingdom])  * [Date].[Fiscal].[Fiscal Year] ON ROWS
FROM [Adventure Works]
image

What is the Problem then?

Multi-select has it’s own disadvantage or in other words issue. The statement given above by MSDN is not entirely accurate. Please read this blog by mosha
One of the main advantage of Where clause is, it sets the context for the whole query and gets evaluated well before the calculated members and other axis. But unfortunately,  if you use the one of the member of slicer axis hierarchy in the where clause then it reset the context and produces wrong results.
select  {[Measures].[Internet Sales Amount]} ON COLUMNS      
        ,[Product].[Category].[Category].MEMBERS
          * {[Product].[Subcategory].[Helmets]
             , [Product].[Subcategory].[caps]
            }  ON ROWS
from [Adventure Works]    
image
If we write the same query with slicer axis as shown below then we get wrong results
SELECT
{[Measures].[Internet Sales Amount]} ON COLUMNS
,[Product].[Category].[Category].MEMBERS  ON ROWS
FROM [Adventure Works]
WHERE {[Product].[Subcategory].[caps]
      ,[Product].[Subcategory].[Helmets]
      }
image

Conclusion

MDX where clause and SQL where clause are not directly comparable. In MDX we can achieve filtering by three different ways "Where”,Subcube and “Filter” function. We need to use them based on the need and query context.
I am writing another blog to compare this three function and their right usage. As of now the above issue can easily resolved by using subcube. This is the one of the reason why SSRS always uses Subcube
SELECT
{[Measures].[Internet Sales Amount]} ON COLUMNS
,[Product].[Category].[Category].MEMBERS  ON ROWS
From  (select {[Product].[Subcategory].[caps]
               ,[Product].[Subcategory].[Helmets]
              } on columns
       FROM [Adventure Works])
image