3

I was wondering which of the following two queries is more performant?

Query 1:

SELECT NONEMPTY(CROSSJOIN({[Product].[Category].children}, 
                           {[Scenario].[Scenario].members} 
                          )
                 ) ON COLUMNS
FROM [Analysis Services Tutorial] 

Query 2:

SELECT CROSSJOIN(NONEMPTY({[Product].[Category].children}), 
                  NONEMPTY({[Scenario].[Scenario].members})
                 ) ON COLUMNS 
FROM [Analysis Services Tutorial] 

I would say query 2 is more performant/optimized because first you take out all the unnecessary members and then crossjoin them. The first query you crossjoin everything and then take out the nulls. That would be my guess but I want somebody who can clear me up.

Edit 1 In response of comments of an answer

Lets say I add a measure as a second parameter, so it does not go to the "default measure". How could second query return values with null? I am specifying to crossjoin between nonempty members. And I just really dont see how the can return different results no matter the dimensions involved. To me they seemed pretty equivalent. What am I not seeing?

Query 1:

SELECT NONEMPTY(CROSSJOIN({[Product].[Category].children}, 
                           {[Scenario].[Scenario].members} 
                          ), [Total Internet Sales]
                 ) ON COLUMNS
FROM [Analysis Services Tutorial] 

Query 2:

SELECT CROSSJOIN(NONEMPTY({[Product].[Category].children},[Total Internet Sales]), 
                  NONEMPTY({[Scenario].[Scenario].members},[Total Internet Sales])
                 ) ON COLUMNS 
FROM [Analysis Services Tutorial] 

Edit 2

As the answer said the queries are not the same. I realized when @GregGalloway presented other scenario. I did an excel with sample data so maybe someone can find it useful. enter image description here

elvainch
  • 1,369
  • 3
  • 15
  • 32

1 Answers1

3

They aren't equivalent since both queries we will return different results. For example, against the real Adventure Works (not some tutorial version) these two queries return different results. Notice that the Clothing/Kentucky column shows null on the second query:

SELECT NONEMPTY(CROSSJOIN({[Product].[Category].children}, 
                           {[Customer].[State-Province].[State-Province].Members} 
                          ), [Measures].[Internet Sales Amount]
                 ) ON COLUMNS
FROM [Adventure Works] 
where [Measures].[Internet Sales Amount]



SELECT CROSSJOIN(NONEMPTY({[Product].[Category].children},[Measures].[Internet Sales Amount]), 
                  NONEMPTY({[Customer].[State-Province].[State-Province].Members},[Measures].[Internet Sales Amount])
                 ) ON COLUMNS 
FROM [Adventure Works] 
where [Measures].[Internet Sales Amount]

Note that the Scenario dimension doesn't relate to the Internet Sales measure group, I don't think. So that may not be a good example. I chose the Product dimension and the Customer dimension for my example.

As discussed (and as you updated in your question) NonEmpty() should always have a second parameter so it is clear what measure you are doing NonEmpty against. Your query should also mention a measure on one axis or the WHERE clause so that you're not returning some vague "default measure". I've included a WHERE clause with a measure in my examples.

Anyway, to answer your question... assuming the measure is a physical measure or a well optimized calculated measure that runs in block mode I wouldn't be surprised if Query 1 is faster. But it depends on the measure and the size of dimensions and the sparsity of the cube. This question is very theoretical and the two queries don't return equivalent results.

GregGalloway
  • 11,355
  • 3
  • 16
  • 47
  • The NonEmpty works with one parameter. Both queries I tested against the adventureworks cube and the returned results and the same ones. – elvainch Dec 02 '15 at 12:56
  • It may work with one parameter but is is doing a NonEmpty against the "default measure". That may not be what you want so I recommend you include the measure you intend as the second parameter. – GregGalloway Dec 02 '15 at 13:26
  • @elvainch in this one instance it may have returned the same results but that's not guaranteed on other data. Query 2 could return rows that have null measure values for that combination of product and scenario. – GregGalloway Dec 02 '15 at 13:28
  • @elvainch I've updated my answer to show the two formulations of the query won't always return equivalent results and clarify my answer. – GregGalloway Dec 02 '15 at 20:09
  • Now seeing your query I understand why they dont return the same output. I had to see it in an example to understand it. Crossjoining two nonempty set, doesnt mean that wont get an empty cells as a result and make sense. Thanks for the help! – elvainch Dec 02 '15 at 23:31
  • I attached a pictured also, let me know what you think! Thanks again! – elvainch Dec 03 '15 at 00:04
  • @elvainch, yep! I think you have got it. The picture looks right on. – GregGalloway Dec 03 '15 at 04:44