1

I've got requirement to display top 5 customer sales by customer group, but with other customers sales within the group aggregated as 'Others'. Something similar to this question, but counted separately for each of customer groups.

According to MSDN to perform TopCount, over a set of members you have to use Generate function.

This part works ok:

with 

set [Top5CustomerByGroup] AS
GENERATE
( 
    [Klient].[Grupa Klientow].[Grupa Klientow].ALLMEMBERS,
    TOPCOUNT
    (
        [Klient].[Grupa Klientow].CURRENTMEMBER * [Klient].[Klient].[Klient].MEMBERS
        , 5
        , [Measures].[Przychody ze sprzedazy rzeczywiste wartosc]
    )
)

SELECT 
{ [Measures].[Przychody ze sprzedazy rzeczywiste wartosc]} ON COLUMNS,
{
[Klient].[Grupa Klientow].[Grupa Klientow].ALLMEMBERS * [Klient].[Klient].[All], --for drilldown purposes
[Top5CustomerByGroup]
}
ON ROWS
FROM 
(
  SELECT ({[Data].[Rok].&[2013]} ) ON COLUMNS
      FROM [MyCube]
)

however I've got problem with 'Others' part.

I think I was able to construct set with other customers by group (data looks good) as:

set [OtherCustomersByGroup] AS
GENERATE
( 
    [Klient].[Grupa Klientow].[Grupa Klientow].ALLMEMBERS,
    except
    (
        {[Klient].[Grupa Klientow].CURRENTMEMBER * [Klient].[Klient].[Klient].MEMBERS},
        TOPCOUNT
        (
            [Klient].[Grupa Klientow].CURRENTMEMBER * [Klient].[Klient].[Klient].MEMBERS
            , 5
            , [Measures].[Przychody ze sprzedazy rzeczywiste wartosc]
        )
    )
)

however I don't have idea how to aggregate it with grouping.

Doing this as in this question

member [Klient].[Klient].[tmp] as
aggregate([OtherCustomersByGroup])

produces one value, which is logical.

I think I need list of sets with 'other' customers in each group instead of single [OtherCustomersByGroup] set, but don't have idea how to construct them.

Does anyone have any ideas or suggestions?

UPDATE:

There is some misunderstanding of my needs. I need Top n customers within each of customer group by sales with sales of other customers in this group aggregated to one position (let's say called Others).

For example for this simplified input:

| Group  | Client   | Sales  |
|--------|----------|--------|
| Group1 | Client1  |    300 |
| Group1 | Client2  |      5 |
| Group1 | Client3  |    400 |
| Group1 | Client4  |    150 |
| Group1 | Client5  |    651 |
| Group1 | Client6  | null   |
| Group2 | Client7  |     11 |
| Group2 | Client8  |     52 |
| Group2 | Client9  |     44 |
| Group2 | Client10 |     21 |
| Group2 | Client11 |    201 |
| Group2 | Client12 |    325 |
| Group2 | Client13 |    251 |
| Group3 | Client14 |     15 |

I need such output (here is top 2):

| Group  | Client   | Sales  |
|--------|----------|--------|
| Group1 | Client5  |    651 |
| Group1 | Client3  |    400 |
| Group1 | Others   |    455 |
| Group2 | Client12 |    325 |
| Group2 | Client13 |    251 |
| Group2 | Others   |    329 |
| Group3 | Client14 |     15 |
| Group3 | Others   |  null  | <- optional row

Sorting isn't required, we are going to process it by client side.

Community
  • 1
  • 1
Piotr Sobiegraj
  • 1,775
  • 16
  • 26
  • What do you mean by "with grouping"? What is the difference in what you want, and what the other question you link to does? – Tab Alleman Feb 10 '15 at 21:09
  • @TabAlleman I want to have Top 5 and 'Others' for each of customer group not one general. As stated in linked MSDN article: "The most common practical use of Generate is to evaluate a complex set expression, such as TopCount, over a set of members. The following example query displays the top 10 Products for each Calendar Year on Rows [...] Note that a different top 10 is displayed for each year, and that the use of Generate is the only way to get this result." I also need 'Others' over a set of members – Piotr Sobiegraj Feb 10 '15 at 21:15
  • This is possible but quite involved: I can maybe provide an advWrks script – whytheq Feb 11 '15 at 13:46
  • @whytheq, hope you can develop something better than my answer, since it's not the best solution to add members even for single select query, and tons of multiplying too. – Alex Peshik Feb 11 '15 at 14:40
  • @AlexPeshik I have a `top N of top M with Other member` script that is against the AdvWrks cube - I can post it but it would totally need re-writing to fit in with this question: I don't have time to do this. – whytheq Feb 19 '15 at 09:19
  • @AlexPeshik - feel free to play with the script I've posted Alex - it is an adaptation of a script developed by Chris Webb - I still do not fully understand how it works - if you can add some more comments to help me then that would be appreciated. – whytheq Feb 19 '15 at 09:35
  • @PiotrSobiegraj I've just posted a working model against AdvWrks that replicates the pattern you have posted at the end of your question – whytheq Mar 04 '15 at 14:11
  • Alex & whytheq thanks both of you. We've done this grouping at client side already but solution looks good :) According to Alex request I've marked whytheq answer. – Piotr Sobiegraj Mar 05 '15 at 19:00

2 Answers2

3

Yes, you've got the main idea by using SET for Others, but several minor additions are required to complete the task.

I'll use my test DBs, but this can easily be transformed to yours.

  • [Report Date] - date dimension ([Klient] analogue)
  • [REPORT DATE Y] - years hierarchy ([Grupa Klientow])
  • [REPORT DATE YM] - months hierarchy ([Klient].[Klient])
  • [Measures].[Count] - measure for TopCount ([Measures].[Przychody ze sprzedazy rzeczywiste wartosc])

I also used top 3 just to show result image here.

And here's the code:

with

/* first, add empty [Other] member to the group level */
member [Report Date].[REPORT DATE Y].[Other] as null

/* second, copy measure by fixing the lowest level */
member [Measures].[Count with Other Groups] as ([Report Date].[REPORT DATE YM],[Measures].[Count])

/* third, create top 10 by group */
set [Report Date Top 10 Groups] as
Generate([Report Date].[REPORT DATE Y].Children
,TopCount([Report Date].[REPORT DATE Y].CurrentMember
 * [Report Date].[REPORT DATE YM].Children,3,[Measures].[Count with Other Groups]))

/* this is the part for Other group mapping */
set [Report Date Other Groups] as
[Report Date].[REPORT DATE Y].[Other]
 * ([Report Date].[REPORT DATE YM].Children
    - Extract([Report Date Top 10 Groups],[Report Date].[REPORT DATE YM]))

select {[Measures].[Count],[Measures].[Count with Other Groups]} on 0
,
{
[Report Date Top 10 Groups],[Report Date Other Groups]}
on 1
from 
[DATA]

And here is the result:

TopCount_withGroups

..all members till the last one (which is 201606) are on the Other group.

Hope this helps, bardzo dziękuję!

Update: code was optimized by removing one multiplying in Report Date Other Groups calculation.

Update-2: (not solved yet, but in progress)

(use 'Other' member under each group)

IMPORTANT! We need additional hierarchy: Group->Client ([Report Date].[REPORT DATE] with Year->Month is my case) to be able to determine parent for each low level member.

with

/* create top 10 by group */
set [Report Date Top 10 Groups] as
Generate([Report Date].[REPORT DATE Y].Children
,TopCount([Report Date].[REPORT DATE Y].CurrentMember
 * [Report Date].[REPORT DATE].Children,3,[Measures].[Count]))

/* this is the part for Other group the lowest level non-aggregated members */
set [Report Date Other Members] as
[Report Date].[REPORT DATE Y].Children
* ([Report Date].[REPORT DATE].[Month].AllMembers
    - [Report Date].[REPORT DATE].[All])
- [Report Date Top 10 Groups]

/* add empty [Other] member to the group level, HERE IS AN ISSUE */
member [Report Date].[REPORT DATE].[All].[Other] as null

set [Report Date Other Groups] as
[Report Date].[REPORT DATE Y].[All].Children
* [Report Date].[REPORT DATE].[Other]

member [Measures].[Sum of Top] as
IIF([Report Date].[Report Date].CurrentMember is [Report Date].[REPORT DATE].[Other]
,null /* HERE SHOULD BE CALCULATION, but only
 {[Report Date].[Report Date Y].[All].[Other]}
 is shown, because 'Other' is added to the entire hierarchy */
,SUM([Report Date].[REPORT DATE Y].CurrentMember
        * ([Report Date].[Report Date].CurrentMember.Parent.Children
            - Extract([Report Date Other Members],[Report Date].[REPORT DATE]))
    ,[Measures].[Count]))

member [Measures].[Sum of Group] as
([Report Date].[Report Date].CurrentMember.Parent,[Measures].[Count])

select {[Measures].[Count],[Measures].[Sum of Group],[Measures].[Sum of Top]} on 0
,
Order(Hierarchize({[Report Date Top 10 Groups]
,[Report Date Other Groups]}),[Measures].[Count],DESC)

on 1
from 
[DATA]

And here is the intermediate result:

TopN_v2

I need to move this result here, but have no idea how to do it.

I also tried using flat hierarchies of each level. Other member is shown correctly, but not able to calculate SUM, because both levels are independent. Maybe we can add a property like 'Group_Name' and use unlinked levels, but again - it decreases performance drastically. All this IIF([bla-bla-bla low level member].Properties("Group_Name")=[bla-bla-bla group level].Member_Name are extremely slow.

Update-3 (AdvWorks version of code above)

with

/* create top 10 by group */
set [Top 10 Groups] as
Generate([Customer].[Country].Children
,TopCount([Customer].[Country].CurrentMember
 * [Customer].[Customer Geography].Children,3,[Measures].[Internet Order Count]))

/* this is the part for Other group the lowest level non-aggregated members */
set [Other Members] as
[Customer].[Country].Children
* ([Customer].[Customer Geography].[State-Province].AllMembers
    - [Customer].[Customer Geography].[All])
- [Top 10 Groups]

/* add empty [Other] member to the group level */
member [Customer].[Customer Geography].[All].[Other] as
([Customer].[Country],[Measures].[Internet Order Count])

set [Other Groups] as
[Customer].[Country].[All].Children
* [Customer].[Customer Geography].[Other]

member [Measures].[Sum of Top] as
IIF([Customer].[Customer Geography].CurrentMember is [Customer].[Customer Geography].[Other]
,null
,SUM([Customer].[Country].CurrentMember
        * ([Customer].[Customer Geography].CurrentMember.Parent.Children
            - Extract([Other Members],[Customer].[Customer Geography]))
    ,[Measures].[Internet Order Count]))

member [Measures].[Sum of Group] as
([Customer].[Customer Geography].CurrentMember.Parent,[Measures].[Internet Order Count])

select {[Measures].[Internet Order Count],[Measures].[Sum of Group],[Measures].[Sum of Top]} on 0
,
Order(Hierarchize({[Top 10 Groups],[Other Groups]}),[Measures].[Internet Order Count],DESC) on 1
from [Adventure Works]

TopCount_withGroups_AdvWorks

Update-4 (with a solution in year/month example)

Amazing solution of @whytheq helped to do what I want:

WITH 
  SET [All Grupa Klientow]  AS ([Report Date].[Report Date Y].Children) 
  SET [All Klient] AS ([Report Date].[Report Date YM].Children)
  SET [Top N Members] AS 
    Generate
    (
      [All Grupa Klientow]
     ,TopCount
      (
        (EXISTING 
          [All Klient])
       ,3
       ,[Measures].[Count]
      )
    ) 
  MEMBER [Report Date].[Report Date YM].[Other] AS 
    Aggregate({(EXISTING {[All Klient]} - [Top N Members])}) 
SELECT 
  {[Measures].[Count]} ON 0
 ,{
      [All Grupa Klientow]
    * 
      {
        [Top N Members]
       ,[Report Date].[Report Date YM].[Other]
      }
  } ON 1
FROM [DATA];

And the image:

TopCount_withGroups_FinalResult

Task is solved, but please mark not this answer, but @whytheq's!

Alex Peshik
  • 1,515
  • 2
  • 15
  • 20
  • Thank you, however I needed sth opposite: Other group for each of year. Could you look at an update? In your example I need top 3 months of each year by count with other months of each year aggregated to Others item separately for each year. – Piotr Sobiegraj Feb 17 '15 at 15:45
  • Update-3 for the rows that return "Other" in the column "Sum of Group" your query is returning 3*Total for Country ....what should this show, or is that what is required? – whytheq Mar 04 '15 at 13:03
  • For `Internet Order Count` measure `Other` member of each continent may show `Sum of Group - Sum of Top` of ANY member of the same continent. I just don't know how to move it there. `Other for Sum of Group` means nothing, it's just parent for current member, and means ALL, but for each particular country this measure helps to calculate whole continent value. I've also added an image of my AdvWorks. – Alex Peshik Mar 04 '15 at 13:23
  • We just need to move subtraction to the position of green arrow. Blue 27659 value is just parent and means nothing for Other group. But makes sense for every member as a sum of Top N group (top 3 in my example). – Alex Peshik Mar 04 '15 at 13:25
  • 1
    @AlexPeshik ok - I have solution and will add to my answer (this is assuming you will up it! ) – whytheq Mar 04 '15 at 14:08
  • Amazing! This works perfectly! Solution is simple and smart, cool! (sorry, I mentioned continent->country instead of country->state above, but you've got the idea) – Alex Peshik Mar 04 '15 at 15:11
  • @AlexPeshik This solution is good to have - we have also been playing a long time to try to be able to do this within our environment. I'm very happy with the points (thanks you), and have upped your answer Alex, as you did most of the work and kept this thread going. – whytheq Mar 06 '15 at 08:51
2

The following is against AdvWrks and uses a technique I saw on Chris Webb's blog which he outlines here:
https://cwebbbi.wordpress.com/2007/06/25/advanced-ranking-and-dynamically-generated-named-sets-in-mdx/

The section of the script that creates the set MyMonthsWithEmployeesSets I find very difficult to get my head around - maybe @AlexPeshik could shed a little more light on what is happening in the following script.

WITH 
  SET MyMonths AS 
    TopPercent
    (
      [Date].[Calendar].[Month].MEMBERS
     ,20
     ,[Measures].[Reseller Sales Amount]
    ) 
  SET MyEmployees AS 
    [Employee].[Employee].[Employee].MEMBERS 
  SET MyMonthsWithEmployeesSets AS 
    Generate
    (
      MyMonths
     ,Union
      (
        {[Date].[Calendar].CurrentMember}
       ,StrToSet
        ("
             Intersect({}, 
             {TopCount(MyEmployees, 10, ([Measures].[Reseller Sales Amount],[Date].[Calendar].CurrentMember))
             as EmployeeSet"
            + 
              Cstr(MyMonths.CurrentOrdinal)
          + "})"
        )
      )
    ) 
  MEMBER [Employee].[Employee].[RestOfEmployees] AS 
    Aggregate
    (
      Except
      (
        MyEmployees
       ,StrToSet
        (
          "EmployeeSet" + Cstr(Rank([Date].[Calendar].CurrentMember,MyMonths))
        )
      )
    ) 
  MEMBER [Measures].[EmployeeRank] AS 
    Rank
    (
      [Employee].[Employee].CurrentMember
     ,StrToSet
      (
        "EmployeeSet" + Cstr(Rank([Date].[Calendar].CurrentMember,MyMonths))
      )
    ) 
SELECT 
  {
    [Measures].[EmployeeRank]
   ,[Measures].[Reseller Sales Amount]
  } ON 0
 ,Generate
  (
    Hierarchize(MyMonthsWithEmployeesSets)
   ,
      [Date].[Calendar].CurrentMember
    * 
      {
        Order
        (
          Filter
          (
            MyEmployees
           ,
            [Measures].[EmployeeRank] > 0
          )
         ,[Measures].[Reseller Sales Amount]
         ,BDESC
        )
       ,[Employee].[Employee].[RestOfEmployees]
      }
  ) ON 1
FROM [Adventure Works];

Edit - solution for Alex's third attempt:

WITH 
  SET [AllCountries] AS [Country].[Country].MEMBERS 
  SET [AllStates]    AS [State-Province].[State-Province].MEMBERS 
  SET [Top2States] AS 
    Generate
    (
      [AllCountries]
     ,TopCount
      (
        (EXISTING 
          [AllStates])
       ,3
       ,[Measures].[Internet Order Count]
      )
    ) 
  MEMBER [State-Province].[All].[RestOfCountry] AS 
    Aggregate({(EXISTING {[AllStates]} - [Top2States])}) 
SELECT 
  {[Measures].[Internet Order Count]} ON COLUMNS
 ,{
      [AllCountries]
    * 
      {
        [Top2States]
       ,[State-Province].[All].[RestOfCountry]
       ,[State-Province].[All]
      }
  } ON ROWS
FROM [Adventure Works];
whytheq
  • 34,466
  • 65
  • 172
  • 267
  • Unfortunately, this doesn't work with the same dimension (in my case with `Year` and `Month` levels). Engine treats both of them as one and selects only TopCount for the lowest level :( I'm trying to use another approach, but still have some issues. Hope, will finish it tomorrow. – Alex Peshik Feb 19 '15 at 18:22
  • @AlexPeshik this script does what it is supposed to do. It is an illustration of an approach, not a direct answer to the OP. – whytheq Feb 20 '15 at 13:25
  • yeah, I understand, but using the same modified approach didn't help. Maybe because lack of my brains. Now I'm trying to avoid StrToSet's and use only fast functions. Only moving to 'Other' member precalculated value for it is an issue. I can show this value as a measure near ANY member of the year, but have no ideas how to move it to 'Other'. I've updated my answer with this partial solution, hope you can see the bottleneck. – Alex Peshik Feb 20 '15 at 16:43
  • Do you have an AdvWrks version of your script - I would like to have a play. – whytheq Feb 20 '15 at 21:40
  • Sorry for the delay, had a lot of work. Here is AdvWorks script in Update-3 of my answer. Dimension `Customer` with `Customer Geography` hierarchy the highest two levels `Country` and `State-Province`. – Alex Peshik Mar 03 '15 at 08:31
  • @AlexPeshik looks like the magic command `EXISTING` is what we were missing! – whytheq Mar 04 '15 at 17:20