1

Problem statement : Dynamically allocate Pre-defined Weights on Categories in case of missing Category(ies).

Details : there are 3 categories of data C1, C2 & C3 each having a predefined weight (lets say its 0.5 ,0.3 & 0.2 respectively). If data exists for all 3 categories, we are taking a "weighted-average" : C1*d1 + C2*d2 + C3*d3.

C1,C2 & C3 are predefined and will always sum to 1.

Problem arises in case of missing data for d1/d2/d3. In such scenarios, we must distribute the weights of the respective categories evenly into other present categories.

for example, if d1 & d2 are present and d3 isnt, so C3's value should be evenly divided to C1 & C2. So the new calculation would be newC1*d1 + newC2*d2 where newC1 and newC2 are revised weight values taken from C3. Similarly for all possible cases (of presence of d1,d2,d3).

There will be 3 categories (C1,C2,C3) ONLY - to make this problem as simple as possible for now.

Input dataframe containing d1,d2,d3 values is as below (called as SCORE) :

Col1   Col2  **SCORE**        Col4              Col5     Col6
123    987 **53.357809** 2017-05-03 16:39:20     456     'ABC'

Is there a generic way to achieve this problem ? Any help will be appreciated.

Shankar Pandey
  • 451
  • 1
  • 4
  • 22
  • That's a broad problem. Do you have anything in mind? How's your data structure (python lists, np arrays, pandas df...)? – rafaelc May 09 '18 at 15:47
  • The actual data resides in a table. they are a set of scores , and each score assigned with a category. the motive is to calculate a weighted average of scores (so i'm going to group by scores for each category as input d1,d2 or d3). this would be a dataframe for my ease but i'm NOT restricted to it. appreciate your help – Shankar Pandey May 09 '18 at 15:51

0 Answers0