1

I'm having some difficulties using reindex() to count various combinations in my DataFrame.

The code below reproduces my problem:

a = [
    ['Brand A' if i==0 else 'Brand B' for i in np.random.randint(0,2,size=(100,))],
    ['Type 1' if i==0 else 'Type 2' for i in np.random.randint(0,2,size=(100,))],
    ['Red' if i==0 else 'Blue' for i in np.random.randint(0,2,size=(100,))]
]
b = pd.DataFrame(a, index=['Brand', 'Type', 'Color']).T
b.loc[(b.Brand=='Brand A')&(b.Type=='Type 1'), 'Color'] = 'Red'   # no Blue, Type 1, Brand A
b.loc[(b.Brand=='Brand B')&(b.Type=='Type 2'), 'Color'] = 'Blue'  # no Red, Type 2, Brand B

c = b.groupby(['Brand','Type','Color'])
c.size()\
 .reindex(['Blue','Red'], level=2, fill_value=0)

outputs:

Brand    Type    Color
Brand A  Type 1  Red      17
         Type 2  Blue     17
                 Red      19 
Brand B  Type 1  Blue     13
                 Red       9
         Type 2  Blue     25
dtype: int64

Is there anyway to get this output instead:

Brand    Type    Color
Brand A  Type 1  Blue      0
                 Red      17
         Type 2  Blue     17
                 Red      19 
Brand B  Type 1  Blue     13
                 Red       9
         Type 2  Blue     25
                 Red       0
dtype: int64
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
Diziet Asahi
  • 38,379
  • 7
  • 60
  • 75

1 Answers1

1

You can use unstack and stack:

print (b.groupby(['Brand','Type','Color']).size().unstack(2, fill_value=0).stack())
Brand    Type    Color
Brand A  Type 1  Blue      0
                 Red      21
         Type 2  Blue     20
                 Red      14
Brand B  Type 1  Blue     15
                 Red      11
         Type 2  Blue     19
                 Red       0
dtype: int64

Solution with reindex by MultiIndex.from_product:

iterables = [['Brand A', 'Brand B'], ['Type 1', 'Type 2'], ['Blue','Red']]
idx = pd.MultiIndex.from_product(iterables, names=['Brand', 'Type', 'Color'])
print (b.groupby(['Brand','Type','Color']).size().reindex(idx, fill_value=0))
Brand    Type    Color
Brand A  Type 1  Blue      0
                 Red      21
         Type 2  Blue     20
                 Red      14
Brand B  Type 1  Blue     15
                 Red      11
         Type 2  Blue     19
                 Red       0
dtype: int64
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • I see, so my problem was that I was reindexing only one level instead of reindexing with the whole MultiIndex directly? Thanks for your help! – Diziet Asahi Aug 19 '16 at 14:49
  • Yes, I try all possible combination and work for me only these two methods. – jezrael Aug 19 '16 at 14:49