Filtering pandas df with level values

Question

I have the following pandas df:

df
                        price           max    maxperhour
Site  Commodity Type                        
Mid   Biomass   Stock     6.0  1.500000e+15  1.500000e+15
      CO2       Env       0.0  1.500000e+15  1.500000e+15
      Coal      Stock     7.0  1.500000e+15  1.500000e+15
      Elec      Demand    NaN           NaN           NaN
      Gas       Stock    27.0  1.500000e+15  1.500000e+15
      Hydro     SupIm     NaN           NaN           NaN
      Lignite   Stock     4.0  1.500000e+15  1.500000e+15
      Solar     SupIm     NaN           NaN           NaN
      Wind      SupIm     NaN           NaN           NaN

I would like to filter above mentioned df and create a list of Commodity items as a list, when Site == 'Mid' and Type == ('Stock' or 'Demand').

so following list should be created with some pandas filtering function:

df.somefunction()
['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec']

How would I achieve this?

Lastly if it is possible I would like to have 'Elec' as the last element, what I mean is; when the list is created, 'Elec' would be probably the third element of the list like:

['Biomass', 'Coal', 'Elec', 'Gas', 'Lignite']

However, it would be best if I can get 'Elec' as the last element like:

['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec']

since it is the only element with Type == 'Demand'

From @jezrael

df[(df.index.get_level_values('Site') == 'Mid') & (df.index.get_level_values('Type') == 'Stock')].index.remove_unused_levels().get_level_values('Commodity').tolist()

Do you think need compare number of filtered values with `Mid, Stock` and `Mid, Demand` and by length of output need join together? — jezrael, Jan 09 '19 at 11:50

jezrael · Accepted Answer · 2019-01-09T12:06:00.303

1

Solution with MultiIndex:

m1 = (df.index.get_level_values('Site') == 'Mid')
m2 = (df.index.get_level_values('Type') == 'Stock')
m3 = (df.index.get_level_values('Type') == 'Demand')

idx1 = df[m1 & m2].index.remove_unused_levels().get_level_values('Commodity')
idx2 = df[m1 & m3].index.remove_unused_levels().get_level_values('Commodity')

idx = idx1.append(idx2)
print (idx)
Index(['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec'], dtype='object', name='Commodity')

Alternative with columns:

df1 = df.reset_index()
m1 = (df1['Site'] == 'Mid')
m2 = (df1['Type'] == 'Stock')
m3 = (df1['Type'] == 'Demand')

idx1 = df1.loc[m1 & m2, 'Commodity']
idx2 = df1.loc[m1 & m3, 'Commodity']

idx = idx1.append(idx2).tolist()
print (idx)
['Biomass', 'Coal', 'Gas', 'Lignite', 'Elec']

edited Jan 09 '19 at 12:06

answered Jan 09 '19 at 11:38

jezrael

822,522
95
1,334
1,252

u are going over the lenght of the levels for the second requirement, but if the Demand are longer than it wont work, but still thank you for efford – oakca Jan 09 '19 at 11:50
@oakca - So not sure if understand, what is logic behind? Not length of outputs? – jezrael Jan 09 '19 at 11:51
well it is not your fault the question is too specific... The logic behind was the Demand elements should be always in the end, but does not matter. I am getting a weird error with the first one, because I did not want to create m1 and m2 instead I wrote all of it in 1 line via: `df[df.index.get_level_values('Site') == 'Mid' & df.index.get_level_values('Type').isin(['Stock', 'Demand'])].index.remove_unused_levels().get_level_values('Commodity') ` and the error is – oakca Jan 09 '19 at 11:54
`*** TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''` – oakca Jan 09 '19 at 11:54
@oakca - What is your pandas version? – jezrael Jan 09 '19 at 11:55
pandas version is 0.23.4 – oakca Jan 09 '19 at 11:56
@oakca - So in real data failed `print (m1 & m2)` ? – jezrael Jan 09 '19 at 12:00
Following fails :(, is it because I wrote it all in 1 line? `data['commodity'][data['commodity'].index.get_level_values('Site') == 'Mid' & data['commodity'].index.get_level_values('Type') == 'Stock'].index.remove_unused_levels().get_level_values('Commodity')` and the error: `*** TypeError: unsupported operand type(s) for &: 'str' and 'Index'` – oakca Jan 09 '19 at 12:02
1

@oakca - yes, there is problem, need `()` like `data['commodity'][(data['commodity'].index.get_level_values('Site') == 'Mid') & (data['commodity'].index.get_level_values('Type') == 'Stock')].index.remove_unused_levels().get_level_values('Commodity')` – jezrael Jan 09 '19 at 12:03
@oakca - oops, there is necesary remove `comodity` like `idx1 = data[(data.index.get_level_values('Site') == 'Mid') & (data.index.get_level_values('Type') == 'Stock')].index.remove_unused_levels().get_level_values('Commodity')` – jezrael Jan 09 '19 at 12:05

Filtering pandas df with level values

1 Answers1