I have a data frame to pivot to longer:
import pandas as pd
import io
_1 = pd.read_csv(io.StringIO(
"""date; origin; val_one; val_two; aaa; bbb; ccc; ddd; eee; fff
10/11/2009; aaa; 1; 0; 0; 0; 0; 0; 1; 0
10/11/2009; bbb; 0; 1; 1; 0; 0; 0; 0; 1
10/11/2009; ccc; 0; 1; 0; 0; 0; 0; 0; 0
10/11/2009; ddd; 0; 2; 0; 1; 1; 1; 0; 0"""),
sep=";").set_axis(['date', 'origin', 'val_one', 'val_two',
'aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff'], axis=1)
I now want to create origin-target (rows aaa
:fff
) dyads. Rows where no target exists (i.e. none of the target columns is 1
, e.g. the third row, index 2) should be dropped; rows where more than one target exists (e.g. the second row, index 1, where aaa
is 1
and fff
is one of the target columns is 1
) should be turned into two rows. The expected output is:
_2_targ = pd.read_csv(io.StringIO(
"""date; origin; val_one; val_two; target
10/11/2009; aaa; 1; 0; eee
10/11/2009; bbb; 0; 1; aaa
10/11/2009; bbb; 0; 1; fff
10/11/2009; ddd; 0; 2; bbb
10/11/2009; ddd; 0; 2; ccc
10/11/2009; ddd; 0; 2; ddd"""),
sep=';').set_axis(['date', 'origin', 'val_one', 'val_two', 'target'])
I have tried using pd.melt
to no avail (see below) - this creates all dyads, even the ones I don't want to keep (because no target column is 1
).
_2 = pd.melt(_1,
id_vars=['date', 'origin', 'val_one', 'val_two'],
value_vars=['aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff'],
var_name='target', value_name='tmp')
What am I missing?