I have a dataframe with event data. I have two columns: Primary and Secondary. The Primary and Secondary columns both contain lists of tags (e.g., ['Fun event', 'Dance party']).
primary secondary combined
['booze', 'party'] ['singing', 'dance'] ['booze', 'party', 'singing', 'dance']
['concert'] ['booze', 'vocals'] ['concert', 'booze', 'vocals']
I want to dummy code the data so that primary columns have a 1 code, non-observed columns have a 0, and values in the secondary column have a .5 value. Like so:
combined booze party singing dance concert vocals
['booze', 'party', 'singing', 'dance'] 1 1 .5 .5 0 0
['concert', 'booze', 'vocals'] .5 0 0 0 1 .5