I have a dataset that looks like this:
postcode house_number col2 col3
xxx xxx xxx xxx
xxx xxx xxx xxx
I want to group the data by postcode
and house_number
, if two rows have the same postcode and house_number, it means they are the same property, then I want to construct a unique_id
for each property (in other words, for a unique_id
, the postcode
/ house_number
must be the same, but the value for col2
/ col3
might be different), something like:
unique_id postcode house_number col2 col3
0 111 222 xxx xxx
0 111 222 xxx xxx
1 xxx xxx xxx xxx
.....
I tried new_df = ppd_df.groupby(['postcode','house_number']).reset_index()
but it gave me error AttributeError: 'DataFrameGroupBy' object has no attribute 'reset_index'
, also I'm not sure how to construct the column unique_id
. Can someone help please? Thanks.