0

I'm working on building a bridge table between two tables with a many-to-many relationship. Table A contains employee IDs and job assignments (one employee can have more than one assignment). Table B has the same structure. The aim is to build a reconciliation report to find misalignment of job assignments for each employee between Table A and Table B.

To create the bridge table I created a union of Table A and Table B using the concat function. This creates a table with all the employee IDs and the different job assignments. Below is a table showing all the assignments for one employee:

employee_id  job_assignment
001          sales
001          marketing
001          management
001          sales
001          marketing
001          HR

But since this is a bridge table I don't want all these duplicate records. I want the output to look like this:

employee_id  job_assignment
001          sales
001          marketing
001          management
001          HR

I then run bridge.drop_duplicates(inplace = True) to drop all the duplicate records, but it is failing to do so.

The datatype for both of these columns are 'object'

I'm still pretty new to manipulating data and building dataframes in pandas and assistance would be much appreciated.

syd
  • 13
  • 1

0 Answers0