I have the table below (please see image). What I want to do is look at each department and determine which individual in every single department could serve as a subject matter expert for that department and explain to me what their department does. Therefore, what I did was group the DataFrame by Department ID then split it out into multiple DataFrames. I wrote some code to essentially say if there is an individual that 80% of the people in a given department are reporting up to, then use that individual. If not, NaN. The code is working as I would like, the issue that I am running into is that I can't get the DataFrames to recombine. i.e. when I export the DataFrame to CSV, it only exports one of the DataFrames. I think the information below should suffice, but if any more information is needed please let me know. Any help is greatly appreciated!
Current iteration of code:
import pandas as pd
import openpyxl
#Reads in the data.
df = pd.read_excel('C:/Users/WorleyMG/OneDrive - Crowe LLP/Desktop/Test/Test1.xlsx')
#Breaks the data out among the Department IDs.
df_group = df.groupby('Department ID')
df_split = [df_group.get_group(x) for x in df_group.groups]
#Figures out who has the most reports among teh different departments.
for df in df_split:
df_count_mgr_id = df['Manager ID'].value_counts()
df2 = pd.DataFrame(df_count_mgr_id)
df2['percent'] = (df2['Manager ID'] / df2['Manager ID'].sum())
df2 = df2.reset_index()
df2 = df2.rename(columns={'index':'manager id','Manager ID':'count'})
df2.loc[df2['percent'] >= .8, 'SME'] = 'This is the SME'
df2 = df2[df2['SME'] == 'This is the SME']
df = pd.concat((df2['manager id'],df), axis=0)
df = df.rename(columns={0: 'manager id'})
copy_down = df['manager id'].iloc[0]
df['manager id']=copy_down
df = df.dropna(subset=['ID'])
print(df)
Current output:
ID Name Department ID ... 3rd Manager ID 3rd Manager Name manager id
0 7838.0 Batman 11640.0 ... NaN NaN 7838.0
1 8636.0 Jacob 11640.0 ... 5356.0 William 7838.0
2 7618.0 Jingle 11640.0 ... 5356.0 William 7838.0
3 7848.0 Heimer 11640.0 ... 5356.0 William 7838.0
4 7872.0 Schmidt 11640.0 ... 5356.0 William 7838.0
5 7532.0 Sarah 11640.0 ... 5356.0 William 7838.0
6 7878.0 Connor 11640.0 ... 5356.0 William 7838.0
7 8500.0 Austin 11640.0 ... 5356.0 William 7838.0
8 7926.0 Powers 11640.0 ... 5356.0 William 7838.0
[9 rows x 10 columns]
ID Name Department ID ... 3rd Manager ID 3rd Manager Name manager id
9 8222.0 Freddy 12122.0 ... 7504.0 Chris NaN
10 9102.0 Kruger 12122.0 ... 7504.0 Chris NaN
11 6020.0 Tony 12122.0 ... 5356.0 William NaN
12 7608.0 Stark 12122.0 ... 5356.0 William NaN
13 7920.0 Bruce 12122.0 ... 5356.0 William NaN
14 6636.0 Wayne 12122.0 ... 5356.0 William NaN
[6 rows x 10 columns]
Desired Output:
ID Name Department ID ... 3rd Manager ID 3rd Manager Name manager id
0 7838.0 Batman 11640.0 ... NaN NaN 7838.0
1 8636.0 Jacob 11640.0 ... 5356.0 William 7838.0
2 7618.0 Jingle 11640.0 ... 5356.0 William 7838.0
3 7848.0 Heimer 11640.0 ... 5356.0 William 7838.0
4 7872.0 Schmidt 11640.0 ... 5356.0 William 7838.0
5 7532.0 Sarah 11640.0 ... 5356.0 William 7838.0
6 7878.0 Connor 11640.0 ... 5356.0 William 7838.0
7 8500.0 Austin 11640.0 ... 5356.0 William 7838.0
8 7926.0 Powers 11640.0 ... 5356.0 William 7838.0
9 8222.0 Freddy 12122.0 ... 7504.0 Chris NaN
10 9102.0 Kruger 12122.0 ... 7504.0 Chris NaN
11 6020.0 Tony 12122.0 ... 5356.0 William NaN
12 7608.0 Stark 12122.0 ... 5356.0 William NaN
13 7920.0 Bruce 12122.0 ... 5356.0 William NaN
14 6636.0 Wayne 12122.0 ... 5356.0 William NaN
[15 rows x 10 columns]