0

This is what I received from spreadsheet:

import pandas as pd
import numpy as np

arrays = [['Phase 1','Phase 1','Phase 1','Phase 1','Phase 1','Phase 1','Phase 1','Phase 1','Phase 1'],
          ['Function A','Function A','Function A','Function A','Function A','Function A','Function A','Function A'],
          ['Achieved on','Achieved on','Achieved on','Achieved on','Achieved on','Achieved on','Planned for', 'Due Date'],
          ['Deliverable 1','Status','True?','Deliverable 2','Status.1','True?.1','NaN','NaN']]
tuples=list(zip(*arrays))
index= pd.MultiIndex.from_tuples(tuples, names=['first','second','third','Project'])
s1=pd.Series(['10/10/2020','Updated','Yes','11/10/2020','Pending','','',''],
                index=index)
reset_df=s1.reset_index()
df=pd.DataFrame(reset_df,index=['Project A', 'Project B'], columns=index)
df2=pd.Series(['10/10/2020','Updated','Yes','11/10/2020','Pending','','',''],
                index=index)
df3=pd.Series(['06/06/2021','Issued','','','','','',''],index=index)
df=df.append([df2,df3], ignore_index=True)
df=df.drop([0,1])
df

Resulting in:

first         Phase 1
second     Function A
third     Achieved on                                               Planned for Due Date
Project Deliverable 1   Status True? Deliverable 2 Status.1 True?.1         NaN      NaN
2          10/10/2020  Updated   Yes    11/10/2020  Pending
3          06/06/2021   Issued

I'm trying to get this output:

data = {'Project':['2','2','3','3'],
        'Phase':['Phase 1','Phase 1','Phase 1','Phase 1'],
        'Function':['Function A','Function A','Function A','Function A'],
        'Requisite':['Deliverable 1','Deliverable 2','Deliverable 1','Deliverable 2'],
        'Achieved on':['10/10/2020','11/10/2020','06/06/2021',''],
        'Status':['Updated','Pending','Issued',''],
        'True?':['Yes','','','']}
output=pd.DataFrame(data)
output

Output is:

 Project    Phase    Function      Requisite Achieved on   Status True?
0       2  Phase 1  Function A  Deliverable 1  10/10/2020  Updated   Yes
1       2  Phase 1  Function A  Deliverable 2  11/10/2020  Pending
2       3  Phase 1  Function A  Deliverable 1  06/06/2021   Issued
3       3  Phase 1  Function A  Deliverable 2

Many posts here have me help a lot, but it seems so complex these operations in pandas.

I tried:

  1. working only with multiindex column level 4 when I read this df from excel (to minimize complexity), but it wasn't helpful.
  2. from this point, I tried to Stack 2 columns ("Status" and "True?") while use Melt to unpivot columns which have "Deliverables". Of course, it didn't work properly.

thanks in advance for any contribution,

mgrijo
  • 1
  • 1
  • please provide your data as text, or better as dataframe constructor as you have a MultiIndex. Images are not helping – mozway Oct 29 '21 at 18:15
  • thanks for your suggestion @mozway I'll try to provide thi way – mgrijo Oct 29 '21 at 19:07
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. – Community Oct 30 '21 at 11:12

0 Answers0