1

I need to perform some data analysis on some excel files (that are saved as account numbers of respective customers). I also have a master sheet with all the account numbers in a column. I need to iterate over the "Account Number" column in MasterSheet.xlsx and read the respective account's excel file (for eg: for account number 123, there is a "123.xlsx", that is located in the same directory as Master Sheet). Then I need to assign the corresponding account number as their variable name.

For a rough understanding of what I want to do, please refer to the code below. I'd prefer to use pandas or openpyxl.

master = pd.read_excel("MasterSheet.xlsx")

for account in master.iterrows():
    filename= str(account)+'.xlsx'
    account= pd.read_excel(filename)

You can see, I have tried to create a filename from each account number read via for loop. And then assign the account number as the variable name for each account's dataframe.

I know this is a very badly framed question but I tried and couldn't frame it any better. I have just started using python. If you need any further information, please ask.

Redowan Delowar
  • 1,580
  • 1
  • 14
  • 36

4 Answers4

1

I also have a master sheet with all the account numbers in a column. I need python to iterate over "Account Number" column in "MasterSheet.xlsx" and read the respective account's excel file (for eg: for account number 123, i have a "123.xlsx", that is in the same location as Master Sheet) and then assign that account number as their variable name.

Since your account_number is saved in the df['Account Number'] column and the files are named as account_number.xlsx, you can just do the following:

import pandas as pd

master = pd.read_excel("MasterSheet.xlsx")

for account in master["Account Number"]:
    filename = str(account) + ".xlsx"
    account = pd.read_excel(filename)
Redowan Delowar
  • 1,580
  • 1
  • 14
  • 36
1
#Import
import pandas as pd

#Read Master file
master = pd.read_excel("MasterSheet.xlsx")

#Make a dictionary - 
#   - Keys will be the each account number
#   - Values will be each dataframe which are account number xlsx files

dictionary1 = {}
for index, row in master.iterrows():
    dictionary1[row['AccountNumber']] = pd.read_excel(str(row['AccountNumber']) + '.xlsx')

#Iterate each dataframe via this code
next(iter(dictionary1.values())) 
lil-wolf
  • 372
  • 2
  • 15
1

One way to do this:

import pandas as pd

master = pd.read_excel("MasterSheet.xlsx")

master['Account File'] = master['Account Number'].apply(lambda x: pd.read_excel(str(x)+'.xlsx'))

Now all your account numbers, account files and other data are in the same data-structure for easy re-use.

Let me know if this helps!

Ramsha Siddiqui
  • 460
  • 6
  • 20
0

For excel file test.xlsx:

  account
0   test1
1   test2

Loop over account column and load new file into new df2:

import pandas as pd

df = pd.read_excel("test.xlsx")
for index, row in df.iterrows():
    df2 = pd.read_excel(row['account'] + '.xlsx')

Output:

   data
0     1
1     2

   data
0     3
1     4
Zaraki Kenpachi
  • 5,510
  • 2
  • 15
  • 38