I did go through some posts here to remove duplicates from list of files in folder using Python openpyxl.
Did not find suitable one to my needs.
Here is the script which I have and want to modify below script which can loop through list of files in a folder to remove duplicates rows by column.
It would be better If I get popup box to mention Column Name like by ColumnA, because every time I have to change by column Name to remove duplicates on list of files.
Here is the code which I have.
import openpyxl
wb = openpyxl.load_workbook('Duplicates.xlsx')
wb2 = openpyxl.load_workbook('Duplicates.xlsx')
sh = wb['Sheet1']
sh2 = wb2['Sheet1']
values = []
for i in range(1, sh.max_row + 1):
a = sh.cell(row=i, column=1).value
if a in values:
pass
else:
values.append(sh.cell(row=i, column=1).value)
for x in range(len(values)):
sh2.cell(row=x + 1, column=1).value = values[x]
wb2.save('new_file.xlsx')