I am trying to get specific column data out of specific sheets in one big google sheet. For example, I have a list of sheets
Sheets = ['Sheet 1', 'Sheet 2', 'Sheet 3']
and from these sheets, I want to retrieve specific columns like
Column_headers = ['A', 'B']
What I am doing right now is getting the data from
import gspread
from gspread_dataframe import set_with_dataframe
import pandas as pd
pd.set_option("display.max_columns", None)
pd.set_option('display.max_rows', None)
sa = gspread.service_account(filename='file.json')
book = sa.open("book")
Sheets = ['Sheet 1', 'Sheet 2', 'Sheet 3']
Column_headers = ['A', 'B']
for i in Sheets:
2022_sheet = book.worksheet(i)
records = 2022_sheet.get_all_records()
data_2022 = zip(*(e for e in zip(*record) if e[0] in Column_headers))
getdata_2022 = pd.DataFrame(data_2022, columns = Column_headers)
print(getdata_2022)
I am getting the following error
GSpreadException: the given 'expected_headers' are not uniques
that is because the headers are not unique (obv) that's why I am retrieving specific columns, also I can't understand the bit where I loop through the 'Sheets' to get the data only from specific sheets. Eventually the end result should be two columns 'A' and 'B' with all the data from the specific 3 sheets.