How do I import rows of a Google Sheet into Pandas, but with column names?

Question

There are great instructions in a number of places to import a Google Sheet into a Pandas DataFrame using gspread, eg:

# Open our new sheet and read some data.
worksheet = gc.open_by_key('...').sheet1

# get_all_values gives a list of rows.
rows = worksheet.get_all_values()

# Convert to a DataFrame and render.
import pandas as pd
df = pd.DataFrame.from_records(rows)
df.head()

The problem is that this import treats the first row as a value rather than as a header.

How can I import the DataFrame and treat the first row as column names instead of values?

score 4 · Accepted Answer · answered Jun 05 '19 at 00:06

4

You can do

row=[[1,2,3,4]]*3
pd.DataFrame.from_records(row[1:],columns=row[0])
   1  2  3  4
0  1  2  3  4
1  1  2  3  4

answered Jun 05 '19 at 00:06

BENY

317,841
20
164
234

This works, but when I randomize the data using `mydf = df.reindex(np.random.permutation(mydf.index))`, the column names get lost. This doesn't happen with data imported via read_csv(). What gives? – emmby Jun 05 '19 at 00:35
@emmby I am not sure this cause the problem or not `mydf = df.reindex(np.random.permutation(mydf.index))` here should `mydf.reindex` not `df.reindex` right ? – BENY Jun 05 '19 at 01:02
I just figured out the same thing, thank you for the help debugging! – emmby Jun 05 '19 at 01:05
@emmby also , if you would like permutation, I think we can do `df.sample(len(df))` – BENY Jun 05 '19 at 01:06

score 2 · Answer 2 · answered Jul 07 '21 at 05:10

I had the same issue - instead of getting a list of lists via get_all_values(), I changed it to get_all_records() and the column headers populated just fine.

for sheet in sheet_names:
   worksheet = sh.worksheet(sheet)
   rows = pd.DataFrame(worksheet.get_all_records())
   database = pd.concat([database, rows])

How do I import rows of a Google Sheet into Pandas, but with column names?

2 Answers2