How to delete a row in a Pandas DataFrame and relabel the index?

Question

I am reading a file into a Pandas DataFrame that may have invalid (i.e. NaN) rows. This is sequential data, so I have row_id+1 refer to row_id. When I use frame.dropna(), I get the desired structure, but the index labels stay as they were originally assigned. How can the index labels get reassigned 0 to N-1 where N is the number of rows after dropna()?

If `df` is your dataframe after dropping NA, then try `df.index = range(len(df))` — visitor, Mar 19 '17 at 17:02

score 17 · Accepted Answer · edited Dec 10 '12 at 20:02

17

Use pandas.DataFrame.reset_index(), the option drop=True will do what you are looking for.

In [14]: df = pd.DataFrame(np.random.randn(5,4))

In [15]: df.ix[::3] = np.nan

In [16]: df
Out[16]:
          0         1         2         3
0       NaN       NaN       NaN       NaN
1  1.895803  0.532464  1.879883 -1.802606
2  0.078928  0.053323  0.672579 -1.188414
3       NaN       NaN       NaN       NaN
4 -0.766554 -0.419646 -0.606505 -0.162188

In [17]: df = df.dropna()

In [18]: df.reset_index(drop=True)
Out[18]:
          0         1         2         3
0  1.895803  0.532464  1.879883 -1.802606
1  0.078928  0.053323  0.672579 -1.188414
2 -0.766554 -0.419646 -0.606505 -0.162188

edited Dec 10 '12 at 20:02

Andy Hayden

359,921
101
625
535

answered Dec 10 '12 at 19:57

Aman

45,819
7
35
37

Yay! reset_index() was exactly what I needed! Thanks for the clear example! – user394430 Dec 10 '12 at 20:12
4

note to others finding this: you need to do `df = df.reset_index(drop=True)`, otherwise the change won't "persist" in df. – sh37211 Jun 29 '21 at 21:04

score 4 · Answer 2 · answered Jul 23 '21 at 14:53

4

In addition to an accepted answer:

You should also use inplace=True as well:

df.reset_index(drop=True, inplace=True)

answered Jul 23 '21 at 14:53

Artem Fediai

51
2

How to delete a row in a Pandas DataFrame and relabel the index?

2 Answers2