19

I'm trying to swap the rows within the same DataFrame in pandas.

I've tried running

a = pd.DataFrame(data = [[1,2],[3,4]], index=range(2), columns = ['A', 'B'])
b, c = a.iloc[0], a.iloc[1]
a.iloc[0], a.iloc[1] = c, b

but I just end up with both the rows showing the values for the second row (3,4).

Even the variables b and c are now both assigned to 3 and 4 even though I did not assign them again. Am I doing something wrong?

demonplus
  • 5,613
  • 12
  • 49
  • 68
Zac
  • 329
  • 1
  • 3
  • 8

4 Answers4

18

Use a temporary varaible to store the value using .copy(), because you are changing the values while assigning them on chain i.e. Unless you use copy the data will be changed directly.

a = pd.DataFrame(data = [[1,2],[3,4]], index=range(2), columns = ['A', 'B'])
b, c = a.iloc[0], a.iloc[1]


temp = a.iloc[0].copy()
a.iloc[0] = c
a.iloc[1] = temp

Or you can directly use copy like

a = pd.DataFrame(data = [[1,2],[3,4]], index=range(2), columns = ['A', 'B'])
b, c = a.iloc[0].copy(), a.iloc[1].copy()
a.iloc[0],a.iloc[1] = c,b
Bharath M Shetty
  • 30,075
  • 6
  • 57
  • 108
  • Oh yes this solved the problem. Many thanks! I'm still unsure about when values are on a chain. Would you say that it would be safer to just always use the `.copy()` method when I am assigning new variables? – Zac Oct 23 '17 at 14:39
  • 1
    Yes by default they hold the reference to main object . Whatever changes you do will reflect on main object. So u need copy. – Bharath M Shetty Oct 23 '17 at 14:44
12

The accepted answer does not make changes the index name.

If you only want to alter the order of rows you should use dataframe.reindex(arraylike). Notice that the index has changed.

enter image description here

Chuan
  • 429
  • 5
  • 16
5

In this way, it can be extrapolated to more complex situations:

a = pd.DataFrame(data = [[1,2],[3,4]], index=range(2), columns = ['A', 'B'])
rows = a.index.to_list()
# Move the last row to the first index
rows = rows[-1:]+rows[:-1]
a=a.loc[rows]
Seneo
  • 87
  • 1
  • 11
3
df = pd.DataFrame(data = [[1,2],[4,5],[6,7]], index=['a','b','c'], columns = ['A', 'B'])

df

Original DataFrame

df.reindex(['a','c','b'])

enter image description here

Prageeth Jayathissa
  • 1,798
  • 1
  • 10
  • 16
Saurabh Kumar
  • 129
  • 1
  • 8