2

I wrote the following code:

frame1=DataFrame(np.arange(9).reshape(3,3), index=['a','b','c'], columns=['Ohio','Texas','California']), states= ['Texas','Utah','California']

Then,
frame1.reindex(index=['a','b','c','d'],method='ffill',columns=states)

It returns an error stating 'index must be monotonic increasing or decreasing'. I have read the answer to this question. Then I re-wrote it as
frame1.reindex(index=['a','b','c','d'],method='ffill',columns=states.sort()).
Now the result is :

    Ohio    Texas   California
a   0   1   2
b   3   4   5
c   6   7   8
d   6   7   8 

As you can see, the columns are not changed as I expected. Why here the columns don't change, though I use the reindex function?

honza_p
  • 2,073
  • 1
  • 23
  • 37
kyle chan
  • 353
  • 1
  • 3
  • 12

2 Answers2

4

As the linked question's answer suggest the index should be sorted before reindexing, in this case the index of axis 0 is sorted but not axis 1 (columns). So sort the columns in frame1 before reindexing, thats the reason why there was no change in columns i.e

frame1.sort_index(axis=1).reindex(index=['a','b','c','d'],method='ffill',columns=sorted(states))

Output :

 California  Texas  Utah
a           2      1     1
b           5      4     4
c           8      7     7
d           8      7     7

​Hope that clears here your doubt.

Bharath M Shetty
  • 30,075
  • 6
  • 57
  • 108
  • thank you, your answer clears my doubt indeed. However it is an example from the book"python for data analysis', and in this book based on python2 environment the order for the column in the output is [texas, utah, california] without sorting any index, so why it is different with our output and operation here? it is because of different version of python or something else? – kyle chan Nov 20 '17 at 17:55
  • Pandas is being updated current version is 0.21 which version is specified in the book? Maybe developers realized that sorted assignemnt might be an easy and proper way of assigning data and leads to less ambuguity. – Bharath M Shetty Nov 20 '17 at 18:05
  • that explains. the version in this book is old, and i guess that why i always counter errors using the book's code, like .ix that is not suggested to use. Again, thank you for quick response. – kyle chan Nov 20 '17 at 18:33
0

Actually, you are doing two things (adding rows & sorting columns). Why is it necessary to do in one step? You can achieve what you want, if you split into two steps:

import pandas as pd
import numpy as np

frame1 = pd.DataFrame(np.arange(9).reshape(3,3), index=['a','b','c'], columns=['Ohio','Texas','California'])
states = ['Texas','Utah','California']
frame1 = frame1.reindex(index=['a','b','c','d'], method='ffill')
frame1.columns = sorted(states)
honza_p
  • 2,073
  • 1
  • 23
  • 37
  • Thank you for answering. I understand that i can do in two steps, but it is an example of reindexing in pandas of the book, python for data analysis, and i wrote the same code as the book's with a result of error. Therefore, i want to know why it happened, and understand the inside logic of this operation. – kyle chan Nov 20 '17 at 17:55