Is there an easier way to change the index values of a pandas dataframe?

Question

I am taking a dataframe, breaking it into two dataframes, and then I need to change the index values so that no number is greater than the total number of rows.

Here's the code:

dataset =   pd.read_csv("dataset.csv",usecols['row_id','x','y','time'],index_col=0)
splitvalue = math.floor((0.9)*786239)
train = dataset[dataset.time < splitvalue]
test = dataset[dataset.time >= splitvalue]

Here's the change that I am doing. I am wondering if there is an easier way:

test.index=range(test.shape[0])
test.index.rename('row_id',inplace=True)

Is there a better way to do this?

score 3 · Accepted Answer · answered Jun 09 '16 at 23:45

3

try:

test = test.reset_index(drop=True).rename_axis('row_id')

answered Jun 09 '16 at 23:45

piRSquared

285,575
57
475
624

score 2 · Answer 2 · answered Jun 09 '16 at 23:53

2

You should shuffle your data before slicing....

dataset.reindex(np.random.permutation(dataset.index))

Otherwise your biasing your test/train sets.

answered Jun 09 '16 at 23:53

Merlin

24,552
41
131
206

Thanks for the suggestion. I didn't realize that the shuffling could be done through reindexing. Cool. – Larry Freeman Jun 10 '16 at 00:01
@LarryFreeman, dont check with head on the new dataframe.. Head sorts on the index then displays... Drove me nuts for while. – Merlin Jun 10 '16 at 00:04
If I don't check with head(), what's the alternative? – Larry Freeman Jun 10 '16 at 00:05
I wasnt clever I just used the above command in the Ipython notebook cell looked the top five... You could look to slice with fancy indexing, I didnt try. – Merlin Jun 10 '16 at 00:12

score 2 · Answer 3 · answered Jun 10 '16 at 07:21

2

You can assign a new Index object directly to overwrite the index:

test.index = pd.Index(np.arange(len(df)), name='row_id')

answered Jun 10 '16 at 07:21

EdChum

376,765
198
813
562

Is there an easier way to change the index values of a pandas dataframe?

3 Answers3