start index at 1 for Pandas DataFrame

Question

I need the index to start at 1 rather than 0 when writing a Pandas DataFrame to CSV.

Here's an example:

In [1]: import pandas as pd

In [2]: result = pd.DataFrame({'Count': [83, 19, 20]})

In [3]: result.to_csv('result.csv', index_label='Event_id')

Which produces the following output:

In [4]: !cat result.csv
Event_id,Count
0,83
1,19
2,20

But my desired output is this:

In [5]: !cat result2.csv
Event_id,Count
1,83
2,19
3,20

I realize that this could be done by adding a sequence of integers shifted by 1 as a column to my data frame, but I'm new to Pandas and I'm wondering if a cleaner way exists.

score 170 · Accepted Answer · answered Nov 23 '13 at 21:57

170

Index is an object, and default index starts from 0:

>>> result.index
Int64Index([0, 1, 2], dtype=int64)

You can shift this index by 1 with

>>> result.index += 1 
>>> result.index
Int64Index([1, 2, 3], dtype=int64)

answered Nov 23 '13 at 21:57

alko

46,136
12
94
102

3

somehow it changes index name - so proper order with naming is: df.index+=1;df.index.name='name' – yourstruly Jul 10 '16 at 16:11
caution using this when in an ipython kernel (such as juypter) that you don't run the cell containing this code more than once. It will add one to the index every time which will not produce the desired result. – Matt_Haythornthwaite Apr 26 '23 at 13:10

score 38 · Answer 2 · edited Sep 13 '22 at 02:32

38

Just set the index before writing to CSV.

df.index = np.arange(1, len(df) + 1)

And then write it normally.

edited Sep 13 '22 at 02:32

Troll

1,895
3
15
34

answered Nov 23 '13 at 21:54

TomAugspurger

28,234
8
86
69

2

where np is import like so: import numpy as np – Dung Aug 29 '16 at 21:22
4

efficient way : df.index = range(1, df.shape[0] + 1) – santhosh_dj Apr 03 '21 at 10:18

score 26 · Answer 3 · edited May 23 '17 at 12:34

26

source: In Python pandas, start row index from 1 instead of zero without creating additional column

Working example:

import pandas as pdas
dframe = pdas.read_csv(open(input_file))
dframe.index = dframe.index + 1

edited May 23 '17 at 12:34

Community

1
1

answered Aug 29 '16 at 21:11

Dung

19,199
9
59
54

What's the difference with the top one and three years later? – Ynjxsjmh Mar 19 '23 at 00:56

score 8 · Answer 4 · answered Aug 25 '17 at 14:06

8

Another way in one line:

df.shift()[1:]

answered Aug 25 '17 at 14:06

Imran

608
10
17

4

This drops the last row. – Armali Sep 02 '20 at 08:18
1

What a scary answer! – Prashant Ghimire Feb 13 '23 at 00:17

score 8 · Answer 5 · answered Apr 28 '18 at 07:06

8

This worked for me

 df.index = np.arange(1, len(df)+1)

answered Apr 28 '18 at 07:06

Liu Yu

391
1
6
16

mosc9575 · Answer 6 · 2022-05-19T05:41:22.263

8

In my opinion best practice is to set the index with a RangeIndex

import pandas as pd

result = pd.DataFrame(
    {'Count': [83, 19, 20]}, 
    index=pd.RangeIndex(start=1, stop=4, name='index')
)
>>> result
       Count
index       
1         83
2         19
3         20

I prefer this, because you can define the range and a possible step and a name for the index in one line.

edited May 19 '22 at 05:41

answered Mar 09 '21 at 20:46

mosc9575

5,618
2
9
32

score 5 · Answer 7 · answered Nov 23 '18 at 11:00

You can use this one:

import pandas as pd

result = pd.DataFrame({'Count': [83, 19, 20]})
result.index += 1
print(result)

or this one, by getting the help of numpy library like this:

import pandas as pd
import numpy as np

result = pd.DataFrame({'Count': [83, 19, 20]})
result.index = np.arange(1, len(result)+1)
print(result)

np.arange will create a numpy array and return values within a given interval which is (1, len(result)+1) and finally you will assign that array to result.index.

score 2 · Answer 8 · answered Feb 20 '23 at 11:29

2

Following on from TomAugspurger's answer, we could use list comprehension rather than np.arrange(), which removes the requirement for importing the module: numpy. You can use the following instead:

df.index = [i+1 for i in range(len(df))]

answered Feb 20 '23 at 11:29

Matt_Haythornthwaite

662
6
14

score 1 · Answer 9 · answered Mar 23 '22 at 11:11

1

Add ".shift()[1:]" while creating a data frame

data = pd.read_csv(r"C:\Users\user\path\data.csv").shift()[1:]

answered Mar 23 '22 at 11:11

prashantwitty

11
2

score 0 · Answer 10 · answered Jan 29 '19 at 16:44

Fork from the original answer, giving some cents:

if I'm not mistaken, starting from version 0.23, index object is RangeIndex type

From the official doc:

RangeIndex is a memory-saving special case of Int64Index limited to representing monotonic ranges. Using RangeIndex may in some instances improve computing speed.

In case of a huge index range, that makes sense, using the representation of the index, instead of defining the whole index at once (saving memory).

Therefore, an example (using Series, but it applies to DataFrame also):

>>> import pandas as pd
>>> 
>>> countries = ['China', 'India', 'USA']
>>> ds = pd.Series(countries)
>>> 
>>>
>>> type(ds.index)
<class 'pandas.core.indexes.range.RangeIndex'>
>>> ds.index
RangeIndex(start=0, stop=3, step=1)
>>> 
>>> ds.index += 1
>>> 
>>> ds.index
RangeIndex(start=1, stop=4, step=1)
>>> 
>>> ds
1    China
2    India
3      USA
dtype: object
>>>

As you can see, the increment of the index object, changes the start and stop parameters.

score 0 · Answer 11 · edited Nov 10 '21 at 08:29

0

This adds a column that accomplishes what you want

df.insert(0,"Column Name", np.arange(1,len(df)+1))

edited Nov 10 '21 at 08:29

Flair

2,609
1
29
41

answered Nov 09 '21 at 23:00

Jen

1

start index at 1 for Pandas DataFrame

11 Answers11

Linked

Related