Pandas: add elements to index at even intervals

Question

I have a dataframe that looks like this:

             B
A             
0.00    5.7096
7.33    8.0280
25.82  15.7212
43.63  19.5156
55.24  20.1888

and I want to add rows with the index at regular intervals (say by 10), so that I can then interpolate the column B with method = 'index'. My desired output is this:

             B
A             
0.00    5.7096
7.33    8.0280
10.00      NaN
20.00      NaN
25.82  15.7212
30.00      NaN
40.00      NaN
43.63  19.5156
50.00      NaN
55.24  20.1888
60.00      NaN

I haven't found any reindex option that adds index elements instead of changing them. My best solution is create a new index, append it to the original dataframe, sort and remove duplicates (if any), but I'm pretty sure there is a better solution.

step = 10
idx = pd.DataFrame(index = df.index).reindex([round(i, 0) for i in np.arange(df.index[0], df.index[-1] + step, step)])
df = df.append(idx)
df.sort_index(inplace = True)
df = df[~df.index.duplicated()]

any suggestion? thanks

score 3 · Answer 1 · answered Feb 09 '21 at 19:01

3

Effectively do a union by doing an outer join.

df = pd.read_csv(io.StringIO("""A             B
0.00    5.7096
7.33    8.0280
25.82  15.7212
43.63  19.5156
55.24  20.1888"""), sep="\s+").set_index("A")

df = df.join(pd.DataFrame(index=pd.RangeIndex(0,60, 10)), how="outer")

	B
0	5.7096
7.33	8.028
10	nan
20	nan
25.82	15.7212
30	nan
40	nan
43.63	19.5156
50	nan
55.24	20.1888

answered Feb 09 '21 at 19:01

Rob Raymond

29,118
3
14
30

seems perfect (solving in just one line), but when I try to change the range to the first and last values of the original index (`df = df.join(pd.DataFrame(index=pd.RangeIndex(df.index[0], df.index[-1], 10)), how="outer")` I get this error: `TypeError: Wrong type for value 55.24` I guess RangeIndex doesn'g to well with float boundaries... – ilmatte Feb 09 '21 at 19:38
`df = df.join(pd.DataFrame(index=pd.RangeIndex(df.index[0].astype(int),df.index[-1].astype(int), 10)), how="outer")` works, changes sample data to go against *0.1* then it does insert 0.0 – Rob Raymond Feb 09 '21 at 19:47

score 1 · Answer 2 · answered Feb 09 '21 at 19:28

1

idx = sorted(set(list(np.arange(70, step=10)) + list(df.index)))
df = df.reindex(idx)

Output:

df Out[59]:

             B
A             
0.00    5.7096
7.33    8.0280
10.00      NaN
20.00      NaN
25.82  15.7212
30.00      NaN
40.00      NaN
43.63  19.5156
50.00      NaN
55.24  20.1888
60.00      NaN

answered Feb 09 '21 at 19:28

KRKirov

3,854
2
16
20

1

Perfect!!! works fine even if I change 70 in your code to `df.index[-1]`. Thanks! – ilmatte Feb 09 '21 at 19:41

score 0 · Answer 3 · answered Feb 09 '21 at 19:05

How about just reindexing the dataframe? The outcome is identical to the solution you provided:

lista = [0.00, 7.33, 25.82, 43.63, 55.24]
listb = [5.7096, 8.0280, 15.7212, 19.5156, 20.1888]
df = pd.DataFrame({'A':lista, 'B':listb})

# range(start, stop, step)
# Create new index and increment
my_increment = [i for i in range(10, 20, 10)]
# combine old and new index, here is where you can use the list function to sort them as well
new_index = df.index.tolist() + my_increment

print(df.reindex(new_index))

Output:

        A        B
0    0.00   5.7096
1    7.33   8.0280
2   25.82  15.7212
3   43.63  19.5156
4   55.24  20.1888
10    NaN      NaN

this is very similar to my solution, creating with a new index that combines the existing and the new values. slightly different (I'm not reindexing) but with a similar amount of code. — ilmatte, Feb 09 '21 at 19:32

Pandas: add elements to index at even intervals

3 Answers3