df.loc produces an error if the dtype of the index is mixed int/str

Question

I have a data set with mixed index values, int and str, which df.to_csv reads as an object.

If I try to slice the rows this does not work, I get a TypeError.

I know I can work around it by changing the index dtype, but I would like to understand why this happens, or if there's a different way of slicing these mixed dtype indices?

I've created the following test case:

import os
import pandas as pd
import numpy as np
#all str index
df1 = pd.DataFrame({'Col': [0, 20, 30, 10]}, index=['a', 'b','c','d'])
#all int index
df2 = pd.DataFrame({'Col': [0, 20, 30, 10]}, index=[1, 2, 3, 4])
#all str index with numbers
df3 = pd.DataFrame({'Col': [0, 20, 30, 10]}, index=['a', 'b', '3', '4'])
#mixed str/int
df4 = pd.DataFrame({'Col': [0, 20, 30, 10]}, index=['a', 'b', 3, 4 ])

df1.loc['b':'d']
    Col
b   20
c   30
d   10

df2.loc[2:4]
Col
2   20
3   30
4   10

df3.loc['b':'4']
Col
b   20
3   30
4   10

df4.loc['b':4]

TypeError

df4.index = df4.index.map(str)
df4.loc['b':'4']
Col
b   20
3   30
4   10

Why does the slice not work for df4? Can you 'fix it' within the slice? Is changing the dtype of the index the only option?

anky · Answer 1 · 2020-02-03T17:10:33.763

1

is changing the dtype of the index the only option?

No, You can achieve this using get_loc which finds the position of the index of the label, which you can use under iloc[]:

df4.iloc[df4.index.get_loc('b') : df4.index.get_loc(4)+1]

edited Feb 03 '20 at 17:10

answered Feb 03 '20 at 17:05

anky

74,114
11
41
70

df.loc produces an error if the dtype of the index is mixed int/str

1 Answers1