2

I have a data set with mixed index values, int and str, which df.to_csv reads as an object.

If I try to slice the rows this does not work, I get a TypeError.

I know I can work around it by changing the index dtype, but I would like to understand why this happens, or if there's a different way of slicing these mixed dtype indices?

I've created the following test case:

import os
import pandas as pd
import numpy as np
#all str index
df1 = pd.DataFrame({'Col': [0, 20, 30, 10]}, index=['a', 'b','c','d'])
#all int index
df2 = pd.DataFrame({'Col': [0, 20, 30, 10]}, index=[1, 2, 3, 4])
#all str index with numbers
df3 = pd.DataFrame({'Col': [0, 20, 30, 10]}, index=['a', 'b', '3', '4'])
#mixed str/int
df4 = pd.DataFrame({'Col': [0, 20, 30, 10]}, index=['a', 'b', 3, 4 ])

df1.loc['b':'d']
    Col
b   20
c   30
d   10

df2.loc[2:4]
Col
2   20
3   30
4   10

df3.loc['b':'4']
Col
b   20
3   30
4   10

df4.loc['b':4]

TypeError

df4.index = df4.index.map(str)
df4.loc['b':'4']
Col
b   20
3   30
4   10

Why does the slice not work for df4? Can you 'fix it' within the slice? Is changing the dtype of the index the only option?

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Martin S.
  • 21
  • 2

1 Answers1

1

is changing the dtype of the index the only option?

No, You can achieve this using get_loc which finds the position of the index of the label, which you can use under iloc[]:

df4.iloc[df4.index.get_loc('b') : df4.index.get_loc(4)+1]

   Col
b   20
3   30
4   10
anky
  • 74,114
  • 11
  • 41
  • 70