1

I would like to automate selecting of values in one column - Step_ID. Insted of defining which Step_ID i would like to filter (shown in the code below) i would like to define, that the first Step_ID and the last Step_ID are being to excluded.

df = df.set_index(['Step_ID'])

df.loc[df.index.isin(['Step_2','Step_3','Step_4','Step_5','Step_6','Step_7','Step_8','Step_9','Step_10','Step_11','Step_12','Step_13','Step_14','Step_15','Step_16','Step_17','Step_18','Step_19','Step_20','Step_21','Step_22','Step_23','Step_24'])]

Is there any option to exclude the first and last value in the column? In this example Step_1 and Step_25. Or include all values expect of the first and the last value? In this example Step_2-Step_24.

The reason for this is that files have different numbers of ''Step_ID''. Since I don't have to redefine it all the time I would like to have a solution that simplify filtering of those. It is necessary to exclude the first and last value in the column 'Step_ID', but the number of the STEP_IDs is always different.

By Step_1 - Step_X, I need to have Step_2 - Step_(X-1).

Parsyk
  • 321
  • 1
  • 3
  • 11

1 Answers1

1

Use:

df = pd.DataFrame({
    'Step_ID': ['Step_1','Step_1','Step_2','Step_2','Step_3','Step_4','Step_5',
                'Step_6','Step_6'],
    'B': list(range(9))})
print (df)
  Step_ID  B
0  Step_1  0
1  Step_1  1
2  Step_2  2
3  Step_2  3
4  Step_3  4
5  Step_4  5
6  Step_5  6
7  Step_6  7
8  Step_6  8

Select all index values without first and last index values extracted by slicing df.index[[0, -1]]:

df = df.set_index(['Step_ID'])
df = df.loc[~df.index.isin(df.index[[0, -1]].tolist())]
print (df)
         B
Step_ID   
Step_2   2
Step_2   3
Step_3   4
Step_4   5
Step_5   6
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/213312/discussion-on-answer-by-jezrael-define-column-values-to-be-selected-disselecte). – Samuel Liew May 07 '20 at 08:08