14

I want to get interval margins of a column with pandas intervals and write them in columns 'left', 'right'. Iterrows does not work (documentation says it would not be use for writing data) and, anyway it would not be the better solution.

import pandas as pd

i1 = pd.Interval(left=85, right=94)
i2 = pd.Interval(left=95, right=104)
i3 = pd.Interval(left=105, right=114)
i4 = pd.Interval(left=115, right=124)
i5 = pd.Interval(left=125, right=134)
i6 = pd.Interval(left=135, right=144)
i7 = pd.Interval(left=145, right=154)
i8 = pd.Interval(left=155, right=164)
i9 = pd.Interval(left=165, right=174)

data = pd.DataFrame(
    {
    "intervals":[i1,i2,i3,i4,i5,i6,i7,i8,i9],
    "left"     :[0,0,0,0,0,0,0,0,0],
    "right"    :[0,0,0,0,0,0,0,0,0]
    },
    index=[0,1,2,3,4,5,6,7,8]
)

#this is not working (has no effect):
for index, row in data.iterrows():
    print(row.intervals.left, row.intervals.right)
    row.left = row.intervals.left
    row.right = row.intervals.right

How can we do something like:

data['left']=data['intervals'].left

data['right']=data['intervals'].right

Thanks!

cs95
  • 379,657
  • 97
  • 704
  • 746
mike
  • 323
  • 3
  • 11

3 Answers3

27

Create an pandas.IntervalIndex from your intervals. You can then access the .left and .right attributes.

import pandas as pd

idx = pd.IntervalIndex([i1, i2, i3, i4, i5, i6, i7, i8, i9])  
pd.DataFrame({'intervals': idx, 'left': idx.left, 'right': idx.right})

    intervals  left  right
0    (85, 94]    85     94
1   (95, 104]    95    104
2  (105, 114]   105    114
3  (115, 124]   115    124
4  (125, 134]   125    134
5  (135, 144]   135    144
6  (145, 154]   145    154
7  (155, 164]   155    164
8  (165, 174]   165    174

Another option is using map and operator.attrgetter (look ma, no lambda...):

from operator import attrgetter

df['left'] = df['intervals'].map(attrgetter('left'))
df['right'] = df['intervals'].map(attrgetter('right'))

df
    intervals left right
0    (85, 94]   85    94
1   (95, 104]   95   104
2  (105, 114]  105   114
3  (115, 124]  115   124
4  (125, 134]  125   134
5  (135, 144]  135   144
6  (145, 154]  145   154
7  (155, 164]  155   164
8  (165, 174]  165   174
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
cs95
  • 379,657
  • 97
  • 704
  • 746
5

A pandas.arrays.IntervalArray, is the preferred way for storing interval data in Series-like structures.

For @coldspeed's first example, IntervalArray is basically a drop in replacement:

In [2]: pd.__version__
Out[2]: '1.1.3'

In [3]: ia = pd.arrays.IntervalArray([i1, i2, i3, i4, i5, i6, i7, i8, i9])

In [4]: df = pd.DataFrame({'intervals': ia, 'left': ia.left, 'right': ia.right})

In [5]: df
Out[5]:
    intervals  left  right
0    (85, 94]    85     94
1   (95, 104]    95    104
2  (105, 114]   105    114
3  (115, 124]   115    124
4  (125, 134]   125    134
5  (135, 144]   135    144
6  (145, 154]   145    154
7  (155, 164]   155    164
8  (165, 174]   165    174

If you already have interval data in a Series or DataFrame, @coldspeed's second example becomes a bit more simple by accessing the array attribute:

In [6]: df = pd.DataFrame({'intervals': ia})

In [7]: df['left'] = df['intervals'].array.left

In [8]: df['right'] = df['intervals'].array.right

In [9]: df
Out[9]:
    intervals  left  right
0    (85, 94]    85     94
1   (95, 104]    95    104
2  (105, 114]   105    114
3  (115, 124]   115    124
4  (125, 134]   125    134
5  (135, 144]   135    144
6  (145, 154]   145    154
7  (155, 164]   155    164
8  (165, 174]   165    174
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
root
  • 32,715
  • 6
  • 74
  • 87
3

A simple way is to use apply() method:

    data['left'] = data['intervals'].apply(lambda x: x.left)
    data['right'] = data['intervals'].apply(lambda x: x.right)
    data
    intervals      left     right
    0   (85, 94]     85      94
    1   (95, 104]    95     104
    ...
    8   (165, 174]  165     174
m02ph3u5
  • 3,022
  • 7
  • 38
  • 51
denis_smyslov
  • 741
  • 8
  • 8
  • 1
    This does work, however, it's not vectorized as in [a1](https://stackoverflow.com/a/53996040/7758804) and [a2](https://stackoverflow.com/a/54044765/7758804), so this will be much slower, by comparison. – Trenton McKinney Dec 16 '20 at 01:55