2

Say I have a data frame like this:

              x   y   z
timestamp
some_date_1   5   2   4
some_date_2   1   2   6
some_date_3   7   3   5
 ...
some_date_50  4   3   6

and I want to apply a sliding window of size 10 (call this a variable window_size) with a 50% overlap (make this a variable step_size that's half of window_size) on the x, y, and z columns. Therefore, I would print the first 10 rows from 0 - 9. Afterwards, I would print 5 - 14, 10 - 19, 15 - 24, etc.

How would I do that if I had a function:

def sliding_window(df, window_size, step_size):

Assume timestamp is datetime.

I want to have separate structures for each window. So, for example, I want to have a separate DataFrame for the first ten rows, and then another for the next ten etc.

For simplicity, I will show an example with window size 4 and step size of 2.

                  x   y   z
timestamp
some_date_1   5   2   4
some_date_2   1   2   6
some_date_3   2   3   1
some_date_4   5   4   4

                 x   y   z
timestamp
some_date_3   2   3   1
some_date_4   5   4   4
some_date_5   6   7   9
some_date_6   2   1   8

1 Answers1

2

Consider the dataframe df

df = pd.DataFrame(np.arange(1, 73).reshape(-1, 3), columns=list('xyz'))
df

def windows(d, w, t):
    r = np.arange(len(d))
    s = r[::t]
    z = list(zip(s, s + w))
    f = '{0[0]}:{0[1]}'.format
    g = lambda t: d.iloc[t[0]:t[1]]
    return pd.concat(map(g, z), keys=map(f, z))

This returns a dataframe with a pd.MultiIndex and we can easily access each window with loc

wdf = windows(df, 10, 5)

wdf.loc['0:10']

    x   y   z
0   1   2   3
1   4   5   6
2   7   8   9
3  10  11  12
4  13  14  15
5  16  17  18
6  19  20  21
7  22  23  24
8  25  26  27
9  28  29  30

Or

wdf.loc['15:25']

     x   y   z
15  46  47  48
16  49  50  51
17  52  53  54
18  55  56  57
19  58  59  60
20  61  62  63
21  64  65  66
22  67  68  69
23  70  71  72
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • I edited my question as this isn't really what I was looking for. –  Aug 01 '17 at 19:56
  • @dirtysocks45 can you show me what your looking for? Otherwise, I'm just guessing. – piRSquared Aug 01 '17 at 19:57
  • I gave an example –  Aug 01 '17 at 20:02
  • @dirtysocks45 then you can use the structure I provided and access each window with `loc` as I've specified. If you don't like the way in which I've accessed each window, then show me how you'd like to access them. Otherwise, this satisfies all requirements you've laid out. – piRSquared Aug 01 '17 at 20:04
  • Is there a way I can access them iteratively? I would like to pass in this function to another one which performs calculations on each window. –  Aug 01 '17 at 20:06
  • Yes. You see the `map(g, z)` in the `windows` function? That's the iterator you want. – piRSquared Aug 01 '17 at 20:10
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/150762/discussion-between-dirtysocks45-and-pirsquared). –  Aug 01 '17 at 20:12