I have some Pandas (python) data frames that are created by collecting data roughly every 8 milliseconds. The data is broken down into blocks, where a sequence restarts. All the blocks have a label, and there is a timestamp column indicating the time (from the beginning of the file) that the sample was collected. To get an idea, the frame looks something like this:
| | EXPINDEX | EXPTIMESTAMP | DATA1 | DATA2 |
-----------------------------------------------------
| BLOCK | 0 | | | |
| Block1 | 1 | 0 | .423 | .926 |
| | 2 | 8.215 | .462 | .919 |
| | 3 | 17.003 | .472 | .904 |
| Block2 | 4 | 55.821 | .243 | .720 |
| | 5 | 63.521 | .237 | .794 |
| ... | ... | ... | ... | ... |
------------------------------------------------------
The EXPTIMESTAMP column is a DateTimeIndex. What I would like to do is keep that column for utility later on, but create a different sub-index with a block-relative DateTimeIndex, e.g.:
| | | EXPTIMESTAMP | DATA1 | DATA2 |
----------------------------------------------------------
| BLOCK | BLOCKTIMESTAMP | | | |
| Block1 | 0 | 0 | .423 | .926 |
| | 8.215 | 8.215 | .462 | .919 |
| | 17.003 | 17.003 | .472 | .904 |
| Block2 | 0 | 55.821 | .243 | .720 |
| | 7.700 | 63.521 | .237 | .794 |
| ... | ... | ... | ... | ... |
----------------------------------------------------------
I have gotten this working:
blockreltimestamp = []
blocks = list(df.index.levels[0])
for block in blocks:
dfblock = df.xs(block, level='BLOCK').copy()
dfblock["InitialVal"] = dfblock.iloc[0]["EXPTIMESTAMP"]
reltime = dfsblock["EXPTIMESTAMP"] - dfblock["InitialVal"]
blockreltimestamp.extend(list(reltime))
df["BLOCKTIMESTAMP"] = blockreltimestamp
df.set_index(["BLOCK","BLOCKTIMESTAMP"], drop=False, inplace=True)
But I was wondering if there were a cleaner / more efficient / more pandas-esque way of doing this type of transformation.
Thanks!