I'm working on an anomaly detection model for CPU data and my current dataframes are structured like this (with 10k+ servers and 168 hours):
Server | Hour1 | Hour2 | Hour3 |
---|---|---|---|
server1 | val1.1 | val1.2 | val1.3 |
server2 | val2.1 | val 2.2 | val2.3 |
I need it to be in a structure like this:
Server | time | Value |
---|---|---|
server1 | 0 | value0 |
server1 | 1 | value1 |
server2 | 0 | value0 |
server2 | 1 | value1 |
The problem is there are 10k+ servers and 168 hourly values for each, so it's taking an eternity to iterate over. Is there a more efficient way to do this transformation?
My current attempt is creating a new df with nested for loops like so:
for index, row in df.iterrows():
for n in range(0,167):
new_df.loc[len(new_df.index)] = row
new_df.iat[len(new_df.index)-1, 2] = n
for index, row in new_df.iterrows():
for i, r in df.iterrows():
new_df_ts = row[2]
if(row[0] == r[0]):
new_df.iat[index, 3] = df.iat[i, 2 + new_df_ts]