0

When I am trying to run below code for list of values I get error:

-> 3088 raise ValueError('index must be monotonic increasing or decreasing')

However, when I run this code for single value. It executes.

Does not run:

def block(host):
    time_values = failedIP_df.ix[[host]].set_index(keys='index')['timestamp']
    if (return_seconds(time_values[2:3].values[0]) \
      - return_seconds(time_values[0:1].values[0]))<=20:
        blocked_host.append(time_values[3:].index.tolist())

list(map(block, failedIP_list))

Runs:

host='unicomp6.unicomp.net'
block(host)

Sample data:

FailedIP_df:

                             timestamp               index
    host        
    199.72.81.55              01/Jul/1995:00:00:01   0
    unicomp6.unicomp.net      01/Jul/1995:00:00:06   1
    freenet.edmonton.ab.ca  01/Jul/1995:00:00:12     12
    burger.letters.com      01/Jul/1995:00:00:12     14
    205.212.115.106         01/Jul/1995:00:00:12     15
    129.94.144.152          01/Jul/1995:00:00:13     21
    unicomp6.unicomp.net      01/Jul/1995:00:00:07   415
    unicomp6.unicomp.net      01/Jul/1995:00:00:08   226
    unicomp6.unicomp.net      01/Jul/1995:00:00:21   99
    129.94.144.152          01/Jul/1995:00:00:14     41
    129.94.144.152          01/Jul/1995:00:00:15     52
    129.94.144.152          01/Jul/1995:00:00:17     55
    129.94.144.152          01/Jul/1995:00:00:18     75
    129.94.144.152          01/Jul/1995:00:00:21     84

FailedIP_list = ['199.72.81.55', '129.94.144.152', 'unicomp6.unicomp.net']

Sample Output: Index of all hosts who were unssuccessful to login within 20sec after three attempts

blocked_list=[99, 55, 75, 84]

I want my code to run for all the values(i.e list of IP addresses) in the list. I would really appreciate some help on this. Thanks.

jubins
  • 317
  • 2
  • 7
  • 18
  • 2
    Can you add sample data and desired output? – jezrael Apr 04 '17 at 05:44
  • @jezrael: I have added the sample data and output. Thanks. – jubins Apr 04 '17 at 15:48
  • @jezrael: I am stuck on this since last night. I would really appreciate if you can please help. I have edited question so that it is easy to understand. If there is anything else, I'll try my best to explain. I think there is some minor correction I need however, I'm not sure what it is. – jubins Apr 04 '17 at 17:06
  • @DYZ: Can you please help. – jubins Apr 04 '17 at 17:46

1 Answers1

0
print (df)
                                   timestamp  index
host                                               
199.72.81.55            01/Jul/1995:00:00:01      0
unicomp6.unicomp.net    01/Jul/1995:00:00:06      1
freenet.edmonton.ab.ca  01/Jul/1995:00:00:12     12
burger.letters.com      01/Jul/1995:00:00:12     14
205.212.115.106         01/Jul/1995:00:00:12     15
129.94.144.152          01/Jul/1995:00:00:13     21
unicomp6.unicomp.net    01/Jul/1995:00:00:07    415
unicomp6.unicomp.net    01/Jul/1995:00:00:08    226
unicomp6.unicomp.net    01/Jul/1995:00:00:33     99 <-change time for matching
129.94.144.152          01/Jul/1995:00:00:14     41
129.94.144.152          01/Jul/1995:00:00:15     52
129.94.144.152          01/Jul/1995:00:00:17     55
129.94.144.152          01/Jul/1995:00:00:18     75
129.94.144.152          01/Jul/1995:00:00:21     84

#convert to datetimes
df.timestamp = pd.to_datetime(df.timestamp, format='%d/%b/%Y:%H:%M:%S')
failedIP_list = ['199.72.81.55', '129.94.144.152', 'unicomp6.unicomp.net']

#filter rows by failedIP_list
df = df[df.index.isin(failedIP_list)]

#get difference and count for all values in index
g = df.groupby(level=0)['timestamp']
DIFF = pd.to_timedelta(g.transform(pd.Series.diff)).dt.total_seconds()
COUNT = g.cumcount()

#filter rows
mask = (DIFF > 20) | (COUNT >= 3)
L = df.loc[mask, 'index'].tolist()
print (L)
[99, 55, 75, 84]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252