I am seeing odd traffic to my web site. On occasion, probably a few times a day, I will get a flurry of requests for the same URI from the same IP address for several minutes, at rates from maybe one per second up to hundreds per second. Other than the timestamp, the requests seem completely identical. There doesn't seem to be any pattern to the URIs or the IPs or any other aspects of the requests other than being identical within each individual flurry. Notably, it seems to encompass all browsers.
At first glance it would appear to be a DOS, but, most of the time, it's really not enough traffic for that, and there are some other characteristics of the requests that lead me to believe that it's not a malicious attack, including the fact that many of them are from authenticated users and all of the ones I've investigated seem to happen inside what otherwise looks like a normal session on the site.
I've pretty much come to the conclusion that it's unintentional. But that leads me to believe that either there's something on our web site that confuses browsers into making a multitude of requests, or there is some user behavior that is creating the flurries.
That raises these questions:
- If it's just a user behavior, it seems that this same pattern would exist on web sites other than mine. I haven't heard of any such thing, but it's worth asking if anyone else has seen this type of traffic. So has anyone?
- If there's something in our web content that could cause this, it seems that someone might have encountered it before. Anyone have any insight here?
I've put some throttling in place, and it's not likely to affect performance of the web site, but I'd really like to find some sort of root cause. If I can't find anything out here, I am going to start directly asking the users that I can identify, but I'd rather deal with it internally if at all possible.
The web servers I'm gathering these logs from are behind F5 load balancers. The logs are from Apache, and the logs do show different timestamps, so it's not a logging error. Plus, we can see some side effects of the multiple requests in database server logs and so forth.
It's possible that users are scraping data, but it seems unlikely. I'm hoping to find a technical explanation first. If I can't find one, I'll move to looking for a social explanation.