10

I have installed Fail2Ban v0.10.2 on Ubuntu 18.04 with Apache 2.4.29 and enabled the standard ssh and apache jails for basic protection with email notification warnings, when an IP is blocked.

Having a look at the documentation, I was not able to find a relevant filter that would help with the following situation:

I would like to ban IPs that hit the server and produce large numbers of 404 errors due to fake URL requests, which can be a typical spam bot behavior. So ideally, an IP is blocked that produces more than three 404 errors in a row with some exceptions for official search engine crawlers.

Is there a default regex for this situation?

I would appreciate your assistance on how to implement this.

3 Answers3

3

I recommend you start by implementing the built-in apache-noscript filter for fail2ban. To do so, add the following lines to/etc/jail.local`

[apache-noscript] 
     enabled = true 
     port = http,https 
     filter = apache-noscript 
     logpath = /var/log/apache2/*error.log 
     maxretry = 3 
     bantime = 600 

tweak the bantime setting to your liking and consider implementing the recidiv filter/jail for repeat offenders.

Note: there is a possible bug with the filter regex

uSlackr
  • 6,412
  • 21
  • 37
2

If the Apache server is set up so that it is not Apache that deals with 404s, e.g. dynamic server or reverse proxy, these do not go to the error_log, so the apache-noscript filter is not the best solution. So a quick custom solution is an option. And by implementing one learns how to do it as often one may have a unique problem worth making the filter more elaborate.

Before even starting make sure the 404s responses have a status_code of 404 as opposed to 200 while directed to a "404 page".

To make a 404 jail we need to make a custom filter, for which we need a working regex.

sudo fail2ban-regex /var/log/httpd/access_log "^<HOST> .* /.* 4\d\d .*$"

(The logs are /var/log/httpd/access_log in CentOS, but /var/log/apache2/access.log in Ubuntu) This will test the regex ^<HOST> .* /.* 4\d\d .*$ on the access_log and the matches will be something like:

Lines: 2959 lines, 0 ignored, 1207 matched, 1752 missed

If later, you have a complex issue, say 404 against a given URL are okay, a third argument is for the ignore regex. If it does not match anything then see what a log entry looks like:

sudo tail /var/log/httpd/access_log

<HOST> matches the IP of the visitor and / is the requested page. Once we are happy we can make a custom filter in etc/fail2ban/filter.d folder, say called app404.conf (not .config like for Apache2 etc., the precise extension is important):

[Definition]
failregex   = ^<HOST> .* /.* 4\d\d .*$

Note the lack of double quotes. Test it again with

sudo fail2ban-regex /var/log/httpd/access_log /etc/fail2ban/filter.d/app404.config

Adding a new jail to jail.local

[app404]
enabled = true
filter   = app404
port = http,https
logpath = %(apache_access_log)s
banTime  = 3600
findtime = 60
maxRetry = 10

And restart fail2ban

fail2ban-client reload

Test it on a different machine/IP (one never knows) and unban that ip

sudo fail2ban-client set app404 unbanip x.x.x.x

Mostly harmless

In a modern web app 404 are harmless and blocking them makes the logs cleaner and reduces odd errors. However, one cannot discount that there is no vulnerability, just because one is using a the latest tech. I.e. it is nice to make sure for oneself. As a result using a tool misused by the hackers, a vulnerability scanner, on one's own server is a good call. A popular one is the open-source Nuclei project.

Matteo Ferla
  • 121
  • 4
0

I recommend configuring apache to not log anything around these locations that you frequently get 404 message on. That way the CPU and disk IO saved writing the logs can be used for your real visitors.

CPU/IO time is also saved when you don't need fail2ban to scan through the logs.

Every real visitor is saved from being subject to IP/nftables rules slowing down their access.

You'll also be saved the anguish of looking at the logs and focusing on the background noise of the internet rather than the real visitors you care about.

danblack
  • 1,249
  • 13
  • 15
  • 1
    PS, don't take me for a fail2ban hater, I was a core maintainer for a number of years before the number of requests like this where too much and distracting from writing real security benefits so I quit. – danblack Sep 15 '18 at 11:17
  • 4
    I run a number of websites and do not 'frequently' get 404s. The majority of 404s are people attempting to find exploits on the websites and believe it is perfectly reasonable to block multiple attempts that result in a 404 – ChrisBint Dec 01 '18 at 15:17
  • 3
    That's actually very bad advice. If someone is trying to break your website, there should be traces and warnings about it, not silent ignores, because "real visitors". If your server has trouble logging errors, then you should buy a better server, not ignore this problem. – quamis Mar 20 '19 at 07:19
  • 1
    404s are harmless. Automaticly classing them as potential threat is a stretch considering most are just automated and even the concept they could do damage is remote. Deploying f2b for low end noise distracts from serious effort in hardening and vulnerability in the same way AV acts as a deficient placebo for managing malware risk. This is a really hard concept to grasp when you're invested in classifying 404 messages as threats and gain euphoria from seeing bans and less log messages. This relief is a distraction from doing stronger access controls and code review that will actually protect. – danblack Mar 20 '19 at 07:57
  • Saying 404's are harmless is ridiculous. If someone is port scanning your server and stumbles across an open port, is port scanning also harmless? – Barry May 14 '21 at 23:35
  • @Barry stick to the topic and don't try making a generality to make it true. If you have a 404 then you are running a server so a http port is exposed. A 404 processing on the web server takes very little time. fail2ban processing web log to generate iptables rules take more cpu time. You've taking way cputime from your real visitors to save you from making monsters out of noisy bots looking for non-existant exploits. And yes, port scanning is harmless because you shouldn't run services that don't need to be exposed. Use fail2ban to block brute force attempts could result in damage only. – danblack May 15 '21 at 00:10
  • @danblack Sorry but you could not be more wrong. Fail2ban has a rolling ban/unban system, so the cpu overhead you speak of only exists under ddos scenario. And in that scenario, you shouldn't be relying on ip rules, let alone analyzing http error codes via log files. On topic would be per the title, blocking erroneous error codes from someone scanning your server for exploits. Saying that's not effective is ridiculous at base level, let alone a deep level. That's fail2ban's precise job. Analyze logs, block ips. I am entirely on topic, you are not. Which is why i gave the port scan example – Barry May 15 '21 at 01:50
  • It's truly fascinating to see someone suggest not using protection in a security related topic. If we were talking about ddos, which we were not and are not, then I would agree there are far better alternative solutions. Like cloudflare @ dns level, or modsecurity w/ owasp. But to outright tell someone not to use fail2ban.... And to take that even a step further and say port scanning is harmless is beyond all recourse. If you can stop someone port scanning, they lose intel. If you can stop scripts testing for exploits, hacker loses. This is security 101, like bare minimum understanding... – Barry May 15 '21 at 02:02
  • risk assessment. It's what security is about. And read the answer's properly – danblack May 15 '21 at 03:55