26

A stranger bot (GbPlugin) is codifying the urls of the images and causing error 404.
I tried to block the bot without success with this in the bottom of my .htaccess, but it didn't work.

Options +FollowSymlinks  
RewriteEngine On  
RewriteBase /  
RewriteEngine on  
RewriteCond %{HTTP_REFERER} !^$  
RewriteCond %{HTTP_USER_AGENT} ^$ [OR]  
RewriteCond %{HTTP_USER_AGENT} ^GbPlugin [NC]  
RewriteRule .* - [F,L]     

The log this below.

201.26.16.9 - - [10/Sep/2011:00:06:05 -0300] "GET /wp%2Dcontent/themes/my_theme%2Dpremium/scripts/timthumb.php%3Fsrc%3Dhttp%3A%2F%2Fwww.example.com%2Fwp%2Dcontent%2Fuploads%2F2011%2F08%2Fmy_image_name.jpg%26w%3D100%26h%3D65%26zc%3D1%26q%3D100 HTTP/1.1" 404 1047 "-" "GbPlugin"

Sorry for my language mistakes

Keyur Shah
  • 11,043
  • 4
  • 29
  • 48
Vera
  • 391
  • 1
  • 4
  • 9

3 Answers3

26

Here's what you can put in your .htacces file

Options +FollowSymlinks  
RewriteEngine On  
RewriteBase /  
SetEnvIfNoCase Referer "^$" bad_user
SetEnvIfNoCase User-Agent "^GbPlugin" bad_user
SetEnvIfNoCase User-Agent "^Wget" bad_user
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_user
SetEnvIfNoCase User-Agent "^EmailWolf" bad_user
SetEnvIfNoCase User-Agent "^libwww-perl" bad_user
Deny from env=bad_user

This will return:

HTTP request sent, awaiting response... 403 Forbidden
2011-09-10 11:15:48 ERROR 403: Forbidden.
Book Of Zeus
  • 49,509
  • 18
  • 174
  • 171
  • Ok, thank you. Uploading now. I will inform the result. Vera – Vera Sep 10 '11 at 16:31
  • 7
    An easy easy to test is to use wget. This is what the return said when I wget my site. – Book Of Zeus Sep 10 '11 at 16:32
  • This line blocks accesses through Facebbok external hit: `5092 "-" "facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)"` , useragent is empty. I removed this line and testing again – Vera Sep 11 '11 at 00:09
  • Sorry, forgert the line SetEnvIfNoCase Referer "^$" bad_user – Vera Sep 11 '11 at 00:16
  • 2
    So if you remove the "SetEnvIfNoCase Referer "^$" bad_user" it's working? – Book Of Zeus Sep 11 '11 at 00:19
  • I tested using Wget and it is blocking, thank you. But it doesn't block error 404 when it requests inexistent images. – Vera Sep 12 '11 at 10:48
  • You can add this "ErrorDocument 403 /path or domain". So you can customize where the bot goes. – Book Of Zeus Sep 12 '11 at 10:57
  • I had already made this: "ErrorDocument 403 /erro/403.shtml" But even so a mistake is generated (thousands a day): 187.82.189.95 - - [12/Sep/2011:01:25:28 -0300] "GET /wp%2Dcontent/themes/my_theme%2Dpremium/scripts/timthumb.php%3Fsrc%3Dhttp%3A%2F%2Fwww.example.com%2Fwp%2Dcontent%2Fuploads%2F2011%2F08%2Fmy_image.jpg%26w%3D100%26h%3D65%26zc%3D1%26q%3D100 HTTP/1.1" 404 1047 "-" "GbPlugin". I would like to block any access so that there were not so many 404 erros in the log – Vera Sep 12 '11 at 11:20
  • Thinking: Maybe it was able to redirect this url for an existent file to block so many 404. I don't know as doing this, just imagined if it would be possible. – Vera Sep 12 '11 at 11:26
  • Finally, all accesses from GbPlugin are blocked. Thank you for your interest and collaboration. You helped me to find the solution. – Vera Sep 12 '11 at 16:55
  • @BookOfZeus: You start all those expressions with '^' which I thought meant beginning of string. But don't some of those patterns occur in the middle of the user agent string? Also, are the symlinks and rewrite lines part of the solution or just other things that happened to be in your file? – WGroleau Jan 09 '16 at 18:45
  • @WGroleau sorry for the late response. These are the most common one that starts with the ^???. Yes they can occur in the middle of the string. It is good to have them without the ^ (I did this as a copy paste, default practice) – Book Of Zeus Apr 03 '16 at 21:58
3

May I recommend this method:

Put this is .htaccess in root of your site.

ErrorDocument 503 "Your connection was refused"
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^(Mozilla.*537.36|Mozilla.*UCBrowser\/9.3.1.344)$ [NC]
RewriteRule .* - [R=503,L]

Where

^(Mozilla.*537.36|Mozilla.*UCBrowser\/9.3.1.344)$

are the two useragents I wanted to block in this example case.

You can use regex so a useragent like

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0

could be

Mozilla.*Firefox\/40.0

^means match from beginning and $ to the end so you could block just one useragent with:

ErrorDocument 503 "Your connection was refused"
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Firefox\/40.0$ [NC]
RewriteRule .* - [R=503,L]

Or add several using the | character to separate them inside ( and ) like in the first example.

RewriteCond %{HTTP_USER_AGENT} ^(Mozilla.*537.36|Mozilla.*UCBrowser\/9.3.1.344)$ [NC]

You can test it by putting your useragent in the code and then try to access the site. http://whatsmyuseragent.com/

Don King
  • 301
  • 2
  • 9
0

To block empty referers, you can use the following Rule :

RewriteEngine on

RewriteCond %{HTTP_REFERER} ^$
RewriteRule ^ - [F,L]

This will forbid all requests to your site if HTTP_REFERER value is empty ^$ .

To block user agents, you can use

RewriteEngine on

RewriteCond %{HTTP_USER_AGENT} opera|firebox|foo|bar [NC]
RewriteRule ^ - [F,L]

This will forbid all requests to your site if HTTP_USER_AGENT matches the Condition pattern.

Amit Verma
  • 40,709
  • 21
  • 93
  • 115