How to Kill Website Spambots Dead
If you have a website set up for more than just a month, spambots will eventually target it. If your server supports an .htaccess file, you can use it to stop the spambots dead in their tracks. I'm not talking about spambots only because some of the most virulent worms use the same tactics that regular spambots do. There are other ways to do the same thing I'm going to describe without an .htaccess file, but you have to be able to do a little PHP programming. If your web server doesn't feature the use of PHP, then you have to find some other way to do it.
Using WordPress as an Example
I don't like telling people to mess with their .htaccess files because something as simple as a trailing space can cause the web server to return an error code instead of one of your pages. If you're careful, however, there's nothing that isn't reversible. I recommend keeping a backup file on your own computer, just to be safe. When you're entering the information, be careful to check to make sure there are no extra spaces or anything. I'll provide you with some code you can use.
With WordPress, the entire application loads whenever a spambot hits, just to throw up a 404 error page. That isn't good, especially when they do it 20 times in a row. It hogs all kinds of server resources and makes it hard for anyone else (real people) to connect. While the redirection plugin works well for other types of redirections, the best thing to do with spambots is to redirect them away from the WordPress installation using an .htaccess file. What I have done is created an empty text file in the root directory called "goodbye.txt".
A WordPress .htaccess file looks like this, if permalinks are turned on:
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule># END WordPress
If you have permalinks turned on, you can insert some lines to take care of the nasties between the second and third lines. If you don't have permalinks turned on, you can copy the module tag lines and insert some lines between them.
Here are a few that I use:
# Redirect bad URIs
#
RewriteRule ^(.*)\+GET\+(.*)$ http://redirect.invalid [R=301,L]
RewriteRule ^(.*)\.asp$ http://redirect.invalid [R=301,L]
RewriteRule ^(.*)\.aspx$ http://redirect.invalid [R=301,L]
RewriteRule ^(.*)\.rar$ http://redirect.invalid [R=301,L]
RewriteRule ^(.*)MSOffice(.*)$ http://redirect.invalid [R=301,L]
RewriteRule ^(.*)\_vti\_bin(.*)$ http://redirect.invalid [R=301,L]
#
# Redirect bad QUERY STRINGs
#
RewriteCond %{QUERY_STRING} ^(.*)\?$ [NC,OR]
RewriteCond %{QUERY_STRING} ^(.*)/proc/self/(.*)$ [NC,OR]
RewriteCond %{QUERY_STRING} ^(.*)/etc/passwd(.*)$ [NC,OR]
RewriteCond %{QUERY_STRING} ^(.*)union\+select(.*)$
RewriteRule ^(.*)$ http://redirect.invalid/ [R=301,L]
You're welcome to use these. Just make sure there's one space only between "RewriteRule" and the rule, one space between the rule and the URL (and use your own URL), one space between the URL and the first bracket and no spaces after the bracket in each line.
You mileage may vary. You should check your logs or use a plugin that shows you 404 errors in order to catch other nasty critters in action. The redirection plugin shows me the errors. With these lines, I've noticed a significant drop in QOS alerts (I administer my own server). I still have some memory-consuming things happening for other reasons, but these were causing the most problems.
Similar Posts:


Hi RT,
Thanks for the very useful list of rules! I've been getting a few hits like this and wondering how to stop it.
Just a suggestion: Instead of a 404-response which might also be reached by legitimate visitors, I suggest a 403-forbidden for the spam bots, so that you can analyze your genuine "lost pages" traffic. It has the same effect.
Also could you write a bit of explanation on what exactly the rewrite rule does? Regular expressions can be quite misleading sometimes.
My last blog: Simple HOWTO: set "working directory" with the Gnome launcher
I'll have to do that in another post. In the meantime, I made a mistake when I pasted the first line. The question marks should be encoded or it won't work.
Thank for this useful information.I really need this if I'm going to create my own website.
mohel
This is a bit advanced for me and need some time to figure out how it works. By the way, will it block Googlebot and other search engine access also?
I am using Bad Behaviours plugin + Askimet which can block almost 90% of spam comments. But then, I saw the memory usage and bandwidth increase…
My last blog: Download Entrecard Toolbar For Firefox 3.5
That particular code won't mess with search engine access. If you're not familiar with regular expressions and need some time to learn how they work, I suggest using the redirection plugin I mentioned. It will definitely help.
Hi RT. Thanks for the advice. I will try redirection plugin and see how they work. Thanks.
My last blog: Download Entrecard Toolbar For Firefox 3.5
Hi RT,
Wow, good stuff. You've really got some good stuff going lately with this and the spam redirects. I'm going to have to read up on this – I can normally keep up with your technical posts, but .htaccess rewrite rules are my weakness.
My last blog: Chessington World Of Adventures – Friday 31 July 2009
Thank for helpful information.I really need this for creating my own website.
You can try Bad Behaviour script for some of this purposes too.
http://bad-behavior.ioerror.us
A bit complicated for me but maybe it is a good idea to get up to speed on this. In the mean time I use Askimet.
My last blog: Choosing the Best Place to Host a Website
Thanks a lot for the post, most websites I manage are on wordpress and the bots are a real pain in the backside for me.
–
CCTV Eastbourne