Ever want to ban a specific robot technology from browsing your hosted website? For example, perhaps there's a bot that ignores the robot exclusion rules. Here's how you can do just that -- ban a specific robot or browser from your site. Note that this hint could be helpful for lots of other tasks, like specific behavior according to specific http headers. See Apache's documentation on Module mod_setenvif and Environment Variables in Apache for more info.
It only takes two lines in your httpd.conf file to ban any specific browser or robot. The first is in the <IfModule mod_setenvif.c> section. Somewhere in that section, add:
BrowserMatch "some_text" badbotReplace some_text with a word that appears in the User-Agent HTTP request header field. For instance, if you wanted to (not that you would!) block the Mozilla browser from your site, use "Mozilla" to replace "some_text".
<Directory "/Library/WebServer/Documents"> Options Indexes FollowSymLinks MultiViews AllowOverride All Order allow,deny # The next line is the new addition deny from env=badbot Allow from all </Directory>What we did is quite straightforward. If the User-Agent field contains "some_text" anywhere inside it, the environment variable 'badbot' is declared; it is non-existent otherwise. Then, when checking for accessibility, if 'badbot' exists, the client is denied.
Comments (0)
Mac OS X Hints
http://hints.macworld.com/article.php?story=20030519102221674