Shows a simplified way to log requests and deny requests that come from <enter annoying bot name here>. Can easily be turned on or off with a database entry and without causing app recycle.
If you have ever had a web site that gets visited in the middle of peak hours by a nasty crawler / bot that doesn't completely observe the robots standard, tying up lots of your pages and causing humongous database access, then you know that you absolutely have to have good metrics to help identify the problem.
This is a simple logging class that:
1) Grabs key information from each request and logs it into a SQL Server table.
2) Can be programmed to identify certain "nastybots" via their User-Agent string and reply with a 401 Access Denied.
3) Can easily be turned on and off by simply updating a row in a SQL Server Database table, which will NOT cause an application restart.
The basic concept here is to try and intercept a request before Page processing and any database access has begun. The easiest way to do that is to override the Application_PreRequestHandlerExecute event. This is most easily done in Global.asax, where you can simply make a static class method call, like so:
When this call is made to the LogRequest method, it checks two private fields, _loggingOn, and _denyBots, and behaves accordingly. If _loggingOn is true, it grabs the items we want from the Request object and writes a row into your Requests SQL Table. The list I have is short, but you can add many more items if your needs differ.
If _denyBots is true, it performs an advanced "IsCrawler" check using Regex test strings of your choosing, and will issue a 401 Access Denied response, which basically stops the bot dead in its tracks, preventing it from doing any damage. Not even a Page object is created.
The class self-populates the values of the two state variables through a method that checks the Cache and reloads from the database every 10 minutes. So you can change the state in the database, and be guaranteed that ten minutes later it will check and change state without recycling your app, as rewriting the web.config or other file might do.
Here's the code for the logging class: