Apache Logging Filter Robots

Problem

Sick of filtering through loads of logs, or just spotting real hits from the robots! 🙂

Seriously reduce your apache web logs, by filtering out images, style sheets and your own hits.



Solution

Simple with Apache’s customlog and setenvif statements.

I’ve also included capturing the user-agent in a separate file, as well as the referer, which is brill for seeing which google searches brought traffic to you.

You can even still capture robot and own hits into a separate log, here is how below.



Example


        SetEnvIf Request_URI ".(png|gif|jpg|js|css)" image-req        SetEnvIf Request_URI "favicon.ico" image-req        SetEnvIf Request_URI "/icons" image-req        SetEnvIf Request_URI "sitemap.xml.gz" image-req        SetEnvIf REMOTE_ADDR "127.0.0.1" image-req        SetEnvIf REMOTE_ADDR "127.0.0.1" home-req        SetEnvIf User-agent "(Googlebot|msnbot|Spider|crawl|slurp|Jeeves|Mediapartners|FeedBurner)" image-req        SetEnvIf User-agent "(Googlebot|msnbot|Spider|crawl|slurp|Jeeves|Mediapartners|FeedBurner)" bot-req    CustomLog logs/access_log.techieblogs"["%{Referer}i"]n %h %l %u %t "%r" %>s %b" env=!image-req    CustomLog logs/access_log.agents.techieblogs"%h ["%{Referer}i"] ["%{User-agent}i"]" env=!image-req    CustomLog logs/access_log.bots.techieblogs"["%{Referer}i"] %h %l %u %t "%r" %>s %b" env=bot-req    CustomLog logs/access_log.home.techieblogs"["%{Referer}i"] n %h %l %u %t "%r" %>s %b" env=home-req



Reference

[tags]Apache Logging, Unix Coding School[/tags]



Leave a Reply

Your email address will not be published. Required fields are marked *