Decrease the traffic hugely by blocking some nasty spider ip
October 31st, 2005
Days before, I open the Baiduspider in my apache’s httpd.conf to let it spide my sites. But it still use a stupid way to spide my sites that cause my Database in a madness and my OS server load average went to 50 like that soon. So I soon block BaiduSpider again.
By watching the awstates of my Squid log, I decide to block some nasty spider which did not show their actual identities like GoogleBot, sohu agent , Yahoo slurp , msnbot .
It’s easy to block these IPs in squid.conf:
acl BADIP src 221.218.17.161/32 202.108.1.0/24 220.181.26.0/24 219.142.78.0/24
http_access deny BADIP
There is a good article on configuring Squid, if I have time I may also write such an article on Squid.