如何阻止疯狂的”百度“爬虫?
Posted by Tintin in IT, MyWebsite, tags: apache, Baiduspider, 百度, htaccess, log查看apache访问记录,发现百度疯狂的爬虫记录。
1. 我还不想禁止所有的百度爬虫
2. 已经修改09-tibet-photo-show链接为2009-tibet-photo-show,并且在.htaccess文件中禁止nextgen gallery目录下面的图片外引,类似09-tibet-photo-show的页面访问只会返回301错误
3. 已经设置2009-tibet-photo-show访问密码,但是类似2009-tibet-photo-show的访问依然会返回200正确结果
有什么办法可以访问因为百度爬虫引起的系统负荷么?还是目前的设置已经足够了?
123.125.66.25 – - [11/May/2010:14:33:55 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.49 – - [11/May/2010:14:33:56 -0500] “GET /09-tibet-photo-show?amp&replytocom=145646&nggpage=2&show=slide HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.24 – - [11/May/2010:14:33:57 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.31 – - [11/May/2010:14:33:59 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&show=gallery&pid=151&nggpage=12 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.53 – - [11/May/2010:14:33:59 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&show=gallery&pid=150&nggpage=11 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.46 – - [11/May/2010:14:33:59 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&show=gallery&pid=477&nggpage=2 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.35 – - [11/May/2010:14:33:56 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.49 – - [11/May/2010:14:34:00 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.26 – - [11/May/2010:14:34:00 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
66.249.71.166 – - [11/May/2010:14:34:02 -0500] “GET /09-tibet-photo-show?pid=463&nggpage=9&show=slide HTTP/1.1″ 301 – “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.71.166 – - [11/May/2010:14:34:02 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
123.125.66.18 – - [11/May/2010:14:34:00 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.101 – - [11/May/2010:14:34:02 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&show=gallery&nggpage=2&pid=150 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.101 – - [11/May/2010:14:34:03 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.49 – - [11/May/2010:14:34:04 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&pid=480&show=gallery&nggpage=3 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.29 – - [11/May/2010:14:34:04 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&show=gallery&nggpage=11&pid=151 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
220.181.125.71 – - [11/May/2010:14:34:05 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
123.125.66.18 – - [11/May/2010:14:34:05 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.37 – - [11/May/2010:14:34:05 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.107 – - [11/May/2010:14:34:09 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&pid=478&nggpage=7 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.40 – - [11/May/2010:14:34:09 -0500] “GET / HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.107 – - [11/May/2010:14:34:10 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&pid=153&nggpage=13&show=slide HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.50 – - [11/May/2010:14:34:10 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&pid=153&show=slide HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.28 – - [11/May/2010:14:34:10 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&pid=151&show=gallery&nggpage=4 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.18 – - [11/May/2010:14:34:10 -0500] “GET / HTTP/1.1″ 200 48778 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.31 – - [11/May/2010:14:34:11 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.50 – - [11/May/2010:14:34:11 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.30 – - [11/May/2010:14:34:10 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.34 – - [11/May/2010:14:34:15 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&nggpage=8&pid=480 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.45 – - [11/May/2010:14:34:15 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&nggpage=5&pid=478 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.26 – - [11/May/2010:14:34:15 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&pid=151&nggpage=11&show=gallery HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.47 – - [11/May/2010:14:34:15 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&nggpage=4&pid=478&show=slide HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
220.181.125.71 – - [11/May/2010:14:34:16 -0500] “GET /09-tibet-photo-show?nggpage=9&pageid=3169&show=slide&pid=160 HTTP/1.1″ 301 – “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
123.125.66.25 – - [11/May/2010:14:34:16 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.33 – - [11/May/2010:14:34:16 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.47 – - [11/May/2010:14:34:16 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.110 – - [11/May/2010:14:34:16 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.33 – - [11/May/2010:14:34:22 -0500] “GET /09-tibet-photo-show?amp&replytocom=144939&show=slide&pid=480 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.100 – - [11/May/2010:14:34:11 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.116 – - [11/May/2010:14:34:23 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&nggpage=2&pid=480&show=gallery HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.34 – - [11/May/2010:14:34:23 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&nggpage=2&pid=480 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.109 – - [11/May/2010:14:34:23 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.112 – - [11/May/2010:14:34:24 -0500] “GET /09-tibet-photo-show?amp&replytocom=145001&nggpage=13&pid=480&show=slide HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.22 – - [11/May/2010:14:34:24 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.32 – - [11/May/2010:14:34:25 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.47 – - [11/May/2010:14:34:24 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
220.181.125.71 – - [11/May/2010:14:34:26 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)”
123.125.66.100 – - [11/May/2010:14:34:27 -0500] “GET /09-tibet-photo-show?amp&replytocom=144939&show=slide&pid=478 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.21 – - [11/May/2010:14:34:27 -0500] “GET /09-tibet-photo-show?replyto&sh&replytocom=145001&nggpage=13&show=gallery&pid=150 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.37 – - [11/May/2010:14:34:28 -0500] “GET /09-tibet-photo-show?replyto&sh&replytocom=145001&nggpage=13&show=gallery HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.102 – - [11/May/2010:14:34:28 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.107 – - [11/May/2010:14:34:28 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.113 – - [11/May/2010:14:34:29 -0500] “GET /09-tibet-photo-show?replyto&sh&replytocom=145001&nggpage=13&pid=149&show=gallery HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.102 – - [11/May/2010:14:34:29 -0500] “GET /09-tibet-photo-show?replyto&sh&replytocom=145001&nggpage=12&show=gallery&pid=478 HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.18 – - [11/May/2010:14:34:29 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.100 – - [11/May/2010:14:34:30 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.21 – - [11/May/2010:14:34:30 -0500] “GET /2009-tibet-photo-show HTTP/1.1″ 200 31812 “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
123.125.66.102 – - [11/May/2010:14:34:32 -0500] “GET /09-tibet-photo-show?replyto&sh&replytocom=145001&nggpage=12&pid=151&show=slide HTTP/1.1″ 301 – “-” “Baiduspider+(+http://www.baidu.com/search/spider.htm)”
Entries (RSS)