Microsoft flooding sites with fake traffic
Bill McGonigle
bill at bfccomputing.com
Fri Feb 22 15:11:00 EST 2008
On Feb 21, 2008, at 10:00, Arc Riley wrote:
> msnbot accesses robots.txt more than any other
> search engine (seconded by Yahoo! Slurp).
I had an e-commerce client DoS'ed by MSNBot during the holiday
season. It was downloading 40GB of dynamic pages per day, for a site
with 4GB of possible data (I crawled it myself to measure). The site
as-idle could handle that kind of traffic but during peak shopping it
was the proverbial straw.
I wound up counting up the total number of possible URI's on the site
and dividing it into the number of seconds in a month, and gave MSNBot:
Crawl-delay: 320
in robots.txt to give it one copy per month. It seems to have worked.
I found a webpage describing this problem that dated from Summer of
'06. Raise your hand if you're shocked...
-Bill
-----
Bill McGonigle, Owner Work: 603.448.4440
BFC Computing, LLC Home: 603.448.1668
bill at bfccomputing.com Cell: 603.252.2606
http://www.bfccomputing.com/ Page: 603.442.1833
Blog: http://blog.bfccomputing.com/
VCard: http://bfccomputing.com/vcard/bill.vcf
More information about the gnhlug-discuss
mailing list