News

MediaCloud, a Berkman Center project, and StopBadware, a former Berkman Center project that has spun off as an independent organization, have each built systems to crawl websites and save the results ...
80legs is a web crawling service running on a distributed grid of 50,000 computers, spidering the web at a rate of 2 billion pages/day, and analyzing the content found.