Package org.archive.wayback.accesscontrol.robotstxt

Class Summary
RobotExclusionFilter CaptureSearchResult Filter that uses a LiveWebCache to retrieve robots.txt documents from the live web, and filters SearchResults based on the rules therein.
RobotExclusionFilterFactory  
RobotRules Class which parses a robots.txt file, storing the rules contained therein, and then allows for testing if path/userAgent tuples are blocked by those rules.
 



Copyright © 2005-2011 Internet Archive. All Rights Reserved.