org.archive.access.nutch
Class NutchwaxCrawlDb
java.lang.Object
org.apache.hadoop.util.ToolBase
org.apache.nutch.crawl.CrawlDb
org.archive.access.nutch.NutchwaxCrawlDb
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool
public class NutchwaxCrawlDb
- extends org.apache.nutch.crawl.CrawlDb
Adds setting of the NutchwaxCrawlDbFilter.
- Author:
- stack
|
Field Summary |
static org.apache.commons.logging.Log |
LOG
|
| Fields inherited from class org.apache.nutch.crawl.CrawlDb |
CRAWLDB_ADDITIONS_ALLOWED, CURRENT_NAME, LOCK_NAME |
| Fields inherited from class org.apache.hadoop.util.ToolBase |
conf |
|
Method Summary |
static void |
main(java.lang.String[] args)
|
void |
update(org.apache.hadoop.fs.Path crawlDb,
org.apache.hadoop.fs.Path[] segments,
boolean normalize,
boolean filter,
boolean additionsAllowed,
boolean force)
|
| Methods inherited from class org.apache.nutch.crawl.CrawlDb |
createJob, install, run, update |
| Methods inherited from class org.apache.hadoop.util.ToolBase |
doMain, getConf, setConf |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
LOG
public static final org.apache.commons.logging.Log LOG
NutchwaxCrawlDb
public NutchwaxCrawlDb()
NutchwaxCrawlDb
public NutchwaxCrawlDb(org.apache.hadoop.conf.Configuration conf)
update
public void update(org.apache.hadoop.fs.Path crawlDb,
org.apache.hadoop.fs.Path[] segments,
boolean normalize,
boolean filter,
boolean additionsAllowed,
boolean force)
throws java.io.IOException
- Overrides:
update in class org.apache.nutch.crawl.CrawlDb
- Throws:
java.io.IOException
main
public static void main(java.lang.String[] args)
throws java.lang.Exception
- Throws:
java.lang.Exception
Copyright © 2005-2007 Internet Archive. All Rights Reserved.