org.archive.access.nutch
Class NutchwaxCrawlDbFilter
java.lang.Object
org.apache.nutch.crawl.CrawlDbFilter
org.archive.access.nutch.NutchwaxCrawlDbFilter
- All Implemented Interfaces:
- org.apache.hadoop.io.Closeable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Mapper
public class NutchwaxCrawlDbFilter
- extends org.apache.nutch.crawl.CrawlDbFilter
Override so we can meddle with the key passed the superclass stripping
collection (then, when the super's mapper is done, put the collection back.
- Author:
- stack
| Fields inherited from class org.apache.nutch.crawl.CrawlDbFilter |
LOG, URL_FILTERING, URL_NORMALIZING, URL_NORMALIZING_SCOPE |
|
Method Summary |
void |
map(org.apache.hadoop.io.WritableComparable key,
org.apache.hadoop.io.Writable value,
org.apache.hadoop.mapred.OutputCollector output,
org.apache.hadoop.mapred.Reporter r)
|
| Methods inherited from class org.apache.nutch.crawl.CrawlDbFilter |
close, configure |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
NutchwaxCrawlDbFilter
public NutchwaxCrawlDbFilter()
map
public void map(org.apache.hadoop.io.WritableComparable key,
org.apache.hadoop.io.Writable value,
org.apache.hadoop.mapred.OutputCollector output,
org.apache.hadoop.mapred.Reporter r)
throws java.io.IOException
- Specified by:
map in interface org.apache.hadoop.mapred.Mapper- Overrides:
map in class org.apache.nutch.crawl.CrawlDbFilter
- Throws:
java.io.IOException
Copyright © 2005-2007 Internet Archive. All Rights Reserved.