org.archive.access.nutch
Class NutchwaxLinkDb
java.lang.Object
org.apache.hadoop.util.ToolBase
org.apache.nutch.crawl.LinkDb
org.archive.access.nutch.NutchwaxLinkDb
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable, org.apache.hadoop.io.Closeable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Mapper, org.apache.hadoop.mapred.Reducer, org.apache.hadoop.util.Tool
public class NutchwaxLinkDb
- extends org.apache.nutch.crawl.LinkDb
Subclass of nutch indexer that writes out LinkDb keys that include the
collection name.
Bulk of code is a copy and paste from LinkDb. LinkDb is not amenable to
subclassing.
- Author:
- stack
| Nested classes/interfaces inherited from class org.apache.nutch.crawl.LinkDb |
org.apache.nutch.crawl.LinkDb.Merger |
| Fields inherited from class org.apache.nutch.crawl.LinkDb |
CURRENT_NAME, LOCK_NAME, LOG |
| Fields inherited from class org.apache.hadoop.util.ToolBase |
conf |
|
Method Summary |
void |
configure(org.apache.hadoop.mapred.JobConf job)
|
void |
invert(org.apache.hadoop.fs.Path linkDb,
org.apache.hadoop.fs.Path[] segments,
boolean normalize,
boolean filter,
boolean force)
|
static void |
main(java.lang.String[] args)
|
void |
map(org.apache.hadoop.io.WritableComparable key,
org.apache.hadoop.io.Writable value,
org.apache.hadoop.mapred.OutputCollector output,
org.apache.hadoop.mapred.Reporter reporter)
|
| Methods inherited from class org.apache.nutch.crawl.LinkDb |
close, createMergeJob, install, invert, reduce, run |
| Methods inherited from class org.apache.hadoop.util.ToolBase |
doMain, getConf, setConf |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
NutchwaxLinkDb
public NutchwaxLinkDb()
NutchwaxLinkDb
public NutchwaxLinkDb(org.apache.hadoop.conf.Configuration conf)
- Construct an LinkDb.
configure
public void configure(org.apache.hadoop.mapred.JobConf job)
- Specified by:
configure in interface org.apache.hadoop.mapred.JobConfigurable- Overrides:
configure in class org.apache.nutch.crawl.LinkDb
map
public void map(org.apache.hadoop.io.WritableComparable key,
org.apache.hadoop.io.Writable value,
org.apache.hadoop.mapred.OutputCollector output,
org.apache.hadoop.mapred.Reporter reporter)
throws java.io.IOException
- Specified by:
map in interface org.apache.hadoop.mapred.Mapper- Overrides:
map in class org.apache.nutch.crawl.LinkDb
- Throws:
java.io.IOException
invert
public void invert(org.apache.hadoop.fs.Path linkDb,
org.apache.hadoop.fs.Path[] segments,
boolean normalize,
boolean filter,
boolean force)
throws java.io.IOException
- Overrides:
invert in class org.apache.nutch.crawl.LinkDb
- Throws:
java.io.IOException
main
public static void main(java.lang.String[] args)
throws java.lang.Exception
- Throws:
java.lang.Exception
Copyright © 2005-2007 Internet Archive. All Rights Reserved.