org.archive.wayback.resourcestore.indexer
Class IndexWorker

java.lang.Object
  extended by org.archive.wayback.resourcestore.indexer.IndexWorker
All Implemented Interfaces:
Shutdownable

public class IndexWorker
extends Object
implements Shutdownable

Simple worker, which gets tasks from an IndexQueue, in the case, the name of ARC/WARC files to be indexed, retrieves the ARC/WARC location from a ResourceFileLocationDB, creates the index, which is serialized into a file, and then hands that file off to a ResourceIndex for merging, using an IndexClient.

Version:
$Date: 2010-11-19 09:36:02 +0700 (Fri, 19 Nov 2010) $, $Revision: 3338 $
Author:
brad

Field Summary
static String ARC_EXTENSION
           
static String ARC_GZ_EXTENSION
           
static String WARC_EXTENSION
           
static String WARC_GZ_EXTENSION
           
 
Constructor Summary
IndexWorker()
           
 
Method Summary
 boolean doWork()
           
 UrlCanonicalizer getCanonicalizer()
           
 ResourceFileLocationDB getDb()
           
 long getInterval()
           
 IndexQueue getQueue()
           
 IndexClient getTarget()
           
 CloseableIterator<CaptureSearchResult> indexFile(String pathOrUrl)
           
 void init()
           
static void main(String[] args)
           
 void setCanonicalizer(UrlCanonicalizer canonicalizer)
           
 void setDb(ResourceFileLocationDB db)
           
 void setInterval(long interval)
           
 void setQueue(IndexQueue queue)
           
 void setTarget(IndexClient target)
           
 void shutdown()
          Release any resources used by this ResourceIndex cleanly
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ARC_EXTENSION

public static final String ARC_EXTENSION
See Also:
Constant Field Values

ARC_GZ_EXTENSION

public static final String ARC_GZ_EXTENSION
See Also:
Constant Field Values

WARC_EXTENSION

public static final String WARC_EXTENSION
See Also:
Constant Field Values

WARC_GZ_EXTENSION

public static final String WARC_GZ_EXTENSION
See Also:
Constant Field Values
Constructor Detail

IndexWorker

public IndexWorker()
Method Detail

init

public void init()

shutdown

public void shutdown()
Description copied from interface: Shutdownable
Release any resources used by this ResourceIndex cleanly

Specified by:
shutdown in interface Shutdownable

doWork

public boolean doWork()
               throws IOException
Throws:
IOException

indexFile

public CloseableIterator<CaptureSearchResult> indexFile(String pathOrUrl)
                                                 throws IOException
Throws:
IOException

main

public static void main(String[] args)
Parameters:
args -

getInterval

public long getInterval()
Returns:
the interval

setInterval

public void setInterval(long interval)
Parameters:
interval - the interval to set

getQueue

public IndexQueue getQueue()
Returns:
the queue

setQueue

public void setQueue(IndexQueue queue)
Parameters:
queue - the queue to set

getDb

public ResourceFileLocationDB getDb()
Returns:
the db

setDb

public void setDb(ResourceFileLocationDB db)
Parameters:
db - the db to set

getTarget

public IndexClient getTarget()
Returns:
the target

setTarget

public void setTarget(IndexClient target)
Parameters:
target - the target to set

getCanonicalizer

public UrlCanonicalizer getCanonicalizer()
Returns:
the canonicalizer

setCanonicalizer

public void setCanonicalizer(UrlCanonicalizer canonicalizer)
Parameters:
canonicalizer - the canonicalizer to set


Copyright © 2005-2011 Internet Archive. All Rights Reserved.