org.archive.wayback.resourceindex
Class NutchResourceIndex

java.lang.Object
  extended by org.archive.wayback.resourceindex.NutchResourceIndex
All Implemented Interfaces:
ResourceIndex

public class NutchResourceIndex
extends Object
implements ResourceIndex

Version:
$Date: 2010-09-29 05:28:38 +0700 (Wed, 29 Sep 2010) $, $Revision: 3262 $
Author:
brad

Constructor Summary
NutchResourceIndex()
           
 
Method Summary
protected  Document getHttpDocument(String url)
           
 int getMaxRecords()
           
protected  String getNodeContent(Element e, String key)
           
protected  String getNodeNutchContent(Element e, String key)
           
protected  String getRequestUrl(WaybackRequest wbRequest)
           
protected  NodeList getSearchChannel(Document d)
           
protected  NodeList getSearchItems(Document d)
           
 String getSearchUrlBase()
           
 void init()
           
 SearchResults query(WaybackRequest wbRequest)
          Transform a WaybackRequest into a ResourceResults.
 void setMaxRecords(int maxRecords)
           
 void setSearchUrlBase(String searchUrlBase)
           
 void shutdown()
          Release any resources used by this ResourceIndex cleanly
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NutchResourceIndex

public NutchResourceIndex()
Method Detail

init

public void init()
          throws ConfigurationException
Throws:
ConfigurationException

query

public SearchResults query(WaybackRequest wbRequest)
                    throws ResourceIndexNotAvailableException,
                           ResourceNotInArchiveException,
                           BadQueryException,
                           AccessControlException
Description copied from interface: ResourceIndex
Transform a WaybackRequest into a ResourceResults.

Specified by:
query in interface ResourceIndex
Parameters:
wbRequest - WaybackRequest object from RequestParser
Returns:
SearchResults containing SearchResult objects matching the WaybackRequest
Throws:
ResourceIndexNotAvailableException - if the ResourceIndex is not available (remote host down, local files missing, etc)
ResourceNotInArchiveException - if the ResourceIndex could be contacted, but no SearchResult objects matched the request
BadQueryException - if the WaybackRequest is lacking information required to make a reasonable search of this ResourceIndex
AccessControlException - if SearchResult objects actually matched, but could not be returned due to AccessControl restrictions (robots.txt documents, Administrative URL blocks, etc)

getSearchChannel

protected NodeList getSearchChannel(Document d)

getSearchItems

protected NodeList getSearchItems(Document d)

getRequestUrl

protected String getRequestUrl(WaybackRequest wbRequest)
                        throws BadQueryException
Throws:
BadQueryException

getNodeNutchContent

protected String getNodeNutchContent(Element e,
                                     String key)

getNodeContent

protected String getNodeContent(Element e,
                                String key)

getHttpDocument

protected Document getHttpDocument(String url)
                            throws IOException,
                                   SAXException
Throws:
IOException
SAXException

getSearchUrlBase

public String getSearchUrlBase()
Returns:
the searchUrlBase

setSearchUrlBase

public void setSearchUrlBase(String searchUrlBase)
Parameters:
searchUrlBase - the searchUrlBase to set

getMaxRecords

public int getMaxRecords()
Returns:
the maxRecords

setMaxRecords

public void setMaxRecords(int maxRecords)
Parameters:
maxRecords - the maxRecords to set

shutdown

public void shutdown()
              throws IOException
Description copied from interface: ResourceIndex
Release any resources used by this ResourceIndex cleanly

Specified by:
shutdown in interface ResourceIndex
Throws:
IOException - for usual causes


Copyright © 2005-2011 Internet Archive. All Rights Reserved.