org.archive.wayback.resourceindex
Class RemoteResourceIndex

java.lang.Object
  extended by org.archive.wayback.resourceindex.RemoteResourceIndex
All Implemented Interfaces:
ResourceIndex

public class RemoteResourceIndex
extends Object
implements ResourceIndex

ResourceIndex implementation that relays a query to a remote index implementation over HTTP. The XMLQueryUI is assumed to be active on the remote server, and the query is sent over as-is, formulated as an OpenSearch query. Results are also returned as-is -- this class attempts to be as transparent as possible.

Version:
$Date: 2010-09-29 05:28:38 +0700 (Wed, 29 Sep 2010) $, $Revision: 3262 $
Author:
brad

Constructor Summary
RemoteResourceIndex()
           
 
Method Summary
protected  void checkDocumentForExceptions(Document document)
           
protected  SearchResults documentToSearchResults(Document document, ObjectFilter<CaptureSearchResult> filter)
           
 UrlCanonicalizer getCanonicalizer()
           
 int getConnectTimeout()
           
protected  Document getFileDocument(File f)
           
protected  Document getHttpDocument(String url)
           
protected  String getNodeContent(Element e, String key)
           
 int getReadTimeout()
           
protected  NodeList getRequestFilters(Document d)
           
protected  String getRequestUrl(WaybackRequest wbRequest)
           
protected  ObjectFilter<CaptureSearchResult> getSearchResultFilters(WaybackRequest wbRequest)
           
protected  NodeList getSearchResults(Document d)
           
 String getSearchUrlBase()
           
 void init()
           
 SearchResults query(WaybackRequest wbRequest)
          Transform a WaybackRequest into a ResourceResults.
 void setCanonicalizer(UrlCanonicalizer canonicalizer)
           
 void setConnectTimeout(int connectTimeout)
           
 void setReadTimeout(int readTimeout)
           
 void setSearchUrlBase(String searchUrlBase)
           
 void shutdown()
          Release any resources used by this ResourceIndex cleanly
protected  SearchResults urlToSearchResults(String requestUrl, ObjectFilter<CaptureSearchResult> filter)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RemoteResourceIndex

public RemoteResourceIndex()
Method Detail

init

public void init()
          throws ConfigurationException
Throws:
ConfigurationException

query

public SearchResults query(WaybackRequest wbRequest)
                    throws ResourceIndexNotAvailableException,
                           ResourceNotInArchiveException,
                           BadQueryException,
                           AccessControlException
Description copied from interface: ResourceIndex
Transform a WaybackRequest into a ResourceResults.

Specified by:
query in interface ResourceIndex
Parameters:
wbRequest - WaybackRequest object from RequestParser
Returns:
SearchResults containing SearchResult objects matching the WaybackRequest
Throws:
ResourceIndexNotAvailableException - if the ResourceIndex is not available (remote host down, local files missing, etc)
ResourceNotInArchiveException - if the ResourceIndex could be contacted, but no SearchResult objects matched the request
BadQueryException - if the WaybackRequest is lacking information required to make a reasonable search of this ResourceIndex
AccessControlException - if SearchResult objects actually matched, but could not be returned due to AccessControl restrictions (robots.txt documents, Administrative URL blocks, etc)

urlToSearchResults

protected SearchResults urlToSearchResults(String requestUrl,
                                           ObjectFilter<CaptureSearchResult> filter)
                                    throws ResourceIndexNotAvailableException,
                                           ResourceNotInArchiveException,
                                           BadQueryException,
                                           AccessControlException
Throws:
ResourceIndexNotAvailableException
ResourceNotInArchiveException
BadQueryException
AccessControlException

checkDocumentForExceptions

protected void checkDocumentForExceptions(Document document)
                                   throws ResourceIndexNotAvailableException,
                                          ResourceNotInArchiveException,
                                          BadQueryException,
                                          AccessControlException
Throws:
ResourceIndexNotAvailableException
ResourceNotInArchiveException
BadQueryException
AccessControlException

getSearchResultFilters

protected ObjectFilter<CaptureSearchResult> getSearchResultFilters(WaybackRequest wbRequest)

documentToSearchResults

protected SearchResults documentToSearchResults(Document document,
                                                ObjectFilter<CaptureSearchResult> filter)
                                         throws ResourceNotInArchiveException
Throws:
ResourceNotInArchiveException

getRequestFilters

protected NodeList getRequestFilters(Document d)

getSearchResults

protected NodeList getSearchResults(Document d)

getRequestUrl

protected String getRequestUrl(WaybackRequest wbRequest)
                        throws BadQueryException
Throws:
BadQueryException

getNodeContent

protected String getNodeContent(Element e,
                                String key)

getHttpDocument

protected Document getHttpDocument(String url)
                            throws IOException,
                                   SAXException
Throws:
IOException
SAXException

getFileDocument

protected Document getFileDocument(File f)
                            throws IOException,
                                   SAXException
Throws:
IOException
SAXException

getSearchUrlBase

public String getSearchUrlBase()
Returns:
the searchUrlBase

setSearchUrlBase

public void setSearchUrlBase(String searchUrlBase)
Parameters:
searchUrlBase - the searchUrlBase to set

shutdown

public void shutdown()
              throws IOException
Description copied from interface: ResourceIndex
Release any resources used by this ResourceIndex cleanly

Specified by:
shutdown in interface ResourceIndex
Throws:
IOException - for usual causes

getCanonicalizer

public UrlCanonicalizer getCanonicalizer()

setCanonicalizer

public void setCanonicalizer(UrlCanonicalizer canonicalizer)

getConnectTimeout

public int getConnectTimeout()

setConnectTimeout

public void setConnectTimeout(int connectTimeout)

getReadTimeout

public int getReadTimeout()

setReadTimeout

public void setReadTimeout(int readTimeout)


Copyright © 2005-2011 Internet Archive. All Rights Reserved.