org.archive.wayback.surt
Class SURTTokenizer

java.lang.Object
  extended by org.archive.wayback.surt.SURTTokenizer

public class SURTTokenizer
extends Object

provides iterative Url reduction for prefix matching to find ever coarser grained URL-specific configuration. Assumes that a prefix binary search is being attempted for each returned value. First value is the entire SURT url String, with TAB appended. Second removes CGI ARGs. Then each subsequent path segment ('/' separated) is removed. Then the login:password, if present is removed. Then the port, if not :80 or omitted on the initial URL. Then each subsequent authority segment(. separated) is removed. the nextSearch() method will return null, finally, when no broader searches can be attempted on the URL.

Version:
$Date: 2010-09-29 05:28:38 +0700 (Wed, 29 Sep 2010) $, $Revision: 3262 $
Author:
brad

Constructor Summary
SURTTokenizer(String url)
          constructor
 
Method Summary
static String exactKey(String url)
           
 String nextSearch()
          update internal state and return the next smaller search string for the url
static String prefixKey(String url)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SURTTokenizer

public SURTTokenizer(String url)
              throws org.apache.commons.httpclient.URIException
constructor

Parameters:
url - String URL
Throws:
org.apache.commons.httpclient.URIException
Method Detail

nextSearch

public String nextSearch()
update internal state and return the next smaller search string for the url

Returns:
string to lookup for prefix match for relevant information.

exactKey

public static String exactKey(String url)
                       throws org.apache.commons.httpclient.URIException
Parameters:
url -
Returns:
String SURT which will match exactly argument url
Throws:
org.apache.commons.httpclient.URIException

prefixKey

public static String prefixKey(String url)
                        throws org.apache.commons.httpclient.URIException
Parameters:
url -
Returns:
String SURT which will match urls prefixed with the argument url
Throws:
org.apache.commons.httpclient.URIException


Copyright © 2005-2011 Internet Archive. All Rights Reserved.