org.archive.wayback.surt
Class SURTTokenizer
java.lang.Object
org.archive.wayback.surt.SURTTokenizer
public class SURTTokenizer
- extends Object
provides iterative Url reduction for prefix matching to find ever coarser
grained URL-specific configuration. Assumes that a prefix binary search is
being attempted for each returned value. First value is the entire SURT
url String, with TAB appended. Second removes CGI ARGs. Then each subsequent
path segment ('/' separated) is removed. Then the login:password, if present
is removed. Then the port, if not :80 or omitted on the initial URL. Then
each subsequent authority segment(. separated) is removed.
the nextSearch() method will return null, finally, when no broader searches
can be attempted on the URL.
- Version:
- $Date: 2010-09-29 05:28:38 +0700 (Wed, 29 Sep 2010) $, $Revision: 3262 $
- Author:
- brad
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SURTTokenizer
public SURTTokenizer(String url)
throws org.apache.commons.httpclient.URIException
- constructor
- Parameters:
url - String URL
- Throws:
org.apache.commons.httpclient.URIException
nextSearch
public String nextSearch()
- update internal state and return the next smaller search string
for the url
- Returns:
- string to lookup for prefix match for relevant information.
exactKey
public static String exactKey(String url)
throws org.apache.commons.httpclient.URIException
- Parameters:
url -
- Returns:
- String SURT which will match exactly argument url
- Throws:
org.apache.commons.httpclient.URIException
prefixKey
public static String prefixKey(String url)
throws org.apache.commons.httpclient.URIException
- Parameters:
url -
- Returns:
- String SURT which will match urls prefixed with the argument url
- Throws:
org.apache.commons.httpclient.URIException
Copyright © 2005-2011 Internet Archive. All Rights Reserved.