org.archive.wayback
Interface UrlCanonicalizer

All Known Implementing Classes:
AggressiveUrlCanonicalizer, IdentityUrlCanonicalizer

public interface UrlCanonicalizer

Interface for implementations that transform an input String URL into a canonical form, suitable for lookups in a ResourceIndex. URLs should be sent through the same canonicalizer they will be searched using, before being inserted into a ResourceIndex.

Author:
brad

Method Summary
 String urlStringToKey(String url)
           
 

Method Detail

urlStringToKey

String urlStringToKey(String url)
                      throws org.apache.commons.httpclient.URIException
Parameters:
url - String representation of an URL, in as original, and unchanged form as possible.
Returns:
a lookup key appropriate for searching within a ResourceIndex.
Throws:
org.apache.commons.httpclient.URIException - if the input url String is not a valid URL.


Copyright © 2005-2011 Internet Archive. All Rights Reserved.