org.archive.wayback.resourcestore.indexer
Class ArcIndexer

java.lang.Object
  extended by org.archive.wayback.resourcestore.indexer.ArcIndexer

public class ArcIndexer
extends java.lang.Object

Transforms an ARC file into Iterator.

Version:
$Date: 2008-07-01 16:45:04 -0700 (Tue, 01 Jul 2008) $, $Revision: 2374 $
Author:
brad

Field Summary
static java.lang.String CDX_HEADER_MAGIC
          CDX Header line for these fields.
 
Constructor Summary
ArcIndexer()
           
 
Method Summary
 UrlCanonicalizer getCanonicalizer()
           
 CloseableIterator<CaptureSearchResult> iterator(org.archive.io.arc.ARCReader arcReader)
           
 CloseableIterator<CaptureSearchResult> iterator(java.io.File arc)
           
 CloseableIterator<CaptureSearchResult> iterator(java.lang.String pathOrUrl)
           
static void main(java.lang.String[] args)
           
 void setCanonicalizer(UrlCanonicalizer canonicalizer)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CDX_HEADER_MAGIC

public static final java.lang.String CDX_HEADER_MAGIC
CDX Header line for these fields. not very configurable..

See Also:
Constant Field Values
Constructor Detail

ArcIndexer

public ArcIndexer()
Method Detail

iterator

public CloseableIterator<CaptureSearchResult> iterator(java.io.File arc)
                                                throws java.io.IOException
Parameters:
arc -
Returns:
Iterator of SearchResults for input arc File
Throws:
java.io.IOException

iterator

public CloseableIterator<CaptureSearchResult> iterator(java.lang.String pathOrUrl)
                                                throws java.io.IOException
Parameters:
pathOrUrl -
Returns:
Iterator of SearchResults for input pathOrUrl
Throws:
java.io.IOException

iterator

public CloseableIterator<CaptureSearchResult> iterator(org.archive.io.arc.ARCReader arcReader)
                                                throws java.io.IOException
Parameters:
arcReader -
Returns:
Iterator of SearchResults for input ARCReader
Throws:
java.io.IOException

getCanonicalizer

public UrlCanonicalizer getCanonicalizer()

setCanonicalizer

public void setCanonicalizer(UrlCanonicalizer canonicalizer)

main

public static void main(java.lang.String[] args)
Parameters:
args -


Copyright © 2005-2009 Internet Archive. All Rights Reserved.