Overview

Tomcat (or other servlet containers) are configured to listen on one or more ports, so each request received on one of those ports is targeted to a particular webapp based on the name of the .war file deployed under the webapps/ directory. The targeted webapp is determined based on the first directory in incoming requests.

If there are two webapps deployed under the webapps/ directory, called webappA.war and webappB.war, then an incoming request /webappA/file1 will be received by the webapp inside webappA.war as the request /file1. An incoming request for webappB/images/foo.gif will be received by the webapp inside webappB.war as /images/foo.gif.

Tomcat (and other servlet containers) allow a special .war file to be deployed under the webapps/ directory called ROOT.war which will receive requests not matching another webapp. If the above example also included a webapp deployed under the webapps/ directory named ROOT.war, then requests starting with webappA/ will be received by webappA.war, requests starting with webappB/ will be received by webappB.war, and all other requests will be receieved by the ROOT.war webapp.

If possible, deploying your webapp as ROOT.war will result in somewhat cleaner public URLs, but this is not a requirement. The examples below all include alternate URL configuration prefixes depending on whether you deploy the Wayback .war file as either ROOT.war or wayback.war.

AccessPoint Names

Each AccessPoint Spring XML bean definition must include a name property:

<bean name="8080:wayback" class="org.archive.wayback.webapp.AccessPoint"> ... </bean>

The name property indicates how requests that are received by the Wayback webapp are routed to the appropriate AccessPoint. Wayback allows targeting AccessPoints based on:

  • hostname
  • port
  • first path after the optional webapp deployment name (which is empty if you deploy your Wayback webapp as ROOT.war)
using the AccessPoint bean name field composed of hostname:port:first_path.

If you have configured DNS to resolve multiple hostnames to the same computer, you can use the hostname: to control AccessPoint resolving based on virtual hosts.

Port is the only required configuration component within the AccessPoint name configuration. If you have multiple Tomcat Connectors you can alter this AccessPoint name configuration to target specific AccessPoints, otherwise, all your AccessPoint names will have the same port, likely one of 8080, or 80.

A more commonly useful AccessPoint name resolving component is the first-path, which allows you to easily expose multiple collections within a single Wayback webapp deployment, without varying hostnames, or ports (which often require network or system administrator assistance).

Example AccessPoint names and URLs

The following table shows how urls will map to particular AccessPoints assuming you have deployed the Wayback webapp as ROOT.war, on a host with the name "access.example.org", using port 8080.

Access Point bean name Archival URL prefix Archival URL query example for http://archive.org
8080:collectionA http://access.example.org:8080/collectionA/ http://access.example.org:8080/collectionA/*/http://archive.org/
8080:collectionB http://access.example.org:8080/collectionB/ http://access.example.org:8080/collectionB/*/http://archive.org/

If you deployed your Wayback webapp with the name wayback.war the following table shows how urls will map to particular AccessPoints, on a host with the name "access.example.org", using port 8080.

Access Point bean name Archival URL prefix Archival URL query example for http://archive.org
8080:collectionA http://access.example.org:8080/wayback/collectionA/ http://access.example.org:8080/wayback/collectionA/*/http://archive.org/
8080:collectionB http://access.example.org:8080/wayback/collectionB/ http://access.example.org:8080/wayback/collectionB/*/http://archive.org/

If you have configured multiple Connectors for your Tomcat server, listening on both port 80, and port 8080, and you deploy ROOT.war you can target different AccessPoints by port, as shown below. These examples assume your servers hostname is still "access.example.org".

Access Point bean name Archival URL prefix Archival URL query example for http://archive.org
80:collectionA http://access.example.org/collectionA/ http://access.example.org/collectionA/*/http://archive.org/
8080:collectionB http://access.example.org:8080/collectionB/ http://access.example.org:8080/collectionB/*/http://archive.org/
80:collectionC http://access.example.org/collectionC/ http://access.example.org/collectionC/*/http://archive.org/

If you have a very limited number of AccessPoints to expose, you can do away with the first-path component, to achieve potentially very uncluttered Archival URLs. Assuming multiple Connectors for your Tomcat server, listening on both port 80, and port 8080, and you deploy ROOT.war you can target different AccessPoints by port alone, as shown below. These examples still assume your servers hostname is "access.example.org".

Access Point bean name Archival URL prefix Archival URL query example for http://archive.org
80 http://access.example.org/ http://access.example.org/*/http://archive.org/
8080 http://access.example.org:8080/ http://access.example.org:8080/*/http://archive.org/

Getting somewhat fancy, you can use virtual hosts, doing away with non-standard ports, and use hostnames alone to specify AccessPoints. This means getting your Tomcat to listen on port 80, and deploying the webapp as ROOT.war. You'd have to configure your DNS so both "collection1.example.org" and "collection2.example.org" point to the host running Wayback:

Access Point bean name Archival URL prefix Archival URL query example for http://archive.org
collection1.example.org:80 http://collection1.example.org/ http://collection1.example.org/*/http://archive.org/
collection2.example.org:80 http://collection2.example.org/ http://collection2.example.org/*/http://archive.org/

Getting really fancy

Assuming you've deployed your webapp as ROOT.war and have Tomcat listening on both port 80 and 8080, with the hostnames "collection1.example.org" and "collection2.example.org" both pointing to the host running wayback:

Access Point bean name Archival URL prefix Archival URL query example for http://archive.org
collection1.example.org:80 http://collection1.example.org/ http://collection1.example.org/*/http://archive.org/
collection1.example.org:8080:subset1 http://collection1.example.org:8080/subset1/ http://collection1.example.org:8080/subset1/*/http://archive.org/
collection1.example.org:8080:subset2 http://collection1.example.org:8080/subset2/ http://collection1.example.org:8080/subset2/*/http://archive.org/
collection2.example.org:8080 http://collection1.example.org:8080/ http://collection1.example.org:8080/*/http://archive.org/
collection2.example.org:80:internal http://collection2.example.org/internal/ http://collection2.example.org/internal/*/http://archive.org/
collection2.example.org:80:public http://collection2.example.org/public/ http://collection2.example.org/public/*/http://archive.org/