WERA Release Notes


Table of Contents

1. Release 0.4.2RC1
1.1. Known Limitations/Issues
1.2. Changes
2. Release 0.4.1
2.1. Known Limitations/Issues
2.2. Changes
3. Release 0.4.0
3.1. Known Limitations/Issues
3.2. Changes
4. Release 0.2.2
4.1. Changes
5. Release 0.2.1
5.1. Known Limitations/Issues
5.2. Changes

1. Release 0.4.2RC1

Abstract

URL Canonicalization, Proxy support (experimental).

First release candidate for the 0.4.2 version.

1.1. Known Limitations/Issues

1.1.1. Redirects not handled

Using Wera in combination with a Proxy server will take care of some of the redirect issues. Work still remains for determining which issues are handled and which are not (for the 0.4.2 release)

For further details see 0.4.1 release notes

1.1.2. Advanced search not ready

The advanced search has not been prioritized for this release. Functionality needed not specified.

1.2. Changes

1.2.1. URL Canonicalization

Canonization of URLs is now added. If a link points to an URL that is indexed with a different form (e.g http://www.nb.no instead of http://nb.no), WERA will now find this in the index. Canonicalization is configurable. Induvidual rules may be disabled, an the order in which the rules are applied may be changed.

1.2.2. Proxy support

The Javascript inserted by WERA before the html page is delivered to the users browser does not catch all links. To prevent this undesired behaviour the web server hosting WERA can be set up as Proxy server so that all requests for other hosts than the WERA host can be redirected back to WERA. Of course, the user will have to change the browsers proxy setting so that all requests goes to the WERA host.

1.2.3. Changes - detailed listing

Table 1. Changes

IDTypeSummaryOpen DateByFiler
1312202FixExacturl needs to canonicalize2005-10-03 21:08sverrebstack-sf
1454555AddPossibility to use Wera with a proxy server for catching leaking links2006-03-20 15:09sverrebsverreb

2. Release 0.4.1

Abstract

Improved url encoding, added metadata view in Timeline, and Google-like result presentation.

2.1. Known Limitations/Issues

2.1.1. No canonicalization of URLS

Currently, no canonization of URLs is done in WERA. If a link points to an URL that is indexed with a different form (e.g http://www.nb.no instead of http://nb.no), WERA will not find this in the index and therefore will report: Sorry, no documents with the given uri were found. See [1312202] exacturl needs to canonicalize.

2.1.2. Redirects not handled

WERA does nothing to handle redirects. The result, depending on the nature of the redirect, will be either that the actual resource is not displayed at all in WERA or that a redirect to live web is executed within the WERA view without any information to the user See bugtracker issues [1312200] Pages at end of redirects not found and [1312214] More redirects to live web.

2.1.3. Advanced search not ready

The advanced search has not been prioritized for this release. Functionality needed not specified.

2.2. Changes

2.2.1. URL encoding

The old index_encode algorithm used for encoding urls in Wera has been replaced by PHP's native urlencode function. See [1354276] Still have URL encoding issues and [1247134] Beautify the wera URL.

2.2.2. Possibillity to display metadata

Possibility to display metadata from the Retriever's getmeta request. A metadata check box has been added to the time line view. When checked, the metadata shows up below the timeline (instead of the archived web page).

2.2.3. Google-like result presentation

The presentation of results has been changed so that the default view is one hit per site. Each hit has a link to 'more from this site' which presents all hits within that site in the same was as the old wera result list.

2.2.4. Changes - detailed listing

Table 2. Changes

IDTypeSummaryOpen DateByFiler
1354276FixStill have URL encoding issues2005-11-11 19:27sverrebstack-sf
1401204AddPossibility to display metadata2006-01-10 08:42sverrebsverreb
1346889AddGoogle-like result presentation2005-11-03 13:34sverrebsverreb
1247134FixBeautify the wera URL2005-07-28 23:47sverrebstack-sf
1403277AddQuery term from search ui to timeline2006-01-11 21:33sverrebsverreb
1403742FixNon-localized string in code2006-01-12 10:56sverrebsverreb

3. Release 0.4.0

Abstract

Improved exacturl handling, error handling and encoding issues. Bug fixes and documentation.

3.1. Known Limitations/Issues

3.1.1. No canonicalization of URLS

Currently, no canonization of URLs is done in WERA. If a link points to an URL that is indexed with a different form (e.g http://www.nb.no instead of http://nb.no), WERA will not find this in the index and therefore will report: Sorry, no documents with the given uri were found. See [1312202] exacturl needs to canonicalize.

3.1.2. Redirects not handled

WERA does nothing to handle redirects. The result, depending on the nature of the redirect, will be either that the actual resource is not displayed at all in WERA or that a redirect to live web is executed within the WERA view without any information to the user See bugtracker issues [1312200] Pages at end of redirects not found and [1312214] More redirects to live web.

3.1.3. Advanced search not ready

The advanced search has not been prioritized for this release. Functionality needed not specified.

3.2. Changes

3.2.1. Exacturl handling

The handling of exacturl searches has been improved considerably on both WERA and NutchWax side. WERA uses the exacturl search functionality extensively both for counting versions of a given URL and to determine the mapping between a given URL/timestamp and its Arc name and offset.

3.2.2. Error handling

WERA's error messages has been improved. Instead of printing cryptical PHP warnings and errors it prints more meaningful error messages enabling to user to understand what is wrong.

3.2.3. Query encoding issues

There were major problems with querying with non-ISO8859 characters. To solve this issue changes were made to both WERA and NutchWax.

3.2.4. Encoding issues when vieving archived pages

WERA now sets the encoding in the header of a given web page prior to sending the page to the users browser. The encoding sent is the encoding detected by NutchWax at index time.

3.2.5. Changes - detailed listing

Table 3. Changes

IDTypeSummaryOpen DateByFiler
1312159Addwera overview doc based on dokuwiki text2005-10-03 11:29sverrebstack-sf
1246834AddMove arc path to retreiver (WAS Path...lib/seal/nutch.inc)2005-07-28 08:06sverrebstack-sf
1244879AddAdd display of text snippets to wera search results page2005-07-25 17:32sverrebstack-sf
1333042FixSearch result list - Bad handling of dedup result list2005-10-20 03:08sverrebstack-sf
1322601Fixsearch ui - time param not set2005-10-10 05:32sverrebsverreb
1324757Fixdebug on messes up displayed web page2005-10-12 04:37sverrebsverreb
1249970FixInstaller requires X though claimed not needing it2005-08-01 21:23stack-sfstack-sf
1324161Fixeuc-jp page not displayed properly in wera2005-10-11 12:34sverrebstack-sf
1322668Fixwera help need update2005-10-10 05:59sverrebsverreb
1324755FixHeader sent from wera documentdispatcher of wrong format2005-10-12 04:32sverrebsverreb
1322554Fixexacturl query returnns 0 of X versions in result list2005-10-10 05:13sverrebsverreb
1322594FixWhen time param not set url is not found2005-10-10 05:29sverrebsverreb
1312442FixDate range missing in querystring2005-10-03 17:26sverrebsverreb
1314403FixUse newly added 'encoding' in search results2005-10-05 19:34sverrebstack-sf
1314098FixEncoding issue, wera displaying archived web page2005-10-05 10:53stack-sfsverreb
1244894FixCannot query for non-ISO8859 characters2005-07-25 18:38stack-sfstack-sf
1312208FixQuery time encoding issues2005-10-03 12:11stack-sfstack-sf
1314360FixRemove all, any or phrase selection in search ui2005-10-05 18:14sverrebsverreb
1312479FixindexSearch.inc need cleanup2005-10-03 18:40sverrebsverreb
1313251FixWera search, ugly and/or not useful error messages2005-10-04 13:32sverrebsverreb
1282042FixWERA - Timeline - Warning when URL not found2005-09-05 03:28sverrebstack-sf
1312484Fix[wera] Ugly complaint about invalid argument2005-10-03 18:47sverrebstack-sf
1312299FixWERA - Exacturl search not always working2005-10-03 13:51sverrebsverreb
1281697Fixsearching czech words not working2005-09-04 10:36stack-sfkranach
1277376FixWERA - Duplicate hits in result list2005-08-31 05:45sverrebsverreb

4. Release 0.2.2

Abstract

Bug fixes

4.1. Changes

Fixed 1277376 duplicate hits in result list. WERA now uses NutchWAX's dedup functionality to supress duplicate hits in result list. Gives improved performance.

5. Release 0.2.1

Abstract

First release of WERA

5.1. Known Limitations/Issues

When no X installed the Java based installer should fall back to console mode. Some reports of problems with this. If so, install WERA manually. See manual.

WERA does not work properly with PHP5. Has to do with PHP5's new Object Model. When using the 'NEAR' mode of the documentLocator it will return a resultset concatenated by the resultsets for 'BEFORE' and 'AFTER' instead of returning the one closest in time. Results in wrong aid to the documentRetriever when presenting inline objects.

5.2. Changes

  1. Support for nutchwax search engine added

  2. Support for nwalucene search removed (replaced by the above).

  3. Support for Fast Search Engine currently not working (will be added in later version).

  4. Advanced search removed (may be added in later version).

  5. Server side link rewriting replaced by javascript client side link rewriting.