Long-term deployment of a transactional archive will require a more complex distributed set-up. For example, several SiteStory archives can be created on different machines. Older data can be offloaded from SiteStory to the WARC file format and deployed as Wayback archive and old data can be purged from the SiteStory Archive instance. The WARC format provides more compact storage in relation to total number of files in the file system. For the web servers with very dynamic pages, total number of files in SiteStory Archive vs inode disk parameter can be an important indicator of the need to offload data to a Wayback archive or start a new SiteStory Archive at a new disk. The archiving setup involving data distribution between SiteStory and IA Wayback archives is demonstrated here. Please note that Wayback software supports the Memento interface, see Wayback Memento install how-to for more information.
To make a Memento client work seamlessly with multiple archives, the TimeGate Redirector interface is provided. TimeGate Redirector dispatches a Memento client to the correct TimeGate, depending on the date requested in the Accept-datetime link header.
Setting up a TimeGate Redirector server under Tomcat is straightforward. The server is contained in a single WAR file named "tg-redirector.war". The installation therefore only consists of editing a configuration file. Once you copied the WAR file into the [TOMCAT-HOME]/webapps directory, you can find the timegates.xml file in the [TOMCAT-HOME]/webapps/tg-redirector/WEB-INF/classes directory. Restart Tomcat after editing the timegates.xml file.From the example below we can see that archived data is distributed between the Wayback archive ( timegate:http://lanlproto.santafe.edu:8080/memento/timegate/) and SiteStory (timegate : http://www.theresourcedepot.com/000010/timegate/).
<?xml version="1.0" encoding="UTF-8"?> <timegates> <timegate uri="http://lanlproto.santafe.edu:8080/memento/timegate/"> <start>Tue\, 27 Mar 2012 22:08:10 GMT</start> <end>Mon\, 09 Apr 2012 23:35:02 GMT</end> <timemap uri="http://lanlproto.santafe.edu:8080/memento/timemap/link/" /> </timegate> <timegate uri="http://www.theresourcedepot.com/000010/timegate/"> <start>Mon\, 09 Apr 2012 23:35:03 GMT</start> <end>Tue\, 19 Apr 2031 12:00:02 GMT</end> <timemap uri="http://www.theresourcedepot.com/000010/timemap/link/" /> </timegate> </timegates>
If you have multiple archives installed with TimeGate Redirector, you also will need to change the TimeGate link header at your mod_sitestory Apache module config section.
For example, if TimeGate Redirector is installed at http://www.theresourcedepot.com/000010D (war file was renamed to 000010D.war), then the ArchiveTimeGate parameter needs to point to that URL.
Example output from TimeGate Redirector:
curl -D my.txt -I -H "Accept-datetime: Sun, 25 June 2010 12:00:00 GMT" \ http://theresourcedepot.org/000010D/http://www.dans.knaw.nl HTTP/1.1 302 Moved Temporarily Date: Wed, 30 May 2012 18:46:13 GMT Location: http://lanlproto.santafe.edu:8080/memento/timegate/http://www.dans.knaw.nl Link:<http://www.dans.knaw.nl>;rel="original", <http://theresourcedepot.org/000010D/timemap/http://www.dans.knaw.nl>;rel="timemap index"; type="application/link-format" Connection: close Content-Type: text/plain; charset=UTF-8 Content-Language: nl
To facilitate discovery of multiple archives by clients we also provide a TimeMap index service with this application. Example output from index TimeMap service:
curl \ http://theresourcedepot.org/000010D/timemap/http://www.dans.knaw.nl http://www.dans.knaw.nl>;rel="original", <http://theresourcedepot.org/000010D/timemap/http://www.dans.knaw.nl>;rel="self "; type="application/link-format" , <http://lanlproto.santafe.edu:8080/list/timemap/link/http://www.dans.knaw.nl>;rel="timemap"; from="Tue, 27 Mar 2012 22:08:10 GMT"; until="Mon, 09 Apr 2012 23:35:02 GMT"; type="application/link-format", <http://www.theresourcedepot.com/000010/timemap/link/http://www.dans.knaw.nl>;rel="timemap"; from="Mon, 09 Apr 2012 23:35:03 GMT"; until="Fri, 21 Sep 2012 20:24:40 GMT"; type="application/link-format"
The index TimeMap is constructed from the timemap entries of the timegates.xml config file.