1. What are Apache Filters?
The Apache version 2 API was rewritten to make Apache much easier to
extend. One of the major changes was the introduction of a filter API
which allows you to write code which examines and possibly modifies the
request data flowing into the web server from the client and the response
data flowing back from the web server to the client.
This data may flow through various filters which may transform it in
various ways: insert content (eg mod_include), encrypt it (mod_ssl),
compress it (mod_deflate)
In a nutshell, Apache 2 filter framework uses same idea as Unix command
line filters: ps ax |grep "apache.*httpd" | wc -l
2. What is APR mod_sitestory.c written in ?
The Apache Portable Runtime (APR) and Utilities (APR-UTILS or APU) are
a pair of c libraries used by Apache daemon. The main purpose of the APR is to
provide a portable, platform-independent layer for applications. Prior to
the APR, Apache code was littered with platform-determined conditionally
compiled code, which made the code hard to read and maintain. The APR
delivers an almost totally uniform API regardless of the run time platform
by abstracting away operating system differences.
3. What does mod_sitestory do?
The mod_sitestory module is implemented as an Apache version 2 output filter. The
mod_sitestory is inserted into the chain of apache filters after a response body and
response headers are already generated. The mod_sitestory establishes a tcp/ip
socket with the SiteStory Web Archive and sends data
(request and response headers and response body) using the PUT method of the HTTP
protocol. By using chunked transfer encoding the mod_sitestory filter is able to
send data to an archive as soon as it reads the data from an output of previous
handle/filter.This means, data processing is integrated in a clean and efficient way
with the Apache chain of events.
4. What if I do not want to archive some selected directories at my Apache server?
Using provided configuration options it is possible to unregister
the insertion of the mod_sitestory filter for some directories at the Apache server. To
illustarate, let us assume that an institution does not want to archive search
results pages which are served at http://<host>/search?author=bla.
The corresponding configuration statement at mod_sitestory module config is "
Excluded /search "
.
5. I have several virtual hosts configured on my apache server. Can I
archive just one domain?
The mod_sitestory filter can be configured per virtual hosts. If the mod_sitestory
configuration is not present in the virtual host configuration section the mod_sitestory filter is, by default, disabled for corresponding domain.
6. I want to turn off transactioanl archiving, but keep TimeGate link header to use Memento client with the IA's Wayback Archive.
You can use configuration: EnableArchiving Off
and point ArchiveTimeGate to IA TimeGate :
ArchiveTimeGate http://api.wayback.archive.org/memento/timegate/