Introduction: A common resource versioning pattern |
|
The Memento protocol is widely supported by web archives. It provides a uniform
HTTP-based approach to access Mementos, archived versions of web resources, in distributed web archives around the world. Although
the Memento protocol can provide the same uniform interface to access resource versions in systems such as wikis, content management systems,
and software versioning systems, its applicability in that realm is not well understood. This document provides clarifications and details various
ways in which resource versioning systems can support the Memento protocol.
|
The Memento protocol is closely aligned with a common
resource versioning pattern that consists of:
|
|
Supporting Memento in systems that follow such a versioning pattern
does not necessarily require implementing all aspects of the protocol.
There are situations in which a little bit of Memento, for example only providing Memento HTTP headers and links with
Memento relation types, can go a long way.
|
This document describes, from simple to more complex, the ways in which a system that hosts resource versions
can support the Memento framework by:
|
| The Memento HTTP headers and link relation types used in this document are registered in the Permanent IANA Message Header Registry, and the IANA Link Relation Type Registry, respectively. They are formally defined in RFC 7089. |
Providing HTTP response headers for resource versions to convey version date and links |
A version resource is a resource with frozen content that was current for a certain period in time.
It encapsulates a prior state of the generic resource, identified by its generic URI.
Hence, a version resource is a Memento as defined in the Memento specification,
and the appropriate Memento HTTP headers can be applied. The headers are used
to express the version datetime, to relate a version URI to the corresponding generic URI, and to support
navigation between versions. In essence, they are introduced to convey the version information that is
provided on top of the W3C specification in a machine-actionable manner.
|
|
|
Publishing a TimeMap, a list of resource versions |
A system that hosts resource versions can publish a TimeMap, a document that lists the version URI and version date
of all resource versions as well as the associated generic URI.
In essence, a TimeMap is very similar to the calendar view the Wayback Machine provides of Mementos it holds for a certain resource,
with a TimeMap providing the information in a machine-actionable manner.
The below picture shows
the Wayback's calendar view for Mementos of the W3C specification..
|
|
For self-containedness, the TimeMap also lists
its own URI. In a full Memento implementation, which also entails
Exposing a TimeGate, the TimeMap also lists the URI of the TimeGate.
Because HTTP Link headers are used throughout the Memento protocol, the default serialization of a TimeMap
is the same as that of the content of the HTTP Link header, defined in RFC5988.
Its media type is application/link-format, defined in RFC6690.
A JSON format for TimeMaps is currently under consideration.
|
In the ongoing example of the Architecture of the World Wide Web specification,
a TimeMap could be published at, for example, http://www.w3.org/TR/timemap/webarch/ and the response to a HTTP GET
on that URI would be:
|
The published TimeMap is made discoverable by means of a link with the timemap relation type
provided in the Link header in responses to HTTP HEAD/GET requests
against each version URI as well as the generic URI.
Using http://www.w3.org/TR/2004/PR-webarch-20041105/ as an example version URI, the
response to a HTTP HEAD/GET request against that URI, omitting links among versions for brevity, becomes:
|
Exposing a TimeGate that supports datetime negotiation to access resource versions |
The most powerful feature of the Memento protocol is datetime negotiation, a variation on content negotiation.
In datetime negotiation, an HTTP HEAD/GET is issued against the generic URI, including
an Accept-Datetime request header that conveys the datetime of the resource version that is desired by the client.
Through the intermediation of a TimeGate that has access to the version
history of the original resource, the client will be led to the resource version (and hence the version URI)
that was the current one at the datetime expressed by the client.
|
Many datetime negotiation implementation patterns exist, detailed in the
Datetime Negotiation section of the Memento protocol,
to accommodate situations in which:
|
For wikis, content management systems, and software versioning systems that follow the versioning approach
illustrated at the beginning of this document, the typical implementation of datetime negotiation is one of the following
patterns detailed in RFC 7089:
|
Following Pattern 1.1,
for the ongoing example of the Architecture of the World Wide Web specification,
the TimeGate would
coincide with the generic URI http://www.w3.org/TR/webarch/. Lets assume a client wants to access the
version of the specification that was current on September 11 2004. In order to do so, the client issues the following HTTP HEAD (or GET) request
against the specification's generic URI http://www.w3.org/TR/webarch/:
The response to this request is shown below. It is a redirection to version URI http://www.w3.org/TR/2004/WD-webarch-20040816/,
which was the current version between August 16 2008 and November 5 2008. Note the use of the Vary header to indicate
the resource supports datetime negotiation. Note also the Link header that indicates that the original resource coincides
with its TimeGate. It also includes a link to the previously discussed TimeMap.
The client can then proceed to access the resource version it desired by issuing a HTTP GET request against the version URI http://www.w3.org/TR/2004/WD-webarch-20040816/. The response to that request
includes the Memento-Datetime header and all previously discussed links:
|
Following Pattern 2.1, the TimeGate would not
coincide with the generic URI and could, for example, be exposed at http://www.w3.org/TR/timegate/webarch/.
Lets again assume a client wants to access the
version of the specification that was current on September 11 2004. In order to do so, the client issues the following HTTP HEAD (or GET) request
against the specification's generic URI http://www.w3.org/TR/webarch/:
The response to that request includes a link to the TimeGate http://www.w3.org/TR/timegate/webarch/
associated with the original resource http://www.w3.org/TR/webarch/:
The client then engages in datetime negotiation with that TimeGate by issuing an HTTP HEAD request against its URI http://www.w3.org/TR/timegate/webarch/ including the Accept-Datetime header:
The response to this request is a redirection to the version URI http://www.w3.org/TR/2004/WD-webarch-20040816/,
which was the current version between August 16 2008 and November 5 2008. Note the use of the Vary header to indicate
the resource supports datetime negotiation. Note also the Link header that points to the original resource and
to the previously discussed TimeMap.
The client can then proceed to access the resource version it desired by issuing a HTTP GET request against the version URI http://www.w3.org/TR/2004/WD-webarch-20040816/. The response to that request
includes the Memento-Datetime header and all previously discussed links:
|
Special Case: Resources frozen in time and place |
|
The content of many resources stops changing some time after their initial publication
or it is even intentionally stable at publication time.
Once such resources stop changing, they are frozen in time, Mementos. Examples are tweets, news stories, blog posts.
In addition, these kind of resources remain available at their original URI of publication
and hence are also frozen in place. As a result, they are at the same time the original resource and its Memento.
Take as an example the tweet https://twitter.com/hvdsomp/status/401134723210416128 posted on
November 14 2013 at 4:49 PM Mountain Time.
|
|
The Memento protocol allows expressing that a resource is frozen in time and in place:
|