Put Protocol

The SiteStory software included the mod_sitestory plugin for the Apache content webserver. It is possible to implement push plug-in to SiteStory from other content servers or from a custom client in any programming language. The SiteStory archive has an interface which utilizes the HTTP PUT method to accept the data related to the client's request to the original server. The following URL pattern is used to submit the request/response pair :

http://[host]:[port]/sitestory/put/[original_url]

The format for the data submited consists of a request line, http_request_headers, an empty line, followed by a status-line of client response, http_response_headers, an empty line and an optional message body. The status-line and headers must all end with <CR><LF> (a carriage return followed by a line feed). The empty line must consist of only <CR><LF> and no other whitespace. While constructing this new PUT request with its own http headers you will need to calculate length of this new compound message body and include Content-Length header (for ex. Content-Length: 3495) or send PUT request in chunked form (Transfer-Encoding: chunked). Please note, that URL-M (url of memento) would be constructed from original http_request_headers using Host header and url portion in the GET request line.

The body of the PUT request to the SiteStory will look like:


GET /misc/jquery.js?R HTTP/1.1
User-Agent:Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1)
Referer:http://www.dans.knaw.nl/en/content/geospatial-sciences-data-collection
Accept:*/*
Accept-Language:en-US,nl;q=0.5
Accept-Encoding:gzip, deflate
Connection:keep-alive
Host:www.test.knaw.nl 
X-Client-IP:127.255.255.255

HTTP/1.1 200 OK
Date: Mon, 30 Jul 2009 14:29:09 GMT
Server: Apache
Content-Length:136
Connection: close
Content-Type: text/html; charset=UTF-8

<html><title>Hello World!</title><body>
<p><font size=\"14\">Hello World! Page created at  Sun, 30 Jul 2009 </font>
</p></body></html>
 

As illustration of the general idea, how to push content to the SiteStory, find bellow a client example in java:


....
import org.apache.commons.httpclient.Header;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpException;
import org.apache.commons.httpclient.methods.GetMethod;
import org.apache.commons.httpclient.methods.PutMethod;
....

public static void test_put() throws HttpException, IOException {
    HttpClient mClient = new HttpClient();
    String bodyonly = "<html><title>Hello World!</title><body>"+
                      "<p><font size=\"14\">Hello World! Page created at  Sun, 30 Jul 2009 </font>"+
                      "</p></body></html>"; 
    int size = bodyonly.length();
    String req_header ="GET /misc/jquery.js?R HTTP/1.1\r\n"+
                       "User-Agent:Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1)\r\n" +
                       "Referer:http://www.dans.knaw.nl/en/content/geospatial-sciences-data-collection\r\n" +
                       "Accept:*/*\r\n"+
                       "Accept-Language:en-US,nl;q=0.5\r\n"+
                       "Accept-Encoding:gzip, deflate\r\n" +
                       "Connection:keep-alive\r\n" +
                       "Host:www.test.knaw.nl\r\n";

     String res_header ="HTTP/1.1 200 OK\r\nDate: Mon, 30 Jul 2009 14:29:09 GMT\r\nServer: Apache\r\nContent-Length:" + size +
                        "\r\nConnection: close\r\nContent-Type: text/html; charset=UTF-8\r\n";
     String body=req_header+"\r\n"+res_header+"\r\n"+ bodyonly;
     PutMethod mPut = new PutMethod("http://www.theresourcedepot.org:8080/SiteStory_Testbed/put/http://www.test.knaw.nl/misc/jquery.js?r");
               mPut.setRequestBody(body);
               mClient.executeMethod( mPut);
               mPut.releaseConnection();
}
...
 

After we execute client code above, we can access timegate/memento as follow:


 curl -I http://www.theresourcedepot.org/SiteStory_Testbed/timegate/http://www.test.knaw.nl/misc/jquery.js?r
HTTP/1.1 302 Moved Temporarily
Date: Thu, 13 Jun 2013 19:45:51 GMT
Location: http://www.theresourcedepot.org/SiteStory_Testbed/memento/20090730142909/http://www.test.knaw.nl/misc/jquery.js?r
Vary: negotiate,accept-datetime
Link: <http://www.test.knaw.nl/misc/jquery.js?r>;rel="original", <http://www.theresourcedepot.org/SiteStory_Testbed/memento/20090730142909/http://www.test.knaw.nl/misc/jquery.js?r>;rel="memento first last"; datetime="Thu, 30 Jul 2009 14:29:09 GMT" , <http://www.theresourcedepot.org/SiteStory_Testbed/timemap/link/http://www.test.knaw.nl/misc/jquery.js?r>;rel="timemap"; type="application/link-format"
Connection: close
Content-Type: text/javascript


curl http://www.theresourcedepot.org/SiteStory_Testbed/memento/20090730142909/http://www.test.knaw.nl/misc/jquery.js?r
<base href="http://www.test.knaw.nl/misc/jquery.js?r" /><html><title>Hello World!</title><body><p><font size="14">Hello World! Page created at  Sun, 30 Jul 2009 </font></p></body></html>

An optinal X-Client-IP header can be appended to the client http_request_headers, in this case client ip information will be stored in database.