HTTP Downgrade

From GridSiteWiki

HTTP Downgrade has now been superceded by GridHTTP, which uses the same concepts and much of the same code, but with a slightly different set of HTTP header and cookie names.




HTTP Downgrade is a protocol supported by GridSite which supports bulk data transfers via unencrypted HTTP, but still retaining the support for authentication and authorization with the usual grid credentials over HTTPS.

The protocol allows clients to set an HTTP-Downgrade-Size: header when making an HTTPS request for a file. This header gives the minimum size of file the client would prefer to be retrieved by HTTP rather than HTTPS if possible. The authentication and authorization are done via HTTPS (X.509, VOMS, GACL etc deciding whether its ok) and then the server may redirect the client to an HTTP version of the file using a standard HTTP 302 redirect response giving the HTTP URL (which can be on a different server, in the general case.) For small files, the file can just be returned over HTTPS as the response.

For the redirection to HTTP response, a standard HTTP Set-Cookie header is used to send the client a one-time passcode in the form of a cookie, which much be presented to obtain the file via HTTP. This one-time passcode only works for the file in question, and only works once (the current implementation stores it in a file and deletes the file when the passcode is used.) This is no worse than GridFTP for providing an unencrypted data channel: it's vulnerable to man-in-the-middle attacks or snooping to obtain a copy of the requested file, but not vulnerable to replay attacks or to other files being obtained by the attacker.

Ways of extending it to support variable TCP window sizes so it can be used for a mix of long and short distance connections (currently the TCP window size has to be set in the Apache configuration file), and support for third-party transfers using the HTTP COPY method from WebDAV are being added to the GridSite implementation.

One big advantage of redirecting to a pure HTTP GET transfer is not just that the server and client don't have to spend CPU en/decrypting it, but that Apache can use the sendfile() system call to tell the kernel to copy it directly from the filesystem to the network socket (or you can use the Linux kernel module HTTP server, which has much the same effect.) This means the data never has to be copied through userspace (the so-called zero copy mode.)

As far as client side APIs go, any client side library which supports HTTP redirects and cookies and lets you add your own headers is sufficient (even the curl command line tool lets you do this, with the -H and -c options, without having to make any modifications to its code.)