Background
My web application lives on a centralised server in the product's "network", and provides the means to manage/configure various distributed devices. The server also logs various statistics that arrive from each device, storing them on disk in /var/log/
. The web GUI allows users to download those logs.
It also has a facility to download them in various different formats, which requires an on-the-fly translation/conversion. This conversion takes some time (say on the order of thirty seconds) and results in files of size in the 300MB region.
All of this is fine and a user can accept that downloading such a converted file is going to take some time. But I've architected myself into a bit of a corner in terms of how effectively I can actually deliver these files.
For the purposes of this question, I shall not be exploring AJAX/JavaScript/Java/Flash/multi-step/multi-page solutions. Assume that, from the user agent's perspective, the download shall be a straightforward HTTP GET request to a CGI script from clicking on an <a>
element, and nothing more.
Problem
My web application is loosely MVC-architected in such a way that the controller chosen to satisfy the requested action (say, in this case: controller "devices" action "getConvertedLog") performs its business logic and sets various flags that describe how the HTTP response should be composed. Only after the controller has finished its work will the HTTP response be composed, with response headers generated and the body streamed from, in this case, a temporary file on disk.
The first problem with this is that the controller itself performs (or, at least, invokes) the file conversion, takes some time to perform. The HTTP headers are consequently not generated (let alone transferred) for thirty seconds or so. Not only does this result in thirty seconds of literally nothing happening in the browser (at least from my experience in Chrome) but it also puts the entire request at high risk of a HTTP 504 Gateway Timeout error from intervening routers.
I could shuffle my code around a little so that some HTTP response headers can be transferred to the browser before the conversion begins, to at least give an indication that something is happening (and, hopefully, stave off the Gateway Timeout). But before the conversion completes I have no way of knowing how many bytes will comprise the result. Therefore, I cannot send a meaningful Content-Length
header, so the user-agent cannot display progress to the user. And for a 300MB file I do not consider this to be acceptable.
The second problem with this is that, if there is an error during conversion, the HTTP response code should be meaningful. So I cannot have sent a Status
in these hypothetical pre-conversion headers.
Question
What would you do here? What's the least I need to do indicate to user-agents and proxies that the request has been accepted and a response is coming (albeit slowly), before the success or failure and size of the response has been determined?
I guess it would be ideal if it were legal and functional to send a very small set of headers, say:
Status: 200 OK
Content-Type: application/zip
Content-Disposition: attachment; filename="thefile.zip"
… then follow it up with the remaining headers (Set-Cookie
, Cache-Control
, Content-Length
and so forth), some potentially replacing earlier ones (like a change in Status
) and, finally, the response body.
I'm hoping the fact that Apache's CGI module translates and re-orders some headers (e.g. Status: 200
ends up in the first line of the response as HTTP/1.1 200 OK
) can help here. How might HTTP 102 Processing help here?
(Update: "At least one CGI-Header must be supplied, but no CGI header can be repeated with the same field-name." [CGI 1.1, §9.2]. Rats.)