84

I'm building a RESTful API that supports queuing long-running tasks for eventual handling.

The typical workflow for this API would be:

  1. User fills in form
  2. Client posts data to API
  3. API returns 202 Accepted
  4. Client redirects user to a unique URL for that request (/results/{request_id})
  5. ~eventually~
  6. Client visits URL again, and sees the results on that page.

My trouble is on step 6. Any time a user visits the page, I file a request to my API (GET /api/results/{request_id}). Ideally, the task will have been completed by now, and I'd return a 200 OK with the results of their task.

But users are pushy, and I expect many overzealous refreshes, when the result is not yet finished processing.

What is my best option for a status code to indicate that:

  • this request exists,
  • it's not done yet,
  • but it also hasn't failed.

I don't expect a single code to communicate all of that, but I'd like something that lets me pass metadata instead of having the client expect content.

It could make sense to return a 202, since that would have no other meaning here: it's a GET request, so nothing is possibly being "accepted." Would that be a reasonable choice?

The obvious alternative to all this -- which functions, but defeats one purpose of status codes -- would be to always include the metadata:

200 OK

{
    status: "complete",
    data: {
        foo: "123"
    }
}

...or...

200 OK

{
    status: "pending"
}

Then client-side, I would (sigh) switch on response.data.status to determine whether the request was completed.

Is this what I should be doing? Or is there a better alternative? This just feels so Web 1.0 to me.

Matthew Haugen
  • 1,045
  • 1
  • 8
  • 11
  • 1
    Isn't 1xx codes made exactly for that purpose? – Andy Apr 19 '16 at 18:04
  • @Andy I was looking at 102, but that's for WebDAV stuff. Beyond that, no... They're mostly for in-transit communications. Useful in switching to Web Sockets and such. – Matthew Haugen Apr 19 '16 at 18:16
  • What kind of delay are you talking? 10 seconds? Or 6 hours? If the delays are short and generally within the same browser visit, you might do long polling or web sockets rather than periodic polling. – GrandmasterB Apr 19 '16 at 19:37
  • @GrandmasterB It's hours, potentially. I'm not responsible for the job processing itself, so I don't have a really good estimate, but it'll be a while. Otherwise, I'd just leave the first `POST` request open. The main issue with long polling or web sockets is that the user might close the browser and come back. I could open them again at that time (and that's what I do), but it seems cleaner to have a single API to call before I open those sockets, since it's an edge-case to have that problem arise. – Matthew Haugen Apr 19 '16 at 20:05
  • The response you should give depends on the headers used by the client making the request - at least if you feel that this draft is good [Reporting Progress of Long-Running Operations in HTTP](https://tools.ietf.org/id/draft-wright-http-progress-01.html) – AnorZaken Jan 18 '22 at 09:11

6 Answers6

81

HTTP 202 Accepted (HTTP/1.1)

You are looking for HTTP 202 Accepted status. See RFC 2616:

The request has been accepted for processing, but the processing has not been completed.

HTTP 102 Processing (WebDAV)

RFC 2518 suggests using HTTP 102 Processing:

The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it.

but it has a caveat:

The server MUST send a final response after the request has been completed.

I'm not sure how to interpret the last sentence. Should the server avoid sending anything during the processing, and respond only after the completion? Or it only forces to end the response only when the processing terminates? This could be useful if you want to report progress. Send HTTP 102 and flush response byte by byte (or line by line).

For instance, for a long but linear process, you can send one hundred dots, flushing after each character. If the client side (such as a JavaScript application) knows that it should expect exactly 100 characters, it can match it with a progress bar to show to the user.

Another example concerns a process which consists of several non-linear steps. After each step, you can flush a log message which would eventually be displayed to the user, so that the end user could know how the process is going.

Issues with progressive flushing

Note that while this technique has its merits, I wouldn't recommend it. One of the reasons is that it forces the connection to remain open, which could hurt in terms of service availability and doesn't scale well.

A better approach is to respond with HTTP 202 Accepted and either let the user to get back to you later to determine whether the processing ended (for instance by calling repeatedly a given URI such as /process/result which would respond with HTTP 404 Not Found or HTTP 409 Conflict until the process finishes and the result is ready), or notify the user when the processing is done if you're able to call the client back for instance through a message queue service (example) or WebSockets.

Practical example

Imagine a web service which converts videos. The entry point is:

POST /video/convert

which takes a video file from the HTTP request and does some magic with it. Let's imagine that the magic is CPU-intensive, so it cannot be done in real-time during the transfer of the request. This means that once the file is transferred, the server will respond with a HTTP 202 Accepted with some JSON content, meaning “Yes, I got your video, and I'm working on it; it will be ready somewhere in the future and will be available through the ID 123.”

The client has a possibility to subscribe to a message queue to be notified when the processing finishes. Once it is finished, the client can download the processed video by going to:

GET /video/download/123

which leads to an HTTP 200.

What happens if the client queries this URI before receiving the notification? Well, the server will respond with HTTP 404 since, indeed, the video doesn't exist yet. It may be currently prepared. It may never been requested. It may exist some time in the past and be removed later. All that matters is that the resulting video is not available.

Now, what if the client cares not only about the final video, but also about the progress (which would be even more important if there is no message queue service or any similar mechanism)?

In this case, you can use another endpoint:

GET /video/status/123

which would result a response similar to this:

HTTP 200
{
    "id": 123,
    "status": "queued",
    "priority": 2,
    "progress-percent": 0,
    "submitted-utc-time": "2016-04-19T13:59:22"
}

Doing the request over and over will show the progress until it's:

HTTP 200
{
    "id": 123,
    "status": "done",
    "progress-percent": 100,
    "submitted-utc-time": "2016-04-19T13:59:22"
}

It is crucial to make a difference between those three types of requests:

  • POST /video/convert queues a task. It should be called only once: calling it again would queue an additional task.
  • GET /video/download/123 concerns the result of the operation: the resource is the video. The processing—that is what happened under the hood to prepare the actual result prior to request and independently to the request—is irrelevant here. It can be called once or several times.
  • GET /video/status/123 concerns the processing per se. It doesn't queue anything. It doesn't care about the resulting video. The resource is the processing itself. It can be called once or several times.
Arseni Mourzenko
  • 134,780
  • 31
  • 343
  • 513
  • RFC 2616 is out of date. See 7230 - 7235. – Eric Stein Apr 19 '16 at 18:19
  • 3
    Does a 202 make sense in response to a `GET`, though? That's certainly the correct choice for the initial `POST`, which is why I'm using it. But it seems semantically suspicious for a `GET` to say "accepted" when it didn't accept anything from that particular request. – Matthew Haugen Apr 19 '16 at 18:21
  • I totally agree with your last paragraph, but that's my question: "either let the user to get back to you later to determine whether the processing ended," how should I tell the client that it hasn't? – Matthew Haugen Apr 19 '16 at 18:22
  • @MatthewHaugen: I don't think `GET` is a good idea anyway in this context. You said that the request “queu[es] long-running tasks”, which should be done through `POST` (otherwise, you'll end up queuing the same item multiple times, or not queuing it at all). As for telling the client that the response isn't ready yet during polling, a simple `HTTP 404` would work. – Arseni Mourzenko Apr 19 '16 at 18:40
  • 2
    @MainMa Like I said, I `POST` up the job to be queued, then I `GET` the results, potentially after the client has closed the session. A 404 is something I've considered as well, but it seems wrong, since the request *is* found, it just hasn't been completed. That would indicate to me that the queued job was not found, which is a very different situation. – Matthew Haugen Apr 19 '16 at 19:06
  • 2
    @MatthewHaugen: when you do the `GET` part, don't think about it as a *incomplete request*, but as a request *to get the result of the operation*. For instance, if I tell you to convert a video and it takes you five minutes to do it, requesting for *a converted video* two minutes later **should** result in HTTP 404, because the video is simply not there yet. Requesting for the progress of the operation itself, on the other hand, will probably result in an HTTP 200 containing the number of bytes converted, the speed, etc. – Arseni Mourzenko Apr 19 '16 at 19:11
  • 9
    [HTTP Status Code for Resource not yet available](http://stackoverflow.com/a/11093566/18192) suggests returning a 409 conflict response ("The request could not be completed due to a conflict with the current state of the resource. "), rather than a 404 response, in the case that a resource doesn't exist because it is in the middle of being generated. – Brian Apr 20 '16 at 13:22
  • 1
    @Brian Your comment would make a reasonable answer to this question. Although I would then respond with "[t]his code is only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request," which isn't strictly true in my case, but that seems less wrong than "not found." A part of me is leaning toward a 409 with a Retry-After header pinned on. The only issue is that it seems weird to return a 409 for a GET, but I can live with that weirdness--it's unlikely to be otherwise-defined in the future. – Matthew Haugen Apr 20 '16 at 16:15
  • @MatthewHaugen: OK, it's an answer. – Brian Apr 20 '16 at 16:55
  • @ArseniMourzenko semantically, I think a more natural REST design is: `POST /videos`, and `GET /videos/:id`. _status_ would be a field on the video's record. The resource would also have url field where a client could query to get the actual video (which might be `/download/:id`, but also might be `/videos/:id/raw` perhaps with a content header or query param to select the desired format). – David Cowden Apr 30 '20 at 10:38
  • @ArseniMourzenko Thanks for this response, and it got me thinking. As our own web server provides server-sent-event streams it occurred to me that the original POST could be responded to with a "text/event-stream" stream, providing regular status messages culminating in the url to download the finished content. – balrob Dec 31 '20 at 02:37
  • Instead of still responding with `200`, what do you think of redirecting the client to the processing result with `303` after the request has been completed, like suggested in the article [*REST and long-running jobs*](https://farazdagi.com/posts/2014-10-16-rest-long-running-jobs/)? – Géry Ogam Jun 24 '21 at 00:32
  • 2
    @Maggyero: I think it is a bad idea for three reasons. (1) A resource which completely changes its type depending on some internal state of the system is counter-intuitive. (2) As a caller, when I ask for status, I may want to get back status, not the whole video; I might have a very different processing for videos themselves; maybe even I check for status from one machine/container, and grab the video itself from another one. (3) A video identified by `/video/status/123` URI doesn't make sense whatsoever. – Arseni Mourzenko Jun 24 '21 at 06:29
  • Thanks, these are very compelling arguments. And instead of a `303` redirection response with the product URI (`/video/status/123`) in the `Location` *header field* when the product resource is available, what do you think of a `200` success response with the product URI in the *body* when the product resource is available? – Géry Ogam Jun 24 '21 at 13:03
  • @Maggyero: the same arguments apply. – Arseni Mourzenko Jun 24 '21 at 13:48
  • Are you sure? Argument 1 does not seem to apply because in the JSON body of your response to a GET request to `/video/status/123` you could have an extra key `"result"` with the value `null` until the result resource is ready, and with the value `"/video/status/123"` after, so the JSON structure would not change. Argument 2 does not seem to apply because you still get the status (and a link to the video) when the video is available, not the video. Argument 3 does not seem to apply because `/video/status/123` always identifies the status (and a link to the video when available). – Géry Ogam Jun 24 '21 at 13:56
  • In the case of an asynchronous process like video conversion or the notorious long-running report it seems that the API server is acting as a proxy for the result, which would make a response to the `GET` for the result of `203 Non-Authoritative Information` perhaps viable when the result is still partial/in-process? I read the definition of "subset or superset" as perhaps including the trivial subset of "nothing yet" (i.e. information from the api service rather than the video conversion service, which seems to be "another source"). – Lee Oct 20 '22 at 17:27
19

I found the suggestions from this blog reasonable: REST and long-running jobs.

To summarize:

  1. The server responds to job requests with 202 Accepted and the Location header field set to the URI of the status monitor, e.g. /queue/12345.
  2. Until the job finishes, the server responds to status requests with 200 OK and some representation showing job status.
  3. After the job finishes, the server responds to status requests with 303 See Other and the Location header field set to the URI of the job result, e.g. /stars/97865.
  4. After the status monitor is deleted manually by the client or automatically by the server, the server responds to status requests with respectively 404 Not Found or 410 Gone.
Géry Ogam
  • 602
  • 3
  • 13
Xiangming Hu
  • 199
  • 1
  • 5
  • 1
    The blog post has moved here: https://farazdagi.com/posts/2014-10-16-rest-long-running-jobs/ – x5657 Jun 28 '21 at 09:31
9

The obvious alternative to all this -- which functions, but defeats one purpose of status codes -- would be to always include the metadata:

This is the correct way to go. The state a resources is in with regard to domain specific log (aka business logic) is a matter for the content type of the resource's representation.

There are two difference concepts being conflated here which are actually different. One is the status of the state transfer between client and server of a resource, and the other is the state of the resource itself in what ever context the business domain undertands the different states of that resource to be. The latter is nothing to do with HTTP status codes.

Remember the HTTP status codes correspond to the state transfer between client and server of the resource being dealt with, independently to any details of that resource. When you GET a resource your client is asking the server for a representation of a resource in the current state it is in. That could be a picture of a bird, it could be a Word document, it could be the current outside temp. The HTTP protocol doesn't care. The HTTP status code corresponds to the result of that request. Did the POST from the client to the server transfer a resource to the server, where the server then gave it a URL that the client can view? Yes? Then that is a 201 Created response.

The resource could be a airline booking that is currently in the 'to be reviewed' state. Or it could be product purchase order that is in the 'approved' state. Those states are domain specific and not what the HTTP protocol is about. The HTTP protocol deals with the transfer of resources between client and server.

The point of REST and HTTP is that the protocols don't concern itself with the details of the resources. This is on purpose, it doesn't concern itself with the domain specific issues so that it can be used without having to know anything about the domain specific issues. You don't reinterpret what the HTTP status codes mean in each different context (an airline booking system, an imagine processing system, a video security system etc).

The domain specific stuff is for the client and server to figure out between themselves based on the Content Type of the resource. The HTTP protocol is agnostic to this.

As for how the client figures out that the Request resource has changed state, polling is your best bet as it keeps control in at the client and doesn't assume unbroken connection. Particularly if it is going to be potentially hours until the state changes. Even if you said to hell with REST you are just going to keep the connection open, keeping it open for hours and assuming nothing will go wrong would be a bad idea. What if the user closes the client or the network goes out. If the granularity is hours the client can just request the state every few minutes until the Request changes from 'pending' to 'done'.

Hope that helps clarify things

Cormac Mulhall
  • 5,032
  • 2
  • 19
  • 19
  • 1
    "Did the POST from the client to the server transfer a resource to the server, where the server then gave it a URL that the client can view? Yes? Then that is a 201 Created response." 202 Accepted is also acceptable as a response to this if the server cannot act immediately to process the resource, which is what the OP is doing. – Andy Apr 21 '16 at 22:24
  • 1
    The thing is the server is acting immediately. It creates the resource with a URL immediately. It is just the state of the resource is "Pending" (or something). That is a business domain state. As far as the HTTP Protocol is concerned the server acted as soon as it created the resource and gave the client the URL of the resource. You can GET that resource. The POST request itself is not pending. This is what I mean by keeping the two different conceptual domains separate. If the client was sending a fire and forget POST request not acted upon for hours then 202 would be applicable. – Cormac Mulhall Apr 22 '16 at 09:13
  • 1
    No one cares if the url exists but you can't get the data the resource represents because it's still being processed. Might as well NOT create the url until it can be used to get the video. – Andy Apr 22 '16 at 23:01
  • The resource is created, it is just in the state "pending". That is in itself relevant data. At some point in the future the server may change the resources state to "completed" (or "failed") but that is a different concept to the HTTP domain specific task of "create the resource". Pending can be a perfectly valid state for a "Request" resource to be, and the client obviously wants to know the server has created the resource in that state since it moves from asking the server to create the resource to know polling it to find out if the state changed. – Cormac Mulhall Apr 24 '16 at 00:06
  • Sorry but what you are describing here is [Asynchronous processing (done wrong)](https://farazdagi.com/posts/2014-10-16-rest-long-running-jobs/). Using the status code 201 instead of 202 for long request processing is wrong. 201 means that the request has been *completely* processed (and resulted in the creation of one or more new resources), which is not the case here. ‘It is just the state of the resource is "Pending" (or something).’ No, this is not the state of the *resource* which is pending, this is the state of the *request*. The resource is the product of this processing. – Géry Ogam Jun 23 '21 at 23:55
  • 1
    Again the business logic concept of 'pending' is nothing to do with the state transfer of the HTTP request. The request to create the resource is not 'pending'. For example if I open a mortgage application that application could be in the "pending" state for 6 months, but the actual HTTP request returns immediately. REST doesn't care about the states the resource can be in (including any business logic related 'pending' state) it cares only about the state transfer. – Cormac Mulhall Jun 25 '21 at 10:58
  • 1
    @Maggyero Have to agree with CormacMulhall here. The spec that is quoted in "Asynchronous processing (done wrong)" directly contradicts what he suggests doing in the "done right" part. Specifically the quoted spec says that whatever is put in the Location header _is_ the primary reasource, then in the article he goes on to put the status-uri as the Location. This makes the status object the primary resource (and effectively relegates the result of our processing as a secondary resource). Thus the 201-status is the correct response, as the status resource is in Location and created immediately. – AnorZaken Jan 18 '22 at 08:37
  • From the perspective of agnostic web-software, the software receives the 201, with a location, and can proceed to that location, where it can retrieve a resource. The web-software doesn't care that the resource happens to contain some json describing the status of some other resource it knows nothing about - a piece of json data that only our business logic understands. To drive the point home: HTTP-statuses exists _so that the web-software can properly operate the communication between client and server_ - it does NOT exists to convey any business meaning to whatever runs on top of the web. – AnorZaken Jan 18 '22 at 08:42
  • @AnorZaken In hindsight, I think Cormac is right. I conflated the *duration* with the *start* of request processing. When the processing outlives the connection (long process), the server should immediately send a response describing the duration of the processing (with the 201 status code and Location header field if it creates a subordinate resource representing the process for monitoring, with 200 otherwise). When the processing of a request is delayed (asynchronous process), the server should immediately send a response describing the start of the processing (with the 202 status code). – Géry Ogam Jan 18 '22 at 13:46
3

HTTP Status Code for Resource not yet available suggests returning a 409 conflict response, rather than a 404 response, in the case that a resource doesn't exist because it is in the middle of being generated.

From the w3 spec:

10.4.10 409 Conflict

The request could not be completed due to a conflict with the current state of the resource. This code is only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request. The response body SHOULD include enough

information for the user to recognize the source of the conflict. Ideally, the response entity would include enough information for the user or user agent to fix the problem; however, that might not be possible and is not required.

Conflicts are most likely to occur in response to a PUT request. For example, if versioning were being used and the entity being PUT included changes to a resource which conflict with those made by an earlier (third-party) request, the server might use the 409 response to indicate that it can't complete the request. In this case, the response entity would likely contain a list of the differences between the two versions in a format defined by the response Content-Type.

This is slightly awkward, since the 409 code is "only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request." I suggest the response body includes a message (possibly in some a response format matching the rest of your API) like, "This resource is currently being generated. It was initiated at [TIME] and is estimated to complete at [TIME]. Please try again later."

Note that I would only suggest the 409 approach if it is highly likely that the user who is requesting the resource is also the user who initiated generation of that resource. Users not involved with the generation of the resource would find a 404 error less confusing.

Brian
  • 4,480
  • 1
  • 22
  • 37
  • 1
    Seems like a stretch f what 409 is really meant for, which is in response to a put. – Andy Apr 21 '16 at 00:07
  • @Andy: True, but so is every other alternative. E.g., 202 is really meant to be a response to the request which *initiated* processing, not the the request that requested the results of the processing. Really, the most spec-compliant response is 404, since the resource was not found (because it didn't exist yet). There's nothing stopping the API from providing the relevant api data within the 404 response. Mind you, 4xx/5xx responses tend to be annoying to consume; some languages will fire an exception rather than just providing a different status code. – Brian Apr 21 '16 at 14:39
  • 5
    No, especially the last few paragraphs of MainMa's answer. Separate end points to check the status of the request and to get the video itself. The status is not the same resource as the video and should be addressable on its own. – Andy Apr 21 '16 at 22:21
  • 1
    Sorry but what you are describing here is [Asynchronous processing (done wrong)](https://farazdagi.com/posts/2014-10-16-rest-long-running-jobs/). – Géry Ogam Jun 23 '21 at 23:16
  • If I try to convert a video and it’s not ready after five minutes, I can’t see any conflict that justifies a 409. – gnasher729 Nov 10 '22 at 22:45
1

The logical sequence is this: I submit a video for conversion. The server works on it. When the conversion finishes, the file can be downloaded. Some long time later the video will be automatically removed.

I would suggest: An endpoint “convert” returning 202 to say conversion is queued up. An endpoint “download” which return 404 before the video is ready and after it is deleted. An endpoint “status” which reports either progress, or that the video is ready, or that it has been deleted. Optionally an endpoint “cancel” to allow cancelling the conversion - no effect when it’s already converted or cancelled. And an endpoint “remove” that will remove the video including cancelling a conversion.

The “status” endpoint should be functioning for a very long time even after cancelling or automatic or manual removal.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
0

For the overzealous users, those who try too quickly, I would use the 503 code:

503 Service Unavailable

and include a Retry-After: ... field with a date when the data is expected to be available. Since it looks like you are communicating with servers (i.e. you show JSON structures) this would most certainly be the best code. It is then the server's responsibility to follow the instruction, i.e. use the date in Retry-After: ... and don't retry before that date (your server can even keep track of offenders and after too many offenses, block their IP).

Of course, the term "Service" may sound a bit off, since the service is responding to your request and it is currently working... but I think that's just semantics.

The text from the RFC:

15.6.4. 503 Service Unavailable

The 503 (Service Unavailable) status code indicates that the server is currently unable to handle the request due to a temporary overload or scheduled maintenance, which will likely be alleviated after some delay. The server MAY send a Retry-After header field (Section 10.2.3) to suggest an appropriate amount of time for the client to wait before retrying the request.

Note: The existence of the 503 status code does not imply that a server has to use it when becoming overloaded. Some servers might simply refuse the connection.

Alexis Wilke
  • 241
  • 1
  • 7