8

In designing a RESTful api the problem arises as to how best to allow resources to be moved between collections.

Renaming a resource could be done by using PATCH but this is not the same thing as moving the resource between collections. Also it is not clear whether it is the resource or the collection which should be patched. Does it make sense to PATCH the resource path of an object in the api if the resource path is not a direct attribute (content) of the resource?

Clearly a DELETE/POST sequence could be used but this involves the use of multiple operations and is not atomic. In this post How to handle a request and delete the issue of performance is raised and POST is suggested as a solution. However POST by itself should not (imho) imply a DELETE. Server performance is not an issue for me, the question is more about the integrity of the RESTful API.

Using PUT is not an option. RFC2616 states:

The PUT method requests that the enclosed entity be stored under the supplied Request-URI. If the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server.

Hence, either the resource is replaced in situ or it is created.

Is there a RESTful way to implement this whilst maintaining atomicity of the operation?

Jon Guiton
  • 191
  • 5
  • 2
    Does this answer your question? [How to handle a request to delete and post?](https://softwareengineering.stackexchange.com/questions/392642/how-to-handle-a-request-to-delete-and-post) – gnat Nov 25 '20 at 15:30
  • What *exactly* do you mean by "mv operation"? If you are talking about the Unix utility `mv`, then that does not perform a single operation. It depends on the circumstance. Within the same filesystem, it will typically perform a `rename`, but between different filesystems, there is no such thing, it performs a `creat` followed by `write`, followed by an `unlink` of the original file … in other words it performs roughly the same sequence that a `cp` followed by `rm` would perform. – Jörg W Mittag Nov 25 '20 at 15:34
  • The same question has been asked on Stack Overflow. However, so far, it hasn't attracted an authoritative answer that is also compliant with all standards. There are customized extensions that are used by individual services, but the syntax might not be standardized. https://stackoverflow.com/questions/46151747/how-to-implement-a-rename-function-for-a-http-based-file-server – rwong Nov 25 '20 at 15:37
  • Also note that in Unix, neither the location nor the name of the file is a property *of the file*. It is a property *of the containing directory*. In fact, directories are actually files themselves, that simply map names to files. So, in Unix, a `mv` operation within the same filesystem would actually *never touch the file at all*, it would `PATCH` the new directory to add the name -> file mapping, and then `PATCH` the original directory to remove the name -> mapping. The `rename` system call generally guarantees that these two operations are performed as a single atomic operation. – Jörg W Mittag Nov 25 '20 at 15:38
  • 1
    The mentioning of the Unix `mv` command is simply intended to illustrate the desire for an operation that apparently makes a resource disappear under its old path and reappear under a new path, without changes to its content, and also history-preserving, and somewhat atomic. – rwong Nov 25 '20 at 15:40
  • 1
    It likely depends on how you have the container-object relationship mapped in your model. – Dan Wilson Nov 25 '20 at 15:55
  • @gnat It i a perspective though I disagree with it since I do not think a POST should imply a DELETE. I have commented on the reply as I feel there is some misunderstanding (there) of what it means for an operation to be idempotent. – Jon Guiton Nov 25 '20 at 15:57
  • @Jorg W Mittag, The use of the word "mv" in quotes was just to illustrate and clarify the idea. Whereas wrapping two operations in a UNIX system call makes them atomic (wrt users of libc) unfortunately this is not the case for HTTP requests where other things can happen in between times. Similarly, it is not possible to wrap two PATCH operators together. I agree though that the move part (though not the rename) is an operation on the collections and not the resource. – Jon Guiton Nov 25 '20 at 16:02
  • RESTful dogmatism never works, exactly because of the situations you've just encountered. It's okay to introduce a little bit of RPC here and there, even to a REST API. Therefore you could simply have a `POST .../move` and `POST .../copy` endpoints. – Andy Nov 26 '20 at 09:33

5 Answers5

13

Is there a RESTful way to implement this whilst maintaining atomicity of the operation?

Short Answer

Just use POST

Medium Answer

Seriously; it is okay to use POST.

POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.” -- Fielding, 2009

Long Answer

REST doesn't have collections. REST has resources and representations, and a uniform interface that includes a vocabulary of self-descriptive messages that are common to all resources.

HTTP doesn't have collections either. It defines a vocabulary of standardized self-descriptive messages that are common all over the web. In other words, when interpreting a message we don't need any specialized knowledge of either the producer or consumer of the message. GET means GET, HEAD means HEAD, 200 means OK, 404 means Not Found, conditional requests, authentication, caching... it's all the same everywhere.

The application domain of HTTP is the transfer of documents over a network. We're just sending each other little copies of documents telling the other guy what to do. If I want you to move a document (A) into a "collection" (B), then I send to you a document (C) that looks something like:

Please move document A into collection B

All of the other stuff -- the method-token, the headers values, the response codes -- that's all meta-data of the document transfer domain; information that we attached to the document so that general purpose HTTP components can do useful things.

In other words, the meta-data allows us to take advantage of the intelligence that we've built into the document transfer application so that we get more value out of it than mere transport.

So, how can we surface the idea of "collection" so that our document transfer application can take advantage of it?

There are at least two answers to this. One answer is WebDAV, which offers a definition of collection resources. And no joke, if what you want is remote web content authoring, you should give it a serious look. RFC 4918 defines the standard semantics for the COPY and MOVE method-tokens.

The other, and I think more common, approach is to describe the relationships between resources. We've got web linking, which gives us standized forms for describing Target/Context/Relation triples. And we've got RFC 6573, which defines the semantics of the item and collection link relation types.

So we can get kind of close: if we have a representation schema like Collection+JSON which has a mechanism for describing a document's own links, then any client familiar with that schema will be able to identify the link relations within it, and those link relations can be changed by sending to the server a representation of the document with the new link values using the same messages that we would use for any other edit (ie: PUT/PATCH). The server can easily understand the request, and decide on its own whether or not to fulfill it.

But it's only close; it doesn't generalize particularly well (where do you embed link relations in a CSV file?). So that leaves you either sending multi-part documents around (ugh) or trying to embed the relations in the headers. And yes, we've already standardized a header for that.

But what we haven't defined is how the semantics of an HTTP request are further refined when link relation headers are present in the request.

And that means that general purpose components aren't going to have a clue what is going on, and aren't going to be able to act intelligently.

And that leaves you with two choices

  • drive the specification and adoption of new standard(s)
  • recognizing that the action isn't worth standardizing

Discussion

The problem is that POST isn't idempotent.

Yes, but let's look carefully at what that means here.

The standard doesn't say "POST is restricted to use for non-idempotent actions". It is perfectly satisfactory to use POST for idempotent actions that aren't worth standardizing.

What it says is that general-purpose components are not allowed to assume that POST is idempotent; that the idempotent semantic constraint doesn't apply to all POST messages. Because the constraint is missing, our document transfer application can't do intelligent things like autonomously retry lost messages.

That's something we could potentially fix via a new method token (let's call it TSOP); you could write up the semantics of TSOP, and guide it through the standards process, and get it registered with IANA, and drive adoption. Ta-da! You now have general-purpose browsers that will resubmit lost form submissions.

Failing that, you are left to look into other registered methods that are unsafe and idempotent. Of the obvious ones, you are limited to PUT.

And PUT is fine -- every general purpose component in the world will understand the document transfer semantics of

PUT /a36c586a-cf90-46aa-b098-b3ffa038bebd HTTP/1.1
Content-Type: text/plain

Please move document A into collection B

So we have a resource model that is documents about changes to documents, and affordances so that clients can find the documents that are documents about documents, and we can design all that.

I am trying to understand exactly what isn't achieved here.

The piece that we don't have is any sort of standardized language for describing changes to previously cached resources in an HTTP response. We only have invalidate, and that only applies to a limited number of meta-data elements have other use.

Consider that 200, in response to a PUT, means that the payload of the payload of the response is a representation of the status of the action. So we might imagine that the response payload to our PUT request could look something like:

SUCCESS

Document /A has been moved to /B
Document /A/1 has been moved to /B/1
Document /B/2 has been removed
Document /C/4 has been moved to /K/9

Of course, we're just making up a language here -- if we want adoption, then this idea would need to be tightened up in to a standard. That might look like using the link header (these do appear to be link triples), and standardizing a new link relation, and then standardizing the semantics of link headers with that relation in the context of an HTTP response.

And then driving adoption.

VoiceOfUnreason
  • 32,131
  • 2
  • 42
  • 79
  • Thankyou for your informed and detailed answer. I do actually have serious doubts about the use of POST, more so since I asked this question tbh - look at the comments to Robert Bräutigam's answer below. The problem is that POST isn't idempotent. As you say the solution seems close but not achieved. Really I am trying to understand *exactly what* isn't achieved here. My thought is that it is something quite fundamental which tweaks to the standards can mitigate but won't solve. – Jon Guiton Nov 26 '20 at 14:03
  • I'm not sure that just defining a TSOP verb to be idempotent would actually work except for things that actually are idempotent but yes, if we keep adding new verbs there is not problem. You're second line of reasoning seems better to my mind except that we are now sending some kind of script to the server which is to be handled by the server as an idempotent transaction. That is clearly out of band wrt the HTTP protocol = essentially SOAP and so isn't at all RESTful. I'm liking my own answer below tbh, that we need client side state in the form of a stack. – Jon Guiton Nov 27 '20 at 17:58
  • That would mean developing verbs to support server side stack operations e.g. `PUSH` to push the present URL either as a backtracking point for the client using the api or as a bracket token such as `BEGIN TRANSACTION` which could be defined as a URI resource. Then we would need `POP` which returned the client to its savepoint or carried the semantics of a `COMMIT` depending on the resource semantics. Maybe `TOP`, and `EMPTY` verbs are needed also to define a stack semantics. That way we would have a clean and standardised interface with the CFL semantics. – Jon Guiton Nov 27 '20 at 18:02
6

One possible solution might indeed be to drop that "should not (imho) imply a DELETE".

While it is somewhat counterintuitive that a constructive REST verb (POST) could have a destructive side effect, in general it is relatively normal to have side effects beyond the creation of a new resource (for example, POST on a comments endpoint might have a side effect on the num_comments attribute of a base article resource.)

The exact composition of a POST message is somewhat unspecified. In many cases, you'd want to have it look mostly like the resource that should be the operation's result, but you could also have request bodies that specify how the resource should be created instead of specifying its exact content.

To create a resources as a clone of an existing one you could use

POST /api/v1/collection1
{
    "copy_from":"/api/v1/collection2/1234"
}

while to move a resource as in your use case you would use

POST /api/v1/collection1
{
    "move_from":"/api/v1/collection2/1234"
}

Would it solve your problem? - I think so.

Would it work with HTTP infrastructure such as proxies, caches, etc.? - I think so.

Is it fully RESTful in spirit? - I don't really know.

Hans-Martin Mosner
  • 14,638
  • 1
  • 27
  • 35
  • Thankyou for the (upvoted) answer Hans-Martin, It certainly solves the atomicity problem and I suspect this is the best and 'normal' way to achieve this. As you say, RESTful -not so sure. A POST/copy followed by a DELETE would also be safer than a DELETE followed by a POST so either of your solutions improves on mine. There are consequences to 'abusing' POST this way e.g. for permissions and authorisation. I don;t think it is fully RESTful but perhaps fully RESTful doesn;'t actually work. – Jon Guiton Nov 25 '20 at 17:40
  • `perhaps fully RESTful doesn't actually work` REST is not a golden hammer and it's not intended to be. – Laiv Nov 26 '20 at 12:28
  • @Laiv I am agreeing that it isn't a golden hammer but there are plenty of people who believe it is. Indeed, many actual and proposed standards are being built around the idea that REST and HATEOAS (note *engine* of application state) are exactly that. There are clearly some great concepts being developed around REST but there do seem to be some deficiencies and gaps too. – Jon Guiton Nov 26 '20 at 13:55
  • Well, they are deficiencies when we want it to be a golden hammer. After a while, I started to embrace the idea of CQRS. On one side, there's a REST interface for query and, sometimes, for CRUD. But when I need sophistication, I enable a second interface. More RPC-like. I just ensure principles like statelessnes, cache, idempotency, safety, etc. – Laiv Nov 26 '20 at 14:00
1

You can have a move resource to which you can POST.

POST's semantics is defined by the resource itself, so you're right that doing that on the original resource is somewhat dodgy, but doing so on a move resource should be completely ok (RFC7231):

The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.

The caller shouldn't know where the POST goes anyway, since you should be using forms to do this, if you are doing REST which includes HATEOAS.

Robert Bräutigam
  • 11,473
  • 1
  • 17
  • 36
  • You do point out in this answer https://softwareengineering.stackexchange.com/a/415314/379992 though that POST is not idempotent which "mv" certainly is. Consequently if something goes wrong the client state becomes indeterminate for the client i.e. the client would not know if the move had happened or not. Because POST is not idempotent it is not necessarily a good idea for the client to try again. That would depend on the semantics of the move resource which are only known to the server and not the client. – Jon Guiton Nov 25 '20 at 17:54
  • 1
    @JonGuiton Hm I see your point. If you want to have the move idempotent, you could define a `belongsTo` resource under your main one, and `PUT` a list of collections you want the main resource to belong to. Also, you don't have to necessarily represent ownership in the URI. If you don't, you don't need to change any of the URIs at all when ownership changes. – Robert Bräutigam Nov 25 '20 at 21:27
1

I'd like to thank all those who have responded to my question for their time and effort. Many of the answers are very well informed and the discussion here has been very helpful for my own understanding and also for my current project. After having carefully reviewed the answers I would like to offer my own answer to my own question.

No. There is no RESTful way to safely move a resource between collections nor is this possible within the framework of HATEOAS. Without the definition of a new HTTP verb MOVE moving resources safely between collections must involve the > use of server side state representations.

My reasoning is this. Without the use of a new verb, the "mv" operation is intrinsically a two step process. It can be seen as either a DELETE/PUT operation on the subject resource or PATCH/PATCH operation on the source and destination collections. It is also an idempotent operation. As @Jörg_W_Mittag stated, under UNIX this is a system call and hence an atomic operation.

There will always be a possibility that some kind of error can occur between the first operation and the second. In order that the client is free to try the operation again under conditions of failure the verbs used must preserve the intrinsic idempotent nature of the "mv" operation. In order to make "mv" a safe operation there are really only two possibilities.

a) Use a new verb such as MOVE or construct verbs outside of the HTTP protocol and pass the information out of band (i.e. on some protocol built on top of HTTP). In this way "mv" is made into an atomic operation which can be performed safely by the client. WebDAV as suggested by @VoiceOfReason is one possibility for this.

b) Use a server side representation of client state to facilitate a simple form of transaction processing. Either

BEGIN TRANSACTION
   DELETE R
   POST R || ROLLBACK
COMMIT

or

BEGIN TRANSACTION
   PATCH C1
   PATCH C2 || ROLLBACK
COMMIT

The reason why this cannot be achieved within the framework of HATEOAS and hence in a RESTful way is that it is necessary for the server to match the bracket terms 'BEGIN TRANSACTION' with their bracket closures 'ROLLBACK' or 'COMMIT'. In order to do this the server must maintain at least 1 bit of client state information.

More generally, HATEOAS treats the client server interaction within the framework of regular languages https://nordicapis.com/designing-a-true-rest-state-machine/.

At any one time the client owns its own state and is presented with a set of links which it can follow to arrive at a new state as determined by the server. The server therefore acts as a state transition table.

It is well known that matching brackets is easily performed within the framework of context free languages but is not possible within the framework of regular languages About CFLs. In other words, HATEOAS restricts the server to acting as a regular automaton precisely because it cannot store client state. The server is therefore unable to match brackets and hence unable to bracket sequences of operations into transactions.

I think there are likely many other examples like this where REST fails to provide a solution but there is certainly a way to augment REST to give it the power of CFLs rather than RLs. This would require the use of a client stack object on the server.

The important verbs here would be PUSH,POP and EMPTY etc. These operations could apply to symbols or to URIs. The resulting system would be far more powerful than REST being able to support transaction processing models but also able to support backtrack search points for the client. In short, make the server into a stack automaton.

This approach would not be at all RESTful because it definitely implies that the server maintains some kind of user state, however it is very close with just the addition of stack operations to the standard set of verbs. Furthermore, it is probably the simplest representation of client state possible (other than a single variable); is safe (with timeout and stack limits) and would provide the full power of CFLs to the interoperation of the client and server.

Actually not the conclusion I expected or hoped to reach, I'm now really unsure how or whether to proceed with a REST based api for my current project but I have learnt a lot by asking this question. Thankyou contributors.

Jon Guiton
  • 191
  • 5
  • I disagree with your analysis and conclusion. The server is *not* a state transition table. It is a state transition table potentially *for each client state* (times resource states, but that's irrelevant for my point). The server can easily lead the client through stack-like state transitions, with new options for each "client-side" stack state. Therefore the server in effect *can* use a stack, therefore it is at least a context-free grammar to use your analogy. – Robert Bräutigam Nov 27 '20 at 08:42
  • Also note the server can easily define an *infinite* number of resources, therefore it is definitely more than a finite automaton. – Robert Bräutigam Nov 27 '20 at 08:46
  • I was quite careful to not use the word finite in my answer. I assert that the action of the machine is regular. Even if the number of states is very large it is still finite although a theory does exist for infinite alphabets as these can be modelled using words on a finite alphabet e.g. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.622.3938&rep=rep1&type=pdf. I agree that the transition table contains a row for each client state *where a client state consists of its set of available links*. The server therefore models a transition table (of rows). – Jon Guiton Nov 27 '20 at 17:43
  • Further, the assumption that client side stack is a solution is false. For example the server state could change independently of the client and invalidate the client stack. Basically the client cannot model a server stack at it's end. For example, if the client uses its stack to perform a backtracking search of the server content there is no way for the server to know that it should not `DELETE` one of the backtracking points whereas if the stack is kept on the server a safe semantics could be defined such as locking objects on the stack or versioning as in an RDBMS. – Jon Guiton Nov 27 '20 at 17:48
  • I'm not sure what you mean. The client state is *submitted on each request*, so the server is aware of all the state changes, or rather the server actually offers all the state changes. A REST client *doesn't* backtrack on it's own, the *server* needs to backtrack based on the choices the client makes. Here's a REST stack: /stack (with two forms), client submits 'push a', server returns /stack/a, client submits 'push b', server returns /stack/ab, client submits pop, server returns /stack/a again, etc. With no persistence on either side. – Robert Bräutigam Nov 27 '20 at 19:50
  • The client chooses from amongst the `OPTIONS` available and so changes to a new state according to the rules of the server which *depend only on the current client state* since the server does not record client state in between requests. The behaviour is therefore regular. The server doesn't need to backtrack - the server just maps a client state to a new client state according to the `OPTION` the client chooses and the existing client state - ah I repeat myself - the action is regular. The server has no memory of previous client states *that is the whole point of REST/HATEOAS" - the behavi... – Jon Guiton Nov 27 '20 at 21:52
  • Your stack example cannot work in a RESTful way because it involves the *server* recording the client state between requests so is, by definition *not* RESTful. The server is precisely forbidden from recording client state, that is what HATEOAS means. Consequently your stack example is *another* example of something which cannot be done in a RESTful way, – Jon Guiton Nov 27 '20 at 21:55
  • As I see it, your stack example can only work if the stack is private to the client. Otherwise we have client A : push X, client A push Y, client B pop, client B pop, Client A pop - error. I see what you mean but isn;t private client state just ... server side client state? – Jon Guiton Nov 27 '20 at 22:00
1

I think you can solve this by examining your assumptions about the concept of a "collection".

To help us, let's work with a concrete example:

  • The resource is currently located at /clients/acme/users/bob
  • We want it to instead be located at /clients/zebedee/users/bob

It's common to interpret that as meaning:

The resource bob is in the /clients/acme/users collection, and we want to move it to the /clients/zebedee/users collection.

But that only makes sense if you assume that:

  • Resource identifiers are strictly hierarchical
  • Each component of a resource identifier represents a collection
  • A resource resides in exactly one collection

But these assumptions are incorrect:

  • Resource identifiers are arbitrary strings; this is part of the idea behind HATEOAS, that the structure of a URL should not determine its relationships.
  • A "collection" is just a type of resource that represents some complex structure. Its name can be just as arbitrary as any other resource.
  • A resource can't "contain" other resources, it can only link to them, so it's perfectly valid for more than one "collection resource" to link to the same resources in different combinations.

So, we can re-construct our problem in various different ways, giving us lots of choices to represent the required action:

  • /clients/acme/users/bob is just an alias or search query, and the canonical identifier for the user is /users/29d123bb-d0ff-488d-b81f-37ffe6a945f7. That resource has a client field in its representation, and a PUT or PATCH request can be used to update that field.
  • An additional "collection" resource exists at /users/ which manages users across all clients, and a PUT or PATCH request to that resource can define the required state of a user {"id": 29d123bb-d0ff-488d-b81f-37ffe6a945f7, "client": "zebedee"}.
  • The /users/ collection accepts a representation of users grouped by client, and you can send a PATCH request which atomically deletes one user and creates the other
  • The /users/ collection accepts a PATCH request which defines the move directly as "from acme/bob to zebedee/bob".
IMSoP
  • 5,722
  • 1
  • 21
  • 26