How to deal with abandoned idempotent operations?

Question

I have implemented idempotent order placement (mostly to avoid accidental double submissions) but I am not sure how to handle incomplete operations. Example scenario:

User tries to place an order.
An order instance with status PENDING_PAYMENT is created in the DB.
Order payment succeeds (3rd party processor, supporting idempotence keys, e.g. Stripe).
My DB fails to update order status to PAID (e.g. it suddenly went down for a minute) and user receives some error.

Since the whole operation is idempotent, it is safe to retry the operation, and some (most?) users would choose to do that.

But what if the user abandons the operation?

I could implement a Completer process, which would push all incomplete operations through to completion. However, this might come as a surprise to the user.
I could combine the Completer with the assumption that it will eventually be able to successfully place the order, in which case I wouldn't even have to alert the user. However, in an odd case of a failure, I'd have an even more surprising outcome - the once successful order would now appear to be failed.

Questions:

What are some ways of dealing with this situation?
What would the user typically expect?

2.1. Should I let the user know exactly what happened (i.e. payment ok, status not ok), inform the user of a generic failure (something went wrong, please retry), or let them know nothing at all?
2.2. If I inform the user of a generic error, they might decide to update their basket and then resubmit the order. I was thinking that the way to deal with this is to simply generate a fresh idempotence key and create a second order. What are the alternatives?

Additional details:

I don't expect a high rate of failures, but I want to be prepared.
I am not dealing with big money or sensitive data - consider this a simple e-shop.

Update

I actually followed this article from Brandur Leach whilst implementing my idempotent operations, in case you're interested: https://brandur.org/idempotency-keys.

I contacted Brandur directly regarding my problem and you can see what he had to say for yourselves: https://github.com/brandur/sorg/issues/268. The gist is that I should always push all operations to completion, which agrees with the answers here. I can then decide what to do with the result. There may be multiple ways of informing the user too.

This process sounds like it happens entirely on your servers -- the user has placed their order and the payment has gone through -- so what exactly does it mean for a user to "abandon" it at this stage? — B. Ithica, Jan 06 '21 at 14:20
Payments are not processed on my servers - I use 3rd party payment providers. So after I execute a payment I also need to update the order status in my DB. At this point a few things might go wrong - the payment service might fail to respond, leving the payment status ambiguous, or my database might fail to update the order to `PAID`. At this point I report an unexected error to the user. Since the whole operation is 100% retryable (including the call to the Payment provider), I expect most people to actually retry the operation but some might decide not to, leaving the order state ambiguous. — Avius, Jan 06 '21 at 15:02
The biggest issue is that the user might walk away without retrying even though their payment already went through. — Avius, Jan 06 '21 at 15:03

score 8 · Accepted Answer · answered Jan 06 '21 at 15:44

8

If I place an order an item from a website and I can see in my online banking that my payment has gone through and the website still says "Payment is pending", I'm not likely to walk away. I'm also not likely to retry the order, since I have no idea that you're using an idempotent payment process. I'm far more likely to contact your support channel and complain.

But if the website says "Payment is pending, please wait..." which periodically refreshes, while in the background you have a task that retries all orders that have been sitting in PENDING_PAYMENT for too long (which is perfectly safe to do because it's idempotent), then I'm most likely to sit there and stare at it until the message switches to "Thanks for your payment", which (in the scenario we're considering) will happen shortly, when the background task retries successfully.

answered Jan 06 '21 at 15:44

B. Ithica

335
1
5

Was thinking about that myself. The only thing is that I'm not sure if the presumption that this "will happen shortly" is always correct. If I receive a timeout from the payment provider or my own database, then it's hard to say what went wrong and how soon things will recover. – Avius Jan 06 '21 at 15:54
That's why I said "in the scenario we're considering". It is of course possible that the payment provider is down for some reason, and "Payment is pending, please wait..." will not change. Then the customer will complain, which is appropriate. You are dependent on an external service; there is not much you can do about this. – B. Ithica Jan 06 '21 at 15:57
I have a feeling that this might the way to go! In this case I might not even display any errors to the user at all (this is probably what you meant), just retry the operation in the background until it works or decides to give up. Most errors should be temporary anyway. I'll try to implement this and accept your answer if it works, unless, of course, something even better comes along in the thread : ] thanks! – Avius Jan 06 '21 at 16:47
1

@Avius No problem. It depends what you consider an error, of course. If the transaction failed, you should definitely display that. If you know you can't contact the payment processor, that could be an error. If you did contact the payment processor and the payment looks like it is still pending after a short time, that's not an error, but if it's been pending for an unusually long time, you might want to display additional messaging (but you still don't know if it succeeded or not, so I wouldn't consider it to be an error. It might have succeeded on their end, but you just don't know.) – B. Ithica Jan 07 '21 at 11:53

score 2 · Answer 2 · answered Jan 05 '21 at 14:58

2

It depends on what you tell the user.

If you tell the user it failed you’re done until the user resubmits. Which could be done with a single click.

If you tell the user to wait you can simply keep them waiting while you resubmit. You could make this seamless or you can keep them updated by explaining the delay.

Automatically trying again after announcing a failure without consent will leave the user angry and confused.

answered Jan 05 '21 at 14:58

candied_orange

102,279
24
197
315

However, the payment already went through, the app failed afterwards. Does that mean that it would be best if I reverted the operation, be it manually or automatically? – Avius Jan 05 '21 at 15:11
2

Maybe send a confirmation email? That's a backend operation, not an app operation, and it bypasses the app, using their email client instead. – Robert Harvey Jan 05 '21 at 15:18
That’s a case of failing to communicate the result of the transaction. Just communicate the result when you can communicate. – candied_orange Jan 05 '21 at 15:19
1

From a user experience perspective (U/X) the operation should either succeed or fail. Any outcome that surprises the end user is not good U/X. User facing messages along the lines of "we're resubmitting your order because our first try failed" may not be well received. Do your best to keep it simple and avoid surprises. – Jason Weber Jan 05 '21 at 20:48

Jason Weber · Answer 3 · 2021-01-08T03:50:41.933

2

Consider unwinding the idempotent transaction to the pre-buy state or a "may try again" state on error. This leaves you with two or three terminal states to handle (in total).

By doing this you can also safely reuse the id. Reusing the id preserves the duplicate prevention benefits of the idempotent approach.

You may discover that reliably unwinding the transaction is very hard. If so, it may be time to rethink the design and/or use of idempotency.

Addition #1 (based on comments): Consider starting with a database update of e.g. PAYMENT_STARTING or whatever. include a timestamp. If this fails, you're at "sorry we're offline right now, please try again later."

Next, call the gateway itself. If this fails you're at "something went wrong, your credit card cannot be charged, please try again later."

Finally, update the database to e.g. PAYMENT_COMPLETE. If this fails you are responsible for retrying based on e.g. the timestamp from the first update. The ideas in other answers about how to handle the user experience are valid.

However you chose to handle the user experience, the goal should be a solution that converges on one of two states: order completed or fully unwound.

Addition #2 (based on comments): It seems that an underlying question is how to handle unreliable calls e.g. in a public cloud. One common approach is a retry library e.g. Polly.

edited Jan 08 '21 at 03:50

answered Jan 05 '21 at 20:41

Jason Weber

320
1
5

The whole transaction is always in a retryable state, and I _am_ reusing the ID, so that part is taken care of. My concern is what if the user does not retry? I am then left with a half-finished transaction, which is particularly problematic if the payment already went through and something failed afterwards (so I have no record of the payment ever going through). I could attempt to revert the payment somehow, which would _indeed_ be hard. Not sure if I could do that in real time either, since any failure indicates that the system is in an unreliable state. – Avius Jan 06 '21 at 12:34
1

This answer assumes the server can tell there *was* an error, but the way I read OP's question, the server doesn't have a clue if there was an error or not -- it just thinks the payment is still pending. – B. Ithica Jan 06 '21 at 15:45
Correct, as I was also unable to save any reference to the external payment to check its status later. – Avius Jan 06 '21 at 15:56
@Avius updated answer addresses these aspects of your question. – Jason Weber Jan 07 '21 at 15:17
It seems that you are suggesting pushing the operation through no matter what, and that the UX is not quite related - the only expectation is that the user is well-informed about what's going on. This falls in line with what B.Ithica suggested. Except that I wouldn't instruct the user to "try again later" because I am going to take care of that myself. – Avius Jan 07 '21 at 15:33
@Avius - the user has made their intent clear by clicking "buy" (or whatever). Putting the onus if retry on them seems like a sub-optimal UX choice. – Jason Weber Jan 08 '21 at 03:40
@B.Ithica - if there is no return value then that is the root issue that should be addressed. If there is a return value than doing something with it (e.g. reversing on a failed write, retrying the write, etc.) seems in order. In a sense this question is less about idempotent operations and more about "how do I compensate for unreliable operations?" One common approach is to use a retry library e.g. Polly https://docs.microsoft.com/en-us/dotnet/architecture/microservices/implement-resilient-applications/implement-http-call-retries-exponential-backoff-polly – Jason Weber Jan 08 '21 at 03:47
@JasonWeber The "4. My DB fails to update order status to PAID" part of the question implies there is no return value -- under these conditions, the value (if any) returned by the payment API is unknown to the system. Idempotency is crucial to the question because if your payment operation _isn't_ idempotent you _can't_ retry it under these conditions -- you'd risk double-billing the customer. – B. Ithica Jan 09 '21 at 19:39

How to deal with abandoned idempotent operations?

3 Answers3