8

The specific scenario in my case involves PayPal but it could easily be any other external system.

According to a lot of sources on the Internet, including the official documentation, a basic PayPal integration (say, for an e-store) would look like this:

  1. Call the PayPal API to execute the payment for an order.
  2. Save the order details to the database.

What if the payment is successful but there is some problem with the database? This would mean that I charged the user but failed to record that.

If I create the order before I charge the user, it is possible that the payment fails and now I have an unpaid order.

To better manage this issue I came up with this:

  1. Create an order with a pending payment.
  2. Execute the payment.
  3. Update the order status.
    1. If update fails, try again (DB retries)
    2. If the retries fail, send an alert to the website admins with the information of what failed and inform the user that there was this problem.
    3. It is also possible that the email fails to be sent, in which case I try to send an error message back to the user, saying that there was a huge error and we'd like them to get in touch asap (because otherwise this incident would not be recorded at all).

Unfortunately it is also possible that the user never gets this message.

It seems like a basic flow, but I have no idea how to handle it properly. How should I do it?

Avius
  • 377
  • 1
  • 9
  • This flow seems fine to me. Why is it that the user never gets the message? – robbrit Aug 19 '19 at 16:56
  • Just like at any other point of this flow, some sort of network failure might occur on both the client and/or the server side. – Avius Aug 19 '19 at 17:57
  • 1
    Also consider writing whatever would have gone into the db had it not failed to a text file (ie, response codes). – GrandmasterB Aug 20 '19 at 05:46

3 Answers3

9

Trying to cover failures like this with business logic and retries is in my view a mistake.

You can always think up a scenario which isn't covered by your logic. In your case, case the DB write has failed because the network card on the box is non functional. So you cant send emails, or a response to the user, retrying wont help etc etc.

Instead you should try and make your systems transactional. ie. wherever a failure can occur you will know that a failure occurred and be able to take remedial action post failure.

In the worst case examples this can be a purely manual rollback. So again with your example, before you send the request to the 3rd party you write a log saying "about to send payment request xxxx" after you get the request back you write "received payment response xxx".

Now even if your computer switches off completely between the begin and complete logs you can look back through them, determine that an error occurred, phone up the payment provider, ask about transaction xxxx and refund the user.

Nb.

From your comments it seems that the missing part of your solution is what is loosely referred to as "monitoring". This is some system of collecting all your logs and metrics together in a single place. eg. "the ELK stack" or similar. With such a system in place you are able to configure various alerts based on the results of searches on your logs. ie

  • if system X logs an error, email me
  • at 0900 email me a report showing all the transactions that have been open for more than 60sec
Ewan
  • 70,664
  • 5
  • 76
  • 161
  • Hmm. Are you saying that this elaborate "exception handling" is _completely_ useless? Couldn't it be that the DB write failed because the DB itself is down, in which case an alert email _would_ be useful? On the other hand, if the problem is with the server's NIC, then it would not be able to revert automatically anyway, which would always lead me to the prescribed "worst case" example, where I have to digs through the logs. Also, slightly off-topic, hope you don't mind: how would I know that something went wrong? Would I have to be constantly monitoring the logs for specific patterns? – Avius Aug 19 '19 at 18:41
  • yes, if the db failed your application is non functional. log the error (which will trigger a alert via your monitoring) and bubble it up to a friendly error message for the user. Your audit trail of logs lets you go through a refund or complete all the in flight orders that were affected by the crash the next day – Ewan Aug 20 '19 at 00:19
1

Scenarios like this come up a lot when dealing with services external to your system, especially in microservices architectures. In your case, the only external dependency is PayPal. In my opinion, the scenario you suggested seems alright: Set order to pending/awaiting payment, then go to PayPal, then update order status. Instead of specific error handling you could implement some sort of monitoring, where you have a dashboard stating the number of pending orders in the database. If database is down, your monitor is down, so you have a problem. If number of pending orders is too high, something might be wrong again. Another pattern you could look at is the Saga pattern, where you define compensating actions for the various steps your process needs to take. However, in your case, I don't think PayPal offers endpoints which you can use to reverse the transaction, so this might not be helpful in your case.

Mike
  • 121
  • 5
0

Ask yourself two things: What is worse, if you ship a product without payment, or if a customer pays and receives no shipment? And what is more likely, that a payment goes through or that it is not accepted?

I suggest that just before you send the payment request off, you write the shipment as “likely paid” to the database. When the payment request goes through or is denied, you update the database. Most likely everything is fine. If not, it is more likely that the payment was made and shipping is the correct thing to do. Worst case you make the less likely and less damaging mistake of shipping without payment, instead of the more likely and reputation damaging mistake of not shipping when the customer paid.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
  • 1
    Whilst good advice, there is something here that I don't quite follow. Could you please elaborate on the following: 1) By "denied" payment are you referring to an unexpected service/network problem or some user problem such as insufficient funds? 2) Given your logic, what is the point of updating the database after the payment? It does not look like it affects anything. – Avius Aug 19 '19 at 18:31
  • If the payment is denied and you can update the database, you don’t ship. If the payment is accepted or denied, and you can’t update the database (which is rare) when the database is read to figure whether to ship or ship not then you have to guess. – gnasher729 Aug 19 '19 at 20:34