3

Let's say you have 10 database records which you need to process in the following way:

  1. Start
  2. Pull 2 records from the database with the 'Processed' flag set to 'false'
  3. Call the external web service with the custom data from this 2 records.
  4. Update those 2 records in the database and set them as 'Processed' = 'true'
  5. Goto 1

Now, what can happen in this scenario?

  1. -
  2. An error can occur while pulling the records from the database
    • that's fine, as in the next round we will try to get them again.
  3. An error can occur while calling an external web service
    • that's also fine, as in the next round we will try to send those 2 records again
  4. An error can occur while calling the database for the processed record update
    • what to do now? Are you optimistic that this will never happen?
    • keep in mind that you already sent those 2 records via web service and you must not send them again at any cost
    • one option is to call a database update in a loop until it runs successfully !?
  5. -

The question is, how (would) you handle this situation?

**UPDATE**

Code illustration:

loop
{
    db = database
    ws = webservice

    trans = db.open_transaction

    recs = trans.query_for_two_records_and_update_processed_to_true

    ws_response = ws.send_text_messages(recs)

    if (ws_response == success)
    {
        trans.commit() // what would you do if this fails?
    }
    else
    {
        trans.rollback()
    }
}
Doc Brown
  • 199,015
  • 33
  • 367
  • 565
HABJAN
  • 171
  • 8
  • "you already sent those 2 records via web service and you must not send them again at any cost" - can you elaborate this - *why* is this not allowed, is there a possibility to "undo" what the web service did to make it possible to send the records again? – Doc Brown Mar 19 '14 at 15:54
  • let's say the web service is a text messaging service. I sent text messages to contacts in those two records and I don't want to send a same text messages to a same contacts again... So, once the text messages are out, they are undo-able. – HABJAN Mar 19 '14 at 16:03
  • Then you have two choices. You can make the text message service "unreliable" (maybe it got sent, maybe it didn't) or your can report the failure to send to the caller. Your cell phone does this; it will tell you if it was unable to send your text message. Email does this too; it send you an Undeliverable message if your email could not be delivered. – Robert Harvey Mar 19 '14 at 16:11
  • Let's say this way: I don't care what web service does and if it processed my request. Let say 100% of time web service returns SUCCESS response. On my side I just need to ensure that I don't call the web service twice for a same record. – HABJAN Mar 19 '14 at 16:35
  • 3
    If the web service call is always going to be successful, the just make sure all other activity is completed before calling the service. – JeffO Mar 19 '14 at 18:40
  • At the very least I should do lifo (last in first out) processing because if there's 10 lines to process and an error happens. Then in the next run you're going to get the same 2 records, get the same error.....and you will have a stalled application while if you always processed newest records, at the very least the newer records get a chance to be processed. – Pieter B Jun 28 '18 at 13:06
  • Is there not a way of checking with the web service if you already sent your record for processing? – Darkhogg Jun 28 '18 at 13:40

4 Answers4

6

It's always possible to come up with an even worse case that ruins whatever solution you come up with. Assume a hostile web service, or a hostile database, and you're screwed.

But in the scenario you describe, you mainly need a more reliable action log than an external database. Let's assume that the local file system is reliable if you don't increase file sizes (which could easily fail if the disk is full).

Then the program should, at startup, create a small file which contains just two fixed-size records, each consisting of space for an ID and a status flag. In addition, we change the database to something slightly more elaborate: the processed flag can now be "new", "in progress" and "processed".

Then you want something like this pseudocode:

txlog = disk struct { id, status }

record = db -> load record
txlog -> id = record.id, status = sending
db -> processed = in progress, on failure txlog -> clear and start over
web -> send request, on error response goto service error, on timeout abort
txlog -> status = sent
db -> processed = done, on failure abort
txlog -> clear

The service error handler looks like this:

txlog -> status = failed
db -> processed = new, on failure abort
txlog -> clear

Now, the interesting part comes next. If you get a db error and abort, or your program suddenly disappears (power loss), or if you never get a response from the web service (webservice crashed, or connectivity loss), your program will eventually start again and realize that the txlog file is not cleared. Then it needs to go into error recovery mode.

If the txlog has status "sending", check the db record. If that has status "new", your program died before it could send the request. Clear the log and proceed normally.

If the db record has status "in progress", you're in trouble: it means something went wrong at a point where you might have sent the request to the web service and it might have been processed. Maybe you lost power or network, or the web service did, but you can't know whether the request went through. In this case, you can only alert a human operator to take a look. I'll talk about that later.

If the txlog has status "sent", the record was sent. If the db still has status "in progress", then try to set the status again. If not, your program died just before it could clear the txlog, or the db did its update but disappeared without sending a success message. Either way, clear the log and move on.

But what about the error case? This is the "worse case" that I mentioned in the beginning, the real issue of the unreliable system that you're dealing with. Power loss at just the wrong time. A network cable that suddenly gets cut. Your system needs to adapt, and if, as in the case you're describing, it can't (the web service you describe just doesn't give that option), you're screwed if the worst case happens. The process I described will at least alert you to it. The human operator can then perhaps contact the web service administrators to ask them about their logs, or otherwise make an informed decision what to do about the operation that may or may not have been performed.


On a side note, could we design, say, a text message service that is idempotent? Sure. The service would require, as a first step in every process, allocating a GUID for the message to be sent. Once you have that GUID, you can store it in the transaction log. Then you send your message with the GUID. The service can then check its own records and discover that the message with this GUID has already been sent, and therefore not send it again. And even if the web service failed to properly record the fact that it sent the message, the final receiver can do the same - the phone, for example, could see that it already received a message with that GUID and ignore the new one.

Sebastian Redl
  • 14,950
  • 7
  • 54
  • 51
3

I believe your requirements are theoretically impossible to implement.

Is there no way to check whether the web service has received the records? Some other service operation? If so you could check before you submit.

Also check with the service owner to see if they can make their service idempotent.

Or set up a manual process to handle the failures.

Martin
  • 31
  • 1
  • 1
    `I believe your requirements are theoretically impossible to implement.` -- Why? – Robert Harvey Mar 19 '14 at 15:45
  • If he is only allowed to call the web service once, and its not possible for him to keep track of whether it has been called, how would you prevent calling the web service twice? What if his machine dies after the call has been submitted but before the service has responded? He would not even know whether the service has been called or not. – Martin Mar 19 '14 at 15:49
  • @Martin: I think you misunderstood the question. The web service bit problem I mentioned is more like 'connection' problem or server down..etc.... – HABJAN Mar 19 '14 at 15:51
  • @HAJBAN No no I understood. It was more of an example of an even worse scenario you may encounter. You may call the web service and the call is received. The web server attempts to send "OK" back to you but your machine just died. I would absolutely love to be proven wrong though. – Martin Mar 19 '14 at 15:54
  • @HABJAN: please answer Martin's questions in his answer. – Doc Brown Mar 19 '14 at 15:57
  • @Martin: yep, that's true, this can happen, but it did not of what i'm aware. :) – HABJAN Mar 19 '14 at 15:57
  • @DocBrown: Answer to Martin's question: there is a way but at this point it's not important. What's important is the step after that. – HABJAN Mar 19 '14 at 15:59
  • @HABJAN: it is 100% important, you want to make these work as just one combined transaction. Since a combined Web service/ database system does not provide you with a transactional service, you have to implement your own. Which means: one has to check if the whole operation (including the web service operation!) was successful, and if not, roll it completely back manually. So you need two things: a test if the web service operation was successful, and a possibility to undo it. – Doc Brown Mar 19 '14 at 16:04
  • @DocBrown. As a side note: In my opinion, the only sane way to handle this is to implement idempotence throughout the system (call site, web service site). If you are going to do changes to the web service, you may be better of implementing idempotence, rather than adding "Check support". This will lead to a much cleaner interface where you don't have to first check if you have done something, but rather just re-do it if some part failed the first time. Implementing 'Checks' will get messy as the system grows imo. – Nitra Mar 19 '14 at 16:10
  • @Nitra: What does idempotence have to do with this? It is a *write* service. Perhaps you meant atomicity? – Robert Harvey Mar 19 '14 at 16:14
  • @RobertHarvey No. As I interpret the question, the main problem is that you must not send the same records to the web service twice but since systems are unreliable, there's no way for you to keep track of whether you've already sent the records. So the pretty solution to this issue would be to allow the client to perform the same web service call many times with the same data without side effects. This is idempotence and its what you use to solve this types of issue. More info here: http://servicedesignpatterns.com/WebServiceInfrastructures/IdempotentRetry – Nitra Mar 19 '14 at 16:21
  • @Nitra: Ah, I see what you mean. But you still need some reasonable guarantee that the update executed, or be notified that it didn't. – Robert Harvey Mar 19 '14 at 16:24
  • I'we updated my question with code illustration to clear the confusion. – HABJAN Mar 19 '14 at 16:47
0

Since you are so confident you can reprocess step #3 (sending to web service), why can't you just mark processed = true right after step #2?

  1. start
  2. fetch 2 unprocessed records
  3. update processed = true
  4. send to web service.

I only recommend this because of how critical it is to never send to the web service more than once AND you are so confident you can continue to reprocess any failures to send to the web service.

JeffO
  • 36,816
  • 2
  • 57
  • 124
  • then I need to ensure that call to a web service is successful for sure... which gets me to a same situation. – HABJAN Mar 19 '14 at 16:28
  • That's true, but this way, there is one less point of failure since there's no need to update the database after sending to the web service and risk records not being marked as processed. – JeffO Mar 19 '14 at 18:37
0

If you cannot use transactions as @RobertHarvey said, use three states: 'unprocessed', 'processing' and 'processed'. Log all failures in a persistent log of some sort (file), and if 3 or 4 fails, try to set it back to 'unprocessed'.

And then manually solve ones stuck in 'processing' using the info in log.

herby
  • 2,734
  • 1
  • 18
  • 25