5

We currently have a 'global catch all exceptions' section in our application. When an uncaught exception is thrown, the stack trace is displayed and the application continues running.

More often than not, this leaves the state of the application invalid going forward. Especially with NullReferenceExceptions, and threading exceptions being the cause.

I decided to have the application log the exception in the 'global' section and shutdown and restart. This was met with criticism from management - who asserted that the user should choose to select whether or not to restart under these conditions, since it never shut down before. (Though I tried my best to explain that the issue was allowing the app to continue running in the first place).

I am looking for things to watch out for now, and specific approaches I can take to handle code that is now allowed to continue to run while in an invalid state.

Sheldon Warkentin
  • 1,272
  • 8
  • 12
  • 8
    It's in an invalid state - there is nothing you can reason about it apart that the unexpected can happen. If you explain to management that "invalid state" can mean "loss of earnings" or similar, perhaps they will listen. – Oded Oct 31 '11 at 19:44
  • I agree with @Oded, the risk of losing user data from your crashing app and thereby damaging the credibility of your brand is something your managers should definitely pick up on. – Carlo Kuip Oct 31 '11 at 20:06
  • 1
    If you want management leverage, tell them this state possibly means that someone is trying to hack into the app and steal company trade secrets and customer data. – hotpaw2 Oct 31 '11 at 22:57
  • Save the stack trace to and fix the bug instead. – Martin Wickman Nov 01 '11 at 09:20
  • 1
    Unless you're writing software for life support machines or other safety critical systems - there's really no reason to leave an application limping along in a crippled state.. – MattDavey Nov 01 '11 at 09:31
  • or Formula 1 cars... :) – MattDavey Nov 01 '11 at 11:46
  • @MattDavey there is really no reason to do this in safety critical software either, in fact you would probably be forbidden from doing it. – jk. Nov 01 '11 at 13:13
  • @jk that's not been my experience - which industry sector are you from? (Not questioning your statement, just interested to know where this difference of opinion stems from) – MattDavey Nov 01 '11 at 13:25

4 Answers4

4

Gee this sounds familiar.

I was once the manager for a group that had a large application which had a few unhandled exceptions that were eventually caught be a global catch-all-and-display-the-world-has-ended handler. By default this used to allow the application to keep running.

Your point about application state in this case is the same one I made: the application should not continue running because its internal state is unknown, and most likely suspect (after all that's how an exception came to be raised in the first place).

I wanted the application modified so that the user could save data and then the application would exit - no opportunity to continue. This was met by fierce resistance, mainly from developers who argued that it had continued for a long time and there had been minimal damage - just lots of complaints. In the end I had to back down - the user was allowed to continue but dialogs were changed to strongly suggest to users that they should exit and restart. I still think this was wrong.

In then end, every exception that is seen needs to be analysed to find the underlying cause, and appropriate fixes, code changes, or whatever need to be applied so that exceptions don't happen - or if they do they are caught as part of normal program flow and locally handled.

(Sometimes, sadly, exceptions are propagated out of things like libraries. One of the rules I was taught a long, long time ago was: "do not ever rely on exceptions for normal program flow". It seems this is not always regularly practiced any more.)

I feel your pain - but in the end, the principle is a simple one and you should be sticking to it: When the last chance exception handler is fired, things are really sick. The program should exit. Doing otherwise gives users the misleading impression that they can continue working, but internally the program has an unknown or broken state. Continuing to work only leads to things getting progressively worse. In the end this translates into loss of customer satisfaction. Customers are also pissed off by a program that raises an exception and exits - if there is unsaved program state you should try and save it, perhaps in a manner that makes it clear that what is saved may not be reliable. At least you reduce the chance of data loss. But continuing to operate... bad bad move. You have to choose the lesser of 2 evils.

quickly_now
  • 14,822
  • 1
  • 35
  • 48
2

I can understand the point that your application managers are trying to make, that simply shutting down or restarting immediately upon encountering an unexpected error can confuse a user. You always want to give the user information about what happened and if possible give them a choice on how they would like to handle it.

You may also want to try and localize the problem to a particular component that may perhaps be reloaded or re-synced so as to prevent other working areas of the application from needing to be brought down?

On the other hand, if by them continuing to work and possibly corrupting persisted data which could cause future problems for themselves or others, OR if this bad or corrupted data needs manual intervention to clean up in some way then you have a case for shutting the application down as soon as possible. Just make sure that it is informative and graceful (... as graceful as an application crash could possibly be :)

Perhaps your efforts are better focused on stabilizing the application to where unexpected exceptions occur far less, and expected exceptions occur far greater.

maple_shaft
  • 26,401
  • 11
  • 57
  • 131
  • We have been working on stabilizing the application. Which is why the scenario I am describing is more of a rare occurance. It came up recently because of a regression we had introduced; which, when fixing, uncovered this larger problem. – Sheldon Warkentin Oct 31 '11 at 20:47
1

Its definitely best practice to shutdown the application which has thrown an unhandled exception. Under Winforms in .NET, there used to be a dialog which popped up and gave you a big warning with the option to continue running. But since I think .NET 3.5 that continue option is no longer there and you are forced to shutdown the app.

Whether you then restart the application automatically is up to you - I have seen this done by some other major software like browsers.

The dialog for our 'global' catch all exception handler displays the stack trace plus other information (machine name, time, error message, current user name, etc) in a textbox which can be copied to the clipboard. A button is available to email support with the information (button is named 'Send Error Report'). In my experience, the user probably won't bother informing you of it happening unless they are given an easy way like this to email you. And writing the exception details to a log file, while still necessary, won't be of much use to the developer until you get access to file or to the machine.

dodgy_coder
  • 1,098
  • 7
  • 22
0

Depending on the situation there's an answer I used once: If something goes haywire but it's not a certain fatality I reset the document name--if the work gets corrupted it wouldn't overwrite the original. It was only intended to allow them time for a graceful exit.

Loren Pechtel
  • 3,371
  • 24
  • 19