0

A lot of code that's designed to convert or parse some data of type Foo into a Bar is written with the assumption that one wouldn't intentionally pass it invalid inputs. As such, it assumes that everything is correct and throws exceptions if it later realizes that something is wrong.

This is typically an appropriate thing to do; expect the inputs to be valid, throw if they're not. Unfortunately, there are some cases where the caller has a piece of data that could possibly represent different types. The simplest way to do so would be to try each of the possible parsers and use the output of whichever parser doesn't complain about it.

As a slightly forced but suitable example, let's say you're parsing incoming JSON data. As per some specification, a given string property can either be a number (as a string), one of a few magical keyword defined in an enum, or some other plaintext, each of which needs to call a corresponding add(int), add(MyEnum) or add(String) method.

The simplest way would basically boil down to

try {
    add(Integer.parseInt(input));
} catch (SomeException e) {
    try {
        add(MyEnum.valueOf(input));
    } catch (IllegalArgumentException e) {
        add(input);
    }
}

This blatantly goes against the oft-repeated mantra of "Don't control flow using exceptions", it falls prey to the dreaded performance cost of using try and throw, and it also goes against the notion that exceptions are only for, well, exceptional circumstances that typically shouldn't happen; here, we're basically explicitly saying "yep, there'll probably be an exception or two here before we're done".

Using parseInt as the main example, a slightly prettier approach would be to create a utility method along the lines of

public static OptionalInt tryParseInt(String s) {
    try {
        return OptionalInt.of(Integer.parseInt(s));
    } catch (NumberFormatException e) {
        return OptionalInt.empty();
    }
}

This is basically just hiding the exact same problems behind a pretty facade. Granted, it makes the calling code cleaner and it encapsulates the ugliness into a method dedicated to containing it, but it doesn't solve the underlying problems. It does however fit well into the "Keep it simple, stupid" rule of thumb.

A third option would to simply implement my own tryParseInt from scratch. Unfortunately, that means I've got to duplicate well-established and likely well-optimized functionality in the JDK. In the more general case as well, if I'm using a library to parse my Foo into a Bar, it's likely because parsing it is complicated to begin with. So, this is a case of reinventing the wheel instead of reusing existing functionality, which almost certainly makes it a considerably worse solution in all but the most trivial of cases.

Option 4A is to write a tryParseInt method that first validates the input, returning OptionalInt.empty() if the validation fails, and passes the input to Integer.parseInt if the validation succeeds. This has two drawbacks though. First, while not reimplementing the full functionality of Integer.parseInt, it still performs the same validation that's done implicitly by the JDK method. We are avoiding exceptions altogether, but at the cost of having to essentially replicate well-established logic. The second problem is that it's not always easy to write completely correct validation without effectively writing a parser, thus...

Option 4B is to write optimistic validation logic that works in all the common and likely cases, but may still let some special cases slip through. The call to the underlying parser would then still be wrapped in a try-catch with the assumption that the catch will rarely be used. Again using tryParseInt as an example, we could check if the string matches the regex [+\-]?\d{1,10} to cover 99% of the cases. [2147483648, 9999999999] and [-9999999999, -2147483649] would still be let through, throw and be caught, but they're unlikely to ever occur. This also doesn't solve the underlying structural problems but prevents like 99% of the performance impact of having to throw an exception (assuming that the validation is fast).

Again, Integer.parseInt and the rest are just simple examples of a more general problem when dealing with third-party parsers and similar.

Basically, my question boils down to how to compromise between the different, inherently incompatible, rules of thumb. Do we want to blindly abolish intentional exceptions altogether? Do we want to limit most of the exceptions' performance impact without adding too much complexity? Does simplicity and maximal code reuse outweigh those "nit-picky details" and "premature micro-optimizations"?

Is there a commonly accepted best practice for situations like this?

  • 2
    As a rule, efficiency concerns should predominate only in small, critical, oft-repeated areas of a code base. Handling textual input is usually not one of these areas. Therefore, in many such cases I would probably ignore the received wisdom about exception cost. – Kilian Foth Aug 02 '18 at 14:04
  • How critical is performance in your context? In other words, how much would it hurt you to have the try/catch overhead? – Aganju Aug 02 '18 at 14:55
  • It is not hard to parse this or any other well defined text properly, you should not even consider doing this any other way. – Martin Maat Aug 02 '18 at 17:12
  • If you are mass importing a bunch of text files that require parsing, then the overhead of exception handling is very noticeable and measurable. If you are attempting to parse the occasional user input, the impact of the exception handling is hardly noticeable. The `tryParse` idea does have some merit __if__ you aren't just hiding an exception. The process of acquiring the current stack trace is what causes the impact. You'll have to make your decisions based on how your application does things. – Berin Loritsch Aug 02 '18 at 17:58

1 Answers1

3

The reason your struggling with this is that there is no clear answer. The situation is ambiguous.

It helps to remember that rules of thumb are guidelines, not absolute requirements.

Exception handling works best when detected deep in the code, and handled remotely (in some high level recovery). But its a tool you can use as you like. The most important rule is pragmatism.

As for things like parsing ints, or json - these are clearly ambiguous. Probably more often than not, failures in these sorts of tools will result in exceptions - and probably more often than not, this is best. But if you have an application where that isn't the case, there is NOTHING wrong with turning those exception throwing utilities into error code returning utilities (via the wrapping you described).

Generally - another guideline I'd recommend you consider - to go along with the two you already are struggling with - is "when in rome, do as the romans", or the "principle of least surprise". If you are using java, and parseInt throws, prefer writing your code to work with throwing. If you are using C++, where sscanf/strtol etc dont throw, check the error codes.

Another good rule of thumb is "how can you make your code more terse?" Brevity isn't always helpful, but if you look at your code and there is alot of boilterplate because of doing things one way (exceptions or error codes), try it the other way and see if its simpler or clearer.

The good news is that the one factor that USED to be (somewhat) relevant, but is now far less relevant, is that of performance. It's very unlikely in any modern application, that the performance difference between these approaches will matter.

Lewis Pringle
  • 2,935
  • 1
  • 9
  • 15
  • It's not possible in Java, but if your language supports output parameters, you could use a `TryParse(string text, out T value)` construct like in C#. It returns a boolean for whether the parse was successful, and the output parameter returns the value. Or if your language supports native tuples, you can return the pair of the value and whether it was successful or not. `[value, success] = TryParse(text)`. It helps bridge the gap without actually throwing exceptions. If you are mass parsing documents, it makes a big enough difference. – Berin Loritsch Aug 02 '18 at 17:51