Sunday, January 30, 2011

Formalization of Exception Mechanism

The notion of exception has been around for a long time. The purpose of this note is to unify various models and scattered terminology into a coherent model complemented by adding new mechanisms to what is already known.

Formalization should be based on feasibility research in the context of a programming language. Furthermore, in order to demonstrate that the model is sound, the algorithms should be verified. However, the discussion of verification is not within the scope of this note. The language selected for research is Z++, which supports the entire model described here.

We begin with the notion of exception layers and handlers. Then, we discuss two forms of resumption, and follow that with handling of exceptions occurring in threads and child processes. Finally, we discuss a mechanism for ensuring that all raised exceptions will have a handler, during compile time.

Layers of exception

A model for exception mechanism starts with the notion of circles or layers. Generally, in each layer certain kinds of exceptions are dealt with. Once an exception reaches the final layer, usually the operating system, without being handled the program crashes. In particular an exception trapped in the layer of the operating system, such as invalid pointer is terminal.

Below is a linguistic representation of the notion of layer for exception mechanism. The scope of layer is between the terms layer and handler. In handler section for each exception that could be handled in this layer we have a case followed by the code for handling its caught exception.

layer

// body of layer

handler
case exception_1:
// handler

// more cases

else
// catchall handler
endlayer;

The else leg acts as a catchall and is optional. The else leg should be used in the final stages prior to handing over an un-handled exception to the operating system. Instead, it is better to have a handler case for each exception of interest for the layer at hand.

Each handler begins a new scope so objects can be declared and further exception layers may be opened, just as is the case with the layer section. Finally, the term endlayer closes the entire layer statement.

The specification exception-type between brackets helps reader know the category of exceptions dealt with in the layer. It also allows the compiler trap handler cases that do not belong to the specified category.

Raising an exception

The Z++ statement for raising an exception is simply raise(exception).

Resumption

Resumption from exception is an indispensable mechanism. Consider a scenario in which deleting an entity from a database fails because the entity was not in the database. Why should we skip all the statements following the failure? The appropriate approach is to log the failure in a file and/or inform the user in a handler, then go back to the statement following the delete statement and continue with the rest of the work.

Unlike repetition that we discuss next the semantics of resumption are quite simple. The execution simply returns to the statement following the one that caused the exception.

The term for resumption in Z++ is resume, which can only appear in a handler. Upon resumption all objects created in the layer section (before the statement that caused the exception) will be available with states prior to the occurrence of exception. However, if an object was passed by reference to the statement that caused the exception, its state could be different.

Repetition

Repetition is a form of resumption that puts control back to the start of the statement that caused the exception. This mechanism allows fixing the problem that caused the exception and trying the operation again.

For function calls, loops and switch statement, as well as all simple statements such as assignments, repeat will return control to the start of the statement. However, in an if-else statement the control returns to the start of the leg in which the exception occurred. The reason for the latter is that, we really need to return to where we got in trouble while testing the condition might put us in a different leg.

One can argue that we should do the same for a switch statement as we do for if-else statement. Nonetheless, it seams reasonable to provide two different mechanisms. One that allows selecting a different leg, as is the case with switch statement, and one that puts us back in the leg that caused the exception. Otherwise we need to make up two terms, one to go to start of a selection statement and another to return to the leg that caused the exception. Such terminology will be of no use with regard to iterations.

The term for repetition in Z++ is repeat, which can only appear in a handler. Upon repetition all objects created in the layer section (before the statement that caused the exception) will be available with states prior to the occurrence of exception. However, if an object was passed by reference to the statement that caused the exception, its state could be different.

Threads

Exceptions can occur in child threads created in a layer statement. A child thread in Z++ passes its un-handled exceptions on to its parent and terminates. Now, if we have already executed statements following the statement that created a global thread, the execution of resumption or repetition will result in incorrect programs.

In order to use resumption and repetition correctly, you will need to call a plain function, which creates a global thread. The call to the function will block until all child threads terminate.

Calls to methods of a task thread block. Thus, the use of threads of task type is safer than global threads for purposes of using resumption and repetition.

Components

A Z++ component starts as a child process, locally or remotely. Un-handled exceptions occurring in a component are passed on to the parent process, but the component process continues to execute in good state waiting for future invocations. Since calls to entry points of a component block, resumption and repetition behave nicely. This is true even when a component is loaded as a remote child process.

However, a component could create its own threads, which will continue to execute even when the parent has not called one of its entry points. Exceptions can occur in one of the child threads of the component. Such exceptions must be dealt with within the child component. In particular, resumption from such an exception may result in an incorrect program unless it is properly designed to deal with the situation.

An un-handled exception passed from a component to its parent can be handled in different ways. The parent may choose to handle and ignore the exception, or to terminate the component, or do something else, such as loading a different component to do the job, or load the same component from another location.

Exception categories

Having exception categories helps in designing programs, as well as in understanding a written program. For instance, a set of exceptions may be related to database operations, another set with correct state of a particular class (for invariants and constraints), and yet another set with loading and interacting with a component. This can be accomplished by naming the exceptions appropriately.

However, the type specification for the exception layer requires that all exceptions in a category to be related. In Z++ this is accomplished via type of exception. In other languages it can be done through inheritance or other mechanisms supported by that language.

One category of exceptions that helps in writing more robust programs is the system exceptions. The system exceptions include division-by-zero, null-pointer and index out of bounds (arrays etc). The point is that, their existence helps engineers deal with such exceptions. Otherwise one has to rely on engineers remembering all such cases.

In Z++ system exceptions are raised by the run-time libraries allowing programs to deal with such cases prior to reaching the layer of the operating system.

Beyond system exceptions, all other categories of exceptions are user-defined. This simply means that engineers must be able to introduce their own categories of exceptions.

Compiler involvement in catching exceptions

It is possible to help the compiler inform us of exceptions that might be raised, and for which we have not defined a handler.

Exceptions occur inside the bodies of functions and methods. The list of exceptions expected to be raised in the body of a function is specified via a throws list after the header of a function.

The compiler uses the list in the throws specification in nested function calls. Starting with the innermost function the compiler verifies that its throws list is covered with the cases of the layer statements in the body of that function. If some exceptions are not covered, the compiler maintains that and checks the list against the layer statements in the next outer call. This continues until the compiler reaches the initial point of call. Now the function in which the initial call started must have a handler for the remaining exceptions that were not handled up to this point, otherwise the compiler will report an error.

This design requires a little more writing for the specification of throws list. For instance, the function calling another function with a throws specification must include the same specification to avoid compiler errors. However, the extra writing helps the understanding of the code and allows the compiler to inform us of our misses in handling raised exceptions.

Labels: , , , ,