Friday, July 15, 2005

Exception Mechanism Correctness

A compiler deals with several notions of correctness, with syntactic correctness paramount in its list. In this note I would like to discuss a notion of correctness regarding exception mechanism.

We have learned the hard way that even expensive testing can miss uncaught exceptions with disastrous consequences. Therefore, the following notion of correctness seems justified.

Are all thrown exceptions caught?

That is, how can we direct the compiler to verify that all thrown exceptions are actually caught?

Note that the compiler can only be responsible for exceptions raised by the application. For instance, division by zero is an exception that the underlying system may throw at any point. The compiler would not know where to assume that catching such an exception is related to correctness. For instance, we may only perform a division after testing that the object in the denominator has a value more than 77. In such cases we are responsible for proper error checking.

Let us abbreviate the notion of correctness we are seeking to Exception Mechanism Correctness (EMC). Furthermore, the mechanism is easier to illustrate in the context of Z++, where it has been implemented. However, a paragraph of digression will suffice to provide a suitable context.

A Z++ program is also a component that can participate in composing larger programs. The components of a program could be on different physical machines. A component has a set of entry points that marks its boundary. A component cannot call its own entry points, while other components can only call its entry points.

That was our digression. So, the abstract view is that, a call can enter a component through an entry point, and leave the component from the same entry point. An entry point to a module is what a method is to a class.

An exception is thrown from within a function (global or method). Thus, the place to look for catching exceptions is where functions are called. Compiler uses prototype specification at the point of call without necessarily having seen the definition of the function being called. So, the first step in mechanizing EMC is to specify the list of exceptions that a function can throw, at the prototype of that function.

The compiler will have to validate EMC at two points. When compiling the definition of the function the compiler must verify that the set of exceptions thrown in the body is a subset of those specified at the prototype of the function. Then, at the point of call it must ensure that all exceptions specified at the prototype of the function are actually caught.

Requiring all thrown exceptions to be caught right at the point of call is too limiting. Suppose a function F has been specified to throw a set of exceptions, say FEX. Let G be another function with a set of exceptions GEX. Assume G is called from inside the definition of F, and that F does not catch any exceptions surrounding the call to G. We will allow the exceptions that G could throw to pass on to F. That is, the compiler will be happy to see that GEX is a subset of FEX.

In the above scenario if some of the exceptions thrown by G are caught inside the body (definition) of F, the required set of exceptions that F could throw reduces by those actually caught. Thus, FEX needs only contain the subset of GEX that is not caught.

Let us now consider a sequence of nested calls. A function F can be called by many different functions in an unknown sequence. In general, at compile time we cannot determine which function marks the end of nested calls prior to program termination. This is where the entry points of a module (component) come into play.

The Z47 processor will deliver exceptions raised in an entry point to the parent module that invoked the entry point. The parent module must have the prototype of the entry point, which lists the exceptions that the entry point could throw. With this infrastructure in place, the compiler does not need to worry about physical decomposition of a large program, and where each component may reside.

Note that, for a sequence of nested calls that does not return to an entry point we do not need to catch all exceptions, and the compiler will not complain. However, the functionally significant start of any sequence of calls is from within an entry point. Therefore, the compiler can determine the effective (functionally significant) uncaught exceptions that must be specified as part of set of exceptions thrown by an entry point.

The following rules present an abstract view of EMC for design and understanding the Z++ compiler messages with regard to EMC.

First EMC Rule. The set of exceptions that a function could throw must be a subset of those specified at its prototype.

Second EMC Rule. All exceptions specified at the prototype of a function are eventually caught, unless ineffective.

The actual algorithm is considerably more complex. For instance, try-catch blocks can be nested, exceptions can be raised without making a call, etc. Furthermore, the Z++ compiler must take into accoount the exceptions thrown by class invariants and method constraints, none of which a C++ compiler has to deal with.

Home of distributed platform-free software development language Z++ is ZH Micro.

Labels: , , , , ,