Sunday, December 12, 2010

Differentiating between exception mechanism and signaling

An operating system in its role as a supervisor oversees the operations of processes. For a variety of reasons, including an attempt by a process to reach out to addresses outside of its scope, the OS must terminate the offending process. In a number of other situations, the OS cannot allow the process to proceed unless it corrects its state. The OS informs a process of its illegal behavior by raising an exception in the context of the process.

An application can also encounter situations that may need raising an exception. Generally in this scenario the purpose is to remedy the situation so the execution of the program can proceed. For instance, in Z++ class invariants or constraints of a method at time of its invocation may be violated. In Z++ these violations can trigger a private method to rectify the problem. But sometimes it may be necessary for the user of the object to correct the problem. Hence in Z++ violations of invariants and constraints can also raise exceptions for reporting the problem to the next level.

We observe that a process can receive exception from two sources only: the OS and the process itself. If the only response to an exception were termination, there would be no point in talking about exceptions in anyway other than crashing. In case of an exception UNIX intercepts the process and directs it to a handler that the process has specified for the exception. After handling the exception, the process is allowed to continue its execution from where it was intercepted. However, C++ makes it impossible to use this mechanism for a number of reasons, among them the lack of resumption from exception.

When it is possible to remedy the situation the execution of a program should continue just as we do things in our daily lives. Simply crashing is not an intelligent solution for all scenarios. In particular, we expect to see more imbedded computers in the future. The manual override and resetting the computer may not always be a desirable solution.

Turning our attention to signaling, this mechanism is not intended for reporting unusual situations that require correction, or else the process will be terminated. Signaling is a form of asynchronous communication among threads and processes. Unlike exceptions, a process can receive signals from processes other than itself. The general pattern of signaling is that one thread or process generates a signal or an event. Another thread or process responds to the generated signal at a later time.

The generation of a signal should not intercept the recipient process. Instead, the recipient should be able to respond to the signal when it is able to do so. But this requires the type of abstractions that a programming language should provide. UNIX, as an operating system, uses signals for exceptions, terminating or stopping a process, as well as plain signaling as a means of communication.

In order to better see the difference between signaling and raising an exception we will briefly describe tell/hear distributed signaling in Z++ and illustrate the need for two forms of resumption from exception.

Z++ hear signals add the dimension of transfer of data from the generator of the signal to the handling thread or process in a manner similar to a function call. For that reason, the generation of a hear signal may be thought of as an asynchronous function call. In particular, hear signals travel among distributed nodes allowing us to also speak of asynchronous remote procedure call (RPC).

The generator of a hear signal tells its signal in the following general form.

Tell Recipient Signal List-of-Data

The recipient will eventually hear the signal in the following general form.

Hear Signal List-of-Data

The tell statement, as is the case with any signal, contacts the remote Z47 Processor, not the receiving process. The actual contact and communication protocol takes place between the local and the remote Signal Managers of Z47 Processors. Part of the protocol is to ask the remote Z47 whether the recipient process exists prior to sending the data. This is like verifying an address with the post office before dropping the mail.

Now suppose the remote Z47 replies negatively. Clearly, the local Z47 must inform the teller process that the recipient is not a process at the given address. At this point we are back to the Z++ source program that executed the tell statement. The proper course of action would be to change the address and try the tell statement again in search of a recipient.

One pattern that may come to mind is to make the tell statement to return a value for its success or failure. One can then use a loop and continue to check the returned value of the tell statement until a recipient is found.

However, the semantics of signaling is not the same as a function call. Traditionally, we do not generate a signal and then check its returned value. A more familiar approach would be to raise an exception when the remote Z47 replies negatively to a tell statement. This approach clearly separates the notion of signal from raising exception. In Z++ the failure of a tell statement causes an exception.

It should be clear that without resumption to re-execute the tell statement the abstraction of hear signaling is essentially useless. In UNIX, an interrupted process continues from where it left for handling a signal. However, in our scenario we really need to go back to the tell statement that caused the exception and try it again, rather than continue the execution from the statement following it. That is the reason for the two forms of resumption in Z++.

It may be worth pointing out that hear signaling is a generalization resulting from dropping the requirement of rendezvous. In Communicating Sequential Processes each process that reaches the rendezvous point will wait for the other until they synchronize and exchange data. But, a process generates a hear signal through a tell statement and moves on. Eventually, the recipient will hear the signal and receive the data. In this scenario neither process waits for the other. For that reason we refer to this form of communication as Communicating Concurrent Processes.

In conclusion we summarize our discussion as follows. Exceptions and Signals are not equivalent. An exception causes interruption to the normal execution of a process and may result in its termination. A process cannot receive exceptions from other processes. On the other hand, a process can receive signals from other processes, and may choose to respond to them when it is able to do so.