Thursday, June 16, 2011

Requirements for unrestricted distributed computing

We begin by considering some of the major techniques in use for developing distributed applications, observing that the main tool is communication via sending messages. We then look at other forms of communication that extend our ability to tackle more problems in the space of distributed computing, and simplify our current methods. Finally, we introduce a medium that alleviates the prevailing restrictions in distributed computing.

A distributed algorithm, centralized, hierarchical etc. deals with states of its objects residing on different nodes. This only requires a communication mechanism between parts of the algorithm executing on various nodes. Specifically, it does not impose any conditions of distributivity on the operating systems in control of those nodes.

A distributed algorithm is a special case of distributed computing. The client-server model provides a general medium for solving many problems of distributed nature. For instance, Remote Procedure Call (RPC) uses the client-server model, as does remote linkage where the library code is downloaded to the client node and executed by the client instead of the server. Nonetheless, the client-server model does not impose conditions of distributivity on the operating systems controlling the client or the server nodes.

As things are, linguistic abstractions for communication can be defined and in most cases implemented without dependence to system peculiarities. This allows defining universal linguistic abstractions for the forms of distributed computing that solely rely on communication mechanisms. In other words, the notions of distributed computing in these categories are not constrained by limitations of available operating systems.

Modes of Communication

For clarity, we will refer to the operating system in control of a node as the host operating system (HOS).

At application level three main forms of communication, and their combinations, are used for purposes of solving problems of distributed nature. The protocol mechanism is used in applications such as email. Another mechanism is to request the execution of a function on a remote node, as in RPC or Remote Method Invocation (RMI). A distributed algorithm generally uses blocking with timeout for sending or receiving the awaited data, at which time synchronization takes place.

On the other hand, on a single node, many problems are tackled via signaling. In particular, Graphical User Interface (GUI) depends heavily on the notion of event.

Communication via signaling for solving problems of distributed nature requires assistance from the HOS of the nodes engaged in the operation. The HOS creates and owns the processes, therefore only HOS can inform a process of the arrival of a signal. In particular if a signal generated on one node is to be delivered to a process on another node, the HOS of the two nodes must support a compatible system of signaling.

Distributed Operating System

The notion of a distributed operating system (DOS) can be defined in terms of its processes. Communication mechanism is a necessary but insufficient condition for a DOS. An HOS controlling two or more nodes is distributed only if its processes can get their next time-slice on a different node. In other words, a process can enter the waiting queue on a node, and gain its time-slice on a different node. This is equivalent to the ability of processes to move from one node to another with strong mobility. Thus, an operating system is distributed if and only if it supports strong mobility of its processes.

The notion of strong mobility came about in research on Autonomous Agents. Our observation implies that the materialization of the notion of autonomous agent is only feasible as a process of a distributed operating system.

The notion of strong mobility bears some vagueness, though. Vaguely speaking, only the soul of a process can be transported not its body. For instance, a process may have opened files, interacting with a remote server and so on. Resources of this kind attached to a process cannot be transported. On the other hand, all objects, threads, stack of nested calls, nested scopes of loops/selections, raised exceptions and registration of (distributed) signals as well as the point of control are transportable. Thus, prior to transportation all files must be closed, and reopened at destination as needed. After all, the same files may or may not exist at destination. Communication with another server, for instance a database server will have to be re-established at destination, as well. However, these are simple matters that one can easily remember prior to being reminded via raised exceptions.

Separation of Responsibilities

An HOS is responsible for managing resources and allocating them for use by its processes. Data transmission is one such resource. HOS packages data and sends it to its destination. The receiving node unpacks the data and delivers it to the target process. The interpretation of the contents of packages being transmitted is up to the communicating processes. Think of it like the phone system that establishes a connection so two people can talk.

At end of section on “Modes of Communication” we introduced the notion of distributed signals (signals sent from one node to another). Given that an HOS should not be aware of the contents of data packages, the transmission and arrival of distributed signals requires a runtime library specific to the language used, for writing such software. Since each HOS has its own way of handling signals, these libraries will be expensive to maintain. Furthermore, libraries can do nothing about an HOS that does not support signaling. This follows from our observation that, HOS creates and owns the processes and therefore only HOS can inform a process of the arrival of a signal.

Now, with regard to agents traveling among nodes we run into an impasse. Assume that the sending HOS packages the state and the code of an agent and transmits it to another node. The runtime library on the receiving node must use the contents of the package to create a process and set its state, and finally tell the HOS where to set its instruction pointer when it allows this new process to execute. However, apart from a system call to create a process, we shall need a fairly complex set of system calls to set the state of a new process and its point of execution. Providing such a set of system calls violates the abstraction of the notion of process. Furthermore, an obvious requirement is that all HOS handle processes in a compatible manner if not identical, from a desktop to a PDA and everything else.

A rather obvious conclusion from the preceding observation is that an HOS cannot simply evolve into a Distributed Operating System (DOS). It may now appear that the materialization of the notion of DOS is not feasible. Well, it is not so.

First, we must distinguish between programming a device and developing an application for use on that device. An application is a user of system resources alone, having nothing to do in controlling or managing them. Therefore, an application must run as a different type of process than system processes, which are responsible for allocating and managing system resources. This separation has been known for a long time with regard to processes created by an HOS. Some processes run in supervisor mode and others in user mode. However, that is not the point I am making here about the separation of processes.

For clarity, we shall refer to a process created by an HOS as a system process or a real process. Below we define the notion of a virtual process.

Definition. A process not created by the HOS controlling a computing device will be called a virtual process. A virtual process may also be referred to as an application process.

The notion of a process, real or virtual has certain characteristics. For validity of the above definition a virtual process must possess such characteristics. A virtual process is the representation of an application in execution. A virtual process needs to be created, will have a state and will consist of one or more threads. Furthermore, the creator of virtual processes must manage them and facilitate Inter-Process Communications (IPC), signaling and other related notions. This observation leads to the realization that we need some kind of operating system for the creation and management of virtual processes.

A virtual process as the representation of an application also needs access to system resources managed by HOS. But, if a virtual process directly requests resources and holds them, it will be indistinguishable from a real process. We therefore observe that the operating system creating virtual processes should respond to the needs of its processes. Then, this operating system as a system application can interact with the HOS in gaining access to the resources needed for its virtual processes.

These observations lead us to the conclusion that instead of the impossible task of creating a distributed HOS we can create a distributed operating system (DOS) that runs like any other system process. The DOS will be responsible for creating and managing virtual processes, as well as handling distributed signals. Remember that for an operating system to be distributed its (virtual) processes must be able to travel while retaining their context.

Conditions for a feasible DOS

Our conclusion does not seem to solve our problem in that developing a distributed operating system to manage all user applications is not an easy task, if feasible at all. Thus, we have to choose between continuing to write distributed software using our restricted abstractions, or engaging in research towards building a viable DOS.

Suppose we choose to build a DOS. Unless a few general conditions are satisfied our DOS will have limited use. For instance, it should be small enough to run on a PDA or any small computing device that we may utilize say in home appliances. The latter is to facilitate the building of smart appliances capable of interacting with one another and perhaps our PDA or a home computer.

Perhaps the more important condition is that our DOS should be independent of any HOS except for requesting resources for its processes. Otherwise, porting and maintaining the DOS will be even more expensive than porting contemporary virtual machines. In other words, the DOS must be self-contained in creating and managing its processes, threads, signals etc, without making any calls to its HOS. In particular, a DOS application can be multi-threaded on an HOS that does not support threading.

The self-contained distributed operating system Z47 Virtual Processor came to its completion in 2008, after over two decades of intense research and development. Z47 satisfies the conditions of viability mentioned above.

Linguistic abstractions

A mechanical tool needs an operator. Thus, the mechanization of computation, in its most primitive form such as an abacus also needs an operator. In our days this has evolved into an operating system and a language. At a lower level, a computing device consists of a set of devices and their drivers under the control of an operating system. This arrangement makes it possible for user applications to interact with the operating system via a language.

Theory of computation is the skeleton of theory of automation. However, automation has its own flesh and organs. In particular, notions of distributed computing are not topics of study within theory of computation. A language like FORTRAN replaces an abacus expert but offers no linguistic abstractions towards distributed computing.

The strength of the language of mathematics in formalizing science is in its cumulative nature as a single language as opposed to a collection of languages. Linguistic abstractions of notions of computing should not be dispersed among languages. Applications built by gluing multiple languages are fragile and too expensive to maintain. On the other hand, the use of a single contemporary language requires the implementation of abstractions provided by other languages, resulting in expensive software with a large maintenance crew.

Views of software development

There are two distinct views of software development. Ultimately, programming a computing device must be completely separated from developing user applications. In the latter case linguistic abstractions should replace system calls, which are indispensable for system-oriented software, such as virtual machines. User applications, on the other hand have a meaning independent of the computing device on which they run, except for their user interface.

Applications for users of a computing device need the use of resources available on the device without having to manage those resources. In addition, distributed applications need more linguistic abstractions for signaling, in particular with data transfer, and the mobility of processes. The actual transportation of virtual processes, for instance, should be completely transparent to an application just as paging and virtual memory is to system processes to avoid obscure defects resulting from violations of abstractions.

A developer of user applications should be able to use the language like a poet. In writing her poetry the poet relies on the language capabilities to express herself. The clay tablet, or pencil and paper or a keyboard, are not the determining factors in the construction of the poetry. Analogously, a developer wishes to express his solution relying only on the language he is using, not the system on which he is constructing his solution.

The Z++ abstract language

The language Z++ is the matching pair for the Z47 distributed operating system. However, Z++ contains its own linguistic abstractions for distributed computing without exposing Z47 internals as C++ does to UNIX.

An application developer expresses a solution in Z++ the same way an author writes a book. The developer relies entirely on the linguistic abstractions relevant to his domain of the problem. Furthermore, the Z++ language coherently contains all successful abstractions of other languages since the beginning of programming. This eliminates the need for gluing multiple languages together for a complete solution. All aspects of a solution can be expressed in Z++ alone.


We have identified messaging as the only currently available means for distributed computing. The contents of a message could be elements of a communication protocol, programming data, remote function execution or executable code.

In our attempt to include signaling in solving problems of distributed nature we arrived at the notion of a virtual process and consequently a distributed operating system. The characteristics of a distributed operating system were listed and Z47 Virtual Processor was introduced as a viable representative.

The mechanization of automation rests on two elements, an operating system and a corresponding language. The Z++ abstract language was introduced as the language corresponding to Z47 distributed operating system. In particular we emphasized that Z++ is an evolutionary language by design, coherently containing all the successful mechanisms and notions in all languages. Furthermore, Z++ is monotonically extensible in a manner similar to the language of mathematics.

Labels: , , ,