Tuesday, September 16, 2008

The feature of data-type misalignment

In this article I would like to share some of the interesting issues that we ran into in the course of making Z47 Processor available for platforms powered by processors such as ARM, and Windows Mobile 6 platform.

Crafting a driver for a device is a low-level activity requiring specialized skills. A properly managed coherent sum of a large set of devices does not add up to an operating system. For an operating system, such a set is complemented with abstractions such as the notion of a process, and providing a logically designed set of system calls. One then starts with a bare operating system and promotes it to a platform by enhancing it with system programs for developing applications for that operating system. That was done with UNIX by the hands of graduate students of those days.

The goal of an operating system is to facilitate the development of applications for its platforms. However, in case of Windows the development of an application is equivalent to putting up a fight with Microsoft engineers. The number of combinations for each use of a system call and choices made in creating a window is vast, and in most cases they interfere. A particular system call may respond differently depending on what flags were set in the creation of the window receiving the call. That makes system calls context sensitive, depending on what flags are set or what style attributes are selected. To some extent this may be necessary, but not to the extent that we encountered.

Developing an application poses sufficient challenge. Engineers should not have to put up a fight with the mysteries planted by random choices made by the folks who maintain the code of their operating system. No wonder one needs to reboot the machine every so often. I really hope no one has considered Windows for developing critical applications.

Developing C++ (or C# for that matter) applications for Windows Mobile is like writing a driver in assembly code for a very complicated device (not to be confused with a complex device). It is good to know that Z++ essentially cushions the obscurity of Windows through its clear abstractions. So now let us look at ARM processor.

The purpose of a universal programming abstraction, such as Z++, is to allow complex data types to move around among computing devices without any knowledge or concern of the kinds of platforms involved. In such cases, one may use packed arrays as a low-level canonical data type representation. On each computing device then Z47 Processor provides the illusion of data-type abstractions at the Z++ language level. This is as it should be.

Restricting memory access to multiples of 2 or 4 reduces the complexity of a processor and is not of any consequence. A system compiler simply pads data without having to generate any extra code for handling such data. The problem begins when the location and type of data must match, as is done for ARM family of processors. For instance, a two-byte data must be located at addresses that are multiples of 2, while a four-byte data must be located at addresses that are multiples of 4. There is no reason for a processor to be aware of size of data it is trying to reach. Quite to the contrary, the address of a piece of data and its representation are two distinct notions.

If I can get around data-type misalignment by telling the compiler to generate the code to avoid it, I am consciously working around a defect. Do not insult my intelligence by telling me that I must redesign my program because I could not understand the notion of data abstraction. Data abstraction is not supposed to start at the level of a processor. Otherwise why don’t we just make object-oriented hardware processors?

What ARM is doing has nothing to do with the idea of RISC processors, unless RISC now means “CHEAP”. The fact that I can get around the defect is not a good reason to accept the defect as of any value. Generally, software is more expensive to develop. Besides, I have no desire to drive cars of a century ago and start the engine manually. By contemporary standards, lack of a starter is an unacceptable defect regardless of the car’s price tag.

Selling a cheap design as a feature is nothing new. Actually we do that every day in politics. More to the point, IBM would love to sell COBOL by telling everyone that threading is a hazardous operation. But multi-threaded programs have earned sufficient reputation that makes that type of sale strategy impossible.

Java is still selling the idea that pointers must be avoided. Obviously, except at low-level system program, accessing locations directly and changing their values must be avoided. Thus, Z++ has, all the pointer capabilities of C++ except the ability to reach actual physical locations. Incidentally, Java is just a third-party library such as QT, presented in an awkward language, in which you also have to write a lot of your own libraries, in Java.

Let me digress a little bit and discuss pointers. In other articles, I have clarified the point that programming pointers are the equivalent of variables in mathematics. Consider the statements “Let n be an integer. Then …”. Basically, the statement means that, the result we are going to established holds for any integer. In the same vain, when we use a programming statement involving a pointer, we mean that the statement will correctly execute for all actual objects that are substituted for the pointer (by de-referencing the pointer). This is true so long as the type of the pointer and the objects match (with polymorphism in mind). In parallel, a mathematical assertion established for the variable n as an integer does not necessarily hold for a rational number. On the other hand, a result established for real numbers generally holds for rational numbers (polymorphism).

Just as the use of variables allows making more complex statements in mathematics, the use of pointers increases expressiveness. It is not a surprise that a more expressive language is more complex and may cause mistakes until the user learns it well. A high percentage of students have difficulty passing an elementary required course in mathematics. Many students pass calculus II (integration concepts and techniques) with a B or lower grade. It will be difficult for such student to manipulate pointers. But, should every one be entitled to software engineer?

Going back to ARM, developers are familiar with “data-type misalignment” crashes of their programs. The question is whether combining the notions of address and type of data is a consequence of the technology, or a blunder that is here to stay with us. It is more reasonable to assume the latter. Either way, an admission of the issue is more satisfying than selling it as a feature.

I find it odd that some people argue that developers must do better in designing their code when even a simple double indirection fails, limiting the use of pointers. Is this advice for low-level programming devices, or for writing Z++ applications? The point is that, a processor should not interfere with notions beyond its scope, such as data types.

It is true that inexperienced engineers abuse pointer casting. Pointer casting eliminates compilation errors, which is taken to mean that the program is correct. Sometimes you find experienced engineers using an untagged union representing a variety of unrelated types. Nonetheless, it is not the role or responsibility of a hardware processor to educate engineers. Such education must come from academia.

Among many correct and necessary uses of pointer casting is the one I mentioned earlier, namely packed arrays. Any traveler knows the reason for carrying a suitcase. In the same vain, we need a canonical data representation for sending data over to remote destinations. A canonical representation by it very meaning cannot be of a high-level form, such a class, or even a double or an integer, because it is meant to represent them all. Unpacking canonical representations will probably involve double indirection, and at times even triple indirection. Actually, the ARM defect caused a lot more workaround than the one case I have mentioned here.

Labels: , ,