Thursday, July 21, 2005

The curse of Stroustrup

Remark. This note is about a specific case of a process, not about a person nor about the process as a whole.

It is interesting to participate in a newsgroup so long as you can ignore unreasonable replies to your postings. However, I am not recommending exposing yourself to thousands who will tolerate nothing but their view.

The dogma gets dangerously worse when their views are supported by the weaknesses of a language, and sometimes directly by an official standardization document. The danger is in spreading the distortions that their views create, at times even as best sellers.

I will mention some of the questions that reflected religious beliefs, without presenting the actual views, starting with section 11.6 of the standard. This item in the standard states that the access of a virtual method in derived classes is ignored.

The more important question about the inclusion of 11.6 is whether we should standardize a general-purpose language based on someone’s implementation at the time of proposal. Sadly, that is what has happened, and now a decade later you will hear engineers supporting this defect with all kinds of philosophical arguments.

Going back to section 11.6, access is part of specification of a method. In redefining a virtual method, its entire specification must be preserved. Otherwise, it is a different method, not a redefinition of the same method. The term redefinition refers to the body of a method, not its prototype specification.

However, keeping language features consistent reduces confusion. Let us use the term “lowering access” to mean changing the access of an item from private to protected or public. The term “raising access” will be used analogously.

A private derivation raises the access of items in the base to private (in the derived class). It is therefore reasonable to allow raising the access of a virtual method in a derived class. However, the ability to lower the access may result in practices that could compromise the original design.

Another interpretation of flexibility that was brought up is the freedom to return a reference to a private member of a class. Considering how easily this feature is misunderstood, how did it crawl into the standard?

Misunderstandings aside, what is the point of having private access specification when we can also use a private member just like a public one? I really think instead of BOOST stuff we should start thinking about reducing some of the flexibilities of C++.

In Einstein’s opinion, Mozart was greater than Beethoven because Mozart sought simplicity while Beethoven created his music. Einstein also considered imagination more important than knowledge, which favors the influence of Beethoven’s creations. One has to hear Einstein relative to the point of view that he was trying to convey.

C++ started out by nicely incorporating Simula classes into C. It then adjusted itself to the C exception mechanism, and replaced C header files with an elegant linguistic construct for templates. However, since about 1990 the main additions have been a primitive namespace construct and some ad hoc casting keywords.

The initial object-oriented simplicity of C++ was properly borrowed from previous research. Generally, successful research rests on earlier discoveries that have become part of our intuition. However, C++ has not provided simple intuition for further improvements. Its formidable learning curve is mainly a consequence of its design flaws and deficiencies rather than the inherent nature of object-orientation.

The domain of imagination for a musician and a language designer are not the same, as the consequences of their imaginations are not. We admire Beethoven for a reason quite apart from admiring the elegance of a programming language. In particular, the terms flaw and defect are inapplicable to musical creations.

The problem domains for C and C++ overlap, but they do not coincide. C++ can become the universal object-oriented abstract language for developing system programs, which can also link with C libraries. There is no need to include C as part of C++ language, and sporadically make adjustments to one of them merely to keep them in synch. C++ has been in recession for too long.

The same techniques that were used in extending C to C++ can give C++ a life of its own. Moreover, there is no barrier in endowing C with an exception mechanism and templates without incorporating the object-oriented view. As things are, C++ is merely an extention of C rather than a language in its own right. Interestingly, many have used Stroustrup extension technique in extending their favorite language. This dimension of the curse has nothing to do with the inventor of the technique and his creativity.

C++ is considered a strongly typed language. Define an enumeration type, say My_enum, and do not set any of its literals equal to 99. The following statement produced a large number of views in its favor. The statement compiles without warning.

My_enum e = My_enum(99);

Actually, the person posting the question would have been happy if 99 were one of the values in the definition of the enumeration type. This view is a consequence of the dual nature of C++, where the C types do not follow object-oriented rules.

For C types, the statement My_enum(99) has the same semantics as ADA’s casting. Many participants insisted that this was a plain cast. But My_enum is a user-defined type of enumeration kind. For instance, if My_enum were a class type, the statement would result in creating an instance of its type. Taking the latter view, My_enum(99) is a constructor that creates an instance of the enumeration type while setting its value to 99. It is natural to view this as an implicit cast. However, the result is still an object and not a literal of type of object.

In either interpretation, at the point of declaration the compiler knows about the values of the enumeration type. Therefore, it can easily determine that the integer 99 does not match as an equivalent integer value for any of the literals of the type. Since an enumeration type is identified with its extent, the mapping of 99 to a literal will fail at compile time. Nevertheless, it was argued that the standard only states that the results will be undefined at run-time! So once again a particular implementation has enforced the standard.

I suppose if Whorf were a programmer he would have said something like “The programming language marks the boundaries of our ability to express solutions”.

A programming language comprises of keywords, syntax rules, libraries and programming techniques specific to the paradigm of the language. In our design and implementations we are essentially limited to these four items. Tools like debuggers are extremely helpful, as are the methods of analysis and techniques of testing. Nonetheless, our ability to express solutions is limited to the facilities of the language we are using.

Limited expressiveness and frozen inextensible model of computation encourages the use of so-called flexibilities, as was the case with C exceptions and templates. Furthermore, our ability to think about abstractions and invent new ideas is a direct consequence of the amount of relevant abstractions that we are able to express linguistically. The flaws of a programming language encourage ad hoc expressions, which eventually become part of our intuition and shape up the way we think about solutions.

Surely, bright graduate students will be able to invent (or discover) rules so we can express meaningful design-related issues in such a way that the compiler will be able to inform us of our design flaws. Any new ideas in this direction will provide us with lasting means of producing software with fewer architectural defects. Simple computational defects can always be found via debugging.

However, it takes several years after graduation to learn about all the C++ flexibilities so one can repair the defects of products assigned to them. The Whorfian foundation for software engineering is properly shaped while at college. However, the academic preparation simply proves inadequate by far at confrontation with the flexible C++ code. That is because academic training concentrates on scientific practices and deliberately avoids harmful, and at times absurd linguistic flexibilities.

Below, I will add the title of (design-related) questions as I run into them in newgroups.

1. Where is the end of a namespace using?
2. How do I declare arrays using non-default constructor?
3. Is there any way I can resume after handling an exception?
4. When a class member is a reference, is it like a static member?
5. Is mutable related to const method or const object?

Although Z++ is a superset of C++, it corrects the defects of C++ mentioned here, among a large number of others. In particular, Z++ treats all types equally thereby completely wiping out the confusing duality of interpretations between the built-in types and user-defined types.

It is important to keep in mind that C++ is a system programming language, while Z++ is an abstract language for developing platform-free distributed applications. Hence, Z++ is not in competition with C++. The focus of this article is repairing C++. Indeed, Z++ is the (ideal) limit point for the sequence of corrections to C++ as it diverges from the Curse of Strourtrup.

Most respondents never got over their intimate familiarity with C++ practices resulting from inadequacies of the language design and its harmful deviations from object-oriented practices. In order to encourage productive discussions towards correcting C++ design flaws, without discouraging words of fanatics, I invite you to post your concerns at NewsGroup.

Labels: , , ,