[CS Dept logo]

Basic definitions regarding computers and programming languages

Lecture Notes for Com Sci 221, Programming Languages

Last modified: Fri Jan 27 15:12:21 1995


Comments on these notes

Begin Monday 9 January

What is a programming language?

Well, first it's a language. I can't remember where I got this definition:

language
1: a system of signs used to communicate
Let's specialize this to a definition of programming language:
programming language
1: a system of signs used by a person to communicate a task/algorithm to a computer, causing the task to be performed
Ponder briefly why I included the details that I did ("person", "task/algorithm", "computer", "causing"), and left out other details ("electronic", "ASCII", "command", etc.). Notice that I insisted in the introduction that the use of a programming language for communication between persons is very important, but that's not a defining property of a programming language.

I think the [cute] definition above captures the properties of programming languages that give the best insight into their nature, while distinguishing them from other sorts of languages. But, I have begged one important question:

What is a computer?

Clearly, a computer is something that computes. Actually, a few decades ago "computer" was a job title. My mother was a computer for a research project at Syracuse, operating a desk calculator. Notice that the machine on the desk was called "calculator", rather than "computer". When Alan Turing invented one of the first electronic digital computers, he called it the "Automated Computing Engine", and went to some pains to make clear that it did not require a human computer to direct its calculations.

The English language has changed a bit since Turing's day, and now "computer" has connotations of the inhuman, instead of a presumptive humanity. For our purposes in this course, "computer" will always mean some sort of high-speed automatic digital computer. Although all automatic computers in general use today are electronic, the essential requirements for programming languages would be the same if they were optical, mechanical, or chemical. "High-speed" is a deliberately vague term. It means fast enough for programming languages to be very useful. If our automated computer were slower than a human computer, we might have a different attitude toward programs. Certainly, thousands of arithmetic operations per second is fast enough, and the current standard of millions per second is plenty fast enough.

I've still avoided the question of what it means to compute. The essential characteristic of computation is that it follows absolutely precise and unambiguous rules, so that in principle one may always look at each step and be sure that it is performed correctly (of course, this does not ensure that the computation itself is the right one to solve the intended problem). Drifting into philosophical mode, it seems that computation is not an objectively defined sort of behavior, but rather it is a type of criterion that may be applied to judge certain aspects of a behavior. When we say that a certain box full of electronic gizmos is "computing", we mean that we have found it useful to regard the actions by which it takes input in the form of buttons pushed and produces output in the form of symbols on a printed page or an electronic display as a form of computation. We deliberately disregard aspects of behavior that are irrelevant to these particular symbol manipulations: the sounds caused by the fan and other mechanical parts, the heat produced by the machine, the way that old disk drives used to walk around on the floor due to the motion of the heads, the exact voltages and currents at particular points in the gizmos, the fading of the colors in the protective covering, etc. Even with respect to the aspects of behavior that we deem relevant, the judgement that a certain box is "computing" is not so much an objective statement describing the behavior, as it is a prescription of our expectations of its behavior, and the conditions under which we will consider it to be working correctly vs. broken. That is, the concepts of "computing" and "computer" are essentially teleological: they have to do with our intended uses of things more than with their objective behaviors.

Well, for our practical purposes in class, it's enough that we recognize typical boxes of electronic gizmos as computers. There are fascinating applications of the concept of computation to other realms of thought, from physics to philosophy, and perhaps some speculation about the exact meaning of computation will help you think about those applications another day. Although the specific electronic nature of computers is not important to our understanding of programming languages, two other qualities of computers and our relations with them are important:

These two qualities make the use, and therefore the structure, of programming languages quite different from natural languages.

The structure of typical computers


This section is awfully wordy, and the directly relevant technical information is rather sparse. It's full of esoteric-sounding philosophical discussion. Nonetheless, it is crucial for you to understand (not merely to notice) the crucial point of the section:

In the physical and biological sciences, actual objects in the real world are the important things to study, and ideal theoretical models are merely convenient mechanisms for approximately predicting the behavior of actual objects. In computing, this relation between the actual and the ideal is usually reversed. Ideal theoretical models of computers provide the requirements that actual real world computers are built to satisfy.

The structure of a programming language is influenced, sometimes profoundly, by the structure of the computers that it is intended to communicate with. In principle, we would like to understand the broadest possible scope of different structures that computers and programming languages might have. In practice, one quarter is insufficient, and we will restrict attention to the conventional class of machines that has been most important from the beginning of automatic computing to the present day: the Random-Access Machine, also called the von Neumann machine in honor of John von Neumann who pioneered its design. "Random-Access" is a conventional, but not very helpful, term referring to the ability of these machines to deal with an arbitrarily chosen (not really random) memory location at each step. For the state of knowledge today, "Control-Flow Machine" might be a more useful name.

End Monday 9 January
%<----------------------------------------------
Begin Wednesday 11 January

In the bad old days, the creation of a physical computer took precedence over the design of a programming language. The natural order of business was to first build a machine with wonderful sounding computing power, and then to try to figure out how to use it. Programming languages were initially thought of as aids for controlling individual machines. The physical solidity of the machines, contrasted with the abstract informational nature of the programs, made this order very natural. But, it turned out in later practice to be wrong.

Useful programs, in fact, survive many replacements of the machines that execute them (the same is true of the data in databases that are controlled by these machines). So, in practice, programs and other abstract pieces of information (called "software") are often more durable than physical machines (called "hardware"). For this reason, programming languages should be understood in relation to abstract ideal machines, which are approximated by the actual physical machines sold by the hardware companies. Figure 1 shows the essential structure of a typical ideal von Neumann machine, and compares it to an actual machine, and to an intermediate level of abstraction that I call a realistic machine.

Whew! Fortunately for you, it's not important to understand actual machines in any detail. The point of the discussion above is to give you a foundation for appreciating the significance of abstractions such as the ideal and realistic machine. From an ontological point of view, the actual machine is a piece of reality, and the ideal and realistic machines are theoretical approximations to it. In astronomy, for example, an actual star might be our sun, and an ideal star might be a simplified mathematical model that predicts the behavior of the sun in an approximate way. If the predictions of the ideal star disagree with the reality of the actual sun, then the fault is clearly with the theoretical ideal. In this respect, the world of computing works opposite to the world of astronomy. In computing, an actual machine might be the QRX 42000, and an ideal machine might be a design that the QRX 42000 is intended to implement. If the specifications in the ideal machine disagree with the real behavior of the actual QRX 42000, then the fault is usually assigned to the actual machine, which is apparently implemented incorrectly. Rather than changing the ideal machine to agree with the actual, we may demand a correction to the actual machine, or buy another one that agrees better with the ideal.

Notice that our proper attitudes toward theoretical ideals vs. actual objects is not predetermined by any general principles regarding the precedence of reality over theory or vice versa, but by the economic facts that determine the costs and consequences of changing the two sorts of things. The actual sun is a relatively long-lasting phenomenon, changing rather slowly, extremely important to us, and particularly difficult for us to modify. A theoretical ideal sun is quite easy for us to change, and the only consequences of the change are better or worse predictions of the behavior of the actual sun. On the other hand, an actual computing machine lasts only a few years, and computer vendors have the ability to change it for mere millions of dollars. A theoretical ideal machine may be even easier to change, but the consequences are obsolescence of all software that has been built according to that ideal design. In the economics of computing, the capital investment in software is much more significant than the investment in hardware.

So, in the world of computers, the actual is taken as an approximation to the ideal. That is, when the two disagree, we tend to demand changes to the actual machine so that it fulfills its ideal specifications. The ideal machine changes infrequently, and those changes tend to be revolutions that invalidate most existing software, and require completely new approaches to computing. For example, the change from a sequential ideal machine to a massively parallel one has been the object of research for a couple of decades, and still has not been accomplished.

Given a single ideal machine, the realistic machine changes frequently, but in simple parameterized ways. Essentially, a realistic machine is an ideal machine with a limit on its memory. As users, we accept that programs that theoretically work on the ideal machine, but that exceed the memory limits of the realistic machine, will fail. We demand, however, that such failure be reported in a way that distinguishes it clearly from normal operation. As the memory gets larger, more and more programs will work, but there is no need to rewrite programs to react to the larger memory.

Our actual machine changes about as often as our realistic version, and in more complicated ways. We generally demand that all differences between the realistic machine and the actual one are completely hidden. That is, they are differences in internal organization, but not in behavior. When the behavior of the actual machine disagrees with the realistic machine that it is supposed to implement, we may complain to the manufacturer, as people are now complaining about the error in the [cute] Pentium chip.

Programming languages related to machines

Every machine comes with a programming language: the one in which programs stored in the machine are written. Such programming languages are called machine languages. It is traditional to write machine language programs in particularly ugly formats, using all capital letters and arcane abbreviations. But, the essential structural quality of a machine language is that each command corresponds directly to an atomic action of the machine. In this course, we will write machine-language programs as highly restricted Pascal programs, so that they are easier to read. The history of programming language design is essentially determined by attempts to allow the structure of programs to fit more closely the structure of the problems that they solve, rather than the code that must be stored in a machine to cause them to be executed. By fitting the structure of problems, we make it easier to operate on programs in response to issues regarding the problems that they solve.

Historically, the next step after machine languages was assembly languages. These are structurally almost the same as machine languages, but they allow memory addresses to be represented symbolically and assigned automatically, they allow program addresses to be calculated and recalculated according to the area in memory where the program is loaded, and they usually have a primitive form of user-defined command called macros.

Programming languages more advanced than assembly languages are called high-level languages. They are normally thought of as divorcing the structure of programs from the structures of the machines that execute them. But, once a high-level language becomes popular, it influences the designs of machines. In some cases, machines have even been built with high level languages, such as Algol and LISP, as their machine languages.

End Wednesday 11 January