[CS Dept logo]

Conventional Programming with Assignment Commands

Lecture Notes for Com Sci 221, Programming Languages

Last modified: Wed Feb 8 14:33:20 1995


Comments on these notes

Continue Monday 30 January and Wednesday 1 February

A small language for structured programming

The essential core of conventional procedural programming languages (such as FORTRAN, Pascal, Algol, C) is the set of programs built from

Sequential composition (usually denoted by writing one program before another, possibly separated by semicolon) is easy to forget, since it seems so trivial, but it is absolutely crucial, and many errors are made by neglecting it. These constructs abstract in very useful ways from the restricted assignments and conditional gotos of typical machine languages, while defining computations that are very similar in structure to machine language computations.

"Simple" assignment commands are those that assign the value of an expression to a scalar variable (not an array, and not a pointer). The expression must not have side effects (the same is required of the boolean expressions in conditionals and iterations).

We will study the power of these constructs in some detail, because they provide a solid basis for understanding the richer control-flow constructs of real programming languages. Sections 3.1-3.3 of the text present this material rather well, so I will not repeat it in detail. Make sure that you understand


End Wednesday 1 February
%<----------------------------------------------
Begin Friday 3 February

Reasoning about computations

Conventional procedural languages have succeeded very well in practice because they offer substantially greater power and convenience than typical machine languages, but they are still close enough in structure to machine languages that we know how to write efficient compilers. The structural similarity to machine language carries a cost, however, in the difficulty of understanding the operation of a program. The essence of the problem is that computations of relatively short programs may be outrageously long. It is impossible in practice to understand the operation of a program by tracing through its computations. If we could produce computations easily by hand, there would be little value in running the program on an automatic computer anyway.

Somehow, we must reason about very long computations with an amount of effort closer to the size of the programs that generate them. And, we must reason about all possible computations of a particular program, not just one at a time. Direct application of intuition to all the computations of a program is not usually very accurate, which is one of the reasons that introductory programming courses are often difficult. C. A. R. Hoare, refining some ideas of Robert Floyd, proposed a formal system for reasoning about structured control-flow programs. We will not be concerned with actual formal derivations in this course, but we will go through the rules of Hoare's system in order to gain an intuitive grasp of the tools they offer us for reasoning about computations.

Floyd's basic insight was that the instantaneous state of a computation is exactly the sort of structure that conventional mathematical notation talks about. In the context of a single state, program variables have fixed (but sometimes unknown to us) values, so they behave precisely like mathematical variables. Only in the passage from one state to another through assignment commands do program variables change their values, and thus behave differently from mathematical variables. A formula, such as x>y may give useul information about an infinite set of different states. So, Floyd proposed to associate mathematical assertions with control points in a program (actually, in a flow chart). The intended meaning was that every state associated with that control point must satisfy the given assertion. We will limit ourselves to integer numerical computations in this discussion, because the mathematical notation for integers and their operations and relations is so well known. The basic ideas apply to other datatypes as well, but their application requires a notation for each type that is used.

Hoare took Floyd's idea, and refined it into a form more like the conventional presentations of mathematical logic. This form is the easiest one in which to present and understand the rules of reasoning, so I will use it here. But, the practical intuitive application of these rules normally goes back to the Floyd style, except that we use the control points in a higher level program rather than in a flow chart. Hoare's assertions are called "Hoare triples," and they have the form A{P}B, where A and B are mathematical assertions and P is a program or program fragment. Some people reverse the use of braces, and write {A}P{B}, since this makes {A} and {B} look like comments in certain programming languages. I will take the lazy typist road, and use the form with only 2 braces.

A{P}B asserts that, if we start in a state for which A is true, and if the program P runs to normal termination (no crash, division by 0, infinite computation, etc.), then the final state makes B true. This is called the partial correctness interpretation of A{P}B, since it makes correct termination a hypothesis, rather than a conclusion. A is called the preassertion of the triple, and B is the postassertion. Notice that the triple A{P}B expresses a real three-way relation between the preassertion A, the program P, and the postassertion B, rather than a combination of two-way relations. That is, it makes no sense to assert a postassertion for a program without a corresponding preassertion.

Some people prefer to study the total correctness interpretation: if we start in a state for which A is true, then the program P runs to normal termination and produces a final state making B true. Even the total correctness interpretation usually refers to an ideal machine, so for example there is never any overflow due to calculation of a huge value. While total correctness is almost always the real goal of reasoning, the rules for partial correctness are simpler, so I will present them. The idea is not to abandon reasoning about termination, but just to do it separately and in a different style.

Let's consider some examples of true Hoare triples, to firm up understanding of their meaning:

True { x := 0 } x=0
No matter what the starting state, after executing x := 0, x must have the value 0.
True { x := x*2 } x is even
No matter what the starting state, after doubling x, x must be even.
x>0 { x := x+1 } x>1
If x starts out greater than 0, and we increment x, then x ends up greater than 1.
True { while True do x := 0 od } False
This looks crazy at first, and it's not the sort of assertion that we usually want to make about a program, but it's true, and by studying it you can understand the essence of the partial correctness interpretaion. No matter what the starting state, this program goes into an infinite loop. If the infinite loop halts, then the moon is made of cheese, 0=1, etc. Of course, this is just a backhanded way of saying that the program does not halt.
True { x := y } x=y
This one should be obvious by now.
x=y { x := x+y } x=2*y
This one depends on x and y being different variables.
False { P } False
It doesn't matter what the program P looks like here. And, the postassertion may be replaced by anything you like.
Notice that a triple becomes stronger when its postcondition is made stronger, or its precondition is made weaker. To describe the results of a program for all possible starting states, the preassertion must be the formula True.

For contrast, make sure that you understand why the following triples are false:

The important restrictions in the notation of Hoare triples are:

Hoare proposed rules for reasoning about programs, using partial correctness triples. I will follow conventional notation from mathematical logic in presenting these rules. Each rule looks something like a fraction, with material above and below a horizontal line. The material above the line gives the hypotheses for the rule, and the item below the line is the conclusion. The meaning of such a rule is that, if we have made sure that the items above the line are true, then it is safe to conclude that the item below the line is also true.

------------
A[E/x]{x:=E}A

Assignment rule



A{P}B  B{Q}C
------------
  A{P;Q}C

Composition rule



  A&B{P}C  A&(not B){Q}C
-------------------------
A{if B then P else Q fi}C

Conditional rule




          A&B{P}A
---------------------------
A{while B do P od}A&(not B)

Iteration rule




A implies B   B{P}C   C implies D
---------------------------------
             A{P}D

Consequence rule

Assignment rule

The assignment rule looks odd, because there is nothing above the line. Some people omit the line, and call such a rule an "axiom" (I think that "postulate" is a bit less presumptuous). The point is that a triple of the form A[E/x]{x:=E}A is always true. The problem is that I've used weird notation.

A[E/x] is intended to mean the result of replacing every global (called "free" in logic textbooks) occurrence of the variable x by the expression E. This operation is called syntactic substitution, and it is essentially a kind of macroexpansion. If A contains local scopes of variables (these normally happen in mathematical formulae with things like derivatives, integrals, and the quantifiers for all and there exists), then there are two superficial but tricky problems:

Your programmer's intuition about local and global variables should be enough to figure out the problems. Normally, we avoid these problems entirely by naming local variables differently from global ones. Unfortunately, the notation for syntactic substitution is not well standardized. If you read other sources, you will find it presented as A[x/E], [E/x]A, [x->E]A, and just about every similar variation that you can think of. Also, the square brackets may be curly or round, and some people use superscripts and subscripts. You just have to figure out each new notation as you run into it in the literature.

To understand the assignment rule, consider what it tells us about the assignment command that increments x. If A is the assertion x=1, and P is the assignment x:=x+1, the assignment rule tells us that x+1=1{x:=x+1}x=1. The preassertion is equivalent to x=0, but the form given above is the one that we get directly by substituting x+1 for x. Notice how the substitution of the expression on the right-hand side of the assignment for the variable in the preassertion causes the expression to be evaluated in the state before the assignment. This trickery is required because there is no way, in the postassertion, to refer to the previous values of variables. Conventional mathematical notation doesn't provide a good way for taking an arbitrary preassertion and projecting it forward through an assignment. Consider for example the difficulty of describing by a formal rule how to go from the preassertion x=0, through the assignment x:=x+1, to a sensible postassertion. While the variable x is an atomic symbol, and we may identify all of its occurrences in the postassertion unambiguously, the expression x+1 has much more subtle relations with preassertions.

The assignment rule as given here applies only to assignments to simple scalar variables. There is a corresponding rule for array assignments, but it is tricky to get right. Pointer variables require a substantial reworking of the whole system of reasoning.

The 3 control-flow rules are understood backwards

Although the official meaning of a rule of reasoning is usually expressed in terms of first establishing the hypotheses, and then establishing the truth of the conclusion, the Hoare rules for composition, conditional, and iteration are best understood backwards. Suppose that we want to know something about a large program. Depending on the outermost operator in the syntax tree for that program, one of these three rules tells us what we need to know about the component programs in order to establish our desired result. So, these rules can be used to take an assertion about a program, and break it up into separate assertions about each of the assignment commands in it.

The backwards application of the composition and conditional rules is straightforward, because the conclusion (the assertion under the line) has an arbitrary preassertion and postassertion. Backwards application of the iteration rule requires the consequence rule as a helper. It is possible to set up an iteration rule that works backwards by itself, but then the rule becomes harder to explain.

Composition rule

This one is pretty simple. If we want to know that A{P;Q}C, then we must find a description B of the intermediate state that occurs after executing P but before executing Q, and show that A{P}B and B{P}C.

Conditional rule

This is just proof by cases, the two cases being that the condition B of the if-then-else is true or false. Notice that it is important that every condition in a program may also be used as a mathematical assertion about the state. This rule does not work if the evaluation of the condition B has side-effects. But, the partial correctness interpretation actually makes it true when B crashes or runs forever. As an exercise, you should figure out the rule for a 1-branched conditional (that is, with no else clause).

End Friday 3 February
%<----------------------------------------------
The material is not quite in the same order here as in the class lecture. I discussed the consequence rule in class on 3 February, but did not even show the form of the iteration rule.

Begin Wednesday 8 February

Iteration rule

This rule is where the rubber meets the road. The whole point of our study of Hoare logic is the insight that we can get from understanding the iteration rule. The other rules are just there as the supporting cast.

The iteration rule uses ideas from the composition and conditional rules. After all, a while-loop is essentially an infinite composition of 1-branched conditionals. The problem is how to deal with the infinite aspect: we have no way of limiting the number of times the loop body must be composed.

The key to the iteration rule is the assertion A, which is called an invariant because we have to show that, if it is true initially, then it is true before and after every execution of the body of the loop. That's what the hypothesis A&B{P}A does for us. If A holds initially, and if the condition B of the while-loop is true (which is the only way that the body P will be executed), then A continues to hold after the loop body is executed once. Once we have established that A is an invariant for the loop (that is, A&B{P}A), then it follows that A{while B do P od}A&(not B). A continues to hold in the postcondition, because it holds initially, the first execution of the loop body keeps it true, the second keeps it true, .... B is false at the end because that's the only way that the loop can terminate.

If you are familiar with mathematical induction from a math course, notice that the argument above is essentially a shorthand for a proof, by induction on the number of iterations of the loop, that after any number of iterations A continues to hold. By merely saying that A is an invariant, we avoid repeating the whole mechanism of inductive proofs for each different loop that we argue about. If you are not familiar with mathematical induction, don't worry. Just visualize the iterations of the loop body P laid end to end on a time line. You can march along that time line, making sure that A is true at each of the joints between iterations. If the loop halts (which is a given hypothesis in the partial correctness interpretation), the A must still hold at the end.

Consequence rule

The consequence rule is not associated with any particular program construct, but it is needed to knit together the applications of the other rules. For example, when the assignment rule gives us x+1=1{x:=x+1}x=1, we use the arithmetic fact that x=0 implies x+1=1, and the rule of consequence, to get the more useful form x=0{x:=x+1}x=1. Unless someone asks, I will assume that you can figure out why the consequence rule is correct. Notice that one of the hypotheses of the consequence rule is a Hoare triple, and the other two are plain old mathematical assertions. When a mathematical assertion is given by itself, rather than as the pre- or postassertion in a Hoare triple, then it must be true of all possible states, not just those occurring in a particular program.

Using the rules of Hoare logic

In principle, the Hoare rules provide a way of building up proven-correct information about a program from the simple assignment commands in the program, through all of the intermediate-sized components of the program, finally to the program as a whole. A fully written out proof would be ridiculously long, and smaller componenets of the program would be mentioned repeatedly as parts of larger components that contain them. Instead of writing out proofs in this fashion, we attach assertions in the form of comments at various control points in a program. The Hoare rules give enough information to determine whether those assertions are justified. To avoid ambiguity, we need to mark preassertions for the whole program, so that they are taken as assumptions rather than things to be proved. We must also mark loop invariants, or they may be mistaken for assertions preceding the loop or its body.

It's time for examples. We will first take the obvious iterative program for exponentiation, and use an invariant to prove that it is correct. This will not be very impressive, and you will wonder why we bothered to prove something so obvious. But, then we will fine tune the invariant, and use it to develop and prove correct a much cleverer and more efficient program, which I hope you will not find so obvious.

The obvious way to exponentiate

Suppose that we have a language with addition and multiplication as primitives, but not exponentiation (this is in fact the case with most machine languages, although not so usual in high level languages). We want to write a simple iterative program to exponentiate. Let the inputs be given as x0 and y0, and we want to compute x0 to the y0 power (which I will write x0^y0, but remember this is not a built-in operation in our programming language) as the value of z. You could (and maybe did) program this in your first programming class:

{preassertion: y0>=0}

z := 1;
y := y0;

{invariant: z=x0^(y0-y) and 0<=y<=y0}

while y>0 do

   z := z*x0;
   y := y-1

od

{z=x0^y0}

Let's define 0^0=1 for simplicity. If you prefer that 0^0=0, or that 0^0 is undefined, then the program and/or the assertions must be a bit more complicated. We need to assume the preassertion y0>=0, since negative powers of integers are not integers. A slightly weaker, but more complicated, preassertion will work: you may figure it out as an exercise.

The main work involved in proving the correctness of the exponentiation program above is showing that z=x0^(y0-y) is an invariant for the while loop.

  1. Notice that the initialization assignments leave z=1 and y=y0 (you can prove this by two applications of the assignment rule, and one application of the composition rule, but it's intuitively obvious anyway). So, the first clause of the invariant holds when the loop is first reached, because 1=x0^0 for all values of x0. The second clause holds because y0=y>=0 after the initialization.
  2. Assume that z=x0^(y0-y) and 0<=y<=y0 and y>0 just before executing the body of the loop. Then, by elementary algebra on the first clause, z*x0=x0^(y0-(y-1)). By two applications of the assignment rule (and one application of the composition rule to knit them together), z=x0^(y0-y) just after executing the body of the loop. Since 0<y<=y0 initially, by the properties of the ordering relations on the integers, 0<=y-1<=y0. Then, by two applications of the assignment rule and one of the composition rule, 0<=y<=y0 just after executing the body of the loop.
This is all that we need to show that z=x0^(y0-y) and 0<=y<=y0 is an invariant for the while loop.

Now, by the iteration rule, z=x0^(y0-y) and 0<=y<=y0 and (not y>0 if and when the loop terminates. From the last two clauses, y=0, and substituting in the first clause we get z=x0^(y0-0)=x0^y0.

Because we have used partial correctness rules, we have only proved that if the loop halts, then z=x0^y0 holds for the final state. Partial correctness reasoning does not provide any way of asserting, much less proving, termination. To see that the loop terminates, notice that the value of y decreases by 1 each time through the loop, and when y<=0 the loop must terminate.

Now, we know that the loop terminates and z=x0^y0 in the final state. Does this tell us that the program exponentiates correctly? Not quite. Consider the following very silly program:

{preassertion: y0>=0}

z := 1;
y0 := 0;

{z=x0^y0}
Two applications of the assignment rule, and one of the composition rule, demonstrate that the final assertion must hold, and this program certainly terminates normally. But, it in no sense computes x0^y0. In order to know that a program to compute x0^y0 is correct, we must also observe that it does not change the value of x0 or y0 in order to make z=x0^y0 hold in the final state. It is very easy to verify by inspection that the original exponentiation program leaves its input variables unchanged, since neither one ever appears on the left-hand side of an assignment. Hoare triples provide no way to assert, much less to prove, that a particular variable is unchanged in a program, since there is no control point at which an assertion may refer to both the initial and final values. Since such properties are very easy to verify by direct inspection, this omission doesn't seem to diminish the value of Hoare logic very much. But, it is important to remember the issue of which variables may be changed when applying Hoare logic in a practical case.
Cleverer and more efficient exponentiation

The simple exponentiation program that we belabored so pedantically in the last section takes a number of iterations equal to the value of y0. That may not seem so bad, but for large values of y0, such as 2^32, it is less than speedy. (All right, some of you have noticed with a sneer that if y0>32 you'll get an overflow anyway, and 32 iterations isn't such a big deal. Well, these integer exponentiation methods generalize to floating point methods, and with floating point you really can go to very large powers. And, in principle, we might be using unlimited precision integer arithmetic).

There is, in fact, a very short and simple program to compute x0^y0 in a number of steps that is roughly 2*log y0. And, very few people can get this program correct using the usual vague intuitions that we apply to programming tasks. An understanding of loop invariants really helps. In fact, the best way to develop this program is to first think of the right invariant, and then write the program that may be proved correct by such an invariant.

From the point of view of manipulating mathematical assertions, what the slow exponentiation program does is to maintain the truth of z=x0^(y0-y), while changing the values of z and y until y=0. We can accomplish this, because z=x0^(y0-y) implies z*x0=x0^(y0-(y-1)), and the two assignments z:=z*x0 and y:=y-1 convert that back into the invariant form, while progressing toward y=0. But, that progress is too slow.

We need a slightly more sophisticated invariant, that allows the exponent to divide by 2, rather than decrease by 1. The crucial algebraic fact allowing the slow program to work is x0^(y+1)=(x0^y)*x0. We should be able to make faster progress with the fact that x0^(2*y)=(x0*x0)^y. But, the details of how to exploit this are not at all obvious. It appears that we should only try to divide the exponent by two when it is even, so we will need to interleave two different sorts of computation steps.

End Wednesday 8 February
%<----------------------------------------------
Begin Friday 10 February

Rework the invariant for the old slow exponentiation program into the equivalent form x0^y0=z*x0^y. z is the accumulator for the result, and y keeps track of progress. For the faster program, we need two different accumulators, one for the results of dividing the exponent by 2, and the other for the results of decrementing it. An invariant with the required flexibility is x0^y0=z*x^y. We merely allow the base as well as the exponent to change. To establish the invariant before executing our loop, we need the initializations x:=x0, y:=y0, and z:=1. In the body of the loop, we may maintain the invariant by halving y and squaring x (only when y is even), or by decrementing y and multiplying z by x. Notice that z may be multiplied by larger values than x0, because of the squaring of x. As before, we may control the loop by the condition y>0. Here is the resulting program:

{preassertion: y0>=0}

z := 1;
y := y0;
x := x0;

{invariant: x0^y0=z*x^y and 0<=y<=y0}

while y>0 do

   if y is even then

      x := x*x;
      y := y/2

   else

      z := z*x;
      y := y-1

   fi

od

{z=x0^y0}

The precondition, and the final assertion are the same as before, since we want exactly the same results from the new program. The proof that x0^y0=z*x^y and 0<=y<=y0 is an invariant is similar, but slightly more complicated. We must use the conditional rule, which is essentially a proof by cases. When y is even, we notice that before executing the body of the loop,

So, the two assignments in the then clause make x0^y0=z*x^y true again after the loop body. When y is odd, we get before the loop body, and the two assignments in the else clause make x0^y0=z*x^y true again after the loop body. It is easy to see that the second clause, 0<=y<=y0 is also invariant.

When the loop terminates, x0^y0=z*x^y and 0<=y<=y0 and (not y>0). As with the old slow program, we get y=0, so x0^y0=z*x^0=z.

We may observe that the loop terminates, and that the values of x0 and y0 are not changed, essentially as before.