Concrete Evaluation

We begin our journey into the foundations of computer science by first studying one of its key applications to computer programming: program correctness. Rigorously stating and proving program properties will require us to deep-dive into mathematical logic, the sub-field of mathematics devoted to modeling deductive reasoning. Deductive reasoning is the foundation for virtually all activities in computer science, whether you are designing an algorithm, verifying the correctness of a circuit, or assessing a program's complexity. In this manner, this initial portion of our investigation of the foundations of computer science is, quite literally, the cornerstone of everything you do in this field moving forward.

Why Functional Programming?

In this course, we'll use the Racket programming language as our vehicle for studying program correctness. In particular, we'll focus on a pure, functional subset of Racket, as studied in CSC 151, that readily translates to other languages. We do this because, as we will see shortly, pure, functional languages admit for a simple, substitution-based model of computation. By "pure," we mean that the language does not allow functions to produce side effects. A side effect is some behavior observable to the caller of the function beyond the function's return value.

The canonical example of a side effect is the mutation of global variables. For example:

(define glob 0)

(define increment
  (lambda ()
    (set! glob (+ glob 1))))

(displayln glob) ; 0
(increment)
(displayln glob) ; 1
(increment)
(displayln glob) ; 2
(increment)
(increment)
(displayln glob) ; 4

The increment function takes no parameters and produces no output. Its sole task is to produce a side effect: change the state of glob by incrementing its value via the set! function. Since the function does not return a value, the program can only use increment for its side effect by calling the function and observing the change to glob over several calls. Finally, note that the displayln function prints its argument to the console (another example of a side effect!).

We'll have more to say about side effects and their relationship with the substitutive model of computation we introduce later in this reading. For now, note that ultimately it is not the language that is important here. What is important is whether we are considering a pure fragment of that language, instead. Indeed, the lessons we learn here regarding program correctness apply to any program that behaves in a pure manner, i.e., has no side effects. For example, we can easily translate our model of computation to the pure fragment Python!

This fact supports a general maxim about programming you have hopefully heard at some point:

Minimize the number of side effects in your code.

Intuitively, complicated code with side effects is frequently tricky to reason about, e.g., explicit pointer manipulation in C. Our study of program correctness in this course will give us concrete reasons why this is indeed the case.

Racket vs. Scamper

Versions of CSC 151 alternate between using Racket and Scamper, the department's homegrown dialect of Scheme. When we discuss programs in this course, we will use features common to both Racket and Scheme (really, Scheme as well), so that you can use either Racket and Scamper as an aid when we talk about code. Where there are differences, we will favor the Racket version of the library function or language construct although conversations between the two language's features of said library function or construct should be relatively straightforward.

Program Correctness

Program correctness is a particularly topical application of logic to start with because every developer cares that their programs are correct. However, what do we mean by correctness?

For this example, we'll first briefly review how we operate over lists in a recursive fashion. Recall that a list in Racket is defined recursively:

A list is either:

Empty, i.e., null or

Non-empty with a head value and the remainder of the list, its tail.

Notably, if a list is non-empty, we can use the car function to retrieve the head and cdr to retrieve the tail.

(define example (list 1 2 3 4 5))

(displayln (car l)) ; 1
(displayln (cdr l)) ; (list 2 3 4 5)

We can use the cons function to create a list consisting of a head value (the first argument) appended on the front of a list (the second argument):

(define example (list 1 2 3 4 5))

(displayln (cons 0 example)) ; (list 0 1 2 3 4 5)

Finally, with the racket/match extension, we can perform pattern matching on a list which combines case analysis (whether the list is empty or non-empty) along with getting a handle on the head and tail directly without invoking car and cdr:

(require racket/match)

(match (list "a" "b" "c")
  ['()              "empty"]
  [(cons head tail) (string-append "non-empty. head: " head)])
  ; Evaluates to "non-empty. head: a"

In this text, we'll favor using pattern matching because mathematical prose also employs pattern matching to concisely describe relationships between symbols! If you are new to pattern matching, you might find it useful to keep this correspondence in mind between using car/cdr and pattern matching. The following if and match are equivalent:

(require racket/match)

(if (null? lst)
    <base case>
    (let ([head (car lst)]
          [tail (cdr lst)])
      <recursive case>))

(match lst
  ['() <base case>]
  [(cons head tail) <recursive case>])

With these in mind, consider the following function that appends two lists together recursively:

(define list-append
  (lambda (l1 l2)
    (match l1
      ['() l2]
      [(cons head tail) (cons head (list-append tail l2))])))

What does it mean for list-append to be "correct"? Of course, we have a strong intuition about how list-append should behave: the result of list-append should be a list that "glues" l2 onto the end of l1. But being more specific about what this means without saying the word "append" is tricky! As we try to crystallize what it means for list-append to be correct, we run into two issues:

Generality

A natural way to specify the correctness of list-append is through a test suite. For example, in Racket, we can write a simple collection of tests using the rackunit module:

(require rackunit)

(check-equal?
  (list-append (list 1 2) (list 3 4 5))
  (list 1 2 3 4 5)
  "Appending non-empty lists")

(check-equal?
  (list-append null (list 1 2 3 4 5))
  (list 1 2 3 4 5)
  "Appending empty list as first argument")

(check-equal?
  (list-append (list 1 2 3 4 5) null)
  (list 1 2 3 4 5)
  "Appending empty list as second argument")

Note that these tests exemplify very specific properties about the behavior of list-append. A test case demands that when a function is given a particular set of inputs, e.g., (list 1 2) and (list 3 4 5), that the function must produce the output (list 1 2 3 4 5). You can't get more specific about the behavior of list-append than that!

However, are these tests enough? While they cover a wide variety of the potential behavior of list-append, they certainly don't cover every possible execution of the function. For example, the tests don't cover when list-append is given two empty (null or '()) lists. They also don't cover the case where the expected output is not (list 1 2 3 4 5)! In this sense, while tests are highly specific, they do not necessarily generalize to the full behavior of the function. This always leaves us with an inkling of doubt that our functions are truly correct.

Specificity

How might we arrive at a more general notion of program correctness? Rather than thinking about concrete test cases, we might consider more abstract propositions about the behavior of the function. Such abstract propositions are typically framed in terms of relationships between the inputs and outputs of the function. With list-append, we might think about the lengths of the input and output lists and how we might phrase our intuition about the function "gluing" its arguments together. This insight leads to the following property that we might check of the function:

For all lists l1 and l2, (+ (length l1) (length l2)) $=$ (length (list-append l1 l2)).

To put another way, the lengths of l1 and l2 are preserved through list-append. That is, list-append doesn't (a) add any elements to the output beyond those found in the input, and (b) does not remove any elements.

By virtue of its statement, this property applies to all possible inputs to list-append. So if we were able to prove that this property held, we knew that it holds for the function, irrespective of the inputs that you pass to the function (assuming that the inputs are lists, of course). This stands in contrast to test cases where we make claims over individual pairs of inputs and outputs.

However, unlike a test case which is highly indicative of our intuition of correctness for this function, this property is only an approximation of our intuition. To put it another way, even if we know this property holds of every input to list-append, it doesn't mean that the function is correct! Here is an example implementation of list-append where the property holds, but the function is not "correct":

(define bad-list-append
  (lambda (l1 l2)
    (match l1
      ['() l2]
      [(cons head tail) (cons 0 (bad-list-append tail l2))])))

The lengths of the input lists are preserved with bad-list-append. However, the output is not the result of gluing together the input lists!

> (bad-list-append (list 1 2 3) (list 4 5))
(list 0 0 0 4 5)

Every element of l1 is replaced with 0! In this sense, our length-property of list-append is general---it applies to every possible set of inputs to the function---but not specific---the property only partially implies correctness.

Balancing Generality and Specificity

In summary, we've seen that test cases and general program properties sit on opposite ends of the generality-specificity spectrum. When we think about program correctness and verifying the behavior of programs, we are always trying to find the balance between generality and specify in our verification techniques. Ultimately, you can never be 100% sure that your program works in all cases---after all, a stray gamma ray can ruin whatever verification you have done about your program. So when employing program verification techniques---testing, automated analysis, or formal proof---you must ultimately make an engineering decision based on context, needs, and available time and resources. We won't have time to dedicate substantial time to these pragmatic concerns in this course, but keep them in the back of your mind as we introduce formal reasoning for program correctness in the sections that follow.

Exercise

Consider the function (lcm x y) which returns the least-common multiple (LCM) of the numbers x and y. For example, the LCM of 3 and 4 is 12, and the LCM of 6 and 15 is 30.

Write a few tests for this function.
Write a few properties of this function's behavior that implies its correctness.

A Substitutive Model of Computation

In order to rigorously prove that a property holds of a program, we must first have a rigorous model of how a program executes. You likely have some intuition about how Python programs operate. For example, try to deduce what this poorly named function does:

(define foo
  (lambda (bar)
    (match bar
      ['() 0]
      [(cons _ tail) (+ 1 (foo tail))])))

How might you proceed? You might imagine taking an example input, e.g., (list 9 1 2), and predicting how the function processes that input. With a few examples and prior knowledge of how conditionals and recursion work, you can likely deduce that this function calculates the length of the input list given to it. Our first goal, therefore, is to formalize this intuition, namely, explicate the rules that govern how Racket programs operate.

For a pure, functional programming language like the subset of Racket that we work with, we can use a simple substitutive model of computation that extends how we evaluate arithmetic expressions. While simple, this model is capable of capturing the behavior of the Racket programs we might consider in this course.

Review: Arithmetic Expressions

First, let's recall the basic definitions and rules surrounding arithmetic expressions. We can divide these into two sorts:

Syntactic definitions and rules that govern when a collection of symbols is a well-formed arithmetic expression.
Semantic definitions and rules that give meaning to a well-formed arithmetic expression. "Meaning," in this case, can be thought of as "how the arithmetic expression computes."

Syntax

Here is an example of a well-formed arithmetic expression:

$8 \times (10 - 5 \div 2)$

In contrast, here is an ill-formed arithmetic expression:

$(4 + 5) - (; \times 9)$

In this ill-formed arithmetic expression, the multiplication operator (×) is missing a left-hand argument.

We can formalize this intuition about how to create a well-formed arithmetic expression with a grammar. Similarly to natural language, grammars concisely define syntactic categories as well as rules of formation for various textual objects. Here is a grammar for arithmetic expressions:

e ::= <number> | e1 + e2 | e1 - e2 | e1 × e2 | e1 ÷ e2 | (e)

To the left of the ::= symbol is a variable, e, that represents a particular syntactic category. To the right are the rules for forming an element of that category. Here, the syntactic category is "arithmetic expressions," traditionally represented with the variable e. The rules are given as a collection of possible forms or alternatives, separated by pipes (|). The grammar says that a well-formed expression is either:

A number (<number> is a placeholder for any number), or
An addition of the form e1 + e2 where e1 and e2 are expressions, or
A subtraction of the form e1 - e2 where e1 and e2 are expressions, or
A multiplication of the form e1 × e2 where e1 and e2 are expressions, or
A division of the form e1 ÷ e2 where e1 and e2 are expressions, or
A parenthesized expression of the form (e) where e is an expression.

Note

About Pronunciation: it may not seem like it, but pronunciation is an excellent tool for internalizing the meaning of a collection of mathematical symbols. Every collection of mathematical symbols is really a sentence in the natural language sense. In our grammar, we can pronounce ::= as "is" and | as "or," so we obtain: "e is a number, or an addition, or … ." This makes it more intuitive and clear what the symbols represent. As we introduce more mathematical symbols throughout the course, always look for ways to pronounce these symbols as cohesive sentences.

Importantly, the grammar is recursive in nature: the various alternatives contain expressions inside of them. This allows us to systematically break down an arithmetic expression into smaller parts, a fact that we leverage when we evaluate expressions. For example:

$8 \times (10 - 5 \div 2)$ is an expression.
$8$ is an expression.
$(10 - 5 \div 2)$ is an expression.
$10 - 5 \div 2$ is an expression.
$10$ is an expression.
$5 \div 2$ is an expression.
$5$ is an expression.
$2$ is an expression.

Note that when interpreting symbols, there is sometimes ambiguity as to how different symbols group together. For example, consider the sub-expression from above:

$10 - 5 \div 2.$

Should we interpret the expression as $10$ and $5 \div 2$ as in the example or alternatively $10 - 5$ and $2$ . Our knowledge of arithmetic tells us that the first interpretation is correct because the $(\div)$ operator takes precedence over the $(-)$ operator. Rules of precedence are, therefore, additional syntactic rules that we must consider in some cases. There also exist rules of associativity for some grammars that govern the order in which we group symbols when the same operator is invoked multiple times. For example, we traditionally interpret subtraction as left-associative. That is $1 - 2 - 3$ is understood to be $(1 - 2) - 3$ rather than $1 - (2 - 3)$ , i.e., $1 - 2$ goes first rather than $2 - 3$ .

Semantics

Once we have established what a well-formed expression looks like, we then need to define what it means. In general, the meaning of a well-formed object is dependent entirely on the purpose of said object. For arithmetic expressions, we care about computing the result of such an expression, so we take that to be the meaning or semantics of an expression.

Traditionally, we define the rules for computing or evaluating an arithmetic expression as follows:

Find the next sub-expression to evaluate using the rules of associativity and precedence to resolve ambiguity.
Substitute the value that the sub-expression evaluates to for that sub-expression in question.
Repeat the process until you are left with a final value.

A value is simply an expression that can not be evaluated or simplified any further. For arithmetic expressions, any number is a value.

By repeating this process, we obtain a series of evaluation steps that an arithmetic expression takes before arriving at a final value. For example, here are the steps of evaluation taken for our sample expression from above:

    8 × (10 - 5 ÷ 2)
--> 8 × (10 - 2.5)
--> 8 × 7.5
--> 60

Note that at each step, the resulting expressions are equivalent, e.g., $8 \times (10 - 5 \div 2)$ and $8 \times 7.5$ are equivalent; both of these expressions are also equivalent to the value $60$ .

Exercise

For each of the following arithmetic expressions, determine if (a) they are well-formed and (b) if they are well-formed, what is their step-by-step evaluation to a final value.

$100 \div (25 \times 5 - 3)$ .
$38 - \times 5 + 2)$ .
$1 + 2 - 3 + 4 \times 5 \div 6 - 7$ .

Racket and the Substitutive Model

Our substitutive computational model for Racket extends the rules for evaluating arithmetic expressions. Like arithmetic expressions, we will think of Racket programs as an expression that evaluates to a final value. And like arithmetic, programs operate by stepwise evaluation, where we find the next sub-expression to evaluates and substitute. Racket's rules for precedence mirror those from arithmetic. So we only need to consider precedence for Racket's additional constructs beyond arithmetic.

Expressions

For our purposes, well-formed Racket expressions can be defined by the following grammar:

e ::= x                       (variables)
    | <number>                (numbers)
    | #t                      (true)
    | #f                      (false)
    | (lambda (x1 ... xk) e)  (lambdas)
    | (e e1 ... ek)           (function application)
    | (let ([x e1]) e2)       (let-binding)

In other words, a Racket expression is either:

A variable x.
A number.
The true value #t.
The false value #f.
A Lambda with arguments x1, ..., xk and a body expression e.
A function application applying arguments e1, ..., ek to function e.
A let-binding that evaluates expression e1 to a value, v and binds v to x while evaluating expression e2.

As an example, the following expression taken from our definition of list-append above:

(cons head (list-append tail l2))

Is a well-formed expression that is a function application of two arguments to the function bound to the variable cons. Each argument is, itself, a function application.

Values

In arithmetic, numbers were our only kind of value. In Racket, we have multiple kinds of values corresponding to the different types of Racket:

Numbers are values (of type int or float depending on whether number is integral or floating-point).
#t and #f are values (of bool type)
(lambda (x1 ... xk) e) is also a value (of function type).

Substitutive Evaluation for Racket

The heart of Racket evaluation is function application. Function application directly generalizes arithmetic evaluation for operators: In arithmetic evaluation:

We first evaluate the left- and right-hand arguments to the operator to values.
We then apply the operator to those values.

For example, for $e_{1} + e_{2}$ , we first evaluate $e_{1}$ to a value, call it $v_{1}$ and then evaluate $e_{2}$ to a value, call it $v_{2}$ . We then carry out the addition of $v_{1}$ and $v_{2}$ .

To evaluate a function application, we first evaluate each of the sub-expressions of the function application to values. For example, consider the inc function defined to be:

(define inc
  (lambda (n)
    (+ n 1)))

if we have the expression, (inc (+ 3 (* 5 2))), it evaluates as follows:

    (inc (+ 3 (* 5 2)))
--> (inc (+ 3 10))
--> (inc 13)

Once we arrive at this point, we need to evaluate the actual function application itself. In the case of primitive operations, we just carry them out directly and substitute their resulting values into the overall expression. However, arbitrary functions will perform arbitrary operations, so we cannot simply just "substitute the result"; we must calculate it by hand!

To do this, we want to perform the following steps:

We substitute the body of the function be applied to for the overall function application.
We substitute the actual arguments passed to the function for each of its parameters.
We, then, continue evaluation of the resulting expression like normal.

For our above example, this means our expression finishes evaluating as follows:

    (inc 13)
--> (+ 13 1)
--> 14

These rules apply to functions of multiple arguments as well. For example, consider the following definition of a function that averages three numbers:

(define avg3
  (lambda (x1 x2 x3)
    (/ (+ x1 x2 x3) 3)))

Then the expression (avg3 (* 2 5) 8 (+ 1 2)) evaluates as follows:

    (avg3 (* 2 5) 8 (+ 1 2))
--> (avg3 10 8 (+ 1 2))
--> (avg3 10 8 3)
--> (/ (+ 10 8 3) 3)
--> (/ 21 3)
--> 7

Note how we evaluate the arguments to avg3 in left-to-right order, one argument at a time.

Exercise

Consider the following Racket top-level definitions:

(define f
  (lambda (x y)
    (+ x y)))

(define g
  (lambda (a b)
    (- (f a a) (f b b))))

Use your mental model of computation to give the step-by-step evaluation of the following Racket expression: (g 5 3). Check your work by evaluating this expression in DrRacket and verifying that you get the same final result.

Conditionals

Recall that conditional expressions have the form:

(if e1 e2 e3)

To evaluate a conditional:

We first evaluate the guard e1 to a value. This value ought to be a boolean value, i.e., #t or #f.
If the guard evaluates to #t, we substitute expression e2 for the entire conditional.
Otherwise, the guard must have evaluated to #f. We, therefore, substitute e3 for the entire conditional.
We then continue evaluating the resulting expression.

Here is an example of a conditional within our substitutive model.

    (+ 1 (if (< 3 5) 2 (* 5 5)))
--> (+ 1 (if #t 2 (* 5 5)))
--> (+ 1 2)
--> 3

Let-bindings

In Racket, we use let-bindings to assign names to intermediate computations for the purposes of readability or performance (i.e., to avoid redundant work). To evaluate a variable declaration of the form (let ([x e1]) e2).

We evaluate e1 to a value v.
We substitute every occurrence of x for v in e2.
Finally, we then evaluate e2 under this substitution.

To bind multiple variables, we can nest let expressions:

(let ([x 0])
  (let ([y 1])
    (let ([z 2])
      (+ x y z))))

However, Racket provides a shorthand macro, let*, that produces the same effect with less nesting:

(let* ([x 0]
       [y 1]
       [z 2])
  (+ x y z))

let* behaves similarly to let except we peel off the bindings in top-down order, evaluating and substituting for one binding before moving onto the next.

Consider the following function that declares local variables:

(define f
  (lambda ()
    (let* ([x (+ 1 1)])
          ([y (* x 2)])
          ([z (* x y)])
      (+ x y z))))

Here's an example of the step-by-step evaluation of a call to this no-argument function:

    (f)
--> (let* ([x (+ 1 1)])
          ([y (* x 2)])
          ([z (* x y)])
      (+ x y z))
--> (let* ([x 2])
          ([y (* x 2)])
          ([z (* x y)])
      (+ x y z))
--> (let* ([y (* 2 2)])
          ([z (* 2 y)])
      (+ 2 y z))
--> (let* ([y 4])
          ([z (* 2 y)])
      (+ 2 y z))
--> (let* ([z (* 2 4)])
      (+ 2 4 z))
--> (let* ([z 8])
      (+ 2 4 z))
--> (+ 2 4 8)
--> 14

Note

A note on precision: while performing these step-by-step derivations, you might be tempted to skip steps, e.g., evaluating all the arguments of a function call at once down to a single value. Resist this urge! One of the "hidden" skills we want to develop in working through these derivations is unfailing attention to detail. We want to be able to carry out mechanical processes like these derivations exactly as advertised.

This is useful for two purposes:

Such attention to detail is a skill necessary for effective algorithmic design. By maintaining this level of precision, you are more likely to catch corner cases and oddities in your design that you can fix before they later become bugs in code.
More immediate for our purposes, skipping steps in deductive reasoning is the number one way to introduce erroneous reasoning in your proofs! Mechanics don't lie, so being able to explicate every step of a logical argument is necessary to be certain that your reasoning is sound.

Later in the course, we will discuss lifting our reasoning to more high-level concerns where we are free to skip low-level steps so that we don't lose sight of the bigger picture of what we are trying to prove. However, our high-level reasoning is always backed by low-level mechanics. It is imperative your develop those mechanics now, otherwise, your high-level reasoning will be baseless!

Additional Exercises

Exercise (Additional Tracing, ‡)

Consider the following Racket functions:

(define compute3
  (lambda (x y z)
    (let ([intermediate (+ x y)])
      (+ (* z intermediate) intermediate))))

(define triple
  (lambda (n)
    (compute3 n n n)))

Give the step-by-step evaluation (i.e., evaluation traces) for each of the following expressions. Make sure to write down all steps of evaluation as required by our substitutive model of computation!

(+ 3 (* 5 (/ 2 (- 10 5))))
(compute3 (* 2 3) (+ 8 3) (/ 1 2))
(triple (+ 5 0.25))

Make sure to check your work by entering these expressions into Python!

CSC 208: Discrete Structures