Types and Programming Languages

Benjamin C. Pierce

‎

Introduction

Chapter 1: Introduction

1.1 Types in Computer Science

Formal methods are mathematically rigorous techniques for the specification, development, analysis, and verification of software systems.
Lightweight formal methods emphasize partial specification, and often are modest enough in power that they more easily be implemented into compilers, linkers, and/or program analyzers.
Model checkers are lightweight formal methods which are tools that search for errors in finite-state systems.
Run-time monitoring is a collection of techniques that allow a system to dynamically detect when one of its components is not behaving according to specification.
A type system is a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute.

Chapter 2: Mathematical Preliminaries

Part I: Untyped Systems

Chapter 3: Untyped Arithmetic Expressions

3.2 Syntax

Inference rules are employed in "natural deduction style" presentations of logical systems as shorthand for inductive definitions of terms.
Axioms are inference rules which have no premises.
Rule schemas are inference rules whose premises and/or conclusions include metavariables.
- Formally, each schema represents the infinite set of concrete rules that can be obtained by replacing each metavariable in consistently by all phrases form the appropriate syntactic category.

3.4 Semantic Styles

Operational semantics specifies the behavior of a programming language by defining a simple abstract machine for it.
- This machine is "abstract" in the sense that is uses the terms of the language as its machine code, rather than some low-level microprocessor instruction set.
- For simple languages, a state of the machine is just a term, and the machine's behavior is defined by a transition function that, for each state, either gives the next state by performing a step of simplification on the term or declares that the machine has halted.
- The meaning of a term t can be taken to be the final state that the machine reaches when started with t as its initial state.
- Strictly speaking, what is described here is the so-called small-step style of operational semantics, sometimes called structural operational semantics.
In denotational semantics, the meaning of a term is taken to be some mathematical object, such as a number or a function.
- Giving denotational semantics for a language consists of finding a collection of semantic domains and then defining an interpretation function mapping terms into elements of these domains.
Instead of first defining the behaviors of programs and then deriving laws from this definition, axiomatic semantics takes the laws themselves as the definition of the language.

3.5 Evaluation

In small-step operational semantics, there are two types of rules:
- Congruence rules find the place in the program where a reduction happens.
- Computation rules actually perform the reduction.
An instance of an inference rule is obtained by consistently replacing each metavariable by the same term in the rule's conclusion and all its premises (if any).
A rule is satisfied by a relation if, for each instance of the rule, either the conclusion is in the relation or one of the premises is not.
The one-step evaluation relation → is the smallest binary relation on terms satisfying the evaluation rules.
Then the pair (t, t′) is in the evaluation relation, we say that the evaluation statement (or judgement) t → t′ is derivable.
A term t is in normal form if no evaluation rule applies to it–i.e., if there is no t′ such that t → t′.
The multi-step evaluation relation →* is the reflexive, transitive closure of one-step evaluation.
A closed term is stuck if it is in normal form but not a value.
- "Stuckness" gives us a simple notion of run-time error for our simple machine. Intuitively, it characterizes the situations where the operational semantics does not know what to do because the program has reached a "meaningless state."
Big-step semantics (sometimes called natural semantics) directly formulates the notion of "this term evaluates to that final value", written t ⇓ v.

Chapter 5: The Untyped Lambda Calculus

Lambda-calculus is a formal system invented in the 1920s by Alonzo Church, in which all computation is reduced to the basic operations of function definition and application.

5.1 Basics

A lambda-term is an arbitrary term in the lambda-calculus.
A lambda-abstraction is a lambda-term which begins with λ.

Abstract and Concrete Syntax
- The concrete syntax (or surface syntax) of a languaghe refers to the strings of characters that programmers directly read and write.
- The abstract syntax is a much simpler representation of programs as labeled trees (called abstract syntax trees, or ASTs). The tree representation renders the structure of terms immediately obvious, making it a natural fit for the complex minupulations involved in both rigorous language definitions and the internals of compilers and interpreters.
- Lexical analyzers (or lexers) convert the string of characters written by the programmer into a sequence of tokens–identifiers, keywords, constants, punctuation, etc.
- Parsers transform a sequence of tokens into an abstract syntax tree.
Scope
- An occurrence of the variable x is said to be bound when it occurs in the body t of an abstraction λ(x.t). More precisely, it is bound by this abstraction. And we say that λ(x) is a binder whose scope is t.
- An occurrence of the variable x is said to be free if it appears in a position where it is not bound by an enclosing abstraction on x.
- A term with no free variables is said to be closed.
- Closed terms are also known as combinators.
- The simplest combinator, called the identity function, does nothing but return its argument, and is defined as id=λ(x.x).
Operational Semantics
- A term of the form (λ x.t₁₂) t₂ is called a redex ("reducible expression").
- Beta-reduction is the operation of rewriting a redex according to the following rule: (λ x.t₁₂) t₂ → [x → t₂]t₁₂, where [x → t₂]t₁₂ means "the term obtained by replacing all free occurrences of x in t₁₂ by t₂."
- Under full beta-reduction, any redex may be reduced at any time.
- Under the normal order strategy, the leftmost, outermost redex is always reduced first.
- Under the call by name strategy, no reductions are allowed inside abstractions.
- Under the call by value strategy, only outermost redexes are reduced and where a redex is reduced only when its right-hand side has already been reduced to a value–a term that is finished computing and cannot be reduced any further.
- A strategy is considered strict when arguments to functions are always evaluated, whether or not they are used by the body of the function (e.g.: call by value).
- A strategy is considered non-strict (or lazy) when arguments to functions are only evaluated when the arguments are actually used (e.g.: call by name).

5.2 Programming in the Lambda-Calculus

Multiple Arguments
- Higher-order functions are functions that yield functions as results.
- Currying is the transformation of multi-argument functions into higher-order functions.
Church Booleans
- Church encoding is a means of representing data and operators in the lambda-calculus.
- Church Booleans are terms in the lambda-calculus which are used to represent boolean values.
  - tru = λ t. λ f. t;
  - fls = λ t. λ f. f;
Pairs
- Using booleans, we can encode pairs of values as terms:
  - pair = λ f. λ s. λ b. b f s
  - fst = λ p. p tru;
  - snd = λ p. p fls;
Church Numerals
- Church numerals are terms in the lambda-calculus which are used to represent the natural numbers, where each number n is represented by a combinator c_n that takes two arguments, s and z (for "successor" and "zero"), and applies s, n times, to z.
  - c₀ = λ s. λ z z;
  - c₁ = λ s. λ s. λ z;
  - c₂ = λ s. λ s. λ s. λ z;
  - etc.
- Other arithmetic operations can be defined as follows:
  - scc = λ n. λ s. λ z. s (n s z);
  - plus = λ m. λ n. λ s. λ z. m s (n s z);
  - times = λ m. λ n. m (plus n) c₀;
  - iszro = λ m. m (λ x. fls) tru;
- Subtraction can be implemented with the rather tricky "predecessor function," which, given c₀ as argument, returns c₀ and, given c_i+1, returns c_i:
  - zz = pair c₀ c₀;
  - ss = λ p. pair (snd p) (plus c_i (snd p));
  - prd = λ m. fst (m ss zz);
Enriching the Calculus
- For simplicity, we introduce primitive booleans and numerals, as well as functions which convert from one to the other:
  - realbool = λ b. b true false;
  - churchbool = λ b. if b then tru else fls;
  - realeq = λ m. λ n. (equal m n) true false;
  - realnat = λ m. m (λ x. succ x) 0;
Recursion
- Terms with no normal form are said to diverge.
- The divergent combinator is defined as follows: omega = (λ x. x x) (λ x. x x); Reducing this redex yields exactly omega again.
- The fixed-point combinator, also known as the call-by-value Y-combinator is defined as follows:
  - fix = λ f. (λ x. f (λ y. x x y)) (λ x. f (λ y. x x y));
- The fixed-point combinator can be used to define a factorial function:
  - g = λ fct. λ n. if realeq n c₀ then c₁ else (times n (fct (prd n)));
  - factorial = fix g;

5.3 Formalities

Substitution
- Variable capture is the phenomenon of free variables in a term s becoming bound when s is naively substituted into a term t.
- Capture-avoiding substitution occurs when bound variable names of a term t are kept distinct from the free variable names of a term s when substituting s into t.
- Alpha-conversion is the operation of consistently renaming a bound variable in a term.

Chapter 6: Nameless Representation of Terms

The Barendregt convention describes the general condition that the names of all bound variables must all be different from each other and from any free variables that may be used.
Combinatory logic refers to a language based directly on combinators instead of on variables.

6.1 Terms and Contexts

De Bruijn's idea was that terms can be more straightforwardly represented by making variable occurrences point directly to their binders, rather than referring to them by name.
- This can be accomplished by replacing named variabled by natural numbers, where the number k stands for "the variable bound by the k'th enclosing λ ."
- For example, the orginary term λ x. x corresponds to the nameless term λ.0, while λ x. λ y. x (y x) corresponds to λ.λ. 1 (0 1).
- Namesless terms are sometimes called de Bruijn terms, and the numeric variables in them are called de Bruijn indices.
- Compiler writers use the term static distances for the same concept.
A naming context is a mapping from de Bruijn indices to free variables.

6.2 Shifting and Substitution

Shifting is the operation of renumbering the indices of free variables in a term to allow a substitution that does not conflict with existing indices.
- Since we do not want to shift bound variables, we only shift those above some cutoff.
- We use the notation ↑^d_c(t) to describe the d-place shift of a term t above cutoff c.

6.3 Evaluation

A variation of de Bruijn indices, which number lambda-binders "from the inside out", are de Bruijn levels, which number binders "from the outside in."

Chapter 7: An ML Implementation of the Lambda-Calculus

7.4 Notes

An environment describes a data structure which is carried along with the term being evaluated, and records associations between bound variable names and the argument value.
Environments can be regarded as a kind of explicit substitution, since it effectively moves the mechanism of substitution from the meta-language into the object language and makes it a part of the syntax of terms manipulated by the evaluator.

Part II: Simple Types

Chapter 8: Typed Arithmetic Expressions

8.2 The Typing Relation

The typing relation for arithmetic expressions is the smallest binary relation between terms and types satisfying all instances of the typing rules.
A term t is typable (or well typed) if there is some type T such that t ∈ T.
The generation lemma for a typing relation (sometimes called an inversion lemma), given a valid typing statement, shows how a proof of this statement could have been generated.

8.3 Safety = Progress + Preservation

Safety (also called soundness) is a property which means that well-typed terms do not reach a "stuck" state when undergoing evaluation.
The progress theorem states that a well-typed term is not stuck (either it is a value or it can take a step according to the evaluation rules).
The preservation theorem states that if a well-typed term takes a step of evaluation, then the resulting term is also well-typed.
The canonical forms of a type are the well-typed values of that type.
The preservation theorem is often called subject reduction or subject evaluation.

Chapter 9: Simply Typed Lambda-Calculus

9.1 Function Types

The type constructor → takes two types T₁ and T₂, and produces a type for a function T₁ → T₂, describing a function which takes an input of type T₁ and produces an output of type T₂.

9.2 The Typing Relation

Languages in which type annotations in terms are used to help guide the typechecker are called explicitly typed.
Languages in which we ask the typechecker to infer or reconstruct type information are called implicitly typed.
A typing context (also called a type environment) Γ is a sequence of variables and their types.
- The expression ∅ \vdash t : T means "the closed term t has type T undery the empty set of assumptions". The ∅ is often omitted for simplicity.
- The "comma" operator is used to extend Γ, for example, in the following expression: Γ,x:T₁ \vdash t₂:T₂
We say that a language is degenerate if it has no well-typed terms at all.

9.3 Properties of Typing

Uniqueness of Types Theorem: In a given typing context Γ, a term t (with free variables all in the domain of Γ) has at most one type. That is, if a term is typable, then its type is unique. Moreover, there is just one derivation of this typing built from the inference rules that generate the typing relation.
Progress Theorem: Suppose t is a closed, well-typed term (that is, \vdash t:T for some T). Then either t is a value or else there is some t' with t → t'.
Preservation Theorem: If Γ \vdash t:T and t → t', then Γ \vdash t':T.

9.4 The Curry-Howard Correspondence

The → type constructor comes with typing rules of two kinds:
- an introduction rule (e.g., rules for abstraction) describing how elements of the type can be created, and
- an elimination rule (e.g., rules for application) describing how elements of the type can be used

The Curry-Howard correspondence (also: Curry-Howard isomorphism, propositions as types analogy) lays out a relationship between type theory and logic where a term of the simply typed lambda-calculus can be seen as a proof of the logical proposition corresponding to its type.

Computation–reduction of lambda-terms–corresponds to the logical operation of proof simplification by cut elimination

Logic	Programming Languages
propositions	types
proposition P ⊃ Q	type P → Q
proposition P ∧ Q	type P × Q
proof of proposition of P	term t of type P
proposition P is provable	type P is inhabited (by some term)

Linear logic gives rise to the idea of linear type systems and modal logics have been used to help design frameworks for partial evaluation and run-time code generation.

9.5 Erasure and Typability

Erasure is the process of mapping simply typed terms to corresponding untyped terms.

9.6 Curry-Style vs. Church-Style

A language is said to be defined in Curry-style if the semantics are defined first, then a type system is added to reject unwanted terms.
A language is said to be defined in Church-style if typing is defined before semantics.
Implicitly typed presentations of lambda-calculi are often given in the Curry style, while Church-style presentations are common only for explicitly typed systems.

Chapter 10: An ML Implementation of Simple Types

Chapter 11: Simple Extensions

11.1 Base Types

Base types are sets of simple, unstructured values, such as numbers, booleans, or characters.
Base types are also referred to as atomic types, because they have no internal structure as far as the type system is concerned.

11.3 Derived Forms: Sequencing and Wildcards

The sequencing notation t₁;t₂ has the effect of evaluating t₁, throwing away its trivial result, and going on to evaluate t₂.
- This can be understood as an abbreviation for (λ x: Unit.t₂) t₁, where the variable x is chosen different than all the free variables of t₂.
Derived forms (also called syntactic sugar) are constructs that can be derived from constructs of the more fundamental operations of abstraction and application.
Replacing a derived form with its lower-level definition is called desugaring.
The wildcard binder, written "_", is used to abbreviate a lambda-abstraction in which the parameter variable is not actually used in the body of the abstraction.
- For example, we can write λ _:S.t to abbreviate λ x:S.t where x is some variable not occurring in t.

11.4 Ascription

To ascribe a particular type to a given term is to record in the text of the program an assertion that this term has this type.
- We say "t as T" to mean "the term t, to which we ascribe the type T".

11.5 Let Bindings

We write let x=t₁ in t₂ to mean "evaluate the expression t₁ and bind the name x to the resulting value while evaluating t₂."
- This feature can be equivalently defined as: (λ x:T₁. t₂) t₁

11.6 Pairs

Pairs can be written as {t₁, t₂}, and projection is written t.1 for the first projection and t.2 for the second.
The type of a pair is T₁ × T₂, which is called the product (or sometimes cartesian product) of T₁ and T₂.

11.7 Tuples

A generalization of pairs, a tuple of n terms, is expressed as {t_i^i∈1..n} and has the type {T_i^i∈1..n}.

11.8 Records

A generalization of tuples, is a record, which uses labels instead of sequential indices. A record is expressed as {l_i=v_i^i∈1..n}, where l_i is the label for value v_i.
While projection can be used to extract the fields of a record one at a time, pattern matching syntax can be used to extract all the fields at the same time.

11.9 Sum Types

A sum type describes a set of values drawn from exactly two given types, and is expressed as T₁+T₂.
Elements of a sum type can be created by tagging elements of the component types.

11.10 Variants

Binary sums generalize to variants, and are expressed as <l_i:T_i^i∈1..n>, where l_i is the field label for type T_i.

Options
- An optional value is an element of the type <none:Unit, some:T>, where T is some specific type.
Enumerations
- An enumerated type (or enumeration) is a variant type in which the field type associated with each label is Unit.
  - e.g.: Weekday = <monday:Unit, tuesday:Unit, …>
Single-Field Variants
- Single-field variants can be used to enforce rules about what operations can be performed on certain values via the type system itself.
Variants as Disjoint Unions
- Sum and variant types are sometimes called disjoint unions, because the sets of elements of type T_i are tagged before they are combined.
- The phrase union type is used to refer to untagged (non-disjoint) union types.

11.11 General Recursion

The simply typed lambda-calculus with numbers and fix is a favorite experimental subject for programming researchers, and is often called PCF.
The recursive binding construct letrec is easily defined as a derived form of fix: letrec x:T₁=t₁ in t₂ ⇔ let x = fix(λ x:T₁. t₁) in t₂

11.12 Lists

Type constructors, like → and × build new types from old ones.
Another useful type constructor is List, which, for every type T, describes finite-length lists whose elements are drawn from T.

Chapter 12: Normalization

An important property of the simply typed lambda-calculus is that the evaluation of a well-typed program is guaranteed to halt in a finite number of steps–i.e., every well-typed term is normalizable.

Chapter 13: References

A programming language feature is called impure if it has side-effects (computational effects), and pure otherwise.

Author: Matthew DiBello

Created: 2025-03-14 Fri 12:05

Validate