Types and Programming Languages
Benjamin C. PierceIntroduction
Chapter 1: Introduction
1.1 Types in Computer Science
- Formal methods are mathematically rigorous techniques for the specification, development, analysis, and verification of software systems.
- Lightweight formal methods emphasize partial specification, and often are modest enough in power that they more easily be implemented into compilers, linkers, and/or program analyzers.
- Model checkers are lightweight formal methods which are tools that search for errors in finite-state systems.
- Run-time monitoring is a collection of techniques that allow a system to dynamically detect when one of its components is not behaving according to specification.
- A type system is a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute.
Chapter 2: Mathematical Preliminaries
Part I: Untyped Systems
Chapter 3: Untyped Arithmetic Expressions
3.2 Syntax
- Inference rules are employed in "natural deduction style" presentations of logical systems as shorthand for inductive definitions of terms.
- Axioms are inference rules which have no premises.
- Rule schemas are inference rules whose premises and/or conclusions include metavariables.
- Formally, each schema represents the infinite set of concrete rules that can be obtained by replacing each metavariable in consistently by all phrases form the appropriate syntactic category.
3.4 Semantic Styles
- Operational semantics specifies the behavior of a programming language by defining a simple abstract machine for it.
- This machine is "abstract" in the sense that is uses the terms of the language as its machine code, rather than some low-level microprocessor instruction set.
- For simple languages, a state of the machine is just a term, and the machine's behavior is defined by a transition function that, for each state, either gives the next state by performing a step of simplification on the term or declares that the machine has halted.
- The meaning of a term t can be taken to be the final state that the machine reaches when started with t as its initial state.
- Strictly speaking, what is described here is the so-called small-step style of operational semantics, sometimes called structural operational semantics.
- In denotational semantics, the meaning of a term is taken to be some mathematical object, such as a number or a function.
- Giving denotational semantics for a language consists of finding a collection of semantic domains and then defining an interpretation function mapping terms into elements of these domains.
- Instead of first defining the behaviors of programs and then deriving laws from this definition, axiomatic semantics takes the laws themselves as the definition of the language.
3.5 Evaluation
- In small-step operational semantics, there are two types of rules:
- Congruence rules find the place in the program where a reduction happens.
- Computation rules actually perform the reduction.
- An instance of an inference rule is obtained by consistently replacing each metavariable by the same term in the rule's conclusion and all its premises (if any).
- A rule is satisfied by a relation if, for each instance of the rule, either the conclusion is in the relation or one of the premises is not.
- The one-step evaluation relation → is the smallest binary relation on terms satisfying the evaluation rules.
- Then the pair (t, t′) is in the evaluation relation, we say that the evaluation statement (or judgement) t → t′ is derivable.
- A term t is in normal form if no evaluation rule applies to it–i.e., if there is no t′ such that t → t′.
- The multi-step evaluation relation →* is the reflexive, transitive closure of one-step evaluation.
- A closed term is stuck if it is in normal form but not a value.
- "Stuckness" gives us a simple notion of run-time error for our simple machine. Intuitively, it characterizes the situations where the operational semantics does not know what to do because the program has reached a "meaningless state."
- Big-step semantics (sometimes called natural semantics) directly formulates the notion of "this term evaluates to that final value", written t ⇓ v.
Chapter 5: The Untyped Lambda Calculus
- Lambda-calculus is a formal system invented in the 1920s by Alonzo Church, in which all computation is reduced to the basic operations of function definition and application.
5.1 Basics
- A lambda-term is an arbitrary term in the lambda-calculus.
- A lambda-abstraction is a lambda-term which begins with λ.
- Abstract and Concrete Syntax
- The concrete syntax (or surface syntax) of a languaghe refers to the strings of characters that programmers directly read and write.
- The abstract syntax is a much simpler representation of programs as labeled trees (called abstract syntax trees, or ASTs). The tree representation renders the structure of terms immediately obvious, making it a natural fit for the complex minupulations involved in both rigorous language definitions and the internals of compilers and interpreters.
- Lexical analyzers (or lexers) convert the string of characters written by the programmer into a sequence of tokens–identifiers, keywords, constants, punctuation, etc.
- Parsers transform a sequence of tokens into an abstract syntax tree.
- Scope
- An occurrence of the variable x is said to be bound when it occurs in the body t of an abstraction λ(x.t). More precisely, it is bound by this abstraction. And we say that λ(x) is a binder whose scope is t.
- An occurrence of the variable x is said to be free if it appears in a position where it is not bound by an enclosing abstraction on x.
- A term with no free variables is said to be closed.
- Closed terms are also known as combinators.
- The simplest combinator, called the identity function, does nothing but return its argument, and is defined as id=λ(x.x).
- Operational Semantics
- A term of the form (λ x.t12) t2 is called a redex ("reducible expression").
- Beta-reduction is the operation of rewriting a redex according to the following rule: (λ x.t12) t2 → [x → t2]t12, where [x → t2]t12 means "the term obtained by replacing all free occurrences of x in t12 by t2."
- Under full beta-reduction, any redex may be reduced at any time.
- Under the normal order strategy, the leftmost, outermost redex is always reduced first.
- Under the call by name strategy, no reductions are allowed inside abstractions.
- Under the call by value strategy, only outermost redexes are reduced and where a redex is reduced only when its right-hand side has already been reduced to a value–a term that is finished computing and cannot be reduced any further.
- A strategy is considered strict when arguments to functions are always evaluated, whether or not they are used by the body of the function (e.g.: call by value).
- A strategy is considered non-strict (or lazy) when arguments to functions are only evaluated when the arguments are actually used (e.g.: call by name).
5.2 Programming in the Lambda-Calculus
- Multiple Arguments
- Higher-order functions are functions that yield functions as results.
- Currying is the transformation of multi-argument functions into higher-order functions.
- Church Booleans
- Church encoding is a means of representing data and operators in the lambda-calculus.
- Church Booleans are terms in the lambda-calculus which are used to represent boolean values.
- tru = λ t. λ f. t;
- fls = λ t. λ f. f;
- Pairs
- Using booleans, we can encode pairs of values as terms:
- pair = λ f. λ s. λ b. b f s
- fst = λ p. p tru;
- snd = λ p. p fls;
- Using booleans, we can encode pairs of values as terms:
- Church Numerals
- Church numerals are terms in the lambda-calculus which are used to represent the natural numbers, where each number n is represented by a combinator cn that takes two arguments, s and z (for "successor" and "zero"), and applies s, n times, to z.
- c0 = λ s. λ z z;
- c1 = λ s. λ s. λ z;
- c2 = λ s. λ s. λ s. λ z;
- etc.
- Other arithmetic operations can be defined as follows:
- scc = λ n. λ s. λ z. s (n s z);
- plus = λ m. λ n. λ s. λ z. m s (n s z);
- times = λ m. λ n. m (plus n) c0;
- iszro = λ m. m (λ x. fls) tru;
- Subtraction can be implemented with the rather tricky "predecessor function," which, given c0 as argument, returns c0 and, given ci+1, returns ci:
- zz = pair c0 c0;
- ss = λ p. pair (snd p) (plus ci (snd p));
- prd = λ m. fst (m ss zz);
- Church numerals are terms in the lambda-calculus which are used to represent the natural numbers, where each number n is represented by a combinator cn that takes two arguments, s and z (for "successor" and "zero"), and applies s, n times, to z.
- Enriching the Calculus
- For simplicity, we introduce primitive booleans and numerals, as well as functions which convert from one to the other:
- realbool = λ b. b true false;
- churchbool = λ b. if b then tru else fls;
- realeq = λ m. λ n. (equal m n) true false;
- realnat = λ m. m (λ x. succ x) 0;
- For simplicity, we introduce primitive booleans and numerals, as well as functions which convert from one to the other:
- Recursion
- Terms with no normal form are said to diverge.
- The divergent combinator is defined as follows: omega = (λ x. x x) (λ x. x x); Reducing this redex yields exactly omega again.
- The fixed-point combinator, also known as the call-by-value Y-combinator is defined as follows:
- fix = λ f. (λ x. f (λ y. x x y)) (λ x. f (λ y. x x y));
- The fixed-point combinator can be used to define a factorial function:
- g = λ fct. λ n. if realeq n c0 then c1 else (times n (fct (prd n)));
- factorial = fix g;
5.3 Formalities
- Substitution
- Variable capture is the phenomenon of free variables in a term s becoming bound when s is naively substituted into a term t.
- Capture-avoiding substitution occurs when bound variable names of a term t are kept distinct from the free variable names of a term s when substituting s into t.
- Alpha-conversion is the operation of consistently renaming a bound variable in a term.
Chapter 6: Nameless Representation of Terms
- The Barendregt convention describes the general condition that the names of all bound variables must all be different from each other and from any free variables that may be used.
- Combinatory logic refers to a language based directly on combinators instead of on variables.
6.1 Terms and Contexts
- De Bruijn's idea was that terms can be more straightforwardly represented by making variable occurrences point directly to their binders, rather than referring to them by name.
- This can be accomplished by replacing named variabled by natural numbers, where the number k stands for "the variable bound by the k'th enclosing λ ."
- For example, the orginary term λ x. x corresponds to the nameless term λ.0, while λ x. λ y. x (y x) corresponds to λ.λ. 1 (0 1).
- Namesless terms are sometimes called de Bruijn terms, and the numeric variables in them are called de Bruijn indices.
- Compiler writers use the term static distances for the same concept.
- A naming context is a mapping from de Bruijn indices to free variables.
6.2 Shifting and Substitution
- Shifting is the operation of renumbering the indices of free variables in a term to allow a substitution that does not conflict with existing indices.
- Since we do not want to shift bound variables, we only shift those above some cutoff.
- We use the notation ↑dc(t) to describe the d-place shift of a term t above cutoff c.
6.3 Evaluation
- A variation of de Bruijn indices, which number lambda-binders "from the inside out", are de Bruijn levels, which number binders "from the outside in."
Chapter 7: An ML Implementation of the Lambda-Calculus
7.4 Notes
- An environment describes a data structure which is carried along with the term being evaluated, and records associations between bound variable names and the argument value.
- Environments can be regarded as a kind of explicit substitution, since it effectively moves the mechanism of substitution from the meta-language into the object language and makes it a part of the syntax of terms manipulated by the evaluator.
Part II: Simple Types
Chapter 8: Typed Arithmetic Expressions
8.2 The Typing Relation
- The typing relation for arithmetic expressions is the smallest binary relation between terms and types satisfying all instances of the typing rules.
- A term t is typable (or well typed) if there is some type T such that t ∈ T.
- The generation lemma for a typing relation (sometimes called an inversion lemma), given a valid typing statement, shows how a proof of this statement could have been generated.
8.3 Safety = Progress + Preservation
- Safety (also called soundness) is a property which means that well-typed terms do not reach a "stuck" state when undergoing evaluation.
- The progress theorem states that a well-typed term is not stuck (either it is a value or it can take a step according to the evaluation rules).
- The preservation theorem states that if a well-typed term takes a step of evaluation, then the resulting term is also well-typed.
- The canonical forms of a type are the well-typed values of that type.
- The preservation theorem is often called subject reduction or subject evaluation.
Chapter 9: Simply Typed Lambda-Calculus
9.1 Function Types
- The type constructor → takes two types T1 and T2, and produces a type for a function T1 → T2, describing a function which takes an input of type T1 and produces an output of type T2.
9.2 The Typing Relation
- Languages in which type annotations in terms are used to help guide the typechecker are called explicitly typed.
- Languages in which we ask the typechecker to infer or reconstruct type information are called implicitly typed.
- A typing context (also called a type environment) Γ is a sequence of variables and their types.
- The expression ∅ \vdash t : T means "the closed term t has type T undery the empty set of assumptions". The ∅ is often omitted for simplicity.
- The "comma" operator is used to extend Γ, for example, in the following expression: Γ,x:T1 \vdash t2:T2
- We say that a language is degenerate if it has no well-typed terms at all.
9.3 Properties of Typing
- Uniqueness of Types Theorem: In a given typing context Γ, a term t (with free variables all in the domain of Γ) has at most one type. That is, if a term is typable, then its type is unique. Moreover, there is just one derivation of this typing built from the inference rules that generate the typing relation.
- Progress Theorem: Suppose t is a closed, well-typed term (that is, \vdash t:T for some T). Then either t is a value or else there is some t' with t → t'.
- Preservation Theorem: If Γ \vdash t:T and t → t', then Γ \vdash t':T.
9.4 The Curry-Howard Correspondence
- The → type constructor comes with typing rules of two kinds:
- an introduction rule (e.g., rules for abstraction) describing how elements of the type can be created, and
- an elimination rule (e.g., rules for application) describing how elements of the type can be used
The Curry-Howard correspondence (also: Curry-Howard isomorphism, propositions as types analogy) lays out a relationship between type theory and logic where a term of the simply typed lambda-calculus can be seen as a proof of the logical proposition corresponding to its type.
- Computation–reduction of lambda-terms–corresponds to the logical operation of proof simplification by cut elimination
Logic Programming Languages propositions types proposition P ⊃ Q type P → Q proposition P ∧ Q type P × Q proof of proposition of P term t of type P proposition P is provable type P is inhabited (by some term) - Linear logic gives rise to the idea of linear type systems and modal logics have been used to help design frameworks for partial evaluation and run-time code generation.
9.5 Erasure and Typability
- Erasure is the process of mapping simply typed terms to corresponding untyped terms.
9.6 Curry-Style vs. Church-Style
- A language is said to be defined in Curry-style if the semantics are defined first, then a type system is added to reject unwanted terms.
- A language is said to be defined in Church-style if typing is defined before semantics.
- Implicitly typed presentations of lambda-calculi are often given in the Curry style, while Church-style presentations are common only for explicitly typed systems.
Chapter 10: An ML Implementation of Simple Types
Chapter 11: Simple Extensions
11.1 Base Types
- Base types are sets of simple, unstructured values, such as numbers, booleans, or characters.
- Base types are also referred to as atomic types, because they have no internal structure as far as the type system is concerned.
11.3 Derived Forms: Sequencing and Wildcards
- The sequencing notation t1;t2 has the effect of evaluating t1, throwing away its trivial result, and going on to evaluate t2.
- This can be understood as an abbreviation for (λ x: Unit.t2) t1, where the variable x is chosen different than all the free variables of t2.
- Derived forms (also called syntactic sugar) are constructs that can be derived from constructs of the more fundamental operations of abstraction and application.
- Replacing a derived form with its lower-level definition is called desugaring.
- The wildcard binder, written "_", is used to abbreviate a lambda-abstraction in which the parameter variable is not actually used in the body of the abstraction.
- For example, we can write λ _:S.t to abbreviate λ x:S.t where x is some variable not occurring in t.
11.4 Ascription
- To ascribe a particular type to a given term is to record in the text of the program an assertion that this term has this type.
- We say "t as T" to mean "the term t, to which we ascribe the type T".
11.5 Let Bindings
- We write let x=t1 in t2 to mean "evaluate the expression t1 and bind the name x to the resulting value while evaluating t2."
- This feature can be equivalently defined as: (λ x:T1. t2) t1
11.6 Pairs
- Pairs can be written as {t1, t2}, and projection is written t.1 for the first projection and t.2 for the second.
- The type of a pair is T1 × T2, which is called the product (or sometimes cartesian product) of T1 and T2.
11.7 Tuples
- A generalization of pairs, a tuple of n terms, is expressed as {tii∈1..n} and has the type {Tii∈1..n}.
11.8 Records
- A generalization of tuples, is a record, which uses labels instead of sequential indices. A record is expressed as {li=vii∈1..n}, where li is the label for value vi.
- While projection can be used to extract the fields of a record one at a time, pattern matching syntax can be used to extract all the fields at the same time.
11.9 Sum Types
- A sum type describes a set of values drawn from exactly two given types, and is expressed as T1+T2.
- Elements of a sum type can be created by tagging elements of the component types.
11.10 Variants
- Binary sums generalize to variants, and are expressed as <li:Tii∈1..n>, where li is the field label for type Ti.
- Options
- An optional value is an element of the type <none:Unit, some:T>, where T is some specific type.
- Enumerations
- An enumerated type (or enumeration) is a variant type in which the field type associated with each label is Unit.
- e.g.: Weekday = <monday:Unit, tuesday:Unit, …>
- An enumerated type (or enumeration) is a variant type in which the field type associated with each label is Unit.
- Single-Field Variants
- Single-field variants can be used to enforce rules about what operations can be performed on certain values via the type system itself.
- Variants as Disjoint Unions
- Sum and variant types are sometimes called disjoint unions, because the sets of elements of type Ti are tagged before they are combined.
- The phrase union type is used to refer to untagged (non-disjoint) union types.
11.11 General Recursion
- The simply typed lambda-calculus with numbers and fix is a favorite experimental subject for programming researchers, and is often called PCF.
- The recursive binding construct letrec is easily defined as a derived form of fix: letrec x:T1=t1 in t2 ⇔ let x = fix(λ x:T1. t1) in t2
11.12 Lists
- Type constructors, like → and × build new types from old ones.
- Another useful type constructor is List, which, for every type T, describes finite-length lists whose elements are drawn from T.
Chapter 12: Normalization
- An important property of the simply typed lambda-calculus is that the evaluation of a well-typed program is guaranteed to halt in a finite number of steps–i.e., every well-typed term is normalizable.
Chapter 13: References
- A programming language feature is called impure if it has side-effects (computational effects), and pure otherwise.
Created: 2025-03-14 Fri 12:05