Type Theory

For People With Shit To Do

Nov 29, 2023

Introduction

Type theory is defined as the formal presentation of type systems. It is a complex topic composed of relatively simple primitives, but its relationship to category theory and its implementations ranging from the nonexistent to the strictly practical to the formalization-of-the-interconnectedness-of-all-things-in-the-universe leave it with a mixed, misunderstood, and often unfavorable view.

"What the fuck is a type?"
- @Zac_Aztec (circa 2023)

The objective of this article is to begin from first principles and explain what types are, how they are composed, what their utility is, and how they map to commonly recognized programming languages.

The explanations here are a bit abstract, but are intended to create a more fundamental model for reasoning about type theory and its application in programming.

A rudimentary understanding of programming for this article is preferred, but not assumed.

Primitive Type

The primitive type is generally enshrined in a programming language that represents one of a few categories of terms. These may include integers, decimal numbers, logical booleans, and so on.

Some meta-languages (ML) do not enshrine any primitive types listed above but rather they rely on compositions of types to create types that are more useful for practical programs. This seems like an unhelpful explanation, but there is a chicken-and-egg problem of prerequisite knowledge to form a model about this. This will be explained throughout the article but for now take the following at face value and don't think too hard about it.

It's types all the way down.
Type is also a type.

// rust example

type MyEightBitUnsignedInteger = u8;

type MyLogicalBoolean = bool;

-- haskell example

data Boolean = True | False

Function

The function is a mapping from one term to another. These terms may be of the same type or they may be of different types. Languages with "first class function support" or "higher-order functions" treat the function as its own type. This means functions can take terms as inputs and return terms as outputs that can be of any type including other functions.

Higher-order functions also enable function composition. This is connected to category theory where given two functions, one that maps term A to term B and another that maps term B to term C, these two functions can be composed such that applying both functions sequentially, there exists a mapping from term A to term C.

That is to say:

if `A -> B`
and `B -> C`
then `A -> B -> C` exists
therefore `A -> C` exists

Functions may be defined and applied. Function definition denotes how the function will map the input to the output while function application performs the mapping.

This will be expanded upon in later sections, but technically Turing completeness, or the ability to operate on a theoretically infinitely sized input, is possible with only the following.

terms: objects with or without types
function definition: defining the map from one term to another
function application: performing the mapping from one term to another

// rust example

fn double(term: u8) -> u8 {
    return term * 2;
}

fn do_twice(func: Fn(u8) -> u8, term: u8) -> u8 {
    return func(func(u8));
}

fn main() {
    let result = do_twice(double, 10);
}

-- haskell example

double x = x * 2

doTwice f x = f (f x)

let result = doTwice double 2

Product Type

The product type is a compound structure consisting of two or more types. That is to say given two types A and B, a product type of A and B would internally contain both A and B. Some programming languages represent product types as "structs" or "tuples" where structs associate each internal type with an identifier while tuples associate each internal type with a numerical index. The terms associated with each internal type are often referred to as "fields".

The name "product type" relates to category theory, where a product type C is the direct product of types A and B.

// rust example

struct NamedProductType {
    integerField: u8,
    booleanField: bool,
}

type AnonymousProductType = (u8, bool);

-- haskell example

data NamedProductType =
    NamedProductType { integerField :: Int, booleanField :: Bool }

data AnonymousProductType = AnonymousProductType (Int, Bool)

Sum Type

The sum type, also known as a tagged union, is one of a set of types. That is to say given two types A and B, a sum type of A and B would contain either type A or type B. Some programming languages represent sum types as "enums" or "unions". However, it is important to note that not all programming languages with an "enum" type are sum types. In languages where the variants, or members, of the enum are simply names with no internal type, these are generally defined in the language as enumerations over integers or natural numbers.

The name "sum type" relates to category theory, where a sum type C is a categorical sum of types A and B.

// rust example

enum TaggedUnion {
    IntegerVariant(u8),
    BooleanVariant(bool),
}

enum Enumeration {
    VariantA,
    VariantB,
}

-- haskell example

data TaggedUnion = IntegerVariant Int | BooleanVariant Boo

data Enumeration = VariantA | VariantB

Polymorphism

Polymorphism applies a single interface to terms of different types. It is divided broadly into three forms. Note that some programming languages can contain one, two, or all three forms of polymorphism.

Ad Hoc Polymorphism

Ad hoc polymorphism refers to polymorphic functions whose behavior is dependent on the type of the arguments. This is often referred to as as function overloading. In some languages, "operator overloading" is also enabled such that arithmetic, bitwise, assignment, and logical operators can be overridden to contain custom functionality. Since operators are, in a sense, functions with syntax sugar, function and operator overloading are the same.

Overloading operators is particularly useful because the algebraic laws that apply to expressions also apply to more abstract objects. For example, the commutative law of addition applies to both integers and elliptic curve points. The commutative property of elliptic curve point arithmetic enables protocols like the Diffie-Hellman key exchange protocol to exist, therefore, programming languages with operator overloading could allow the addition of elliptic curve points to use the addition (`+`) operator.

// rust example

struct EllipticCurvePoint {
    x: FieldElement,
    y: FieldElement,
}

impl EllipticCurvePoint {
    pub fn generator_point() -> Self {
        // ..
    }
}

impl Add for EllipticCurvePoint {
    // ..
}

fn main() {
    let g = EllipticCurvePoint::generator_point();
    let g1 = g + g;
}

Parametric Polymorphism

Parametric polymorphism allows functions and data types to be provided with generic data types which are then instantiated with other types as needed. Languages with parametric polymorphism often have two different sets of arguments, one containing types and the other containing expressions.

When parametric polymorphic code is compiled, a process called monomorphization is performed. This transforms polymorphic code to monomorphic code by essentially creating copies of functions and data types for each unique type parameter.

A useful way to consider parametric polymorphism is given a type `T<U>` where `T` is the type and `U` is a type parameter to `T`, "for every possible type `U`, there exists a distinct type `T<U>`".

Reread that last sentence, it is important and prerequisite to later sections.

// rust example

struct PolymorphicStruct<T> {
    polymorphic_field: T,
}

fn polymorphic_function<T>(argument: PolymorphicStruct<T>) -> T {
    return argument.polymorphic_field;
}

fn main() {
    let integer_struct: PolymorphicStruct<u8> = PolymorphicStruct {
        polymorphic_field: 1,
    };

    let integer: u8 = polymorphic_function::<u8>(integer_struct);

    let boolean_struct = PolymorphicStruct {
        polymorphic_field: true,
    };

    let boolean: bool = polymorphic_function::<bool>(boolean_struct);
}

Subtype Polymorphism

Subtype polymorphism, or subtyping, is a type system containing relationships between supertypes and subtypes where functions and data types written for a given supertype are also applicable to elements of each of its subtypes.

It is important to note that subtyping is not the same as inheritance from object-oriented languages, as subtyping is the relationship between types or interfaces whereas inheritance is the relationship between implementations. In some object-oriented languages, subtyping is called interface inheritance while inheritance is referred to implementation inheritance.

// rust example

trait Supertype {
    fn id(self: Self) -> Self;
}

fn identity<S: Supertype>(term: S) -> S {
    return term.id();
}

struct IntegerSubtype(u8);

impl Supertype for IntegerSubtype {
    fn id(self: Self) -> Self {
        return IntegerSubtype(self.0);
    }
}

struct BooleanSubtype(bool);

impl Supertype for BooleanSubtype {
    fn id(self: Self) -> Self {
        return BooleanSubtype(self.0);
    }
}

fn main() {
    let integer_term = IntegerSubtype(1);
    let integer_term = identity(integer_term);

    let boolean_term = BooleanSubtype(true);
    let boolean_term = identity(boolean_term);
}

Dependent Type

Dependent types are types that depend on other types or values. Parametric polymorphism is a form of dependent typing, as types depend on other types, but this is not the whole picture. A dependent type may depend on other types and on values.

A useful way to consider dependent types is given a type `T<U, n>` where `T` is the type, `U` is a type parameter, and `n` is a value parameter to `T`, "for every possible type `U` and every possible value `n`, there exists a distinct type `T<U, n>`. Reread that last sentence. Now read it again.

There are very few languages that have true dependent typing. Coq, Idris, Agda, Lean, and a few others contain dependent types. These are often considered too far into theory territory, though they carry unique benefits in terms of program correctness, compile-time formal verification, and theorem proving.

Rust has a rudimentary form with its "const generic" system, but this is not a fully realized dependent type.

In the examples below, the Rust code used is pseudocode, meant to mimic the syntax of Rust but with dependent types for the sake of demonstration. Notice the return type and how it contains a value expression. In the example, `T` is the internal type of a vector `Vec` and `n` is its length.

// pseudo-rust example

type Vec<T, n>; // ..

fn append(
    lhs: Vec<T, n>,
    rhs: Vec<T, m>
) -> Vec<T, n + m> {
    // ..
}

/- lean example -/

inductive Vect (α : Type u) : Nat → Type u where
    | nil : Vect α 0
    | cons : α → Vect α n → Vect α (n + 1)

Theorem Proving

This section is much less concrete, both because this is the topic of thousands of pages in academic papers and because my personal understanding of this is not fully developed. Instead the objective is to present the value of such systems from a high level.

Theorem proving is a use case for type systems where theorems are proven during compile time. In some cases, the compiler does not even generate code, it simply proves a theorem and its successful execution indicates a valid proof.

So why is this useful?

This is useful because the declaration of a type is a theorem and the existence of a term with the declared type is a proof that the type exists. Consider the following example, eloquently explained by John Wiegley.

If you declare, say, an `Integer` type, you are creating a theorem that says "there exists a value of type `Integer`." When you assign a value to that type, say `10`, if the program compiles, the number `10` is a proof that there exists an `Integer`!

Combine this knowledge with parametric polymorphism and dependent types and you can prove surprisingly complex systems. Revisit the pseudo-Rust example with dependent types and describe this as a set of theorems and proofs.

Theorem: "There exists a distinct vector `Vec` for all types `T` and all lengths `n`"
Proof: `let vec: Vec<u8, 2> = Vec::from([1_u8, 2_u8]);`

Theorem: "There exists a function `append` that, for two distinct vectors `Vec` with type `T` and of lengths `n` and `m` respectively, constructs a new vector `Vec` of type `T` and of length `n + m`."
Proof: `let both: Vec<u8, 4> = append(vec, vec);`

As mentioned above, this is an incredibly deep topic and has more information to explore than could ever be condensed into a reasonably sized article, so we will leave this topic for now.

The Lambda Cube

    λω-------------λC
    /|             /|
   / |            / |
  /  |           /  |
λ2---+---------λP2  |
 |   |          |   |
 |  λω_---------+-λPω_
 |  /           |  /
 | /            | /
 |/             |/
λ->------------λP

If you truly have shit to do, this is not strictly necessary, so you can close the article and be on your way. This is deep in theory territory and here there be lambdas.

The lambda-cube, aka λ-cube, is a framework introduced to explore the dimensions that map simply typed lambda calculus to the calculus of constructions.

"wot"
- @tomscott (circa 2013)

In the example above, consider the bottom left point on the cube, λ-> to be simply typed lambda calculus and the upper right point on the cube, λC to be the calculus of constructions.

In human words, the simply typed lambda calculus is nothing but terms and functions that operate on those terms. As mentioned in the function type section, this is technically Turing complete. Meanwhile, the calculus of constructions is the generalization of lambda calculus where terms and types are the same.

"what"
- @steveaustinBSR (circa 2001)

Before diving into each point on the cube, it may be useful to mention the Curry-Howard correspondence. This indicates a direct relationship between programs and proofs. As mentioned in the theorem proving section. This means there is a correspondence in terms and types. This is why the calculus of constructions is possible; terms and types are not related to one another, they are one another. Languages that are designed with this in mind offer proof assistants that can formally prove program correctness during compilation.

Back to the lambda cube. The x (--), y (|), and z (/) axes each correspond to a dimension of the type system.

x : corresponds to dependent types
y : corresponds to polymorphism
z : corresponds to type operators

—

λ-> aka Simply Typed Lambda Calculus introduces terms, definition of functions that map terms to terms, and the application of functions to terms.

λ2 aka System F introduces polymorphism, where terms can depend on types.

λP aka First-Order Predicate Calculus introduces dependent types, where types can depend on terms.

λω_ aka System Fω_ introduces type constructors, where types can depend on other types.

λω aka System Fω is a composition of λω_ and λ2, introducing polymorphic type constructors, where terms can depend on types and types can depend on types.

λ2P aka Second-Order Predicate Calculus is a composition of λP and λ2, introducing polymorphic dependent types, where terms can depend on types and types can depend on terms.

λPω_ aka Weak Higher-Order Predicate Calculus is a composition of λω_ and λP, introducing types that can depend on terms and types that can depend on types.

λC aka Calculus of Constructions is the composition of λ2, λP, and λω_ completing the lambda cube, introducing terms that depend on types, types that depend on terms, and types that depend on types.

—

Conclusion

This is where our type theory deep dive stops for now. In summary, we have explored simply types from primitives to functions, algebraic data types from product types to sum types, and higher-order types like polymorphic types and dependent types. We also began to scratch the surface on how these could be applied to things such as theorem proving and correctness checking. Finally, we took a look at the dimensionality of type systems ranging from lisp-like simply typed lambda calculus to the coq-like calculus of constructions.

The primary takeaway here is to consider, for any language we see, how the designers of these languages have considered the structure of their type system, what consequences these designs have on the extensibility, correctness, and readability of the language, and how can we take full advantage of the tools provided to us through the type system to create robust, correct, and performant systems.

".. finally, our code can be perfect."
- @noboilerplate (circa 2022)

Until next time, good hacking 🤘

jtriley