Algebraic data types
Now that we understand the essentials of the theory of type systems, the next big question is: where to? There’s many aspects of type systems we could potentially explore, but so little time. I’ve picked three particular topics in type systems (algebraic data types, recursive types, and polymorphic types) that are both practically useful in day-to-day programming, and also embody a wide range of core concepts in type theory.
ADTs by example
The basic idea behind algebraic data types (ADTs) is to represent complex data structures in our type system, more than just integers and functions. We want to represent relations between data, specifically the notions of “and” and “or”. We need types that represent a composite of A and B, or an option of A or B.
Records
The former idea is quite common in programming languages. C has structs:
typedef struct {
int x;
int y;
} point_t;
void main() {
point_t origin;
origin.x = 0;
origin.y = 0;
printf("(%d, %d)", origin.x, origin.y);
}
Python has tuples:
origin = (0, 0)
print(origin[0], origin[1])
These are both instances of the same core idea: types that contain multiple components. If the components are anonymous, like in the Python example, we call them tuples1, and if the components have names, we call them records (or structs in the C case). Tuples are distinct from lists, as tuples have a fixed size. Records are distinct from dictionaries (aka maps, aka associative arrays) for the same reason. You have most certainly seen tuples or records before, so not much more detail is necessary. Here’s what they look like in OCaml:
type point = {
x : int;
y : int;
}
let origin: int * int = (0, 0) in
let (x, y) = origin in
Printf.printf "(%d, %d)" x y;
let origin: point = { x = 0; y = 0; } in
Printf.printf "(%d, %d)" origin.x origin.y
Two points of interest. First, in order to access elements of the tuple, we destructured the tuple into it’s components with let (x, y) = origin
. This, and its more generalized notion of pattern matching are common features of functional programming languages (and sadly uncommon in other languages). Second, when constructing the point record, the record literal itself did not have to be annotated with the type name (e.g. point { x = 0; y = 0 }
), and is instead inferred.
Variants
This latter idea, of having a type that represents one of many choices, is far less common in most traditional programming languages, but equally as important as records. In OCaml, this is usually called a “variant” type. The simplest example is the option type (OCaml):
type option = Some of int | None
let div (m : int) (n : int) : option =
if n = 0 then None
else Some (m / n)
let k = div 5 0 in
match k with
| Some k' -> Printf.printf "Success: %d" k'
| None -> Printf.printf "Failure :("
A value of the option type can be one two things: either a None
, or a Some(n)
for any integer n
. In the div
function, we use this to indicate the absence or presence of a result: if we divide by zero, then return None
, else return Some(m/n)
.
That creates a value of the option type, but in order to use it, we have to be able to ask: is this result a some, or a none? For that, we can use the match
statement, which allows us to define a branch for each possible value of the result. Here, we say “if k is a some, then get the integer result and print it, otherwise print failure.”
As an extended example, variants are amazingly useful for representing errors. Consider some alternative approaches to error handling. In C, programs traditionally use error codes:
#define ERROR_DIV_BY_ZERO -1
char* error_to_string(int err) {
if (err == ERROR_DIV_BY_ZERO) {
return "Div by zero";
} else {
return "<Unknown>";
}
}
int div(int m, int n, int* result) {
if (n == 0) {
return ERROR_DIV_BY_ZERO;
} else {
*result = m / n;
return 0;
}
}
void main() {
int result;
int err = div(5, 0, &result);
if (err < 0) {
printf("Error: %s\n", error_to_string(err));
exit(1);
}
}
This interface is horrendous.
- We have to pass a pointer to our result, only for it to possibly be filled in, and we could still ignore the error and use result’s undefined value.
- We have to have a gross morass of #defines and error to string conversions.
- We have to have a clear convention on error codes: is 0 success or failure? Is greater than 0, or less than 0 a success or a failure?
Some modern languages still get this wrong2. Any language with a notion of null pointers (e.g. Java) often uses pointers as “implicit” option types—if the returned pointer in null, then the operation failed, otherwise it succeeded. By contrast, if you look at using our div
function above:
let k = div 5 0 in
match k with
| Some k' -> Printf.printf "Success: %d" k'
| None -> Printf.printf "Failure :("
We are disallowed from using the result value unless it succeeded. We simply cannot accidentally mess up the error handling. Moreover, we could use a more expressive result type to have richer kinds of errors:
type error = DivByZero | ...
type result = Ok of int | Error of error
let error_to_string e = match e with
| DivByZero -> "Div by zero"
| ...
let div (m : int) (n : int) : result =
if n = 0 then Error (DivByZero)
else Ok (m / n)
let k = div 5 0 in
match k with
| Ok k' -> Printf.printf "Success: %d" k'
| Error e -> Printf.printf "Failure: %s" (error_to_string e)
We can reuse the variant mechanism not just to represent our results, but also our errors too! One thing this makes clear: the power of a variant over the traditional notion of an “enum” (like you find in C) is that variants branches have fields. This enables rich, safe interfaces like the one above.
ADT theory
Hopefully that gives you an intuition for what ADTs are and how they work. Now, we’re going to explore a formalism that extends our simply typed lambda calculus from last week with ADTs.
Product types
First, we’re going to look at simplified records: pairs (or 2-uples). Before I provide you the solution, I want you to think for a moment—how would you implement pairs? What do you need to add to the syntax, statics, and dynamics?
As a general methodology for developing new language features, a given feature usually has two “forms” (syntax extensions): an introduction form, and an elimination form. Introductions create a thing, and elimination destroys the thing, usually providing value in the process. For example, functions are introduced by lambdas, and eliminated by function application.
For pairs, we’ll do the same thing. We need a form to create pairs, and to destroy pairs (i.e. access their elements).
We add a new “direction” syntax to specify which element of the pair we’re accessing. The operation of accessing an element of a pair is called a “projection.” For example, a function that returns that adds two pairs of integers:
We also added a new type: . In the world of type theory, this is usually called a “product type.” The reasoning for the name will become more apparent when we discuss the “algebra” part of ADTs. In the meantime, we’ll need new static semantics:
Read them carefully to make sure you parse the syntax. The rules themselves should seem fairly commonsense–if a pair is composed of two expressions with types and , then the pair type is . If a pair has type and you’re taking out the left element, then the result is of type , vice versa for the right element. Lastly, the dynamic semantics:
There are a good number of rules, but as usual, most of the clutter just comes from precise treatment of eager vs. lazy evaluation. Here, we specify that a pair is only a value if both of its components are values. The projection rules should be self-explanatory at this point.
Sum types
As with products, we will present a simplified version of variants, usually called “sum types.” And as before, we will start by defining the syntax of the introduction and elimination forms, and then defining their semantics.
This syntax looks a little more intimidating! First, to introduce a value of sum type, we use an “injection”. For example, creates a value that could be an integer, or could be a function (but it’s actually an integer). Then we use a statement like a more limited match
to conditionally run code based on the value of a sum type. For example, a function that either returns a number or calls a function:
We need to add typing rules for our new syntax:
The first two rules are fairly simple—if an injection says “this is a sum type ”, then double-check the type of and roll with it, no further questions asked. Case statements are more tricky. First, we check that the expression we’re casing on is actually a sum type . Then we have two expressions with variable bindings, and type-checking rules reminiscent of the original rule for lambdas. For , we have to check what type it returns assuming , getting some . We then check the type of assuming . Importantly, the two expressions have to return the same type. This is implicitly written in the rule by using the same variable name for both judgments.
Lastly, the dynamic semantics:
As an exercise, you should read through these yourself and see if you can understand the semantics. A goal of this course is to equip you with the necessary tools and methodologies to explore programming languages on your own, so this is a valuable lesson for improving your fluency in the langauge of PL.
ADT metatheory
Now that we’ve changed our simply typed lambda calculus, we’re going to have to re-prove our type safety theorems. After all, we could have introduced potentially incorrect/breaking changes. However, one of the neat things about structural induction proofs is that they are naturally extensible. Rather than having to go through the entire language again, I can just prove the cases of the theorem for each new rule I provided. Below, I’ll walk through a proof of progress for sum types as an example.
Theorem: if then either or there exists an such that .
Proof: by structural induction (extending from last time).
-
T-Inject-L: if then either or there exists an such that .
First, by the premises of T-Inject-L, .
Second, by the inductive hypothesis, either or .
Third, casing on each possibility for :
-
If , then by D-Inject-2, .
-
If , then by D-Inject-1, .
Hence the theorem holds for both cases.
-
-
T-Inject-R: proof is same as T-Inject-L, but for and .
-
T-Case: if then either or there exists an such that .
First, assume the premises of T-Case (listed above).
Second, assume the inductive hypothesis for .
Third, casing on each possibility for :
-
If , then by inversion of T-Inject-L and T-Inject-R, because , two possible values for :
-
: then by D-Case-2:
-
: then by D-Case-3:
-
-
If , then by D-Case-1:
Hence, in each case, there exists an that the steps to, and the progress theorem holds.
-
Algebra of ADTs
Product types and sum types are collectively called “algebraic” data types because they have algebraic properties similar to normal integers. It’s a neat little construction that might help you understand the relation between products/sums, although it probably won’t change much in the design of your programs.
Generally, you can understand the algebraic properties in terms of the number of values that inhabit a particular type. For example, the type is inhabited by two values, true and false. We write this as . To fully develop the algebra, we first need to add two concepts to our language:
Above we introduce the void and unit types3. The idea behind these types is that there are 0 expressions that have the type , and there is 1 expression that has the type (the unit value). With these in the language, now we have an identity for both of our data types:
For example, if , then the terms that have the type are and . In an information-theoretic sense, adding the unit type (or the type) to our pair provides no additional information about our data structure. The number of terms inhabiting the type remain the same. Similarly, for the sum case, the type has two values: and . There are no values of type , so there are no other possible injections.
More generally, the sum and product type operators follow these basic rules:
These rules then give rise to a number of algebraic properties. For example:
One way to interpret this is as calculuating information content. A pair of two numbers is logically same as the same pair reversed. As in, there’s not a real difference between { int x; int y; }
and { int y; int x }
. For more on this, I recommend The algebra (and calculus!) of algebraic data types (uses Haskell syntax, but ideas are the same).
-
“Tuple” is a generalization of single, double, triple, quadruple, quintuple, etc. to -uple, a composite of elements. Hence, tuple. ↩
-
It’s always a strong statement to say something is categorically “wrong” in a programming language, but I feel pretty strongly about this one. ↩
-
While the void type does not exist in OCaml, the unit type is used frequently to represent side effects. For example,
Printf.printf
returns a unit type. ↩