Traits
As I mentioned last time, there are really only two features that truly distinguish Rust as a programming language: the borrow checker, and traits. Every other part of the language you should be able to map onto a feature you already know in either OCaml or C. Including: primitive types (integers, booleans, strings), lexically scoped variables, mutation, functions, nested functions, product types, sum types, match expressions, polymorphic types, and namespaces. Rust has a number of smaller features like syntatic sugar for iterators, but the list provided comprises Rust’s core feature set. We talked last time about the borrow checker, and how it allows us to do compile-time memory management by careful restrictions on aliasing and mutation through ownership/borrowing. Today, we’re going to focus on a different issue: partial parametric polymorphism.
Partial parametric polymorphism
Recall from last time that both OCaml and Rust support what’s called parametric polymorphism, where functions and types can be parameterized by an arbitrary input type.
let reverse_pair (x, y) = (y, x)
(* reverse_pair : 'a * 'b -> 'b * 'a *)
type 'a list = Nil | Cons of 'a * 'a list
fn reverse_pair<A, B>((x, y): (A, B)) -> (B, A) {
(y, x)
}
enum List<T> { Nil, Cons(T, Box<List<T>>) }
However, a common issue with parametric polymorphism is that you would like to write generic functions assuming certain capabilities of a type. For example, if you want to write a list to string method, then you need to assume some way of turning list elements into strings. In OCaml, this is done with higher-order functions:
let rec to_string (f : 'a -> string) (l : 'a list) : string =
match l with Nil -> "" | Cons(x, l') -> (f x) ^ (to_string f l')
More broadly, OCaml uses a programming style where the programmer is expected to explicitly link types with functions. For example, as you saw in the modules lecture, a list library would look like:
module type List = sig
type 'a list = Nil | Cons of 'a * 'a list
val len : 'a list -> int
val to_string : ('a -> string) -> 'a list -> string
end
module ListTest(L : List) = struct
let l : int L.list = L.Cons(2, L.Nil) in
Printf.printf "%s" (L.to_string int_to_string l)
end
Note how invoking code related to lists requires explicitly naming the List module. By contrast, Rust uses a style more similar in some respects to object-oriented programming languages, where functions are associated with types, and are called using the value.method()
notation.
// T is the most canonical type variable in Rust, not A
enum List<T> { Nil, Cons(T, Box<List<T>>) }
impl<T> List<T> {
fn len(&self) -> i32 {
match self {
List::Nil => 0,
List::Cons(_, l) => 1 + l.len()
}
}
}
fn main() {
let l: List<i32> = List::Cons(3, Box::new(List::Nil));
assert_eq!(l.len(), 1);
}
Here, the impl
block means “associate this list of functions with the given type”. Functions in an impl block (almost) always have self
as the first parameter, which refers to the thing on the left side of the period. Above, calling l.len()
invokes the len
function with self = l
.
Stepping back for a second, you probably have a lot of preconceptions about the dot operator from object oriented languages (Java, Python, …). For example, you might think that a class instance is like a struct, where accessing a method is like reading a field from a struct. I want you to toss those ideas in the garbage and think about it this way: the dot operator defines type-directed name resolution. When I say l.len()
, then len
is a function that exists somewhere, and we resolve which function it is based on the type of l
. Because l
is a List<T>
, then we know we can apply the len
function from the impl<T> List<T>
block.
Keep this in mind as we return to the question: how do we implement the list to string operation? As the title of the lecture implies, we’re going to use traits. At their core, traits are interfaces, just like you may have seen in Java. We can implement traits for types, and write functions generic over traits.
// ToString interface defines function signatures. You can read this as:
// "to_string takes as input the thing implementing the trait and returns a String."
trait ToString {
fn to_string(&self) -> String;
}
// We can specify implementations of traits using `impl Trait for Type` syntax.
// Here, we make a simple one for integers using the format!() routine in the
// standard library.
impl ToString for i32 {
fn to_string(&self) -> String { format!("{}", self) }
}
// Like the normal impl blocks before, traits associate functions with types.
// Below, I can use n.to_string() because `to_string` now is associatd with i32.
fn main() {
let n: i32 = 3;
println!("{}", n.to_string());
}
Traits define interfaces, and we can implement those interfaces for types similarly to the normal impl blocks. However, the important thing they permit is functions defined over types that implement a trait. For example:
fn to_string_parens<T>(t: T) where T: ToString -> String {
format!("({})", t.to_string())
}
fn main() {
assert_eq!(to_string_parens(3), "(3)");
}
This is an example of partial parametric polymorphism. Our to_string_parens
function works over all types T
that implement a particular trait. In contrast with our OCaml example, we no longer need an explicit function f : 'a -> string
passed in as an argument. Instead, we have an external system for associating functions with types Then define our generic function only over types that are associated with certain functions. This is particularly powerful when dealing with containers. Returning to the list to string example, we can implement that function in a few ways:
// Option 1: standalone generic function
fn list_stringify<T>(l: &List<T>) -> String where T: ToString {
match l {
List::Nil => String::from(""),
List::Cons(x, l) => x.to_string() + &list_stringify(l)
}
}
// Option 2: generic function in impl block
impl<T> List<T> where T: ToString {
fn stringify(&self) -> String {
match self {
List::Nil => String::from(""),
List::Cons(x, l) => x.to_string() + &l.stringify()
}
}
}
// Option 3: implement trait for container
impl<T> ToString for List<T> where T: ToString {
fn to_string(&self) -> String {
match self {
List::Nil => String::from(""),
List::Cons(x, l) => x.to_string() + &l.to_string()
}
}
}
// Note the different syntaxes for calling our new functions.
fn main() {
let l: List<i32> = List::Cons(3, Box::new(List::Nil));
assert_eq!(list_stringify(&l), l.stringify());
assert_eq!(list_stringify(&l), l.to_string());
}
Let’s take a moment to reflect on the benefits of this trait-based approach.
- Type-directed name resolution (with
impl
blocks) obviates the need for the programmer to know where a particular function comes from. For example, if I ever calllist.to_string()
, I just need to know that the elements of the list have theirto_string
method defined somewhere, and the compiler can figure out the rest. - Partially polymorphic functions (with traits and
where
clauses) enable the programmer to concisely define required functionality for input types. A trait provides a convenient identifier for a bundle of functions, which means we don’t need to reproduce the type signature of required function each time in the generic function, e.g. as we had to do in OCaml.
On a theoretical note, traits were originally invented as a mechanism for avoiding problems with inheritance (particularly multiple inheritance) in object-oriented languages. See “Traits: A Mechanism for Fine-grained Reuse”. There are clear links between traits and existential types. For example, the Rust function:
fn foo<T: ToString>(t: T) -> String { t.to_string() }
In the lambda calculus, we could emulate this behavior with the function:
Where the function would be called with a module containing the element element t
along with its corresponding methods.
Comparison with C++ templates
In the Parametric types lecture, I mentioned how C++ templates are a form of “unchecked” polymorphism. This was valid:
#include <iostream>
template<typename T>
void print(T t) {
t.print();
}
class Foo {
public:
void print() { std::cout << "Foo"; }
};
int main() {
print(Foo());
}
But then using print
with a class that didn’t have the print()
method caused a type error in print
to occur:
class Bar {};
int main() {
print(Bar());
}
Let’s see what happens if we do the same thing in Rust.
fn print<T>(t: T) {
t.print();
}
struct Foo;
impl Foo { fn print(&self) { println!("Foo"); } }
fn main() {
print(Foo {});
}
This fails to compile with the error:
error[E0599]: no method named `print` found for type `T` in the current scope
--> test.rs:2:5
|
2 | t.print();
| ^^^^^
Aha! In Rust, polymorphic functions are fully type-checked when they are declared, not when they are used. This means you can never call a Rust function and get a type error within the function because you gave it the wrong type. The solution here would be to define a Print
trait and define print<T>(t: T) where T: Print
.
Trait composition
Another important part of traits is how they compose. For example, let’s say I want to write a generic function to return a copy of all elements in a list less than a target element.
fn lt_filter<T>(l: &List<T>, target: &T) -> List<T> where T: PartialOrd + Clone {
match l {
List::Cons(x, l) => {
let rest = lt_filter(l, target);
if x < target { List::Cons(x.clone(), Box::new(rest)) }
else { rest }
}
List::Nil => List::Nil
}
}
This function is defined over list element types that implement PartialOrd
(i.e. the <
operator) and Clone
(i.e. the .clone()
function). The syntax T: PartialOrd + Clone
means T
is required to implement both traits. This is the basis of trait composition: you can establish requirements on arbitrary sets of traits. This may not seem earth-shattering, but it’s particularly meaningful in contrast to similar patterns in object-oriented programming (discussed later).
Trait polymorphism
Sometimes it’s useful for an interface itself to be polymorphic over an input type. For example, consider the general case of converting from one type to another. Not all conversions are well defined (e.g. what does it mean to turn a number into a list?), but for conversions that are well defined, we want a common interface for all of them. This is the From
trait:
trait From<T> {
fn from(T) -> Self;
}
Here, From<T>
means the trait itself is polymorphic over some type T
. A From<T>
interface defines a function that takes as input a T
, and outputs Self
, a special keyword referring to the type implementing the trait. For example:
enum MyBool { True, False }
impl From<bool> for MyBool {
fn from(b: bool) -> MyBool {
if b { MyBool::True } else { MyBool::False }
}
}
fn main() {
let b = MyBool::from(true);
}
Given this definition, we can define a reciprocal trait Into
:
trait Into<T> {
fn into(self) -> T;
}
impl<S, T> Into<T> for S where T: From<S> {
fn into(self) -> T { T::from(self) }
}
fn main() {
let b: MyBool = true.into();
}
Read the impl
block very carefully to understand what’s going on. It says: given some types S
and T
, we can implement Into<T>
for S
when T
implements From<S>
. The definition of into
is essentially just a call to T::from
. Note that this impl block is defined over all such types S
. Previously, we implemented traits for a specific type, but now we’re implementing functions for all types! This should give you a sense of how powerful Rust’s trait system can be.
Now take a second look at main
. What’s happening? Remember that the dot operator is type-directed name resolution. When we say true.into()
, that does a search for all the functions into(bool) -> MyBool
that could be associated with the type bool
. (Note: the fact that into
is supposed to return MyBool
is inferred from the fact that b
is supposed to be MyBool
.) The Into
trait could possibly implement into
, if MyBool: From<bool>
. Since we implemented that trait earlier, the name resolution engine can deduce that .into()
refers to the Into
trait.
Note: this process of resolving names with complex trait implementations relies on an idea called logic programming. This is a huge and amazing area of PL we won’t cover in this course, but you should totally read up on. Here’s my notes from last year, and here’s a blog post describing how Rust traits become logic programs.
The last thing I want to point out is how we can make neat interfaces using these polymorphic traits. For example:
fn print_two<S, T>(s: S, t: T) where S: Into<String>, T: Into<String> {
let s: String = s.into();
let t: String = t.into();
println!("{}, {}", s, t);
}
fn main() {
print_two(1, "hello");
print_two(true, false);
}
It’s… type-safe… dynamic typing?
Associated types
The last main feature of types to discuss is associated types. Let’s start with an example. In Rust, a popular trait is the Iterator
, which represents a stream of elements. Concretely, the iterator trait looks like:
trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
}
This trait defines an interface over a function next
, that takes a mutable reference to the type implementing the trait, and returns an option of Self::Item
. Item
refers to a type associated with the Iterator
trait which is specified by the implementor of the trait. For example, we can implement an iterator for our list:
impl<T> Iterator for &List<T> where T: Clone {
type Item = T;
fn next(&mut self) -> Option<T> {
match self {
List::Nil => None,
List::Cons(x, l) => {
*self = l;
Some(x.clone())
}
}
}
}
fn main() {
let l: List<i32> =
List::Cons(1, Box::new(List::Cons(3, Box::new(List::Nil))));
let mut iter = &l;
assert_eq!(iter.next(), Some(1));
assert_eq!(iter.next(), Some(3));
assert_eq!(iter.next(), None);
}
Here, we’re treating an immutable pointer to a list (&List<T>
) as an iterator that produces elements of type T
. Calling next on this pointer moves the pointer to the next element in the list, and returns a copy of the element it just saw (relying on T: Clone
).
If you’re wondering: isn’t associated types similar to polymorphic traits? Read the Advanced traits chapter of the Rust book to understand the distinction.
That’s it! That encompasses (most) of the core features of traits. To recap:
- Traits define interfaces, i.e. sets of function signatures. An interface can be polymorphic in a type (e.g.
From<T>
), and can have associated types (e.g.Iterator::Item
). impl Trait for T
blocks implement traits for types, potentially under conditions defined by awhere
clause.- Polymorphic functions can limit their type parameters using trait bounds, e.g.
fn foo<T>() where T: A + B
. - To use a function defined in a trait, the compiler applies type-directed name resolution with the dot operator to find implementations of requested functions, e.g.
true.into()
.
Comparison with OOP
While I’ve tried to give you a little intuition for why traits are designed as they are, it’s hard to teach you in a single lecture all the implications of traits. The material above covers the mechanisms of traits moreso than the design patterns. That, really, is the key part of traits—they fundamentally change the way you structure your code and think about modular, generic programming.
To accept traits into your heart, you really just have to program with them for a while, either in Rust or in languages with equivalent features (namely Haskell, and somewhat Scala). It’s particularly hard to convey their design patterns because traits are fairly new (formulated in 2003, as opposed to object-oriented programming which dates back to 1967 with Simula), and there is not much institutional knowledge about trait patterns in the same way OOP has. Still, I’ll try to highlight a few ways Rust traits relate to the canonical OOP language, Java.
Encapsulation
Both Rust and Java have a notion of private versus public methods and fields.
// Java
public class Animal {
private string name;
public int age;
public Animal(string name_, int age_) { name = name_; age = age_; }
public string getName() { return name; }
public void setName(string name_) { name = name_; }
}
Animal animal = new Animal("Sparky", 3);
animal.setName("Fido");
animal.name = "Foobar"; // Invalid!
// Rust
struct Animal { name: String, pub age: i32 }
impl Animal {
pub fn new(name: String, age: i32) -> Animal { Animal { name, age } }
pub fn get_name(&self) -> String { self.name.clone() }
pub fn set_name(&mut self, name: String) { self.name = name; }
}
let mut animal = Animal::new("Sparky".to_string(), 3);
animal.set_name("Fido".to_string());
animal.name = "Foobar".to_string(); // Invalid!
In Java, subclasses (e.g. Dog
of Animal
) can access protected
fields. Rust has no notion of inheritance, so it does not have the same notion.
Polymorphism
In OOP languages, classes are the means to specify shared functionality between different kinds of things. Writing polymorphic functions usually means casting subclasses to superclasses. For example:
// Java
public class Dog extends Animal { ... }
public class Cat extends Animal { ... }
string decorateName(Animal animal) {
return "Animal name: " + animal.name();
}
void printAnimals(List<Animal> animals) {
for (Animal animal : animals) {
System.out.println(decorateName(animal));
}
}
Dog dog = new Dog(...);
Cat cat = new Cat(...);
List<Animal> animals = new ArrayList((Animal) dog, (Animal) cat);
printAnimals(animals);
In Rust, traits are used for the same purpose.
// Rust
trait Animal { fn name(&self) -> String; }
impl Animal for Box<Animal> { fn name(&self) -> String { (**self).name() } }
struct Cat { name: String, ... }
struct Dog { name: String, ... }
impl Animal for Cat { fn name(&self) -> String { self.name.clone() } }
impl Animal for Dog { fn name(&self) -> String { self.name.clone() } }
fn decorate_name<T>(animal: &T) -> String where T: Animal {
format!("Animal name: {}", animal.name())
}
fn print_animals(animals: &Vec<Box<Animal>>) {
for animal in animals.iter() {
println!("{}", decorate_name(&*animal));
}
}
let dog = Dog::new(...);
let cat = Cat::new(...);
let animals: Vec<Box<Animal>> = vec![Box::new(dog), Box::new(cat)];
print_animals(&animals);
Note: in order to have a heterogenous collection of elements, i.e. a list containing objects of many different concrete types that all implement the same trait, we have to use trait objects. The
impl Animal for Box<Animal>
line ensures that passng a boxed trait (fromprint_animals
) into an unboxed trait (decorate_name
) works correctly.
The challenge for OOP (and benefit of traits) arises when considering extensibility. Imagine that the Cat/Dog/Animal classes were developed in a separate module by another programmer, and we want to extend them with another method. In Rust, this is simple:
use animal::{Animal, Cat, Dog};
trait Noise { fn noise(&self) -> String; }
impl Noise for Cat { fn noise(&self) -> String { "meow".to_string() } }
impl Noise for Dog { fn noise(&self) -> String { "woof".to_string() } }
fn name_and_noise<T>(animal: T) -> String where T: Animal + Noise {
format!("{}: {}", animal.name(), animal.noise())
}
let dog = Dog::new(...);
println!("{}", name_and_noise(dog));
In Java, solving this issue would require changing the interface in the base class (Animal
), which means modifying an upstream dependency.