CS 242: Typestate

Throughout the last several lectures, we have seen ways of taking best practices in systems programming and encoding them in Rust’s type system. For example:

Memory safety issues are caused by simultaneous mutation and aliasing. Rust’s ownership model codifies best practice by categorically preventing a value from being changed while two or more variables point to it. This avoids dangling pointers, iterator invalidation, and many other such bugs.
Improper use of the system memory allocator causes issues. In Rust, determining the size of allocation is handled by the type system (see mem::size_of). The ownership system determines when to insert calls tofree(). Rust automatically derives destructors for composite data types, ensuring the user cannot forget to free memory, or free in the wrong order.
The use of destructors avoids many common pitfalls of stateful objects. In the Python C API, reference counts to pointers must be manually incremented and decremented, which can be done incorrectly in either direction. Rust’s Rc always decrements the reference count on destruction. Similarly, MutexGuards ensure that the programmer won’t forget to call pthread_mutex_unlock.
Polymorphic data types protect against invalid access to data. Option types prevent accessing the result of a computation that might error, avoiding the need for null values. Mutex types ensure that a concurrently accessed value must be locked before being accessed—you can’t accidentally forget to take the lock.
Marker traits tag types with their capabilities, ensuring e.g. that one thread can’t Send a thread-unsafe type (like a non-atomic reference counted pointer) to another thread.

These are all excellent examples of type-safe systems programming, but to me they feel qualitatively small-scale. Yes, we can avoid individual memory issues or null pointer checks, but what about higher-level invariants of our program?

State machines

Specifically, we’re going to focus on state machines. Many applications have logical structure that can be represented as a finite state machine, i.e. an object that takes action to transition through a series of states. For example, consider a basic API for reading a file. The state machine looks something like:

Ignoring for a moment all the failure cases, the happy path of a state machine looks like this. After opening a file, we can read it until reaching the end of the file, entering an EOF state. In either the read or EOF states, we can choose to close the file. As you’ve seen in a class like CS 110, you can interact with a state machine as follows.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>

int main() {
  int fd = open("test.txt", 0);
  if (fd == -1) {
    printf("Error opening file: %s\n", strerror(errno));
    exit(1);
  }

  char buf[1024];
  int bytes = 128;
  int bytes_read = read(fd, buf, bytes);
  if (bytes_read == -1) {
    printf("Error reading file: %s\n", strerror(errno));
    exit(1);
  } else if (bytes == bytes_read) {
    printf("Read 128 bytes\n");
  } else {
    printf("Read %d bytes, reaching EOF\n", bytes_read);
  }

  int err = close(fd);
  if (err == -1) {
    printf("Error closing file: %s\n", strerror(errno));
  }
}

This is an example of high-quality C that carefully adheres to every convention of the file API. But let’s think of different mistakes we could make using this API. Specifically, think in terms of the state machine. What transitions can we make, or what states can we enter that aren’t permitted from the graph above?

Starting in a non-start state: let’s say we invent a file descriptor, e.g. int fd = 5. The API allows me to call read(fd, ...) because it cannot distinguish between a descriptor generated by the API, versus one I invented myself.
Attempting to take a valid transition on an invalid operation: if open fails, then we should check fd == -1 and step to an error state. But we can forget to check this condition, and attempt to read(fd) as if we were in the read state.
Taking an incorrect transition: if we don’t check that bytes == bytes_read, we can’t distinguish between whether a read has entered an EOF state or a read state. So we could potentially transition incorrectly to a read or EOF state after calling read().
Stopping before an accept state: we can forget to close the file descriptor, meaning exiting the middle of the state machine before entering an accept state. This bug is particularly common in cases of early return, e.g. if read fails then we potentially need to close the file descriptor before exiting.
Reusing old states: once we’ve close the file descriptor (or hit EOF), we can still attempt to read from fd as if it was still in a read state.

At a high-level, the issue is that the state machine invariants of the file API are not encoded in the C type system. At best, they are enforced as runtime checks with error codes. In this lecture, we’ll see how each mistake can be avoided with careful type-level design in Rust, a design pattern called “typestate”.

Typestate in Rust

Let’s start with a simplified example. Consider the following state machine of three states:

The main idea is that we can represent each state as a different type, and we restrict which operations are available on each type to enforce the state machine transitions. The corresponding Rust code for the above state machine:

mod statemachine {
  pub struct A { _secret: () }
  pub struct B { _secret: () }
  pub struct C { _secret: () }

  impl A {
    pub fn new() -> A { A { _secret: () } }
    pub fn b(self) -> B { B { _secret: () } }
  }

  impl B {
    pub fn b(self) -> B { B { _secret: () } }
    pub fn c(self) -> C { C { _secret: () } }
  }

  impl C {
    pub fn a(self) -> A { A { _secret: () } }
  }
}

use statemachine::*;

fn main(){
  let a = A::new();
  let b = a.b();
  let b = b.b();
  let c = b.c();
  let a = c.a();
}

The state machine is embedded inside a mod block so Rust treats the main function and A/B/C types as being in separate modules with distinct privacy scopes. Each state is a struct with no fields, and each struct has a set of methods implemented for it. Each method describes a transition on the state machine. For example, A::new creates a value A from nothing, and A::b takes an A and converts it to a B. The self transition B::b takes a B and returns another B. The main function shows an example of transitioning through the graph.

Let’s consider how this API relates to some of the classes of state machine errors above:

Starting in a non-start state: because each struct is pub but its field _secret is private, this means a client outside the module cannot construct a state directly. For example:
```
fn main(){
  let a = A { _secret: () };
}
```
This program fails to compile with the error:
```
error[E0451]: field `_secret` of struct `statemachine::A` is private
  --> test.rs:24:15
   |
24 |   let a = A { _secret: () };
   |               ^^^^^^^^^^^ field `_secret` is private
```
We can only instantiate structs through provided constructor methods, e.g. A::new(). Because B and C do not have constructors, we enforce that their states can only be entered through A.
Taking an incorrect transition: if we have a value of type A, it is impossible for us to get a value of type C without having gone through B. A does not expose a method that returns C, only B does.
Reusing old states: notice that each method takes as input self, meaning it consumes ownership of the state. This ensures that after transitioning out of a state, it cannot be used again. For example:
```
fn main(){
  let a = A::new();
  let b = a.b();
  a.b();
}
```
This program fails to compile with the error:
```
error[E0382]: use of moved value: `a`
--> test.rs:26:3
25 |   let b = a.b();
 |           - value moved here
26 |   a.b();
 |   ^ value used here after move
```

File API in Rust

Now, let’s apply the typestate pattern to the file API. Specifically, we are going to wrap the C API in Rust to show how to encode the file state machine in Rust types.

Calling C from Rust

First, let’s take a look at how to call C functions from Rust. The Rust libc crate includes many C standard library functions, marked as unsafe due to raw pointer usage. Here’s a code snippet that reads a file completely into a buffer, but does not check any error codes.

use std::ffi::CString;
use libc;
use std::str;

fn read_all() {
  unsafe {
    // Convert string into a char*
    let path = CString::new("test.txt").unwrap();
    let fd = libc::open(path.as_ptr(), 0);

    let mut buf: Vec<u8> = Vec::new();;
    loop {
      // Ensure buffer has enough capacity for next 128 bytes
      buf.reserve(128);

      // Read the bytes into the vector
      let count = libc::read(fd, buf.as_mut_ptr() as *mut libc::c_void, 128);

      // Manually set the vector's length
      buf.set_len(buf.len() + count as usize);

      // Check for EOF
      if count < 128 {
        break;
      }
    }

    // Print final string
    println!("{}", str::from_utf8(&buf).unwrap());
  }
}

Basic typestate

Let’s turn this into a proper API. First, we need a type to represent an open file.

pub struct File {
  fd: libc::c_int,
  buf: Vec<u8>,
}

Then we can associate an open constructor with the File type through an impl block:

impl File {
  pub fn open(path: String) -> Result<File, String> {
    unsafe {
      let fd = libc::open(CString::new(path).unwrap().as_ptr(), 0);
      if fd == -1 {
        /* assume errno_string() reads errno and converts it
         * into a Rust string through strerror() */
        Err(errno_string())
      } else {
        Ok(File { fd, buf: Vec::with_capacity(1024) })
      }
    }
  }
}

As we’ve seen before with sum types, the use of Result allows us to represent the failure case. Specifically, if File::open fails, the client gets a Result::Err. The client does not get a File struct upon failure, meaning they cannot accidentally use an invalid file.

With a valid File struct in hand, now we want to implement a read method that reads bytes from the file descriptor. Applying the typestate pattern, we will consume ownership of the file, and return one of three possible outcomes.

If the file has not reached EOF, then return back the File.
If the file has reached EOF, return a new state FileEof.
If the read caused an error, then return an Error.

First, we need a new FileEof struct that looks the same as File, but is a distinct type:

pub struct FileEof {
  fd: libc::c_int,
  buf: Vec<u8>,
  closed: bool
}

Then we can implement read as:

pub enum ReadResult { File(File), FileEof(FileEof), Error(String) }

impl File {
  // Read takes ownership of the file, indicated by self without & or &mut
  pub fn read(mut self, bytes: usize) -> ReadResult {
    // Ensure buf has enough space
    self.buf.reserve(bytes);
    unsafe {
      // Read the file
      let bytes_read = libc::read(
        self.fd, self.buf.as_mut_ptr() as *mut libc::c_void, bytes);

      // If the read failed, then immediately return an error
      if bytes_read == -1 {
        return ReadResult::Error(errno_string());
      }

      // Increase length of the vector for the leements copied
      self.buf.set_len(self.buf.len() + bytes_read as usize);

      if (bytes_read as usize) < bytes {
        // Return EOF if we've reached the end of the file.
        // Copy all the fields of self into a new struct.
        ReadResult::FileEof(FileEof {
          fd: self.fd,
          buf: self.buf
        })
      } else {
        // If we haven't reached EOF, then return back ownership
        // of self.
        ReadResult::File(self)
      }
    }
  }
}

Finally, we need a way to close a file. This should consume ownership of the file, but return an error if one occurs.

impl File {
  pub fn close(mut self) -> Result<(), String> {
    unsafe {
      // If the close fails, return the error
      if libc::close(self.fd) == -1 {
        Err(errno_string())
      } else {
        Ok(())
      }
    }
  }
}

All together, we can use this implementation like so:

fn main() {
  let f = File::open("test.txt".into()).unwrap();
  // read returns a enum, so we need to match on it
  match f.read(10) {
    ReadResult::File(f) => {
      // assume f.contents() returns f.buf as a string
      println!("{}", f.contents());
      f.close().unwrap();
    },
    _ => {}
  }
}

One issue with this implementation is that some functionality should be shared across multiple states. For example, close should work the same way on File and FileEof. But currently, we would have to duplicate close across impl File and impl FileEof. To address this, rather than representing each unique state as a different struct, we can have a single polymorphic wrapper type File<S> where each S is a different type.

use std::marker::PhantomData;

// Each struct is a separate state
pub struct Reading;
pub struct Eof;

// File is parameterized by the current state
pub struct File<S> {
  fd: libc::c_int,
  buf: Vec<u8>,
  // Rust complains if a struct has a type parameter that is unused in the
  // definition, so this is a hack that adds a zero-size "phantom" field
  // which references the unused type S.
  _marker: PhantomData<S>
}

// Now we return File<Reading> and File<Eof> instead of File and FileEof
pub enum ReadResult { File(File<Reading>), FileEof(File<Eof>), Error(String) }

// Close is implemented on all states
impl<S> File<S> {
  pub fn close(mut self) -> Result<(), String> {
    // same as before
  }
}

// impl blocks allow us to only implement methods for certain types
// this ensures that only a file in the Reading state can be read
impl File<Reading> {
  pub fn open(path: String) -> Result<File<Reading>, String> {
    unsafe {
      let fd = libc::open(CString::new(path).unwrap().as_ptr(), 0);
      if fd == -1 {
        Err(errno_string())
      } else {
        Ok(File {
          fd,
          buf: Vec::with_capacity(1024),
          // Add PhantomData value for _marker
          _marker: PhantomData
        })
      }
    }
  }

  pub fn read(mut self, bytes: usize) -> ReadResult {
    self.buf.reserve(bytes);
    unsafe {
      let bytes_read = libc::read(
        self.fd, self.buf.as_mut_ptr() as *mut libc::c_void,
        bytes);
      if bytes_read == -1 {
        return ReadResult::Error(errno_string());
      }

      self.buf.set_len(self.buf.len() + bytes_read as usize);

      if (bytes_read as usize) < bytes {
        // We return a File<Eof> instead of a FileEof
        ReadResult::FileEof(File {
          buf: self.buf,
          fd: self.fd,
          _marker: PhantomData,
        })
      } else {
        ReadResult::File(self)
      }
    }
  }
}

Now, we can start to define higher-level API constructs. For example, a method that takes a file and reads it until EOF:

impl File<Reading> {
  pub fn read_all(mut self) -> Result<String, String> {
    loop {
      match self.read(1024) {
        ReadResult::File(f) => {
          // If we aren't done reading, return ownership of `f` back into `self`
          self = f;
        }
        ReadResult::FileEof(f) => {
          let ok = Ok(f.contents());
          // the ? is shorthand for "if f.close() gives an error, then return
          // the error out of read_all"
          f.close()?;
          // Also note that we can do f.close() now because close is implemented
          // for all states
          return ok;
        }
        ReadResult::Error(s) => {
          return Err(s);
        }
      }
    }
  }
}

To recap, which mistakes have we solved so far?

Starting in a non-start state: in the typestate convention, File<Eof> cannot be constructed without going through File, ensuring we start in the right part of the state machine.
Attempting to take a valid transition on an invalid operation: the Result types ensure that if e.g. open fails, then File will only be accessible in the success case.
Taking an incorrect transition: separating the File<Reading> and File<Eof> cases ensures that we only get access to the correct state after a transition.
Reusing old states: due to ownership, we cannot reuse a file state after it has been consumed.

Close and drop

The one mistake we have yet to avoid is stopping before an accept state. For example, in this snippet:

fn main() {
  let f = File::open("test.txt".into()).unwrap();
  f.read(10);
}

If we don’t explicitly call f.close(), then the file will be dropped without calling the close(fd) operation, leaving the OS to potentially leak file descriptors or other resources. One possible solution is to automatically close upon drop:

impl<S> Drop for File<S> {
  fn drop(&mut self) {
    unsafe {
      if libc::close(self.fd) == -1 {
        panic!("{}", errno_string());
      }
    }
  }
}

Is this the best design? The issue is that drop can’t return an error, so if one occurs, here we immediately exit the process. In general, Rust libraries avoid panicking on drop because it’s preferable that all errors are recoverable. Rust’s canonical File API says:

Files are automatically closed when they go out of scope. Errors detected on closing are ignored by the implementation of Drop.

Under this design guideline, the best choice is to allow users to handle close errors if they explicitly call close, and otherwise attempt to close while ignoring errors in Drop. However, we need to be careful–if File::close calls close as does File::drop, we could potentially call close() twice. One possibility is to track a closed bit on the File struct:

pub struct File<T> {
  ...
  closed: bool
}

impl File<Reading> {
  pub fn open(path: String) -> Result<File<Reading>, String> {
    ...
     File {
       ...
       closed: false
     }
   }
}

impl<S> File<S> {
  pub fn close(mut self) -> Result<(), String> {
    self.closed = true;
    unsafe {
      if libc::close(self.fd) == -1 {
        Err(errno_string())
      } else {
        Ok(())
      }
    }
    // self is dropped here, calling File::drop(&mut self)
  }
}

impl<S> Drop for File<S> {
  fn drop(&mut self) {
    if !self.closed {
      unsafe { libc::close(self.fd); }
    }
  }
}