Rust, Builder Pattern, Trait Objects, Box<T> and Rc<T>

25 Jul 2017 — 7 min read

One of the intimidating parts of learning Rust is to master all the basic container types:

Box<T>, Rc<T>, Arc<T>, RefCell<T>, Mutex<T>, etc.

The least we can say is that they are not really intuitive to use and they contribute to the steep Rust learning curve.

In this post we will focus on a specific use case for Trait Objects with the Builder Pattern. The goal is to highlight some of the differences between Box<T> (Heap allocated object) and Rc<T> (Reference Counting pointer), which are both very important container types you should master early on.

The Basics

`Box<T>` and `Rc<T>`

Box<T> is a container type designed to allocate and "hold" an object on the heap. It's the most simple form of allocation on the heap and the content is dropped when it goes out of scope.

For example, a Box<u32> would look like:

alt

Rc<T> (short for Reference Counting) is used when we want multiple methods using a read only reference thus providing with shared ownership over some content. It counts the uses of the reference pointing to the same piece of data on the heap. This ensures that when the last reference is dropped, the data itself will be dropped and the memory properly freed.

For example, an Rc<String> would look like:

alt

Note that you can't pass around an Rc between threads (not thread-safe), that's what Arc (Atomic Reference Count) is for.

Whether you use a Box<T> or an Rc<T>, the Box or Rc itself is on the stack, but the data they "contain" lives on the heap.

In short, Box and Rc are nothing else than references (pointers) to objects stored on the heap. One provides shared ownership, the other doesn't.

Trait Objects

A Trait Object represents a pointer to some concrete type that implements a Trait (think interface if you are unfamiliar with the term Trait).

Trait Objects are Dynamically Sized Types, and because Rust needs to know everything at compile time about the size of the types it works with, Trait Objects are handled a bit differently.

Much like polymorphism, they use a mechanism to determine which version actually runs through a virtual table dispatching to the right implementation (unsurprisingly called dynamic dispatch). This is achieved by using basic pointers types like &, Box<T>, Rc<T> (or Arc<T>).

To sum up, once we have a pointer to something that implements a Trait, we have a Trait Object.

Use case

Container Engine Trait

Let's pick up a specific use case to explain one usage of Trait Objects. We assume that we are heavy users of docker and rkt and want a common interface (Trait) to run containers.

/// A Trait for container engines
pub trait Engine {
    fn run(&self, image: &str) -> Result<(), &str>;
    // etc.
}

We can have multiple implementations of the Engine trait, let's say with Docker:

extern crate shiplift;

use shiplift::Docker as DockerClient;
use shiplift::ContainerOptions;

struct Docker {
    client: DockerClient,
}

impl Docker {
    pub fn new() -> Docker {
        Docker { client: DockerClient::new() }
    }
}

impl Engine for Docker {
    fn run(&self, image: &str) -> Result<(), &str> {
        let containers = self.client.containers();
        let opts = &ContainerOptions::builder(image).build();
        if containers.create(opts).is_err() {
            return Err("Cannot create container");
        }
        Ok(())
    }
}

You get the picture: same goes for Rkt that will have its own implementation of the Engine Trait . This is satisfying for now, let's jump to the Server.

Our Server type

For the purpose of running our program, we also want a common Server object holding our various configuration options and our Engine Trait implementation:

pub struct Server {
    // [...] Important configuration fields
}

For convenience, we want to use the builder pattern to instantiate an immutable version of the Server object. How it works is that we initially create an empty mutable Server object and then populate fields using init_*() methods. Let's start with a skeleton:

pub struct Server {
    // [...] Important configuration fields
}

impl Default for Server {
    fn default() -> Self
    {
        // Instantiate a default Engine client
        let client = /* ??? */;

        Server {
            engine: client,
        }
    }
}

impl Server {
    pub fn new() -> Server
    {
        Default::default()
    }

    // Overrides the default engine
    pub fn init_engine(&mut self, engine: /* ??? */) -> &mut Server
    {
        self.engine = engine;
        self
    }

    // [...] collection of init functions

    pub fn build(&self) -> Server
    {
        Server {
            engine: self.engine.clone(),
        }
    }
}

Creating our Server instance will thus be as simple as:

let server = Server::new()
                .init_engine(/* Some Engine impl */)
                .init_storage(/* Some Storage impl */)
                .init_something_else(/* (ノ^_^)ノ */)
                .build();

In the end, we get back an immutable reference to a Server with the build method. This same build method takes the mutable version of the Server to create a new immutable Server from its content.

Note: We need to use clone() in the build method for the engine to move ownership of the Engine Trait implementation to the new immutable Server. Rust is very strict about ownership of content!

Now back to the Server type and a question should cross our mind:

struct Server {
    engine: /* What should I use here? */
}

We can't use a standalone Engine because it would point to the stack and trait objects might have various sizes (depending on the implementation we point to). To guarantee memory safety, Rust must know the size and alignment of things it manipulates. If we tried to do so, our code would be filled with errors like this one:

[...] the trait `std::marker::Sized` is not implemented for `engine::Engine + 'static`

Because Trait Objects are Dynamically Sized Types, we could use special pointer types to the heap to circumvent this requirement, thus the reason of this topic on Box<T> and Rc<T>.

As well put up in the Rust documentation:

Putting the value behind a pointer means the size of the value is not relevant when we are tossing a trait object around, only the size of the pointer itself.

Considering this, would allocating the object on the heap using the Box<T> container type help in this case? We know that a Trait Object is basically a pointer to something implementing a Trait. Box<T> seems to fit the bill.

Let's try by wrapping our Engine in a Box type like the following:

use std::boxed::Box;

struct Server {
    engine: Box<Engine>,
}

We can use it to initialize our Server object and fill our initial skeleton:

impl Server {
    pub fn new() -> Server { /* [...] */ }

    // Overrides the default engine
    pub fn init_engine(&mut self, engine: Box<Engine>)
      -> &mut Server
    {
        self.engine = engine;
        self
    }

    [...]
}

Let's check if this compiles:

$ cargo check

This should fail:

error: no method named `clone` found for type `std::boxed::Box<Engine + 'static>` in the current scope
  --> src/lib.rs:72:33
   |
72 |             engine: self.engine.clone(),
   |                                 ^^^^^
   |
   = note: the method `clone` exists but the following trait bounds were not satisfied: `Engine : std::marker::Sized`, `Engine : std::clone::Clone`
[...]

The reason why the code above fails is that even though Box holds the trait implementation on the heap, using clone on a Box copies the object rather than a fat pointer to it. That's why the compiler says that our objects must be Sized if we want to use the clone method. We can't copy something if we don't know its size!

In this example, Engine is a Dynamically Sized Type thus it is Unsized, it does not respect the Sized Trait bound required for clone to work. The error makes sense, let's try to find out more about this looking at the Rc<T> container type.

Wrapping a Trait object with Rc

That's where Rc comes into play. Rc, unlike Box, does not copy the whole context and data when calling clone, it only copies and hands-off a reference to the object on the heap, the "fat pointer" with the virtual table pointing to the right Trait implementation.

On top of that, this highlights a main difference between Box and Rc:

With Rc the ownership to the object living on the heap is shared. When the counter reaches 0, it drops the reference to the object, freeing the memory associated with it. That's why clone only hands-off a reference, there is no need to drop the object when it goes out of scope: we only drop it once all the references to it are gone. When calling clone, we don't need to know the size of the object we copy. The only thing we copy is the reference to the object living on the heap and we increment the counter by 1.
Box acts as the most simple form of heap allocation. When it goes out of scope, we drop the content. That's why we need to copy the whole content when calling clone, we effectively copy the content from method to method to keep it alive, transferring ownership from method to method as they go out of scope. To guarantee memory safety, we naturally need to know the size of the content we copy, thus the content must be Sized when we call the clone method.

In short: Box<T> copies values, Rc<T> clones references and keeps track of references in use.

Now that we've highlighted the difference between Box<T> and Rc<T> regarding their relationship to the clone method and the link to Trait Objects, let's finally switch from using Box<T> to using Rc<T> instead:

use std::rc::Rc;

pub struct Server {
    engine: Rc<Engine>,
}

impl Default for Server {
    fn default() -> Self
    {
        // Instantiate a default Engine client
        let client = Rc::new(Docker::new());

        Server {
            engine: client,
        }
    }
}

impl Server {
    pub fn new() -> Server
    {
        Default::default()
    }

    // Overrides the default engine
    pub fn init_engine(&mut self, engine: Rc<Engine>)
      -> &mut Server
    {
        self.engine = engine;
        self
    }

    // [...] collection of init functions

    pub fn build(&self) -> Server
    {
        Server {
            engine: self.engine.clone(),
        }
    }
}

This should compile successfully. Indeed in the build method, we only clone a reference to the Engine implementation, not the actual implementation itself.

If this sounds confusing, just remember that Rc only copies a reference to the object, and because the content in Rc is immutable: there is no risk of data race and multiple owners modifying the object at the same time. That's why we mentioned that Rust was very strict about ownership: it wouldn't allow us to pass around a mutable reference to the Engine implementation.

In init_engine we can now override the default Engine to spin-up our containers.

let rkt = Rc::new(Rkt::new());

let server = Server::new()
                .init_engine(rkt)
                .build();

Mix and match container types

This post was just about Box and Rc usage with Trait Objects, but remember that you can always mix and match container types to achieve particular goals. For example:

Rc<RefCell<T>>: Shared ownership with interior mutability and dynamically checked borrow rules.
Arc<Mutex<RefCell<T>>>: Thread-safe shared ownership with interior mutability and mutual exclusion.
etc.

Bookmark this cheat sheet or this periodic table of Rust types, these are incredibly useful resources.

rust traits

Alexandre Beslic Twitter

Comments

Building a Rust project on CircleCI

While Travis supports Rust natively for its build pipeline, CircleCI still misses first-class support for Rust. This short post explains how to build a Rust project on CircleCI so you don't have to go through all the trouble. We'll be using CircleCI 2.0, which comes

28 Nov 2016

Paid Members Public

Getting started with Capn'proto RPC for Rust

Introduction Capn'proto is a data interchange format and RPC system. Unlike protocol buffers (coupled with its rpc system: grpc), there are no encoding/decoding steps. You can find more details on the Capn'proto website. In this article we will focus on the RPC part of Capn&