Rust in 2021: Leveraging the Type System for Infallible Message Buffers

The Rust programming language features a versatile type system. It gives memory safety by distinguishing between raw pointers, pointers to valid data and pointers to data that may be written to. It helps with concurrency by marking types that may be moved between threads. And it helps keep API users on the right track with typestate programming.

With some features of current nightly builds, this concept can be extended to statically check that a function will succeed. Let's explore that road!

Current state: Fallible message serialization

For a close-to-real-world example, we will look into how CoAP messages are built on embedded devices. CoAP is a networking protocol designed for the smallest of devices, and enables REST style applications on even the tiniest of devices with less than 100KiB of flash memory. The devices can take both the server and the client role.

Writing a response message nowadays may[1] look like this:

fn build_response<W: WritableMessage>(&self, request_data: ..., response: W) -> Result<(), W::Error> {
    let chunk = self.data.view(request_data.block);
    match chunk {
        Ok((payload, option)) => {
            message.set_code(code::Content);
            message.add_opaque_option(option::ETag, &self.etag)?;
            message.add_block_option(option::Block2, option);
            message.add_payload(payload)?;
        }
        Err(_) => {
            message.set_code(code::BadRequest);
        }
    };
    Ok(())
}

As these devices typically don't come with dynamic management of the little RAM they have, it is common to build responses right into the memory area from which they are sent; that memory area is provided in the WritableMessage trait.

A CoAP server for large systems with dynamic memory management may not even need to ever err out here -- it could just grow its message buffer and set its error type to the (unstable) never_type. In a constrained system with a fixed-size message buffer, that is not an option. Moreover, while for some cases this may be client's fault and call for an error response, many such cases indicate programming errors: All the size requirements of a chunk and the options can be known, so if they don't fit in the allocated buffer, that's a programming error.

Typed messages: No runtime errors

Programming errros should trigger at compile time whenever possible. So what could that look like here? Let's dream:

fn build_response<W: WritableMessage<520>>(&self, request_data: ..., response: W) -> impl WritableMessage<0> {
    let chunk = self.data.view(request_data.block);
    match chunk {
        Ok((payload, option)) => {
            message
                .set_code(code::Content)
                .add_opaque_option(option::ETag, &self.etag)
                .add_block_option(option::Block2, option)
                .add_payload(payload)
        }
        Err(_) => {
            message
                .set_code(code::BadRequest)
                .into()

        }
    }
}

    Checking myserver v0.1.0
error[E0277]: the trait bound `impl WritableMessage<520>: WritableMessage<522_usize>` is not satisfied
  --> src/main.rs:9:10
   |
1  | fn build_response(&self, request_data: ..., response: W) -> impl WritableMessage<0> {
...
9  |     .add_payload(payload)
   |          ^^^ the trait `WritableMessage<522_usize>` is not implemented for `impl WritableMessage<520>`
   |
help: consider further restricting this bound
   |
1  | fn build_response<W: WritableMessage<522_usize>>(...) {

error: aborting due to previous error

If the function can indicate clearly in its signature what it needs as a response buffer, static error checking can happen on all sides:

Inside the function, operations on the message can be sure to have the required space around.

Error conditions only arise when inputs need to be converted from unbounded to bounded values. They are, however, not the typical programming errors of underestimating the needed space, but reflect actual application error conditions that are now more visible and do not get buried in boilerplate error handling.

In the particular example above, the program only asked 520 bytes of memory, but would need 522 bytes in the worst case. Even with some testing, this would not have been noticed, and then suddenly fail once a file larger than 8KiB was transferred.
Outside the function, the caller can be sure to provide adaequate space, or will be notified at compile time that build_response is not implemented for too small a buffer.

How to get there

A cornerstone of tracking memory requirements in types are const generics. They are what allows creating types and traits like WritableMessage<522> in the first place.

Aside: typenum and generic-array

The typenum and generic-array crates do provide similar functionality.

Using them gets complicated very quickly, though, and debugging with them even more so.

While they do a great job with the language features available so far, const generics can provide a much smoother development experience due to their integration into the language.

The first bunch of const generics, min_const_generics, is already on the road to stabilzation; withoutboats' article on the topic summarizes well what can and what can't be done with them.

Applications like this will need more of what as I can tell is not even complete behind the const_generics feature gate: computation based on const generics will be essential in matching the input and output types of operations on size-typed buffers.

With those, methods like add_option<const L: usize>(option: u16, data: &[u8; L]) can be implemented on messages, with additional limitations on L being small enough to fit in the current message type.

Why this matters

Managing errors in this way ensures that out-of-memory errors do not come as a surprise, and enhances visibility of throse error case that do need consideration.

Furthermore, each caught error contributes to program size. Without help from the type system, the compiler can only rarely be sure that the size checks are unnecessary. Not only do these checks contribute to the machine code, they often also come with additional error messages (even line numbers or file names) generated for the handler that eventually acts on them. (This is less of an issue when error printing is carefully avoided, or a tool like defmt is employed).

Last but not least, there's a second point to in-place creation of CoAP messages: Options must be added in ascending numeric order by construction of the message; adding a low-number option later is either an error or needs rearranging the options, depending on the implementation. The same const generics that can ensure that sufficient space is available can just as well be used to statically show that all options are added in the right sequence.

Context, summary and further features

This post has been inspired by Rust Core Team's blog post on creating a 2021 roadmap.

The features I have hopes to use more in 2021 are:

never_type, to indicate when an implementation claims to never fail,
min_const_generics, to support numbers as a part of types, and
more const_generics, especially with arithmetic operations.
On the receiving side of the CoAP implementations, the generic_associated_types and type_alias_impl_trait features can simplify message parsing a lot. I have a marvellous example of this, but this summary is too narrow to contain it.

[1]	While the examples are influenced by the coap-message crate, they do not reflect its current state of error handling. Also, they assume a few convenience functions to be present for sake of brevity.

Tags:	blog-chrysn