Programming Thoughts & Paradigms

RSS

Segfaults in safe code

2021-12-14

I was recently working on an interesting usecase in a Rust project, specifically a Linux PAM library, where we pass a raw pointer to a C function.

For the sake of brevity, the code looks something like this:

use std::os::raw::c_uint;

// Let's create a type that represents a raw pointer
// we might receive from a C function.
type SomeT = *const c_uint;

// This function disguises the raw pointer behind the SomeT type.
// In Rust, this is considered to be unsound, meaning that undefined
// behavior is possible from safe code.
fn maybe_safe_who_knows(x: SomeT) -> c_uint {
    unsafe { *x }
}

This disguises the raw pointer behind the SomeT type. Constructing a dangling pointer isn't considered unsafe in itself - but using it is. In Rust, a function like this is considered to be unsound, meaning that undefined behavior is possible from safe code. Therefore, nothing stops us from doing maybe_safe_who_knows(0x1234 as _) and causing a segfault. See for yourself.

This is because raw pointers are not guaranteed to point to a valid instance of the data they represent, which means that dereferencing them can lead to unaligned or null behaviour - or even worse, a segmentation fault. In severe cases, this could also lead to type confusion where data is misrepresented, thus corrupting the state of a program.

Potential improvements

While the first option is the easiest to implement, it's also the dirtiest. Marking a whole function as unsafe means we are opting out of having the compiler enforce certain guarantees. This might be useful when we want to give up guaranteed safety in exchange for greater performance, or in this case the ability to interface with another language where Rust’s guarantees don’t apply, but it is possible to write a safe wrapper around the types we need allowing the best of both worlds.

Credit to one of my awesome colleagues for pointing this out to me during code review, and introducing me to The Rustonomicon.