Hacking around the Rust type-system to provide ergonomic bitfields

So for a while now, I've been trying to figure out how to create a bitfield macro that allows easy, ergonomic use. A few days ago, I found the solution, and now I've published bitutils to crates.io!

Some background

I've been working on and off on an emulator for the 3DS' security processor, as well as a test binary that I cross-compile to the ARM9 processor on the 3DS. Both are written in Rust, and both require heavy use of bitfields. I wanted a solution that allowed me to name the individual fields, and to be somewhat ergonomic (my previous solution required me to do bf!( reg.field1 = newval ) for instance).

First, the syntax

I defined a bf! macro, used like so:

bf!(BitfieldName[u8] {
    field1: 0:3,
    field2: 4:6,
    field3: 7:7
});

Then, I can access and modify fields like this:

let mut bitfield = BitfieldName::new(0);
bitfield.field1.set(0xF);
return bitfield.field2.get();

What hackery is this?

Well, the bf! macro expands to some pretty gnarly stuff. Let's check it out.

// This is a module, not a struct, because we have to contain the namespace pollution
pub mod BitfieldName {
    #[repr(C)]
    pub struct field1 {
        _dont_instantiate_pls: ()
    }

    #[allow(dead_code)]
    impl field1 {
        #[inline(always)]
        pub fn get(&self) -> u8 {
            // &self MUST always point to the appropriate Bf type.
            let bfptr = self as *const Self as *const Bf;
            let _ = self;
            let val = unsafe { (*bfptr).val };
            bits!(val, $var_low : $var_hi)
        }

        #[inline(always)]
        pub fn set(&mut self, new: u8) {
            // &mut self MUST always point to the appropriate Bf type.
            let bfptr = self as *mut Self as *mut Bf;
            let _ = self;
            let val = unsafe { &mut (*bfptr).val };
            *val ^= bits!(*val, 0:3) << 0;
            *val |= bits!(new, 0:(3 - 0)) << 0;
        }
    }

    // ditto for field2, field3

    #[repr(C)]
    pub struct Fields {
        pub field1: field1,
        pub field2: field2,
        pub field3: field3
    }

    #[repr(transparent)]
    #[derive(Copy, Clone)]
    pub struct Bf {
        pub val: $ty,
    }
    impl Bf {
        pub fn new(val: $ty) -> Self {
            Self {
                val: val
            }
        }
    }
    impl $crate::Deref for Bf {
        type Target = Fields;
        fn deref(&self) -> &Fields {
           // We go through Deref here because Fields MUST NOT be moveable.
           unsafe { &*(self as *const Self as *const Fields) } 
        }
    }
    impl $crate::DerefMut for Bf {
        fn deref_mut(&mut self) -> &mut Fields {
           unsafe { &mut *(self as *mut Self as *mut Fields) } 
        }
    }

    pub fn new(val: u8) -> Bf {
        Bf::new(val)
    }
}

What's this pointer casting about?

I needed a way to get around a few things:

The lack of ability to do parametrization on mutability
The lack of ability to conjoin identifiers in macros (which would let me get around (a))
The syntactic noise of chaining method calls
The lack of ability to encode self-referential structs in the type system.

I had decided that using struct fields, somehow, was the best way to reference individual field members. But how? Unions were the obvious choice, but I didn't want to use unsafe every time I accessed a bitfield member.

But then I remembered. Zero sized types! ZSTs allow me to add functionality to the type system without bloating my bitfield type and while keeping it Copy.

Additionally, I can cast any pointer to a ZST pointer and have it be a valid representation (since, after all, there's nothing to represent!). So the &self and &mut self in the fieldN types really just point to the Bf struct, and I can pointer cast it right back.

And then, I added a little Deref trickery to prevent someone from attempting to Copy or move the Fields type out of the Bf struct, which might put them in serious trouble if they called any function with &self type punning!

a blog

Hacking around the Rust type-system to provide ergonomic bitfields

Some background

First, the syntax

What hackery is this?

What's this pointer casting about?