I recently got into Linux kernel development by rewriting a driver from C to Rust. To support this driver, I needed to add Rust abstractions for kernel APIs written in C. Some of this work is making its way upstream and now seems like a good time to write about that experience.

Because I was a noob when I started (and mostly still am), this post should be pretty approachable for people who don't have any kernel development experience.

Spoiler alert: Writing the code was the easy part!

Why did I do this?

I love Linux as a user. I also love Rust, having worked in various domains with it. Command line applications, web apps that run in the browser wit WebAssembly, bare-metal embedded stuff... you name it. I am continuously impressed by how versatile this language is. I work as a research assistant at the Institute of Embedded Systems at the Zurich University of Applied Sciences. That's also where I'm studying for my master's degree in computer science.

So, translating an embedded Linux driver from C to Rust was the perfect idea for a semester project for me. We had some custom hardware at the institute with an out-of-tree driver written in C, exactly what I needed. My work, studies and passions all came together in this project.

Compiling the kernel

Before doing anything Rust related, I needed to make sure I can compile and run what's already there. That way, I can confirm everything behaves the same.

I ran my driver on a Raspberry Pi 4, so I needed their fork of the kernel. Presumably they maintain some patches supporting their hardware. There is great documentation for running a custom kernel on a Raspberry Pi, I followed that and everything worked well. I stored the snippets I had to repeat often in a justfile, a quality-of-life decision I don't regret. That file grew continuously while working on the project.

Kconfig, I love to hate you

The next step was to add a hello-world module to the build, to confirm I can compile Rust code. There is also documentation about building the kernel with Rust support enabled.

That part didn't work immediately though. I had to spend quite some time configuring the build with everything I needed. It was my first contact with the kernel's build system: kconfig. It's a complex and flexible system for tweaking every possible thing about the kernel.

I feel like I still only understand 5% of it at best, but here's my working model that got me through the project: Throughout the kernel source tree, there are these Kconfig files littered around. Here's a random snippet:

config MEDIA_SUBDRV_AUTOSELECT
	bool "Autoselect ancillary drivers (tuners, sensors, i2c, spi, frontends)"
	depends on HAS_IOMEM
	select I2C
	select I2C_MUX
	default y if MEDIA_SUPPORT_FILTER
	help
	  By default, a media driver auto-selects all possible [...]

This snippet defines a boolean configuration value MEDIA_SUBDRV_AUTOSELECT. It has a (conditional..?) default value and help text. But most importantly, dependencies are defined with depends and selects. depends works intuitively: something can only be enabled if the things it depends on are also enabled. I'm less certain about how select works exactly, but it seems to me the idea is that if something is enabled, all the other things it "selects" are also enabled automatically. I think the idea is that some variables are determined without user input and serve only as dependencies for other variables. For example this one:

config RUST_IS_AVAILABLE
	def_bool $(success,$(srctree)/scripts/rust_is_available.sh)
	help
	  This shows whether a suitable Rust toolchain is available (found).

For a newbie, this distinction is not obvious and confusing. But I can definitely see how configuring the kernel is simply a difficult task with a decent amount of inherent complexity.

But how is this used, how do you actually create a kernel configuration? Here is where things start getting strange! The config file (.config in the root of the kernel tree) looks pretty straight-forward, it's all just name=value. But you're not supposed to edit this file yourself, because it would be hard to honor all the dependency constraints.

There are several tools that help you create and modify a config. For example, there is the make target make defconfig, which creates a default config for you. I needed to cross-compile for the Raspberry Pi, so my base config was created with:

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- bcm2711_defconfig

For more flexibility, there are interactive tools to modify your configuration. make menuconfig will open an ncurses-based TUI. It's reasonably intuitive, but there are a lot of submenus to explore. I'm not sure how people are supposed to figure out what they could or should enable. Finding stuff can be difficult, not least because variables with unmet dependencies are hidden! The painful memory of trying to find the hidden RUST variable is still fresh. Apparantly that has a dependency on MODVERSIONS being disabled, which it wasn't. How was I supposed to know? Well, I guess I was supposed to know based on its definition:

config RUST
	bool "Rust support"
	depends on HAVE_RUST
	depends on RUST_IS_AVAILABLE
	select EXTENDED_MODVERSIONS if MODVERSIONS
	depends on !MODVERSIONS || GENDWARFKSYMS
	depends on !GCC_PLUGIN_RANDSTRUCT
	depends on !RANDSTRUCT
	depends on !DEBUG_INFO_BTF || (PAHOLE_HAS_LANG_EXCLUDE && !LTO)
	depends on !CFI_CLANG || HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC
	select CFI_ICALL_NORMALIZE_INTEGERS if CFI_CLANG
	depends on !CALL_PADDING || RUSTC_VERSION >= 108100
	depends on !KASAN_SW_TAGS
	depends on !(MITIGATION_RETHUNK && KASAN) || RUSTC_VERSION >= 108300
	help
	  Enables Rust support in the kernel.

Thanks a lot, that's not overwhelming at all. Also, that definition is in init/Kconfig. There are tons of kconfig files all over the place, how are you supposed to find stuff? I guess grep works, if you know what you're looking for.

But that's not how I actually found it. The more convenient way is to open make menuconfig and hit / to open a search view. Typing "rust" immediately brings up the right thing. That said, you're still met with an unreadable specification of its dependencies:

Depends on: HAVE_RUST [=y] && RUST_IS_AVAILABLE [=y] && (!MODVERSIONS [=y] || GENDWARFKSYMS [=n]) && !GCC_PLUGIN_RANDSTRUCT [=n] && !RANDSTRUCT [=n] && (!DEBUG_INFO_BTF [=n] || PAHOLE_HAS_LANG_EXCLUDE [=n] && !LTO [=n]) && (!CFI_CLANG [=n] || HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC [=n]) && (!CALL_PADDING [=y] || RUSTC_VERSION [=107800]>=108100) && !KASAN_SW_TAGS [=n] && (!MITIGATION_RETHUNK [=y] || !KASAN [=n] || RUSTC_VERSION [=107800]>=108300)

While this is still overwhelming, it tells you the current values of each variable. That's a huge help! That way you can figure out what you need to change in order to make the whole expression true. With that, I managed to cobble together a config that allowed me to compile a Rust module. I continued to squabble sporadically with kconfig while working on the driver and needing to enable more features, but nothing major.

Lastly, I'm very pedantic about my setup being scripted. I work on up to four different machines in different locations and I want to be able to nuke a machine at will, reinstall Linux, set everything up according to my preferences with scripts, clone all my repos and keep working as if nothing happened. So, this make menuconfig nonsense was no permanent solution.

After some digging, I found ./scripts/config. It allows you to set variables on the command line. If you know which variables to set and in what order, you can script your config:

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- bcm2711_defconfig
./scripts/config --disable "MODVERSIONS"
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- rust.config

Actually, that last line is weird. You'd expect it to be ./scripts/config --enable "RUST" or something. Well, ./scripts/config is not very smart and just sets that one variable according to what you ask it to do. It doesn't take any dependencies into consideration. I suppose the provided make target rust.config also has some necessary side effects. I wish there was a smarter ./scripts/config which gives you an error if you try to enable something without satisfying its dependencies.

O patches, where art thou?

With my newly aquired ability to compile Rust code, I could start rewriting the driver. The code itself is not difficult at all. If you know a little bit of C and a little bit of Rust, there is usually a straight-forward mapping you can apply. You'll want to make things reasonably idiomatic, but that still isn't a big challenge. I think the most annoying part was replacing C macros with Rust constants, where you have to declare the type upfront. I started by declaring them all as usize, but I had to go back several times and fix it up as the driver took shape. It didn't take much time, but it was one of those things that felt more burdensome than it was, due to its repetitive nature.

The challenge of the rewrite was obviously not having access to the kernel APIs written in C. Now, it would technically be possible to do a one-to-one rewrite and call the C functions directly in an unsafe block. But that would be boring and missing the point. The goal of the Rust for Linux project is to enable writing drivers without the use of unsafe.

So, what's the solution? You go on the internet searching for patches that add safe API wrappers in Rust. There used to be a GitHub repo with a list of out-of-tree patches. But I won't post the link, because it's not maintained anymore. At this point, I think the best idea is just to say hello on the Rust for Linux Zulip chat and ask. The people there are very kind and helpful! I couldn't have finished this project without them.

I ended up finding something for most of the APIs I needed, but none of them were in a state where I could just use them as-is. No wonder they weren't merged upstream. So that was the start of the actually interesting work: Improving these abstractions to the point where I could use them.

But first, we have to talk about...

Git gud

I consider myself a proficient user of Git. I read the Pro Git book cover-to-cover, nurtured a beautiful config file over years and I always look forward to cleaning up history with interactive rebase.

But the Linux kernel is built different! It presented me with challenges I hadn't encountered in any other project. Here are the main reasons why I think version control in the kernel is especially difficult:

  1. The kernel moves really fast. There is no internal API stability (which is good), but it means that patches that haven't been touched in a couple months probably don't apply cleanly anymore. Depending on how broad of a surface area your changes are touching, you're gonna spend a lot of effort just keeping pace with upstream changes. If you're maintaining patches for several unrelated kernel subsystems at the same time, this adds up.

  2. There are a lot of actively used forks. I get it, that's what Git is made for. It's a DVCS, a distributed version control system. Everybody including me loves this fact. But pragmatically, juggling dependencies from a bunch of different remotes that move at different speeds is still hard.

It wouldn't surprise me at all if there are a bunch of highly specialized "Git tricks" that experienced kernel devs use so they don't have any problems. Whatever they are, I don't know about them. And whoever starts working on the kernel for the first time probably doesn't either, because few other project are like this.

The approach that seemed to work best for me was to pick one branch as my "main base". That was rust-next for me. I pulled in other branches not with a merge-commit, because that would've made interactive rebase of parallel branches really annoying. Instead, I --squash merged them. It feels wrong, but works well. These squash-merged branches are easy to identify, rebase and recreate if you need to pull in new changes from upstream. Your mileage may vary.

On top of the "base branch", I kept my commits as parallel as possible, taking care that commits which don't have a logical dependency on each other also have no dependency in the commit tree. At the end, all of these parallel strains are merged together (which does cause some conflicts, but mostly isolated to trivial things like import statements) so you have one commit to work on top of.

There is some git configuration that's absolutely necessary to make any of this work:

[rebase]
	autoSquash = true
	autoStash = true
	rebaseMerges = true # == no-rebase-cousins
	updateRefs = true
[rerere]
	enabled = true
	autoupdate = true

With that, you can work on several branches in parallel and use git rebase -i <common base> to modify the entire tree at once. All branches are updated automatically if you make sure your current HEAD is a descendant of them.

That being said, Jujutsu is much better at this kind of workflow and what I was actually using most of the time. I first discovered this workflow while using Jujutsu and only later found out you could do something similar with Git.

a preview of the commits I was carrying around
@  yxstvo (empty) (no description set)lmpxkx devel add justfile
○  vtrqzo suppress warnings in gpio consumer
○  ooxxwu add devicetree overlay for raspberry pi setup
○  xkvmyq squash-merge rpi-6.15.y
○  pwtkou ds90ub954 add DS90UB954 FPD-link driver
○  stulrs abstractions (empty) Merge branch 'property-samples'
○    tmzrox WIP add some shady pointer casts
├─╮
│ ○  qkwmyy property-samples samples: rust: platform: Add property read examples
│ ○  zmuolw rust: device: Add property_get_reference_args
│ ○  wmxqvv rust: device: Add child accessor and iterator
│ ○  mokttv rust: device: Implement accessors for firmware properties
│ ○  nvqtpo rust: device: Introduce PropertyGuard
│ ○  ztlyrs rust: device: Enable printing fwnode name and path
│ ○  zxnmnp rust: device: Add property_present() to FwNode
│ ○  orwutq rust: device: Enable accessing the FwNode of a Device
│ ○  rxkzvu rust: device: Create FwNode abstraction for accessing device properties
│ ○  tksmrr property-cover-letter (empty) More Rust bindings for device property reads
│ ○  mmpktl squash-merge driver-core-next
○ │    styoku (empty) Merge branch 'i2c-new-client'
├───╮
│ │ ○  yylymw i2c-new-client WIP: rust: i2c: add new_client_device
○ │ │    svnpmx Merge branch regmap-read-write
├─────╮
│ │ │ ○  vvlqvr regmap-read-write rust: regmap: add read & write methods
│ │ │ ○  nmzrzs regmap-without-fields rust: regmap: remove requirement of using fields
│ │ ├─╯
│ │ ○  nwlvzp rebase/fabo/b4/ncv6336 fixup! rust: i2c: add basic I2C client abstraction
│ │ ○  zuwnxw arm64: dts: qcom: apq8039-t2: add node for ncv6336 regulator
│ │ ○  wuuvwk regulator: add driver for ncv6336 regulator
│ │ ○  nsmoos regulator: dt-bindings: add binding for ncv6336 regulator
│ │ ○  kxzqsz rust: regulator: add support for regmap
│ │ ○  xkkozn rust: regulator: add Regulator Driver abstraction
│ │ ○  quzqpp rust: regulator: add abstraction for Regulator's modes
│ │ ○  msnmmu rust: error: add declaration for ENOTRECOVERABLE error
│ │ ○  wzmvqk rust: add abstraction for regmap
│ │ ○  wnrpon rust: i2c: add basic I2C client abstraction
│ │ ○  oszory (empty) Regulator driver with I2C/Regmap Rust abstractions
│ │ ○  npwzrw Fix KUNIT for genmask
│ │ ○  qmzpnz rust: kernel: add support for bits/genmask macros
│ │ ○  olktmv rust: Add a Sealed trait
│ ├─╯
○ │    wtwsyy (empty) Merge branch gpio-cansleep and delay-msleep
├───╮
│ │ ○  suosxr delay-msleep rust: delay: add msleep function
│ ├─╯
○ │  zxlyqy rust: gpio: discard error
○ │  xwyvvv gpio-cansleep rust: gpio: add set_value_cansleep method
○ │  vmtzmo gpio-consumer (empty) Cherry-pick rust gpio consumer
○ │  mtovzn rust: gpio: add GPIOD consumer abstractions
├─╯
  mtuwzr rust-next rust: platform: fix docs related to missing Markdown code spans
│
~

I'd be interested to know how other people deal with similar situations.

I assume the problem of version control in the kernel is worse with Rust right now, because several basic abstractions are very actively being worked on in parallel. If I was writing a driver in C, I would expect fewer problems with multiple subsystems changing from under me (with these breaking changes possibly being scattered among several remotes).

Reading device tree properties from Rust

The driver I was rewriting was configurable via the device tree. The kernel API to read these configuration values from the driver is in drivers/base/property.c - the "unified device property interface". This is where I did my most interesting work and the resulting patches are expected to land in the Linux kernel 6.17. Let's look at some examples of what a safe kernel API wrapper looks like.

Wrapping a C struct in Rust

Pretty much all functions related to reading device properties take a pointer to a struct fwnode_handle. In order to write a safe API around that, we need to wrap it in a custom Rust type. Here's what that looks like:

#[repr(transparent)]
pub struct FwNode(Opaque<bindings::fwnode_handle>);

The binding to the C struct is automatically generated with bindgen. #[repr(transparent)] guarantees that this Rust type and the C type it wraps have the same memory layout. The last interesting bit is Opaque<T>. It removes many of the assumptions the Rust compiler usually makes for optimization reasons. For example, it's fine to have multiple mutable references to that value. The value can also be invalid according to its type, e.g. Opaque<bool> may sometimes store 3. This is necessary, because these values are shared with C.

Reference counting is the GOAT

Before I started doing kernel development myself, my opinions on the Rust for Linux project were entirely based on my affinity for the Rust language and other people's opinions online. One opinion that I remember being reiterated constantly on the internet is that kernel development in Rust is too hard and not worth it, because the kernel has to deal with raw pointers all the time and it's basically impossible to track their lifetimes statically, meaning you'll have major problems appeasing the borrow checker. That sounded plausible to me and I partially believed it.

Turns out that opinion is totally wrong. In reality, C has to deal with the lifetimes of pointers too. The fact that the C compiler doesn't help you find problems doesn't make the underlying problem go away. So how does the kernel deal with this? Reference counting, lot's of it. This is brilliant, because it nips any potential altercation with the borrow checker in the bud.

There's just one small technical hurdle: We still need some abstraction on the Rust side to ensure the reference count is handled safely. Thankfully, this was already a solved problem when I joined the party. Here's what the solution looks like:

unsafe impl crate::types::AlwaysRefCounted for FwNode {
    fn inc_ref(&self) {
        unsafe { bindings::fwnode_handle_get(self.as_raw()) };
    }

    unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
        unsafe { bindings::fwnode_handle_put(obj.cast().as_ptr()) }
    }
}

So, for a reference-counted type, you implement the AlwaysRefCounted trait on the Rust wrapper. The trait specifies how to increment and decrement the refcount. This is usually just calling the binding to the correct C function. This trait implementation makes FwNode usable with the smart-pointer ARef<T>. Cloning and dropping an ARef<FwNode> automatically handles the refcount. Since the type is always reference counted, there is even a blanked implementation to convert a &T to an ARef<T>. So you can see, this is very ergonomic to use on the Rust side!

Iterators: as enjoyable in the kernel as anywhere else

With the basic abstraction over fwnode_handle in place, we're ready to read some device tree properties. Readings integers and strings is relatively boring, you just make sure to call the C function correctly. So let's skip to the interesting bits.

A node in the device tree may have a variable number of children. In order to read them, we have to do some kind of iteration. How does it work on the C side?

#define fwnode_for_each_child_node(fwnode, child)			\
    for (child = fwnode_get_next_child_node(fwnode, NULL); child;	\
         child = fwnode_get_next_child_node(fwnode, child))

struct fwnode_handle *child;
fwnode_for_each_child_node(parent, child) {
    // process child
}

So, there's a macro to help with the iteration. The function fwnode_get_next_child_node takes a pointer to the previous node and returns a pointer to the next one. It also decrements the refcount for the pointer you pass in and increments it for the one it returns, so the caller doesn't have to do anything in the normal case. The iteration starts and ends with a null pointer.

We can't call the macro from Rust, so we have to reimplement the iteration logic:

impl FwNode {
    pub fn children<'a>(&'a self) -> impl Iterator<Item = ARef<FwNode>> + 'a {
        let mut prev: Option<ARef<FwNode>> = None;

        core::iter::from_fn(move || {
            let prev_ptr = match prev.take() {
                None => ptr::null_mut(),
                Some(prev) => ARef::into_raw(prev).as_ptr().cast(),
            };
            let next = unsafe {
                bindings::fwnode_get_next_child_node(self.as_raw(), prev_ptr)
            };
            if next.is_null() {
                return None;
            }
            let next = unsafe { FwNode::from_raw(next) };
            prev = Some(next.clone());
            Some(next)
        })
    }
}

What's happening here is essentially the same thing as in the C macro. First, we declare the variable prev so we can remember which node we looked at in the last iteration. Then we create an iterator with iter::from_fn, passing it a closure that's run to get the next item. The closure first determines the pointer to the previous element, which may be a null pointer. That is passed to fwnode_get_next_child_node, which yields the next item. If it's null, we're done. Otherwise, we remember it for the next iteration an return it.

While this implementation is more verbose, the way to use it is even nicer than the C version:

for child in parent.children() {
    // process child
}

It's a normal Rust iterator, so you can just write a regular for loop! Lovely.

Also, did you notice that we accidentally fixed a footgun of the C macro? If you somehow break out of the loop in C, the refcount is not decremented for the node you were processing at that time. That would cause a memory leak. But in the Rust version, this does not happen. If you break out of the loop, the iterator is dropped. The iterator has ownership of the prev variable, so its destructor is run as well. That decrements the refcount of any remaining node, preventing a memory leak.

Buffer overflows begone!

Another interesting example is the function fwnode_property_get_reference_args. Semantically, it returns a variable number these "reference args". But there's no such thing in C, so what it actually does is take a pointer to a fwnode_reference_args, which looks like this:

struct fwnode_reference_args {
    struct fwnode_handle *fwnode;
    unsigned int nargs;
    u64 args[NR_FWNODE_REFERENCE_ARGS];
};

The function then fills a certain number of elements in args and indicates that number with nargs. Users of this function can introduce undefined behavior by reading from an element beyond the bound of nargs. So, to make this safe, a Rust wrapper function needs to be a little smarter. Here's what we ended up doing:

#[repr(transparent)]
pub struct FwNodeReferenceArgs(bindings::fwnode_reference_args);

impl FwNodeReferenceArgs {
    pub fn as_slice(&self) -> &[u64] {
        unsafe {
            core::slice::from_raw_parts(
                self.0.args.as_ptr(),
                self.0.nargs as usize,
            )
        }
    }
}

We created a thin wrapper around the struct fwnode_reference_args. The encapsulation prevents users from accessing the array directly. Users can access the array with the as_slice method, which creates a regular Rust slice over the array. In Rust, it's impossible for safe code to read beyond the bound of a slice. Mission accomplished!

LKML: The end boss of kernel development

Contributing to Linux was my first time interacting with a mailing list, at least for the purpose of sharing and reviewing code. I thoroughly hated the entire process. I tried in vain to write about my experience in a constructive manner, but it always turned into an unhinged rant, so I gave up. In summary, I think that sending and reviewing patches via email is exactly as insane as it sounds.

Part of me wants kernel.org to host an instance of Forgejo and everyone just migrates to that. Another part of me recognizes that this would actually be a step back for the kernel. I self-host Forgejo and it's great, but it inherits the pull request UI from GitHub. GitHub PRs don't deal well with force-pushes, which kernel developers would be doing a lot. They iterate on their patches many times, working toward a clean history, which is laudable. A GitHub-style pull request UI simply doesn't benefit such a workflow (and may even be actively harmful).

As much as I hate the mailing list, I don't think there's a strictly better option at the moment.

I have some ideas about what a worthy successor to the mailing list could look like, but I'll save that for another time. Ideally when I have something to show!

Closing thoughts

I love Rust, so I was already biased to be positive about the Rust for Linux project, even before dabbling with it myself. I'm genuinely surprised to be even more optimistic now than before. The coding part was much easier than I imagined, thanks to the use of reference counting in the kernel.

And the promised benefits of Rust over C? They're absolutely real. The Rust version of the driver feels way more robust than the C code, not just regarding memory safety. It didn't have a single bug: Once it compiled, it worked. That's not a huge deal considering it was a direct rewrite, but it counts for something.

The Rust for Linux project has a long way ahead of itself. There are many APIs waiting to be wrapped. I've seen enough for now, but maybe there's a challenge for you?


Update: Someone asked for links to the actual code. The original driver is split in a .c and a .h file. The Rust driver is here. In my opinion, the driver code itself is really not all that interesting. It's surprisingly standard Rust code. That is of course thanks to the heavy lifting done by the safe API wrappers, which are way more interesting. I had to make small modifications here and there, so I won't list them all. But the most interesting ones (including all the code snippets I discussed above) are part of a patch series that I posted on the LKML. A mostly finished version of it can be found here.