Rust4Linux with Danilo Krummrich
About bringing Rust to the Linux kernel
2025-12-11 61 min
Description & Show Notes
Bringing Rust into the Linux kernel is one of the most ambitious modernization efforts in open source history. The Linux kernel, with its decades of C code and deeply ingrained development practices, is now opening its doors to a memory-safe language. It's the first time in over 30 years that a new programming language has been officially adopted for kernel development. But the journey is far from straightforward.
In this episode, we speak with Danilo Krummrich, Linux kernel maintainer and Rust for Linux core team member, about the groundbreaking work of integrating Rust into the Linux kernel. Among other things, we talk about the Nova GPU driver, a Rust-based successor to Nouveau for NVIDIA graphics cards, and discuss the technical challenges and cultural shifts required for large-scale Rust adoption in the kernel as well as the future of the Rust4Linux project.
About Rust for Linux
Rust for Linux is a project aimed at bringing the Rust programming language into the Linux kernel. Started to improve memory safety and reduce vulnerabilities in kernel code, the project has been gradually building the infrastructure, abstractions, and tooling necessary for Rust to coexist with the kernel's existing C codebase.
About Danilo Krummrich
Danilo Krummrich is a software engineer at Red Hat and a core contributor to the Rust for Linux project. His fundamental contribution to Rust for Linux is the driver-core infrastructure, the foundational framework that makes it possible to write drivers in Rust at all. This includes both C and Rust code that provides the core abstractions for device drivers in the kernel. Danilo is a maintainer for multiple critical kernel subsystems, including Driver Core, DRM (GPUVM, Rust, GPU Scheduler), GPU drivers for NVIDIA GPUs (Nova, Nouveau), Firmware Loader API, as well as Rust bindings for PCI, DMA, and ALLOC. He is the primary developer of the Nova GPU driver, a fully Rust-based driver for modern NVIDIA GPUs.
Links From The Episode
- AOSP - The Android Open Source Project
- Kernel Mailing Lists - Where the Linux development happens
- Miguel Ojeda - Rust4Linux maintainer
- Wedson Almeida Filho - Retired Rust4Linux maintainer
- noveau driver - The old driver for NVIDIA GPUs
- Vulkan - A low level graphics API
- Mesa - Vulkan and OpenGL implementation for Linux
- vtable - Indirect function call, a source of headaches in nouveau
- DRM - Direct Rendering Manager, Linux subsystem for all things graphics
- Monolithic Kernel - Linux' kernel architecture
- The Typestate Pattern in Rust - A very nice way to model state machines in Rust
- pinned-init - The userspace crate for pin-init
- rustfmt - Free up space in your brain by not thinking about formatting
- kunit - Unit testing framework for the kernel
- Rust core crate - The only part of the Rust Standard Library used in the Linux kernel
- Alexandre Courbot - NVIDIA employed co-maintainer of nova-core
- Greg Kroah-Hartman - Linux Foundation fellow and major Linux contributor
- Dave Airlie - Maintainer of the DRM tree
- vim - not even neovim
- mutt - classic terminal e-mail client
- aerc - a pretty good terminal e-mail client
- Rust4Linux Zulip - The best entry point for the Rust4Linux community
Official Links
Transcript
Welcome to another episode of Rust in Production, a podcast about companies
who use Rust to shape the future of infrastructure.
My name is Matthias Endler from corrode, and today I'm joined by Danilo Krummrich
from the Linux Kernel Project to talk about bringing Rust into the kernel.
Danilo, thanks so much for taking the time for the interview today.
Can you quickly say a few words about yourself and about Red Hat,
the company you work for?
Of course, thanks for the invitation. Yeah, so my name is Danilo and I'm a Linux
kernel engineer working at Red Hat in the accelerators and GPU group.
And I'm also a maintainer in the Linux kernel of various subsystems,
components and drivers and member of the Rust4Linux team.
It's almost intimidating to think that you work on such an important project
and also on such a low level.
I guess a lot of people that are listening in either use Linux or get in contact
with Linux on a daily basis because, you know, we use a ton of servers nowadays
and a lot of these run on Linux.
What does it feel like to work on that level and how did you even get started
with kernel development?
I don't think it necessarily feels different than working on any other software
project, I would guess. I cannot really tell.
I worked on a few other projects in the past 10 years ago. I worked a little
bit on the AOSP, so the Android open source project.
I worked there at kernel level as well, but also some user space portions.
So i was always interested in the kernel since i finished my exams at university
and when i joined the first company that i also worked at when i was a student
i i just told people hey i'm interested in the kernel i want to work at that
just get me to the just get me to the base layer team,
and then i just started working there and and especially i started learning
a lot the learning curve was very, very high.
When was the first time you personally heard of the Rust4Linux project?
When was the first time you heard about, I mean, discussions,
adding Rust to the Linux kernel?
I would say almost from the beginning, because I'm following what's going on
in the project, what's discussed in mailing lists.
But this was more reading along. It was not that I I was immediately,
okay, this is definitely something we need to, I need to get involved in. But I was aware.
Did you know Rust back then?
So this was all new for you, but how did it feel like?
Can you remember the time, the feeling that maybe thinking about the adoption
of Rust and was that the time when you also looked at the language for the first time?
Can you talk about your first impressions of the language coming from a mostly
C background, I'm guessing?
It was a bit intimidating, I would say. The C language is very, very simple.
And in the kernel, we are doing very, very complicated things.
With a very, very simple language.
With Rust, it's different. Rust is a very difficult language, I would say.
But at the same time, it makes some of the problems we have to solve much easier to solve.
So I guess that's a good trade-off.
Did you ever second-guess the adoption of Rust in a Linux kernel,
or was it always clear to you that there were very obvious benefits?
So after I got used to the concepts of Rust and understood what they are about
and understood how I can utilize them, I don't have any doubt.
This is really i'm very or let it say the other way around i'm very very convinced
that rust definitely has a big future in the kernel how.
Long did it take to build out the entire infrastructure infrastructure that
is needed for such a project inside the kernel and outside the kernel from the rust side i mean.
So i think a lot
of initial so a lot of the initial work was
and i cannot say a lot about that because that was not the time where i joined
the project this was before i joined the project but i think and if i'm forgetting
someone i'm very sorry about it but i i think it was mainly Miguel and and Wedson
so Miguel Ojeda and Wedson Almeida who,
started the Rust4Linux project and in a downstream branch worked out a lot
of infrastructure,
And the initial support in the Linux kernel, build system integration, and so on.
And I think the initial lift there
was to convince people that it's worth adding it to the Linux kernel.
There were approaches adding a second language into the kernel in the past.
I think C++ was tried a couple of times, and it failed badly.
So adding a new language to the Linux kernel is definitely not an easy thing to attempt.
And they were successful. So I think this milestone was really one of the big ones.
Why has Rust succeeded, quote unquote, to be adopted in the Linux kernel and C++ not?
So I think that a subset of C++ would have been a great addition to the kernel.
Just to name two examples, the first one being inheritance.
The C kernel code relies a lot on the inheritance pattern. So getting some language
support for that would definitely been a great improvement.
And the second one that I think is very, very useful is as simple as field visibilities of structures.
So we have the case in the kernel that generic components obviously have fields that are,
or should be accessed by design by
the users but there is also private fields and what we often end up is that
drivers instead of representing their needs by contributions to the common component
in the kernel instead abuse private fields of the structure and basically peek into component.
Internals, which obviously is not considered by the component maintainers in
the end and hence causes problems in maintainability and can actually lead to bugs in the kernel.
So those are problems that are solved by Rust as well.
And I think overall, Rust is just the much better fit for the kernel because
we have much less things that are not a good fit for the kernel.
Plus, it has all the additional features that help with memory safety that C++ does not have.
However, it also brings some disadvantages with it, which is the common practice
in Rust to just unwrap results or unwrap options. So basically operations that panic the program.
And in user space, that's the same option in a lot of cases.
But in the kernel, it almost never is a sane option because a panic in the kernel
is really just the last resort when the system is in a state where it's basically
non-recoverable and otherwise you would corrupt memory or in the worst case
even corrupt file systems.
So this is something that we have to take care of when using Rust in the kernel
that we're making it hard for people to accidentally trigger panics in the kernel.
So what was your first task when starting with Rust4Linux?
Was it a thing that you voluntarily picked up or was there a need at work where
you needed to start the Rust, where you needed to touch the Rust part?
I think it really started with me taking maintainership of the Nouveau driver.
And it's also a result out of the way forward with Nouveau and the introduction of Nova.
So with the Nouveau driver, we have a lot of issues which are...
So back then when I started at Red Hat, I started working on a component that's
named GPUVM for the DRM subsystem.
I mean, it's basically managing the GPU's virtual address space and doing some
more things in the context of providing
helpers for drivers to implement Vulkan-compatible user space APIs.
So APIs that are given out to user space drivers, for instance, Mesa.
So the work of GPU VM I did in the context of adding a new UAPI to Nouveau,
which is Vulkan-compatible.
And in order to get that going in Nouveau, we needed to do quite some changes
on how the page table management of the GPU itself works.
But it was never possible to do it to the full and correct extent.
We still have problems there.
And that's because the design of how Nouveau is written just doesn't really
work out well for this kind of requirements that we get from a Vulkan-compatible UAPI.
So those design reworks would be a lot of effort,
especially considering that Nouveau
supports a huge range of GPUs and GPU architectures and generations.
So you can basically split them up into the generations before NVIDIA introduced
GSP and after NVIDIA introduced GSP.
So GSP means GPU system processor, and it's a firmware processor that lives within the GPU.
And instead of programming the hardware directly, if you have a GSP GPU,
you can just talk to the firmware through a ring buffer and a firmware-specific
protocol and instruct the firmware
to do the things that you would usually program through registers.
And by reworking all those layers to get a Vulkan compatible UAPI we would need
to consider the whole range of GPUs, the ones pre-GSP and the ones since GSP introduction,
so that would be a lot of effort just to rework something we also have other
problems with and the other problems we have in Nouveau are.
How well it's documented and how
it's designed so Nouveau consists out of lots
and lots of v tables with callbacks that
are not really documented in terms of ownership
and lifetimes of the objects that are allocated returned and
and handled which is which
is one of the huge problems and one problem
that is really important to me personally to
address is that Nouveau
is just not really accessible to users we don't
have a lot of contributors and i
think the reason for that is because just no one really understands how it works
and doesn't get comfortable working with that driver and for an open source
project that's not really what you want so what we,
there ended up with was to decide at a certain point we should do a new driver
we should do a driver that is gsp only and if we do a new driver now we saw
that Rust4Linux is progressing.
We thought okay if we do a new driver now we have the choice do we do another c driver,
or do we do a Rust driver where the language solves a lot of problems for us that DRM drivers have.
So DRM drivers, GPU drivers are usually very, very complicated,
suffer from race conditions, suffer from memory issues and just because they're
complicated and it's hard to get it right.
And Rust helps a lot with that. So we decided to go for a Rust driver and or
actually first step evaluate if we want to do a rust drive and this is where
i got in touch with rust in the kernel the first time because i started doing this
evaluation of is a rust driver,
for the successor of Nouveau doable is it is it reasonable and what are the
things that we need to consider what are the things that we need to do to get it going.
And so at this point in time you're
thinking is rust a better choice for writing a completely new driver for nvidia
graphics cards for the linux kernel and moving away from the legacy codebase which
was Nouveau back in the day i think Nouveau not even sure when it started but
it's been around for a while right.
Yeah i think it's it's around for 15 20 years.
Yeah okay so that
must mean the code base is relatively evolved it has evolved and it's probably
complicated to maintain i'm not sure about the status but i'm just guessing
that with 15 years or 20 years of development time it might not be the easiest
codebase to work with especially for beginners yeah.
It absolutely grew over a long period of
time and yeah so the lack of documentation and the fact that Nouveau was maintained
for a long time by a single person and all the knowledge about how things are
meant to work within the driver only really exists in the head of this person
doesn't really help either.
Okay, but when you describe it like that, it sounds really daunting to start
a new driver because you kind of need to get that business logic,
port it over first to a new language,
second from an older code base, which maybe you are not 100% familiar with.
I'm not sure. Maybe you were. Yeah.
There might be a lot of internals or assumptions or implications in the code,
which will make that port harder as well.
And on top of it, you also have to understand the structure of the Linux kernel
itself and how subsystems and drivers interact and so on.
So it sounds like a very challenging task.
Oh yeah, it is challenging, no question. But it's not like we're potting over the logic.
The good thing is that we're starting from GSP only,
which means that not only, but to a large extent, we really have to understand
how the firmware interface works.
And this is absolutely doable and doesn't really require to understand every
single bit of what the Nouveau code does.
And so you can start to port over that part alone in isolation.
Yeah, so what we went for is we basically split the Nova driver into two separate drivers.
One of them basically provides, which we call NovaCore,
provides the hardware and firmware abstraction layer, which is needed not only
by Nova DRM, which is the second Nova driver, basically, and implements all the DRM parts.
So where DRM is direct rendering manager, it's the subsystem in the kernel that
handles GPUs and accelerators.
So this is then the second driver and it sits on top of NovaCore and it's,
utilizes the hardware through this abstraction layer this
is necessary for two reasons the first is that
there is also the nvidia vgpu
driver which is currently making its way upstream which
is a virtualization layer or actually it's
only really the manager of the virtualization layer because
it manages the pci virtual
functions so the gpu has has one
physical function that is exposed through
to the system to the host kernel which is
where a normal driver would sit on top of but additionally
there are protocols in the pci layer that allow
the gpu to expose also
virtual functions which looks like as if
they were additional gpus but actually they
are virtualized within the graphics card and vgpu
manages those through nova core by utilizing nova core to expose those virtual
functions to virtual machines and then virtual machines can run nova core again
and then Nova DRM on top to actually expose the full graphics stack.
So that's one reason why we have this driver split, because we have basically
multiple clients for the firmware and hardware abstraction layer that lives within NovaCore.
The second reason is the firmware API that is exposed by the GPU,
is not stable, or at least we cannot rely on this API to be stable.
And that's one of the things where Rust helps a lot as well,
because abstracting the firmware API is much easier with the powerful Rust type
system and things like PROC macros compared to C where we,
in the end, would need to build endless VTables to differentiate between the
different firmware versions And then you can build more VTables to gather the
common code that's maybe similar between certain versions,
which becomes a huge mess and is one of the reasons why Nouveau is also not sustainable.
And in Rust, that would be a proc macro, which could be expanded,
which might end up being the same Vtable, but you don't have to write it yourself.
Yes. Yes.
Now, a bit of a silly question, but bear with me for a second.
How do you even start with developing such a driver in the kernel?
You can't just possibly say cargo new.
How do you initiate a new driver?
From the technical side of things, it's very easy because the major work of
integrating the Rust compiler into the Linux kernel has already been done before
I joined the Rust4Linux project.
And this is also what the Rust tree in the kernel is actually about.
It's about compiler and build system support.
It carries a lot of other things and carries a lot of other base infrastructure
that we have in the kernel as well.
But its main focus is compiler and build system.
So you basically create a
new driver the exact same way as you would create
a new driver in c in the kernel which
is you create a kconfig file create a
new config option for the kernel to include this new
component which is i don't know 10
15 20 lines of of code most of
that being just module description and then
you have a make file where you add the the
core rs file which then pulls in all other modules to the global variable that
grabs all the all the source files for the cabled build system and that's it
and then it builds so this part is is actually pretty easy We're not using Cargo.
So we're integrating the Rust compiler into kbuild.
Okay, but then you have a driver which does essentially nothing.
You still have to communicate with the outside world. How does that look like?
Exactly. So then you basically have a driver stub.
Actually, you don't even have a driver stub. The only thing you really have
is a so-called kernel module, which has an entry point and an exit point,
which is the init function and the exit function.
That's it. In the Rust abstraction, it's basically an init function and the
exit function is basically just drop off the module structure.
But that's all you got. Now you need some driver infrastructure.
And that's where the first challenge was for the Nova project because a lot
of work has been done before I joined the Rust4Linux project.
Like the initial integration of the Rust compiler into the build system,
module support, lots of generic infrastructure that you need.
So abstractions for handling C reference counted structures on the Rust side, for instance.
Some locking stuff was there. So really just the absolute fundamental basics.
But there was no driver infrastructure. So you weren't able to write drivers.
So the first thing that we actually needed to come up with was core driver infrastructure,
which is the representation of a device, representation of a driver,
representation of a bus, and all the glue code to connect things together.
But i think before i talk more about that we would
need to go one step back and
talk about what we call abstractions in
Rust4Linux right yeah so the
approach we we take and what rust focuses on
in the kernel is really drivers this is
where we can get most of the value of the language in
the short term i think also because
of the architecture of Linux. Linux is a monolithic kernel
and therefore all drivers also
run in kernel space and if a driver messes things
up it breaks the whole machine right
so if we are able to provide support for writing drivers in rust a huge part
for safety and and reliability of the kernel is is is achieved also if we consider that.
Subsystem code, so really kernel core code, is reviewed and tested much more
thoroughly than driver code is.
So the most value in the shortest time really get from drivers.
And the mechanism that people have worked out to interface with C infrastructure
is to write so-called abstractions. So we're not directly calling C functions
from Rust because that would be fundamentally unsafe.
Because then we would need to deal with raw pointers all the time.
And then we wouldn't really get a lot of the benefits from Rust.
So what we're doing instead is we're building abstractions, which means that
for a C structure and its corresponding functions to Rust.
Achieve the functionality that's intended by the structure.
We write a Rust structure that in some way embeds the C structure or embeds
a pointer to the C structure. It really depends on the actual case.
And then we write some Rust code around that API. So you can only use that API in a way that is safe.
So, for instance, if you obtain a device, a representation of a device in the
system, you would not get a pointer, but rather a reference to the device.
And instead of now dealing with only the reference of the device,
you also have certain device states. So, in Rust, that's just type states.
And in the kernel, a device can be in multiple states,
and we represent that by type states, and then only allow the functions that
are applicable for a certain state if the user can or if the user calls it on
the abstraction type with the corresponding type state.
Okay, that sounds super cool because type state is a very nice way to model state machines,
obviously, but I don't know if you had an equivalent on the C side before or
if that was a new integration or if that was a new abstraction that you added in that process.
So on the C side, if we stay at the device example, that's not existent on the C side.
On the C side, it's really just a raw struct device pointer,
and it's on the user to use it in the correct way.
Okay. Well, I made it sound like it was an obviously good idea,
but maybe we can also take a step back here and maybe describe what is the benefit
of the type state pattern, and maybe even what is it in the first place.
Yeah. So let me explain real quick how device, driver, and bus fits into the bigger picture.
So if you want to write a driver in the kernel, you have to implement the driver
structure of the particular subsystem. Let's say PCI.
So you implement the PCI structure, which requires you to implement a couple of callbacks.
Lots of them are optional. Some of them are mandatory.
The two most interesting callbacks are probe and remove.
The probe callback of the driver is called once the bus, in this case the PCI
bus, detects a device in the system that matches the driver.
So the bus basically detects there is a new device and then it checks what is
the vendor ID of the PCI device, What is the device ID of the PCI device?
It checks a couple more things, of course. This is a bit oversimplified.
But then it looks for a driver in the system that is registered that matches those attributes.
And if it finds a driver in the system that matches the attributes,
it calls the probe function that the driver registered and passes in the PCI
device structure in the probe function.
And then the driver can start operating the device.
It can obtain device resources such as I.O. memory or interrupt handlers and
just do its business logic.
It gets a mutable pointer to that device, or what would be the input?
So in C, it's really just a pointer. If we're in Rust, it's a reference to a
PCI device structure with a certain type state.
So for probe, so actually for all the callbacks that come from a bus, so like probe, remove.
So remove is obviously when the driver is unbound from the device,
when the device is unbound from the driver this way around.
Um, but in the bus callbacks, it gets the type state core.
So then the device has the core type state, and then you have access to certain
functions of the PCI device that you would otherwise not have.
And that's important because in the bus callbacks, a global bus lock is taken,
and that allows you to modify certain fields within the device structure that
are needed for instance to,
enable i o memory in the first place or bus mastering and,
So by having this type state, we can ensure that those functions that are only
allowed to be called from bus callbacks are not called from anywhere else.
And the way they could be called from anywhere else is because device structures
are fundamentally reference counted in the kernel.
So everyone can hold on to a reference of that thing.
So it is theoretically possible, but in Rust, it's not.
Yeah so a simplified version of that would be you
get a device you are allowed to call a
couple of methods on it which give you a
new state of that device a thing that you
can hold on to but then crucially you cannot call
methods that would be invalid or you try
to make invalid state irrepresentable this way
yes that's pretty cool it feels
like the type state pattern is a very core part of the abstractions that you
needed for kernel development from rust are there any other components that
you needed to build out which come
to mind anything that other systems programmers could also learn from so.
I think one example may be how we deal with reference counted structures on the C side.
So in C, reference counting works that structures embed a struct kref,
which is in the end an atomic counter that gives you a release callback once
it drops to zero, and then you have to implement a release callback in a certain
way to clean up that structure.
And this is a pattern that we have very often. Things in the kernel are very
often reference counted.
And in C abstractions, we had to deal with that.
So what people did were writing a trait that is called always ref counted,
which implemented by the corresponding Rust structure representation of the
reference counted C struct,
just provides common helpers for reference counting and a common interface for reference counting,
which is the a ref type with
with a generic so you basically
have your rust abstraction structure let's say device and since
the device is fundamentally reference counted you then end up with an a ref
device and then this a ref knows that there's always reference counted implemented
for the device because it's a trade bound and then we can we can make those parts common.
Well typically in a pure rust
world you would probably lean into ownership a little more like a lot of things
just move between different components in rust but i'm assuming that that will
be very hard to do in an inux kernel because the rest of the code expects references
and passes references around or ref counted structures to be more precise.
We try to avoid reference counting if it's not absolutely necessary.
But if the C structure already does it, there's usually a very good reason for it.
Um, so this is something we didn't just have to adapt to.
I may have one other good example that is worth mentioning in this context,
which is you mentioned in Rust, you usually use move semantics and try to not
reference count things or put things on the heap, um, for that.
And this is another implication of writing abstractions to existing C code,
which is a lot of the C structures and a lot of C code is actually self-referential.
Linked lists in the kernel are self-referential.
Lots of logs are subsequently self-referential. So this is why one of the Rust4Linux
team members came up with a solution for that, which is called pin-init.
And it basically does what the name implies,
which is in place initialization and
pinning so i don't
know if you heard about pin-init but there is also a user space crate so so
it's actually his name is Benno and he maintains pin in it in a user space crate
that can be used there but also in the kernel but i think it originated from
the kernel so moving things around is oftentimes not a possibility because of that,
So pin in it is a great example of one of those common problems that had to be solved.
From what we discussed so far, it feels like you built very solid abstractions,
but mostly just to bridge the
world from Rust to sea and also bridging the safe and the unsafe world.
But I did wonder, was it worth it?
Are there any improvements now in working with kernel drivers now that we have Rust support?
And was all of the investment into Rust worth it? Is it more than just safety?
Absolutely. So maybe to go one step back to the abstractions first.
So writing those abstractions is really the absolute crucial part.
And it's very, very difficult.
And I see people who know the kernel very well, who start getting a good idea
of how Rust works, really having trouble to get those abstractions right.
Because this translation layer is really the hard part. now
for drivers it's still very very
worth because you get from the point where your
driver so multiple things the first is
where usually your development cycle
would look like i write a piece of new code in my driver i compile it i deploy
it to some test machine i boot the machine up and i get a kernel panic with
a random page fault and then i have to take the stack trace and try to figure out what happened.
That's what usually happens. Usually you don't write Z code that just works in a new driver.
This now went to, once it compiles...
Works in 99% of the cases to a point that it at least doesn't fold.
Maybe it doesn't work semantically as you intended it, but it doesn't give you
all the hard debug problems that you have when you do it and see in the first
place because the compiler would complain first.
So this is really the goal of the abstractions to get that away and to get the
memory safety here, which I mean, this is kind of the memory safety part,
but it's an implication of the memory safety part, Another aspect,
which is, I think, the second one that's very important is we can use the powerful
Rust type system not only to ensure additional safety and correctness to a certain kind,
but due to the limitations of an API we can also impose on users,
we can guide the user into the direction of using the API in the correct way.
Where in C, you're basically free to do whatever you want. You allocate a structure
and then you call random functions on it that you just find in a header file.
You can design things in Rust in a way that they are used in the right way,
that are used in a semantic way that makes sense. So you can encourage good
code just by designing your abstraction in that way.
And this is another big advantage that I really see from Rust.
There are a lot of other advantages, I think, as well, which makes everyday
work a little bit easier.
We have the Rust format tool, so you don't have to think about how to format
code anymore or run a tool that tells you how to do it.
It depends on your IDE. But having Rust format is really nice,
since a lot of kernel developers are just using Vim and nothing else,
and not a fancy IDE that already formats code in a way that you want it.
But it's just a minor thing. Another problem.
Less minor thing, I would say, is the possibility of writing doc tests.
So we have a unit test framework in the kernel that's called kUnit, which is pretty nice.
Yet, a lot of code that goes into the kernel does not come with unit tests.
Actually, I would say it's not common to send new kernel code with unit tests already.
But in rust we have those doc tests and we compile those doc tests that are really just,
small snippets of code right we compile them into
k-unit tests and then you can enable them with a kernel config option
when you boot the kernel and then they are running on boot and you get immediate
feedback if something broke your api which was previously only possible if you
really write a real k-unit test and insert the k-unit test module and run it
so it it is much more overhead than running those doc tests,
which give you some immediate sanity check of if your API still does the right thing.
So that's another advantage as well.
Now, the Rust standard library is split up into multiple subgrades,
obviously, and one of them is a no-std part.
Can you use the no-std module?
Can you make use of the Rust standard library and to what extent or do you really
have to use all of the kernel abstractions and you have to bring everything
like data structures yourself.
So we're using what Rust exposes as the core crate.
This is what we use. We don't use any other crates.
We had for a while, we had the alloc crate as well in the kernel.
But we removed that because it just didn't fulfill the requirements that we
needed for the kernel allocators.
So one and a half years ago, I wrote the kernel allocator trade and the allocator abstractions,
including all the stuff that you need as a basic allocation primitives like
you have in Rust's alloc crate.
Which is box and vec and those kind of things.
So in the kernel, we have kbox and kvec, vbox, kvbox, which is basically just
the approbations for the corresponding kernel allocators like kmalloc, vmalloc, kvmalloc.
And that was necessary because kernel allocators have more arguments than the
allocator API of the Rust allocate crate allows us to use.
We have specific allocation flags we have
to consider numa nodes and the upstream
trade just didn't fit so we had to had to do our
own thing so back then
when we only or when we had the alloc
crate used we only supported the kmalloc allocator
so just a symbol so just one allocator and
and had a flex extension for it no numa node support but long time it that's
we just need to support the other ones as well so that's why i i did this work
also in in preparation for for drivers because drivers absolutely need that okay.
It's not really a question, but an observation. It feels like you learned Rust
in hardcore mode by implementing all of these basic data structures and the allocator.
Yeah, that's actually what happened. So the way I learned Rust is probably a
way I wouldn't recommend to other people.
So what I did is I knew the kernel very well because I'm a kernel engineer for
a long time. So I knew a lot of parts.
In the past, I've worked at all different kinds of subsystems in the kernel.
So I'm more a generalist here than I was an expert for a very specific subsystem.
In fact, I'm only working for about three years in DRM and on GPU drivers.
All the time before, I was working on various other subsystems in the kernel.
But that was to my advantage in this
case because i i knew obviously from
that a lot about various different areas
in the kernel so what i was doing is i took advantage of that knowledge looked
at existing rust code in the kernel and knowing what it is supposed to do semantically
I basically reverse engineered for myself what the Rust code must be doing here.
So this is how I learned Rust and how I approached it.
Pretty painful way but it worked out in the end well for me.
Yeah true now the
community perception of some of the
conflicts on the mailing lists and people maybe getting a little angry and maybe
some people even leaving the project at some point what's your perspective on
this what's your take on this what's the inside perception versus the outside perception here yeah.
I i think that hits a very, very good point that is also very important to me.
So I saw how those news about some controversies or some discussions that were
a little bit controversial on the mailing list went through the news and sounded
like if it would have been the biggest drama on earth.
But in the end, we have thousands of contributors in the kernel.
So you have thousands of different opinions, I think, for such fundamental addition
to the kernel, like a new language.
It's absolutely normal that people have different opinions and some people have
stronger opinions than other people.
Some people express their opinions stronger than other people, right?
So I haven't seen anything that is dramatic or is unexpected to a certain extent and I think.
I feel like it appears to be made a bigger deal out of it as it actually is.
Another aspect of this is everyone talked about it and lots of news sites picked it up.
But in the end, it was one interaction from thousands of other interactions
that went in the absolute other direction.
So just to name a few, and I know the terms may not make a lot of sense to a
lot of people who are not super familiar with the kernel,
but I just want to list a few of them just to give an impression of where things go very well, actually.
And we have new contributors that have not been contributing to the kernel ever
before who stepped up doing kernel work because of Rust and because they were interested in Rust.
We have people that were a long time around in the kernel and are now doing Rust work as well.
And between those people who are doing this Rust work,
so those Rust contributors and C-maintainers,
we have so many interactions where great relationships has been established
and where actually ideas or implementations of Rust abstractions brought back
improvements to the seaside as well,
which are not going through the news pages, right?
And just to name a few of them, this is DriverCore, this is Memory Management,
PCI, OpenFirmware, ACPI, DRM, Networking, Timekeeping.
MISC Device, I2C, the clock framework, PWM regulators...
Primitives like maple tree, xarray, work queues, firmware api and and a lot more to be
honest where people have established relationships with the maintainers and
are working together and we have great interactions get great results and i
i think it's it's only fair to mention those as well.
Wow i had no idea and you
know as someone who does not know anything about
the linux kernel but a few things about rust
to me it feels enabling as well
like as just someone who thinks i might
potentially at some point want to
dabble with this don't have any plans but it would
be more likely now than before because there's the Rust4Linux project you
have a website and i checked it out it's very approachable and i'm guessing
that the people behind it are so as well and so i feel more welcome than i was before,
That's kind of a great achievement as well, besides the code.
Absolutely agree.
Now, let's say you look forward and you say a year or two from now,
we have another chat and you're enthusiastic about what was achieved.
What would you be proud of?
What's next for the Nova driver and for your work?
So as for the nova project one thing
that i want to say here is my role
so far has been to do all the enablement
work i implemented the driver core infrastructure
in rust piece the pci bus and and
generic io stuff and a lot of other
things that i now also maintain and i
will keep doing this work because i think it's
important it's important for nova but it's
also important for Rust4Linux
and i also think it's important for the kernel overall so
a lot of my time goes into
that when it comes to nova one could ask who's doing nova now and i have a very
very good answer to that which i'm very happy of and it turned out that a lot
of nvidia folks actually stepped up writing code for Nova.
Actually, the majority of the code that goes into NovaCore is written by NVIDIA people.
And we just got a co-maintainer for NovaCore, which is Alexandre Courbot,
who is helping me a lot in maintaining the project and moving the project forward.
And so I'm very happy about the NVIDIA folks stepping up.
So what I'm,
hopefully proud of in a year from now is that
Nova actually developed in the direction of solving the problems that we intended
to solve when we decided doing a whole new driver from scratch and doing it
in Rust is the correct way to go,
which specifically means it solves the design problems in Nouveau that we had in Nouveau.
It solves the problem of abstracting the firmware.
It solves the problem of building a modular driver stack that supports virtualization
and a solid compute and graphics driver on the host, as well as in VMs, of course.
I'm i'm looking forward to people saying
who run an nvidia gpu hey i'm
running this out of the box it comes with my kernel installation
from my distribution and it's not something that i
have to install afterwards from from an out-of-tree source and i mean we have
that with Nouveau but but honestly oftentimes it's a bit sad because i see all
the effort that went into Nouveau and there were brilliant minds working on Nouveau.
So I really want to give my appreciation for the work here.
But there is also the reality that oftentimes Nouveau is used to install the proprietary driver.
And that's also a bit sad. So I'm looking forward to change that.
At least to me and potentially to a lot of listeners.
The day-to-day work of an actual Linux kernel developer is completely unknown,
and it sounds almost a bit daunting to work on that.
What's it really like? What's your day-to-day? How does working in the Linux
kernel really look like?
Yeah, so there are many different aspects to that.
One of it is obviously the maintainer work, which means that I have to review
a lot of patches. have to give feedback, but also guide people into the right direction,
design-wise for the subsystem or for the component, and ideally serve as a multiplier here.
Make sure to get people interested in doing that work in the first place,
get more contributors involved, and also scale myself,
which means find people to work with who may want to take responsibility themselves,
which is, I think, a part that oftentimes is forgotten by maintainers.
And how much of that time do you spend in email, in your IDE,
and during code reviews?
So writing emails and doing code reviews, kind of the same thing.
We're doing code review by mail. Patches are sent through email.
I would say so roughly about half of the time goes definitely into writing mails
and having discussions about code, reviewing patches.
A significant amount of time also goes into dealing with patches.
It's not only about reviewing them as a maintainer, but you also have to handle
them, which means you have to apply them to your tree.
You have to sanity check if everything about the patch is correct in terms of process and form.
And ideally, you also do a build check, at least.
And then you take it in your tree, then it goes into the next tree,
which is a kernel tree that is basically capturing the trees of all kernel maintainers
who want to be part of Linux Next.
And it regularly, I think even nightly, merges all the maintainer trees into
one single branch and reports conflicts.
Most of the time, the conflicts are already solved by the Linux Next maintainer himself.
Sometimes it's not possible that he just drops the branch for the night and
comes back to the maintainers asking, hey, what I should do here?
What's the right solution?
So part of the job as a maintainer is also to deal with that.
And of course, you have to send the pull requests either to Linux himself or
to the next maintainer in the hierarchy at the end of the cycle.
So a development cycle is roughly about three months.
Then you send the changes that you accumulated over the time to Linus or the
next maintainer in the hierarchy.
And then also per release cycle, which is not release cycle,
but release candidate cycle.
You send the fixes for the last release or after the last merge window for the
next release candidates to Linus or the maintainers. as well.
And this goes on for the three months and then you have the next release and
then this cycle starts from the beginning.
Otherwise, yeah, I spend a lot of time in my editor, of course, writing actual code,
But there's also a lot of time that just goes into coordinating,
coordinating with other developers, coordinating with companies we're partnered
with or coordinating with people from the community.
And who's next in line for you? Do you send a patch to Greg Kroah-Hartman
or someone else in the hierarchy?
So that depends. I'm maintaining a couple of trees and a couple of different
components or subsystems and drivers.
So I'm sending patches to Dave Airlie, who's maintainer of DRM.
I do that previously. I did that for the Nova tree.
And now I'm doing it with the DRM Rust tree. We have an own DRM tree for Rust components.
And what we do different than other subsystems with that.
Then I submit patches to, or actually submit pull requests to Miguel,
who's maintaining the Rust tree.
This is for the allocator stuff that i mentioned previously and i'm also sending
pull requests to linus himself for the driver core tree that i co-maintain nice.
And how does the tooling look like then you mentioned that yeah well you need
to use an email tool and you need to use at least an ide so what's the weapon of choice.
Right yeah so this is where really
my past of being a
C kernel engineer from the get-go kind of shines through
which is i'm really just using vim not
even new vim just vim and mutt and that's that's basically about it so i i recently
switched to aerc for email which is another console client but that's really
all i use no no fancy ide no language server nothing and.
What about the rust tooling,
Do you use any tooling other than cargo format?
Is there any kernel-specific tooling as well that maybe you've built or someone
else builds which does various random tasks, administrative tasks?
No, I mean, the formatting and all the Rust-specific things are kind of built
into the cable build system, so you're not really running cargo Rust format.
I don't even know if that's an actual command of cargo, but it's integrated
into the kernel build system.
We have in DRM, we have some maintainer tools which are called DIMM.
It's a helper to apply patches, do back merges from other trees, send pull requests.
So basically make a couple of maintainer and committer tasks a bit more easy. But other than that.
No, just... I find it fascinating how much you can get done,
with these tools and then just sticking to
them and putting in the hours and putting in the work that's fascinating right
danilo if someone is now interested and intrigued to learn more and perhaps
even wants to contribute to the linux kernel at some point especially the rust
part where would they learn how would they how could they learn more.
So I think the best entry points here are the Zulip we have,
where the Rust4Linux community is usually around and also the Rust for Rust4Linux
team is around and can also give hints on where to start if people have a specific
topic of interest they want to work on in the kernel or may be interested in.
Otherwise, if you just want to get started doing your first steps,
the Rust4Linux project in the GitHub repository maintains an issue list with good first issues.
So we try to keep the list huge enough so that everyone has something interesting
to find, to pick out of it, where we keep just things around where we know,
okay, this would be a nice rework.
It's not super pressuring to get it done, but it would be something we would
like to see happening and it's a good thing for someone new to start contributing to.
Those are the things we keep around there and I can really recommend to have a look at that.
That was amazing. Everyone who's interested, please check out these resources.
We will also link to them in the show notes.
Finally, what's your message to the Rust community?
Oh, I don't know if I have the message to the Rust community.
I probably want to say thank you because I,
Learning Rust and writing Rust code made me a better engineer also for the C
parts in the kernel that I maintain.
It made me think different about certain problems and just gave me a different
perspective of approaching problems, which is very helpful.
So my message is probably be keep up the great work.
Otherwise, when it comes to the things that the kernel needs from the Rust project,
and there are definitely things that are needed,
I also want to say thank you for the great collaboration.
I'm not that involved myself.
I definitely want to make that clear. We have other members from the Rust4Linux
team who are regularly talking to people from the Rust core team and from
the Rust project in general.
And there is great collaboration already ongoing.
So, yeah, I'm very happy to see how things go.
Amazing. Danilo, thanks so much for taking the time for the interview today.
Thanks.
Rust in Production is a podcast by corrode. It is hosted by me,
Matthias Endler, and produced by Simon Brüggen.
For show notes, transcripts, and to learn more about how we can help your company
make the most of Rust, visit corrode.dev.
Thanks for listening to Rust in Production.
Danilo
00:00:28
Matthias
00:00:48
Danilo
00:01:15
Matthias
00:02:05
Danilo
00:02:17
Matthias
00:02:37
Danilo
00:03:01
Matthias
00:03:27
Danilo
00:03:37
Matthias
00:03:59
Danilo
00:04:09
Matthias
00:05:16
Danilo
00:05:23
Matthias
00:07:52
Danilo
00:08:09
Matthias
00:13:21
Danilo
00:13:43
Matthias
00:13:47
Danilo
00:14:10
Matthias
00:14:33
Danilo
00:15:16
Matthias
00:15:46
Danilo
00:15:51
Matthias
00:18:46
Danilo
00:18:55
Matthias
00:18:57
Danilo
00:19:12
Matthias
00:20:33
Danilo
00:20:40
Matthias
00:25:11
Danilo
00:25:33
Matthias
00:25:51
Danilo
00:26:06
Matthias
00:27:38
Danilo
00:27:44
Matthias
00:29:13
Danilo
00:29:55
Matthias
00:31:23
Danilo
00:31:46
Matthias
00:33:26
Danilo
00:33:59
Matthias
00:38:06
Danilo
00:38:34
Matthias
00:40:36
Danilo
00:40:46
Matthias
00:42:04
Danilo
00:42:29
Matthias
00:45:37
Danilo
00:46:22
Matthias
00:46:24
Danilo
00:46:43
Matthias
00:49:44
Danilo
00:50:06
Matthias
00:50:59
Danilo
00:51:07
Matthias
00:53:44
Danilo
00:53:53
Matthias
00:54:33
Danilo
00:54:44
Matthias
00:55:15
Danilo
00:55:30
Matthias
00:56:08
Danilo
00:56:38
Matthias
00:57:45
Danilo
00:57:55
Matthias
00:59:18
Danilo
00:59:23
Matthias
00:59:25