Rust in Production

Matthias Endler

Rust4Linux with Danilo Krummrich

About bringing Rust to the Linux kernel

2025-12-11 61 min

Description & Show Notes

Bringing Rust into the Linux kernel is one of the most ambitious modernization efforts in open source history. The Linux kernel, with its decades of C code and deeply ingrained development practices, is now opening its doors to a memory-safe language. It's the first time in over 30 years that a new programming language has been officially adopted for kernel development. But the journey is far from straightforward.

In this episode, we speak with Danilo Krummrich, Linux kernel maintainer and Rust for Linux core team member, about the groundbreaking work of integrating Rust into the Linux kernel. Among other things, we talk about the Nova GPU driver, a Rust-based successor to Nouveau for NVIDIA graphics cards, and discuss the technical challenges and cultural shifts required for large-scale Rust adoption in the kernel as well as the future of the Rust4Linux project.

About Rust for Linux

Rust for Linux is a project aimed at bringing the Rust programming language into the Linux kernel. Started to improve memory safety and reduce vulnerabilities in kernel code, the project has been gradually building the infrastructure, abstractions, and tooling necessary for Rust to coexist with the kernel's existing C codebase.

About Danilo Krummrich

Danilo Krummrich is a software engineer at Red Hat and a core contributor to the Rust for Linux project. His fundamental contribution to Rust for Linux is the driver-core infrastructure, the foundational framework that makes it possible to write drivers in Rust at all. This includes both C and Rust code that provides the core abstractions for device drivers in the kernel. Danilo is a maintainer for multiple critical kernel subsystems, including Driver Core, DRM (GPUVM, Rust, GPU Scheduler), GPU drivers for NVIDIA GPUs (Nova, Nouveau), Firmware Loader API, as well as Rust bindings for PCI, DMA, and ALLOC. He is the primary developer of the Nova GPU driver, a fully Rust-based driver for modern NVIDIA GPUs.

Links From The Episode


Official Links

Transcript

Welcome to another episode of Rust in Production, a podcast about companies who use Rust to shape the future of infrastructure. My name is Matthias Endler from corrode, and today I'm joined by Danilo Krummrich from the Linux Kernel Project to talk about bringing Rust into the kernel. Danilo, thanks so much for taking the time for the interview today. Can you quickly say a few words about yourself and about Red Hat, the company you work for?
Danilo
00:00:28
Of course, thanks for the invitation. Yeah, so my name is Danilo and I'm a Linux kernel engineer working at Red Hat in the accelerators and GPU group. And I'm also a maintainer in the Linux kernel of various subsystems, components and drivers and member of the Rust4Linux team.
Matthias
00:00:48
It's almost intimidating to think that you work on such an important project and also on such a low level. I guess a lot of people that are listening in either use Linux or get in contact with Linux on a daily basis because, you know, we use a ton of servers nowadays and a lot of these run on Linux. What does it feel like to work on that level and how did you even get started with kernel development?
Danilo
00:01:15
I don't think it necessarily feels different than working on any other software project, I would guess. I cannot really tell. I worked on a few other projects in the past 10 years ago. I worked a little bit on the AOSP, so the Android open source project. I worked there at kernel level as well, but also some user space portions. So i was always interested in the kernel since i finished my exams at university and when i joined the first company that i also worked at when i was a student i i just told people hey i'm interested in the kernel i want to work at that just get me to the just get me to the base layer team, and then i just started working there and and especially i started learning a lot the learning curve was very, very high.
Matthias
00:02:05
When was the first time you personally heard of the Rust4Linux project? When was the first time you heard about, I mean, discussions, adding Rust to the Linux kernel?
Danilo
00:02:17
I would say almost from the beginning, because I'm following what's going on in the project, what's discussed in mailing lists. But this was more reading along. It was not that I I was immediately, okay, this is definitely something we need to, I need to get involved in. But I was aware.
Matthias
00:02:37
Did you know Rust back then? So this was all new for you, but how did it feel like? Can you remember the time, the feeling that maybe thinking about the adoption of Rust and was that the time when you also looked at the language for the first time? Can you talk about your first impressions of the language coming from a mostly C background, I'm guessing?
Danilo
00:03:01
It was a bit intimidating, I would say. The C language is very, very simple. And in the kernel, we are doing very, very complicated things. With a very, very simple language. With Rust, it's different. Rust is a very difficult language, I would say. But at the same time, it makes some of the problems we have to solve much easier to solve. So I guess that's a good trade-off.
Matthias
00:03:27
Did you ever second-guess the adoption of Rust in a Linux kernel, or was it always clear to you that there were very obvious benefits?
Danilo
00:03:37
So after I got used to the concepts of Rust and understood what they are about and understood how I can utilize them, I don't have any doubt. This is really i'm very or let it say the other way around i'm very very convinced that rust definitely has a big future in the kernel how.
Matthias
00:03:59
Long did it take to build out the entire infrastructure infrastructure that is needed for such a project inside the kernel and outside the kernel from the rust side i mean.
Danilo
00:04:09
So i think a lot of initial so a lot of the initial work was and i cannot say a lot about that because that was not the time where i joined the project this was before i joined the project but i think and if i'm forgetting someone i'm very sorry about it but i i think it was mainly Miguel and and Wedson so Miguel Ojeda and Wedson Almeida who, started the Rust4Linux project and in a downstream branch worked out a lot of infrastructure, And the initial support in the Linux kernel, build system integration, and so on. And I think the initial lift there was to convince people that it's worth adding it to the Linux kernel. There were approaches adding a second language into the kernel in the past. I think C++ was tried a couple of times, and it failed badly. So adding a new language to the Linux kernel is definitely not an easy thing to attempt. And they were successful. So I think this milestone was really one of the big ones.
Matthias
00:05:16
Why has Rust succeeded, quote unquote, to be adopted in the Linux kernel and C++ not?
Danilo
00:05:23
So I think that a subset of C++ would have been a great addition to the kernel. Just to name two examples, the first one being inheritance. The C kernel code relies a lot on the inheritance pattern. So getting some language support for that would definitely been a great improvement. And the second one that I think is very, very useful is as simple as field visibilities of structures. So we have the case in the kernel that generic components obviously have fields that are, or should be accessed by design by the users but there is also private fields and what we often end up is that drivers instead of representing their needs by contributions to the common component in the kernel instead abuse private fields of the structure and basically peek into component. Internals, which obviously is not considered by the component maintainers in the end and hence causes problems in maintainability and can actually lead to bugs in the kernel. So those are problems that are solved by Rust as well. And I think overall, Rust is just the much better fit for the kernel because we have much less things that are not a good fit for the kernel. Plus, it has all the additional features that help with memory safety that C++ does not have. However, it also brings some disadvantages with it, which is the common practice in Rust to just unwrap results or unwrap options. So basically operations that panic the program. And in user space, that's the same option in a lot of cases. But in the kernel, it almost never is a sane option because a panic in the kernel is really just the last resort when the system is in a state where it's basically non-recoverable and otherwise you would corrupt memory or in the worst case even corrupt file systems. So this is something that we have to take care of when using Rust in the kernel that we're making it hard for people to accidentally trigger panics in the kernel.
Matthias
00:07:52
So what was your first task when starting with Rust4Linux? Was it a thing that you voluntarily picked up or was there a need at work where you needed to start the Rust, where you needed to touch the Rust part?
Danilo
00:08:09
I think it really started with me taking maintainership of the Nouveau driver. And it's also a result out of the way forward with Nouveau and the introduction of Nova. So with the Nouveau driver, we have a lot of issues which are... So back then when I started at Red Hat, I started working on a component that's named GPUVM for the DRM subsystem. I mean, it's basically managing the GPU's virtual address space and doing some more things in the context of providing helpers for drivers to implement Vulkan-compatible user space APIs. So APIs that are given out to user space drivers, for instance, Mesa. So the work of GPU VM I did in the context of adding a new UAPI to Nouveau, which is Vulkan-compatible. And in order to get that going in Nouveau, we needed to do quite some changes on how the page table management of the GPU itself works. But it was never possible to do it to the full and correct extent. We still have problems there. And that's because the design of how Nouveau is written just doesn't really work out well for this kind of requirements that we get from a Vulkan-compatible UAPI. So those design reworks would be a lot of effort, especially considering that Nouveau supports a huge range of GPUs and GPU architectures and generations. So you can basically split them up into the generations before NVIDIA introduced GSP and after NVIDIA introduced GSP. So GSP means GPU system processor, and it's a firmware processor that lives within the GPU. And instead of programming the hardware directly, if you have a GSP GPU, you can just talk to the firmware through a ring buffer and a firmware-specific protocol and instruct the firmware to do the things that you would usually program through registers. And by reworking all those layers to get a Vulkan compatible UAPI we would need to consider the whole range of GPUs, the ones pre-GSP and the ones since GSP introduction, so that would be a lot of effort just to rework something we also have other problems with and the other problems we have in Nouveau are. How well it's documented and how it's designed so Nouveau consists out of lots and lots of v tables with callbacks that are not really documented in terms of ownership and lifetimes of the objects that are allocated returned and and handled which is which is one of the huge problems and one problem that is really important to me personally to address is that Nouveau is just not really accessible to users we don't have a lot of contributors and i think the reason for that is because just no one really understands how it works and doesn't get comfortable working with that driver and for an open source project that's not really what you want so what we, there ended up with was to decide at a certain point we should do a new driver we should do a driver that is gsp only and if we do a new driver now we saw that Rust4Linux is progressing. We thought okay if we do a new driver now we have the choice do we do another c driver, or do we do a Rust driver where the language solves a lot of problems for us that DRM drivers have. So DRM drivers, GPU drivers are usually very, very complicated, suffer from race conditions, suffer from memory issues and just because they're complicated and it's hard to get it right. And Rust helps a lot with that. So we decided to go for a Rust driver and or actually first step evaluate if we want to do a rust drive and this is where i got in touch with rust in the kernel the first time because i started doing this evaluation of is a rust driver, for the successor of Nouveau doable is it is it reasonable and what are the things that we need to consider what are the things that we need to do to get it going.
Matthias
00:13:21
And so at this point in time you're thinking is rust a better choice for writing a completely new driver for nvidia graphics cards for the linux kernel and moving away from the legacy codebase which was Nouveau back in the day i think Nouveau not even sure when it started but it's been around for a while right.
Danilo
00:13:43
Yeah i think it's it's around for 15 20 years.
Matthias
00:13:47
Yeah okay so that must mean the code base is relatively evolved it has evolved and it's probably complicated to maintain i'm not sure about the status but i'm just guessing that with 15 years or 20 years of development time it might not be the easiest codebase to work with especially for beginners yeah.
Danilo
00:14:10
It absolutely grew over a long period of time and yeah so the lack of documentation and the fact that Nouveau was maintained for a long time by a single person and all the knowledge about how things are meant to work within the driver only really exists in the head of this person doesn't really help either.
Matthias
00:14:33
Okay, but when you describe it like that, it sounds really daunting to start a new driver because you kind of need to get that business logic, port it over first to a new language, second from an older code base, which maybe you are not 100% familiar with. I'm not sure. Maybe you were. Yeah. There might be a lot of internals or assumptions or implications in the code, which will make that port harder as well. And on top of it, you also have to understand the structure of the Linux kernel itself and how subsystems and drivers interact and so on. So it sounds like a very challenging task.
Danilo
00:15:16
Oh yeah, it is challenging, no question. But it's not like we're potting over the logic. The good thing is that we're starting from GSP only, which means that not only, but to a large extent, we really have to understand how the firmware interface works. And this is absolutely doable and doesn't really require to understand every single bit of what the Nouveau code does.
Matthias
00:15:46
And so you can start to port over that part alone in isolation.
Danilo
00:15:51
Yeah, so what we went for is we basically split the Nova driver into two separate drivers. One of them basically provides, which we call NovaCore, provides the hardware and firmware abstraction layer, which is needed not only by Nova DRM, which is the second Nova driver, basically, and implements all the DRM parts. So where DRM is direct rendering manager, it's the subsystem in the kernel that handles GPUs and accelerators. So this is then the second driver and it sits on top of NovaCore and it's, utilizes the hardware through this abstraction layer this is necessary for two reasons the first is that there is also the nvidia vgpu driver which is currently making its way upstream which is a virtualization layer or actually it's only really the manager of the virtualization layer because it manages the pci virtual functions so the gpu has has one physical function that is exposed through to the system to the host kernel which is where a normal driver would sit on top of but additionally there are protocols in the pci layer that allow the gpu to expose also virtual functions which looks like as if they were additional gpus but actually they are virtualized within the graphics card and vgpu manages those through nova core by utilizing nova core to expose those virtual functions to virtual machines and then virtual machines can run nova core again and then Nova DRM on top to actually expose the full graphics stack. So that's one reason why we have this driver split, because we have basically multiple clients for the firmware and hardware abstraction layer that lives within NovaCore. The second reason is the firmware API that is exposed by the GPU, is not stable, or at least we cannot rely on this API to be stable. And that's one of the things where Rust helps a lot as well, because abstracting the firmware API is much easier with the powerful Rust type system and things like PROC macros compared to C where we, in the end, would need to build endless VTables to differentiate between the different firmware versions And then you can build more VTables to gather the common code that's maybe similar between certain versions, which becomes a huge mess and is one of the reasons why Nouveau is also not sustainable.
Matthias
00:18:46
And in Rust, that would be a proc macro, which could be expanded, which might end up being the same Vtable, but you don't have to write it yourself.
Danilo
00:18:55
Yes. Yes.
Matthias
00:18:57
Now, a bit of a silly question, but bear with me for a second. How do you even start with developing such a driver in the kernel? You can't just possibly say cargo new. How do you initiate a new driver?
Danilo
00:19:12
From the technical side of things, it's very easy because the major work of integrating the Rust compiler into the Linux kernel has already been done before I joined the Rust4Linux project. And this is also what the Rust tree in the kernel is actually about. It's about compiler and build system support. It carries a lot of other things and carries a lot of other base infrastructure that we have in the kernel as well. But its main focus is compiler and build system. So you basically create a new driver the exact same way as you would create a new driver in c in the kernel which is you create a kconfig file create a new config option for the kernel to include this new component which is i don't know 10 15 20 lines of of code most of that being just module description and then you have a make file where you add the the core rs file which then pulls in all other modules to the global variable that grabs all the all the source files for the cabled build system and that's it and then it builds so this part is is actually pretty easy We're not using Cargo. So we're integrating the Rust compiler into kbuild.
Matthias
00:20:33
Okay, but then you have a driver which does essentially nothing. You still have to communicate with the outside world. How does that look like?
Danilo
00:20:40
Exactly. So then you basically have a driver stub. Actually, you don't even have a driver stub. The only thing you really have is a so-called kernel module, which has an entry point and an exit point, which is the init function and the exit function. That's it. In the Rust abstraction, it's basically an init function and the exit function is basically just drop off the module structure. But that's all you got. Now you need some driver infrastructure. And that's where the first challenge was for the Nova project because a lot of work has been done before I joined the Rust4Linux project. Like the initial integration of the Rust compiler into the build system, module support, lots of generic infrastructure that you need. So abstractions for handling C reference counted structures on the Rust side, for instance. Some locking stuff was there. So really just the absolute fundamental basics. But there was no driver infrastructure. So you weren't able to write drivers. So the first thing that we actually needed to come up with was core driver infrastructure, which is the representation of a device, representation of a driver, representation of a bus, and all the glue code to connect things together. But i think before i talk more about that we would need to go one step back and talk about what we call abstractions in Rust4Linux right yeah so the approach we we take and what rust focuses on in the kernel is really drivers this is where we can get most of the value of the language in the short term i think also because of the architecture of Linux. Linux is a monolithic kernel and therefore all drivers also run in kernel space and if a driver messes things up it breaks the whole machine right so if we are able to provide support for writing drivers in rust a huge part for safety and and reliability of the kernel is is is achieved also if we consider that. Subsystem code, so really kernel core code, is reviewed and tested much more thoroughly than driver code is. So the most value in the shortest time really get from drivers. And the mechanism that people have worked out to interface with C infrastructure is to write so-called abstractions. So we're not directly calling C functions from Rust because that would be fundamentally unsafe. Because then we would need to deal with raw pointers all the time. And then we wouldn't really get a lot of the benefits from Rust. So what we're doing instead is we're building abstractions, which means that for a C structure and its corresponding functions to Rust. Achieve the functionality that's intended by the structure. We write a Rust structure that in some way embeds the C structure or embeds a pointer to the C structure. It really depends on the actual case. And then we write some Rust code around that API. So you can only use that API in a way that is safe. So, for instance, if you obtain a device, a representation of a device in the system, you would not get a pointer, but rather a reference to the device. And instead of now dealing with only the reference of the device, you also have certain device states. So, in Rust, that's just type states. And in the kernel, a device can be in multiple states, and we represent that by type states, and then only allow the functions that are applicable for a certain state if the user can or if the user calls it on the abstraction type with the corresponding type state.
Matthias
00:25:11
Okay, that sounds super cool because type state is a very nice way to model state machines, obviously, but I don't know if you had an equivalent on the C side before or if that was a new integration or if that was a new abstraction that you added in that process.
Danilo
00:25:33
So on the C side, if we stay at the device example, that's not existent on the C side. On the C side, it's really just a raw struct device pointer, and it's on the user to use it in the correct way.
Matthias
00:25:51
Okay. Well, I made it sound like it was an obviously good idea, but maybe we can also take a step back here and maybe describe what is the benefit of the type state pattern, and maybe even what is it in the first place.
Danilo
00:26:06
Yeah. So let me explain real quick how device, driver, and bus fits into the bigger picture. So if you want to write a driver in the kernel, you have to implement the driver structure of the particular subsystem. Let's say PCI. So you implement the PCI structure, which requires you to implement a couple of callbacks. Lots of them are optional. Some of them are mandatory. The two most interesting callbacks are probe and remove. The probe callback of the driver is called once the bus, in this case the PCI bus, detects a device in the system that matches the driver. So the bus basically detects there is a new device and then it checks what is the vendor ID of the PCI device, What is the device ID of the PCI device? It checks a couple more things, of course. This is a bit oversimplified. But then it looks for a driver in the system that is registered that matches those attributes. And if it finds a driver in the system that matches the attributes, it calls the probe function that the driver registered and passes in the PCI device structure in the probe function. And then the driver can start operating the device. It can obtain device resources such as I.O. memory or interrupt handlers and just do its business logic.
Matthias
00:27:38
It gets a mutable pointer to that device, or what would be the input?
Danilo
00:27:44
So in C, it's really just a pointer. If we're in Rust, it's a reference to a PCI device structure with a certain type state. So for probe, so actually for all the callbacks that come from a bus, so like probe, remove. So remove is obviously when the driver is unbound from the device, when the device is unbound from the driver this way around. Um, but in the bus callbacks, it gets the type state core. So then the device has the core type state, and then you have access to certain functions of the PCI device that you would otherwise not have. And that's important because in the bus callbacks, a global bus lock is taken, and that allows you to modify certain fields within the device structure that are needed for instance to, enable i o memory in the first place or bus mastering and, So by having this type state, we can ensure that those functions that are only allowed to be called from bus callbacks are not called from anywhere else. And the way they could be called from anywhere else is because device structures are fundamentally reference counted in the kernel. So everyone can hold on to a reference of that thing. So it is theoretically possible, but in Rust, it's not.
Matthias
00:29:13
Yeah so a simplified version of that would be you get a device you are allowed to call a couple of methods on it which give you a new state of that device a thing that you can hold on to but then crucially you cannot call methods that would be invalid or you try to make invalid state irrepresentable this way yes that's pretty cool it feels like the type state pattern is a very core part of the abstractions that you needed for kernel development from rust are there any other components that you needed to build out which come to mind anything that other systems programmers could also learn from so.
Danilo
00:29:55
I think one example may be how we deal with reference counted structures on the C side. So in C, reference counting works that structures embed a struct kref, which is in the end an atomic counter that gives you a release callback once it drops to zero, and then you have to implement a release callback in a certain way to clean up that structure. And this is a pattern that we have very often. Things in the kernel are very often reference counted. And in C abstractions, we had to deal with that. So what people did were writing a trait that is called always ref counted, which implemented by the corresponding Rust structure representation of the reference counted C struct, just provides common helpers for reference counting and a common interface for reference counting, which is the a ref type with with a generic so you basically have your rust abstraction structure let's say device and since the device is fundamentally reference counted you then end up with an a ref device and then this a ref knows that there's always reference counted implemented for the device because it's a trade bound and then we can we can make those parts common.
Matthias
00:31:23
Well typically in a pure rust world you would probably lean into ownership a little more like a lot of things just move between different components in rust but i'm assuming that that will be very hard to do in an inux kernel because the rest of the code expects references and passes references around or ref counted structures to be more precise.
Danilo
00:31:46
We try to avoid reference counting if it's not absolutely necessary. But if the C structure already does it, there's usually a very good reason for it. Um, so this is something we didn't just have to adapt to. I may have one other good example that is worth mentioning in this context, which is you mentioned in Rust, you usually use move semantics and try to not reference count things or put things on the heap, um, for that. And this is another implication of writing abstractions to existing C code, which is a lot of the C structures and a lot of C code is actually self-referential. Linked lists in the kernel are self-referential. Lots of logs are subsequently self-referential. So this is why one of the Rust4Linux team members came up with a solution for that, which is called pin-init. And it basically does what the name implies, which is in place initialization and pinning so i don't know if you heard about pin-init but there is also a user space crate so so it's actually his name is Benno and he maintains pin in it in a user space crate that can be used there but also in the kernel but i think it originated from the kernel so moving things around is oftentimes not a possibility because of that, So pin in it is a great example of one of those common problems that had to be solved.
Matthias
00:33:26
From what we discussed so far, it feels like you built very solid abstractions, but mostly just to bridge the world from Rust to sea and also bridging the safe and the unsafe world. But I did wonder, was it worth it? Are there any improvements now in working with kernel drivers now that we have Rust support? And was all of the investment into Rust worth it? Is it more than just safety?
Danilo
00:33:59
Absolutely. So maybe to go one step back to the abstractions first. So writing those abstractions is really the absolute crucial part. And it's very, very difficult. And I see people who know the kernel very well, who start getting a good idea of how Rust works, really having trouble to get those abstractions right. Because this translation layer is really the hard part. now for drivers it's still very very worth because you get from the point where your driver so multiple things the first is where usually your development cycle would look like i write a piece of new code in my driver i compile it i deploy it to some test machine i boot the machine up and i get a kernel panic with a random page fault and then i have to take the stack trace and try to figure out what happened. That's what usually happens. Usually you don't write Z code that just works in a new driver. This now went to, once it compiles... Works in 99% of the cases to a point that it at least doesn't fold. Maybe it doesn't work semantically as you intended it, but it doesn't give you all the hard debug problems that you have when you do it and see in the first place because the compiler would complain first. So this is really the goal of the abstractions to get that away and to get the memory safety here, which I mean, this is kind of the memory safety part, but it's an implication of the memory safety part, Another aspect, which is, I think, the second one that's very important is we can use the powerful Rust type system not only to ensure additional safety and correctness to a certain kind, but due to the limitations of an API we can also impose on users, we can guide the user into the direction of using the API in the correct way. Where in C, you're basically free to do whatever you want. You allocate a structure and then you call random functions on it that you just find in a header file. You can design things in Rust in a way that they are used in the right way, that are used in a semantic way that makes sense. So you can encourage good code just by designing your abstraction in that way. And this is another big advantage that I really see from Rust. There are a lot of other advantages, I think, as well, which makes everyday work a little bit easier. We have the Rust format tool, so you don't have to think about how to format code anymore or run a tool that tells you how to do it. It depends on your IDE. But having Rust format is really nice, since a lot of kernel developers are just using Vim and nothing else, and not a fancy IDE that already formats code in a way that you want it. But it's just a minor thing. Another problem. Less minor thing, I would say, is the possibility of writing doc tests. So we have a unit test framework in the kernel that's called kUnit, which is pretty nice. Yet, a lot of code that goes into the kernel does not come with unit tests. Actually, I would say it's not common to send new kernel code with unit tests already. But in rust we have those doc tests and we compile those doc tests that are really just, small snippets of code right we compile them into k-unit tests and then you can enable them with a kernel config option when you boot the kernel and then they are running on boot and you get immediate feedback if something broke your api which was previously only possible if you really write a real k-unit test and insert the k-unit test module and run it so it it is much more overhead than running those doc tests, which give you some immediate sanity check of if your API still does the right thing. So that's another advantage as well.
Matthias
00:38:06
Now, the Rust standard library is split up into multiple subgrades, obviously, and one of them is a no-std part. Can you use the no-std module? Can you make use of the Rust standard library and to what extent or do you really have to use all of the kernel abstractions and you have to bring everything like data structures yourself.
Danilo
00:38:34
So we're using what Rust exposes as the core crate. This is what we use. We don't use any other crates. We had for a while, we had the alloc crate as well in the kernel. But we removed that because it just didn't fulfill the requirements that we needed for the kernel allocators. So one and a half years ago, I wrote the kernel allocator trade and the allocator abstractions, including all the stuff that you need as a basic allocation primitives like you have in Rust's alloc crate. Which is box and vec and those kind of things. So in the kernel, we have kbox and kvec, vbox, kvbox, which is basically just the approbations for the corresponding kernel allocators like kmalloc, vmalloc, kvmalloc. And that was necessary because kernel allocators have more arguments than the allocator API of the Rust allocate crate allows us to use. We have specific allocation flags we have to consider numa nodes and the upstream trade just didn't fit so we had to had to do our own thing so back then when we only or when we had the alloc crate used we only supported the kmalloc allocator so just a symbol so just one allocator and and had a flex extension for it no numa node support but long time it that's we just need to support the other ones as well so that's why i i did this work also in in preparation for for drivers because drivers absolutely need that okay.
Matthias
00:40:36
It's not really a question, but an observation. It feels like you learned Rust in hardcore mode by implementing all of these basic data structures and the allocator.
Danilo
00:40:46
Yeah, that's actually what happened. So the way I learned Rust is probably a way I wouldn't recommend to other people. So what I did is I knew the kernel very well because I'm a kernel engineer for a long time. So I knew a lot of parts. In the past, I've worked at all different kinds of subsystems in the kernel. So I'm more a generalist here than I was an expert for a very specific subsystem. In fact, I'm only working for about three years in DRM and on GPU drivers. All the time before, I was working on various other subsystems in the kernel. But that was to my advantage in this case because i i knew obviously from that a lot about various different areas in the kernel so what i was doing is i took advantage of that knowledge looked at existing rust code in the kernel and knowing what it is supposed to do semantically I basically reverse engineered for myself what the Rust code must be doing here. So this is how I learned Rust and how I approached it. Pretty painful way but it worked out in the end well for me.
Matthias
00:42:04
Yeah true now the community perception of some of the conflicts on the mailing lists and people maybe getting a little angry and maybe some people even leaving the project at some point what's your perspective on this what's your take on this what's the inside perception versus the outside perception here yeah.
Danilo
00:42:29
I i think that hits a very, very good point that is also very important to me. So I saw how those news about some controversies or some discussions that were a little bit controversial on the mailing list went through the news and sounded like if it would have been the biggest drama on earth. But in the end, we have thousands of contributors in the kernel. So you have thousands of different opinions, I think, for such fundamental addition to the kernel, like a new language. It's absolutely normal that people have different opinions and some people have stronger opinions than other people. Some people express their opinions stronger than other people, right? So I haven't seen anything that is dramatic or is unexpected to a certain extent and I think. I feel like it appears to be made a bigger deal out of it as it actually is. Another aspect of this is everyone talked about it and lots of news sites picked it up. But in the end, it was one interaction from thousands of other interactions that went in the absolute other direction. So just to name a few, and I know the terms may not make a lot of sense to a lot of people who are not super familiar with the kernel, but I just want to list a few of them just to give an impression of where things go very well, actually. And we have new contributors that have not been contributing to the kernel ever before who stepped up doing kernel work because of Rust and because they were interested in Rust. We have people that were a long time around in the kernel and are now doing Rust work as well. And between those people who are doing this Rust work, so those Rust contributors and C-maintainers, we have so many interactions where great relationships has been established and where actually ideas or implementations of Rust abstractions brought back improvements to the seaside as well, which are not going through the news pages, right? And just to name a few of them, this is DriverCore, this is Memory Management, PCI, OpenFirmware, ACPI, DRM, Networking, Timekeeping. MISC Device, I2C, the clock framework, PWM regulators... Primitives like maple tree, xarray, work queues, firmware api and and a lot more to be honest where people have established relationships with the maintainers and are working together and we have great interactions get great results and i i think it's it's only fair to mention those as well.
Matthias
00:45:37
Wow i had no idea and you know as someone who does not know anything about the linux kernel but a few things about rust to me it feels enabling as well like as just someone who thinks i might potentially at some point want to dabble with this don't have any plans but it would be more likely now than before because there's the Rust4Linux project you have a website and i checked it out it's very approachable and i'm guessing that the people behind it are so as well and so i feel more welcome than i was before, That's kind of a great achievement as well, besides the code.
Danilo
00:46:22
Absolutely agree.
Matthias
00:46:24
Now, let's say you look forward and you say a year or two from now, we have another chat and you're enthusiastic about what was achieved. What would you be proud of? What's next for the Nova driver and for your work?
Danilo
00:46:43
So as for the nova project one thing that i want to say here is my role so far has been to do all the enablement work i implemented the driver core infrastructure in rust piece the pci bus and and generic io stuff and a lot of other things that i now also maintain and i will keep doing this work because i think it's important it's important for nova but it's also important for Rust4Linux and i also think it's important for the kernel overall so a lot of my time goes into that when it comes to nova one could ask who's doing nova now and i have a very very good answer to that which i'm very happy of and it turned out that a lot of nvidia folks actually stepped up writing code for Nova. Actually, the majority of the code that goes into NovaCore is written by NVIDIA people. And we just got a co-maintainer for NovaCore, which is Alexandre Courbot, who is helping me a lot in maintaining the project and moving the project forward. And so I'm very happy about the NVIDIA folks stepping up. So what I'm, hopefully proud of in a year from now is that Nova actually developed in the direction of solving the problems that we intended to solve when we decided doing a whole new driver from scratch and doing it in Rust is the correct way to go, which specifically means it solves the design problems in Nouveau that we had in Nouveau. It solves the problem of abstracting the firmware. It solves the problem of building a modular driver stack that supports virtualization and a solid compute and graphics driver on the host, as well as in VMs, of course. I'm i'm looking forward to people saying who run an nvidia gpu hey i'm running this out of the box it comes with my kernel installation from my distribution and it's not something that i have to install afterwards from from an out-of-tree source and i mean we have that with Nouveau but but honestly oftentimes it's a bit sad because i see all the effort that went into Nouveau and there were brilliant minds working on Nouveau. So I really want to give my appreciation for the work here. But there is also the reality that oftentimes Nouveau is used to install the proprietary driver. And that's also a bit sad. So I'm looking forward to change that.
Matthias
00:49:44
At least to me and potentially to a lot of listeners. The day-to-day work of an actual Linux kernel developer is completely unknown, and it sounds almost a bit daunting to work on that. What's it really like? What's your day-to-day? How does working in the Linux kernel really look like?
Danilo
00:50:06
Yeah, so there are many different aspects to that. One of it is obviously the maintainer work, which means that I have to review a lot of patches. have to give feedback, but also guide people into the right direction, design-wise for the subsystem or for the component, and ideally serve as a multiplier here. Make sure to get people interested in doing that work in the first place, get more contributors involved, and also scale myself, which means find people to work with who may want to take responsibility themselves, which is, I think, a part that oftentimes is forgotten by maintainers.
Matthias
00:50:59
And how much of that time do you spend in email, in your IDE, and during code reviews?
Danilo
00:51:07
So writing emails and doing code reviews, kind of the same thing. We're doing code review by mail. Patches are sent through email. I would say so roughly about half of the time goes definitely into writing mails and having discussions about code, reviewing patches. A significant amount of time also goes into dealing with patches. It's not only about reviewing them as a maintainer, but you also have to handle them, which means you have to apply them to your tree. You have to sanity check if everything about the patch is correct in terms of process and form. And ideally, you also do a build check, at least. And then you take it in your tree, then it goes into the next tree, which is a kernel tree that is basically capturing the trees of all kernel maintainers who want to be part of Linux Next. And it regularly, I think even nightly, merges all the maintainer trees into one single branch and reports conflicts. Most of the time, the conflicts are already solved by the Linux Next maintainer himself. Sometimes it's not possible that he just drops the branch for the night and comes back to the maintainers asking, hey, what I should do here? What's the right solution? So part of the job as a maintainer is also to deal with that. And of course, you have to send the pull requests either to Linux himself or to the next maintainer in the hierarchy at the end of the cycle. So a development cycle is roughly about three months. Then you send the changes that you accumulated over the time to Linus or the next maintainer in the hierarchy. And then also per release cycle, which is not release cycle, but release candidate cycle. You send the fixes for the last release or after the last merge window for the next release candidates to Linus or the maintainers. as well. And this goes on for the three months and then you have the next release and then this cycle starts from the beginning. Otherwise, yeah, I spend a lot of time in my editor, of course, writing actual code, But there's also a lot of time that just goes into coordinating, coordinating with other developers, coordinating with companies we're partnered with or coordinating with people from the community.
Matthias
00:53:44
And who's next in line for you? Do you send a patch to Greg Kroah-Hartman or someone else in the hierarchy?
Danilo
00:53:53
So that depends. I'm maintaining a couple of trees and a couple of different components or subsystems and drivers. So I'm sending patches to Dave Airlie, who's maintainer of DRM. I do that previously. I did that for the Nova tree. And now I'm doing it with the DRM Rust tree. We have an own DRM tree for Rust components. And what we do different than other subsystems with that. Then I submit patches to, or actually submit pull requests to Miguel, who's maintaining the Rust tree. This is for the allocator stuff that i mentioned previously and i'm also sending pull requests to linus himself for the driver core tree that i co-maintain nice.
Matthias
00:54:33
And how does the tooling look like then you mentioned that yeah well you need to use an email tool and you need to use at least an ide so what's the weapon of choice.
Danilo
00:54:44
Right yeah so this is where really my past of being a C kernel engineer from the get-go kind of shines through which is i'm really just using vim not even new vim just vim and mutt and that's that's basically about it so i i recently switched to aerc for email which is another console client but that's really all i use no no fancy ide no language server nothing and.
Matthias
00:55:15
What about the rust tooling, Do you use any tooling other than cargo format? Is there any kernel-specific tooling as well that maybe you've built or someone else builds which does various random tasks, administrative tasks?
Danilo
00:55:30
No, I mean, the formatting and all the Rust-specific things are kind of built into the cable build system, so you're not really running cargo Rust format. I don't even know if that's an actual command of cargo, but it's integrated into the kernel build system. We have in DRM, we have some maintainer tools which are called DIMM. It's a helper to apply patches, do back merges from other trees, send pull requests. So basically make a couple of maintainer and committer tasks a bit more easy. But other than that.
Matthias
00:56:08
No, just... I find it fascinating how much you can get done, with these tools and then just sticking to them and putting in the hours and putting in the work that's fascinating right danilo if someone is now interested and intrigued to learn more and perhaps even wants to contribute to the linux kernel at some point especially the rust part where would they learn how would they how could they learn more.
Danilo
00:56:38
So I think the best entry points here are the Zulip we have, where the Rust4Linux community is usually around and also the Rust for Rust4Linux team is around and can also give hints on where to start if people have a specific topic of interest they want to work on in the kernel or may be interested in. Otherwise, if you just want to get started doing your first steps, the Rust4Linux project in the GitHub repository maintains an issue list with good first issues. So we try to keep the list huge enough so that everyone has something interesting to find, to pick out of it, where we keep just things around where we know, okay, this would be a nice rework. It's not super pressuring to get it done, but it would be something we would like to see happening and it's a good thing for someone new to start contributing to. Those are the things we keep around there and I can really recommend to have a look at that.
Matthias
00:57:45
That was amazing. Everyone who's interested, please check out these resources. We will also link to them in the show notes. Finally, what's your message to the Rust community?
Danilo
00:57:55
Oh, I don't know if I have the message to the Rust community. I probably want to say thank you because I, Learning Rust and writing Rust code made me a better engineer also for the C parts in the kernel that I maintain. It made me think different about certain problems and just gave me a different perspective of approaching problems, which is very helpful. So my message is probably be keep up the great work. Otherwise, when it comes to the things that the kernel needs from the Rust project, and there are definitely things that are needed, I also want to say thank you for the great collaboration. I'm not that involved myself. I definitely want to make that clear. We have other members from the Rust4Linux team who are regularly talking to people from the Rust core team and from the Rust project in general. And there is great collaboration already ongoing. So, yeah, I'm very happy to see how things go.
Matthias
00:59:18
Amazing. Danilo, thanks so much for taking the time for the interview today.
Danilo
00:59:23
Thanks.
Matthias
00:59:25
Rust in Production is a podcast by corrode. It is hosted by me, Matthias Endler, and produced by Simon Brüggen. For show notes, transcripts, and to learn more about how we can help your company make the most of Rust, visit corrode.dev. Thanks for listening to Rust in Production.