Rust in Production

Matthias Endler

Microsoft with Victor Ciura

In this first episode of season 4, we talk to Victor Ciura about large-scale Rust adoption at Microsoft

2025-04-17 73 min

Description & Show Notes

Victor Ciura is a veteran C++ developer who worked on Visual C++ and the Clang Power Tools. In this first episode of season 4, we talk to him about large-scale Rust adoption at Microsoft.

Victor works as a Principal Engineer on the Rust team in Microsoft's Developer Division, building the compiler toolchain and libraries needed for the broader Rust efforts across the organization. He is a regular speaker at conferences like CPPCon and also spoke at EuroRust 2024.

We talk about Microsoft's first steps with Rust, widespread implementation across key products and services, and Hyrum's Law.

About Microsoft

Microsoft is a company that needs no further introduction. From Windows to Azure, Microsoft has a wide range of products and services. A while ago, Microsoft started an initiative to bring Rust into core parts of their ecosystem.

About Victor Ciura

Victor is a well-known C++ expert leading Microsoft's Rust adoption efforts. His work focuses on ensuring Rust interoperates with existing C++ codebases and integrating the Rust compiler and toolchain into the Microsoft ecosystem.

Proudly Supported by CodeCrafters

CodeCrafters helps you become proficient in Rust by building real-world, production-grade projects. Learn hands-on by creating your own shell, HTTP server, Redis, Kafka, Git, SQLite, or DNS service from scratch. 
Start for free today and enjoy 40% off any paid plan by using this link

Links From The Episode


Official Links

Transcript

Welcome back to a fresh season of Rust in Production, a podcast about companies who use Rust to shape the future of infrastructure. My name is Matthias Endler from corrode, and today we talk to Victor Ciura about large-scale Rust adoption at Microsoft. Victor, thanks so much for joining us today. Can you say a few words about yourself and about Microsoft, the company you work for?
Victor
00:00:27
Hey, thanks for inviting me. I'm a principal engineer on the developer division at Microsoft. Before coming to Rust, I spent over 20 years doing C++ exclusively systems programming in general and developer tools, worked on various projects, including being part of the Visual C++ team at Microsoft. And in recent years, I transitioned to work on Rust tooling in developer division.
Matthias
00:00:56
We both met at Eurorost in 2024, and I really had a pleasant experience listening to your talk. It was very honest, brutally honest, let me say. And I really appreciated that. And at the same time, one could see that you're a very experienced engineer. So I'm really looking forward to that conversation today, because I think that we will not only talk about the highlights of using Rust in production, but also maybe some of the downsides. But before we start, there's probably a lot to talk about, but before we start, can you give us an overview and maybe an update of what happened since Eurorust 2024?
Victor
00:01:40
Yeah, definitely. So things in Microsoft, across all organizations in Microsoft, Microsoft is a very big place, but things in terms of Rust have moved a lot in the past year. And there's so many things brewing. Many of the projects are publicly facing, so there's public knowledge, but many of the things we're working on are not yet ready to be shared. So stay tuned. More stuff is going to be made public this calendar year. But definitely a lot of stuff has been going on. So back then, we talked about a few of these early efforts with Rust across the company, but I have a few more sort of that I wish I could spotlight here. There's, All sorts of interesting projects, in my opinion, in my position in the developer division, because we are tasked to improve the lives of Rust developers in the company and improve developer tooling and sort of unblock them and make sure they're productive and doing their best work across all the products and services we have. But you can imagine, I hear from all teams across the company and they're either telling us how they're using Rust and what they love about it, but also complaining about stuff that doesn't work for them. And, oh, they're so loud. So my team specifically is tasked to make them productive and improve their lives. So in this role and in this team, I hear about a lot of interesting projects. So they all sort of fascinate me. So, and they're areas that I've never worked in and they're so diverse in nature from firmware, hardware stuff related to services and SDKs and all sorts of interesting things that I've never worked with. And now I get to sort of meet these projects and see how they're using Rust and the kinds of challenges they're facing and their successes along the way. So I wanted to maybe share some of these with you. yeah sure.
Matthias
00:03:47
Let's kick it off by highlighting some of the projects.
Victor
00:03:49
So there's there's work we've been doing for example with TockOS i don't know if people are familiar with probably your audience is familiar with TockOS kernel which is written in in rust we've used this in a project for what we call the microsoft pluton security processor we've done some collaboration in open source here because we're using this and we've bet on this one-source project. And it's used in Surface PCs, and we're using this in collaboration with our OEM partners as well as how we're building laptops and PCs.
Matthias
00:04:28
Just for the people who might not be familiar with TockOS, as far as I'm aware, and I'm not an expert on the topic, but I think it is a real-time operating system that is very low level, very close to the hardware and it's perfect for when you might need a very, lean runtime and you want to run certain, processes or certain tasks on top of that operating system, which more or less have to react to real-time data.
Victor
00:04:59
Exactly. And it's, for example, in the case of these Copilot PCs where we need it, we're using the Intel Partner Security Engine there, which is a specialized hardware for security on these devices. And we bet on TockOS as a good fit for this. And sort of that's how we ended up collaborating on this existing project, which was an amazing project and was deemed as a good fit for this.
Matthias
00:05:27
And by collaborating, you mean contributing actual code to the project?
Victor
00:05:32
I can give you, it's a funny story. People think that 32-bit is by all means dead. So many, many people don't even think in terms of, oh, do I need 32-bit or something? So for this particular Intel chip that we use for this security thing, it's actually, it needs 32-bit, right? So we did work to port TockOS to 32-bit architecture. That's just one example.
Matthias
00:05:59
Is that already merged into mainline or what's the status on that?
Victor
00:06:04
I don't know. The PR is there, but I don't know if it's merged yet or not. But it's definitely public.
Matthias
00:06:11
Wow. Okay.
Victor
00:06:12
I think it should be merged by now.
Matthias
00:06:14
Yeah, I don't know how many 32-bit devices are out there, but I would say there must be millions, probably billions of these devices.
Victor
00:06:24
Yeah, when you think about these specialized cards or specialized hardware that's sort of connected to various other more complex infrastructure, there's definitely still a need for 32-bit in some places, which people don't actually expect. They like to think just in terms of ARM64 or ARM64. Intel 64-bit architecture so.
Matthias
00:06:46
It's funny because if if i think of big tech like very large organizations sometimes i hear that it's easier for them to build their own little you know solution or proper solution for example google used that as an excuse to build fuchsia which is also written in rust they could also have contributed to an existing project but in your case for TockOS, you very much contributed to an existing open source project. Can you elaborate on that? Like what was the decision making process behind that?
Victor
00:07:19
So I think for many decades, Microsoft has suffered the not invented here thing syndrome. And we tended to think we're the smartest and do everything from scratch because we can do it better than anyone else. I think we're cured of that. So right now we're very much in the process of looking for what's best out there and if that best in class in open source fits our needs and aligns in terms of roadmap and what we want from it with our project needs and our roadmaps and if it if we have this alignment then we're gonna bet on on open source very much the the microsoft of today is about, doing work in the open and leveraging work that is in the open source so both consuming good quality libraries and projects and contributing back. So if we don't find such a thing, we will start it, of course. But in many cases, we find good projects out there that, almost align perfectly with what we need, and then what's the point in reinventing the wheel, right?
Matthias
00:08:28
Yeah, true.
Victor
00:08:29
When we can improve it for everyone.
Matthias
00:08:31
Yeah, absolutely. You said TockOS was used in Surface devices, but isn't Surface a 64-bit device?
Victor
00:08:38
No, this is a specialized chip that's for the security processor.
Matthias
00:08:43
Ah, okay.
Victor
00:08:43
It's not the main integrated CPU.
Matthias
00:08:47
Okay, perfect. Yeah, so we already covered one very... Low-level project, one that is close to the hardware.
Victor
00:08:56
Another one was, and I think we did this a while back, is Project Moo, where we did a Wifi implementation that's actually used in Surface laptops and even in Azure boxes. So there's a custom Wifi implementation in Rust. And this is, again, an open-source project. People can actually Google for the GitHub repo for it. We can probably.
Matthias
00:09:22
Link to it in the show notes i will.
Victor
00:09:23
Yeah i can provide you with a bunch of links to these things then again if we're thinking about these kinds of things with we have the calyptra project which is about harder root of trust foundation for building more advanced security capabilities so again this is a fully transparent industry collaboration effort where We're sort of contributing to this space. And again, Azure integrated HSMs, this is Microsoft's new in-house security chip. This is meant for isolating cryptographic keys in dedicated vaults. It's FIPS compliant. It's about securing key exchange between Azure VMs and all these things. So again, we're using Rust in all these security sensitive areas of our cloud infrastructure, right? If we're talking about hardware that supports the cloud and devices, if we're talking about the hypervisors and the services themselves and sort of how we're securing this trusted compute platform in the cloud. So all across the board, from firmware and hardware and dedicated cards up the software stack at all levels, right? If we're talking about Azure Post Agents, where we offload on hardware cards for managing Azure VM workloads and manage virtualization for storage and networking in Azure. Even Hyper-V, right? Which Hyper-V underpins the whole Azure infrastructure for all the VMs, right? If you think about classic Hyper-V has been written, I don't know. 25 years now. I don't even remember. It was written in C++, right? And still the bulk of it is C++. But we're rewriting parts of it in Rust right now. Either parts that we want to secure, so we want to sort of harden various parts of the hypervisor, or we're rewriting new parts of it. For example, the instruction emulation for ARM64 in Hyper-V is written completely in Rust. So the whole hypervisor virtual stack and even more and more components in hype or vr rewritten from c++ to rust because we want to sort of harden those components and and i think that's a sort of a theme across microsoft not just in azure but maybe azure would be the most aggressive such organization in microsoft was doing these rust rewrites but in general we're we're on the one hand trying to harden c++ because we have a lot of c++ at microsoft as you can like might imagine right the billions of lines of code of c++ and and we need to harden c++ because it's not going to go away anytime soon so we're investing a lot on on securing c++ both in terms of tooling and how we're methodologies around writing c++ but especially in terms of tooling and at the same time. We're doing these tactical Rust migrations or Rust rewrites, where we're identifying pieces of code or components that have been traditionally targeted for vulnerabilities over the years, sort of the most sensitive parts or things that have most CVEs or exploits that were sort of memory caused by memory exploitations. And we're just rewriting them in Rust to secure them and to be able to prove their security as in not just oh oh it's rust it must be secure right rewriting them in rust gives us ways to reason about proving their their safety and and do all sorts of better analysis on it can you.
Matthias
00:13:06
Share any internal metrics about how security improved after rewriting parts of it in rust.
Victor
00:13:12
I don't have numbers on hand but i've i've seen plenty of presentations coming out of we have dedicated security teams in the companies that sort of analyze these kinds of patterns across multiple organizations and projects and they regularly post statistics so I've seen such presentations internally but I don't have numbers of them but it's. People, in general, people tend to think, oh, this is anecdotal. As you know, you're rewriting it in Rust and it will just magically be better. And they sort of deride this thing. But we have hard data that shows that components that have been rewritten are much more solid. And we can actually even formally prove some of them to be memory safe, both in terms of spatial and temporal safety. And I've seen similar studies out of Google on their security blog as well about providing statistical information to prove that these rights are improving the overall security of systems. And more interesting, what I've seen from the Google security study that I read last year is that they noticed the incidents, the security memory vulnerabilities decreasing just out of writing new components in Rust without even needing to touch existing components. Because the vast majority of exploits and bugs tend to be in recent code, not in code that has been tested and tried and patched for decades, right? Most vulnerable code tends to be the most recent code because it has been less tested or less battle-hardened, let's say.
Matthias
00:14:54
Do you see the same pattern at Microsoft?
Victor
00:14:56
Yeah, we're seeing similar things. We're not sort of madly going across the board and just like for the fun of it, Let's rewrite everything in Rust. We're doing these sort of tactical choices where we say, okay, this new component, it's better if we write it in Rust. Or this component, we feel that by rewriting it in Rust, it will give us a chance to rethink it, re-architect it, prove... Different sort of pre and post conditions about it sort of be be more deliberate and more precise about what we can prove about this particular piece of code that we're rewriting so it's it's not sort of a blanket statement that oh if we're rewriting everything everything will be magically better right we're we're doing sort of very tactical choices here.
Matthias
00:15:49
Google came forward and I said 70% of their issues with code in general were memory safety issues. And I would assume that it's similar across different organizations. I might even remember that there was a similar study from Microsoft.
Victor
00:16:06
Yeah, I think I presented the slide on that at one or two of the conferences last year. So it definitely we're seeing the same thing as in memory safety dominates all cvs across the microsoft so it's definitely over 70 percent in microsoft the majority so the majority are memory safety issues and all across the board from heap corruption heap out of bounds step corruptions type confusions uninitialized variables use after freeze all sorts of things so definitely we're seeing the same kinds of things, right? And in terms of mitigations, yes, like I told you, we're constantly developing newer and newer technologies for C++ to address this. But with Rust, you get most of these things out of the box, right? Again, there's challenges even there, but definitely in terms of memory vulnerabilities, we can do a lot better.
Matthias
00:17:08
Was Azure the first Rust project at Microsoft? Was Azure the first to adopt Microsoft?
Victor
00:17:16
I don't know for sure at this point. So very much when Rust started at Microsoft several years ago, it was very much a sort of bottom-up thing where various teams or various enthusiastic engineers tried out things, tried to rewrite various components, for various reasons and claimed some results on them, as in, did it work? What were the challenges? what didn't work, and so on. So it was very much sort of an experimentation phase. And then the following years, based on early successes, there's been a bunch of incubation projects across the company in all organizations, not just Azure. In Windows, in Office, in Azure, in M365, all across the company, really. And I think we did this for a couple of years and sort of had various degrees of successes and teams were, I think the most impressive thing is that teams were super enthusiastic and they felt the experience was, in general, a win, right? Even if maybe a particular project didn't get the green light to go forward, right? If it was just maybe proving something or prototyping something or, you know, not all projects get the green light to go ahead, right? For various reasons. But even in cases where we just learned from a particular such instance or such experience, teams were super enthusiastic and they felt like this could really change something at Microsoft, right, in how we do software. So after that incubation phase, we're sort of more and more projects started doing Rust seriously. We have internal hackathons, right, a couple of times a year. And we've long passed the phase where people enthusiastically, let's try something or let's incubate this and see how it goes, right? Right now, we're in a phase where Rust at Microsoft is taking really seriously. And that comes with... So it's both good and scary at the same time, right? It's good because that means more teams and developers are empowered to use Rust. So it's no longer like sort of covert pirate operation to do something in Rust. It's sort of applauded and encouraged when it's appropriate but it also presents sort of a very high bar in terms of in terms of expectations right because very very much across the company people are feeling that rust is, has graduated to be a sort of a premier programming language, right? And that means they have the same kinds of expectations from Rust that they have from C++ and C Sharp and TypeScript internally, as in those used to be traditionally the tier one, the premier languages in the company, right? And now as Rust is becoming to be seen as a tier one language at the company, all these developers who are getting more and more ambitious in building cool stuff with Rust, they sort of have these high expectations as in because they come from these worlds, they come from C++, from C Sharp, and they expect developer tools to be the same kind of maturity, right? So they expect their developer experience, their debugging experience, their tooling, their infrastructure to be ready. So they have sort of all these expectations in terms of where Rust should be, at least internally.
Matthias
00:20:57
I would assume that also very much depends on the project they are working on. For example, if I work on the Windows kernel, I have different expectations about my tooling than, let's say, if I work with cloud providers or cloud services.
Victor
00:21:11
Definitely, yes.
Matthias
00:21:12
Or do you see similar patterns even across low level and high level? Does it even matter where people use Rust? It has the same deficiencies across the board.
Victor
00:21:23
There are definitely things that are universal. I mentioned this. Microsoft is very much a big place and various organizations have sort of various internal protocols and specific compliance rules and specific infrastructure needs and so on. So while there are many commonalities, there's also very specific needs based on project, right? And I would say if the Rust likes or sort of, let's say, enthusiasm for Rust is maybe the universal thing across Microsoft organizations, like everyone loves fast dev compile time iterations, as in if it compiles, it most likely works to that kind of feel. Or if it's easier to write tests, people will write more tests, right? Or be more encouraged to write a lot of tests. And the richness of the ecosystems and libraries out there, memory safety guarantees that we talked about earlier, and even data race-related concurrency bugs that you can reduce with this. So sort of the things they love seem to be sort of universal. Yeah. And performance, right? Let's not forget that many of the people who are coming across in the company are coming from garbage-collected programming languages like C-sharp, right? So they like the native systems programming language feel and deterministic distraction and snappiness of Rust CodeGen. Although the likes are almost universal across the company, the dislikes are very much specific on whichever team, what problem they face most, right? Or what they struggle with. For example, in terms of infrastructure and compliance and security promises. There's sort of a common bar across the company, but some organizations have more stringent rules in terms of what they need to prove about the compliance and security of software issue, right? So this is where engineering systems need to work to fill in the gaps, as in we've worked for decades to implement all sorts of workflows and engineering systems to deal with C++ codebases and C Sharp codebases or TypeScript codebases. And we built all sorts of tools from static analysis to checkers to binary tools that check various promises and guarantees about the produced binary artifacts or binary hardening tools or all sorts of, we have a plethora of tools internal that we use online. In gen engineering systems. And as you can imagine, a lot of these are sort of not there for Rust, right? They either don't exist altogether or they just fall flat on their face with Rust-generated code, right? So we sort of need to fill in these gaps or improve existing tools if there are any, right? For example, let's see, static analysis, right? But there's also lots of tools at binary level.
Matthias
00:24:43
So do the different teams even share that feedback with one another? Say, for example, someone at Azure finds a great tool. How do you share that internally? You probably have some sort of Microsoft Teams channel or multiple even about tooling for Rust.
Victor
00:25:00
So there's two things that run back. So yeah, definitely we have a thriving Rust-Aceans community inside. So we actually have internal forums and Teams channels for Rustasians across the company where we share sort of learnings from each other. We ask questions. We sort of discover things we share. So that's sort of a community sort of things. We have many, many, many Rust developers internally who are discussing all sorts of interesting stuff there from sharing pieces of code or asking for help or sharing a tool or asking about something.
Matthias
00:25:34
Just roughly how many people are in these channels?
Victor
00:25:37
It's it's it's in the thousands i think i haven't looked in a while but it's a lot of people it's like always generates a lot of noise in those channels a lot of discussions every day but going back to your original question so that's sort of for for sharing and guidance and asking and discovery so there's sort of these internal communities are amazing right and i've learned so much from just being in those threads. But going back to your original question in terms of tooling and engineering systems, that's a totally different story. That's not where something happens randomly. So we have dedicated engineering systems teams inside the company, and we have policies across the board where a team cannot just build something the way they like to build it, right? There's there's policies and and specific rules about how you set up infrastructure how you set up pipelines how what kind of tools and tasks you run and how you do all sorts of processing and audits on on on the build systems so there's there's a big branch of engineering systems in the company who's responsible in supporting all the programming languages that are used in all projects across the company regardless of the organization right so.
Matthias
00:26:57
Can you share a few of the best practices that might be helpful for people outside of microsoft maybe general things that you found out about how to set up a project in rust.
Victor
00:27:09
Oh, there's a, that's a, I think we need more than an hour just for that. So there's, hopefully, what we are doing internally is that we're setting what we call paved paths for various programming languages, right? There's paved path for C++ projects, paved path for C Sharp, paved path for Rust. We're building one right now, right? Where we're setting up workflows and templates so that people don't have to do these kinds of setups manually for each project, right? Because you don't want to end up with snowflakes across the company that are slightly configured slightly differently, or they're using slightly different things, right? So we want very much to standardize best practices. So we have pipeline templates, we have project templates, we have sort of infrastructure tasks that are shared. So it's not a manual bring up process. And that's why sort of it's such a high bar to make Rust a sort of a tier one supported language in the company because there's so many of these things that we need to make sure they're right for prime time and they satisfy the needs for all the projects, right? Because this is very much a unified rollout, right? Regardless if we're talking about secret scanning or static analysis or. Any kind of SDL requirement in terms of security analysis or any kind of binary checks that we're doing on software and so on. So there's lots of such tools and policies, and they all need to be uniformly applied. As in, when a new Rust project becomes online in the company, it needs to have this proper setup, these pipelines, these templates in place in terms of engineering systems and infrastructure. It's one thing to start something sort of as a hackathon project and sort of build something and share it with a few other colleagues in a team, right? As proof of concept, when you build it, that's sort of a different story. But when you sort of promote it to this is now a project that is proper project in the company, then it needs to be sort of integrated in this engineering system.
Matthias
00:29:19
I can see how this can work for a microservice environment where, for example, you build a new Azure component or so. You have a bootstrapping system for that. But I wonder how it works on the lower level layers. For example, in one of your talks, you mentioned that you chose direct write as one of your first Rust experiments. Experiments and direct write is like a text layout and rendering component that's that's used across windows and office and you mentioned that it has a com-like interface so it has this component object model interface and i wonder if you can even have some sort of bootstrapping pipeline for this when when you start such a project yeah.
Victor
00:30:00
That that's actually a very good segue, I wanted to mention this, not all of these things happen as greenfield projects or some perfectly isolated thing like, oh, I'm going to extract this new functionality or this new component. And it's so nice and pristine and I can, it's self-contained and it has everything you need and I can test it. It's rarely the case that something is so nicely cut away from everything else, right? Sometimes you need to work on something that's highly intertwined or has tight integrations into something bigger, right? So then it becomes more challenging how you're doing this bridge, how you're doing these migrations becomes challenging how you're testing it, how you're doing infrastructure for it, definitely, right? So just for for your audience here in case they don't know you mentioned earlier direct write this is a sort of a full stack text analysis framework it handles layout and rendering for text textual output right so it does font shaping and all those things and it it's it ships in windows so it's a it's a dll in windows it it's even cross-platform part of it which is direct write core. And this is the part that was rewritten in Rust. It's actually cross-platform, right? For example, Office components depend on direct write core and Office is cross-platform, as you know. So, and I believe we did this rewrite about four, maybe five years ago, maybe four, if I remember correctly. So indeed, this component uses com-like interfaces. You mentioned this earlier. And this sounds potentially like a scary thing, but it actually can be very helpful in terms of when you're doing these gradual rewrites. Because one of the biggest challenges we're seeing for gradual Rust adoption in Microsoft is challenges around interop, and specifically interop with C++. And I sort of wanted specifically to spend some time to talk about this because it's such a such a big and challenging topic and and but in terms of components that have these nice com-like boundaries it's actually although it might sound scary it's actually a blessing because it provides the natural surface separation between worlds that you want to bridge right and and com is sort of the grandfather of all interop technologies so it's it's been solving interrupt problems for, I don't know, 40-plus years now. So it actually allows for gradual rewrites and gradual migration and how you're doing this carving out, like, to do incremental porting. So that actually turned out to be a good thing, right? And Rust code is directly callable from app code through home interfaces. And you have this sort of almost natural separation that was provided architecturally beforehand. But testing is not easy, right? And specifically for this component, for the direct write, I can give you some numbers here. I was looking now. So the ported code was about 150,000 lines. And it was pretty much a sort of natural translation, like from C and C++ to Rust. It didn't require a major re-architecting or rethinking of the component. It was fairly straightforward.
Matthias
00:33:55
Work that's nice because one thing that a lot of people cherish about microsoft and windows is that there's such a long guarantee for backwards compatibility so you basically don't ever break apis if possible and i i imagine that might be super hard if you do a translation from c++ to rust for example you need to make sure that all of the invariants are upheld while you do the sensation it's.
Victor
00:34:24
It's very hard and i would say the bar is even higher than people expect because in general people expect that the contract is the api that you provide and say okay this is the api and if the api offers the same guarantees we can change the implementation however we want right and nobody will be affected. In reality, sometimes, not always, but sometimes it's even worse than that. As in, even when there's detectable change in behavior, even if it's not enforced by an API contract, some people get angry, right? And when I say some people, I don't mean people outside the company, as in even internal consumers, right, of various components, right? If there's a behavior change that they ended up depending on, even if it was sort of out of contract, right? But if, you know, Hiram's law, as in if somebody, someone somewhere can take a dependency on some observable behavior from your component, they will, right? It's a statistical fact, right? And by the sheer scale of Microsoft and Windows in general, you will end up with components that will rely on out-of-contract behavior from various other components. And if you change slightly that behavior, they will complain.
Matthias
00:35:52
Do you have an example for that?
Victor
00:35:54
Not on top of my head, but it's not just isolated cases, right? There's many such cases where, you end up making a change and somebody gets mad at you.
Matthias
00:36:11
Yeah, I would assume, for example, if you have a library like DirectWrite or GDI, which is the graphics device interface, which goes back to the 80s and early 90s, you also mentioned that in your EuroRust talk if you have such an old components, the side effects even become part of your ABI. Exactly. Yeah, even just the runtime of a certain function might have an impact on your call, on your system. Yeah.
Victor
00:36:39
When we're talking about this, and for GDI, we're talking sort of late 80s, early 90s kind of thing, right? It was designed for 286. That kind of time thing. So when you're talking about these kinds of components, and if you imagine how many millions of lines of code use those those apis and in in what weird ways they've been used and abused because i think the word means sometimes abused not used you will end up breaking someone so yeah it's it's highly critical to make sure that these changes do not sort of upset the existing ecosystem and for windows there's a very high bar in terms of how they're testing stuff not just oh we can do test units on this functionality and all test units passed and we're good ship it it's not like that so building the whole like every change sort of you need to build the whole operating system you need to pass a enormous stress test suite of stuff so it's like it might be sort of record-breaking stuff in terms of the amount of tests and stuff that that you need to wait for to make sure that Everything just works. All sorts of high-level, low-level integration tests, real workloads, all sorts of apps. Windows tests tons of apps, and they automate all sorts of workflows through those apps, even apps that are super old. So you wouldn't believe the high bar of what it means to change something in Windows. And especially in a component such low-level as GDI, for example. You can break so many things.
Matthias
00:38:28
Especially with those old components like GDI, people will probably find new ways to make unsafe calls and maybe try to break it. Maybe sometimes intentionally, sometimes not intentionally, but just trying to achieve their goal. And they make calls which maybe... Original authors haven't anticipated and i'm thinking this is probably where rust can help you and you allured to that in in a talk where you said the more code you ported to rust the less unsafe code you need could you walk us through that journey like why is that a thing well.
Victor
00:39:10
Yeah because when you when you start sort of cutting up pieces that you start rewriting you need to interact with the old code because you it's a sort of a sometimes even catch-22 problem but anyway even if it's not a circular thing you still need to call the old code let's say c old c code or old c++ code and that is sort of on the untrusted boundary so when you have like a little bit of rust and 98 is still the old code sort of your your your unsafe surface is ginormous So you need to have a lot of sort of sprinkling unsafe and sort of crossing the FFI boundary and doing some really unsafe interrupting because most of your surface is unsafe. But as you're doing this gradual process of rewriting more and more code in Rust, then this unsafe boundary reduces over time. So you sort of have it less and less and less unsafe parts until you sort of ideally achieve the total rewrite of that component. And then you sort of have these. And even if you end up consuming some unsafe bits, you can at least provide the safe wrapper around it. If let's say maybe there's a last 2% that you can't get rid of for some reason. I don't know. You can at least wrap a safe projection around the unsafe part that still remains and provide some precondition or sort of assertions about how it should supposed to be used in contract so that it still is sound, right? Even if underneath is unsafe in some way. So, yeah, this is sort of a general trend you see when you're doing this kind of C or C++ to Rust migrations, as in the more you do, the less unsafe you need. And since you mentioned unsafe, sort of a common misconception that people have is that, oh, in many situations, because we're now talking about sort of low-level code and you want to sort of have... Access, sort of intimate access to hardware and high performance and all this good stuff as in close to the metal kind of thing in systems programming. They just assume like, of course, that's why they've been using C and C++ because that's what you get. You get the sharpest tool, the closest to the metal, the highest there is, right? And when you're doing this in Rust, then the expectation is that, oh, I'm going to need to do a lot of unsafe rust because i need to talk to hardware i need to do a lot of unsafe things i'm clever here and systems engineer and do all sorts of dangerous things so it's expectation that you're going to write unsafe rust which is not the case or they expect that oh you're going to need to write a lot of unsafe rust to be as performant as c or as performant as c++ on when And you're doing that rewrite and say, I need to touch pointers. I need to do the unchecked thing because it's faster. This is a misconception in most cases. What we've seen is totally the other case, as in many situations, we're getting actually better code gen out of the compiler when we're writing safe abstractions, the equivalent safe abstractions. Because the compiler can have more guarantees about what's going on in your code in terms of aliasing, in terms of interf, in terms of how that code is structured. Whereas if you're writing a lot of unsafe, all bets are off, as in the compiler needs to be conservative and not as aggressive in what it can do in terms of code movement and optimization. So in many situations, the preconception of let's throw a lot of unsafe here because we're coming from C++ anyway, and we want to have the raw hardware performance, we've seen the total opposite. Yeah. We're very much recommending that people are using safe Rust and not reach out for unsafe because it's an easily available escape hatch, right? Because sometimes it's actually worse.
Matthias
00:43:33
Well, there's another large project that is very close to the hardware, which is currently trying to adopt Rust. And that is the Linux kernel. And, you know, on the mailing list, you can see this play out a little bit nowadays, where you have a lot of infighting between very experienced, longstanding developers. And on the other side, you have these people who want to try the new, who want to have safer interfaces. Do you see similar patterns at Microsoft? We have a lot of infighting between older, more senior kernel developers and people that maybe want to introduce Rust just because they think, you know, it will improve the situation in the future. But I guess short-term will also incur a maintenance overhead because you have to keep the Rust interface and the C or C++ interface in sync.
Victor
00:44:29
So I'll preface this with the fact that I'm not directly involved in the Linux and Rust in Linux project but in terms of what I'm seeing in Microsoft I would say there's no such polarization that I'm perceiving again I'm not part of the Windows team I just collaborate with a lot of people from there because I'm in the developer division and they use our stuff so I'm not in I'm not a kernel developer there. So again, it's sort of semi-outsider opinion, right? But definitely what I'm seeing internally, I don't see this strong polarization and, Yes, you have in the company, you have your Rust enthusiasts and evangelists who are sort of trying to convince everyone that they should do everything in Rust because they're so enthusiastic and it's been so successful for them and their projects. So they're genuinely trying to help everyone. And sometimes they can be a bit overwhelming. Thing and what i see is in general when you reply a lot of pressure automatically you get some resistance right from people even if they don't have sort of strong counters they don't like that you're coming off strong so they would just naturally resist so but i definitely don't see. Polarization as such as in people are genuinely looking at stuff on face value as in does this thing help me right does it make the component better does it improve security of this thing does it if we're talking about coming from c and c++ to rust it's really the case that they're after extra performance right because they already have the performance side of the story they're mostly interested in can i improve the ergonomics of the code can i improve the security of the code so it's very much about can i secure this component can i reduce my headaches right Let's.
Matthias
00:46:28
Say I wanted to start a new project in C++ in 2025 at Microsoft, would that raise some eyebrows?
Victor
00:46:35
There is no company-wide mandate of banning new projects using C++ or even C for that matter. It's maybe frowned upon if I were to put a label on it, but there's no sort of mandate as in if you're starting something new, you better not touch C++. There's no such thing in the company. There is a strong mandate of that in Azure and in highly security-critical areas, right? If you're thinking maybe security modules cryptography those sorts of things and. Hypervisor things highly sensitive areas of of cloud infrastructure right on those areas yes there is sort of a mandate for memory safe languages right and in many situations rust is the only one that fits the bill in terms of being memory safe and performant in other the situation, C-sharp fits the bill just fine, right? So depending on the kind of workload, but there's no such thing as in across the company, you're not supposed to write new stuff in C and C++. And there's statistically, because just of the size of the stuff around, as in there's more C++ code being written every day than Rust code. Just imagine the scale of Microsoft product. So... The quantity of Rust code increases exponentially in the company, there's so much C++ in the company that a new code that is written or fixed or improved in those components will need to be C++, right? So, and it will take a while between this will ever change, right? Just imagine the billions of lines of C++ there in the company.
Matthias
00:48:35
It's kind of crazy to think that Rust's growth is exponential at Microsoft. That's kind of a positive.
Victor
00:48:42
I'm not casually calling it that way. So being in the developer division, I'm sort of, my team is in a unique position to see this firsthand because everyone sort of comes to us when they have a problem. So developer division is tasked to prove the tools and create the tools to help developers across the company be successful. So based on how I see requests coming in and the conversations I'm part of and the investments that I'm seeing across the board in terms of supporting these tools and infrastructures, this is the only natural conclusion you can draw, right? Because I think we reached an inflection point in the company where we're all in on Rust, as in, we're committed to making Rust succeed at Microsoft because we've seen so many successes in various projects and teams are no longer sort of treating it as a hobby experiment, a happy accident that this thing works great in Rust. No, they're seeing it as... On this thing. We're betting on technology, on this technology, we're betting on this language, we're betting on the community of libraries out there and so on.
Matthias
00:50:03
So your team has to make sure that Rust projects at Microsoft are set up to be a success, at least from an infrastructure perspective. What are the top three things, or tooling perspective, sorry, what are the top three let's say issues that people come and approach you with and why is number one compile times?
Victor
00:50:29
Well, yeah. Well, people coming from C++ don't complain about compile times. Let me put it this way. So only people coming from C Sharp and the net complain about compile times. C++ people are used to getting their coffee or reading the Reddits where they compile. No i would say top things number one meeting compliance and security promises and that ties into our earlier conversation about engineering systems and missing tools and improving the guidance improving the templates improving the security checks there's a lot of things that are missing i can't talk about the details of many of those things because they don't make sense outside of Microsoft. And also, they're not public tools, many of them. They're used in internal infrastructure. But we are leveraging external tools when they make sense inside the company, right? Even, for example, for static analysis, we're using Clippy and compiler links. We're investing in CodeQL for Rust and so on. So there's many, I would say, in the engineering systems and compliance area. That's sort of top concern because many projects are sort of ready to go into production and they're blocked in such sort of compliance stages where they need to check some boxes so that they can go in production, right? And so maybe number two would be interop. I would say, again, I mentioned this earlier, in order for us to succeed, you need to have a good bridge for C++ developers. But even for C-sharp developers, we're not talking just interrupt with C++, we're talking even interrupt with C-sharp in some scenarios. So... Story the interop story needs to be ergonomic needs to be low cost and needs to so in interop it's many things again if you tend to talk to a particular kind of project you will find sort of maybe some requirements that they have for interop but again at at the scale of microsoft and how diverse the projects are across the board depending on who you talk to they will tell you different things when they mean interop, right? Some teams care a lot about high fidelity language interop, right? And some teams care a lot about sort of the glue code and the automation of generating the glue code for the interop. Other teams will say, oh, I have this enormous API surface and I just don't want to duplicate these 10,000 functions that I have in C++. I don't want to have them on the Rust side as well to check in two versions of the functions in the repo, right? So I want some compiler magics or metadata to generate those stuff for me without me having to check them in. So other teams care less about the language interop part and more about the binary interop aspect. Because again, depending on the kind of project you're talking about, some projects are big monoliths that compile everything from source. So we have just one ginormous binary at the end. Other projects are very much a mix of multiple shared libraries, right? And many, many projects in the company are very much in this... Bucket of DLLs kind of thing, right? And there you need to properly handle interop, even as you might have a DLL written in C++ and a DLL written in Rust and the process written in Rust and they need to talk to each other, right? And they're not always sort of leaf dependencies. Sometimes there may be more complex interactions between them. So very complicated challenges around dynamic linking, right? Challenges around ABI stability and the fact that it's clunky for Rust to provide any kind of ABI stability and resilience. `repr(c)` just isn't an ABI story, let's face it.
Matthias
00:54:45
What's missing? I'm not an expert in that area, but what else do we need other than `repr(c)`?
Victor
00:54:53
I did a talk at Rust Nation UK last week. an hour long talk about ABI resilience okay I will listen to that one if I can use this opportunity to plug that of course whenever it becomes available on YouTube yeah we will link to it in the show notes, so I guess I don't want to do it injustice here in just five minutes awesome but in general sort of going down to C as your, FFI thing in many situations is fine as in, dumbing down your interfaces and your intro player to just C and C-like types, it might be fine for many projects. But in some scenarios, when you have fancier vocabulary types and you're actually using C++ and you're not in the bucket of saying, oh, it's a C++ project, but it's actually a C project with some C++ constructs in it, right? So if you're actually using vocabulary types that are sort of richer from the C++ language or on the other side from the Rust language, you don't want to sort of have a sort of common denominator and go through C and sort of dumb down your interface and make them way less expressive and ergonomic. So that's why it matters.
Matthias
00:56:19
When I think of interop, I also think of the OS interop side of things. For example, path handling or string handling and so on.
Victor
00:56:29
That's another business, string handling. You mentioned that. Oh, that's such a big can of worms in Windows. It's a bit more civilized on the Linux side of things, but on Windows, it's a nightmare. And just the plethora of types of strings and how they're represented and their encodings and what APIs expect, it's just bites people a lot and and sometimes they even attribute string handling to interrupt costs because they don't realize that crossing these intro boundaries sometimes you require that you represent strings differently and then you're paying a cost for this translation so they're not sort of zero cost interrupt abstractions so it is it's it is painful and again it depends on your project. If your project is very much sort of low needs in terms of boundary interaction, your interfaces are sort of low levels C-like things and, you might get away with simple automation and simple tools like BindGen, C-BindGen and do your FFI even manually. I've seen people do manual FFI and be just happy with it, right? But if you have fancier things and more complex APIs and your interrupt surface is larger, then even tools like CXX will not cut it. People struggle. So we need to do better. Not just in Microsoft. As a community, we need to do better. We need to sort of build better tools here. Because I believe the success of Rust very much depends on if we can build these bridges that are easy to cross from C++ or from C Sharp in our case. And I didn't even talk about build systems. As in another thing, sort of top problem here is fitting cargo in existing build systems. Right how do you fit cargo in let's say you have an an a cmake project that has all sorts of components and you want to fit in some rust component in there and fit the cargo build with the cmake build or you might have in our case we have a lot of ms build based projects right that are building c++ or c sharp solutions and you want to fit in cargo with the build system make sure that everything works properly dependencies cleanup caching infrastructure everything that entails from that and that people expect to just work, right? It's a big challenge and i think making rust succeed we need to check all these boxes not just developers are already sort of enthusiastic about the language itself and the problem it solves from them right we just need to solve the tuning right because there's sort of it's a it's it's an enthusiasm ramp as in they're super excited and they do they want to do stuff and they build their prototypes and they they they show some wins either performance or security or whatever they're super excited and then they face this roadblock of oh now i need to make it work for real as in i need official pipelines i need tooling i need checks i need whatever and then frustration starts just staying up.
Matthias
00:59:55
To date with that incoming stream of changes is even hard. That's also why you need intro, why you need code generation and so on, because a lot of people, they don't really know how large Microsoft is. For example, I think we talked earlier that Windows, the operating system, gets around 1,000 pull requests a day, and that's like a huge scale that you're operating on, so you cannot manually do all of these things. You kind of need tooling for that. I wonder, so if there was someone listening right now and they were thinking about how to help you in an let's say in a product kind of way would there even be, a chance for them to sell microsoft a tool for example i don't know a private registry or a better code generator or a better search engine for Rust code would you even be interested in any of these products or do you have a product in mind that you would encourage people to build that could be useful outside of Microsoft as well?
Victor
01:00:55
So I believe in, and it's not just my opinion, it's how we see things for Rust in Microsoft. Sure, we're building, I mentioned earlier, we're building a lot of sort of stuff in terms of tooling that makes sense for engineering systems and sort of solve our particular needs internally and they don't make much sense for everyone else. But when we see an opportunity that something that we're building for internal needs makes sense for other teams as well as which are outside of our engineering systems, outside of our company, we very much contribute to those projects. Like I mentioned earlier with some of the examples I gave in when I highlighted projects, think the key here is not building products and and sort of building better products that we can, license or leverage or whatever i think we need to move the bar in terms of the community as in each company when it sees an opportunity to improve something if it makes sense to improve it for everyone that's what should be doing and that's what we're doing whenever we see a gap in an open source tool, we do the work in open source to improve it. When we see a gap in the compiler and in Rust Projects itself, we have a lot of Rust Project contributors working in Microsoft. And we love to do all that work in upstream because who would want to maintain a private fork, right? So everyone benefits if we're doing this. And we're not the only ones, right? All the big companies who are betting on Rust and they need a lot of similar things internally for their engineering systems, they end up doing the work in open source, right? Either spinning off new projects or new open source things or contributing to existing open source projects. And that's what we're doing as well. So I think the secret sauce here is pulling resources together and sort of working so that everyone can take them further.
Matthias
01:02:55
Is the opposite also true where you adopt a product that might be mature and you integrate it into your build process if it makes sense? Do you have examples of maybe tools that are missing right now that people could build and maybe productize in the Rust space?
Victor
01:03:16
I think we need more investment in static analysis for Rust, And I'm seeing an opportunity there. And we have good stuff already, but I see an opportunity to invest more.
Matthias
01:03:33
Shouldn't Clippy solve that?
Victor
01:03:35
Yes, but we need more checks. Yeah.
Matthias
01:03:39
More checks.
Victor
01:03:40
So it's, and again, it's a perfect example where it's an open source project. Everyone benefits. Checks are opt-in, so you're not disturbing anyone, right? So anyone who can contribute more stuff there, I think it's a win for everyone. And dynamic analysis and projects like Miri, for example, it's another area where I think we can invest more and everyone would benefit.
Matthias
01:04:08
I truly believe that Miri also has potential to be maybe a part of a larger product. You say that this is a great feature, but again, you probably want to have a thing that takes care of this, a platform that you can point people to and say, look, we degraded in that certain metric, or can you take a look at this specific thing? And when people use Tokyo in production, that might also be an issue where, you know, maybe you have a race condition that is very hard to find. Maybe there's a tool out there that could analyze that and highlight that issue for you. There are tools like this for the Go ecosystem, for example, which just don't really have them in Rust.
Victor
01:04:53
And another thing where we're actively trying to drive community work is the language itself and stabilizing things. We see a lot of internal teams who are enthusiastic about particular functionality, either a tool or some language or library thing, but it takes a long time to stabilize them. So in general, we want to discourage people internally to be on nightlies, aside from experimentation, of course. So anything production needs to be on stable Rust. But we see a lot of things and sort of themes excited about either something in library space or some particular tool, sometimes even language constructs, but mostly tooling and library stuff. And they say, oh, this is only nightly and I can't go in production with it. And maybe we're actively working here to drive some of the stuff that we care about. But I think in general, as a community, is more deliberate paths to moving proposals along and sort of stabilizing stuff that's been true or stuff that has been vetted in industry that works, not just on our side, but many other companies were sort of vetting the stuff and say, okay, this works for us. Sort of making, maybe making more deliberate push to stabilize stuff. Because there are some things that have been in Nightly for years and nowhere near stabilized.
Matthias
01:06:29
I completely understand what you mean here. A lot of, let's say, really helpful features are stuck in Nightly or even in the RFC stage or somewhere in between. So, yeah, I fully understand.
Victor
01:06:45
But work doesn't get done by itself. So we need volunteers to drive these things. And we're trying to do our part as best as we can. But the more people are reviewing PRs and RFCs, the better. So it's a community effort. It truly is.
Matthias
01:07:04
True. Anyhow, we're getting close to the end. So it's not all flowers and roses, obviously. But it feels like, in general, what would you say? How would you summarize Rosted Microsoft in the beginning of 2025? What would be the mantra?
Victor
01:07:24
It's the, no bullshit about it. It's literally what we're calling it inside, at least on my team, who's responsible to drive this, is 2025 is the year of Microsoft, the year of Rusted Microsoft. So literally this is what we're calling it. It's the year where we need to solve enablement across the board for Rusted Microsoft. So we're all in on making that happen and providing all the tools and filling all the gaps. So although it's a lot of work and not to lie, it's not easy and a lot of moving pieces, people are very excited about this. Both teams who are waiting on us to deliver the stuff that we promised. But even on our side, we're super excited to unblock all these teams because I'm looking at the projects and I'm seeing internally using Rust. I can't but get excited I see all this interesting stuff going on and it's just a shame to see them block on some missing tool or something that just doesn't really work for them and I want to solve that problem, I very much see myself as unblocking them to do really cool stuff Can you give.
Matthias
01:08:41
Us a quick teaser maybe of a thing that is on the midterm horizon that people might be looking forward to?
Victor
01:08:52
Debugging Rust on Windows will become really better.
Matthias
01:08:57
Wow, that's good to hear.
Victor
01:09:00
And maybe that's because it really sucks right now.
Matthias
01:09:04
That's what I meant with honesty in the beginning. Yeah, I appreciate it. So it feels like Microsoft is all in when it comes to Rust. Definitely. So you're kind of making it succeed. Or I would say, could you say Microsoft devs love Rust? could you say that.
Victor
01:09:23
All the devs that i talk to love rust because they they come to us for rusty things so they they're all super excited and they love it there's there's no rust haters at least not that i'm seeing everyone that comes to us they they come with passion and they want they want us to help them so.
Matthias
01:09:50
Traditionally our last question is what's your message to the rust community could be anything about the community about technology i mean you have a very strong c++ background you told me that you go to cpp conf almost like since the very beginning if i'm.
Victor
01:10:10
Not mistaken so.
Matthias
01:10:12
Maybe maybe this would be an interesting angle to come from to say okay from your personal perspective as someone who is very involved in c++ and has a lot of experience what would be your message to the rust community it's.
Victor
01:10:27
A message both for rust community and for c++ community it's in order for us to succeed c++ does not mean it has to die so it's not a zero-sum game it's very much about many many years from now where we need to learn to coexist both in terms of software interop and in terms of community interop so i think my message would be let's spend more time on improving these bridges and and making sure the languages work well together so that we can successfully have this and as communities learn to collaborate better. Awesome.
Matthias
01:11:13
Yeah, I would love to see that. Victor, thanks so much. I really appreciate you taking the time today. Thank you so much.
Victor
01:11:21
Thanks for having me. Bye-bye.
Matthias
01:11:23
Rust in Production is a podcast by corrode. It is hosted by me, Matthias Endler, and produced by Simon Brüggen. For show notes, transcripts, and to learn more about how we can help your company make the most of Rust, visit corrode.dev. Thanks for listening to Rust in Production.