Rust in Production

Matthias Endler

Cloudflare with Edward Wang & Kevin Guthrie

About handling 90 million web requests per second with Rust

2025-10-30 68 min

Description & Show Notes

How do you build a system that handles 90 million requests per second? That's the scale that Cloudflare operates at, processing roughly 25% of all internet traffic through their global network of 330+ edge locations.

In this episode, we talk to Kevin Guthrie and Edward Wang from Cloudflare about Pingora, their open-source Rust-based proxy that replaced nginx across their entire infrastructure. We'll find out why they chose Rust for mission-critical systems handling such massive scale, the technical challenges of replacing battle-tested infrastructure, and the lessons learned from "oxidizing" one of the internet's largest networks.

How do you build a system that handles 90 million requests per second? That’s the scale that Cloudflare operates at, processing roughly 25% of all internet traffic through their global network of 330+ edge locations.

In this episode, we talk to Kevin Guthrie and Edward Wang from Cloudflare about Pingora, their open-source Rust-based proxy that replaced nginx across their entire infrastructure. We’ll find out why they chose Rust for mission-critical systems handling such massive scale, the technical challenges of replacing battle-tested infrastructure, and the lessons learned from “oxidizing” one of the internet’s largest networks.

About Cloudflare

Cloudflare is a global network designed to make everything you connect to the Internet secure, private, fast, and reliable. Their network spans 330+ cities worldwide and handles approximately 25% of all internet traffic. Cloudflare provides a range of services including DDoS protection, CDN, DNS, and serverless computing—all built on infrastructure that processes billions of requests every day.

About Kevin Guthrie

Kevin Guthrie is a Software Architect and Principal Distributed Systems Engineer at Cloudflare working on Pingora and the production services built upon it. He specializes in performance optimization at scale. Kevin has deep expertise in building high-performance systems and has contributed to open-source projects that power critical internet infrastructure.

About Edward Wang

Edward Wang is a Systems Engineer at Cloudflare who has been instrumental in developing Pingora, Cloudflare’s Rust-based HTTP proxy framework. He co-authored the announcement of Pingora’s open source release. Edward’s work focuses on performance optimization, security, and building developer-friendly APIs for network programming.

Links From The Episode


Official Links



Transcript

It's Rust in Production, a podcast about companies who use Rust to shape the future of infrastructure. My name is Matthias Endler from corrode and today we talk to Kevin Guthrie and Edward Wang from Cloudflare about handling 90 million web requests per second with Rust. Kevin and Edward, thanks so much for taking the time. Can you introduce yourselves and Cloudflare, the company you work for?
Kevin
00:00:26
Sure. I'll go first. My name is Kevin Guthrie. I'm Principal Software Engineer or Systems Engineer at Cloudflare. I've been here about a year and a half. I've been a Rust developer for about four-ish years, on and off professionally. I've done some side projects, some games, some really stupid projects, some really complex projects. I just love Rust language. and it's hard to do anything else what about you Edward.
Edward
00:00:53
Yeah hey my name is Edward i'm also a systems engineer here at cloudflare and i've been working on rust i mean since i joined the company essentially about almost five years ago now at this point and i've been working on well we're going to talk about the Pingora framework today previously i was working at a game studio so working on internet plumbing essentially was a pretty big difference now.
Matthias
00:01:25
Yes we will talk about Pingora today but i'm not sure if people are aware of the scale that cloudflare is at can you share some numbers just to fill everyone in.
Kevin
00:01:38
Yeah, okay. So we have some changing data. This is just based on public data. All the things we're going to share today are things that are publicly available. We have about 20% of the internet goes through Cloudflare. The reference here is from a tweet from one of our engineers. This is up from a couple years ago when you had, Steve Klabnik was on and talked about how Cloudflare had 10% of the world's internet, or 10% of the internet. So we're a little bit up from that. Currently, from the internal Pingora side, we handle about 90 million requests per second worldwide, occasionally going up above 100 million requests per second.
Matthias
00:02:21
That is crazy. I guess beyond comprehension for most people. That would mean that probably a huge majority of the traffic goes through Cloudflare, and maybe to some extent through Rust. We will talk about that today. But what is the setup internally to handle that scale, to handle that amount of requests?
Edward
00:02:45
Yeah, I think we have a, I mean, if you're not familiar with Cloudflare to begin with, Cloudflare operates a global network rate of more than 300 points of presence around the globe. There are all these points of presence data centers in many different countries, and traffic is routed to them via these Anycast addresses, something that we've talked about on the blog before. So there are all sorts of setups, both on the layer 4 side and layer 7 side, to be able to load balance, distribute traffic and capacity accordingly. Internally, we operate one of the, and by we, I mean our team operates one of the services that your request travels through in order to get served in a response. And those are the services that are using our Angular framework, which is our team. But internally, yes, there's a bunch of different mechanisms to balance and distribute the traffic outside of data centers. External to the data centers, routing to the data centers, and then within our data centers themselves throughout the life of a request, as we call it. Traveling through a few different CDN services or proxies.
Kevin
00:04:24
Some rust some not.
Matthias
00:04:26
A lot of people who are in the rust community for quite a while know that cloudflare was one of the earliest adopters of frost at a larger scale but what was rarely talked about was the reasoning behind why cloudflare chose rust can you shed some light on that was it for performance reasons was it for memory safety reasons what was the big driver behind Rust adoption at Cloudflare?
Edward
00:04:56
I think neither Kevin and I were around for the very beginning of this, right? But we know that our teams have the content delivery network, you know, compute, workers compute teams, etc. They've all been eyeing Rust for a long time, as you mentioned. And I think it was really, yeah, All of the, like, compile time... Checks that you'd be able to do, all of the classes of bugs that you would essentially eliminate from production, right? You would be, I mean, on the content delivery side, it's no secret that a lot of our company was built on these proxy services using NGINX, right, which is based in C code, as well as Lua business logic on top of that. The amount of, let's just say that there were certainly a number of core dumps and invalid memory accesses associated with us perhaps making changes within our NGINX fork. Over time, we've had to implement more and more complicated features within NGINX internally. And these core dumps are really impactful, obviously. When a core dump happens on a worker process in NGINX, that drops like thousands of requests. So we had executive, honestly, we had executive visibility and support on that. But something that our team has talked about before was that at least in the earlier years, prior to Rust adoption, for each core dump, I recall that our former CTO, John Graham Cumming, would actually get an email for each of those. These crashes were very much top of mind for folks. So when you have that kind of, if you're able to build something to leverage all of those advantages of, hey, I can just completely erase, eliminate these classes of errors, then you're definitely going to be pursuing something like that. And Cloudflare is certainly we are not shy to consider every new technology an advantage we can come by.
Matthias
00:07:43
The one thing that I noticed when you explained the reasoning behind Rust was that a big chunk of the business logic was written in Lua. And I wondered immediately, couldn't you just use another language with a static type system? Like, I don't know, Go, for example, wouldn't that have been easier to integrate?
Edward
00:08:04
Sure, but we were using NGINX, which was relatively new at the time that we were adopting. It, I believe, we had built a lot of our features, the firewall, etc., and DDoS features on top of filters that would run in NGINX. So the OpenResty, I believe, as it's called, the OpenResty is a framework set of. It allows you to implement business logic that you can plug into each of the NGINX filters that run across the life of a request without necessarily touching all of the very, perhaps arcane to a lot of folks, C code. So in order to do something like integrate Go and stuff, there might be certain similar efforts to do that. But I think none have been as mature as OpenResty and its Lua logic and Lua filters. I think for a while, some of the OpenResty folks were working with us on our CDN teams. So generally speaking i think go was one of the possibilities i believe when we were evaluating other languages right to go to to switch to but i think rust is the definite forerunner for all the reasons that i mentioned a you know zero cost abstractions for great performance and And obviously, most importantly, I think, eliminating all sorts of memory safety issues and bugs that can arise from memory safety issues.
Kevin
00:09:59
One other thing that impacted our decision to go to Rust, I think. Like I said, I've only been here a year and a half, so I was definitely not around when the decision was being made. But we had a lot of the forerunners, like the celebrities and the Rust community, were working at Cloudflare at various points in time. Like Steve Klabnik himself worked at Cloudflare. Ashley Williams also worked here. So, I mean, there was a lot of popularity of the Rust language in Cloudflare to begin with.
Matthias
00:10:26
When you want to integrate a language like Go into an existing infrastructure that runs on NGINX, that would probably be a little harder because Go has a small but not negligible runtime. It has a garbage collector and so on. Whereas with Rust, you could integrate very deeply with basic C FFI. Did that also play a big role? And also, did you end up integrating Rust into your NGINX server for a while before you moved on to build your own solution?
Edward
00:11:01
Yeah, I think this speaks to the matters of how do you migrate and switch over pieces of your infrastructure gradually to Rust as well, right? So all of the, as I mentioned, a lot of the core business logic historically has been built on Lua via OpenResty. All of that, you know, business logic built up over time. So I think there were initially some notions of how do you integrate, how do you maybe integrate, change those filters to how do you extract the business logic, right? To use, to be using your Rust-based logic instead. And there, I think we're varying approaches to this for, there are a lot of teams that work on the CDN in addition to us. Some of these, some of the logic you can kind of extract into different services, either in-band with the request processing or out-of-band. You can make calls to other services. For example, the approach that we ended up choosing was to extract a specific, on a high level, extract a particular responsibility of one of our NGINX proxies into a separate service. That was what we were doing when we were first developing Pingora, which was at the time, NGINX would reach out to make origin connections directly and make origin requests directly. We decided, hey, what if we situated an in-band with a request proxy that sits just behind that NGINX proxy? and routed requests to that instead. And then that service would decide how to make origin requests to which origins. Handling all of the origin communication responsibility. And then you're able to do something like divert traffic to that selectively depending on how ready that service is to handle certain classes of requests. I think this is generally the strategy that has been working out pretty well for various services at Cloudflare, as long as you're able to have something that has some sort of control plane that sits in front and, decides, you know, the routing in band of that request.
Kevin
00:13:46
It's the unsurprising answer of how do you solve a problem at a proxy company is by adding more proxies.
Edward
00:13:55
Yeah. And certainly at first, I think we were a little concerned about, I think whenever you're adding another hop, it's another proxy hop at injecting another service into it. You're worried about complexity. You're worried about performance regressions and things like that, latency, obviously, right? Generally speaking, what we had noticed was that adding another service hop is certainly unconditionally going to add some amount of latency. Thankfully the feature the new logic that we were adding on top of that was generally able to offset a lot of any those detriments that you would face the the example that we tended to point out in our in our blog a while ago was how our Pingora service i don't know how much we want to get into, the why exactly in terms of like NGINX versus Pingora architecture and stuff but Pingora was a lot more efficient this was definitely top of mind for us a lot more efficient in terms of how it was making and reusing origin connections and so something like that generally brings down your you know the the the latency of making an origin request significantly if you can skip all of the TLS handshake etc latency fortunately for us as well when it comes to replacing an origin facing proxy the the cost of the cost of origin latency significantly dwarfs any additional proxy hop latency you have so.
Matthias
00:15:39
Okay, but was the project already called Pingora back then, or was it some sort of intermediate step?
Edward
00:15:47
I guess I have to shout out some folks who were working on Pingora. I say I was working on it, but in reality, it was Yuchen, Yuchen Wu, as well as Andrew Houck, who were kind of the primary and first drivers of Pingora. And at first it was called OpenRusty, I think. You still see this term in some of the old tests because it was very much meant to replace OpenResty and NGINX itself and be a, I don't want to say a drop-in replacement, but do all the things and model a lot of its logic off of NGINX and OpenResty. Because honestly, that worked for us. and NGINX's logic models and the way that it thought about request processing worked for us. So we wanted to do a lot of things pretty similarly to NGINX.
Matthias
00:16:45
I really like the name, OpenRusty, but of course.
Kevin
00:16:49
I don't remember why they didn't go with that name in the end. I think it partially was sort of pejorative, but also could have been confused as a typo. I think Pingora is a much better name. The name, I think, came from the manager of the team who almost slipped and died off of the mountain, the literal mountain that's actually called Pingora.
Edward
00:17:10
I believe the story is that a particular trip to the Pingora Mountain almost cost him his life. And now we've been ascending that summit ever since.
Matthias
00:17:26
Sounds like negative foreshadowing, but in reality it worked out well. But the one thing that I wondered was that, let's say you add another hop. You add proxy behind a proxy. And then you have a ton of requests coming in and then you want to switch to a new version you just kind of want to do a release basically wouldn't that be an easy source for dropping connections and dropping requests it.
Kevin
00:17:54
Is so it's something we have to do very carefully and we have since we handle things like web sockets we have lots of long-running requests so upgrades updates are something we don't do very often now or then. The way Pingora does this is a really slick system where when you bring up a new update, the process that you want to move everything to, it can start. It can know about the old instance of Pingora that's currently running. And that old instance can gracefully hand over the socket to start listening for new connections on the new instance of Pingora while old requests finish out on the old one and then the old instance can handle all of its requests and then gracefully shut down whereas the new one is bringing up any new connections and handling those.
Matthias
00:18:44
Is that safe or does that happen between processes? I wonder if you can even make it safe.
Kevin
00:18:51
It is. I mean, I don't know. I'm sure in Rust this is classified as some form of unsafe code because you're passing around raw file descriptors for sockets. But it is also a really common thing. This is something I first heard of at Facebook, where their networking, their HTTP servers do the same exact thing or even their load balancing system i think it's a very common process but i never worked with the actual code to do it until working on the Pingora project.
Edward
00:19:21
Yeah there's actually so yeah there's there's this process of transferring these listen file descriptors i think it's one of the few places, i i could be wrong but i think it's one of the few places where yes because we're dealing with those raw file descriptors, there's a bit of unsafe code there. There's actually also a crate that I believe we've put out, not us ourselves, but I mentioned, or maybe I haven't mentioned yet that Cloudflare is not in a monolith when it comes to Rust, and we are not the only folks, developing in the Rust ecosystem. So the folks who are working on another proxy framework called oxy have actually open sourced a crate that is specifically for these kinds of graceful process restarts and it's called shellflip so that it uses a very you know similar mechanism of you know transferring while descriptors and doing that handover of of yes doing that handover.
Matthias
00:20:29
ShellFlip sounds like another really cool name. You have a way with names, I guess, at Cloudflare.
Edward
00:20:36
I think it was taken from the TableFlipGo package. And I think some of our engineers decided to...
Kevin
00:20:44
That's a good name.
Edward
00:20:45
I don't know why it's Shell in particular, but maybe it has to do with crabs.
Matthias
00:20:50
Maybe, yeah. But do you share a lot of code with other teams at Cloudflare? Rust creates, that is. You mentioned that you have Pingora and you have Oxy, but there's probably more stuff at Cloudflare which uses Rust. How does code sharing look like? Because from another company, or actually from a few, I heard that sort of by serendipity, they start to use different crates in completely different contexts. And it kind of happens very naturally to share code.
Kevin
00:21:22
Yeah, that's true. We have a sort of a haphazard way of sharing code. We do have our own internal repository for uploading crates, like it's an internal copy of crates.io.
Edward
00:21:33
Internal registry.
Kevin
00:21:34
Sorry, internal registry. That's the right terminology. But for a lot of it, it's done through referencing crates through Git URLs. So it's a little bit on the Go side of things. So you have a crate that you want to share with other people in Cloudflare. It's up on our internal Git server. You can write a blog post about it. Anybody can just include it. Putting it on the internal registry should be a more common thing to do, but I have literally never done it. I've shared a couple of crates with different teams for various stupid things, but most of those are just incorporated either through one way, which is making the project open source and then having people consume it just from the open internet or from the actual crates.io or consuming it internally from an internal Git repo.
Edward
00:22:21
I will say that I think usage of the internal registries is pretty common now. The whole point of the registry, One of the major points was to avoid that the kit commits kind of references that Argo allows you to do. It's still used in some cases, right? But yeah, more and more, I think the ecosystem around Rust has become a lot more shared maybe in the past few years. In our code, we use both approaches.
Matthias
00:23:02
And when you publish code, do you have a formal process for the publication? Do you run any cargo tools to make sure that the code quality is on par with the rest?
Kevin
00:23:14
Uh we use the i mean really we use the standard open source tools we use clippy we use the auditing tool name i can't think of right now make sure we are not publishing anything within secure code cargo audit yes cargo audit yeah exactly but that's that's really about it we're very stringent on our internal code reviews so like i said we have open source projects all of the open source contributions that come in go through internal external review as well as internal review before they go into the the main branch for Pingora but as far as automated tools yeah it's really just clippy a clippy in testing now.
Matthias
00:23:54
Let's come back to Pingora for a while we established that we have a system called open rusty it's behind NGINX that's the current place we're at when was that roundabout like the year that you had that system running in production.
Edward
00:24:10
Oh boy so the blog came about 2022 i want to say the first forays into Pingora started, around 2020 or a little before that i would need to look into the exact dates for production But I want to say that the service, I think it was around, it wasn't long after 2020 that these services, that the Pingora services first started to get used and deployed.
Matthias
00:24:48
Yeah. And pretty early on, you saw some advantages. I guess, Edward, you also mentioned that. The additional hop didn't really make a big difference because the connection to the origin was the bottleneck and the new Rust-based system was already pretty fast. But then one could argue NGINX was already plenty fast. What were some of the other NGINX limitations that you ran into which kind of triggered you to find a different approach other than, say, the lack of the type system, for example, that you had with the lure solution in the past.
Edward
00:25:27
Yeah i can definitely speak to that i think i was working on one particular feature that was a bit hard to. It was i mentioned that every time we have we i mentioned that we have an internal NGINX fork that we've added more and more complexity into for developing our own internal features whenever we want to futz around with how NGINX does its request processing and response serving right over time and eventually there was a moment at which the the straw on the camel's back broke where we were implementing we were trying to implement more complicated logic on top of the things we are already doing for example i think there are we've blogged about concurrent streaming acceleration which is a fancy name for we're serving your cached request cached response body as it gets pulled from the origin it those changes are pretty intrusive see changes as we iterated on top of that. Any feature of decent complexity would cause core dumps. And as I mentioned before, that was highly visible to leadership. So if we were to make significant progress at all, we would usually be debugging what sort of invariant we were violating inside of NGINX. NGINX is great in a lot of ways. And it, like the developers themselves are experts in what is valid, you know, to access when and what can you do asynchronously from the, from the lifetime of the main request, for example, and what is not safe to do so. But those things are not necessarily, I mean, they're not as enforced within the code strictly, right? The way that you can encapsulate those exact kinds of lifetime and memory restrictions in Rust. So that was the point at which we said we were already developing Pingora and then we said actually for any feature of significant complexity we need to start moving it into the new system we need to start migrating to the new to the new proxy system and developing features there as much as possible instead of NGINX itself.
Kevin
00:28:19
One thing we want to make clear is we definitely are not here to complain about NGINX or to bash NGINX in any way. NGINX is like the foundation of Cloudflare. And the actual NGINX and OpenResty projects are amazing and stable and used in millions, billions of places. I don't actually know. But the modifications we were doing were not as stable and leading to the core dumps. As someone who came to this, came to Cloudflare not having done internet plumbing, just similar to Edward. Seeing the C code for NGINX, which is asynchronous, it's not written in an async await kind of way like you're used to with Rust or TypeScript or anything. It is literally, it's async code, but you're working with it in the time domain, like you are managing the state literally as you go through these, waiting for different files, waiting for sockets to open, close. So it is a very complex thing and almost impossible to debug.
Edward
00:29:17
Yeah, for sure. the the honestly developer ergonomics and developer velocity on top of the you know the the the classes of bugs that you're able to avoid and not worry about that avoiding whole classes of bugs speeds up your productivity where you don't have to worry about introducing those things this is actually why a lot of our business logic was written in lua filters as well because you don't you're the the amount you're not going to seg vaults from manipulating lua objects right but that, often comes at a performance cost with with lua vm and lua runtime even if you lua jit so this. The other, like, main primary advantage of switching to Angular and Rust was honestly just, like, as Kevin had mentioned, the expressiveness of async Rust is extremely powerful. And when you're looking at especially for onboarding new engineers learning NGINX and how it manually handles the event loop events right because it when when you're going in when a request comes into NGINX it needs it is it is handling those equal events and propagating it to the request event handlers, and then it needs to decide what comes next. Assign the next handlers once your header is done, then you assign the event handler for the body, etc. There's a lot of manual mental effort involved with that kind of coding model where you are both handling the HTTP processing logic in tandem with handling the event loop. And with async await constructs, all of that logic then becomes linear, actually. You can very much see... After this, you're going to do this next in the life of a request. And that, I think, I believe, has been really helpful for folks who are, for onboarding new engineers, for learning the code base, etc. I think it's that that was extremely those ergonomics were just as important to us honestly because we need to ship things fast here.
Matthias
00:32:12
Sounds super crazy because not many people will be familiar with how engine x or to be more specific c handles asynchronous execution, That's a thing that was sort of a selling point for NGINX in the beginning. It was event-driven in comparison to Apache, which was not very much event-driven. It was more or less process-driven, and NGINX kind of changed that model, but you kind of need to shoehorn your logic into that. But is it similar to the state machine that gets converted to something more maintainable on the Rust side? So on Rust, we don't really need to write a big state machine ourselves. We just use async Rust as we do, and then the compiler will just generate the state machine for us. Is the code similar on the C side, or is it completely different?
Edward
00:33:10
Got it, yeah. Yeah, I would say NGINX, I guess a lot of that is hand unrolled, the way we were talking about, right? Where the events and the next state that you're going to go to for the next event that you encounter are manually defined within NGINX. And then you were also talking about how NGINX was... Really revolutionary in terms of how it was doing the asynchronous event driven model and that kind of touches on a it's it's not exactly related to how you know async rust gets kind of converted into a state machine, but it is something that it's a big explanation for why NGINX is already a great performer, right? Does really well in benchmarks already. The underlying mechanisms for how that works, I talked about something called ePoll before. The underlying mechanisms of NGINX are such that there's an event loop that an NGINX worker process goes through and it's able to with the help of operating system utilities like epol determine when io events are ready on certain file descriptors and otherwise block if if there are no events ready there but then so you can think of it as essentially processing all of these events as they become ready as as io events come in on in this you know literal loop and the equivalent like that that's the that was a really powerful and greatly you know efficient way to to do all of this and respond to, network and file IO, right? We got to cheat with Pingora because we are able to reap the benefits of tokio that does something really similar. It has a, and I am not a tokio expert by any means, I would say, but it is doing something really similar also with the help of Mio underlying, right, Metal.io, where it's also handling the file descriptors and this event loop, etc. Within its reactor that is using a lot of the same operating system mechanisms, be it EPOL or KQ or what have you, to listen to these IO events and then propagate them, awake the corresponding tokio tasks that are relying on them, right? So all of a sudden, we are able to look at that as Bangora and we have a great abstraction, you know, from the actual underlying event handling systems that we can build upon. And so a lot of, I would say that, like. Great big success when it comes to why we were able to develop Pingora relatively quickly, I would say, is because we were able to build upon the success of tokio and all of its great performance considerations and mechanisms, right? Because tokio also has a bunch of internal optimizations for especially when it comes to well we can get into how it does things between threads etc and and tries to load balance load balance tasks and work between threads but it's basically we were able to do so much on top of because we already had a great underlying async runtime and event handling mechanism.
Kevin
00:37:39
I don't know if you sold, you said you weren't a tokio expert. I don't know if that came across because it sounds like you are quite the tokio expert now. I can say as a definite non-tokio expert that you don't really even need to know all this to use tokio as an async code. You can come into it as a TypeScript developer and be like, oh yeah, it's async, wait, I get it.
Edward
00:38:00
We have an actual tokio maintainer on our team now, Noah Kennedy, who is actually a tokio expert. So I don't generally say I'm an expert on things unless I really feel like I know them very, very, very well.
Matthias
00:38:17
Well also the error messages have gotten a lot better in recent years i can still remember the early days when you got a page full of you know gibberish about the type system and so on but recently they improved it a lot thanks to compiler internals which helped infer what really went wrong and also present the information in a more consumable way and also just new mechanisms inside of the components of like Impultrate which allow you to focus more on the core issue at hand. I don't know if you saw any of the older error messages before but.
Kevin
00:38:58
Definitely yeah yeah so i i did some work with uh with tokio a long time ago like before before like well before i went 1.0 so things were a little bit different than in terms of error messages so like it was basically back in the old days of dealing with java error messages or other languages error messages where you can almost ignore them just look at where in the code it was pointing and go there and try to figure things out yourself, as opposed to right now, where the error messages in async are practically as good as they are in synchronous Rust code, which is to say really good.
Matthias
00:39:32
Did you ever run into problems with tokio, like, for example, starving the executor on threads when you block too many futures or async cancellation where you had a sub-request that you didn't want to kill when you killed main task, for example, and you wanted to keep it running? And the Rust Futures ecosystem was sort of getting into your way?
Kevin
00:39:58
At least a little bit. There was one scenario that I ran into within the past month where I was downloading a file, downloading a large file, running on a hyperscaler VM, so really excellent internet connection, and just the process of downloading that file on a small VM with a limited number of threads and a small number of tokio workers was enough to starve every other connection because I wasn't being smart. I wasn't using budgeting. I was just letting this one task take over the entire runtime and blocking everything else. Like no other async things were being taken care of. It was just happily downloading this one giant file and using the entire CPU to do that.
Matthias
00:40:39
Yeah, what happens a lot during testing is that people have these multi-core machines, they have, I don't know, 16, 32, 64 cores, what have you, and they run that system in their test laptop in their development environment. And everything works because you have plenty of threads. By default, tokio just spawns as many background threads as you have cores. But then you move to production where maybe you have to make do with two cores and then suddenly you have two blocking threads, tasks and then you ran out of threats and kind of the threat pool got exhausted that's a thing at least that i saw with some clients and you running 20 of the internet i did wonder if you ever run and ran into these sort of problems where these might be hard to troubleshoot as well because you look at a dashboard or so and everything looks normal as if it was sort of working but it doesn't do any work it doesn't make any progress on the futures.
Kevin
00:41:43
That's true we have recently at least our team recently the ping core team has incorporated tokio internal metrics into our dashboards to give us visibility into these sort of things but you're right that's something before like the past couple months that we didn't have visibility into, and if we were running into that problem we wouldn't have known since then we've encountered a few problems in production that we've seen in in our like measurements of runtime queue size and, i've got the other metrics but basically how much is the the scheduling system getting backed up by all of the all of the threads being busy i.
Edward
00:42:23
Think we were also running into issues where we had certain file io operations that were taking while and that for for those tokio actually has like a separate blocking thread pool that usually doesn't get saturated because it's really large, but may, right? And on the note of, I would say you also brought up async cancellation as well. I think it's really easy to mess up. Async, you know, thoughts about cancel safety and things. It's really easy to mess up a while tokio select loop so that you, if any of the branches aren't necessarily cancel safe, that's not something that you, that's purely an async Rust problem that you're, that generally I think doesn't get introduced very well for other people who are entering Rust. There was a great Rust talk actually by rain from oxide that i would love to shout out because it was really helpful in thinking about why exactly async cancellation is is hard to reason about it's because there's i i there are a lot of ways in which cleaning up async rust being mindful of cancel safety being mindful of when a future is canceled and thus cancels everything else under it, right? That's both really, really useful in async Rust, but also very, very easy to mess up. And it's a problem. They had introduced this concept, which I was really... Which helped me think about this, right? It is a problem that you can't determine whether or not something is safe to cancel just within the function itself. You have to look at every other child future under it and determine what else is going to get canceled when I decide to cancel this. And so it becomes a really hard problem to think about because suddenly you have to think about everything else, you know, all of the global context.
Kevin
00:44:35
This is structured futures?
Edward
00:44:37
I think it's just because when a future, like child futures, that will also get canceled, right? When you cancel the parent future, I believe. So it's tough. And so async Rust definitely has its sharp edges. That's not necessarily a tokio-specific thing, but the last bit, the very last bit, is that actually there is, to shout out another Cloudflare crate that we are not yet using in Pingora either, but other services at Cloudflare are. There's a crate called Foundations that I think helps you export a lot of these tokio metrics and such out of the box. So it has a lot of nice functionality to be able to do that. And it should be a pretty minimal, again, foundation layer for folks if you're interested in more easily exposing a lot of those runtime operational concerns and getting observability into that.
Kevin
00:45:42
One thing to build on that, the one thing that Pingora does to help you avoid the problem of going from one machine with many cores to another machine with few cores, the problem about unexpected number of tokio tasks, is to make you be explicit. Like well Pingora uses tokio under the hood it doesn't really expose the it doesn't really expose tokio to the the caller instead it talks about things in terms of backends how many how many threads do you want to use on this back end and we don't do a default number we don't default to the number of cores we make you be explicit to say okay you want to run this tell us how many tasks you want to run and we do a lot of things like isolating services to a certain subset of tasks it's not like one giant tokio runtime which you would get if you were just running tokio main it's a good way of isolating business critical things from things that need to run in the background that can take a little extra time.
Matthias
00:46:39
Yeah and since we're on the topic of using a certain number of cores in rust code i always wonder why everyone sort of defaulted to the number of cores that you have on your system because if every dependency if every library does that you end up with a multiple a multiple of the number of cores you have and i'm just saying this that so that people are mindful about the resources that they request from a system and speaking of resources that's the other part about performance or efficiency let's say that i wondered about when you compare NGINX with Pingora did you or were you able to squeeze out even more, requests per server now that you switched over to Pingora because NGINX must have already handled a ton of requests I'm assuming because it was written in C.
Edward
00:47:34
I don't know if we were able to squeeze out more necessarily, like people squeeze out all of the resources from us with the amount of requests per second that they drop on our network. But I think, again, I think the resources that we are really saving, as far as I can recall, because NGINX you know just bare request processing without you know extra compute futzing with the request processing with lua filters or whatnot is generally already pretty efficient and just trying to limit what it does to being pipes we we have similar goals right We want to be as minimal as we can and just ferrying the bytes through and making the necessary modifications on the layer seven stuff. Now, the things that we were saving, I had mentioned earlier that we were like. More efficient at reusing origin connections for example you can in theory i guess squeeze out yeah you can you can squeeze out and save compute on making for both yourself and the origin right if you're if you have better origin connection reuse when you're trying to make requests upstream the reason why we there was such a fundamental difference i think that in in the blog i think we had mentioned that we had reduced we had lowered it by like two-thirds the amount of origin connections we were making the fundamental reason for that was like a fundamental architecture reason which was that NGINX worker processes right because their individual processes weren't able to share a connection pool unlike the thread-based model that we have in in Pingora so that and where where we have an upstream connection pool that that all the threats can on a particular server can share from. Except in those particular design, fundamental architecture ways that we were really conscious of when we were first optimizing for Pinguara because our team making origin connections, that's a big deal for us. I think generally we would expect performance to be pretty much on par, right? With what Entirenex is doing. And that's the promise of Rust, right? You can do that. And be just as expressive and easy to understand.
Kevin
00:50:32
Yeah, a lot of this comes down to physical limits. So NGINX is optimized to the max to the level of what you can do on a network card and what you can do reading files from a disk. We are limited by the same physical constraints. We are reading from the same network, reading from the same disk effectively. The place where Rust excels here is an ability to make it easy to read and easy to write and easy to onboard as opposed to requiring a PhD to unroll C code.
Edward
00:50:59
And when it comes to like implementing new, playing with the shiny new tools that the kernel allows you to, like uring and stuff, that's perhaps a lot easier to, like I would certainly want to be doing that within our framework instead of trying to roll that into NGINX, right? That's at this point i think it's just we we are a lot more comfortable working with our ecosystem.
Matthias
00:51:29
Which brings us to today and just to wrap up the part about Pingora can you share some numbers about the project where are we at today and maybe about cloudflare in general.
Kevin
00:51:45
Sure. So, I mean, the first thing that I always tell people about Pingora or Cloudflare in general is that the teams are really small, surprisingly small, especially if you look at teams at other big companies like Amazon or Facebook or Google, there's only between six and eight people, depending on the time of day on the Pingora team. So this team that handles a large 20% of the internet traffic in the world, handled by six or seven people, most of whom are asleep at the same time. In terms of lines of code, we, for some reason, are not giving out the official number of lines of code in Cloudflare that is written in Rust. But for Pingora, even on the open source side, there's about 130,000 lines of code.
Edward
00:52:29
To be clear we're not the only content delivery network team not the only a proxy service through which you know these these requests are passing through there there are lots of other folks but yeah i think that where it makes sense the the engineering teams are are definitely we have a lot of autonomy each of us as engineers and a lot certainly a lot of responsibility and we're kind of each, you know, driven to do what we want within the team. So each of us carries, I think, a lot of load without trying not to stress the bus factor, though, in that case.
Kevin
00:53:10
True. Yeah, the reason the team size fluctuates is because we, as a company, are open to working cross teams to the extent that for the past, I don't know, three or four months, I've not been working on the Pingora team and been working on the Speed team for other undisclosed yet projects, but are also written in Rust, more on the core side than the edge side, but still very interesting and all async Rust, just like Pingora.
Matthias
00:53:38
Amazing.
Edward
00:53:40
Though I don't think we can share across all of Cloudflare how many lines of Rust code there are, I will say that Rust, we mentioned that Rust has been of interest to Cloudflare for a long time. Pretty much every new service on the edge is written in Rust, I believe, unless there's some significant reason not to. I think all of the services that are running are proof enough that it provides significant value, especially in our performance-critical, like, segfault-avoidant environment.
Kevin
00:54:26
Yeah, there are at least a few requests that go through Cloudflare that touch only Rust. It's not the majority yet, but it is a significant number.
Matthias
00:54:35
Speaking of which, have we talked about the number of requests per second that Pingora handles right now?
Kevin
00:54:42
Oh, yeah, I mentioned it briefly in a ramble. But yeah, so Pingora itself, there are multiple Pingora projects. But the one most prevalent that talks to upstream origins handles on average about 90 million requests a second. There was a blog post that came out that the Primagen read out loud. And he got to that number and said, wow, is that a billy? no that's a trillion that's a trillion requests per day.
Matthias
00:55:09
Wow that's crazy that's a lot of requests.
Kevin
00:55:13
Yeah.
Matthias
00:55:14
Does the Rust ecosystem cover everything that you need right now? Are there any crates that you wanted to mention that are amazing, that are invaluable for you? And are there any things lacking in the ecosystem right now?
Kevin
00:55:29
One crate that I think is sort of underutilized is the valuable crate. As part of the tokio project, it ties in really nicely with tokio tracing. It allows you to basically give a controllable summary of objects that you want to show up in your traces. It's got a usage pattern that's similar to serde. You annotate your structs that you want to be able to display. It's got some great new features that allow you to omit fields if you don't want PII to show up in your traces. It's really well written. We've added our own features on top of it that allow you to do things like when you have structures from external crates that you don't have access to add annotations to them. You can give it a special valuable annotation to instead of giving a full object representation of this structure, you can give it the debug representation or the display representation and have that show up in your logs. It's just a really simple way of avoiding the boilerplate that comes up with wanting to give a summary of an entire object structure, which I've seen in lots and lots of places, especially in other languages. You want an object to show up in multiple ways, but you can't interfere with how it's serialized to json so you have to go through all the boiler plate of writing okay this field goes in oh no skip this field it's got ip addresses, it's having a crate that's designed to do this and also is thoughtful, Because it doesn't add overhead of implementing. You're implementing it as a blanket trait implementation, but it's done in a dynamic way. So it doesn't add even a lot of monomorphism. It just gives you one implementation for anything that implements display or is valuable. It's a great crate. I can't get enough of it.
Matthias
00:57:14
One could say it's invaluable.
Kevin
00:57:16
It is invaluable, yes.
Edward
00:57:22
No, yeah, I'm glad you mentioned that because I feel like I'm cheating. Everything that comes to mind is a core dependency, right? Like, tokio, obviously, has so many great utilities for us to express, like message passing and in async fashion, etc. And obviously we've extolled its, I think we've sung its praises. And other things that come to mind seem really foundational, just like, you know, reference counted bytes, byte buffers with the bytes crate. Very foundational. dash map how do you get a concurrent hash map with as little lock contention as possible something like dash map with a bunch of shards it's it's great oh the the other things i've already mentioned as well which is our you know our cloudflare crates that i've already mentioned before shellflip when it comes to process restart, graceful process restarts and foundations for various telemetry and observability things among other operational service things. I don't know if we wanted to shout out like some community work on top of Pingora 2.
Kevin
00:58:53
Oh, yeah.
Edward
00:58:55
Like there are originally, I think we had been working with some folks within the Proximo memory safety org on a more batteries included actual drop in NGINX replacement called River. I do believe that a lot of that work maybe is on pause right now, though. But there are a lot of other great community folks who come in, report issues, etc., contribute, who are working on, I think, there's this ping gap crate as well, is one of the most significant and popular, where they've also implemented all of other, dealt with our more arcane APIs around caching and stuff. So definitely, that's a tremendous effort. And I think we have been so flattered and excited by the community engagement with Pingora. It was monumental and humbling.
Matthias
01:00:08
What does River do?
Edward
01:00:10
It is both of these projects that I mentioned, River and PingGap, are meant to be more batteries included, NGINX, like actual binary deployment. So Pingora is meant to be a library, and it can be a bit difficult to work with if all you're trying to do is use it as a drop-in for NGINX, right? You have to actually implement all of, you know, define the proxy service and things like that in code. And it is not a batteries included like plug-and-play sort of deployment where you can just versus something like one of these other projects where you can in theory just build it and run it as if it were an NGINX binary. So we were really trying to build the foundations of a lot of this, of a proxy framework. And allow the community to expand on it since we don't necessarily haven't yet necessarily needed that generalized solution ourselves with the amount of heavy customization and heavy like... You know, surface fiddly bits that we do ourselves in spinning up a Pingora service.
Kevin
01:01:40
Yeah. And as we mentioned, we're only six or seven people. We don't have so much time to add additional features. I mean, we love adding features to Pingora. The River project, when it was envisioned, was supposed to have things like WebAssembly integration. So you can do all these things, but expose them as WebAssembly. That was one of those things that I would love to implement myself. But, you know, there's just, there's enough Cloudflare work to go around, and also it's a significant project to take on. The community has been really good at putting things into Pingora directly, though. Some notable ones that come to mind are Russell's integration. We internally use OpenSSL. The Russell's integration was a huge undertaking that one person did themselves, and we're very grateful to that. Harold, if you're listening, thank you very much. There's another similar integration for another TLS implementation, I think the AWS 2SN TLS. That one is still yet to be reviewed, obviously is assigned to me. I'm slacking off on my open source job there.
Edward
01:02:43
We really try to, we are really trying to stay on top of open source, but there's only sometimes, I wish I just had more. I think we all wish we had just more.
Kevin
01:02:58
Open source time. Yeah, the open source stuff is so fun.
Matthias
01:03:01
Yeah there's never enough time speaking of which we have to conclude as well because we ran out of time but it was amazing to talk to both of you likewise if you could phrase a statement to the rust community anything that you always wanted to share what would it be yeah.
Edward
01:03:23
No like we there's a bunch of http ecosystem things that there's There's a great maintainer for all of it is like open source, the HTTP, literally HTTP crate, h2, you know, a lot of those are our core dependencies for Pingora as well. And the maintainer, Sean, is incredible at what he does.
Matthias
01:03:50
Yeah, I agree. Shout out to Sean.
Kevin
01:03:54
Yeah, the thing I was going to thank the Rust community for is for being so coherent, especially around HTTP things like the hyper ecosystem, the h2, all of those things are so ubiquitous that it makes integrating with existing projects much easier. Specifically, I was working with a ClickHouse client that is an official ClickHouse client that the ClickHouse team puts out, but I needed to add a new feature for rotating MTLS certificates, which obviously their client does not support. But because they expose access to the hyper HTTP client under the hood, it made it an easy thing to do. It's just such a good experience to come to. Like if you need a feature you already have the tools necessary to add functionality to tools that are published by other people in a coherent way something that you don't get in java something that you i don't know if you get in go that's not my ecosystem but as a former and recovering java programmer it's very nice yeah.
Matthias
01:04:51
That's a very nice closing statement as well Edward anything that you want to add.
Edward
01:04:56
I'm glad you had a specific answer because really, I am mainly just thankful for, I mean, it is true that the ecosystem, though I'm sure there are gaps from time to time, generally, if you are looking for a particular pattern or thing, you will either find out that it is hard to do so, or that someone else has already tried to at least some extent to do it and has a working very much like you know if not production ready nearly production ready implementation of it, so the rust ecosystem in general has has just been kind of the amount of excitement that folks have within the community is is a great sign of sign of promise and i mean obviously i think rust has already eaten up a lot of the internet if we are anything to if we are a good example but no we're we're just so once again just so thankful that people are interested in what we do and are patient with us and are you know are great contributors so.
Matthias
01:06:13
Kevin and Edward thanks so much for taking time for the interview today.
Kevin
01:06:17
Thanks Matthias we appreciate it.
Edward
01:06:19
Thank you yes yeah.
Kevin
01:06:21
I mean thanks for putting on this podcast Yes, I cannot believe it took five seasons for me to catch on.
Matthias
01:06:26
It's never too late. Rust in Production is a podcast by corrode. It is hosted by me, Matthias Endler, and produced by Simon Brüggen. For show notes, transcripts, and to learn more about how we can help your company make the most of Rust, visit corrode.dev. Thanks for listening to Rust in Production.