WEBVTT

00:00:01.550 --> 00:00:05.830
<v Matthias>It's Rust in Production, a podcast about companies who use Rust to shape the

00:00:05.830 --> 00:00:06.710
<v Matthias>future of infrastructure.

00:00:07.190 --> 00:00:11.390
<v Matthias>My name is Matthias Endler from corrode and today we talk to Kevin Guthrie and

00:00:11.390 --> 00:00:16.830
<v Matthias>Edward Wang from Cloudflare about handling 90 million web requests per second with Rust.

00:00:19.510 --> 00:00:24.030
<v Matthias>Kevin and Edward, thanks so much for taking the time. Can you introduce yourselves

00:00:24.030 --> 00:00:25.930
<v Matthias>and Cloudflare, the company you work for?

00:00:26.550 --> 00:00:29.450
<v Kevin>Sure. I'll go first. My name is Kevin Guthrie.

00:00:29.850 --> 00:00:33.310
<v Kevin>I'm Principal Software Engineer or Systems Engineer at Cloudflare.

00:00:33.510 --> 00:00:37.250
<v Kevin>I've been here about a year and a half. I've been a Rust developer for about

00:00:37.250 --> 00:00:40.890
<v Kevin>four-ish years, on and off professionally.

00:00:41.310 --> 00:00:45.410
<v Kevin>I've done some side projects, some games, some really stupid projects,

00:00:45.550 --> 00:00:46.570
<v Kevin>some really complex projects.

00:00:46.670 --> 00:00:53.010
<v Kevin>I just love Rust language. and it's hard to do anything else what about you Edward.

00:00:53.010 --> 00:00:56.150
<v Edward>Yeah hey my name is Edward

00:00:56.150 --> 00:00:58.890
<v Edward>i'm also a systems engineer here at

00:00:58.890 --> 00:01:02.370
<v Edward>cloudflare and i've been working

00:01:02.370 --> 00:01:05.310
<v Edward>on rust i mean since i joined the

00:01:05.310 --> 00:01:11.990
<v Edward>company essentially about almost five years ago now at this point and i've been

00:01:11.990 --> 00:01:16.390
<v Edward>working on well we're going to talk about the Pingora framework today previously

00:01:16.390 --> 00:01:21.630
<v Edward>i was working at a game studio so working on internet plumbing essentially was

00:01:21.630 --> 00:01:25.030
<v Edward>a pretty big difference now.

00:01:25.030 --> 00:01:31.310
<v Matthias>Yes we will talk about Pingora today but i'm not sure if people are aware of

00:01:31.310 --> 00:01:37.110
<v Matthias>the scale that cloudflare is at can you share some numbers just to fill everyone in.

00:01:38.290 --> 00:01:42.710
<v Kevin>Yeah, okay. So we have some changing data.

00:01:43.130 --> 00:01:46.210
<v Kevin>This is just based on public data. All the things we're going to share today

00:01:46.210 --> 00:01:47.910
<v Kevin>are things that are publicly available.

00:01:48.830 --> 00:01:52.590
<v Kevin>We have about 20% of the internet goes through Cloudflare.

00:01:54.730 --> 00:01:59.270
<v Kevin>The reference here is from a tweet from one of our engineers.

00:01:59.770 --> 00:02:02.410
<v Kevin>This is up from a couple years ago when you had,

00:02:03.760 --> 00:02:07.240
<v Kevin>Steve Klabnik was on and talked about how Cloudflare had 10% of the world's

00:02:07.240 --> 00:02:10.560
<v Kevin>internet, or 10% of the internet. So we're a little bit up from that.

00:02:11.240 --> 00:02:16.880
<v Kevin>Currently, from the internal Pingora side, we handle about 90 million requests

00:02:16.880 --> 00:02:21.100
<v Kevin>per second worldwide, occasionally going up above 100 million requests per second.

00:02:21.660 --> 00:02:27.340
<v Matthias>That is crazy. I guess beyond comprehension for most people.

00:02:27.680 --> 00:02:33.300
<v Matthias>That would mean that probably a huge majority of the traffic goes through Cloudflare,

00:02:33.300 --> 00:02:36.500
<v Matthias>and maybe to some extent through Rust. We will talk about that today.

00:02:36.780 --> 00:02:44.480
<v Matthias>But what is the setup internally to handle that scale, to handle that amount of requests?

00:02:45.560 --> 00:02:50.860
<v Edward>Yeah, I think we have a,

00:02:51.300 --> 00:02:54.720
<v Edward>I mean, if you're not familiar with Cloudflare to begin with,

00:02:55.000 --> 00:03:03.100
<v Edward>Cloudflare operates a global network rate of more than 300 points of presence around the globe.

00:03:03.100 --> 00:03:08.440
<v Edward>There are all these points of presence data centers in many different countries,

00:03:08.440 --> 00:03:15.260
<v Edward>and traffic is routed to them via these Anycast addresses, something that we've

00:03:15.260 --> 00:03:17.200
<v Edward>talked about on the blog before.

00:03:17.480 --> 00:03:23.560
<v Edward>So there are all sorts of setups, both on the layer 4 side and layer 7 side,

00:03:23.720 --> 00:03:28.940
<v Edward>to be able to load balance, distribute traffic and capacity accordingly.

00:03:28.940 --> 00:03:33.860
<v Edward>Internally, we operate one of the,

00:03:34.100 --> 00:03:41.280
<v Edward>and by we, I mean our team operates one of the services that your request travels

00:03:41.280 --> 00:03:46.440
<v Edward>through in order to get served in a response.

00:03:46.840 --> 00:03:54.320
<v Edward>And those are the services that are using our Angular framework, which is our team.

00:03:54.320 --> 00:04:00.960
<v Edward>But internally, yes, there's a bunch of different mechanisms to balance and

00:04:00.960 --> 00:04:04.860
<v Edward>distribute the traffic outside of data centers.

00:04:05.400 --> 00:04:09.740
<v Edward>External to the data centers, routing to the data centers, and then within our

00:04:09.740 --> 00:04:16.440
<v Edward>data centers themselves throughout the life of a request, as we call it.

00:04:17.080 --> 00:04:23.380
<v Edward>Traveling through a few different CDN services or proxies.

00:04:24.870 --> 00:04:26.010
<v Kevin>Some rust some not.

00:04:26.010 --> 00:04:29.730
<v Matthias>A lot of people who are in the rust community for

00:04:29.730 --> 00:04:32.870
<v Matthias>quite a while know that cloudflare was one

00:04:32.870 --> 00:04:35.950
<v Matthias>of the earliest adopters of frost at a

00:04:35.950 --> 00:04:43.730
<v Matthias>larger scale but what was rarely talked about was the reasoning behind why cloudflare

00:04:43.730 --> 00:04:49.690
<v Matthias>chose rust can you shed some light on that was it for performance reasons was

00:04:49.690 --> 00:04:55.470
<v Matthias>it for memory safety reasons what was the big driver behind Rust adoption at Cloudflare?

00:04:56.430 --> 00:05:02.650
<v Edward>I think neither Kevin and I were around for the very beginning of this, right?

00:05:02.830 --> 00:05:10.250
<v Edward>But we know that our teams have the content delivery network,

00:05:10.870 --> 00:05:14.090
<v Edward>you know, compute, workers compute teams, etc.

00:05:14.470 --> 00:05:18.770
<v Edward>They've all been eyeing Rust for a long time, as you mentioned.

00:05:19.650 --> 00:05:26.070
<v Edward>And I think it was really, yeah, All of the, like, compile time...

00:05:27.010 --> 00:05:32.430
<v Edward>Checks that you'd be able to do, all of the classes of bugs that you would essentially

00:05:32.430 --> 00:05:37.510
<v Edward>eliminate from production, right?

00:05:38.310 --> 00:05:42.210
<v Edward>You would be, I mean, on the content delivery side,

00:05:42.910 --> 00:05:49.670
<v Edward>it's no secret that a lot of our company was built on these proxy services using

00:05:49.670 --> 00:05:57.070
<v Edward>NGINX, right, which is based in C code, as well as Lua business logic on top of that.

00:05:58.370 --> 00:06:07.690
<v Edward>The amount of, let's just say that there were certainly a number of core dumps

00:06:07.690 --> 00:06:17.330
<v Edward>and invalid memory accesses associated with us perhaps making changes within our NGINX fork.

00:06:17.330 --> 00:06:23.830
<v Edward>Over time, we've had to implement more and more complicated features within NGINX internally.

00:06:25.150 --> 00:06:29.710
<v Edward>And these core dumps are really impactful, obviously.

00:06:30.110 --> 00:06:35.330
<v Edward>When a core dump happens on a worker process in NGINX, that drops like thousands of requests.

00:06:36.230 --> 00:06:44.350
<v Edward>So we had executive, honestly, we had executive visibility and support on that.

00:06:44.350 --> 00:06:52.110
<v Edward>But something that our team has talked about before was that at least in the

00:06:52.110 --> 00:06:57.530
<v Edward>earlier years, prior to Rust adoption,

00:06:58.690 --> 00:07:05.190
<v Edward>for each core dump, I recall that our former CTO, John Graham Cumming,

00:07:05.270 --> 00:07:07.570
<v Edward>would actually get an email for each of those.

00:07:08.210 --> 00:07:13.590
<v Edward>These crashes were very much top of mind for folks.

00:07:14.850 --> 00:07:23.170
<v Edward>So when you have that kind of, if you're able to build something to leverage

00:07:23.170 --> 00:07:24.810
<v Edward>all of those advantages of,

00:07:25.130 --> 00:07:29.990
<v Edward>hey, I can just completely erase, eliminate these classes of errors,

00:07:30.170 --> 00:07:33.830
<v Edward>then you're definitely going to be pursuing something like that.

00:07:33.950 --> 00:07:41.710
<v Edward>And Cloudflare is certainly we are not shy to consider every new technology

00:07:41.710 --> 00:07:43.650
<v Edward>an advantage we can come by.

00:07:43.650 --> 00:07:49.050
<v Matthias>The one thing that I noticed when you explained the reasoning behind Rust was

00:07:49.050 --> 00:07:53.110
<v Matthias>that a big chunk of the business logic was written in Lua.

00:07:53.310 --> 00:07:58.790
<v Matthias>And I wondered immediately, couldn't you just use another language with a static type system?

00:07:59.030 --> 00:08:03.650
<v Matthias>Like, I don't know, Go, for example, wouldn't that have been easier to integrate?

00:08:04.810 --> 00:08:13.470
<v Edward>Sure, but we were using NGINX, which was relatively new at the time that we were adopting.

00:08:13.650 --> 00:08:21.110
<v Edward>It, I believe, we had built a lot of our features, the firewall,

00:08:21.470 --> 00:08:28.310
<v Edward>etc., and DDoS features on top of filters that would run in NGINX.

00:08:29.190 --> 00:08:41.490
<v Edward>So the OpenResty, I believe, as it's called, the OpenResty is a framework set of.

00:08:42.710 --> 00:08:48.670
<v Edward>It allows you to implement business logic that you can plug into each of the

00:08:48.670 --> 00:08:56.230
<v Edward>NGINX filters that run across the life of a request without necessarily touching all of the very,

00:08:56.230 --> 00:08:59.650
<v Edward>perhaps arcane to a lot of folks, C code.

00:09:00.050 --> 00:09:07.510
<v Edward>So in order to do something like integrate Go and stuff, there might be certain

00:09:07.510 --> 00:09:09.810
<v Edward>similar efforts to do that.

00:09:09.810 --> 00:09:17.230
<v Edward>But I think none have been as mature as OpenResty and its Lua logic and Lua filters.

00:09:17.670 --> 00:09:24.530
<v Edward>I think for a while, some of the OpenResty folks were working with us on our CDN teams.

00:09:25.270 --> 00:09:28.310
<v Edward>So generally speaking i

00:09:28.310 --> 00:09:31.930
<v Edward>think go was one of the possibilities i

00:09:31.930 --> 00:09:39.930
<v Edward>believe when we were evaluating other languages right to go to to switch to

00:09:39.930 --> 00:09:44.210
<v Edward>but i think rust is the definite forerunner for all the reasons that i mentioned

00:09:44.210 --> 00:09:49.810
<v Edward>a you know zero cost abstractions for great performance and And obviously,

00:09:50.050 --> 00:09:51.870
<v Edward>most importantly, I think,

00:09:52.250 --> 00:09:58.370
<v Edward>eliminating all sorts of memory safety issues and bugs that can arise from memory safety issues.

00:09:59.070 --> 00:10:02.990
<v Kevin>One other thing that impacted our decision to go to Rust, I think.

00:10:03.170 --> 00:10:05.670
<v Kevin>Like I said, I've only been here a year and a half, so I was definitely not around when the

00:10:05.670 --> 00:10:06.550
<v Kevin>decision was being made.

00:10:06.750 --> 00:10:11.510
<v Kevin>But we had a lot of the forerunners, like the celebrities and the Rust community,

00:10:11.710 --> 00:10:15.110
<v Kevin>were working at Cloudflare at various points in time.

00:10:15.250 --> 00:10:19.170
<v Kevin>Like Steve Klabnik himself worked at Cloudflare. Ashley Williams also worked here.

00:10:20.150 --> 00:10:25.850
<v Kevin>So, I mean, there was a lot of popularity of the Rust language in Cloudflare to begin with.

00:10:26.270 --> 00:10:32.070
<v Matthias>When you want to integrate a language like Go into an existing infrastructure

00:10:32.070 --> 00:10:38.530
<v Matthias>that runs on NGINX, that would probably be a little harder because Go has a

00:10:38.530 --> 00:10:40.910
<v Matthias>small but not negligible runtime.

00:10:40.910 --> 00:10:45.630
<v Matthias>It has a garbage collector and so on. Whereas with Rust, you could integrate

00:10:45.630 --> 00:10:49.610
<v Matthias>very deeply with basic C FFI.

00:10:49.890 --> 00:10:55.010
<v Matthias>Did that also play a big role? And also, did you end up integrating Rust into

00:10:55.010 --> 00:11:00.550
<v Matthias>your NGINX server for a while before you moved on to build your own solution?

00:11:01.450 --> 00:11:08.810
<v Edward>Yeah, I think this speaks to the matters of how do you migrate and switch over

00:11:08.810 --> 00:11:13.790
<v Edward>pieces of your infrastructure gradually to Rust as well, right?

00:11:16.310 --> 00:11:23.610
<v Edward>So all of the, as I mentioned, a lot of the core business logic historically

00:11:23.610 --> 00:11:27.850
<v Edward>has been built on Lua via OpenResty.

00:11:28.170 --> 00:11:31.910
<v Edward>All of that, you know, business logic built up over time.

00:11:32.630 --> 00:11:38.410
<v Edward>So I think there were initially some notions of how do you integrate,

00:11:38.410 --> 00:11:40.590
<v Edward>how do you maybe integrate,

00:11:40.690 --> 00:11:46.110
<v Edward>change those filters to how do you extract the business logic, right?

00:11:46.370 --> 00:11:50.930
<v Edward>To use, to be using your Rust-based logic instead.

00:11:51.450 --> 00:11:56.870
<v Edward>And there, I think we're varying approaches to this for, there are a lot of

00:11:56.870 --> 00:11:59.630
<v Edward>teams that work on the CDN in addition to us.

00:12:00.130 --> 00:12:05.570
<v Edward>Some of these, some of the logic you can kind of extract into different services,

00:12:05.570 --> 00:12:10.970
<v Edward>either in-band with the request processing or out-of-band.

00:12:10.970 --> 00:12:14.710
<v Edward>You can make calls to other services.

00:12:14.710 --> 00:12:26.370
<v Edward>For example, the approach that we ended up choosing was to extract a specific, on a high level,

00:12:26.570 --> 00:12:33.550
<v Edward>extract a particular responsibility of one of our NGINX proxies into a separate service.

00:12:33.550 --> 00:12:38.310
<v Edward>That was what we were doing when we were first developing Pingora,

00:12:38.330 --> 00:12:46.670
<v Edward>which was at the time, NGINX would reach out to make origin connections directly

00:12:46.670 --> 00:12:48.370
<v Edward>and make origin requests directly.

00:12:48.890 --> 00:12:55.890
<v Edward>We decided, hey, what if we situated an in-band with a request proxy that sits

00:12:55.890 --> 00:13:01.230
<v Edward>just behind that NGINX proxy? and routed requests to that instead.

00:13:01.510 --> 00:13:07.310
<v Edward>And then that service would decide how to make origin requests to which origins.

00:13:07.950 --> 00:13:11.510
<v Edward>Handling all of the origin communication responsibility.

00:13:12.470 --> 00:13:17.810
<v Edward>And then you're able to do something like divert traffic to that selectively

00:13:17.810 --> 00:13:23.910
<v Edward>depending on how ready that service is to handle certain classes of requests.

00:13:23.910 --> 00:13:29.230
<v Edward>I think this is generally the strategy that has been working out pretty well

00:13:29.230 --> 00:13:32.810
<v Edward>for various services at Cloudflare,

00:13:32.830 --> 00:13:39.610
<v Edward>as long as you're able to have something that has some sort of control plane that sits in front and,

00:13:40.140 --> 00:13:44.780
<v Edward>decides, you know, the routing in band of that request.

00:13:46.480 --> 00:13:52.040
<v Kevin>It's the unsurprising answer of how do you solve a problem at a proxy company

00:13:52.040 --> 00:13:53.340
<v Kevin>is by adding more proxies.

00:13:55.260 --> 00:14:00.040
<v Edward>Yeah. And certainly at first, I think we were a little concerned about,

00:14:00.040 --> 00:14:05.840
<v Edward>I think whenever you're adding another hop, it's another proxy hop at injecting

00:14:05.840 --> 00:14:06.820
<v Edward>another service into it.

00:14:06.920 --> 00:14:10.920
<v Edward>You're worried about complexity. You're worried about performance regressions

00:14:10.920 --> 00:14:13.660
<v Edward>and things like that, latency, obviously, right?

00:14:14.340 --> 00:14:24.340
<v Edward>Generally speaking, what we had noticed was that adding another service hop is

00:14:24.340 --> 00:14:28.040
<v Edward>certainly unconditionally going to add some amount of latency.

00:14:29.300 --> 00:14:32.560
<v Edward>Thankfully the feature the

00:14:32.560 --> 00:14:35.580
<v Edward>new logic that we were adding on top

00:14:35.580 --> 00:14:41.620
<v Edward>of that was generally able to offset a lot of any those detriments that you

00:14:41.620 --> 00:14:46.640
<v Edward>would face the the example that we tended to point out in our in our blog a

00:14:46.640 --> 00:14:53.380
<v Edward>while ago was how our Pingora service i don't know how much we want to get into,

00:14:54.320 --> 00:14:57.060
<v Edward>the why exactly in terms

00:14:57.060 --> 00:15:00.100
<v Edward>of like NGINX versus Pingora architecture and stuff

00:15:00.100 --> 00:15:03.360
<v Edward>but Pingora was a lot more

00:15:03.360 --> 00:15:06.300
<v Edward>efficient this was definitely top of mind for us

00:15:06.300 --> 00:15:09.280
<v Edward>a lot more efficient in terms of how it was making and

00:15:09.280 --> 00:15:12.360
<v Edward>reusing origin connections and so

00:15:12.360 --> 00:15:16.520
<v Edward>something like that generally brings down your you

00:15:16.520 --> 00:15:19.560
<v Edward>know the the the latency of making an origin

00:15:19.560 --> 00:15:26.300
<v Edward>request significantly if you can skip all of the TLS handshake etc latency fortunately

00:15:26.300 --> 00:15:31.780
<v Edward>for us as well when it comes to replacing an origin facing proxy the the cost

00:15:31.780 --> 00:15:38.440
<v Edward>of the cost of origin latency significantly dwarfs any additional proxy hop latency you have so.

00:15:39.020 --> 00:15:42.140
<v Matthias>Okay, but was the project already

00:15:42.140 --> 00:15:47.060
<v Matthias>called Pingora back then, or was it some sort of intermediate step?

00:15:47.480 --> 00:15:53.840
<v Edward>I guess I have to shout out some folks who were working on Pingora.

00:15:54.000 --> 00:15:59.580
<v Edward>I say I was working on it, but in reality, it was Yuchen, Yuchen Wu,

00:15:59.820 --> 00:16:06.980
<v Edward>as well as Andrew Houck, who were kind of the primary and first drivers of Pingora.

00:16:06.980 --> 00:16:09.920
<v Edward>And at first it was called OpenRusty, I think.

00:16:10.180 --> 00:16:14.920
<v Edward>You still see this term in some of the old tests because it was very much meant

00:16:14.920 --> 00:16:23.600
<v Edward>to replace OpenResty and NGINX itself and be a, I don't want to say a drop-in replacement,

00:16:23.840 --> 00:16:29.260
<v Edward>but do all the things and model a lot of its logic off of NGINX and OpenResty.

00:16:29.440 --> 00:16:37.340
<v Edward>Because honestly, that worked for us. and NGINX's logic models and the way that

00:16:37.340 --> 00:16:39.660
<v Edward>it thought about request processing worked for us.

00:16:39.780 --> 00:16:44.740
<v Edward>So we wanted to do a lot of things pretty similarly to NGINX.

00:16:45.310 --> 00:16:48.930
<v Matthias>I really like the name, OpenRusty, but of course.

00:16:49.850 --> 00:16:53.790
<v Kevin>I don't remember why they didn't go with that name in the end.

00:16:53.990 --> 00:17:01.370
<v Kevin>I think it partially was sort of pejorative, but also could have been confused as a typo.

00:17:01.730 --> 00:17:05.050
<v Kevin>I think Pingora is a much better name. The name, I think, came from the manager

00:17:05.050 --> 00:17:07.510
<v Kevin>of the team who almost slipped and died off of the mountain,

00:17:07.810 --> 00:17:09.550
<v Kevin>the literal mountain that's actually called Pingora.

00:17:10.230 --> 00:17:18.650
<v Edward>I believe the story is that a particular trip to the Pingora Mountain almost cost him his life.

00:17:18.670 --> 00:17:23.630
<v Edward>And now we've been ascending that summit ever since.

00:17:26.250 --> 00:17:30.790
<v Matthias>Sounds like negative foreshadowing, but in reality it worked out well.

00:17:33.170 --> 00:17:37.490
<v Matthias>But the one thing that I wondered was that, let's say you add another hop.

00:17:37.610 --> 00:17:39.210
<v Matthias>You add proxy behind a proxy.

00:17:39.890 --> 00:17:45.130
<v Matthias>And then you have a ton of requests coming in and then you want to switch to

00:17:45.130 --> 00:17:49.610
<v Matthias>a new version you just kind of want to do a release basically wouldn't that

00:17:49.610 --> 00:17:54.750
<v Matthias>be an easy source for dropping connections and dropping requests it.

00:17:54.750 --> 00:17:58.690
<v Kevin>Is so it's something we have to do very carefully and we have since we handle

00:17:58.690 --> 00:18:03.770
<v Kevin>things like web sockets we have lots of long-running requests so upgrades updates

00:18:03.770 --> 00:18:08.890
<v Kevin>are something we don't do very often now or then.

00:18:09.170 --> 00:18:13.270
<v Kevin>The way Pingora does this is a really slick system where when you bring up a

00:18:13.270 --> 00:18:18.750
<v Kevin>new update, the process that you want to move everything to, it can start.

00:18:19.210 --> 00:18:25.210
<v Kevin>It can know about the old instance of Pingora that's currently running.

00:18:25.490 --> 00:18:30.230
<v Kevin>And that old instance can gracefully hand over the socket to start listening

00:18:30.230 --> 00:18:34.850
<v Kevin>for new connections on the new instance of Pingora while old requests finish

00:18:34.850 --> 00:18:39.250
<v Kevin>out on the old one and then the old instance can handle all of its requests

00:18:39.250 --> 00:18:40.790
<v Kevin>and then gracefully shut down

00:18:40.790 --> 00:18:43.830
<v Kevin>whereas the new one is bringing up any new connections and handling those.

00:18:44.650 --> 00:18:50.090
<v Matthias>Is that safe or does that happen between processes? I wonder if you can even make it safe.

00:18:51.290 --> 00:18:56.830
<v Kevin>It is. I mean, I don't know. I'm sure in Rust this is classified as some form

00:18:56.830 --> 00:19:01.230
<v Kevin>of unsafe code because you're passing around raw file descriptors for sockets.

00:19:01.830 --> 00:19:06.150
<v Kevin>But it is also a really common thing. This is something I first heard of at Facebook,

00:19:06.910 --> 00:19:13.750
<v Kevin>where their networking, their HTTP servers do the same exact thing or even their

00:19:13.750 --> 00:19:18.430
<v Kevin>load balancing system i think it's a very common process but i never worked

00:19:18.430 --> 00:19:21.190
<v Kevin>with the actual code to do it until working on the Pingora project.

00:19:21.190 --> 00:19:27.910
<v Edward>Yeah there's actually so yeah there's there's this process of transferring these

00:19:27.910 --> 00:19:30.710
<v Edward>listen file descriptors i think it's one of the few places,

00:19:31.590 --> 00:19:35.450
<v Edward>i i could be wrong but i think it's one of the few places where yes because

00:19:35.450 --> 00:19:38.810
<v Edward>we're dealing with those raw file descriptors, there's a bit of unsafe code there.

00:19:39.170 --> 00:19:49.010
<v Edward>There's actually also a crate that I believe we've put out, not us ourselves, but I mentioned,

00:19:49.390 --> 00:19:54.550
<v Edward>or maybe I haven't mentioned yet that Cloudflare is not in a monolith when it

00:19:54.550 --> 00:19:57.750
<v Edward>comes to Rust, and we are not the only folks,

00:19:59.030 --> 00:20:01.150
<v Edward>developing in the Rust ecosystem.

00:20:01.350 --> 00:20:04.850
<v Edward>So the folks who are working on another proxy

00:20:04.850 --> 00:20:08.570
<v Edward>framework called oxy have actually open sourced a

00:20:08.570 --> 00:20:13.410
<v Edward>crate that is specifically for these kinds of graceful process restarts and

00:20:13.410 --> 00:20:19.830
<v Edward>it's called shellflip so that it uses a very you know similar mechanism of you

00:20:19.830 --> 00:20:28.350
<v Edward>know transferring while descriptors and doing that handover of of yes doing that handover.

00:20:29.280 --> 00:20:35.500
<v Matthias>ShellFlip sounds like another really cool name. You have a way with names, I guess, at Cloudflare.

00:20:36.000 --> 00:20:39.820
<v Edward>I think it was taken from the TableFlipGo package.

00:20:40.080 --> 00:20:44.840
<v Edward>And I think some of our engineers decided to...

00:20:44.840 --> 00:20:45.560
<v Kevin>That's a good name.

00:20:45.640 --> 00:20:49.040
<v Edward>I don't know why it's Shell in particular, but maybe it has to do with crabs.

00:20:50.020 --> 00:20:56.140
<v Matthias>Maybe, yeah. But do you share a lot of code with other teams at Cloudflare?

00:20:57.100 --> 00:21:01.920
<v Matthias>Rust creates, that is. You mentioned that you have Pingora and you have Oxy,

00:21:02.080 --> 00:21:05.240
<v Matthias>but there's probably more stuff at Cloudflare which uses Rust.

00:21:05.500 --> 00:21:06.940
<v Matthias>How does code sharing look like?

00:21:07.420 --> 00:21:13.400
<v Matthias>Because from another company, or actually from a few, I heard that sort of by

00:21:13.400 --> 00:21:18.280
<v Matthias>serendipity, they start to use different crates in completely different contexts.

00:21:18.280 --> 00:21:21.160
<v Matthias>And it kind of happens very naturally to share code.

00:21:22.130 --> 00:21:25.270
<v Kevin>Yeah, that's true. We have a sort of a haphazard way of sharing code.

00:21:25.430 --> 00:21:29.970
<v Kevin>We do have our own internal repository for uploading crates,

00:21:30.010 --> 00:21:32.010
<v Kevin>like it's an internal copy of crates.io.

00:21:33.310 --> 00:21:34.230
<v Edward>Internal registry.

00:21:34.590 --> 00:21:38.590
<v Kevin>Sorry, internal registry. That's the right terminology. But for a lot of it,

00:21:38.750 --> 00:21:42.890
<v Kevin>it's done through referencing crates through Git URLs.

00:21:43.170 --> 00:21:47.370
<v Kevin>So it's a little bit on the Go side of things. So you have a crate that you

00:21:47.370 --> 00:21:49.350
<v Kevin>want to share with other people in Cloudflare.

00:21:50.090 --> 00:21:54.310
<v Kevin>It's up on our internal Git server. You can write a blog post about it.

00:21:54.390 --> 00:21:55.790
<v Kevin>Anybody can just include it.

00:21:56.470 --> 00:22:00.490
<v Kevin>Putting it on the internal registry should be a more common thing to do,

00:22:00.510 --> 00:22:02.670
<v Kevin>but I have literally never done it.

00:22:02.670 --> 00:22:06.790
<v Kevin>I've shared a couple of crates with different teams for various stupid things,

00:22:06.950 --> 00:22:11.950
<v Kevin>but most of those are just incorporated either through one way,

00:22:12.050 --> 00:22:15.170
<v Kevin>which is making the project open source and then having people consume it just

00:22:15.170 --> 00:22:19.750
<v Kevin>from the open internet or from the actual crates.io or consuming it internally

00:22:19.750 --> 00:22:21.030
<v Kevin>from an internal Git repo.

00:22:21.310 --> 00:22:27.490
<v Edward>I will say that I think usage of the internal registries is pretty common now.

00:22:27.690 --> 00:22:36.410
<v Edward>The whole point of the registry, One of the major points was to avoid that the

00:22:36.410 --> 00:22:42.850
<v Edward>kit commits kind of references that Argo allows you to do.

00:22:43.950 --> 00:22:47.070
<v Edward>It's still used in some cases, right?

00:22:47.590 --> 00:22:55.050
<v Edward>But yeah, more and more, I think the ecosystem around Rust has become a lot

00:22:55.050 --> 00:23:01.030
<v Edward>more shared maybe in the past few years. In our code, we use both approaches.

00:23:02.370 --> 00:23:06.790
<v Matthias>And when you publish code, do you have a formal process for the publication?

00:23:07.030 --> 00:23:13.070
<v Matthias>Do you run any cargo tools to make sure that the code quality is on par with the rest?

00:23:14.440 --> 00:23:17.680
<v Kevin>Uh we use the i mean really we

00:23:17.680 --> 00:23:20.440
<v Kevin>use the standard open source tools we use

00:23:20.440 --> 00:23:23.480
<v Kevin>clippy we use the auditing tool

00:23:23.480 --> 00:23:26.640
<v Kevin>name i can't think of right now make sure we are not publishing anything within

00:23:26.640 --> 00:23:29.500
<v Kevin>secure code cargo audit yes cargo audit

00:23:29.500 --> 00:23:32.260
<v Kevin>yeah exactly but that's that's really

00:23:32.260 --> 00:23:35.260
<v Kevin>about it we're very stringent on our internal code reviews

00:23:35.260 --> 00:23:40.240
<v Kevin>so like i said we have open source projects all of the open source contributions

00:23:40.240 --> 00:23:44.180
<v Kevin>that come in go through internal external review as well as internal review

00:23:44.180 --> 00:23:50.060
<v Kevin>before they go into the the main branch for Pingora but as far as automated

00:23:50.060 --> 00:23:54.780
<v Kevin>tools yeah it's really just clippy a clippy in testing now.

00:23:54.780 --> 00:24:01.040
<v Matthias>Let's come back to Pingora for a while we established that we have a system

00:24:01.040 --> 00:24:06.540
<v Matthias>called open rusty it's behind NGINX that's the current place we're at when was

00:24:06.540 --> 00:24:10.300
<v Matthias>that roundabout like the year that you had that system running in production.

00:24:10.300 --> 00:24:20.060
<v Edward>Oh boy so the blog came about 2022 i want to say the first forays into Pingora started,

00:24:21.420 --> 00:24:30.840
<v Edward>around 2020 or a little before that i would need to look into the exact dates

00:24:30.840 --> 00:24:34.580
<v Edward>for production But I want to say that the service,

00:24:34.580 --> 00:24:37.960
<v Edward>I think it was around,

00:24:37.980 --> 00:24:43.360
<v Edward>it wasn't long after 2020 that these services,

00:24:43.560 --> 00:24:46.940
<v Edward>that the Pingora services first started to get used and deployed.

00:24:48.000 --> 00:24:52.460
<v Matthias>Yeah. And pretty early on, you saw some advantages. I guess,

00:24:52.560 --> 00:24:53.900
<v Matthias>Edward, you also mentioned that.

00:24:55.340 --> 00:25:00.040
<v Matthias>The additional hop didn't really make a big difference because the connection

00:25:00.040 --> 00:25:07.360
<v Matthias>to the origin was the bottleneck and the new Rust-based system was already pretty fast.

00:25:07.540 --> 00:25:11.260
<v Matthias>But then one could argue NGINX was already plenty fast.

00:25:11.900 --> 00:25:17.380
<v Matthias>What were some of the other NGINX limitations that you ran into which kind of

00:25:17.380 --> 00:25:22.000
<v Matthias>triggered you to find a different approach other than, say,

00:25:22.280 --> 00:25:27.060
<v Matthias>the lack of the type system, for example, that you had with the lure solution in the past.

00:25:27.060 --> 00:25:35.740
<v Edward>Yeah i can definitely speak to that i think i was working on one particular

00:25:35.740 --> 00:25:41.540
<v Edward>feature that was a bit hard to.

00:25:43.360 --> 00:25:46.720
<v Edward>It was i mentioned that every time

00:25:46.720 --> 00:25:49.800
<v Edward>we have we i mentioned

00:25:49.800 --> 00:25:53.380
<v Edward>that we have an internal NGINX fork that we've added

00:25:53.380 --> 00:25:56.880
<v Edward>more and more complexity into for developing our

00:25:56.880 --> 00:26:00.040
<v Edward>own internal features whenever we want to futz around with

00:26:00.040 --> 00:26:03.500
<v Edward>how NGINX does its request

00:26:03.500 --> 00:26:08.320
<v Edward>processing and response serving right over time

00:26:08.320 --> 00:26:11.220
<v Edward>and eventually there was

00:26:11.220 --> 00:26:14.620
<v Edward>a moment at which the the straw

00:26:14.620 --> 00:26:19.060
<v Edward>on the camel's back broke where

00:26:19.060 --> 00:26:22.040
<v Edward>we were implementing we were

00:26:22.040 --> 00:26:24.940
<v Edward>trying to implement more complicated logic on top

00:26:24.940 --> 00:26:28.000
<v Edward>of the things we are already doing for example i

00:26:28.000 --> 00:26:31.180
<v Edward>think there are we've blogged about concurrent

00:26:31.180 --> 00:26:38.500
<v Edward>streaming acceleration which is a fancy name for we're serving your cached request

00:26:38.500 --> 00:26:44.140
<v Edward>cached response body as it gets pulled from the origin it those changes are

00:26:44.140 --> 00:26:48.040
<v Edward>pretty intrusive see changes as we iterated on top of that.

00:26:49.230 --> 00:26:53.010
<v Edward>Any feature of decent complexity would cause core dumps.

00:26:53.750 --> 00:26:59.470
<v Edward>And as I mentioned before, that was highly visible to leadership.

00:27:00.490 --> 00:27:07.030
<v Edward>So if we were to make significant progress at all, we would usually be debugging

00:27:07.030 --> 00:27:12.530
<v Edward>what sort of invariant we were violating inside of NGINX.

00:27:16.090 --> 00:27:19.050
<v Edward>NGINX is great in a lot of ways.

00:27:19.230 --> 00:27:26.790
<v Edward>And it, like the developers themselves are experts in what is valid, you know,

00:27:26.870 --> 00:27:34.770
<v Edward>to access when and what can you do asynchronously from the, from the lifetime of the main request,

00:27:34.890 --> 00:27:37.330
<v Edward>for example, and what is not safe to do so.

00:27:37.330 --> 00:27:42.090
<v Edward>But those things are not necessarily, I mean, they're not as enforced within

00:27:42.090 --> 00:27:45.150
<v Edward>the code strictly, right?

00:27:45.350 --> 00:27:53.030
<v Edward>The way that you can encapsulate those exact kinds of lifetime and memory restrictions in Rust.

00:27:53.030 --> 00:27:56.130
<v Edward>So that was the

00:27:56.130 --> 00:27:59.850
<v Edward>point at which we said

00:27:59.850 --> 00:28:05.050
<v Edward>we were already developing Pingora and then we said actually for any feature

00:28:05.050 --> 00:28:08.990
<v Edward>of significant complexity we need to start moving it into the new system we

00:28:08.990 --> 00:28:14.890
<v Edward>need to start migrating to the new to the new proxy system and developing features

00:28:14.890 --> 00:28:19.190
<v Edward>there as much as possible instead of NGINX itself.

00:28:19.860 --> 00:28:24.100
<v Kevin>One thing we want to make clear is we definitely are not here to complain about

00:28:24.100 --> 00:28:26.260
<v Kevin>NGINX or to bash NGINX in any way.

00:28:26.500 --> 00:28:31.420
<v Kevin>NGINX is like the foundation of Cloudflare. And the actual NGINX and OpenResty

00:28:31.420 --> 00:28:37.000
<v Kevin>projects are amazing and stable and used in millions, billions of places. I don't actually know.

00:28:37.360 --> 00:28:43.360
<v Kevin>But the modifications we were doing were not as stable and leading to the core dumps.

00:28:43.460 --> 00:28:45.840
<v Kevin>As someone who came to this, came

00:28:45.840 --> 00:28:49.280
<v Kevin>to Cloudflare not having done internet plumbing, just similar to Edward.

00:28:49.860 --> 00:28:57.360
<v Kevin>Seeing the C code for NGINX, which is asynchronous, it's not written in an async

00:28:57.360 --> 00:29:00.860
<v Kevin>await kind of way like you're used to with Rust or TypeScript or anything.

00:29:01.020 --> 00:29:04.520
<v Kevin>It is literally, it's async code, but you're working with it in the time domain,

00:29:04.680 --> 00:29:08.820
<v Kevin>like you are managing the state literally as you go through these,

00:29:08.840 --> 00:29:13.160
<v Kevin>waiting for different files, waiting for sockets to open, close.

00:29:13.380 --> 00:29:16.600
<v Kevin>So it is a very complex thing and almost impossible to debug.

00:29:17.580 --> 00:29:20.880
<v Edward>Yeah, for sure. the

00:29:20.880 --> 00:29:25.660
<v Edward>the honestly developer ergonomics

00:29:25.660 --> 00:29:28.560
<v Edward>and developer velocity on top of

00:29:28.560 --> 00:29:31.720
<v Edward>the you know the the the

00:29:31.720 --> 00:29:35.420
<v Edward>classes of bugs that you're able to avoid and not

00:29:35.420 --> 00:29:39.520
<v Edward>worry about that avoiding whole

00:29:39.520 --> 00:29:42.820
<v Edward>classes of bugs speeds up your productivity where you

00:29:42.820 --> 00:29:48.320
<v Edward>don't have to worry about introducing those things this is actually why a lot

00:29:48.320 --> 00:29:52.600
<v Edward>of our business logic was written in lua filters as well because you don't you're

00:29:52.600 --> 00:30:00.980
<v Edward>the the amount you're not going to seg vaults from manipulating lua objects right but that,

00:30:01.620 --> 00:30:10.040
<v Edward>often comes at a performance cost with with lua vm and lua runtime even if you lua jit so this.

00:30:11.240 --> 00:30:19.580
<v Edward>The other, like, main primary advantage of switching to Angular and Rust was

00:30:19.580 --> 00:30:22.000
<v Edward>honestly just, like, as Kevin had mentioned,

00:30:22.740 --> 00:30:28.940
<v Edward>the expressiveness of async Rust is extremely powerful.

00:30:28.940 --> 00:30:32.160
<v Edward>And when you're

00:30:32.160 --> 00:30:36.700
<v Edward>looking at especially for onboarding new engineers learning

00:30:36.700 --> 00:30:41.800
<v Edward>NGINX and how it manually handles

00:30:41.800 --> 00:30:49.600
<v Edward>the event loop events right because it when when you're going in when a request

00:30:49.600 --> 00:30:57.120
<v Edward>comes into NGINX it needs it is it is handling those equal events and propagating

00:30:57.120 --> 00:30:59.400
<v Edward>it to the request event handlers,

00:30:59.460 --> 00:31:03.720
<v Edward>and then it needs to decide what comes next.

00:31:04.540 --> 00:31:11.180
<v Edward>Assign the next handlers once your header is done, then you assign the event

00:31:11.180 --> 00:31:12.520
<v Edward>handler for the body, etc.

00:31:12.840 --> 00:31:20.060
<v Edward>There's a lot of manual mental effort involved with that kind of coding model

00:31:20.060 --> 00:31:29.020
<v Edward>where you are both handling the HTTP processing logic in tandem with handling the event loop.

00:31:29.760 --> 00:31:37.700
<v Edward>And with async await constructs, all of that logic then becomes linear,

00:31:37.920 --> 00:31:40.280
<v Edward>actually. You can very much see...

00:31:41.310 --> 00:31:46.330
<v Edward>After this, you're going to do this next in the life of a request.

00:31:46.530 --> 00:31:54.550
<v Edward>And that, I think, I believe, has been really helpful for folks who are,

00:31:54.550 --> 00:31:58.910
<v Edward>for onboarding new engineers, for learning the code base, etc.

00:31:58.910 --> 00:32:07.930
<v Edward>I think it's that that was extremely those ergonomics were just as important

00:32:07.930 --> 00:32:12.210
<v Edward>to us honestly because we need to ship things fast here.

00:32:12.210 --> 00:32:17.390
<v Matthias>Sounds super crazy because not many people will be familiar with how engine

00:32:17.390 --> 00:32:23.950
<v Matthias>x or to be more specific c handles asynchronous execution,

00:32:24.470 --> 00:32:28.730
<v Matthias>That's a thing that was sort of a selling point for NGINX in the beginning.

00:32:28.910 --> 00:32:33.950
<v Matthias>It was event-driven in comparison to Apache, which was not very much event-driven.

00:32:34.130 --> 00:32:39.010
<v Matthias>It was more or less process-driven, and NGINX kind of changed that model,

00:32:39.030 --> 00:32:42.730
<v Matthias>but you kind of need to shoehorn your logic into that.

00:32:44.320 --> 00:32:51.320
<v Matthias>But is it similar to the state machine that gets converted to something more

00:32:51.320 --> 00:32:53.320
<v Matthias>maintainable on the Rust side?

00:32:53.500 --> 00:32:57.520
<v Matthias>So on Rust, we don't really need to write a big state machine ourselves.

00:32:57.800 --> 00:33:03.680
<v Matthias>We just use async Rust as we do, and then the compiler will just generate the state machine for us.

00:33:03.840 --> 00:33:09.180
<v Matthias>Is the code similar on the C side, or is it completely different?

00:33:10.120 --> 00:33:15.860
<v Edward>Got it, yeah. Yeah, I would say NGINX,

00:33:16.360 --> 00:33:25.620
<v Edward>I guess a lot of that is hand unrolled, the way we were talking about, right?

00:33:25.980 --> 00:33:32.440
<v Edward>Where the events and the next state that you're going to go to for the next

00:33:32.440 --> 00:33:37.020
<v Edward>event that you encounter are manually defined within NGINX.

00:33:37.020 --> 00:33:41.820
<v Edward>And then you were also talking about how NGINX was...

00:33:43.610 --> 00:33:46.570
<v Edward>Really revolutionary in terms of

00:33:46.570 --> 00:33:53.230
<v Edward>how it was doing the asynchronous event driven model and that kind of touches

00:33:53.230 --> 00:33:58.910
<v Edward>on a it's it's not exactly related

00:33:58.910 --> 00:34:06.130
<v Edward>to how you know async rust gets kind of converted into a state machine,

00:34:06.430 --> 00:34:16.370
<v Edward>but it is something that it's a big explanation for why NGINX is already a great performer,

00:34:16.730 --> 00:34:18.990
<v Edward>right? Does really well in benchmarks already.

00:34:19.370 --> 00:34:26.230
<v Edward>The underlying mechanisms for how that works, I talked about something called ePoll before.

00:34:26.450 --> 00:34:31.390
<v Edward>The underlying mechanisms of NGINX

00:34:31.390 --> 00:34:34.770
<v Edward>are such that there's an event

00:34:34.770 --> 00:34:37.770
<v Edward>loop that an NGINX worker process goes

00:34:37.770 --> 00:34:40.850
<v Edward>through and it's able to

00:34:40.850 --> 00:34:44.730
<v Edward>with the help of operating system utilities

00:34:44.730 --> 00:34:49.330
<v Edward>like epol determine when io

00:34:49.330 --> 00:34:53.210
<v Edward>events are ready on certain file descriptors

00:34:53.210 --> 00:34:56.330
<v Edward>and otherwise block if if

00:34:56.330 --> 00:34:59.230
<v Edward>there are no events ready there but then so

00:34:59.230 --> 00:35:02.130
<v Edward>you can think of it as essentially processing all of

00:35:02.130 --> 00:35:05.270
<v Edward>these events as they become ready

00:35:05.270 --> 00:35:08.710
<v Edward>as as io events come in on in

00:35:08.710 --> 00:35:16.930
<v Edward>this you know literal loop and the equivalent like that that's the that was

00:35:16.930 --> 00:35:26.210
<v Edward>a really powerful and greatly you know efficient way to to do all of this and respond to,

00:35:26.950 --> 00:35:28.970
<v Edward>network and file IO, right?

00:35:30.340 --> 00:35:37.160
<v Edward>We got to cheat with Pingora because we are able to reap the benefits of tokio

00:35:37.160 --> 00:35:39.920
<v Edward>that does something really similar.

00:35:39.920 --> 00:35:47.900
<v Edward>It has a, and I am not a tokio expert by any means, I would say,

00:35:48.120 --> 00:35:55.300
<v Edward>but it is doing something really similar also with the help of Mio underlying, right,

00:35:55.820 --> 00:36:02.840
<v Edward>Metal.io, where it's also handling the file descriptors and this event loop, etc.

00:36:03.320 --> 00:36:11.040
<v Edward>Within its reactor that is using a lot of the same operating system mechanisms,

00:36:11.400 --> 00:36:20.100
<v Edward>be it EPOL or KQ or what have you, to listen to these IO events and then propagate them,

00:36:21.080 --> 00:36:26.620
<v Edward>awake the corresponding tokio tasks that are relying on them, right?

00:36:26.800 --> 00:36:37.300
<v Edward>So all of a sudden, we are able to look at that as Bangora and we have a great abstraction,

00:36:37.300 --> 00:36:46.180
<v Edward>you know, from the actual underlying event handling systems that we can build upon.

00:36:46.380 --> 00:36:49.220
<v Edward>And so a lot of, I would say that, like.

00:36:50.430 --> 00:36:53.110
<v Edward>Great big success when it comes

00:36:53.110 --> 00:36:58.230
<v Edward>to why we were able to develop Pingora relatively quickly, I would say,

00:36:58.430 --> 00:37:04.950
<v Edward>is because we were able to build upon the success of tokio and all of its great

00:37:04.950 --> 00:37:08.530
<v Edward>performance considerations and mechanisms, right?

00:37:08.790 --> 00:37:11.890
<v Edward>Because tokio also has a bunch

00:37:11.890 --> 00:37:16.450
<v Edward>of internal optimizations for especially

00:37:16.450 --> 00:37:19.790
<v Edward>when it comes to well we can get into how

00:37:19.790 --> 00:37:26.470
<v Edward>it does things between threads etc and and tries to load balance load balance

00:37:26.470 --> 00:37:33.630
<v Edward>tasks and work between threads but it's basically we were able to do so much

00:37:33.630 --> 00:37:39.110
<v Edward>on top of because we already had a great underlying async runtime and event handling mechanism.

00:37:39.850 --> 00:37:42.870
<v Kevin>I don't know if you sold, you said you weren't a tokio expert.

00:37:42.990 --> 00:37:46.750
<v Kevin>I don't know if that came across because it sounds like you are quite the tokio expert now.

00:37:46.850 --> 00:37:52.130
<v Kevin>I can say as a definite non-tokio expert that you don't really even need to

00:37:52.130 --> 00:37:54.690
<v Kevin>know all this to use tokio as an async code.

00:37:54.770 --> 00:37:58.370
<v Kevin>You can come into it as a TypeScript developer and be like, oh yeah,

00:37:58.610 --> 00:37:59.810
<v Kevin>it's async, wait, I get it.

00:38:00.570 --> 00:38:05.730
<v Edward>We have an actual tokio maintainer on our team now, Noah Kennedy,

00:38:05.730 --> 00:38:08.450
<v Edward>who is actually a tokio expert.

00:38:08.790 --> 00:38:13.850
<v Edward>So I don't generally say I'm an expert on things unless I really feel like I

00:38:13.850 --> 00:38:15.810
<v Edward>know them very, very, very well.

00:38:17.260 --> 00:38:20.860
<v Matthias>Well also the error messages have

00:38:20.860 --> 00:38:23.960
<v Matthias>gotten a lot better in recent years i can still

00:38:23.960 --> 00:38:27.000
<v Matthias>remember the early days when you got

00:38:27.000 --> 00:38:30.280
<v Matthias>a page full of you know

00:38:30.280 --> 00:38:33.560
<v Matthias>gibberish about the type system and so on but recently they

00:38:33.560 --> 00:38:37.640
<v Matthias>improved it a lot thanks to compiler internals

00:38:37.640 --> 00:38:43.360
<v Matthias>which helped infer what really went wrong and also present the information in

00:38:43.360 --> 00:38:48.920
<v Matthias>a more consumable way and also just new mechanisms inside of the components

00:38:48.920 --> 00:38:54.640
<v Matthias>of like Impultrate which allow you to focus more on the core issue at hand.

00:38:55.340 --> 00:38:58.580
<v Matthias>I don't know if you saw any of the older error messages before but.

00:38:58.580 --> 00:39:01.840
<v Kevin>Definitely yeah yeah so i

00:39:01.840 --> 00:39:04.620
<v Kevin>i did some work with uh with tokio a long

00:39:04.620 --> 00:39:07.240
<v Kevin>time ago like before before like well before i went

00:39:07.240 --> 00:39:11.960
<v Kevin>1.0 so things were a little bit different than in terms of error messages so

00:39:11.960 --> 00:39:16.640
<v Kevin>like it was basically back in the old days of dealing with java error messages

00:39:16.640 --> 00:39:21.020
<v Kevin>or other languages error messages where you can almost ignore them just look

00:39:21.020 --> 00:39:24.700
<v Kevin>at where in the code it was pointing and go there and try to figure things out yourself,

00:39:25.460 --> 00:39:29.600
<v Kevin>as opposed to right now, where the error messages in async are practically as

00:39:29.600 --> 00:39:32.100
<v Kevin>good as they are in synchronous Rust code, which is to say really good.

00:39:32.340 --> 00:39:36.660
<v Matthias>Did you ever run into problems with tokio, like, for example,

00:39:37.420 --> 00:39:44.480
<v Matthias>starving the executor on threads when you block too many futures or async cancellation

00:39:44.480 --> 00:39:50.120
<v Matthias>where you had a sub-request that you didn't want to kill when you killed main

00:39:50.120 --> 00:39:52.520
<v Matthias>task, for example, and you wanted to keep it running?

00:39:52.520 --> 00:39:57.000
<v Matthias>And the Rust Futures ecosystem was sort of getting into your way?

00:39:58.220 --> 00:40:02.720
<v Kevin>At least a little bit. There was one scenario that I ran into within the past

00:40:02.720 --> 00:40:06.080
<v Kevin>month where I was downloading a file, downloading a large file,

00:40:06.480 --> 00:40:10.500
<v Kevin>running on a hyperscaler VM, so really excellent internet connection,

00:40:10.680 --> 00:40:14.300
<v Kevin>and just the process of downloading that file on a small VM with a limited number

00:40:14.300 --> 00:40:21.020
<v Kevin>of threads and a small number of tokio workers was enough to starve every other

00:40:21.020 --> 00:40:25.120
<v Kevin>connection because I wasn't being smart. I wasn't using budgeting.

00:40:25.340 --> 00:40:30.920
<v Kevin>I was just letting this one task take over the entire runtime and blocking everything else.

00:40:31.040 --> 00:40:34.780
<v Kevin>Like no other async things were being taken care of. It was just happily downloading

00:40:34.780 --> 00:40:38.260
<v Kevin>this one giant file and using the entire CPU to do that.

00:40:39.180 --> 00:40:43.920
<v Matthias>Yeah, what happens a lot during testing is that people have these multi-core

00:40:43.920 --> 00:40:46.500
<v Matthias>machines, they have, I don't know, 16, 32,

00:40:46.820 --> 00:40:53.520
<v Matthias>64 cores, what have you, and they run that system in their test laptop in their

00:40:53.520 --> 00:40:54.380
<v Matthias>development environment.

00:40:54.380 --> 00:40:56.960
<v Matthias>And everything works because you have plenty of threads.

00:40:57.380 --> 00:41:03.580
<v Matthias>By default, tokio just spawns as many background threads as you have cores.

00:41:03.860 --> 00:41:09.040
<v Matthias>But then you move to production where maybe you have to make do with two cores

00:41:09.040 --> 00:41:12.500
<v Matthias>and then suddenly you have two blocking threads,

00:41:12.970 --> 00:41:15.730
<v Matthias>tasks and then you ran

00:41:15.730 --> 00:41:18.470
<v Matthias>out of threats and kind of the

00:41:18.470 --> 00:41:21.270
<v Matthias>threat pool got exhausted that's a thing at least that i

00:41:21.270 --> 00:41:24.550
<v Matthias>saw with some clients and you running

00:41:24.550 --> 00:41:29.510
<v Matthias>20 of the internet i did wonder if you ever run and ran into these sort of problems

00:41:29.510 --> 00:41:35.510
<v Matthias>where these might be hard to troubleshoot as well because you look at a dashboard

00:41:35.510 --> 00:41:39.630
<v Matthias>or so and everything looks normal as if it was sort of working but it doesn't

00:41:39.630 --> 00:41:43.130
<v Matthias>do any work it doesn't make any progress on the futures.

00:41:43.130 --> 00:41:49.570
<v Kevin>That's true we have recently at least our team recently the ping core team has

00:41:49.570 --> 00:41:54.770
<v Kevin>incorporated tokio internal metrics into our dashboards to give us visibility

00:41:54.770 --> 00:41:59.350
<v Kevin>into these sort of things but you're right that's something before like the

00:41:59.350 --> 00:42:01.190
<v Kevin>past couple months that we didn't have visibility into,

00:42:01.850 --> 00:42:06.050
<v Kevin>and if we were running into that problem we wouldn't have known since then we've

00:42:06.050 --> 00:42:10.850
<v Kevin>encountered a few problems in production that we've seen in in our like measurements

00:42:10.850 --> 00:42:13.750
<v Kevin>of runtime queue size and,

00:42:14.690 --> 00:42:19.150
<v Kevin>i've got the other metrics but basically how much is the the scheduling system

00:42:19.150 --> 00:42:23.210
<v Kevin>getting backed up by all of the all of the threads being busy i.

00:42:23.210 --> 00:42:25.890
<v Edward>Think we were also running into issues where we

00:42:25.890 --> 00:42:33.790
<v Edward>had certain file io operations that were taking while and that for for those

00:42:33.790 --> 00:42:40.130
<v Edward>tokio actually has like a separate blocking thread pool that usually doesn't

00:42:40.130 --> 00:42:43.870
<v Edward>get saturated because it's really large, but may, right?

00:42:44.690 --> 00:42:51.210
<v Edward>And on the note of, I would say you also brought up async cancellation as well.

00:42:51.410 --> 00:42:54.590
<v Edward>I think it's really easy to mess up.

00:42:55.430 --> 00:42:59.590
<v Edward>Async, you know, thoughts about cancel safety and things.

00:42:59.610 --> 00:43:05.710
<v Edward>It's really easy to mess up a while tokio select loop so that you,

00:43:05.990 --> 00:43:10.110
<v Edward>if any of the branches aren't necessarily cancel safe, that's not something

00:43:10.110 --> 00:43:14.610
<v Edward>that you, that's purely an async Rust problem that you're, that generally I

00:43:14.610 --> 00:43:19.410
<v Edward>think doesn't get introduced very well for other people who are entering Rust.

00:43:19.930 --> 00:43:23.650
<v Edward>There was a great Rust talk actually by rain

00:43:23.650 --> 00:43:26.670
<v Edward>from oxide that i would love to shout out because

00:43:26.670 --> 00:43:30.050
<v Edward>it was really helpful in thinking about why exactly

00:43:30.050 --> 00:43:33.030
<v Edward>async cancellation is is hard to reason about it's

00:43:33.030 --> 00:43:41.190
<v Edward>because there's i i there are a lot of ways in which cleaning up async rust

00:43:41.190 --> 00:43:48.950
<v Edward>being mindful of cancel safety being mindful of when a future is canceled and

00:43:48.950 --> 00:43:51.570
<v Edward>thus cancels everything else under it, right?

00:43:52.310 --> 00:43:57.330
<v Edward>That's both really, really useful in async Rust, but also very,

00:43:57.490 --> 00:43:59.570
<v Edward>very easy to mess up. And it's a problem.

00:43:59.910 --> 00:44:03.590
<v Edward>They had introduced this concept, which I was really...

00:44:04.670 --> 00:44:11.430
<v Edward>Which helped me think about this, right? It is a problem that you can't determine

00:44:11.430 --> 00:44:17.850
<v Edward>whether or not something is safe to cancel just within the function itself.

00:44:17.850 --> 00:44:25.390
<v Edward>You have to look at every other child future under it and determine what else

00:44:25.390 --> 00:44:28.110
<v Edward>is going to get canceled when I decide to cancel this.

00:44:28.110 --> 00:44:31.570
<v Edward>And so it becomes a really hard problem to think about because suddenly you

00:44:31.570 --> 00:44:35.430
<v Edward>have to think about everything else, you know, all of the global context.

00:44:35.630 --> 00:44:37.030
<v Kevin>This is structured futures?

00:44:37.330 --> 00:44:42.510
<v Edward>I think it's just because when a future, like child futures,

00:44:43.110 --> 00:44:45.590
<v Edward>that will also get canceled, right?

00:44:45.770 --> 00:44:49.470
<v Edward>When you cancel the parent future, I believe.

00:44:49.750 --> 00:44:53.590
<v Edward>So it's tough.

00:44:53.890 --> 00:44:57.670
<v Edward>And so async Rust definitely has its sharp edges.

00:44:58.390 --> 00:45:02.170
<v Edward>That's not necessarily a tokio-specific thing, but the last bit,

00:45:02.330 --> 00:45:08.170
<v Edward>the very last bit, is that actually there is, to shout out another Cloudflare

00:45:08.170 --> 00:45:11.230
<v Edward>crate that we are not yet using in Pingora either,

00:45:11.450 --> 00:45:14.410
<v Edward>but other services at Cloudflare are.

00:45:14.650 --> 00:45:19.070
<v Edward>There's a crate called Foundations that I think helps you export a lot of these

00:45:19.070 --> 00:45:22.030
<v Edward>tokio metrics and such out of the box.

00:45:22.030 --> 00:45:25.570
<v Edward>So it has a lot of nice functionality to be able to do that.

00:45:25.570 --> 00:45:30.330
<v Edward>And it should be a pretty minimal, again,

00:45:30.590 --> 00:45:36.810
<v Edward>foundation layer for folks if you're interested in more easily exposing a lot

00:45:36.810 --> 00:45:41.190
<v Edward>of those runtime operational concerns and getting observability into that.

00:45:42.340 --> 00:45:47.420
<v Kevin>One thing to build on that, the one thing that Pingora does to help you avoid

00:45:47.420 --> 00:45:52.020
<v Kevin>the problem of going from one machine with many cores to another machine with

00:45:52.020 --> 00:45:55.460
<v Kevin>few cores, the problem about unexpected number of tokio tasks,

00:45:55.820 --> 00:45:57.860
<v Kevin>is to make you be explicit.

00:45:57.860 --> 00:46:01.020
<v Kevin>Like well Pingora uses tokio under

00:46:01.020 --> 00:46:04.360
<v Kevin>the hood it doesn't really expose the it

00:46:04.360 --> 00:46:07.780
<v Kevin>doesn't really expose tokio to the the caller instead

00:46:07.780 --> 00:46:11.000
<v Kevin>it talks about things in terms of backends how many how

00:46:11.000 --> 00:46:13.840
<v Kevin>many threads do you want to use on this back end and we don't do

00:46:13.840 --> 00:46:16.480
<v Kevin>a default number we don't default to the

00:46:16.480 --> 00:46:19.600
<v Kevin>number of cores we make you be explicit to say okay you want

00:46:19.600 --> 00:46:23.420
<v Kevin>to run this tell us how many tasks you want to run and we do a lot of things

00:46:23.420 --> 00:46:28.240
<v Kevin>like isolating services to a certain subset of tasks it's not like one giant

00:46:28.240 --> 00:46:33.020
<v Kevin>tokio runtime which you would get if you were just running tokio main it's a

00:46:33.020 --> 00:46:36.700
<v Kevin>good way of isolating business critical things from things that need to run

00:46:36.700 --> 00:46:39.080
<v Kevin>in the background that can take a little extra time.

00:46:39.080 --> 00:46:41.700
<v Matthias>Yeah and since we're on the

00:46:41.700 --> 00:46:44.700
<v Matthias>topic of using a certain number of

00:46:44.700 --> 00:46:48.820
<v Matthias>cores in rust code i always wonder why everyone sort

00:46:48.820 --> 00:46:51.660
<v Matthias>of defaulted to the number of cores that you have on your system

00:46:51.660 --> 00:46:54.700
<v Matthias>because if every dependency if every library does that

00:46:54.700 --> 00:46:57.640
<v Matthias>you end up with a multiple a multiple of

00:46:57.640 --> 00:47:00.440
<v Matthias>the number of cores you have and i'm just

00:47:00.440 --> 00:47:03.440
<v Matthias>saying this that so that people are mindful

00:47:03.440 --> 00:47:08.700
<v Matthias>about the resources that they request from a system and speaking of resources

00:47:08.700 --> 00:47:13.860
<v Matthias>that's the other part about performance or efficiency let's say that i wondered

00:47:13.860 --> 00:47:22.920
<v Matthias>about when you compare NGINX with Pingora did you or were you able to squeeze out even more,

00:47:24.040 --> 00:47:29.460
<v Matthias>requests per server now that you switched over to Pingora because NGINX must

00:47:29.460 --> 00:47:33.440
<v Matthias>have already handled a ton of requests I'm assuming because it was written in C.

00:47:34.190 --> 00:47:38.030
<v Edward>I don't know if we were able to squeeze out more necessarily,

00:47:38.230 --> 00:47:42.970
<v Edward>like people squeeze out all of the resources from us with the amount of requests

00:47:42.970 --> 00:47:46.070
<v Edward>per second that they drop on our network.

00:47:46.470 --> 00:47:55.950
<v Edward>But I think, again, I think the resources that we are really saving,

00:47:57.430 --> 00:48:02.070
<v Edward>as far as I can recall, because

00:48:02.070 --> 00:48:05.410
<v Edward>NGINX you know just bare request

00:48:05.410 --> 00:48:08.910
<v Edward>processing without you know extra compute

00:48:08.910 --> 00:48:12.330
<v Edward>futzing with the request processing with

00:48:12.330 --> 00:48:16.710
<v Edward>lua filters or whatnot is generally already pretty efficient and

00:48:16.710 --> 00:48:23.470
<v Edward>just trying to limit what it does to being pipes we we have similar goals right

00:48:23.470 --> 00:48:30.530
<v Edward>We want to be as minimal as we can and just ferrying the bytes through and making

00:48:30.530 --> 00:48:34.070
<v Edward>the necessary modifications on the layer seven stuff.

00:48:34.350 --> 00:48:40.370
<v Edward>Now, the things that we were saving, I had mentioned earlier that we were like.

00:48:40.850 --> 00:48:45.330
<v Edward>More efficient at reusing origin

00:48:45.330 --> 00:48:48.350
<v Edward>connections for example you can in theory

00:48:48.350 --> 00:48:51.170
<v Edward>i guess squeeze out yeah you

00:48:51.170 --> 00:48:54.550
<v Edward>can you can squeeze out and save compute

00:48:54.550 --> 00:48:57.830
<v Edward>on making for both yourself and the

00:48:57.830 --> 00:49:00.950
<v Edward>origin right if you're if you have better

00:49:00.950 --> 00:49:04.370
<v Edward>origin connection reuse when

00:49:04.370 --> 00:49:07.750
<v Edward>you're trying to make requests upstream the reason

00:49:07.750 --> 00:49:11.770
<v Edward>why we there was

00:49:11.770 --> 00:49:14.550
<v Edward>such a fundamental difference i think that in in

00:49:14.550 --> 00:49:17.550
<v Edward>the blog i think we had mentioned that we had reduced we had

00:49:17.550 --> 00:49:20.610
<v Edward>lowered it by like

00:49:20.610 --> 00:49:23.470
<v Edward>two-thirds the amount of origin connections we

00:49:23.470 --> 00:49:26.430
<v Edward>were making the fundamental reason for that was like a fundamental

00:49:26.430 --> 00:49:29.770
<v Edward>architecture reason which was

00:49:29.770 --> 00:49:33.550
<v Edward>that NGINX worker processes right

00:49:33.550 --> 00:49:37.370
<v Edward>because their individual processes weren't able

00:49:37.370 --> 00:49:43.770
<v Edward>to share a connection pool unlike the thread-based model that we have in in

00:49:43.770 --> 00:49:51.550
<v Edward>Pingora so that and where where we have an upstream connection pool that that

00:49:51.550 --> 00:49:56.590
<v Edward>all the threats can on a particular server can share from.

00:49:58.110 --> 00:50:05.430
<v Edward>Except in those particular design, fundamental architecture ways that we were

00:50:05.430 --> 00:50:09.970
<v Edward>really conscious of when we were first optimizing for Pinguara because our team

00:50:09.970 --> 00:50:12.310
<v Edward>making origin connections, that's a big deal for us.

00:50:13.250 --> 00:50:19.250
<v Edward>I think generally we would expect performance to be pretty much on par, right?

00:50:19.970 --> 00:50:26.210
<v Edward>With what Entirenex is doing. And that's the promise of Rust, right? You can do that.

00:50:26.970 --> 00:50:31.030
<v Edward>And be just as expressive and easy to understand.

00:50:32.010 --> 00:50:36.710
<v Kevin>Yeah, a lot of this comes down to physical limits. So NGINX is optimized to

00:50:36.710 --> 00:50:40.650
<v Kevin>the max to the level of what you can do on a network card and what you can do

00:50:40.650 --> 00:50:43.810
<v Kevin>reading files from a disk. We are limited by the same physical constraints.

00:50:44.010 --> 00:50:47.110
<v Kevin>We are reading from the same network, reading from the same disk effectively.

00:50:47.650 --> 00:50:53.910
<v Kevin>The place where Rust excels here is an ability to make it easy to read and easy

00:50:53.910 --> 00:50:59.470
<v Kevin>to write and easy to onboard as opposed to requiring a PhD to unroll C code.

00:50:59.470 --> 00:51:03.670
<v Edward>And when it comes to like implementing new,

00:51:03.990 --> 00:51:07.030
<v Edward>playing with the shiny new tools that the kernel allows you to,

00:51:07.270 --> 00:51:12.230
<v Edward>like uring and stuff, that's perhaps a lot easier to,

00:51:12.230 --> 00:51:19.010
<v Edward>like I would certainly want to be doing that within our framework instead of

00:51:19.010 --> 00:51:22.490
<v Edward>trying to roll that into NGINX, right?

00:51:22.490 --> 00:51:29.470
<v Edward>That's at this point i think it's just we we are a lot more comfortable working with our ecosystem.

00:51:29.470 --> 00:51:37.650
<v Matthias>Which brings us to today and just to wrap up the part about Pingora can you

00:51:37.650 --> 00:51:44.390
<v Matthias>share some numbers about the project where are we at today and maybe about cloudflare in general.

00:51:45.370 --> 00:51:49.470
<v Kevin>Sure. So, I mean, the first thing that I always tell people about Pingora or

00:51:49.470 --> 00:51:51.670
<v Kevin>Cloudflare in general is that the teams are really small,

00:51:52.230 --> 00:51:56.570
<v Kevin>surprisingly small, especially if you look at teams at other big companies like

00:51:56.570 --> 00:52:01.130
<v Kevin>Amazon or Facebook or Google, there's only between six and eight people,

00:52:01.430 --> 00:52:04.090
<v Kevin>depending on the time of day on the Pingora team.

00:52:04.530 --> 00:52:10.070
<v Kevin>So this team that handles a large 20% of the internet traffic in the world,

00:52:10.510 --> 00:52:14.150
<v Kevin>handled by six or seven people, most of whom are asleep at the same time.

00:52:16.210 --> 00:52:21.990
<v Kevin>In terms of lines of code, we, for some reason, are not giving out the official

00:52:21.990 --> 00:52:24.130
<v Kevin>number of lines of code in Cloudflare that is written in Rust.

00:52:24.270 --> 00:52:29.290
<v Kevin>But for Pingora, even on the open source side, there's about 130,000 lines of code.

00:52:29.290 --> 00:52:33.010
<v Edward>To be clear we're not the only content

00:52:33.010 --> 00:52:36.070
<v Edward>delivery network team not the

00:52:36.070 --> 00:52:39.130
<v Edward>only a proxy service through which you know

00:52:39.130 --> 00:52:43.170
<v Edward>these these requests are passing through there there are lots of other folks

00:52:43.170 --> 00:52:49.250
<v Edward>but yeah i think that where it makes sense the the engineering teams are are

00:52:49.250 --> 00:52:53.650
<v Edward>definitely we have a lot of autonomy each of us as engineers and a lot certainly

00:52:53.650 --> 00:52:58.430
<v Edward>a lot of responsibility and we're kind of each, you know,

00:52:58.470 --> 00:53:02.850
<v Edward>driven to do what we want within the team.

00:53:03.130 --> 00:53:08.350
<v Edward>So each of us carries, I think, a lot of load without trying not to stress the

00:53:08.350 --> 00:53:09.970
<v Edward>bus factor, though, in that case.

00:53:10.490 --> 00:53:16.010
<v Kevin>True. Yeah, the reason the team size fluctuates is because we, as a company,

00:53:16.310 --> 00:53:21.190
<v Kevin>are open to working cross teams to the extent that for the past,

00:53:21.370 --> 00:53:24.130
<v Kevin>I don't know, three or four months, I've not been working on the Pingora team

00:53:24.130 --> 00:53:28.010
<v Kevin>and been working on the Speed team for other undisclosed yet projects,

00:53:28.290 --> 00:53:33.110
<v Kevin>but are also written in Rust, more on the core side than the edge side,

00:53:33.150 --> 00:53:37.650
<v Kevin>but still very interesting and all async Rust, just like Pingora.

00:53:38.820 --> 00:53:39.300
<v Matthias>Amazing.

00:53:40.080 --> 00:53:46.280
<v Edward>Though I don't think we can share across all of Cloudflare how many lines of

00:53:46.280 --> 00:53:53.200
<v Edward>Rust code there are, I will say that Rust, we mentioned that Rust has been of

00:53:53.200 --> 00:53:54.900
<v Edward>interest to Cloudflare for a long time.

00:53:55.140 --> 00:54:01.920
<v Edward>Pretty much every new service on the edge is written in Rust,

00:54:02.180 --> 00:54:05.760
<v Edward>I believe, unless there's some significant reason not to.

00:54:05.760 --> 00:54:14.200
<v Edward>I think all of the services that are running are proof enough that it provides

00:54:14.200 --> 00:54:19.860
<v Edward>significant value, especially in our performance-critical, like,

00:54:20.540 --> 00:54:24.640
<v Edward>segfault-avoidant environment.

00:54:26.140 --> 00:54:29.500
<v Kevin>Yeah, there are at least a few requests that go through Cloudflare that touch only Rust.

00:54:30.180 --> 00:54:34.960
<v Kevin>It's not the majority yet, but it is a significant number.

00:54:35.760 --> 00:54:39.780
<v Matthias>Speaking of which, have we talked about the number of requests per second that

00:54:39.780 --> 00:54:42.040
<v Matthias>Pingora handles right now?

00:54:42.220 --> 00:54:46.900
<v Kevin>Oh, yeah, I mentioned it briefly in a ramble. But yeah, so Pingora itself,

00:54:47.000 --> 00:54:48.980
<v Kevin>there are multiple Pingora projects.

00:54:49.380 --> 00:54:54.860
<v Kevin>But the one most prevalent that talks to upstream origins handles on average

00:54:54.860 --> 00:54:56.700
<v Kevin>about 90 million requests a second.

00:54:58.200 --> 00:55:01.800
<v Kevin>There was a blog post that came out that the Primagen read out loud.

00:55:01.960 --> 00:55:05.660
<v Kevin>And he got to that number and said, wow, is that a billy? no that's a trillion

00:55:05.660 --> 00:55:07.440
<v Kevin>that's a trillion requests per day.

00:55:09.220 --> 00:55:12.600
<v Matthias>Wow that's crazy that's a lot of requests.

00:55:13.280 --> 00:55:13.920
<v Kevin>Yeah.

00:55:14.750 --> 00:55:18.670
<v Matthias>Does the Rust ecosystem cover everything that you need right now?

00:55:18.890 --> 00:55:22.730
<v Matthias>Are there any crates that you wanted to mention that are amazing,

00:55:22.910 --> 00:55:24.430
<v Matthias>that are invaluable for you?

00:55:24.610 --> 00:55:28.770
<v Matthias>And are there any things lacking in the ecosystem right now?

00:55:29.130 --> 00:55:33.130
<v Kevin>One crate that I think is sort of underutilized is the valuable crate.

00:55:33.430 --> 00:55:38.170
<v Kevin>As part of the tokio project, it ties in really nicely with tokio tracing.

00:55:38.850 --> 00:55:45.530
<v Kevin>It allows you to basically give a controllable summary of objects that you want

00:55:45.530 --> 00:55:46.710
<v Kevin>to show up in your traces.

00:55:48.230 --> 00:55:52.590
<v Kevin>It's got a usage pattern that's similar to serde. You annotate your structs

00:55:52.590 --> 00:55:53.790
<v Kevin>that you want to be able to display.

00:55:54.190 --> 00:55:57.350
<v Kevin>It's got some great new features that allow you to omit fields if you don't

00:55:57.350 --> 00:56:00.970
<v Kevin>want PII to show up in your traces. It's really well written.

00:56:01.330 --> 00:56:05.790
<v Kevin>We've added our own features on top of it that allow you to do things like when

00:56:05.790 --> 00:56:11.510
<v Kevin>you have structures from external crates that you don't have access to add annotations to them.

00:56:11.770 --> 00:56:17.310
<v Kevin>You can give it a special valuable annotation to instead of giving a full object

00:56:17.310 --> 00:56:21.870
<v Kevin>representation of this structure, you can give it the debug representation or

00:56:21.870 --> 00:56:24.290
<v Kevin>the display representation and have that show up in your logs.

00:56:24.430 --> 00:56:28.790
<v Kevin>It's just a really simple way of avoiding the boilerplate that comes up with

00:56:28.790 --> 00:56:32.590
<v Kevin>wanting to give a summary of an entire object structure, which I've seen in

00:56:32.590 --> 00:56:35.110
<v Kevin>lots and lots of places, especially in other languages.

00:56:35.110 --> 00:56:38.590
<v Kevin>You want an object to show up in multiple ways, but you can't interfere with

00:56:38.590 --> 00:56:42.390
<v Kevin>how it's serialized to json so you have to go through all the boiler plate of

00:56:42.390 --> 00:56:46.250
<v Kevin>writing okay this field goes in oh no skip this field it's got ip addresses,

00:56:47.350 --> 00:56:50.970
<v Kevin>it's having a crate that's designed to do this and also is thoughtful,

00:56:51.870 --> 00:56:57.630
<v Kevin>Because it doesn't add overhead of implementing. You're implementing it as a

00:56:57.630 --> 00:57:00.930
<v Kevin>blanket trait implementation, but it's done in a dynamic way.

00:57:01.030 --> 00:57:04.130
<v Kevin>So it doesn't add even a lot of monomorphism.

00:57:04.410 --> 00:57:11.210
<v Kevin>It just gives you one implementation for anything that implements display or is valuable.

00:57:11.610 --> 00:57:14.210
<v Kevin>It's a great crate. I can't get enough of it.

00:57:14.730 --> 00:57:16.350
<v Matthias>One could say it's invaluable.

00:57:16.790 --> 00:57:18.530
<v Kevin>It is invaluable, yes.

00:57:22.810 --> 00:57:27.910
<v Edward>No, yeah, I'm glad you mentioned that because I feel like I'm cheating.

00:57:29.550 --> 00:57:35.290
<v Edward>Everything that comes to mind is a core dependency, right?

00:57:35.490 --> 00:57:42.630
<v Edward>Like, tokio, obviously, has so many great utilities for us to express,

00:57:42.830 --> 00:57:45.950
<v Edward>like message passing and in async fashion, etc.

00:57:45.950 --> 00:57:50.530
<v Edward>And obviously we've extolled its, I think we've sung its praises.

00:57:51.410 --> 00:57:56.930
<v Edward>And other things that come to mind seem really foundational,

00:57:57.390 --> 00:58:04.410
<v Edward>just like, you know, reference counted bytes, byte buffers with the bytes crate.

00:58:05.090 --> 00:58:08.130
<v Edward>Very foundational. dash map

00:58:08.130 --> 00:58:11.230
<v Edward>how do you get a concurrent hash map

00:58:11.230 --> 00:58:14.270
<v Edward>with as little lock contention as

00:58:14.270 --> 00:58:21.230
<v Edward>possible something like dash map with a bunch of shards it's it's great oh the

00:58:21.230 --> 00:58:28.350
<v Edward>the other things i've already mentioned as well which is our you know our cloudflare

00:58:28.350 --> 00:58:34.150
<v Edward>crates that i've already mentioned before shellflip when it comes to process restart,

00:58:34.390 --> 00:58:41.050
<v Edward>graceful process restarts and foundations for various telemetry and observability

00:58:41.050 --> 00:58:45.210
<v Edward>things among other operational service things.

00:58:46.520 --> 00:58:53.640
<v Edward>I don't know if we wanted to shout out like some community work on top of Pingora 2.

00:58:53.980 --> 00:58:54.680
<v Kevin>Oh, yeah.

00:58:55.160 --> 00:59:02.320
<v Edward>Like there are originally, I think we had been working with some folks within

00:59:02.320 --> 00:59:13.680
<v Edward>the Proximo memory safety org on a more batteries included actual drop in NGINX

00:59:13.680 --> 00:59:14.800
<v Edward>replacement called River.

00:59:14.800 --> 00:59:21.760
<v Edward>I do believe that a lot of that work maybe is on pause right now, though.

00:59:22.300 --> 00:59:27.740
<v Edward>But there are a lot of other great community folks who come in,

00:59:27.820 --> 00:59:31.660
<v Edward>report issues, etc., contribute, who are working on, I think,

00:59:32.080 --> 00:59:37.560
<v Edward>there's this ping gap crate as well, is one of the most significant and popular,

00:59:37.740 --> 00:59:46.600
<v Edward>where they've also implemented all of other, dealt with our more arcane APIs

00:59:46.600 --> 00:59:48.060
<v Edward>around caching and stuff.

00:59:48.300 --> 00:59:52.140
<v Edward>So definitely, that's a tremendous effort.

00:59:52.460 --> 01:00:01.320
<v Edward>And I think we have been so flattered and excited by the community engagement with Pingora.

01:00:01.600 --> 01:00:08.000
<v Edward>It was monumental and humbling.

01:00:08.660 --> 01:00:10.280
<v Matthias>What does River do?

01:00:10.920 --> 01:00:15.360
<v Edward>It is both of these projects that I mentioned, River and PingGap,

01:00:15.500 --> 01:00:22.580
<v Edward>are meant to be more batteries included, NGINX, like actual binary deployment.

01:00:22.580 --> 01:00:26.380
<v Edward>So Pingora is meant to be a library,

01:00:26.380 --> 01:00:35.900
<v Edward>and it can be a bit difficult to work with if all you're trying to do is use

01:00:35.900 --> 01:00:37.740
<v Edward>it as a drop-in for NGINX, right?

01:00:37.800 --> 01:00:44.360
<v Edward>You have to actually implement all of, you know, define the proxy service and

01:00:44.360 --> 01:00:45.720
<v Edward>things like that in code.

01:00:45.720 --> 01:00:49.620
<v Edward>And it is

01:00:49.620 --> 01:00:56.000
<v Edward>not a batteries included like plug-and-play sort of deployment where you can

01:00:56.000 --> 01:01:02.060
<v Edward>just versus something like one of these other projects where you can in theory

01:01:02.060 --> 01:01:07.200
<v Edward>just build it and run it as if it were an NGINX binary.

01:01:08.300 --> 01:01:14.620
<v Edward>So we were really trying to build the foundations of a lot of this, of a proxy framework.

01:01:14.620 --> 01:01:23.880
<v Edward>And allow the community to expand on it since we don't necessarily haven't yet

01:01:23.880 --> 01:01:30.140
<v Edward>necessarily needed that generalized solution ourselves with the amount of heavy

01:01:30.140 --> 01:01:32.200
<v Edward>customization and heavy like...

01:01:33.340 --> 01:01:39.380
<v Edward>You know, surface fiddly bits that we do ourselves in spinning up a Pingora service.

01:01:40.140 --> 01:01:44.080
<v Kevin>Yeah. And as we mentioned, we're only six or seven people. We don't have so

01:01:44.080 --> 01:01:47.840
<v Kevin>much time to add additional features. I mean, we love adding features to Pingora.

01:01:48.500 --> 01:01:51.660
<v Kevin>The River project, when it was envisioned, was supposed to have things like

01:01:51.660 --> 01:01:55.560
<v Kevin>WebAssembly integration. So you can do all these things, but expose them as WebAssembly.

01:01:55.740 --> 01:01:58.080
<v Kevin>That was one of those things that I would love to implement myself.

01:01:58.560 --> 01:02:02.240
<v Kevin>But, you know, there's just, there's enough Cloudflare work to go around,

01:02:02.340 --> 01:02:06.600
<v Kevin>and also it's a significant project to take on.

01:02:06.800 --> 01:02:10.880
<v Kevin>The community has been really good at putting things into Pingora directly, though.

01:02:11.740 --> 01:02:14.140
<v Kevin>Some notable ones that come to mind are Russell's integration.

01:02:14.420 --> 01:02:16.620
<v Kevin>We internally use OpenSSL.

01:02:17.160 --> 01:02:21.500
<v Kevin>The Russell's integration was a huge undertaking that one person did themselves,

01:02:21.580 --> 01:02:25.180
<v Kevin>and we're very grateful to that. Harold, if you're listening, thank you very much.

01:02:25.520 --> 01:02:34.680
<v Kevin>There's another similar integration for another TLS implementation, I think the AWS 2SN TLS.

01:02:35.260 --> 01:02:39.540
<v Kevin>That one is still yet to be reviewed, obviously is assigned to me.

01:02:40.440 --> 01:02:42.480
<v Kevin>I'm slacking off on my open source job there.

01:02:43.560 --> 01:02:49.120
<v Edward>We really try to, we are really trying to stay on top of open source,

01:02:49.340 --> 01:02:54.700
<v Edward>but there's only sometimes, I wish I just had more.

01:02:55.580 --> 01:02:58.360
<v Edward>I think we all wish we had just more.

01:02:58.360 --> 01:03:00.820
<v Kevin>Open source time. Yeah, the open source stuff is so fun.

01:03:01.660 --> 01:03:05.600
<v Matthias>Yeah there's never enough time speaking of

01:03:05.600 --> 01:03:12.280
<v Matthias>which we have to conclude as well because we ran out of time but it was amazing

01:03:12.280 --> 01:03:18.460
<v Matthias>to talk to both of you likewise if you could phrase a statement to the rust

01:03:18.460 --> 01:03:23.000
<v Matthias>community anything that you always wanted to share what would it be yeah.

01:03:23.000 --> 01:03:31.180
<v Edward>No like we there's a bunch of http ecosystem things that there's There's a great

01:03:31.180 --> 01:03:37.040
<v Edward>maintainer for all of it is like open source, the HTTP,

01:03:37.680 --> 01:03:44.760
<v Edward>literally HTTP crate, h2, you know, a lot of those are our core dependencies for Pingora as well.

01:03:45.220 --> 01:03:50.160
<v Edward>And the maintainer, Sean, is incredible at what he does.

01:03:50.800 --> 01:03:52.840
<v Matthias>Yeah, I agree. Shout out to Sean.

01:03:54.110 --> 01:03:57.850
<v Kevin>Yeah, the thing I was going to thank the Rust community for is for being so coherent,

01:03:58.230 --> 01:04:02.770
<v Kevin>especially around HTTP things like the hyper ecosystem, the h2,

01:04:03.110 --> 01:04:09.210
<v Kevin>all of those things are so ubiquitous that it makes integrating with existing projects much easier.

01:04:09.210 --> 01:04:13.990
<v Kevin>Specifically, I was working with a ClickHouse client that is an official ClickHouse

01:04:13.990 --> 01:04:17.810
<v Kevin>client that the ClickHouse team puts out, but I needed to add a new feature

01:04:17.810 --> 01:04:22.130
<v Kevin>for rotating MTLS certificates, which obviously their client does not support.

01:04:22.310 --> 01:04:28.470
<v Kevin>But because they expose access to the hyper HTTP client under the hood,

01:04:28.650 --> 01:04:29.990
<v Kevin>it made it an easy thing to do.

01:04:30.270 --> 01:04:33.930
<v Kevin>It's just such a good experience to come to.

01:04:33.930 --> 01:04:38.230
<v Kevin>Like if you need a feature you already have the tools necessary to add functionality

01:04:38.230 --> 01:04:42.490
<v Kevin>to tools that are published by other people in a coherent way something that

01:04:42.490 --> 01:04:45.730
<v Kevin>you don't get in java something that you i don't know if you get in go that's

01:04:45.730 --> 01:04:51.790
<v Kevin>not my ecosystem but as a former and recovering java programmer it's very nice yeah.

01:04:51.790 --> 01:04:55.850
<v Matthias>That's a very nice closing statement as well Edward anything that you want to add.

01:04:56.890 --> 01:05:04.550
<v Edward>I'm glad you had a specific answer because really, I am mainly just thankful

01:05:04.550 --> 01:05:08.370
<v Edward>for, I mean, it is true that the ecosystem,

01:05:08.970 --> 01:05:12.430
<v Edward>though I'm sure there are gaps from time to time, generally,

01:05:12.450 --> 01:05:17.110
<v Edward>if you are looking for a particular pattern or thing,

01:05:17.370 --> 01:05:21.610
<v Edward>you will either find out that it is hard to do so, or that someone else has

01:05:21.610 --> 01:05:26.390
<v Edward>already tried to at least some extent to do it and has a working very much like

01:05:26.390 --> 01:05:30.690
<v Edward>you know if not production ready nearly production ready implementation of it,

01:05:31.450 --> 01:05:34.370
<v Edward>so the rust ecosystem in general has

01:05:34.370 --> 01:05:37.870
<v Edward>has just been kind of the amount of excitement

01:05:37.870 --> 01:05:40.670
<v Edward>that folks have within the

01:05:40.670 --> 01:05:44.270
<v Edward>community is is a great sign

01:05:44.270 --> 01:05:47.610
<v Edward>of sign of promise and i mean obviously i

01:05:47.610 --> 01:05:50.370
<v Edward>think rust has already eaten up a lot

01:05:50.370 --> 01:05:53.350
<v Edward>of the internet if we are

01:05:53.350 --> 01:05:59.310
<v Edward>anything to if we are a good example but no we're we're just so once again just

01:05:59.310 --> 01:06:06.170
<v Edward>so thankful that people are interested in what we do and are patient with us

01:06:06.170 --> 01:06:11.830
<v Edward>and are you know are great contributors so.

01:06:13.330 --> 01:06:17.350
<v Matthias>Kevin and Edward thanks so much for taking time for the interview today.

01:06:17.990 --> 01:06:19.630
<v Kevin>Thanks Matthias we appreciate it.

01:06:19.630 --> 01:06:21.630
<v Edward>Thank you yes yeah.

01:06:21.630 --> 01:06:24.690
<v Kevin>I mean thanks for putting on this podcast Yes, I cannot believe it took five

01:06:24.690 --> 01:06:25.730
<v Kevin>seasons for me to catch on.

01:06:26.550 --> 01:06:31.230
<v Matthias>It's never too late. Rust in Production is a podcast by corrode.

01:06:31.430 --> 01:06:35.510
<v Matthias>It is hosted by me, Matthias Endler, and produced by Simon Brüggen.

01:06:35.690 --> 01:06:39.970
<v Matthias>For show notes, transcripts, and to learn more about how we can help your company

01:06:39.970 --> 01:06:42.850
<v Matthias>make the most of Rust, visit corrode.dev.

01:06:43.090 --> 01:06:45.450
<v Matthias>Thanks for listening to Rust in Production.