WEBVTT

00:00:00.010 --> 00:00:05.250
<v Matthias>Hello and welcome to Season 6 of Rust in Production, a podcast about companies

00:00:05.250 --> 00:00:07.850
<v Matthias>who use Rust to shape the future of infrastructure.

00:00:08.170 --> 00:00:13.130
<v Matthias>My name is Matthias Endler from corrode, and today I chat with Cian Butler from

00:00:13.130 --> 00:00:16.790
<v Matthias>Cloudsmith about oxidizing Python backends with Rust.

00:00:18.970 --> 00:00:22.070
<v Matthias>Cian, thanks so much for taking the time for the interview today.

00:00:22.330 --> 00:00:23.950
<v Matthias>Can you say a few words about yourself?

00:00:24.690 --> 00:00:29.210
<v Cian>Yep. I'm a performance engineer and SRE at Cloudsmith.

00:00:30.130 --> 00:00:35.830
<v Cian>I've been doing Rust in some form or another for the last 10 years,

00:00:36.290 --> 00:00:42.310
<v Cian>mostly as side projects, but I have been doing it professionally for nearly three years now.

00:00:42.590 --> 00:00:48.550
<v Cian>Working at Cloudsmith, trying to build on the Edge team, where we work on our

00:00:48.550 --> 00:00:51.350
<v Cian>CDN and all that fun networking stuff.

00:00:51.650 --> 00:00:56.870
<v Cian>Cloudsmith, we're a package management company. So we do package management as a SaaS.

00:00:57.190 --> 00:01:03.110
<v Cian>We support like 36 different formats of packages for Node, Cargo,

00:01:03.830 --> 00:01:05.790
<v Cian>Python, all the big ones.

00:01:06.330 --> 00:01:10.590
<v Cian>We do public repositories, private repositories and open source repositories.

00:01:11.750 --> 00:01:15.570
<v Cian>We're going pretty fast. We've got some big customers that I don't know who

00:01:15.570 --> 00:01:17.710
<v Cian>I can mention. So I won't mention anyone just in case.

00:01:18.170 --> 00:01:25.510
<v Cian>Because of that, we process about 110 million API requests daily,

00:01:26.580 --> 00:01:31.640
<v Cian>That equates to petabytes of packages downloaded every day.

00:01:32.080 --> 00:01:35.280
<v Cian>A lot of that is done in Python right now.

00:01:35.640 --> 00:01:41.880
<v Cian>We have a very old Django monolith that we've had since day one, which is 10 years ago.

00:01:42.420 --> 00:01:48.580
<v Cian>It's grown. And as we attempt to scale it, we needed to find new ways to scale it.

00:01:48.680 --> 00:01:54.200
<v Cian>So we started looking at Rust as a way of making it faster and more efficient.

00:01:54.200 --> 00:02:03.320
<v Matthias>Great. That means the monolith is exactly as old as your Rust experience was long.

00:02:03.500 --> 00:02:07.520
<v Matthias>So it's 10 years for the monolith and 10 years of Rust for you.

00:02:08.020 --> 00:02:13.280
<v Cian>Yeah, I hadn't even thought about it, but yeah, it's a nice little commonality there.

00:02:13.480 --> 00:02:19.040
<v Matthias>And I could imagine you want to use Cloudsmith in a situation where you have

00:02:19.040 --> 00:02:22.340
<v Matthias>an organization that manages a bunch of packages,

00:02:22.920 --> 00:02:29.080
<v Matthias>maybe a bunch of packages in different ecosystems, and you want to have hosted

00:02:29.080 --> 00:02:33.600
<v Matthias>version of that that is secure and safe, like we're talking about supply chain

00:02:33.600 --> 00:02:37.200
<v Matthias>security, or are there any other reasons for using Cloudsmith?

00:02:38.230 --> 00:02:43.790
<v Cian>Oh, 100%. Supply chain security is one of those things we're very big on, very focused on.

00:02:44.270 --> 00:02:49.870
<v Cian>It's not just, though, security. So if you run different, you could run multiple

00:02:49.870 --> 00:02:52.850
<v Cian>different formats of packages or just one format.

00:02:53.370 --> 00:02:57.010
<v Cian>You'd use us to be a proxy to your upstreams. So you could say,

00:02:57.230 --> 00:03:01.450
<v Cian>pull all your packages through Cloudsmith, and that gets you better caching

00:03:01.450 --> 00:03:06.010
<v Cian>on them because you get our access to our CDN, and then you can apply security posture on it.

00:03:06.010 --> 00:03:11.790
<v Cian>So don't download any packages that have these vulnerabilities or CSVs published

00:03:11.790 --> 00:03:15.930
<v Cian>on them, which are decision engines for that kind of tooling.

00:03:16.470 --> 00:03:19.930
<v Cian>But as well, you might just publish your own packages for internal use.

00:03:20.050 --> 00:03:26.570
<v Cian>So if you are a big company that's building lots of packages that you use internally

00:03:26.570 --> 00:03:31.430
<v Cian>for other services, so you could be having, let's say, a logging library with your custom logs.

00:03:31.430 --> 00:03:38.030
<v Cian>You push it up there, it gets pulled in by all your microservices or CLIs and they can get built.

00:03:38.430 --> 00:03:42.690
<v Cian>That's much more the traditional way of like people have private packages they

00:03:42.690 --> 00:03:47.790
<v Cian>don't want to put on the internet and they don't want to have the insane tooling

00:03:47.790 --> 00:03:49.430
<v Cian>of putting all their packages in one repo.

00:03:49.670 --> 00:03:51.970
<v Cian>So they have lots of, so they have private repository.

00:03:52.390 --> 00:03:55.630
<v Cian>A lot more focus now in the industry, that's supply chain security.

00:03:55.890 --> 00:03:59.310
<v Cian>So that's where you see a lot of our development happening right now in like

00:03:59.310 --> 00:04:02.130
<v Cian>securing different supply chains.

00:04:02.650 --> 00:04:06.190
<v Cian>I won't say I'm an expert on that side of it. We have people who are a lot smarter

00:04:06.190 --> 00:04:07.790
<v Cian>about that, who focus on that.

00:04:08.010 --> 00:04:14.390
<v Cian>I mostly focus on the low-level networking stuff and data processing side of it all.

00:04:15.810 --> 00:04:19.970
<v Matthias>I realize that you might not have been around, but can you maybe,

00:04:20.350 --> 00:04:27.670
<v Matthias>from conversations with other employees, remember why Python was chosen to start

00:04:27.670 --> 00:04:28.850
<v Matthias>a project in the first place?

00:04:30.040 --> 00:04:37.020
<v Cian>I think it's a comfort situation. We had two founders who started it.

00:04:39.640 --> 00:04:45.340
<v Cian>Their story is not one I will be the best person to repeat, so I won't repeat it.

00:04:45.460 --> 00:04:48.420
<v Cian>If you want to look it up, we've definitely done some posts on it.

00:04:48.640 --> 00:04:51.480
<v Cian>One of our CTO likes to talk about his history.

00:04:52.660 --> 00:04:56.960
<v Cian>Lee Skillen, if you want to look up anything from his blog and his LinkedIn. in.

00:04:57.120 --> 00:05:03.060
<v Cian>But the reason we chose Python is that it's just familiarity.

00:05:03.580 --> 00:05:13.700
<v Cian>Like, it's a really good language. Like, it's powerful in that you can write so much code so easily.

00:05:14.520 --> 00:05:19.660
<v Cian>It's very business friendly. It's not overly verbose.

00:05:19.760 --> 00:05:23.700
<v Cian>So it leads to rapid prototyping very quickly.

00:05:24.520 --> 00:05:29.800
<v Cian>Same is true with Django. Django makes it so easy to spin up a web server and

00:05:29.800 --> 00:05:34.780
<v Cian>hook it up to a database and start playing around and getting your proof of

00:05:34.780 --> 00:05:37.740
<v Cian>concepts ready, getting your POC built.

00:05:38.560 --> 00:05:41.980
<v Cian>I think that, no, I don't think, I know.

00:05:42.260 --> 00:05:46.620
<v Cian>I know that we wouldn't have scaled as fast as we did without Django and Python

00:05:46.620 --> 00:05:50.580
<v Cian>because we wouldn't be able to roll features out as quickly as we did.

00:05:50.880 --> 00:05:55.160
<v Cian>They definitely helped us scale the company and get to where we are today.

00:05:55.960 --> 00:06:02.260
<v Cian>Saying that, after 10 years of code being written, I think I said there's like

00:06:02.260 --> 00:06:08.820
<v Cian>200,000 lines of code in our thing, somewhere over 20,000 files in our monolith.

00:06:09.500 --> 00:06:12.700
<v Cian>That's a lot of code. It's a lot of.

00:06:13.630 --> 00:06:17.230
<v Cian>Code that not everyone understands and we're

00:06:17.230 --> 00:06:20.070
<v Cian>constantly like going back reading it

00:06:20.070 --> 00:06:24.070
<v Cian>trying to figure out how it works and if you've

00:06:24.070 --> 00:06:26.750
<v Cian>even this morning i was trying to read a set of

00:06:26.750 --> 00:06:29.650
<v Cian>mixins trying to figure out what request what path a

00:06:29.650 --> 00:06:32.970
<v Cian>request goes through as we have multiple different layers

00:06:32.970 --> 00:06:40.710
<v Cian>of python code to process it it adds up over time it so what made it really

00:06:40.710 --> 00:06:46.410
<v Cian>good for scaling on day one has kind of like caught up and made it really difficult

00:06:46.410 --> 00:06:49.710
<v Cian>to understand and handle now.

00:06:50.910 --> 00:06:53.050
<v Cian>So double-edged sword of Python there.

00:06:54.010 --> 00:06:58.650
<v Matthias>Yeah, and a lot of people might say, let's just remove everything,

00:06:58.870 --> 00:07:01.070
<v Matthias>start from scratch, rewrite it in Rust.

00:07:01.490 --> 00:07:06.990
<v Matthias>But what people forget is that those 200,000 lines of code, they contain a lot

00:07:06.990 --> 00:07:09.070
<v Matthias>of business logic and a lot of value.

00:07:09.170 --> 00:07:12.330
<v Matthias>I'm assuming right now, please correct me if I'm wrong here,

00:07:12.710 --> 00:07:19.170
<v Matthias>is that a lot of the logic is also about handling different package manager formats,

00:07:19.170 --> 00:07:24.810
<v Matthias>file formats lots of parsing lots of error handling and so on can you talk a

00:07:24.810 --> 00:07:30.030
<v Matthias>little bit about what's in there what's the bread and butter for you to make

00:07:30.030 --> 00:07:31.210
<v Matthias>that infrastructure work even.

00:07:31.210 --> 00:07:39.610
<v Cian>Yeah yeah it's it's that it's all that kind of stuff i said each package format is its own distinct.

00:07:41.010 --> 00:07:43.690
<v Cian>Concept like we have they all have a lot of

00:07:43.690 --> 00:07:46.690
<v Cian>similarities under the hood the data types are all very

00:07:46.690 --> 00:07:50.010
<v Cian>similar in our infrastructure but each

00:07:50.010 --> 00:07:53.330
<v Cian>one has its own idiosyncratic ways of

00:07:53.330 --> 00:08:00.050
<v Cian>being of being handled and request flows we can like the flow for uploading

00:08:00.050 --> 00:08:05.950
<v Cian>a package is under the hood is we take a binary and we store it somewhere but

00:08:05.950 --> 00:08:11.510
<v Cian>the handshake you do with that and the metadata you store and that differs in each package,

00:08:11.870 --> 00:08:18.070
<v Cian>which means that you could go into our code base and go into the slash packages

00:08:18.070 --> 00:08:23.250
<v Cian>folder and then you'll just see 36 different code bases in there that are similar.

00:08:23.610 --> 00:08:29.070
<v Cian>They have shared bits of code for logging and for metadata processing and tracking

00:08:29.070 --> 00:08:34.070
<v Cian>of events used internally and all that kind of business logic that's shared,

00:08:34.070 --> 00:08:40.210
<v Cian>but each format is different and their code paths are different.So

00:08:40.210 --> 00:08:47.010
<v Cian>we'll never, like, we could, like, sit down and very quickly scaffold out

00:08:47.010 --> 00:08:52.210
<v Cian>a brand new service in Go or Rust that hits those same things.

00:08:52.690 --> 00:08:56.710
<v Cian>But we then have the weird edge case of, like,

00:08:57.330 --> 00:09:08.230
<v Cian>how do you, how does that interact with our processing of, our processing of SBOMs generation?

00:09:09.850 --> 00:09:15.410
<v Cian>And and then we need to store that in this in the in a way that's can be queried

00:09:15.410 --> 00:09:21.690
<v Cian>by our api to be displayed in our ui or and we also need to track all those

00:09:21.690 --> 00:09:26.190
<v Cian>data all those bytes you care about how many bytes are being downloaded we need

00:09:26.190 --> 00:09:29.090
<v Cian>to ensure that all that data is being tracked correctly,

00:09:29.850 --> 00:09:33.030
<v Cian>we have we're still we're in that scale-up

00:09:33.030 --> 00:09:35.950
<v Cian>phase of startup life so we're we're hiring we're bringing on

00:09:35.950 --> 00:09:40.650
<v Cian>new engineers but we're still a small enough team so let

00:09:40.650 --> 00:09:44.130
<v Cian>if you brought in you bring in me you we bring

00:09:44.130 --> 00:09:46.810
<v Cian>in me lee our cto made the joke of one day he's going

00:09:46.810 --> 00:09:50.870
<v Cian>to wake up and everything's going to be rust after hiring me and we all we all

00:09:50.870 --> 00:09:54.010
<v Cian>we laugh and it's funny but we know it's not really going to happen we're going

00:09:54.010 --> 00:09:58.410
<v Cian>to have some core bits that are rust and but they're still going to be that

00:09:58.410 --> 00:10:04.190
<v Cian>core python code that's not changing because everyone in our shop knows python,

00:10:04.810 --> 00:10:06.670
<v Cian>we have a couple people who know Go.

00:10:06.890 --> 00:10:09.430
<v Cian>We have a couple people, we have me who knows Rust.

00:10:10.010 --> 00:10:14.010
<v Cian>We have some people willing to learn and who have tried Rust and Go at different

00:10:14.010 --> 00:10:15.410
<v Cian>times, but they're not like,

00:10:16.900 --> 00:10:21.420
<v Cian>ready to jump in on a project and start developing today or tomorrow right.

00:10:21.420 --> 00:10:24.980
<v Matthias>But also even if you were let's say

00:10:24.980 --> 00:10:28.000
<v Matthias>an expert in go it would be harder to

00:10:28.000 --> 00:10:31.640
<v Matthias>integrate go into the project because go

00:10:31.640 --> 00:10:38.860
<v Matthias>has its own runtime it has a garbage collector and you could do so by using

00:10:38.860 --> 00:10:44.780
<v Matthias>the network boundary but not necessarily integrating it into the existing project

00:10:44.780 --> 00:10:48.140
<v Matthias>as you could do with for example PyO3 or so.

00:10:48.140 --> 00:10:53.800
<v Cian>100 we and we have actually experimented with go and that's where it ended up

00:10:53.800 --> 00:11:00.580
<v Cian>so we've moved logic for doing specific things out into a go microservice previously

00:11:00.580 --> 00:11:06.180
<v Cian>nothing core to that business it was specifically supporting for one format

00:11:06.180 --> 00:11:07.600
<v Cian>and for scaling that format,

00:11:09.140 --> 00:11:14.460
<v Cian>and yeah we couldn't you can't it's nice it works it's there and it's solid

00:11:14.460 --> 00:11:19.720
<v Cian>but it is a separate microservice and it goes against that belief we have that

00:11:19.720 --> 00:11:22.880
<v Cian>everything should be in the monolith this is one of those core tenets we have

00:11:22.880 --> 00:11:25.580
<v Cian>that we should scale our monolith we should,

00:11:26.510 --> 00:11:30.390
<v Cian>focus on making sure code is in the monolith interesting.

00:11:30.390 --> 00:11:38.230
<v Matthias>Point because at least in the last decade or so monoliths were sort of frowned

00:11:38.230 --> 00:11:42.850
<v Matthias>upon weirdly enough and now it feels like the industry is circling back on that

00:11:42.850 --> 00:11:47.790
<v Matthias>idea can you maybe explain from your perspective what's so great about having a monolith.

00:11:47.790 --> 00:11:50.950
<v Cian>Yeah yeah no i i think i

00:11:50.950 --> 00:11:54.590
<v Cian>came into the industry when we were just heading

00:11:54.590 --> 00:11:57.610
<v Cian>for that peak of microservices or on our way up to it and

00:11:57.610 --> 00:12:00.610
<v Cian>i've never liked them i'm like a big hater

00:12:00.610 --> 00:12:03.570
<v Cian>on them i've always hated them now maybe i got cut very

00:12:03.570 --> 00:12:07.190
<v Cian>early on and i've never like recovered from it but i

00:12:07.190 --> 00:12:11.010
<v Cian>think the thing about scaling microservices is

00:12:11.010 --> 00:12:14.170
<v Cian>it seems like it's a really easy thing

00:12:14.170 --> 00:12:17.550
<v Cian>to do you can just throw a little service at it and everything

00:12:17.550 --> 00:12:20.510
<v Cian>works and it's like it's i

00:12:20.510 --> 00:12:23.390
<v Cian>just have this i call that service and it gives me a response and that's

00:12:23.390 --> 00:12:26.810
<v Cian>great and when you're running like one box with

00:12:26.810 --> 00:12:30.330
<v Cian>talking to another box that does scale pretty nicely and when

00:12:30.330 --> 00:12:33.210
<v Cian>you have a small bit of traffic that scales real it scales

00:12:33.210 --> 00:12:36.230
<v Cian>really nicely because you have a small bit of traffic but in the real world

00:12:36.230 --> 00:12:42.490
<v Cian>it's never that simple you deploy 10 services for your microservice it says

00:12:42.490 --> 00:12:48.110
<v Cian>it's got 10 it's got you've 10 replicas and you have 20 replicas of your other

00:12:48.110 --> 00:12:53.310
<v Cian>service let's say you need to ensure that you're properly load balancing across those 10 services.

00:12:53.650 --> 00:12:59.230
<v Cian>You need to account for the network delay in your one service as it waits on the other.

00:12:59.490 --> 00:13:08.650
<v Cian>You start running into issues about managing connection pools and blocking IO resources.

00:13:09.830 --> 00:13:14.990
<v Cian>Like this is one of those things that we actually ran into a lot in our monolith.

00:13:15.530 --> 00:13:18.130
<v Cian>The way Python blocks...

00:13:19.210 --> 00:13:25.890
<v Cian>Can be quite problematic because it doesn't just like go to sleep and pull,

00:13:26.090 --> 00:13:27.810
<v Cian>it could just like sit there and wait.

00:13:27.970 --> 00:13:31.170
<v Cian>And then you just have resources that are blocked waiting for that.

00:13:31.350 --> 00:13:36.450
<v Cian>You need to know how to sleep and how to, and pick up more work in the background

00:13:36.450 --> 00:13:38.810
<v Cian>while you wait on resources to fill up.

00:13:39.070 --> 00:13:42.510
<v Cian>But if you never have to call across that network boundary,

00:13:42.950 --> 00:13:49.470
<v Cian>if you have all your logic in a monolith you don't you can avoid the overhead

00:13:49.470 --> 00:13:56.590
<v Cian>of a network you have a much simpler cognitive design that you can account for right.

00:13:56.590 --> 00:14:01.890
<v Matthias>I fully agree and also refactoring across microservices is never fun.

00:14:01.890 --> 00:14:04.910
<v Cian>No no and yeah and

00:14:04.910 --> 00:14:07.790
<v Cian>this is a problem we're running into i say

00:14:07.790 --> 00:14:11.030
<v Cian>i keep saying this problem running into we don't

00:14:11.030 --> 00:14:14.770
<v Cian>have microservices but we do have a cdn and

00:14:14.770 --> 00:14:17.690
<v Cian>we and how we roll code out

00:14:17.690 --> 00:14:20.370
<v Cian>to that cdn versus how it interacts with the

00:14:20.370 --> 00:14:25.550
<v Cian>monolith is a core part of what we do on the edge team and you need to really

00:14:25.550 --> 00:14:30.930
<v Cian>ensure that you have that two-factor step of like we add a feature in the monolith

00:14:30.930 --> 00:14:35.170
<v Cian>in the cdn so it can start using it and then you add you enable the feature

00:14:35.170 --> 00:14:38.210
<v Cian>in the monolith and then we can remove the old legacy.

00:14:38.430 --> 00:14:45.770
<v Cian>So you have that like three step deploy phase and it, you'd think it's, it's such a,

00:14:46.450 --> 00:14:51.210
<v Cian>a hassle is the is the only way to say it of like remembering that and if you

00:14:51.210 --> 00:14:55.290
<v Cian>don't do that you end up with all these dead code paths which we have we have

00:14:55.290 --> 00:15:00.410
<v Cian>hundreds of line of dead code paths in our in our edge because we just didn't go back and clean it up.

00:15:00.410 --> 00:15:03.930
<v Matthias>Yeah it's such a bespoke process

00:15:03.930 --> 00:15:07.290
<v Matthias>to make releases across microservices all

00:15:07.290 --> 00:15:10.290
<v Matthias>the ceremony the adding a

00:15:10.290 --> 00:15:13.570
<v Matthias>feature but also putting it behind a feature flag making

00:15:13.570 --> 00:15:16.730
<v Matthias>sure that the other one is bumped up

00:15:16.730 --> 00:15:19.790
<v Matthias>to the correct version and then

00:15:19.790 --> 00:15:27.450
<v Matthias>slowly migrating over and whereas if you have a monolith you can just make all

00:15:27.450 --> 00:15:30.750
<v Matthias>of those changes in one pull request and then review all of those changes and

00:15:30.750 --> 00:15:39.190
<v Matthias>your debugger still works and your linter still works and and all of those niceties yep.

00:15:39.190 --> 00:15:43.050
<v Cian>That's and i think that's the it's It's the debugger that still works.

00:15:43.050 --> 00:15:45.110
<v Cian>I think it's one of the nicest ones as well.

00:15:45.630 --> 00:15:52.890
<v Cian>I'm not a big debugger fan myself, but I know that a lot of people in our industries love debuggers.

00:15:53.070 --> 00:15:57.730
<v Cian>And it's the fact that you don't have to pull out something like Jaeger or Datadog

00:15:57.730 --> 00:16:01.030
<v Cian>to do that debugging because you're calling across different services.

00:16:01.790 --> 00:16:07.130
<v Cian>Like tracing is great. I love tracing tooling, like all the open telemetry kind of stuff.

00:16:07.330 --> 00:16:13.010
<v Cian>It's great. But when you need to run a dedicated open telemetry stack for debugging

00:16:13.010 --> 00:16:17.470
<v Cian>one simple request, that's a lot of overkill on my laptop.

00:16:17.770 --> 00:16:21.110
<v Cian>And like, I have a nice laptop, but I don't know I need to be running a data

00:16:21.110 --> 00:16:23.390
<v Cian>center on my laptop just to do a little bit debugging.

00:16:23.930 --> 00:16:28.410
<v Matthias>Coming back to Rust, because that's kind of what I want to talk about.

00:16:28.910 --> 00:16:36.210
<v Matthias>It's nicer in Rust. Yes, you can integrate Rust with PyO3, but I'm not sure

00:16:36.210 --> 00:16:38.870
<v Matthias>how that process went for you.

00:16:39.010 --> 00:16:45.210
<v Matthias>Did you even use PyO3 for that work, or did you decide on doing it a different way?

00:16:47.160 --> 00:16:54.160
<v Cian>So let's step back a little. I came into Cloudsmith last year as performance engineer.

00:16:55.360 --> 00:17:00.700
<v Cian>We decided as a company, we wanted to focus on building and scaling.

00:17:02.260 --> 00:17:06.300
<v Cian>And it was known that I was a Rust developer coming from a Rust shop.

00:17:06.580 --> 00:17:12.200
<v Cian>So there was a known value that I was probably going to write some Rust at some

00:17:12.200 --> 00:17:19.280
<v Cian>point. But we didn't sit down and say, how can we bring Rust in to scale this service?

00:17:19.560 --> 00:17:22.440
<v Cian>I sat down and I just started looking at those traces.

00:17:22.960 --> 00:17:28.720
<v Cian>Started looking at Datadog, started looking at where the bottlenecks in our

00:17:28.720 --> 00:17:30.980
<v Cian>service were. We had load tests running.

00:17:31.220 --> 00:17:38.100
<v Cian>We were getting information back about what was slow, what were our slowest

00:17:38.100 --> 00:17:39.260
<v Cian>endpoints, all that kind of stuff.

00:17:40.460 --> 00:17:43.340
<v Cian>So the things that came out when you

00:17:43.340 --> 00:17:46.460
<v Cian>look at that data was we would sit waiting

00:17:46.460 --> 00:17:53.160
<v Cian>on io we would and it would be serialization these were two of our biggest things

00:17:53.160 --> 00:17:59.640
<v Cian>the io was two different types of io our database we we queried the database

00:17:59.640 --> 00:18:03.420
<v Cian>a lot probably too much but we do it uh,

00:18:04.200 --> 00:18:07.260
<v Cian>eats up a lot of resources. The other side is the network.

00:18:07.480 --> 00:18:13.820
<v Cian>So we call out to upstreams like PyPy and Cargo to pull in information.

00:18:14.200 --> 00:18:21.300
<v Cian>And then we have the inbound requests. So that's requests from our customers to us.

00:18:22.520 --> 00:18:28.380
<v Cian>So and how many requests per second can we process from the pull in from the

00:18:28.380 --> 00:18:30.380
<v Cian>network and process concurrently.

00:18:31.200 --> 00:18:37.500
<v Cian>The other bits being serialization, that's serializing large JSON payloads,

00:18:37.800 --> 00:18:39.680
<v Cian>large XML payloads, and that kind of stuff.

00:18:40.020 --> 00:18:44.060
<v Cian>So we sat down and said, how can we go about fixing this?

00:18:44.380 --> 00:18:49.040
<v Cian>And it wasn't a one shot of like, we need to fix it all at once,

00:18:49.240 --> 00:18:53.080
<v Cian>or we need to roll everything out, switch everything up at once,

00:18:53.120 --> 00:18:55.520
<v Cian>or let's build it ourselves.

00:18:56.160 --> 00:19:02.200
<v Cian>We try not to be a shop that suffers from not built here kind of thing we like

00:19:02.200 --> 00:19:04.620
<v Cian>to use open source software where

00:19:05.040 --> 00:19:09.680
<v Cian>possible or use sasses where possible because there's only so many people we

00:19:09.680 --> 00:19:17.260
<v Cian>have so we so i started googling because i knew a solution to the json serialization already,

00:19:18.880 --> 00:19:27.100
<v Cian>uh back in in two jobs ago back when i worked in video games we worked we had

00:19:27.100 --> 00:19:34.640
<v Cian>a very large logging pipeline where we would serialize everything to JSON across the whole fleet.

00:19:35.840 --> 00:19:39.440
<v Cian>And so we were also a Python shop,

00:19:39.660 --> 00:19:45.180
<v Cian>and I was working on the metrics team, and we rolled out a logging change that

00:19:45.180 --> 00:19:51.340
<v Cian>switched how we serialize JSON in all of our microservices with a Rust library called orjson.

00:19:51.780 --> 00:19:52.300
<v Matthias>Oh, yeah.

00:19:52.700 --> 00:19:56.600
<v Cian>It's a great library. Well, it's a Rust library and a Python library.

00:19:56.980 --> 00:20:00.460
<v Cian>It's written in Rust, and it's

00:20:00.460 --> 00:20:04.260
<v Cian>got nice Python bindings that look similar

00:20:04.260 --> 00:20:07.140
<v Cian>enough to the normal ones the normal

00:20:07.140 --> 00:20:10.060
<v Cian>json python bindings so i knew

00:20:10.060 --> 00:20:14.140
<v Cian>from then that the speed up varies somewhere

00:20:14.140 --> 00:20:17.580
<v Cian>between 7 and 10x depending

00:20:17.580 --> 00:20:20.400
<v Cian>on what you're doing and what it looks

00:20:20.400 --> 00:20:29.720
<v Cian>like and i know that when we did the change in that company i saw about a one

00:20:29.720 --> 00:20:36.420
<v Cian>to two percent change of cpu usage across our data center over a couple weeks

00:20:36.420 --> 00:20:40.260
<v Cian>it takes time for changes to go out but we definitely saw improvements,

00:20:41.020 --> 00:20:46.120
<v Cian>and at that scale it was really important to kind of like you get a lot of you get the,

00:20:46.780 --> 00:20:51.760
<v Cian>those small gains they really add up over time so i reached for that library

00:20:51.760 --> 00:20:58.100
<v Cian>because i had such success with it before and when we went to reach for it it

00:20:58.100 --> 00:21:01.980
<v Cian>turns out django already has a wrapper It was even easier than that.

00:21:02.200 --> 00:21:11.020
<v Cian>So we installed the Django or JSON serialization library, and it swapped out our...

00:21:12.710 --> 00:21:17.910
<v Cian>JSON serialization, which is just the normal Python JSON serialization with a Rust-based one.

00:21:19.650 --> 00:21:25.790
<v Cian>We then had to go through all the code base and find every place we imported

00:21:25.790 --> 00:21:27.750
<v Cian>JSON and replace it with orjson.

00:21:28.810 --> 00:21:34.970
<v Cian>And then we did these each incremental steps. We didn't like flip the switch.

00:21:38.330 --> 00:21:41.250
<v Cian>We flipped the switch on

00:21:41.250 --> 00:21:46.930
<v Cian>updating all the json files one at a time until i think one day i just got very

00:21:46.930 --> 00:21:52.510
<v Cian>bored sat down and had a train ride and just banged through every single i just

00:21:52.510 --> 00:21:56.130
<v Cian>grabbed everywhere we imported json library and just iterated through those

00:21:56.130 --> 00:21:58.390
<v Cian>files making sure they were all correct nice.

00:21:58.390 --> 00:22:00.430
<v Matthias>It's always the train rides right.

00:22:00.430 --> 00:22:01.570
<v Cian>Yeah it's.

00:22:01.570 --> 00:22:03.870
<v Matthias>Always that's when we get the work done but.

00:22:03.870 --> 00:22:04.490
<v Cian>Yeah also.

00:22:04.490 --> 00:22:08.690
<v Matthias>What i find particularly interesting about that story is,

00:22:10.400 --> 00:22:15.420
<v Matthias>If you didn't know it was written in Rust, you might not even have cared about

00:22:15.420 --> 00:22:20.480
<v Matthias>it because it was yet another Python package that you just integrate into your workflow.

00:22:20.860 --> 00:22:26.480
<v Matthias>And it was a drop-in replacement. But I wonder how many organizations out there

00:22:26.480 --> 00:22:31.040
<v Matthias>run Rust without even knowing it this way because Orchason happens to be written in Rust.

00:22:31.680 --> 00:22:33.360
<v Cian>I think there's probably so

00:22:33.360 --> 00:22:38.140
<v Cian>many places. Like, if you asked my previous employer if they're on Rust,

00:22:38.240 --> 00:22:41.080
<v Cian>they would say, nope we have no rust and i know for a fact there's rust

00:22:41.080 --> 00:22:43.920
<v Cian>in every service because of the fact i put

00:22:43.920 --> 00:22:47.360
<v Cian>it there through that python library and i

00:22:47.360 --> 00:22:50.440
<v Cian>think that's actually like a nice thing it's it's

00:22:50.440 --> 00:22:53.900
<v Cian>also true of the cryptography library in python it's

00:22:53.900 --> 00:22:56.880
<v Cian>rust based now it the there is

00:22:56.880 --> 00:22:59.700
<v Cian>rust and there's rust in linux now

00:22:59.700 --> 00:23:02.680
<v Cian>like rust is everywhere it's getting rolled out

00:23:02.680 --> 00:23:05.680
<v Cian>everywhere but it's those

00:23:05.680 --> 00:23:13.320
<v Cian>nice places like orjson where someone has sat down and said how can i make this

00:23:13.320 --> 00:23:15.800
<v Cian>faster without breaking the api

00:23:15.800 --> 00:23:22.920
<v Cian>or in such a way that it doesn't take a massive lift to switch it out yes.

00:23:22.920 --> 00:23:26.080
<v Matthias>But also as some sort of counter argument

00:23:26.080 --> 00:23:32.180
<v Matthias>to that someone might listen to it and think well json is sort of a nice easy

00:23:32.180 --> 00:23:37.260
<v Matthias>interface to integrate with there because there's a nice api surface but how

00:23:37.260 --> 00:23:41.660
<v Matthias>often does that happen in practice that you can just use a drop-in replacement

00:23:41.660 --> 00:23:43.280
<v Matthias>what would you say to that.

00:23:44.070 --> 00:23:49.510
<v Cian>Yeah, not as often as I would like. It's totally not as often as I'd like. We've managed to get...

00:23:51.890 --> 00:23:56.290
<v Cian>I talked about this before in previous talks at FOSDEM about our experience at it.

00:23:57.310 --> 00:24:00.630
<v Cian>We switched or JSON first, and it worked great.

00:24:01.570 --> 00:24:06.850
<v Cian>Well, it worked great. One customer broke because they were parsing JSON with

00:24:06.850 --> 00:24:11.210
<v Cian>bash and grep and seds and all those things. Don't do that.

00:24:12.330 --> 00:24:15.210
<v Cian>It's bad. they realized it was bad and they moved on

00:24:15.210 --> 00:24:18.090
<v Cian>so ordreson great drop replacement

00:24:18.090 --> 00:24:21.830
<v Cian>after the success of ordreson i i

00:24:21.830 --> 00:24:26.750
<v Cian>didn't want to like i knew pyotree existed so i sat down and said where could

00:24:26.750 --> 00:24:33.690
<v Cian>pyotree be used next my other the one that i wanted to look at was xml parsing

00:24:33.690 --> 00:24:39.410
<v Cian>not parsing serialization We have to serialize very large XML payloads.

00:24:39.770 --> 00:24:45.210
<v Cian>So I was interested in seeing how could I come up with a more efficient way

00:24:45.210 --> 00:24:49.550
<v Cian>of doing this for our use case using Rust and Pyotree.

00:24:50.430 --> 00:24:55.050
<v Cian>But I got distracted when I went on to Pyotree's docs and I noticed...

00:24:55.940 --> 00:24:59.480
<v Cian>That you they had a json schema

00:24:59.480 --> 00:25:02.620
<v Cian>library and i thought oh cool i wonder

00:25:02.620 --> 00:25:05.880
<v Cian>if this is faster than our json schema library so i

00:25:05.880 --> 00:25:09.880
<v Cian>went to see our usage of json schema library and found out we were already using

00:25:09.880 --> 00:25:15.140
<v Cian>the rust one but also we were using the python one we had both installed and

00:25:15.140 --> 00:25:20.660
<v Cian>we're using them both at different parts in the code and i kind of just looked

00:25:20.660 --> 00:25:24.080
<v Cian>at myself going how do we What happened here?

00:25:24.220 --> 00:25:29.380
<v Cian>Did someone just not look at our folder and say, do we have a JSON schema library already?

00:25:29.680 --> 00:25:32.560
<v Cian>Or were we planning to do the migration?

00:25:32.800 --> 00:25:38.620
<v Matthias>My suggestion would be to use Cloudsmith because they handle package management for you.

00:25:38.780 --> 00:25:41.200
<v Matthias>And this is how you could avoid the problem.

00:25:41.580 --> 00:25:45.300
<v Cian>Yes, yeah, totally. Well, I think you could at least catch it sooner.

00:25:45.760 --> 00:25:48.660
<v Cian>Maybe we wouldn't have been running the two things for so long.

00:25:49.000 --> 00:25:53.500
<v Cian>But saying that, it gave another opportunity for us to like continue the rollout

00:25:53.500 --> 00:25:56.460
<v Cian>of like switching to Rust because we clearly knew it worked for us.

00:25:56.900 --> 00:26:01.980
<v Cian>We had success already. So all I needed to do that one was again,

00:26:02.180 --> 00:26:06.500
<v Cian>just switching the imports in all the other ones and removing the pure Python implementation.

00:26:07.300 --> 00:26:10.680
<v Cian>And we rolled it out and it was smooth as butter.

00:26:10.940 --> 00:26:15.860
<v Cian>Like that, it wasn't, I didn't change any code. I just changed the import statements.

00:26:16.710 --> 00:26:21.510
<v Cian>So there definitely is like that ability to do those drop-in replacements that

00:26:21.510 --> 00:26:24.430
<v Cian>work so well right there.

00:26:25.070 --> 00:26:31.750
<v Matthias>What I find cool about that story is that these initial quick wins gave you

00:26:31.750 --> 00:26:39.250
<v Matthias>a lot of confidence into integrating Rust in the stack without really requiring

00:26:39.250 --> 00:26:41.290
<v Matthias>a lot of backing from the entire organization.

00:26:41.290 --> 00:26:45.350
<v Matthias>You can just go step by step and you can see the success right away.

00:26:45.350 --> 00:26:50.430
<v Matthias>But then eventually you might have hit a wall where this is no longer possible

00:26:50.430 --> 00:26:55.850
<v Matthias>because all of the quick wins are gone so i wonder how you transitioned from

00:26:55.850 --> 00:27:04.550
<v Matthias>there to maybe introducing more rust because well obviously it was kind of a success yeah.

00:27:04.550 --> 00:27:10.010
<v Cian>Like i said i was playing around with pyotree and different ideas and when i

00:27:10.010 --> 00:27:15.050
<v Cian>started looking at our bottleneck for the network I started thinking about how

00:27:15.050 --> 00:27:17.550
<v Cian>we manage work in the service.

00:27:18.070 --> 00:27:21.690
<v Cian>And the way our request model worked was,

00:27:22.680 --> 00:27:29.660
<v Cian>We were using WSGI, W-S-G-I, and effectively where processing requests come

00:27:29.660 --> 00:27:32.480
<v Cian>in, we'd give them to a Python worker,

00:27:32.800 --> 00:27:38.600
<v Cian>and it would do the request to completion and then hand the response back.

00:27:38.820 --> 00:27:45.240
<v Cian>So for Rust developers, they might look at that, and the model is very similar

00:27:45.240 --> 00:27:51.100
<v Cian>to a Tokyo service that we had, and that's my instant thought about it.

00:27:51.100 --> 00:27:55.340
<v Cian>I looked at it and said, that looks like a Tokyo service that has one event

00:27:55.340 --> 00:27:56.780
<v Cian>loop that does some processing,

00:27:57.280 --> 00:28:01.960
<v Cian>hands it off to a background task, and then it waits for the task to complete

00:28:01.960 --> 00:28:06.200
<v Cian>and get back the results onto the main event loop and throws it back over the wire.

00:28:06.880 --> 00:28:11.140
<v Cian>Of course, it doesn't use serialization to bytes or any of that kind of stuff, but it looks like it.

00:28:11.840 --> 00:28:17.200
<v Cian>One of the bottlenecks I found was we were wasting cycles doing work for connections

00:28:17.200 --> 00:28:18.380
<v Cian>that had already closed.

00:28:18.780 --> 00:28:19.440
<v Matthias>Oh, wow.

00:28:19.580 --> 00:28:19.740
<v Cian>Yeah.

00:28:19.880 --> 00:28:20.220
<v Matthias>Why is that?

00:28:21.300 --> 00:28:25.520
<v Cian>It's a little to do with our queuing model and a little bit to do with request

00:28:25.520 --> 00:28:29.020
<v Cian>management in uWSGI, the process we were using.

00:28:29.940 --> 00:28:34.600
<v Cian>Effectively, if a request had sat in the queue for too long,

00:28:34.800 --> 00:28:38.020
<v Cian>it would be handed over to uWSGI.

00:28:38.220 --> 00:28:42.460
<v Cian>uWSGI would do it and it would time out in the upstream because it had been

00:28:42.460 --> 00:28:44.040
<v Cian>processing for longer than a minute.

00:28:45.100 --> 00:28:48.640
<v Cian>But there's no way to cancel the request once it's in flow.

00:28:49.180 --> 00:28:52.040
<v Cian>It would we would benefit from

00:28:52.040 --> 00:28:55.340
<v Cian>it because we do all the work and cache the result so another

00:28:55.340 --> 00:28:58.820
<v Cian>request would be so the request would have been retried and would be in the

00:28:58.820 --> 00:29:02.660
<v Cian>queue and by the time it gets to the front of the queue it's all its results

00:29:02.660 --> 00:29:08.980
<v Cian>are cached so it was a nasty flow but it kind of we kind of optimized for it

00:29:08.980 --> 00:29:12.860
<v Cian>yeah but i thought to myself this feels insane there's no.

00:29:15.120 --> 00:29:17.820
<v Cian>Is the first is going to be cached or it's

00:29:17.820 --> 00:29:20.500
<v Cian>going to be re-driven like most of the time it

00:29:20.500 --> 00:29:23.640
<v Cian>is we're dealing with some some of the i think someone

00:29:23.640 --> 00:29:26.580
<v Cian>recently described some of our clients as some of the best and worst

00:29:26.580 --> 00:29:30.300
<v Cian>clients in the world because they're designed for public

00:29:30.300 --> 00:29:33.640
<v Cian>infrastructure since they're all package management clients so

00:29:33.640 --> 00:29:37.060
<v Cian>they have a lot of retries but they have a lot of weird formats so

00:29:37.060 --> 00:29:39.720
<v Cian>we're dealing with some of the best and worst clients so we

00:29:39.720 --> 00:29:43.160
<v Cian>know a lot of things are going to be like retried and attempted again

00:29:43.160 --> 00:29:46.040
<v Cian>but it's also not a

00:29:46.040 --> 00:29:48.960
<v Cian>perfect cache because some of our caches are in memory and

00:29:48.960 --> 00:29:54.280
<v Cian>some of them are memcache so things that were in memcache those were quick but

00:29:54.280 --> 00:29:59.240
<v Cian>if it was in memory cache unless you hit the exact same node again that in memory

00:29:59.240 --> 00:30:04.160
<v Cian>cache is useless and like i said we're running lots of replicas so there's no

00:30:04.160 --> 00:30:06.560
<v Cian>real guarantee on that yeah.

00:30:06.560 --> 00:30:10.080
<v Matthias>That's a thing that i heard a couple times already is

00:30:10.080 --> 00:30:13.340
<v Matthias>that if you think about a

00:30:13.340 --> 00:30:16.380
<v Matthias>highly performance service that does not waste a

00:30:16.380 --> 00:30:19.340
<v Matthias>lot of cpu cycles then you need less of

00:30:19.340 --> 00:30:23.460
<v Matthias>those which means you have higher cache locality if

00:30:23.460 --> 00:30:29.620
<v Matthias>you have a service that is not as fast you need more instances so you lose the

00:30:29.620 --> 00:30:35.560
<v Matthias>ability to have things in your in-memory cache so that's kind of another way

00:30:35.560 --> 00:30:42.060
<v Matthias>on how more performant languages or more performant code is effective.

00:30:42.060 --> 00:30:43.120
<v Cian>And.

00:30:43.120 --> 00:30:44.120
<v Matthias>Helps with performance.

00:30:44.120 --> 00:30:46.960
<v Cian>Yeah like those i'm a big believer

00:30:46.960 --> 00:30:49.860
<v Cian>that in-memory caches are are only

00:30:49.860 --> 00:30:53.460
<v Cian>good when you can have a small footprint because they're

00:30:53.460 --> 00:30:59.400
<v Cian>effectively they build up in that small footprint and if you need to have lots

00:30:59.400 --> 00:31:05.660
<v Cian>of replicas for whatever reason be that be that like budgetary or a limiting

00:31:05.660 --> 00:31:10.780
<v Cian>of like only having one CPU per map to a process or something like that,

00:31:11.040 --> 00:31:14.680
<v Cian>you end up with these very disparate caches that have different information

00:31:14.680 --> 00:31:18.360
<v Cian>and your load kind of ends up going all over the place.

00:31:19.240 --> 00:31:24.340
<v Matthias>But wouldn't you have been able to query the in-memory cache and then,

00:31:24.520 --> 00:31:26.680
<v Matthias>if that fails, go to memcache right away?

00:31:28.200 --> 00:31:31.540
<v Cian>Yes, you would think that. But the issue isn't that we were,

00:31:31.680 --> 00:31:37.200
<v Cian>it was, it's not that we have one caching mechanism, it's that we have different caching mechanisms.

00:31:37.660 --> 00:31:42.700
<v Cian>So we were using the Python caching library for in-memory cache.

00:31:42.880 --> 00:31:47.720
<v Cian>And then we were using our memcache with our database to cache responses from

00:31:47.720 --> 00:31:50.160
<v Cian>the database. So these are actually two different caches.

00:31:50.300 --> 00:31:54.380
<v Cian>The memcache one is just, could we stop ourselves from going to database?

00:31:54.620 --> 00:31:59.360
<v Cian>And we would totally check that on every request. So if we had done a very expensive

00:31:59.360 --> 00:32:02.100
<v Cian>DB query, it should be in that memcache.

00:32:02.220 --> 00:32:04.640
<v Cian>So on the retry, it would come from the memcache.

00:32:04.980 --> 00:32:09.180
<v Cian>What wasn't being cached were those pure functions we were running inside the

00:32:09.180 --> 00:32:10.840
<v Cian>monolith that were in the Python cache.

00:32:11.380 --> 00:32:11.900
<v Matthias>Got it.

00:32:12.300 --> 00:32:12.440
<v Cian>Yeah.

00:32:12.440 --> 00:32:16.840
<v Matthias>So the new bottleneck right now is between the network layer,

00:32:17.140 --> 00:32:23.040
<v Matthias>which was your uWSGI, and the Django monolith.

00:32:23.340 --> 00:32:26.660
<v Matthias>There's where you lose a lot of the performance now.

00:32:27.060 --> 00:32:33.180
<v Cian>Yeah. And my goal was something we're still working on, was I wanted to be able

00:32:33.180 --> 00:32:34.380
<v Cian>to do request cancellation.

00:32:34.380 --> 00:32:38.400
<v Cian>So I wanted to be able to say, that's timed out upstream, I want to cancel it.

00:32:38.400 --> 00:32:43.140
<v Cian>Something i had previously done in a tokyo service so kind of was like totally

00:32:43.140 --> 00:32:51.120
<v Cian>let's do this so i sat down to try and figure out how i could map a tokyo request

00:32:51.120 --> 00:32:55.820
<v Cian>managed service to our WSGI app and,

00:32:56.660 --> 00:33:02.760
<v Cian>it was and i was reading pyotree docs and i was playing around with a library called,

00:33:05.020 --> 00:33:12.320
<v Cian>RustImport, which lets you very quickly write PyoTree bindings for your Rust libraries.

00:33:12.800 --> 00:33:20.180
<v Cian>You can get a very rough and ready code in 20 lines with some macros.

00:33:23.210 --> 00:33:29.470
<v Cian>And you can have this very rough importing of Rust code directly into your Python

00:33:29.470 --> 00:33:31.250
<v Cian>code without a lot of overhead.

00:33:32.110 --> 00:33:33.610
<v Cian>Great for prototyping.

00:33:35.810 --> 00:33:40.310
<v Cian>I had found some places where I thought I would probably change this if I wanted

00:33:40.310 --> 00:33:45.790
<v Cian>to bring it to prod and just use Pyotree for creating the interface exactly as I wanted to.

00:33:46.130 --> 00:33:48.890
<v Cian>But it was definitely great for prototyping.

00:33:49.730 --> 00:33:54.770
<v Cian>But saying that while prototyping i started looking at prior art and i had found

00:33:54.770 --> 00:33:59.470
<v Cian>someone had this idea already which is i want to say the best thing about like

00:33:59.470 --> 00:34:03.410
<v Cian>open source is sometimes you go and look and say someone has someone already

00:34:03.410 --> 00:34:07.810
<v Cian>had this idea and more often not someone has so.

00:34:07.810 --> 00:34:10.950
<v Matthias>Yeah and also you could have gone and

00:34:10.950 --> 00:34:14.590
<v Matthias>completely ignored that and not have done

00:34:14.590 --> 00:34:20.490
<v Matthias>any more research and you would have that liability on your side whereas now

00:34:20.490 --> 00:34:26.370
<v Matthias>you looked at prior art as you said and you found a thing that someone else

00:34:26.370 --> 00:34:32.530
<v Matthias>worked on before so that also shows that you took a very level-headed approach to that.

00:34:33.420 --> 00:34:40.420
<v Cian>Yeah, 100%. The project we found, it was called Granian, or I might mispronounce

00:34:40.420 --> 00:34:43.040
<v Cian>it a handful of times because I got so used to call it Granian at one point.

00:34:44.660 --> 00:34:52.660
<v Cian>But it's effectively a replacement for that WSGI service we were using that is written 100% in Rust.

00:34:53.140 --> 00:35:04.580
<v Cian>It's a Tokyo event loop that hands off to Python processes for doing the actual processing of the code.

00:35:04.740 --> 00:35:08.900
<v Cian>So all your business logic runs there and it just ensures that all the network

00:35:08.900 --> 00:35:12.300
<v Cian>logic is done inside Rust.

00:35:13.280 --> 00:35:17.720
<v Cian>This was really cool for me because I was like, cool, here's a project that

00:35:17.720 --> 00:35:20.160
<v Cian>does exactly what I wanted to do.

00:35:20.700 --> 00:35:26.140
<v Cian>And I started reading the code and I learned that the concept of request cancellation,

00:35:26.380 --> 00:35:29.640
<v Cian>the thing I was doing all of this to was not

00:35:29.640 --> 00:35:32.960
<v Cian>possible in new WSGI at all like there was never going to be a chance of

00:35:32.960 --> 00:35:37.840
<v Cian>doing it in WSGI because it's just not supported by a protocol Gradian does

00:35:37.840 --> 00:35:45.880
<v Cian>only supports it if you're using ASGI which is an async version of SGI that's

00:35:45.880 --> 00:35:49.060
<v Cian>more like a traditional event loop style of async await mm-hmm.

00:35:50.060 --> 00:35:52.020
<v Matthias>Similar to an io_uring or so.

00:35:52.360 --> 00:35:52.600
<v Cian>Exactly.

00:35:52.960 --> 00:35:53.660
<v Matthias>Completion-based.

00:35:54.060 --> 00:35:57.820
<v Cian>It's exactly the same kind of design.

00:35:58.440 --> 00:36:02.540
<v Cian>And you get to reuse all that kind of code that's designed for those IOU loops.

00:36:04.880 --> 00:36:08.600
<v Cian>But we sat down and we'd already started looking at it.

00:36:08.740 --> 00:36:14.000
<v Cian>So it saved me a lot of time in that concept of prior art walking down paths

00:36:14.000 --> 00:36:18.920
<v Cian>that we could have lost so much time if I had spent working on it.

00:36:18.920 --> 00:36:25.860
<v Cian>But it did have a feature that I loved, and that was it had a built-in queue

00:36:25.860 --> 00:36:27.280
<v Cian>for managing the requests.

00:36:28.520 --> 00:36:36.400
<v Cian>So right now, to this day and at the time, we were running HAProxy in front

00:36:36.400 --> 00:36:39.240
<v Cian>of uWSGI to allow us to scale.

00:36:40.140 --> 00:36:45.460
<v Cian>HA Proxy was effectively doing the queuing for us, managing work in a queue,

00:36:45.640 --> 00:36:51.020
<v Cian>and then handing it off to a uWSGI process that would hand it off to a Django

00:36:51.020 --> 00:36:54.560
<v Cian>process and do the request.

00:36:56.120 --> 00:37:01.620
<v Cian>And for reasons that elude me of why an engineer decided to do this,

00:37:01.700 --> 00:37:09.760
<v Cian>we also are running an Nginx in front of the HA Proxy to do very light routing

00:37:09.760 --> 00:37:11.820
<v Cian>control and optimizations.

00:37:12.380 --> 00:37:16.580
<v Cian>Nothing that couldn't have been done in HAProxy, but it was just being done

00:37:16.580 --> 00:37:18.140
<v Cian>in Nginx for some reason.

00:37:19.160 --> 00:37:20.800
<v Cian>And there was a ticket on a backlog

00:37:20.800 --> 00:37:26.080
<v Cian>for years of merge HAProxy and Nginx together and just have HAProxy.

00:37:26.780 --> 00:37:32.960
<v Matthias>It's interesting that you make that decision. One could have made the decision to go with Nginx.

00:37:32.960 --> 00:37:38.480
<v Matthias>Personally I find the nginx config to be easier to read and write in comparison

00:37:38.480 --> 00:37:44.700
<v Matthias>to the HAproxy config maybe that was the reason for nginx you're.

00:37:44.700 --> 00:37:46.460
<v Cian>Probably right it's like I,

00:37:47.230 --> 00:37:50.370
<v Cian>nginx is a really nice config it's super like

00:37:50.370 --> 00:37:53.390
<v Cian>readable and simple and probably of all the

00:37:53.390 --> 00:37:56.290
<v Cian>tools i'm going to talk about it's just it's the easiest to

00:37:56.290 --> 00:38:01.190
<v Cian>work with and was pretty bulletproof and in doing some amazing things for us

00:38:01.190 --> 00:38:08.230
<v Cian>but the ha proxy one was i think ha proxy is just a better queuing tool or at

00:38:08.230 --> 00:38:14.670
<v Cian>least my experience of using of doing request management in ha proxy has been better,

00:38:16.670 --> 00:38:19.630
<v Cian>but i think when no one knew which

00:38:19.630 --> 00:38:22.570
<v Cian>way we wanted to actually go the idea of like let's replace them

00:38:22.570 --> 00:38:25.610
<v Cian>both let's replace one with the other was this

00:38:25.610 --> 00:38:28.670
<v Cian>was the idea and when i

00:38:28.670 --> 00:38:32.650
<v Cian>found granian i looked at it and said oh this

00:38:32.650 --> 00:38:36.930
<v Cian>can not only replace our WSGI management interface

00:38:36.930 --> 00:38:41.190
<v Cian>but it can also replace ha-proxy because it

00:38:41.190 --> 00:38:44.610
<v Cian>can do that queuing internally and it

00:38:44.610 --> 00:38:51.650
<v Cian>has dials for tuning that queuing as we needed it to work so it there was and

00:38:51.650 --> 00:38:58.270
<v Cian>we also had this intense dislike of the of uWSGI because uWSGI is quite

00:38:58.270 --> 00:39:04.250
<v Cian>difficult to tune uWSGI being the tool we use for managing WSGI requests.

00:39:05.690 --> 00:39:06.410
<v Cian>So...

00:39:08.600 --> 00:39:11.860
<v Cian>So I started chatting to a principal and said, have a look at this.

00:39:12.040 --> 00:39:13.060
<v Cian>What do you think about this?

00:39:13.300 --> 00:39:18.820
<v Cian>And I got the thumbs up of, ah, sure, let's try it out and see what happens,

00:39:19.320 --> 00:39:25.640
<v Cian>which is a very Irish way of going, let's run a load test and see how it performs.

00:39:27.420 --> 00:39:34.980
<v Cian>So we threw up a version into our load test environment that replaced uWSGI

00:39:34.980 --> 00:39:37.380
<v Cian>with Granian. Granian. Granian.

00:39:37.740 --> 00:39:38.620
<v Matthias>One of these.

00:39:38.800 --> 00:39:44.100
<v Cian>One of these two. So we threw up a version into our load test environment that

00:39:44.100 --> 00:39:49.740
<v Cian>replaced uWSGI with Granian and we began load testing it.

00:39:49.900 --> 00:39:53.180
<v Cian>We just started throwing lots of different types of requests.

00:39:53.360 --> 00:39:57.660
<v Cian>We have some nice load testing tooling that simulates some request flows.

00:39:57.820 --> 00:40:01.320
<v Cian>So we just had it run. and the

00:40:01.320 --> 00:40:04.800
<v Cian>numbers we got back were marginally better

00:40:04.800 --> 00:40:07.980
<v Cian>it wasn't like a night and day like oh my

00:40:07.980 --> 00:40:11.000
<v Cian>god this thing is going to save us we found

00:40:11.000 --> 00:40:16.200
<v Cian>the savior of scaling no nothing like that but what it did say was it changed

00:40:16.200 --> 00:40:23.260
<v Cian>the numbers in our in our percent in our p50s and our p90s our p50s went down

00:40:23.260 --> 00:40:29.360
<v Cian>and our p90s went up which meant we just had a lot more outliers and our Averages were better,

00:40:29.780 --> 00:40:34.560
<v Cian>which was enough of a signal for us to sit down and go, there's something here.

00:40:35.200 --> 00:40:39.700
<v Cian>Don't, it could just be a better tool for us to be able to tune.

00:40:40.660 --> 00:40:46.120
<v Cian>It could just be more cues is helping us scale in some way or another.

00:40:46.740 --> 00:40:50.820
<v Cian>But it was definitely, it was a signal that we said, we need to test a little

00:40:50.820 --> 00:40:53.640
<v Cian>bit more with this. This isn't something we need to just walk away from.

00:40:53.640 --> 00:41:02.680
<v Matthias>Right, because if you see that your P50 is better, that means the outliers are now more prominent.

00:41:03.020 --> 00:41:07.340
<v Matthias>So there might be things in your business logic or timeouts with upstream,

00:41:07.340 --> 00:41:10.620
<v Matthias>which mean that they drive up the P50.

00:41:11.360 --> 00:41:19.780
<v Matthias>P90 or p95 signal but overall this is also a thing that you see a lot with replacing

00:41:19.780 --> 00:41:25.340
<v Matthias>code with faster code on on the back end side is if you do it right then the

00:41:25.340 --> 00:41:26.900
<v Matthias>outliers become more prominent.

00:41:28.200 --> 00:41:35.420
<v Cian>Yeah no 100 we were definitely seeing that where it was these very slow paths

00:41:35.420 --> 00:41:40.100
<v Cian>that were blocking us were still the slow ones.

00:41:40.140 --> 00:41:43.460
<v Cian>But the very quick paths, they just became quicker.

00:41:45.360 --> 00:41:52.300
<v Cian>And there's a lot of differences in how uWSGI and Gradian were configured in

00:41:52.300 --> 00:41:59.840
<v Cian>those early load tests that I now know were silently masking different things about.

00:42:00.080 --> 00:42:06.540
<v Cian>They were handling switching contexts differently, how tread management worked.

00:42:06.960 --> 00:42:11.020
<v Cian>So the memory footprint was,

00:42:12.000 --> 00:42:15.140
<v Cian>little more stable in one

00:42:15.140 --> 00:42:18.520
<v Cian>while it correlated to workload better

00:42:18.520 --> 00:42:23.480
<v Cian>in the other that's got good and bad it meant that previously we would have

00:42:23.480 --> 00:42:28.460
<v Cian>like the memory which and cpu would stay flat but now like as requests went

00:42:28.460 --> 00:42:31.740
<v Cian>up you could actually see the cpu was going up and down because we were doing

00:42:31.740 --> 00:42:37.140
<v Cian>more work and we're like that's a good signal for us scaling now we could use that to do some

00:42:37.260 --> 00:42:39.600
<v Cian>where previously we couldn't do that auto-scaling.

00:42:40.380 --> 00:42:42.820
<v Matthias>Yeah, because you could never go down to zero.

00:42:43.640 --> 00:42:48.740
<v Cian>Exactly, yeah. So we sat down and we drew up a testing scenario,

00:42:49.020 --> 00:42:52.680
<v Cian>like some numbers we wanted to see, some testing we wanted to do.

00:42:52.900 --> 00:42:57.000
<v Cian>Which parts of the stack could we try removing now that we just,

00:42:57.080 --> 00:42:59.420
<v Cian>and could we just replace it with Gradian?

00:43:00.540 --> 00:43:06.900
<v Cian>So we did a lot of different load tests to the point we actually managed to

00:43:06.900 --> 00:43:09.600
<v Cian>bottleneck in the load test tooling.

00:43:09.880 --> 00:43:14.320
<v Cian>We hadn't scaled the load test tooling up high enough that it could push enough

00:43:14.320 --> 00:43:20.200
<v Cian>throughput in one of our tests that we needed to step back and change the load test tooling out.

00:43:21.880 --> 00:43:28.520
<v Cian>We were previously using Locust, which is a fantastic load test tool where you

00:43:28.520 --> 00:43:30.960
<v Cian>write your load test in Python,

00:43:33.040 --> 00:43:37.900
<v Cian>and then you spin up lots of Python workers that are managed and it does the

00:43:37.900 --> 00:43:38.980
<v Cian>load test from different places.

00:43:40.420 --> 00:43:44.960
<v Cian>But those workers were becoming our bottleneck. So, well, they're not really a bottleneck.

00:43:45.420 --> 00:43:48.380
<v Cian>How much money we were willing to spend on those workers became the bottleneck.

00:43:48.480 --> 00:43:52.440
<v Cian>Like how many workers could you spin up for a load test was the bottleneck.

00:43:52.680 --> 00:43:57.780
<v Cian>So we switched out for a tool called Goose, which was a reimagining of that in Rust.

00:43:59.220 --> 00:44:03.880
<v Cian>Managed to push the same amount of workers, we were able to push more requests, like,

00:44:04.540 --> 00:44:09.620
<v Cian>I think 100 or 1,000 X more requests per worker, which meant that bottleneck was out the window.

00:44:10.720 --> 00:44:17.140
<v Matthias>It's somewhat funny that in the process of oxidization, you also have to swap

00:44:17.140 --> 00:44:18.740
<v Matthias>out the load testing tool.

00:44:19.460 --> 00:44:27.020
<v Cian>I think that was the biggest signal of we can push more was when we had to swap

00:44:27.020 --> 00:44:29.840
<v Cian>out the load testing tool because that was what was being saturated.

00:44:31.100 --> 00:44:35.000
<v Cian>Yeah. Yeah, and it was really good.

00:44:35.800 --> 00:44:42.020
<v Cian>At the end of it all, we had a test scenario that showed we were able to push about 2x,

00:44:43.090 --> 00:44:46.350
<v Cian>per compute resources than we previously were.

00:44:46.850 --> 00:44:52.190
<v Cian>And there's a lot of reasons for that. One is we were running less intermediate services.

00:44:52.210 --> 00:44:55.730
<v Cian>We weren't running Nginx after this. We weren't running HAProxy.

00:44:56.230 --> 00:45:01.850
<v Cian>And Gradian was effectively doing all of that for us in a nice Rust event loop

00:45:01.850 --> 00:45:05.410
<v Cian>and handing it off to background processes in Python.

00:45:06.130 --> 00:45:13.150
<v Cian>And the Python was that original P50 gain was adding up along with all these

00:45:13.150 --> 00:45:15.110
<v Cian>less resources having to be run.

00:45:15.110 --> 00:45:18.430
<v Matthias>It's great because yeah as

00:45:18.430 --> 00:45:21.770
<v Matthias>a first step you could say you handled twice

00:45:21.770 --> 00:45:24.630
<v Matthias>the load which means you could have half

00:45:24.630 --> 00:45:27.510
<v Matthias>the servers if you wanted to but then on top of

00:45:27.510 --> 00:45:30.710
<v Matthias>it you have better memory locality now

00:45:30.710 --> 00:45:33.650
<v Matthias>so maybe you even need less

00:45:33.650 --> 00:45:41.310
<v Matthias>cache servers if you had those and on top of it even before the request even

00:45:41.310 --> 00:45:46.470
<v Matthias>hits your monolith you can also optimize a lot because now you don't need nginx

00:45:46.470 --> 00:45:51.070
<v Matthias>and ha proxy you could replace all of that with one service and.

00:45:51.070 --> 00:45:55.710
<v Cian>I think that was the biggest one for us management saw that i what we're saying

00:45:55.710 --> 00:46:00.990
<v Cian>we could squeeze more requests out of what we're already paying that's we could

00:46:00.990 --> 00:46:04.770
<v Cian>scale We said we could scale down, but we knew scaling down was not going to

00:46:04.770 --> 00:46:05.450
<v Cian>be what we were going to do.

00:46:05.690 --> 00:46:10.810
<v Cian>We're signing customers on every day. We're growing every day. We're scaling up.

00:46:11.150 --> 00:46:13.590
<v Cian>So the idea of scaling down,

00:46:14.150 --> 00:46:21.210
<v Cian>of compressing the amount of work we can do in compute is big for us we it got

00:46:21.210 --> 00:46:25.870
<v Cian>us the time to experiment more and continue our testing and see what's next

00:46:25.870 --> 00:46:29.790
<v Cian>for us what's what can we improve and.

00:46:29.790 --> 00:46:34.990
<v Matthias>You need that time because i'm assuming that there are differences between the

00:46:34.990 --> 00:46:41.110
<v Matthias>old stack and the new stack especially if you deal with a lot of real world http traffic.

00:46:41.110 --> 00:46:49.030
<v Cian>Yeah yeah there was two big differences for us that caused two annoying outages for us as well,

00:46:50.070 --> 00:46:53.270
<v Cian>the one that's gonna is burned into my brain was

00:46:53.270 --> 00:47:02.490
<v Cian>to do with docker we we have so docker has a lot of interesting clients is the

00:47:02.490 --> 00:47:07.710
<v Cian>best way i can describe it and it's a it's a standard of how you do stuff But

00:47:07.710 --> 00:47:10.510
<v Cian>every client can kind of implement,

00:47:10.950 --> 00:47:17.790
<v Cian>do the implementation slightly differently and handles edge cases slightly differently than each other.

00:47:19.110 --> 00:47:22.230
<v Cian>So for scaling reasons of our

00:47:22.230 --> 00:47:26.310
<v Cian>cdn we would often respond with trio

00:47:26.310 --> 00:47:30.350
<v Cian>sevens and say and say the resources

00:47:30.350 --> 00:47:35.230
<v Cian>over in this other location for storage go get it and and you download it yourself

00:47:35.230 --> 00:47:38.470
<v Cian>rather than me downloading it for you and handing it off like you don't want

00:47:38.470 --> 00:47:42.170
<v Cian>to be you don't want a python service doing a download and sending it back over

00:47:42.170 --> 00:47:46.730
<v Cian>the wire you want something that's built to scale and serve those requests.

00:47:46.930 --> 00:47:48.710
<v Cian>So it's our CDN out of the edge.

00:47:48.990 --> 00:47:56.070
<v Matthias>The Docker clients that you meant are things like the implementations of things on your local machine.

00:47:56.150 --> 00:47:59.690
<v Matthias>Like if you do Docker pull or you use Podman or...

00:48:00.270 --> 00:48:07.690
<v Cian>No, yeah, it's... When I say Docker clients, I mean Podman versus Docker versus BuildX versus...

00:48:09.050 --> 00:48:09.690
<v Matthias>OrbStack.

00:48:09.890 --> 00:48:13.070
<v Cian>OrbStack, yeah. And there's hundreds more multiple...

00:48:14.130 --> 00:48:19.450
<v Cian>You work in a company, you'll be running different versions and different developer machines sometimes.

00:48:19.930 --> 00:48:23.790
<v Cian>And you'll be, so one developer is doing one thing and that could be different

00:48:23.790 --> 00:48:27.510
<v Cian>to prod because you're not running in prod, you're actually running Kubernetes,

00:48:27.750 --> 00:48:29.390
<v Cian>which is different again to Docker.

00:48:29.610 --> 00:48:33.990
<v Cian>Like the Docker clients are all different and unique and there's many of them

00:48:33.990 --> 00:48:36.250
<v Cian>with different edge cases. right.

00:48:36.250 --> 00:48:42.110
<v Matthias>So back to your story we were at a point where you don't want to handle the

00:48:42.110 --> 00:48:46.530
<v Matthias>requests for the clients instead you tell them look elsewhere for the resource

00:48:46.530 --> 00:48:48.250
<v Matthias>that you're trying to pull yeah.

00:48:48.250 --> 00:48:51.470
<v Cian>Exactly so we'd give them a nice 307 to

00:48:51.470 --> 00:48:54.390
<v Cian>our cdn location and they respect it

00:48:54.390 --> 00:48:57.130
<v Cian>and they pull it it's it's part of the protocol that

00:48:57.130 --> 00:49:00.550
<v Cian>they can do that but for reasons

00:49:00.550 --> 00:49:03.310
<v Cian>that are very legacy and to do

00:49:03.310 --> 00:49:06.170
<v Cian>it how go implemented it's a

00:49:06.170 --> 00:49:10.250
<v Cian>first HTTP client they were

00:49:10.250 --> 00:49:14.110
<v Cian>accepting bodies they accepted

00:49:14.110 --> 00:49:17.550
<v Cian>trio sevens with content lengths

00:49:17.550 --> 00:49:21.950
<v Cian>that were not zero and because

00:49:21.950 --> 00:49:24.930
<v Cian>of that they would have the docker

00:49:24.930 --> 00:49:27.950
<v Cian>client for some reason used that first content

00:49:27.950 --> 00:49:31.290
<v Cian>length it saw as the metadata as

00:49:31.290 --> 00:49:34.610
<v Cian>the content size of the image it

00:49:34.610 --> 00:49:37.550
<v Cian>was eventually going to be and it would so

00:49:37.550 --> 00:49:42.350
<v Cian>it would look at the response and say cool i have a content length of two of

00:49:42.350 --> 00:49:47.190
<v Cian>200 megabytes going to put that in the met in the metadata for my for my docker

00:49:47.190 --> 00:49:53.670
<v Cian>my eventual docker image so it then follows the 307 and goes and grabs all the

00:49:53.670 --> 00:49:55.950
<v Cian>other layers and it says,

00:49:56.190 --> 00:50:00.470
<v Cian>and then it signs it and says, here you go, this is your built image.

00:50:02.590 --> 00:50:11.630
<v Cian>The issue came in when we were getting, so we're sending back this 307 saying

00:50:11.630 --> 00:50:15.550
<v Cian>it's got a content-like length of 200 megabytes, let's say.

00:50:17.130 --> 00:50:21.830
<v Cian>The ALB we were using, the load balancer we were using, started to have errors

00:50:21.830 --> 00:50:24.790
<v Cian>on this. It started saying, nope, that's an invalid request.

00:50:25.010 --> 00:50:27.870
<v Cian>I don't remember exactly what error it started returning, but it started throwing

00:50:27.870 --> 00:50:31.630
<v Cian>random errors that were not the correct error as well.

00:50:32.330 --> 00:50:39.270
<v Cian>So it was processing something internally and it broke its serialization.

00:50:40.610 --> 00:50:45.890
<v Cian>It's kind of scary when I kind of start saying it internally because these were SaaS products.

00:50:46.050 --> 00:50:50.110
<v Cian>We didn't have proper logs for them. We just had metrics of error rates going up and down.

00:50:50.810 --> 00:50:57.550
<v Cian>So we sat down, started digging in, and we managed to map the error rates to the Docker requests.

00:50:59.170 --> 00:51:04.710
<v Cian>And we decided we needed to flip some, we needed to move some stuff around and try some stuff out.

00:51:05.510 --> 00:51:11.970
<v Cian>So we started encoding, we said, oh, this is encoding 307s as,

00:51:13.270 --> 00:51:17.810
<v Cian>it's saying these 307s have a content length of 200 megabytes or whatever the

00:51:17.810 --> 00:51:20.150
<v Cian>eventual image size is going to be. Mm-hmm.

00:51:21.030 --> 00:51:24.870
<v Cian>Let's not do that. That's what's breaking this. Let's respond with actual valid

00:51:24.870 --> 00:51:27.270
<v Cian>HTTP and say the content length is zero.

00:51:28.190 --> 00:51:31.990
<v Cian>So we did that and Docker freaked out.

00:51:32.590 --> 00:51:36.890
<v Cian>It started, well, actually we ran tests and they were working.

00:51:37.070 --> 00:51:40.110
<v Cian>Like, we're like, great, our end-to-end tests are still working in this, this is fine.

00:51:40.710 --> 00:51:46.230
<v Cian>And then one of our developers came in and said, hey, I can't get my local dev to start.

00:51:46.610 --> 00:51:55.510
<v Cian>So we started debugging it. and it turned out that their local dev was getting the wrong metadata.

00:51:55.830 --> 00:51:58.370
<v Cian>And my local dev was working completely fine.

00:51:59.050 --> 00:52:04.530
<v Cian>And that's where it became really weird. I was using BuildX and they were not

00:52:04.530 --> 00:52:08.330
<v Cian>using BuildX for building their Docker images and running their Docker images.

00:52:10.130 --> 00:52:14.870
<v Cian>And that's when we realized it was very specific clients were doing stuff differently.

00:52:15.190 --> 00:52:18.150
<v Cian>Some of them were checking the metadata data from the header,

00:52:18.470 --> 00:52:22.290
<v Cian>and some of them were doing the maths themselves and putting it in there.

00:52:23.590 --> 00:52:31.730
<v Cian>We rolled back the change of the header, and we moved the logic around.

00:52:31.730 --> 00:52:36.230
<v Cian>We moved the validation of the content length out to the edge network so we

00:52:36.230 --> 00:52:40.970
<v Cian>could do some like after our load balancers had done all their work and hyper had changed.

00:52:42.780 --> 00:52:46.660
<v Cian>Nginx was just handling that 307 completely

00:52:46.660 --> 00:52:50.540
<v Cian>differently and arguably incorrectly

00:52:50.540 --> 00:52:53.620
<v Cian>it was doing it was massaging it

00:52:53.620 --> 00:52:57.720
<v Cian>into a way that the load balancer was accepting it yeah and

00:52:57.720 --> 00:53:00.620
<v Cian>we needed to work around all of all of

00:53:00.620 --> 00:53:03.680
<v Cian>those kind of weird edge cases that we had previously

00:53:03.680 --> 00:53:06.760
<v Cian>just got nginx working on nginx was just doing

00:53:06.760 --> 00:53:10.080
<v Cian>stuff in we moved it out to our cdn

00:53:10.080 --> 00:53:12.720
<v Cian>layer our so our request processing was at the

00:53:12.720 --> 00:53:18.320
<v Cian>edge then and it works like once you got once you move those things around you

00:53:18.320 --> 00:53:25.000
<v Cian>can see the that it does work but like there's so many weird edge cases in in

00:53:25.000 --> 00:53:30.660
<v Cian>hp that i that i can't like say this is a drop in replacement yeah it's one

00:53:30.660 --> 00:53:31.980
<v Cian>of those you really have to test them.

00:53:32.790 --> 00:53:42.570
<v Matthias>Yeah, I remember that in one of our earlier episodes with the maintainer of cURL, Daniel Stenberg,

00:53:42.910 --> 00:53:50.110
<v Matthias>he mentioned a very similar problem, which is that Hyper was very strict about

00:53:50.110 --> 00:53:53.210
<v Matthias>certain ways HTTP traffic should be handled.

00:53:53.790 --> 00:53:59.470
<v Matthias>And cURL needs to be extremely permissive because people expect it.

00:53:59.570 --> 00:54:02.510
<v Matthias>That's kind of the API of the command line tool.

00:54:03.070 --> 00:54:08.730
<v Matthias>And he needed people to go in and either soften the edges on hyper or make parts

00:54:08.730 --> 00:54:13.530
<v Matthias>of that transition layer a bit more permissive on the cURL side.

00:54:13.770 --> 00:54:16.870
<v Matthias>But that was a tough job for them.

00:54:17.030 --> 00:54:20.190
<v Matthias>And eventually they removed the Rust backend because of that.

00:54:20.330 --> 00:54:24.850
<v Matthias>So because they couldn't fix it or there were not enough people who wanted to put in the work.

00:54:25.290 --> 00:54:30.090
<v Matthias>And you hit this because, I just want to reemphasize that, you hit that because

00:54:30.090 --> 00:54:34.830
<v Matthias>Granian, the WSGI server, uses hyper correct.

00:54:34.830 --> 00:54:37.890
<v Cian>Yeah it's exactly it's the exact same stuff hyper

00:54:37.890 --> 00:54:41.230
<v Cian>was doing everything technically correct the

00:54:41.230 --> 00:54:47.810
<v Cian>the fun sentence of everything is technically correct granian uses hyper and

00:54:47.810 --> 00:54:55.610
<v Cian>tokyo and pyotree it's just core libraries so it it was using hyper to serialized

00:54:55.610 --> 00:54:59.790
<v Cian>a response and it was just and.

00:54:59.790 --> 00:55:06.110
<v Matthias>Do you believe that we need more permissive libraries more permissive rust crates

00:55:06.110 --> 00:55:13.750
<v Matthias>for real world hdp usage or other areas where things have historically grown

00:55:13.750 --> 00:55:18.570
<v Matthias>to make those you know rust adaptations easier for people,

00:55:19.450 --> 00:55:25.650
<v Matthias>Or would you rather say, well, no, instead we should work with better standards

00:55:25.650 --> 00:55:29.530
<v Matthias>and maybe fix our code? Yeah.

00:55:30.510 --> 00:55:34.810
<v Cian>As I've noted, I work with some of the best and worst clients.

00:55:35.270 --> 00:55:43.630
<v Cian>They do retries, they expect really good responses, but I don't own the API contract on them.

00:55:43.770 --> 00:55:46.250
<v Cian>I have to just follow the API contract.

00:55:46.990 --> 00:55:50.210
<v Cian>I would love to say that we as an industry should be following the standards

00:55:50.210 --> 00:55:55.010
<v Cian>being so strict to them and I can totally see that if I look back at me five

00:55:55.010 --> 00:56:00.350
<v Cian>years ago I would be there shouting no no follow the standards we should make

00:56:00.350 --> 00:56:02.610
<v Cian>everyone who doesn't follow the standards feel the pain,

00:56:03.210 --> 00:56:05.930
<v Cian>the issue is there that's a

00:56:05.930 --> 00:56:09.010
<v Cian>lot of people that's a lot of pain and it's not

00:56:09.010 --> 00:56:12.890
<v Cian>something you can fix overnight like i

00:56:12.890 --> 00:56:15.830
<v Cian>think we i know because i work in a package company a lot

00:56:15.830 --> 00:56:18.970
<v Cian>of people run a lot of different versions of the same software

00:56:18.970 --> 00:56:26.990
<v Cian>so even if like we started making tools stricter every everyone on december

00:56:26.990 --> 00:56:31.870
<v Cian>on february 28th decided to do one launch where everything switched to strict

00:56:31.870 --> 00:56:37.770
<v Cian>mode the in every library we then have to get that rolled out to every version of that software,

00:56:38.010 --> 00:56:41.370
<v Cian>it's not going to be, it's going to be a painful rollout.

00:56:42.270 --> 00:56:46.730
<v Cian>You need to have a level of permissiveness in the clients.

00:56:47.840 --> 00:56:52.360
<v Cian>Saying that, I don't want the default to be permissive. The default should be perfect.

00:56:52.720 --> 00:56:55.660
<v Cian>It should be the best way a client should run.

00:56:55.840 --> 00:57:01.900
<v Cian>The client should have timeouts. It should have sane defaults and should follow the standard.

00:57:02.320 --> 00:57:07.800
<v Cian>But when you run a legacy system, you're going to have a lot of weird legacy

00:57:07.800 --> 00:57:12.400
<v Cian>issues. And you need to be able to flip those switches off to mean that you

00:57:12.400 --> 00:57:13.660
<v Cian>can enable these things.

00:57:13.660 --> 00:57:14.760
<v Matthias>Otherwise.

00:57:14.760 --> 00:57:18.980
<v Cian>You're going to end up with a lot of duct tape around your very strict system

00:57:18.980 --> 00:57:20.180
<v Cian>to flip those switches off.

00:57:20.180 --> 00:57:27.480
<v Matthias>Yeah be very strict initially and then lower the guard yeah exactly now when

00:57:27.480 --> 00:57:34.000
<v Matthias>you look back on the project what would you say were your key learnings i'm

00:57:34.000 --> 00:57:37.300
<v Matthias>talking about things that you would have done differently but also things where

00:57:37.300 --> 00:57:39.420
<v Matthias>you believe rust is a good fit,

00:57:40.600 --> 00:57:42.000
<v Matthias>how did that project go?

00:57:42.280 --> 00:57:44.360
<v Matthias>Maybe you can summarize it in a few sentences.

00:57:46.250 --> 00:57:50.430
<v Cian>The project could have gone a lot better. It's still underway.

00:57:51.550 --> 00:57:54.810
<v Cian>We're using it in specific environments now.

00:57:55.050 --> 00:58:00.550
<v Cian>We haven't rolled out 100% everywhere because of these weird edge cases we found with Docker.

00:58:01.630 --> 00:58:06.150
<v Cian>And the other issue we found was about connection management to our database.

00:58:06.770 --> 00:58:13.070
<v Cian>It's a big problem. We need to do some upgrades, which means we've held off

00:58:13.070 --> 00:58:14.310
<v Cian>and we haven't got there.

00:58:15.030 --> 00:58:17.950
<v Cian>That's and that was the biggest things about the project

00:58:17.950 --> 00:58:22.030
<v Cian>that was the unknown unknowns we

00:58:22.030 --> 00:58:28.050
<v Cian>sat we sat down and i keep saying we there was maybe me a principal to review

00:58:28.050 --> 00:58:36.250
<v Cian>my work and a manager to like sign off on it and and set out like would you

00:58:36.250 --> 00:58:38.530
<v Cian>we'd leverage our end-to-end test to do stuff.

00:58:38.750 --> 00:58:44.810
<v Cian>We'd use our load tests to validate our request throughput and that kind of stuff.

00:58:45.590 --> 00:58:50.470
<v Cian>But we never had a plan. And we had rollback and rollout plans.

00:58:50.590 --> 00:58:55.250
<v Cian>We had rollout plans that were like, well, canary in lower environments, raise them up.

00:58:55.910 --> 00:58:59.570
<v Cian>We'll do it in off regions in quiet times.

00:59:00.510 --> 00:59:05.330
<v Cian>Following the SRE handbook of how do you roll out changes safely.

00:59:06.510 --> 00:59:10.150
<v Cian>But we we had issues

00:59:10.150 --> 00:59:15.450
<v Cian>with like that that business logic like at the start of all this we started

00:59:15.450 --> 00:59:22.050
<v Cian>pulling in rust tools to speed up python because we didn't want to do a full

00:59:22.050 --> 00:59:27.210
<v Cian>rewrite for many different reasons we wanted to use.

00:59:28.410 --> 00:59:31.170
<v Cian>Small bits of rust in our stack to speed it

00:59:31.170 --> 00:59:34.170
<v Cian>up or small bits of sea as well if that was not if that was going to be there

00:59:34.170 --> 00:59:40.450
<v Cian>as well we were very much just looking for faster ways to do what we were currently

00:59:40.450 --> 00:59:50.210
<v Cian>doing but what we were currently doing wasn't well wasn't understood enough by myself and,

00:59:51.130 --> 00:59:54.070
<v Cian>others because we have that 10 years of legacy

00:59:54.070 --> 00:59:56.890
<v Cian>there's edge cases where the person

00:59:56.890 --> 00:59:59.650
<v Cian>who worked done it has come and gone or that

00:59:59.650 --> 01:00:03.070
<v Cian>same person has come and gone to the company three times he's

01:00:03.070 --> 01:00:07.730
<v Cian>my principal and he's like this is kind of reminding me of an outage i had five

01:00:07.730 --> 01:00:12.670
<v Cian>years ago and he and he's trying to remember it and we're trying to fix it these

01:00:12.670 --> 01:00:18.710
<v Cian>are the things i would have loved to know beforehand i would have loved to have

01:00:18.710 --> 01:00:21.370
<v Cian>known that we were going to run into these,

01:00:22.090 --> 01:00:25.130
<v Cian>weird edge cases and i don't know

01:00:25.130 --> 01:00:28.310
<v Cian>how i would have known how i would have got there how much

01:00:28.310 --> 01:00:31.750
<v Cian>more time researching could we have come up with how much

01:00:31.750 --> 01:00:34.530
<v Cian>more testing could we have done were these things we're

01:00:34.530 --> 01:00:41.970
<v Cian>only going to find in prod probably but i wish there was better we we had a

01:00:41.970 --> 01:00:48.510
<v Cian>better way of validating these things like a better test suite for like hp testing

01:00:48.510 --> 01:00:51.810
<v Cian>better test suite for different clients.

01:00:53.810 --> 01:00:58.170
<v Matthias>I guess the question I would ask to myself is, had I known all of these things

01:00:58.170 --> 01:00:59.910
<v Matthias>before would I have made any difference?

01:01:00.870 --> 01:01:07.690
<v Matthias>Would I have made a different choice? And maybe the outcome was kind of still worth it?

01:01:08.450 --> 01:01:14.270
<v Cian>Yeah. No, we retroed. Like I said, we're still in process, but we do regular retros.

01:01:14.570 --> 01:01:17.710
<v Cian>And that question came up. Was this the right choice?

01:01:17.850 --> 01:01:23.730
<v Cian>Should we have made a different choice? And I said, and we all agreed this is the right choice.

01:01:25.210 --> 01:01:29.950
<v Cian>There is something here that is worth testing. It's worth using.

01:01:30.870 --> 01:01:35.590
<v Cian>If we're not moving as fast as we want to and we've just introduced a new thing,

01:01:35.790 --> 01:01:39.070
<v Cian>we still know where we're going. We all agreed on a roadmap.

01:01:40.190 --> 01:01:43.510
<v Cian>The roadmap was just a lot longer than we thought we'd agreed upon.

01:01:43.770 --> 01:01:51.870
<v Cian>But it's still worth it. The speed increases we're seeing and they're totally worth it.

01:01:54.610 --> 01:01:59.890
<v Matthias>Now, looking back at your 10-year of Rust experience,

01:01:59.890 --> 01:02:03.250
<v Matthias>three of them professionally how has

01:02:03.250 --> 01:02:06.430
<v Matthias>your perception of rust changed over time

01:02:06.430 --> 01:02:09.470
<v Matthias>remember that maybe when

01:02:09.470 --> 01:02:14.590
<v Matthias>you started you might have been enthusiastic about the language just trying

01:02:14.590 --> 01:02:19.430
<v Matthias>to explore what's there but now that you use it professionally what would you

01:02:19.430 --> 01:02:26.730
<v Matthias>say has shaped your perception on rust in in the last couple years.

01:02:29.370 --> 01:02:34.470
<v Cian>Rust has changed a lot over 10 years. Like, I can remember a time when you'd

01:02:34.470 --> 01:02:37.790
<v Cian>get a clippy warning that would tell you, don't do that, do this.

01:02:37.950 --> 01:02:40.370
<v Cian>You'd do that, and it would produce a different clippy warning.

01:02:40.590 --> 01:02:44.050
<v Cian>And you could be 10 clippy warnings deep before you had the working code.

01:02:44.870 --> 01:02:51.170
<v Cian>Rust today is a lot different than that was. Like, you get one clippy warning, and then you're fixed.

01:02:51.190 --> 01:02:56.190
<v Cian>Or maybe you get one clippy warning, and your fix is ever so slightly different

01:02:56.190 --> 01:02:59.050
<v Cian>because you didn't turn on pedantic mode or something like that.

01:02:59.050 --> 01:03:05.730
<v Cian>But like rust is a lot friendlier now than it used to be and when i started

01:03:05.730 --> 01:03:11.030
<v Cian>writing rust i was very much just looking for at the cool hip language i think

01:03:11.030 --> 01:03:16.230
<v Cian>i think i first found rust out of fosdom going like going full circle in my life.

01:03:16.230 --> 01:03:17.650
<v Matthias>Me too,

01:03:17.650 --> 01:03:18.110
<v Matthias>by the way.

01:03:18.110 --> 01:03:23.410
<v Cian>Yeah it's like Mozilla was so big on it and it seemed so interesting and I had

01:03:23.410 --> 01:03:29.090
<v Cian>just come off learning Go and I had and Google,

01:03:29.650 --> 01:03:34.090
<v Cian>was going Go is great, Go is great and I was talking to SREs who were like this

01:03:34.090 --> 01:03:40.790
<v Cian>Go thing seems really cool but I was like I was just there was some idiosyncratic

01:03:40.790 --> 01:03:47.030
<v Cian>things about Go that I was never a big fan of so I so that's why I started learning Rust and I,

01:03:47.830 --> 01:03:55.630
<v Cian>It's, I think we, it's got some rough edges still that are not fully sanded out or fully well.

01:03:56.670 --> 01:04:02.850
<v Cian>The story's not there yet. Like, when to choose a framework is still like an

01:04:02.850 --> 01:04:04.330
<v Cian>interesting problem you have in Rust.

01:04:04.330 --> 01:04:13.790
<v Cian>You have the issue of, do you use Hyper or do you use Axum or do you use Rocket?

01:04:14.030 --> 01:04:18.970
<v Cian>And I'm not even sure is Rocket still a thing. Like, I remember when that came

01:04:18.970 --> 01:04:22.250
<v Cian>out and it was really good, but I've never used it professionally.

01:04:22.250 --> 01:04:25.250
<v Cian>I think I've always reached for Axum and Hyper professionally.

01:04:26.990 --> 01:04:31.070
<v Cian>Because you the smaller projects don't

01:04:31.070 --> 01:04:33.990
<v Cian>move as fast in the

01:04:33.990 --> 01:04:37.270
<v Cian>right in sometimes but saying that hyper hyper moves

01:04:37.270 --> 01:04:41.110
<v Cian>very slowly hyper was only went v1 a

01:04:41.110 --> 01:04:43.930
<v Cian>year and a bit ago it took a

01:04:43.930 --> 01:04:46.990
<v Cian>long time to get to v1 and v1 was a big change

01:04:46.990 --> 01:04:49.950
<v Cian>as well so that was a that switch was like was almost

01:04:49.950 --> 01:04:54.010
<v Cian>a full rewrite of services and i

01:04:54.010 --> 01:04:57.530
<v Cian>think that's the thing i appreciate though about rust we took

01:04:57.530 --> 01:05:00.490
<v Cian>our time to get to get an api that was

01:05:00.490 --> 01:05:03.330
<v Cian>going to be stable that wasn't going to change a

01:05:03.330 --> 01:05:06.010
<v Cian>lot and was and it's going and you can

01:05:06.010 --> 01:05:09.410
<v Cian>work against but when you

01:05:09.410 --> 01:05:12.550
<v Cian>look but how many of the projects have never

01:05:12.550 --> 01:05:15.470
<v Cian>hit v1 is scary i look

01:05:15.470 --> 01:05:18.990
<v Cian>at my my cargo lock file

01:05:18.990 --> 01:05:21.810
<v Cian>or my cargo toml and a lot of

01:05:21.810 --> 01:05:27.790
<v Cian>my projects are still 0.8 0.7 0.1239

01:05:27.790 --> 01:05:33.450
<v Cian>like these values that i'm like they could break at any time but i need to keep

01:05:33.450 --> 01:05:37.450
<v Cian>track of these things because i work at a package management company where people

01:05:37.450 --> 01:05:42.590
<v Cian>need to track we We need to track stable and secure versions constantly.

01:05:43.070 --> 01:05:44.110
<v Matthias>I would say...

01:05:45.300 --> 01:05:51.860
<v Matthias>Yes, a lot of versions are still not 1.0, or there are a lot of unstable crates out there.

01:05:52.000 --> 01:05:57.680
<v Matthias>But at least from my experience, they break less often than in other ecosystems.

01:05:57.680 --> 01:06:03.500
<v Matthias>Even though there might be a feature release bump or so, rarely do I need to

01:06:03.500 --> 01:06:07.480
<v Matthias>go in and make any bigger sweeping changes.

01:06:07.480 --> 01:06:11.240
<v Matthias>It's rather just minor things, or sometimes I'm not even affected by that.

01:06:11.240 --> 01:06:17.260
<v Matthias>And so like i don't want to devalue your point but it's just to give more people

01:06:17.260 --> 01:06:21.840
<v Matthias>some perspective maybe people that don't work with rust a lot it it might not

01:06:21.840 --> 01:06:24.320
<v Matthias>be the biggest problem right now in the ecosystem.

01:06:24.320 --> 01:06:27.200
<v Cian>No i i'd say i don't think it's the biggest problem

01:06:27.200 --> 01:06:30.460
<v Cian>in the ecosystem right now i think it's i think

01:06:30.460 --> 01:06:33.840
<v Cian>it's one of those problems that is a bit of a perception problem

01:06:33.840 --> 01:06:36.600
<v Cian>and it's very much one that you might

01:06:36.600 --> 01:06:40.140
<v Cian>see newbies might interrupt might

01:06:40.140 --> 01:06:42.920
<v Cian>feel a lot more like i i say

01:06:42.920 --> 01:06:46.520
<v Cian>that i said that i'm like bumping crates and i

01:06:46.520 --> 01:06:49.580
<v Cian>i'm maybe being apprehensive but like

01:06:49.580 --> 01:06:52.340
<v Cian>i have the same experience you have i rarely have to go

01:06:52.340 --> 01:06:55.980
<v Cian>in and actually change an api i i

01:06:55.980 --> 01:06:59.760
<v Cian>don't i did a very big bump on some stuff recently and

01:06:59.760 --> 01:07:03.380
<v Cian>we're using a new framework for

01:07:03.380 --> 01:07:06.120
<v Cian>we're testing a new framework for writing parts of our edge code and

01:07:06.120 --> 01:07:09.020
<v Cian>rust and it went from version

01:07:09.020 --> 01:07:11.800
<v Cian>0.6 to 0.7 and i think

01:07:11.800 --> 01:07:14.560
<v Cian>it just added a lot of optional args to a lot of stuff so we had

01:07:14.560 --> 01:07:18.120
<v Cian>to try to read the docs and add those optional arcs that was that was not a

01:07:18.120 --> 01:07:22.000
<v Cian>big change it took maybe an hour of my time to just do that and that was fine

01:07:22.000 --> 01:07:25.940
<v Cian>and at saying that it came with improvements it came with cache improvements

01:07:25.940 --> 01:07:31.800
<v Cian>and all that kind of stuff so taking in those changes was good it was obviously

01:07:31.800 --> 01:07:33.720
<v Cian>feature changes is not bug fixes and that.

01:07:33.860 --> 01:07:37.860
<v Cian>So I'm like happy to take that stuff in. But when I...

01:07:38.630 --> 01:07:43.050
<v Cian>We're trying out more Rust and I'm bringing more people in to look at Rust.

01:07:43.550 --> 01:07:49.490
<v Cian>They, who are coming from a Python world and coming from different worlds.

01:07:50.110 --> 01:07:55.530
<v Cian>And they look at a lock file and they say, why are none of these things stable?

01:07:55.770 --> 01:08:00.330
<v Cian>I have to have that conversation with them about why we're still using pre-release

01:08:00.330 --> 01:08:04.810
<v Cian>software and why it might be years before that pre-release software comes in.

01:08:05.310 --> 01:08:09.750
<v Cian>And I don't think it's a problem you need to fix, but maybe it's a problem of education.

01:08:10.190 --> 01:08:19.370
<v Cian>And how do we talk about the v0 of packages to make people understand that this

01:08:19.370 --> 01:08:23.110
<v Cian>is, should this be production or should this not be production?

01:08:23.470 --> 01:08:27.370
<v Cian>It's not, a v1 isn't a signal that this should be production or not.

01:08:27.530 --> 01:08:30.190
<v Cian>It's just a signal of stability of the API.

01:08:30.650 --> 01:08:32.590
<v Matthias>Do you think you will use Rust in 10 years?

01:08:33.210 --> 01:08:36.590
<v Cian>I hope so. Like, there's an answer of, I hope so.

01:08:37.110 --> 01:08:43.130
<v Cian>I think languages change a lot, and the language ecosystem change a lot.

01:08:44.370 --> 01:08:48.510
<v Cian>I didn't think 10 years ago I'd be still writing Python or JavaScript,

01:08:48.870 --> 01:08:50.690
<v Cian>but I'm still writing Python and JavaScript.

01:08:50.990 --> 01:08:56.170
<v Cian>But you look at them, and they're a lot different to the Python and JavaScript you wrote 10 years ago.

01:08:56.490 --> 01:08:59.370
<v Cian>So I think Rust is here to stay.

01:08:59.890 --> 01:09:02.550
<v Cian>It's, I said earlier, it's in the Linux kernel now.

01:09:03.190 --> 01:09:10.050
<v Cian>It's in low-level libraries for Python. It's in UV. It's in ty.

01:09:11.170 --> 01:09:14.530
<v Cian>It's becoming a core part of our industry.

01:09:14.830 --> 01:09:19.590
<v Cian>But how will I be writing it? Or will someone else be writing it? I don't know.

01:09:19.890 --> 01:09:26.110
<v Cian>Maybe we'll have got to a point where we have saturated the amount of rust we

01:09:26.110 --> 01:09:28.330
<v Cian>need to write. And we can use...

01:09:29.550 --> 01:09:32.870
<v Cian>Higher level tooling built on top of that rust could

01:09:32.870 --> 01:09:36.290
<v Cian>we have a language that's less verbose than

01:09:36.290 --> 01:09:38.990
<v Cian>rust that is gives us

01:09:38.990 --> 01:09:42.050
<v Cian>the same memory safety could we take the

01:09:42.050 --> 01:09:45.810
<v Cian>lessons we learned from the borrow checker and apply

01:09:45.810 --> 01:09:50.930
<v Cian>that to an a language that looks something like a python for business logic

01:09:50.930 --> 01:09:56.630
<v Cian>and call in and out of it and maybe that's better for us maybe that's actually

01:09:56.630 --> 01:10:03.170
<v Cian>what i want is a language that takes all the learnings from Rust and takes the stability from Rust,

01:10:03.190 --> 01:10:09.190
<v Cian>but is a little friendlier for newcomers or a little easier for people fresh,

01:10:09.450 --> 01:10:14.250
<v Cian>for graduates fresh out of college to get started with without feeling like

01:10:14.250 --> 01:10:17.830
<v Cian>they're writing a systems language. Because that's something you always hear.

01:10:18.030 --> 01:10:23.510
<v Cian>Rust is a systems language. It's for systems programming. It's for systems problems,

01:10:23.510 --> 01:10:26.390
<v Cian>which isn't true. You can write anything. you.

01:10:26.690 --> 01:10:29.510
<v Cian>Rust is a language. It's a tool. You can do whatever you want with that tool.

01:10:29.870 --> 01:10:34.190
<v Cian>I've written business APIs in it. I've written load balancers in it.

01:10:34.330 --> 01:10:37.870
<v Cian>I've written CLIs in it. It's great for all of those things.

01:10:38.450 --> 01:10:43.170
<v Cian>And we've learned a lot from it that we could apply to other places.

01:10:43.690 --> 01:10:46.550
<v Cian>So will I be writing Rust? I hope so.

01:10:47.230 --> 01:10:53.530
<v Cian>Will everyone be writing Rust? Probably not. Will there be a new language that

01:10:53.530 --> 01:10:55.730
<v Cian>hopefully isn't inspired by Rust?

01:10:56.190 --> 01:10:58.510
<v Cian>Probably. Will there be a new language? Definitely.

01:11:00.330 --> 01:11:05.330
<v Matthias>And finally what's your message to the rust community as a whole.

01:11:05.330 --> 01:11:08.910
<v Cian>I think it's gonna start with a thanks because i

01:11:08.910 --> 01:11:12.070
<v Cian>wouldn't be doing what i enjoy right now without the rust community like

01:11:12.070 --> 01:11:15.350
<v Cian>they that knowledge sharing the

01:11:15.350 --> 01:11:18.230
<v Cian>rust books people willing to talk about it

01:11:18.230 --> 01:11:21.870
<v Cian>and whenever someone's willing to talk about it it's always that very enthusiastic

01:11:21.870 --> 01:11:24.910
<v Cian>talking about it like i

01:11:24.910 --> 01:11:27.550
<v Cian>was lucky enough at for them to go for dinner with a lot

01:11:27.550 --> 01:11:30.710
<v Cian>of the other speakers of for the rust room the enthusiasm

01:11:30.710 --> 01:11:33.510
<v Cian>people have about their projects and not the language it's

01:11:33.510 --> 01:11:36.630
<v Cian>great and people are always willing to have a

01:11:36.630 --> 01:11:40.010
<v Cian>very open conversation and talk about different things

01:11:40.010 --> 01:11:44.890
<v Cian>lessons learned and all that kind of stuff and i gotta say that's such a great

01:11:44.890 --> 01:11:49.130
<v Cian>thing and we need to keep that so it's so important i don't think i would have

01:11:49.130 --> 01:11:55.430
<v Cian>got into rust without that because it's what led it's But reading the Rust book

01:11:55.430 --> 01:11:56.910
<v Cian>is what let me learn Rust.

01:11:57.370 --> 01:12:02.710
<v Cian>It's such a nice way to learn. And I think we have to keep focusing on ways

01:12:02.710 --> 01:12:06.590
<v Cian>to make it easy to get new people into learning the language,

01:12:06.830 --> 01:12:13.090
<v Cian>to make it a better language, and to make people not think of it as a fad or

01:12:13.090 --> 01:12:15.070
<v Cian>a systems programming language.

01:12:15.070 --> 01:12:18.390
<v Cian>We have to focus on that path for beginners.

01:12:19.170 --> 01:12:22.070
<v Cian>Tools like Clippy have done massive improvements there. like

01:12:22.070 --> 01:12:24.890
<v Cian>that it's it's more than just a linter

01:12:24.890 --> 01:12:28.050
<v Cian>it's a tool for helping you learn how to

01:12:28.050 --> 01:12:32.550
<v Cian>write good an idiomatic rust like i and

01:12:32.550 --> 01:12:37.370
<v Cian>when we focus on tooling that's natural to humans i think we just come up with

01:12:37.370 --> 01:12:41.830
<v Cian>a better language and i think we have to keep that in mind when we develop rust

01:12:41.830 --> 01:12:50.450
<v Cian>is it's tooling to make you as a human enjoy writing rust and make sure it's not a pain Where.

01:12:50.450 --> 01:12:52.350
<v Matthias>Can people learn more about Cloudsmith?

01:12:53.210 --> 01:12:58.090
<v Cian>So cloudsmith.com is our website. You can, if you want to use Cloudsmith or

01:12:58.090 --> 01:13:00.870
<v Cian>think that you need better package management, check it out.

01:13:01.490 --> 01:13:06.010
<v Cian>If you are interested in joining us, we are always hiring.

01:13:06.530 --> 01:13:11.190
<v Cian>My team is experimenting with Rust. So if you're a Rust developer and want to

01:13:11.190 --> 01:13:15.710
<v Cian>write some Rust in production, reach out, reach out to me.

01:13:16.950 --> 01:13:20.150
<v Cian>I'll get my email dropped in the show notes so people can reach out.

01:13:20.770 --> 01:13:25.910
<v Cian>And if they want to just talk about Cloudsmith, or package management or Rust, you can also reach out.

01:13:27.190 --> 01:13:31.370
<v Matthias>Amazing. Cian, thanks so much for taking the time for the interview today.

01:13:31.670 --> 01:13:34.190
<v Cian>Thank you. It's been a very pleasurable chat.

01:13:35.090 --> 01:13:38.770
<v Matthias>Rust in Production is a podcast by corrode. It is hosted by me,

01:13:39.070 --> 01:13:41.850
<v Matthias>Matthias Endler, and produced by Simon Brüggen.

01:13:42.010 --> 01:13:46.290
<v Matthias>For show notes, transcripts, and to learn more about how we can help your company

01:13:46.290 --> 01:13:49.190
<v Matthias>make the most of Rust, visit corrode.dev.

01:13:49.370 --> 01:13:51.790
<v Matthias>Thanks for listening to Rust in Production.