WEBVTT

00:00:01.530 --> 00:00:05.810
<v Matthias>It's time for another episode of Rust in Production, a podcast about companies

00:00:05.810 --> 00:00:08.350
<v Matthias>who use Rust to shape the future of infrastructure.

00:00:08.790 --> 00:00:13.370
<v Matthias>I'm Matthias Endler from corrode and today's guest is Charlie Marsh from Astral.

00:00:13.590 --> 00:00:16.970
<v Matthias>We talk about improving the Python ecosystem with Rust.

00:00:18.970 --> 00:00:23.670
<v Matthias>Charlie, thanks for being a guest. Can you say a few words about yourself and

00:00:23.670 --> 00:00:25.850
<v Matthias>about Astral, the company you work for?

00:00:26.570 --> 00:00:28.690
<v Charlie>Yeah, of course. Thanks so much for having me. first of all.

00:00:28.810 --> 00:00:29.690
<v Charlie>My name is Charlie Marsh.

00:00:30.130 --> 00:00:36.790
<v Charlie>I run a company called Astral. We build high-performance developer tools for the Python ecosystem.

00:00:36.790 --> 00:00:44.050
<v Charlie>So we're best known for two tools, RUF, which is a sort of combined linter formatter

00:00:44.050 --> 00:00:45.490
<v Charlie>code transformation tool.

00:00:45.630 --> 00:00:48.670
<v Charlie>You could think of it like kind of Rust format and Clippy together,

00:00:49.030 --> 00:00:51.370
<v Charlie>which is again for Python code, but written in Rust.

00:00:51.610 --> 00:00:56.590
<v Charlie>And then UV, which is our project manager, Python package manager,

00:00:56.770 --> 00:00:59.690
<v Charlie>Python tool chain manager. It's, it's, it's, you could think of it a little

00:00:59.690 --> 00:01:01.010
<v Charlie>bit like cargo and Rust up.

00:01:01.130 --> 00:01:05.790
<v Charlie>So it tries to bootstrap or will bootstrap Python for you and then help you

00:01:05.790 --> 00:01:10.650
<v Charlie>manage your dependencies, install things, lock them into lock files and reproducible

00:01:10.650 --> 00:01:12.390
<v Charlie>versions and all that kind of stuff.

00:01:12.810 --> 00:01:18.130
<v Charlie>So yeah, everything we've built so far is open source and written in Rust.

00:01:18.350 --> 00:01:26.530
<v Charlie>We're a team of about 15 people spread from the Pacific, like Pacific time in the U.S.

00:01:26.530 --> 00:01:29.690
<v Charlie>We have people in pacific time central time eastern time

00:01:29.690 --> 00:01:35.110
<v Charlie>and we have one person in the uk we have four people in cet or like germany

00:01:35.110 --> 00:01:39.150
<v Charlie>switzerland the netherlands and then one person in india so not only are we

00:01:39.150 --> 00:01:43.390
<v Charlie>like remote we're like very distributed kind of like a lot of open source and

00:01:43.390 --> 00:01:47.150
<v Charlie>we spend all our time basically writing rust to try and make python better i.

00:01:47.150 --> 00:01:56.630
<v Matthias>Heard about astral the first time when you published ruff but it took off with uv i would say like.

00:01:56.630 --> 00:01:57.190
<v Charlie>There was.

00:01:57.190 --> 00:02:04.030
<v Matthias>Definitely a huge sympathy for what you do in the packaging space as well did

00:02:04.030 --> 00:02:08.330
<v Matthias>you see that as well from a company perspective that there was a huge growth.

00:02:08.920 --> 00:02:12.400
<v Charlie>I think so. Yeah. I mean, I think, you know, we started with Ruff.

00:02:12.640 --> 00:02:15.400
<v Charlie>I started working on Ruff before the company existed.

00:02:16.040 --> 00:02:18.940
<v Charlie>And it was just something I was building.

00:02:19.620 --> 00:02:24.040
<v Charlie>Well, for a lot of reasons, but largely because I saw these were problems that

00:02:24.040 --> 00:02:25.700
<v Charlie>I had experienced in my own projects.

00:02:25.860 --> 00:02:30.080
<v Charlie>And I was like, what would it be like if I wrote, you know, Python tooling and Rust instead?

00:02:30.280 --> 00:02:33.200
<v Charlie>And that was sort of the genesis of Ruff. and ruff

00:02:33.200 --> 00:02:36.460
<v Charlie>grew extremely quickly like we had you know

00:02:36.460 --> 00:02:39.940
<v Charlie>decades old projects like scipy and

00:02:39.940 --> 00:02:44.180
<v Charlie>stuff like adopting it while it was still like what i would consider to be very

00:02:44.180 --> 00:02:49.280
<v Charlie>unstable so it was clear that people wanted you know something in this in this

00:02:49.280 --> 00:02:55.840
<v Charlie>arena of faster python tooling so ruff grew very fast and you know we were

00:02:55.840 --> 00:02:58.480
<v Charlie>we were seeing what was happening with ruff and we were like well.

00:02:59.080 --> 00:03:01.900
<v Charlie>As a company and in terms of like the problems we're trying to

00:03:01.900 --> 00:03:04.680
<v Charlie>solve like we're not just we're not trying to

00:03:04.680 --> 00:03:07.980
<v Charlie>build just that like we want to build a python

00:03:07.980 --> 00:03:11.020
<v Charlie>tool chain effectively like we want to solve the hard

00:03:11.020 --> 00:03:14.120
<v Charlie>problems in python and for me like packaging was

00:03:14.120 --> 00:03:17.280
<v Charlie>kind of like if you want to be like the python tooling

00:03:17.280 --> 00:03:20.000
<v Charlie>company which aspirational i guess we'd like

00:03:20.000 --> 00:03:22.840
<v Charlie>to be then like you have to work on packaging i think

00:03:22.840 --> 00:03:25.740
<v Charlie>because it's it's the thing that everyone has

00:03:25.740 --> 00:03:28.840
<v Charlie>trouble with and the thing that has been so many people have tried to

00:03:28.840 --> 00:03:31.560
<v Charlie>solve and and done good work on but but i

00:03:31.560 --> 00:03:34.440
<v Charlie>don't think anyone would have would consider it solved and so

00:03:34.440 --> 00:03:38.920
<v Charlie>you know i i kind of knew from early on that we wanted to do something in packaging

00:03:38.920 --> 00:03:44.920
<v Charlie>and before we released it i definitely felt nervous in that ruff had grown

00:03:44.920 --> 00:03:48.580
<v Charlie>well and people really liked it and it was it was kind of felt like it was going

00:03:48.580 --> 00:03:52.400
<v Charlie>to be a tough act to follow like we don't want to be a one-hit wonder you know

00:03:52.400 --> 00:03:53.600
<v Charlie>with the tools that we built.

00:03:54.060 --> 00:03:58.080
<v Charlie>Like I wanted to build something that hopefully would be as exciting and grow

00:03:58.080 --> 00:04:00.960
<v Charlie>as well and have as much of an impact as Ruff.

00:04:01.080 --> 00:04:03.920
<v Charlie>And I think UV has actually really surpassed that in a lot of ways.

00:04:04.060 --> 00:04:09.300
<v Charlie>It's like, I think the impact it's had on Python has probably been more significant.

00:04:09.580 --> 00:04:13.000
<v Charlie>And I think like the, not that I don't love Ruff.

00:04:13.120 --> 00:04:18.540
<v Charlie>I mean, it was my first big project and we still like half the people on the

00:04:18.540 --> 00:04:21.440
<v Charlie>team are still working, are working on Ruff and the static analysis tooling.

00:04:21.440 --> 00:04:25.280
<v Charlie>It's like a big focus area for us but i think i've been really amazed by how

00:04:25.280 --> 00:04:30.920
<v Charlie>quickly uv has had the impact that it's had like we released it in february

00:04:30.920 --> 00:04:36.820
<v Charlie>of last year so it's just it just is one turned one year old and in that time it's grown to like.

00:04:37.940 --> 00:04:43.380
<v Charlie>I think last I checked, it's like a little over 12.5% of all requests to the

00:04:43.380 --> 00:04:50.500
<v Charlie>Python index come from UV, which is like over 200 million requests a day, which is wild, right?

00:04:50.640 --> 00:04:55.780
<v Charlie>And I spend a lot of my time talking to companies. There are tons of like 1

00:04:55.780 --> 00:04:59.260
<v Charlie>billion, 10 billion, even $100 billion companies that are using this thing in

00:04:59.260 --> 00:05:01.060
<v Charlie>production and have been for a long time.

00:05:01.220 --> 00:05:04.380
<v Charlie>So just like the way it grew i think really surpassed

00:05:04.380 --> 00:05:07.240
<v Charlie>my my expectations and i feel

00:05:07.240 --> 00:05:12.160
<v Charlie>like we now have a a platform to like keep hopefully making python better but

00:05:12.160 --> 00:05:15.680
<v Charlie>but i think you're right that like uv i think we shifted into a little bit of

00:05:15.680 --> 00:05:19.880
<v Charlie>another gear we spread beyond just being like the company was just ruff before

00:05:19.880 --> 00:05:27.120
<v Charlie>and now we are like python tooling and we try to solve like a much wider surface area of problems now.

00:05:27.120 --> 00:05:31.640
<v Matthias>You said that ruff was your first major rust project

00:05:31.760 --> 00:05:34.800
<v Matthias>maybe the first actual project that you tried after

00:05:34.800 --> 00:05:37.960
<v Matthias>like learning rust i'm not sure about this but it certainly

00:05:37.960 --> 00:05:41.520
<v Matthias>was big milestone for you in your rust journey but then

00:05:41.520 --> 00:05:49.560
<v Matthias>you compared it with uv and did you already learn a few things that you wanted

00:05:49.560 --> 00:05:55.540
<v Matthias>to avoid with uv that you did with ruff and also did you set up the uv code

00:05:55.540 --> 00:05:58.060
<v Matthias>base early on for such growth.

00:05:58.060 --> 00:06:04.580
<v Charlie>Yeah so So I would say I really learned Rust in the process of writing Rust.

00:06:05.420 --> 00:06:10.660
<v Charlie>I had done some Rust at my previous job, but someone else on the team introduced

00:06:10.660 --> 00:06:12.240
<v Charlie>Rust, really great engineer.

00:06:13.480 --> 00:06:16.800
<v Charlie>And when I was contributing to that code base, I was mostly trying to get in

00:06:16.800 --> 00:06:19.820
<v Charlie>and out as quickly as possible because I didn't really know Rust.

00:06:19.840 --> 00:06:22.480
<v Charlie>I was like, I need to fix a bug. How do I get this to compile?

00:06:22.680 --> 00:06:26.640
<v Charlie>I wasn't really taking the time to learn it. I was just using it when I had

00:06:26.640 --> 00:06:28.480
<v Charlie>to. So I had some Rust exposure.

00:06:29.540 --> 00:06:33.060
<v Charlie>But part of why I worked on ruff in the first place was I felt like I wanted

00:06:33.060 --> 00:06:35.820
<v Charlie>to really learn rust. And I thought that in order to do that,

00:06:35.940 --> 00:06:38.800
<v Charlie>I had to build something from scratch. And I had to like, it's just kind of

00:06:38.800 --> 00:06:41.280
<v Charlie>the way that I like to learn. I was like, I need to build something.

00:06:41.480 --> 00:06:46.380
<v Charlie>And like, even if that requires like, wasting a ton of time trying to understand

00:06:46.380 --> 00:06:50.060
<v Charlie>like lifetimes, like maybe I'll burn like two days trying to get this to compile.

00:06:50.220 --> 00:06:53.240
<v Charlie>But like, that's kind of how I learn is just like, I have to build things and

00:06:53.240 --> 00:06:56.120
<v Charlie>like fight through the problems. And I wanted to learn the ecosystem and just

00:06:56.120 --> 00:06:57.100
<v Charlie>the tooling and everything.

00:06:57.800 --> 00:07:02.200
<v Charlie>So I did really learn Rust in the process of writing ruff.

00:07:02.420 --> 00:07:06.920
<v Charlie>I think that showed, like there was probably still shows a little bit today,

00:07:07.180 --> 00:07:09.680
<v Charlie>like, you know, especially early on, there were a lot of things I was doing

00:07:09.680 --> 00:07:14.420
<v Charlie>that were just that I would now, I mean, this is, this is part of personal growth, right?

00:07:14.500 --> 00:07:17.160
<v Charlie>There's lots of things I was doing that I would now look at and laugh about

00:07:17.160 --> 00:07:19.300
<v Charlie>and say, that's obviously not the right way to do it.

00:07:19.800 --> 00:07:23.960
<v Charlie>Whether it was like silly performance things or the way I was structuring the

00:07:23.960 --> 00:07:26.780
<v Charlie>code or or I don't know, anything.

00:07:27.600 --> 00:07:32.600
<v Charlie>By the time that we started working on UV, I think I felt a lot more comfortable in Rust.

00:07:33.380 --> 00:07:37.100
<v Charlie>And I'd also learned more from looking at other projects too,

00:07:37.380 --> 00:07:40.220
<v Charlie>I think. And we had kind of evolved Ruff a little bit over time.

00:07:40.460 --> 00:07:46.420
<v Charlie>So for example, like we have a really, a fairly wide like crate structure.

00:07:46.720 --> 00:07:48.580
<v Charlie>Like we just create a lot of crates.

00:07:48.900 --> 00:07:52.640
<v Charlie>Like if you go and open up Ruff or UV, the crate subdirectory,

00:07:52.800 --> 00:07:54.360
<v Charlie>there's like in UV, there's probably.

00:07:55.460 --> 00:08:00.020
<v Charlie>There's at least 20 crates, maybe like 30. So we just create like lots of crates.

00:08:00.460 --> 00:08:04.900
<v Charlie>And that was something that I started to do in ruff because our,

00:08:05.040 --> 00:08:07.140
<v Charlie>our build and compile times got really bad.

00:08:07.160 --> 00:08:11.360
<v Charlie>And it became like, if I just put everything in one crate, it was like our build

00:08:11.360 --> 00:08:14.720
<v Charlie>and compile times got worse and worse. And the development loop was painful.

00:08:15.480 --> 00:08:18.300
<v Charlie>And, you know, at the time I was reading a

00:08:18.300 --> 00:08:21.000
<v Charlie>lot about how sort of like crates are like the atomic unit for like

00:08:21.000 --> 00:08:23.700
<v Charlie>Rust-C and like you can parallelize across crates and you need

00:08:23.700 --> 00:08:27.180
<v Charlie>to be thinking about like how your crate graph looks so

00:08:27.180 --> 00:08:30.080
<v Charlie>that there's a lot of like fan out and stuff and i was like okay we're going

00:08:30.080 --> 00:08:33.480
<v Charlie>to start creating like lots more smaller crates and we're going to sort of change

00:08:33.480 --> 00:08:38.220
<v Charlie>the atomic unit of like what the code base is from like the module to the crate

00:08:38.220 --> 00:08:42.440
<v Charlie>and so we started carving out like lots of little crates and we did basically

00:08:42.440 --> 00:08:47.060
<v Charlie>the same thing and for example in ruff like you know.

00:08:48.000 --> 00:08:50.940
<v Charlie>The core linter crate like doesn't

00:08:50.940 --> 00:08:53.700
<v Charlie>depend on clap for example like it doesn't have it doesn't depend on

00:08:53.700 --> 00:08:56.500
<v Charlie>any of the cli stuff the cli stuff is in a separate crate

00:08:56.500 --> 00:08:59.260
<v Charlie>that depends on the linter crate and you just try and get this structure in this

00:08:59.260 --> 00:09:02.260
<v Charlie>organization or like the parser and the ast those are

00:09:02.260 --> 00:09:05.000
<v Charlie>all their own crates so we can if you just need to test the

00:09:05.000 --> 00:09:07.700
<v Charlie>parser right it's really fast to compile and build or if people

00:09:07.700 --> 00:09:10.760
<v Charlie>if other people even just need to pull in the parser as a library which

00:09:10.760 --> 00:09:13.420
<v Charlie>some people do it's much easier so by the time we got

00:09:13.420 --> 00:09:16.240
<v Charlie>to uv i think we had ironed out a

00:09:16.240 --> 00:09:19.060
<v Charlie>lot of things that made it so we put we just

00:09:19.060 --> 00:09:22.320
<v Charlie>put a lot of stuff in place from the start like we just had a better separation

00:09:22.320 --> 00:09:25.520
<v Charlie>in terms of what went into what crates i think

00:09:25.520 --> 00:09:31.100
<v Charlie>we'd ironed out a lot of our workflows for example like what's our clippy configuration

00:09:31.100 --> 00:09:36.500
<v Charlie>how do we do how do we pin the rust toolchain version like how do we install

00:09:36.500 --> 00:09:39.780
<v Charlie>rust in ci like there were just all these little things that we were able to

00:09:39.780 --> 00:09:43.180
<v Charlie>just copy over from ruff and i think that made things a lot easier.

00:09:44.050 --> 00:09:48.430
<v Charlie>At the same time, there were still a lot of design decisions in UV architecturally

00:09:48.430 --> 00:09:52.230
<v Charlie>that were pretty different because the design of a package manager and the design

00:09:52.230 --> 00:09:56.570
<v Charlie>of the linter are just the things they need to do are very different.

00:09:57.230 --> 00:10:01.430
<v Charlie>So, for example, UV has a whole networking stack, right?

00:10:01.550 --> 00:10:06.410
<v Charlie>It needs to make lots of HTTP requests and the linter doesn't need to do any of that.

00:10:06.550 --> 00:10:13.350
<v Charlie>So suddenly we had to think about requests or basically the whole HTTP stack.

00:10:13.350 --> 00:10:18.150
<v Charlie>We had to think about like open ssl we had to think about all these system dependencies

00:10:18.150 --> 00:10:21.930
<v Charlie>like git that was another two like we support git dependencies so suddenly you

00:10:21.930 --> 00:10:25.070
<v Charlie>have to think about how you depend on git i think there was just a lot of complexity

00:10:25.070 --> 00:10:27.950
<v Charlie>that came with building a package manager that didn't we didn't necessarily

00:10:27.950 --> 00:10:29.890
<v Charlie>have to encounter when we were working on the linter,

00:10:30.790 --> 00:10:34.350
<v Charlie>which is good because i kind of like encountered those problems over time when.

00:10:34.350 --> 00:10:41.370
<v Matthias>I understand you correctly to summarize you didn't have a rough start with uv because.

00:10:41.370 --> 00:10:42.750
<v Matthias>You could copy over.

00:10:42.750 --> 00:10:48.870
<v Matthias>A lot of the let's say templates or a lot of the best practices from the previous project.

00:10:48.870 --> 00:10:52.430
<v Charlie>At least what we thought were best practices yeah and i guess to some degree

00:10:52.430 --> 00:10:55.390
<v Charlie>we still think our best practices but when.

00:10:55.390 --> 00:11:03.010
<v Matthias>You used uv when you started with uv did you already compartmentalize some of

00:11:03.010 --> 00:11:08.790
<v Matthias>the functionality into smaller crates somewhat subconsciously did you say oh

00:11:08.790 --> 00:11:15.070
<v Matthias>yeah this definitely goes into a separate crate now or did you still start with a single large crate.

00:11:15.070 --> 00:11:19.270
<v Charlie>No i think we were like compartmentalizing stuff basically from the start and

00:11:19.270 --> 00:11:24.530
<v Charlie>for example like you know we had one crate for like parsing python versions

00:11:24.530 --> 00:11:27.210
<v Charlie>there's a whole spec around versioning that requires,

00:11:27.750 --> 00:11:32.730
<v Charlie>a lot of there's a lot of intricacies to it so that's its own crate we had one crate for like,

00:11:33.330 --> 00:11:36.190
<v Charlie>parsing requirements txt files that's another thing that's common in

00:11:36.190 --> 00:11:39.590
<v Charlie>python that's there's actually no spec for that that's like implementation defined

00:11:39.590 --> 00:11:42.270
<v Charlie>and so we have like one crate for that you know we have a crate for

00:11:42.270 --> 00:11:45.330
<v Charlie>creating virtual environments we had a crate for the cli we

00:11:45.330 --> 00:11:48.930
<v Charlie>had a crate for the resolver so yeah we broke things down i think pretty early

00:11:48.930 --> 00:11:55.010
<v Charlie>on and that i mean that's generally served us very well so i'm a big fan of

00:11:55.010 --> 00:11:58.710
<v Charlie>that structure i think it's like much the trade-off is really different when

00:11:58.710 --> 00:12:03.330
<v Charlie>you're publishing a library like we don't publish any of our we don't,

00:12:03.870 --> 00:12:06.890
<v Charlie>we don't publish any of anything in rougher uv as

00:12:06.890 --> 00:12:09.750
<v Charlie>a library like we don't publish to crates io any of

00:12:09.750 --> 00:12:13.290
<v Charlie>the stuff that we build right now like the public api for our stuff tends to

00:12:13.290 --> 00:12:17.930
<v Charlie>be the command line so the cli the command line interface so like if we had

00:12:17.930 --> 00:12:21.710
<v Charlie>to publish things i think we would have to think a lot more carefully well there's

00:12:21.710 --> 00:12:25.010
<v Charlie>all sorts of considerations that come with publishing that you don't have to

00:12:25.010 --> 00:12:27.650
<v Charlie>think about if you aren't publishing.

00:12:27.990 --> 00:12:32.350
<v Charlie>But one is, if we had a really granular create structure, I think it becomes

00:12:32.350 --> 00:12:35.770
<v Charlie>a bit harder for people to use your... There's a lot more of a maintenance burden,

00:12:35.810 --> 00:12:38.510
<v Charlie>and it's also harder for people to use and compose your things.

00:12:39.910 --> 00:12:41.330
<v Matthias>But you don't go and...

00:12:42.890 --> 00:12:48.330
<v Matthias>Introduce a crate for types specifically because that's one thing that i see some,

00:12:49.550 --> 00:12:52.410
<v Matthias>some people do some companies do that they have

00:12:52.410 --> 00:12:55.630
<v Matthias>a types crate where they put everything that is related to

00:12:55.630 --> 00:12:58.510
<v Matthias>you know their basic types into one thing

00:12:58.510 --> 00:13:02.150
<v Matthias>and then at the end they don't

00:13:02.150 --> 00:13:07.070
<v Matthias>really get a lot of benefits from using a workspace because you need that types

00:13:07.070 --> 00:13:13.390
<v Matthias>crate anyway in all of your other crates and so it kind of goes against to what

00:13:13.390 --> 00:13:18.430
<v Matthias>a workspace is about in my opinion but i want to hear it from you like what

00:13:18.430 --> 00:13:21.490
<v Matthias>are some anti-patterns for building workspaces.

00:13:21.490 --> 00:13:24.350
<v Charlie>Yeah so like i guess

00:13:24.350 --> 00:13:28.210
<v Charlie>yeah we try not to do things like like

00:13:28.210 --> 00:13:30.970
<v Charlie>that i don't know what we would call that pattern but like

00:13:30.970 --> 00:13:34.210
<v Charlie>create a crate that's just kind of like used everywhere there's

00:13:34.210 --> 00:13:37.250
<v Charlie>maybe some of that but we try to generally avoid it

00:13:37.250 --> 00:13:40.970
<v Charlie>i think the other anti-patterns are like you can

00:13:40.970 --> 00:13:44.770
<v Charlie>make your crate structure like too granular like if

00:13:44.770 --> 00:13:47.830
<v Charlie>you find yourself creating a crate for that's like

00:13:47.830 --> 00:13:51.550
<v Charlie>one function and sometimes

00:13:51.550 --> 00:13:54.750
<v Charlie>you'll find yourself doing that because you have like circular dependencies

00:13:54.750 --> 00:13:57.510
<v Charlie>or something in the dependency chain that's requiring you to do this

00:13:57.510 --> 00:14:01.290
<v Charlie>like i have two crates that need this functionality and they can't depend on

00:14:01.290 --> 00:14:04.250
<v Charlie>one can't depend on the other or something and then you find yourself putting

00:14:04.250 --> 00:14:08.030
<v Charlie>a function into like like it's just a crate that's like a function at that point

00:14:08.030 --> 00:14:11.610
<v Charlie>you've probably gone too granular and you kind of need to rethink like the organization

00:14:11.610 --> 00:14:14.810
<v Charlie>because there is some overhead to having all these crates and like,

00:14:15.330 --> 00:14:18.770
<v Charlie>you know it's not always like super fun to maintain i think

00:14:18.770 --> 00:14:21.670
<v Charlie>a couple things that we do though that are kind of nice for this

00:14:21.670 --> 00:14:24.870
<v Charlie>is like we prefix all the crates

00:14:24.870 --> 00:14:31.310
<v Charlie>in the workspace with uv or with ruff so like it's easy to differentiate what's

00:14:31.310 --> 00:14:35.110
<v Charlie>us create in the workspace versus what's a third party crate like all of our

00:14:35.110 --> 00:14:40.650
<v Charlie>crates are like uv virtual env or like uv resolver or like whatever else that's

00:14:40.650 --> 00:14:42.910
<v Charlie>pretty nice and then the other thing is we.

00:14:44.080 --> 00:14:51.760
<v Charlie>We declare them all in the workspace root so that every crate that depends on

00:14:51.760 --> 00:14:54.640
<v Charlie>other crates in the workspace can just use workspace equals true.

00:14:54.880 --> 00:14:59.460
<v Charlie>This is a little bit nuanced and maybe hard to visualize, but it basically means

00:14:59.460 --> 00:15:03.380
<v Charlie>that in the cargo.toml for all the crates that are in the workspace,

00:15:03.400 --> 00:15:05.780
<v Charlie>it's very obvious what else is in the workspace.

00:15:07.040 --> 00:15:10.480
<v Charlie>Another thing that we do is we actually try

00:15:10.480 --> 00:15:13.980
<v Charlie>to use workspace equals true for all basically

00:15:13.980 --> 00:15:17.900
<v Charlie>all dependencies in the workspace so we

00:15:17.900 --> 00:15:21.240
<v Charlie>put everything that we depend on in the root cargo toml

00:15:21.240 --> 00:15:24.080
<v Charlie>like the workspace root and then we use workspace equals true

00:15:24.080 --> 00:15:27.020
<v Charlie>everywhere and that tends to simplify

00:15:27.020 --> 00:15:30.440
<v Charlie>things a lot it basically means we have like one dependency specifier for

00:15:30.440 --> 00:15:34.160
<v Charlie>request for example and a

00:15:34.160 --> 00:15:37.220
<v Charlie>common set of like default extras or no default extras or

00:15:37.220 --> 00:15:40.220
<v Charlie>sorry not extras features extras is the python version is

00:15:40.220 --> 00:15:43.300
<v Charlie>the python analog to to features but like no default features

00:15:43.300 --> 00:15:46.260
<v Charlie>or you know or whatever else we have like one dependency declaration for

00:15:46.260 --> 00:15:49.800
<v Charlie>everything that we need and then all the workspace members their cargo terminals

00:15:49.800 --> 00:15:52.820
<v Charlie>are just very straightforward it's like the things they require with workspace

00:15:52.820 --> 00:15:56.880
<v Charlie>equals true so we don't have to think about oh do we have request dependency

00:15:56.880 --> 00:16:00.460
<v Charlie>specifiers across like 10 or 15 different crates we just have one definition

00:16:00.460 --> 00:16:04.480
<v Charlie>for it so like that's also i think been a really handy thing right.

00:16:04.480 --> 00:16:09.980
<v Matthias>Workspaces seem to be one of these things that i keep missing in other ecosystems

00:16:09.980 --> 00:16:16.160
<v Matthias>and i can't remember if there's even another language that has such a feature from the top.

00:16:16.160 --> 00:16:19.000
<v Charlie>Of my head yeah think about one i mean we've thought about

00:16:19.000 --> 00:16:21.860
<v Charlie>it a fair amount in python so like uv

00:16:21.860 --> 00:16:24.760
<v Charlie>has a has workspace functionality that's like very

00:16:24.760 --> 00:16:27.940
<v Charlie>much modeled after cargo um so

00:16:27.940 --> 00:16:30.920
<v Charlie>you have a root it defines the members

00:16:30.920 --> 00:16:34.500
<v Charlie>we use very similar actually like members and excludes syntax

00:16:34.500 --> 00:16:37.620
<v Charlie>and there's nice things like that you know

00:16:37.620 --> 00:16:40.360
<v Charlie>we you can do like uv run dash p just like you can do

00:16:40.360 --> 00:16:43.120
<v Charlie>cargo run right so like we've we've just like

00:16:43.120 --> 00:16:46.100
<v Charlie>copied a lot of things from cargo because it's great

00:16:46.100 --> 00:16:48.980
<v Charlie>and we're like we want to have those things in python

00:16:48.980 --> 00:16:51.780
<v Charlie>and for workspaces i think that's

00:16:51.780 --> 00:16:55.140
<v Charlie>worked really well it's been very interesting to introduce them because for

00:16:55.140 --> 00:16:57.980
<v Charlie>a lot of people it's fairly new concept like when they come to python right

00:16:57.980 --> 00:17:00.680
<v Charlie>like like most people who use uv i've never used Rust right they've never

00:17:00.680 --> 00:17:03.500
<v Charlie>heard of a cargo workspace before and like why should they

00:17:03.500 --> 00:17:06.920
<v Charlie>you know if they're just working in python so it's been interesting to try and

00:17:06.920 --> 00:17:12.420
<v Charlie>like i communicate that and like help people understand like what it's for and

00:17:12.420 --> 00:17:16.140
<v Charlie>like why you might use it and like what what an example workspace might look

00:17:16.140 --> 00:17:19.880
<v Charlie>like in practice there are some things that we miss that are hard to get a little

00:17:19.880 --> 00:17:22.160
<v Charlie>bit hard to get like without standards like for example.

00:17:23.190 --> 00:17:26.750
<v Charlie>We don't support, I just spent all this time talking about this incredibly boring

00:17:26.750 --> 00:17:29.690
<v Charlie>workspace equals true thing that all your listeners are probably like,

00:17:29.990 --> 00:17:31.310
<v Charlie>why is he so much time talking about that?

00:17:31.550 --> 00:17:35.610
<v Charlie>But like, we can't really do that in UV because like there isn't really a way

00:17:35.610 --> 00:17:37.550
<v Charlie>to express that in the standards.

00:17:38.050 --> 00:17:41.130
<v Charlie>And so, you know, there are some things that we have to kind of either can't

00:17:41.130 --> 00:17:43.110
<v Charlie>support or have to get creative about how we support.

00:17:43.350 --> 00:17:46.370
<v Charlie>But I think the workspace concept is excellent. And like, I'm really glad that

00:17:46.370 --> 00:17:48.270
<v Charlie>we made it such a first class thing in UV.

00:17:49.190 --> 00:17:50.750
<v Matthias>Now, I'm pretty sure that we could

00:17:50.750 --> 00:17:54.710
<v Matthias>talk about workspaces for hours because there's a lot of nuance to it.

00:17:54.870 --> 00:17:57.510
<v Matthias>And I think a lot of people that haven't tried it themselves,

00:17:57.510 --> 00:18:00.170
<v Matthias>they just don't know what the fuss is all about.

00:18:00.290 --> 00:18:03.590
<v Matthias>But I do believe that there's more to it than just the term.

00:18:05.050 --> 00:18:09.070
<v Matthias>Now, one other thing that you mentioned, though, which is very close to my heart

00:18:09.070 --> 00:18:15.530
<v Matthias>is parsing stuff, especially the requirements txt, which is unspecified, I heard.

00:18:16.470 --> 00:18:18.270
<v Matthias>Now, isn't it super simple?

00:18:18.270 --> 00:18:20.390
<v Charlie>Yeah isn't.

00:18:20.390 --> 00:18:23.930
<v Matthias>It super simple like someone might hear this and say oh yeah i just opened my

00:18:23.930 --> 00:18:28.170
<v Matthias>editor i string split every line on the equal sign and that's my requirements

00:18:28.170 --> 00:18:32.170
<v Matthias>txt parser why is that not the case.

00:18:32.170 --> 00:18:35.550
<v Charlie>Yeah so for requirements txt it's like

00:18:35.550 --> 00:18:38.530
<v Charlie>this is basically this is a file format that

00:18:38.530 --> 00:18:42.130
<v Charlie>exists for pip really pip is

00:18:42.130 --> 00:18:45.370
<v Charlie>like the i guess sort of how should

00:18:45.370 --> 00:18:48.070
<v Charlie>i describe it like the reference implementation for a lot of things

00:18:48.070 --> 00:18:50.770
<v Charlie>it's really been like the python package installer for a long

00:18:50.770 --> 00:18:53.850
<v Charlie>time yeah um and requirements txt

00:18:53.850 --> 00:18:56.770
<v Charlie>is a file format that exists for pip and you know

00:18:56.770 --> 00:18:59.530
<v Charlie>kind of the way that you can think about it which

00:18:59.530 --> 00:19:03.130
<v Charlie>maybe people don't think about it this way is like it's

00:19:03.130 --> 00:19:06.370
<v Charlie>basically like each line is a command line argument because you

00:19:06.370 --> 00:19:10.410
<v Charlie>can not only put requirements in there you can also put command line arguments

00:19:10.410 --> 00:19:15.210
<v Charlie>and settings yeah which is interesting so like on pip and pip you can pass an

00:19:15.210 --> 00:19:18.910
<v Charlie>index url on the command line which is like what registry should i use for fetching

00:19:18.910 --> 00:19:23.410
<v Charlie>packages you can actually put that in requirements txt too you can just do dash dash index url.

00:19:24.490 --> 00:19:29.530
<v Charlie>You can also like embed like you can also nest them so like within a requirements

00:19:29.530 --> 00:19:35.070
<v Charlie>txt you can do dash r and point to a different requirements txt and then it's

00:19:35.070 --> 00:19:39.830
<v Charlie>sort of like it gets inlined roughly so there's a lot of complexity to the stuff

00:19:39.830 --> 00:19:42.950
<v Charlie>that's in requirements txt and that people don't really think about.

00:19:43.570 --> 00:19:51.230
<v Charlie>And there are also lots of very subtle behaviors where, especially for us, we often have to decide,

00:19:52.270 --> 00:19:55.810
<v Charlie>well, what do we want to do? Do we want to be like bug for bug compatible with

00:19:55.810 --> 00:19:58.190
<v Charlie>pip or do we want to do things like slightly different?

00:19:58.430 --> 00:20:02.330
<v Charlie>And a lot of these are edge cases, but there's just a lot of nuance to it.

00:20:02.470 --> 00:20:10.570
<v Charlie>Like in a requirements.txt file, you can have, I guess what I would call like a named dependency.

00:20:10.850 --> 00:20:15.110
<v Charlie>Like you could say like flask and then the version that you want,

00:20:15.710 --> 00:20:19.930
<v Charlie>or you could do flask at, and then the URL that you should fetch it from or

00:20:19.930 --> 00:20:22.010
<v Charlie>like a Git repository or something like that.

00:20:22.270 --> 00:20:26.010
<v Charlie>But you can also just do the URL or just do the Git repository.

00:20:26.730 --> 00:20:31.910
<v Charlie>And it turns out that in PIP, there's slightly different behaviors around how

00:20:31.910 --> 00:20:37.110
<v Charlie>those things are parsed, like the URL, if it's just the URL versus if it's after a name dependency.

00:20:38.550 --> 00:20:43.810
<v Charlie>And how white space is handled, like error recovery, all this stuff is just a little bit different.

00:20:44.110 --> 00:20:48.310
<v Charlie>And so over time, we've had to add, basically, the only way to know if we're

00:20:48.310 --> 00:20:52.230
<v Charlie>doing the right thing is we just see what PIP does. and then we try to mimic that to some degree.

00:20:52.470 --> 00:20:55.750
<v Charlie>Or if we think we can do things that are unambiguously clearer, then we'll do that.

00:20:56.710 --> 00:21:01.290
<v Charlie>But yeah, that one's especially hard. It's a little bit easier for other things

00:21:01.290 --> 00:21:06.570
<v Charlie>like Python version specifiers where there's a clear standard on how these things should be parsed.

00:21:07.470 --> 00:21:12.230
<v Charlie>And if we see something that's not clear, we can actually ask about it through

00:21:12.230 --> 00:21:15.530
<v Charlie>the standards process and be like, how should this be handled?

00:21:15.730 --> 00:21:16.590
<v Charlie>Like blah, blah, blah, blah, blah.

00:21:16.830 --> 00:21:19.990
<v Matthias>In essence, you make the standards stronger for everyone

00:21:19.990 --> 00:21:23.170
<v Matthias>which is great what kind of piqued

00:21:23.170 --> 00:21:26.070
<v Matthias>my interest there was did you have to

00:21:26.070 --> 00:21:30.510
<v Matthias>go and read the pip source code to be able to understand what's going on because

00:21:30.510 --> 00:21:36.870
<v Matthias>this is a standard that sort of developed over 30 years i think python is 30

00:21:36.870 --> 00:21:42.330
<v Matthias>years by now it's from 1994 as far as i remember so and did you really have

00:21:42.330 --> 00:21:44.030
<v Matthias>to read the source code yes.

00:21:44.030 --> 00:21:45.850
<v Charlie>Sometimes we definitely have to go read source code,

00:21:46.740 --> 00:21:49.460
<v Charlie>and which is fine like i don't i don't

00:21:49.460 --> 00:21:53.080
<v Charlie>really mind that like as long as we can figure out like why a certain behavior exists

00:21:53.080 --> 00:21:57.240
<v Charlie>in a certain way and like what it's motivated by but yes we often have to go

00:21:57.240 --> 00:22:01.560
<v Charlie>read the source code to understand like how does this tool handle this case

00:22:01.560 --> 00:22:05.940
<v Charlie>and like and and also why right like a lot of the time you're trying to understand

00:22:05.940 --> 00:22:09.480
<v Charlie>why people made certain decisions not just how it works that.

00:22:09.480 --> 00:22:15.260
<v Matthias>Means you don't even you don't only have to read the source code you also have to read the history the.

00:22:15.260 --> 00:22:19.400
<v Charlie>Yes version history yes but do.

00:22:19.400 --> 00:22:24.260
<v Matthias>They have proper tests at least where you can go through and maybe implement

00:22:24.260 --> 00:22:26.580
<v Matthias>these tests in in your implementation.

00:22:26.580 --> 00:22:29.700
<v Charlie>Yeah they do have tests yeah it's not always like super

00:22:29.700 --> 00:22:33.180
<v Charlie>straightforward to take their tests and move them

00:22:33.180 --> 00:22:36.500
<v Charlie>over to uv like a lot of the times i don't

00:22:36.500 --> 00:22:39.320
<v Charlie>know just the way that you write tests in like python and rust often tends to

00:22:39.320 --> 00:22:42.460
<v Charlie>be quite different like or at least the way that we write tests

00:22:42.460 --> 00:22:45.400
<v Charlie>tends to be quite different like you know like in python like

00:22:45.400 --> 00:22:48.340
<v Charlie>mocking is very popular for example and like we

00:22:48.340 --> 00:22:51.480
<v Charlie>don't really have we have like almost no walking in uv like

00:22:51.480 --> 00:22:54.360
<v Charlie>almost all of our tests i think

00:22:54.360 --> 00:22:57.480
<v Charlie>are what you would probably consider like integration tests

00:22:57.480 --> 00:23:00.900
<v Charlie>like the whole way that we test uv is we

00:23:00.900 --> 00:23:03.960
<v Charlie>actually run real scenarios against

00:23:03.960 --> 00:23:10.200
<v Charlie>a real package registry and it's almost all based on snapshot testing so like

00:23:10.200 --> 00:23:14.940
<v Charlie>we snapshot the cli output a lot of the time that's like how we detect if things

00:23:14.940 --> 00:23:18.420
<v Charlie>are working correctly or not is we snapshot the cli output or we snapshot the

00:23:18.420 --> 00:23:22.080
<v Charlie>lock file or we snapshot things like that but like the vast majority of the

00:23:22.080 --> 00:23:23.740
<v Charlie>way that you get tested is.

00:23:24.560 --> 00:23:26.180
<v Charlie>Integration tests of running real

00:23:26.180 --> 00:23:32.160
<v Charlie>commands on the cli and snapshotting the output and that's like we have,

00:23:32.970 --> 00:23:36.390
<v Charlie>a lot of tests and they all run through that.

00:23:36.650 --> 00:23:39.350
<v Charlie>So it's a little different sometimes because if I go and look,

00:23:39.410 --> 00:23:43.850
<v Charlie>we do have obviously more traditional unit tests for requirements.txt parser.

00:23:44.130 --> 00:23:47.610
<v Charlie>That's something that's very testable, right? And we do have tests for that.

00:23:47.730 --> 00:23:53.690
<v Charlie>But often if it's more complex scenarios around how PIP behaves versus how we

00:23:53.690 --> 00:23:57.510
<v Charlie>behave, it's a little bit harder to shoo it into what we have.

00:23:58.290 --> 00:24:00.350
<v Matthias>Do you always run the full test harness?

00:24:01.770 --> 00:24:02.450
<v Charlie>In CI?

00:24:02.970 --> 00:24:08.370
<v Matthias>Yeah, or maybe locally as well, but I wonder that's probably too much.

00:24:08.950 --> 00:24:13.730
<v Charlie>Yeah, we do always run, we almost always run the full test harness.

00:24:14.070 --> 00:24:20.070
<v Charlie>So like we just, we skip it for changes that are purely documentation.

00:24:20.750 --> 00:24:24.890
<v Charlie>Which we just detect with some file filters in GitHub Actions,

00:24:25.050 --> 00:24:26.470
<v Charlie>but otherwise we run the full test suite.

00:24:26.770 --> 00:24:30.250
<v Charlie>Every change we run it on Linux, macOS, and Windows.

00:24:30.830 --> 00:24:34.690
<v Charlie>And that's because we do like

00:24:34.690 --> 00:24:37.590
<v Charlie>that that that's not as critical for ruff but

00:24:37.590 --> 00:24:41.050
<v Charlie>it's very critical for uv because with the

00:24:41.050 --> 00:24:44.110
<v Charlie>package manager just like more things tend to differ across

00:24:44.110 --> 00:24:47.410
<v Charlie>platform and more behaviors differ across platform and

00:24:47.410 --> 00:24:50.470
<v Charlie>so we always test everything on linux mac os and windows

00:24:50.470 --> 00:24:55.130
<v Charlie>i guess one interesting piece is like we we

00:24:55.130 --> 00:24:58.450
<v Charlie>build uv for like a lot of different targets so you

00:24:58.450 --> 00:25:01.510
<v Charlie>know when we publish to py pi or publish anywhere

00:25:01.510 --> 00:25:04.430
<v Charlie>or publish to github releases like we're

00:25:04.430 --> 00:25:07.250
<v Charlie>building like 15 probably somewhere between like

00:25:07.250 --> 00:25:12.990
<v Charlie>15 and 20 different builds or maybe is it 15 somewhere between 10 and 20 let's

00:25:12.990 --> 00:25:19.550
<v Charlie>just say different build variants and so that's like linux x86 you know dynamically

00:25:19.550 --> 00:25:25.990
<v Charlie>linked against glibc linux x86 sorry linux x86 musl linux arch GWC, Linux ArchMuzzle.

00:25:26.110 --> 00:25:29.910
<v Charlie>We build for some more obscure platforms that are supported in Python,

00:25:30.210 --> 00:25:33.790
<v Charlie>like PowerPC and S390X and stuff like that.

00:25:34.070 --> 00:25:41.890
<v Charlie>We build for ARM and x86 Windows, ARM and x86 Mac OS.

00:25:42.190 --> 00:25:45.710
<v Charlie>So we just have a lot of different builds. And...

00:25:46.150 --> 00:25:48.930
<v Charlie>I mean setting first of all setting those up to actually

00:25:48.930 --> 00:25:51.690
<v Charlie>build is fairly complex and so if you ever find yourself needing

00:25:51.690 --> 00:25:54.790
<v Charlie>to do that you should look at our ci because we've

00:25:54.790 --> 00:25:57.870
<v Charlie>you know we've just figured out how to build a rust project across all

00:25:57.870 --> 00:26:00.730
<v Charlie>those different like a lot of those are cross compiled right and

00:26:00.730 --> 00:26:03.810
<v Charlie>we've just figured out how to like build those across lots of different machines what

00:26:03.810 --> 00:26:07.450
<v Charlie>is our ci our ci your ci

00:26:07.450 --> 00:26:10.330
<v Charlie>oh yeah yeah just go look at our.github folder and like

00:26:10.330 --> 00:26:13.290
<v Charlie>take stuff from it um but the

00:26:13.290 --> 00:26:17.070
<v Charlie>other piece is we actually like we don't we

00:26:17.070 --> 00:26:20.490
<v Charlie>rerun all of those builds whenever certain files

00:26:20.490 --> 00:26:23.690
<v Charlie>change so like if we add a new dependency we rebuild across

00:26:23.690 --> 00:26:26.890
<v Charlie>all those machines which is fairly because

00:26:26.890 --> 00:26:30.210
<v Charlie>we've otherwise we've run into situations where we add

00:26:30.210 --> 00:26:33.710
<v Charlie>a new dependency and then we go to release and then i

00:26:33.710 --> 00:26:38.050
<v Charlie>don't know the arm musl build fails for some reason that we don't or the windows

00:26:38.050 --> 00:26:41.750
<v Charlie>arm build fails for something some reason that we don't fully understand and

00:26:41.750 --> 00:26:46.870
<v Charlie>so now we run those that i guess the only nuance is we we do rebuild everything

00:26:46.870 --> 00:26:51.070
<v Charlie>on every platform if we for example change our dependencies yeah.

00:26:51.070 --> 00:26:58.510
<v Matthias>I i think this is kind of where the rust build system also has its limits at

00:26:58.510 --> 00:27:03.830
<v Matthias>this point in time because especially when you talk about like feature flags

00:27:03.830 --> 00:27:08.010
<v Matthias>and maybe optional dependencies dev dependencies there's a lot of,

00:27:09.010 --> 00:27:12.390
<v Matthias>loose ends and sometimes you

00:27:12.390 --> 00:27:18.310
<v Matthias>can't really specifically and exactly say which dependencies you want to enable

00:27:18.310 --> 00:27:23.390
<v Matthias>for which platform and you know on top of it you have various system dependencies

00:27:23.390 --> 00:27:28.630
<v Matthias>and the buildrs files to go with it and sometimes you don't want to have a system

00:27:28.630 --> 00:27:31.150
<v Matthias>dependency on a certain target and so on.

00:27:31.430 --> 00:27:34.210
<v Matthias>And it's very hard to work around these limitations.

00:27:34.750 --> 00:27:38.470
<v Charlie>Yeah. I mean, a lot of the complexity has come from things like accelerators,

00:27:38.490 --> 00:27:39.930
<v Charlie>sorry, not accelerators, allocators.

00:27:40.330 --> 00:27:45.070
<v Charlie>Like, you know, like we use, I don't even know how to pronounce these things, jemallo, right?

00:27:45.290 --> 00:27:50.470
<v Charlie>Like on most platforms, but like it just like doesn't work on like,

00:27:50.570 --> 00:27:52.250
<v Charlie>it like doesn't compile on like some platform.

00:27:52.510 --> 00:27:55.250
<v Charlie>So we have to have like a bunch of build configuration around that.

00:27:55.470 --> 00:27:58.130
<v Charlie>The other one that caused a lot of trouble is.

00:27:58.630 --> 00:28:01.610
<v Charlie>Was our like zlib implementation like we

00:28:01.610 --> 00:28:04.790
<v Charlie>wanted to use which is like for decompression like we

00:28:04.790 --> 00:28:08.030
<v Charlie>wanted to use so like

00:28:08.030 --> 00:28:11.430
<v Charlie>by default you get this pure rust minis oxide implementation

00:28:11.430 --> 00:28:14.450
<v Charlie>if you just use like like flate

00:28:14.450 --> 00:28:17.270
<v Charlie>and like reqwest like if you just use like the reqwest crate to

00:28:17.270 --> 00:28:20.170
<v Charlie>like decompress stuff you get like this pure rust implementation by default i

00:28:20.170 --> 00:28:23.350
<v Charlie>think but there's like this uh zlib-ng

00:28:23.350 --> 00:28:26.710
<v Charlie>version that you can use which is a lot

00:28:26.710 --> 00:28:29.570
<v Charlie>faster but it requires like it adds

00:28:29.570 --> 00:28:32.510
<v Charlie>like a CMake dependency and it needs to be built it's like

00:28:32.510 --> 00:28:35.610
<v Charlie>very hard to build and so we could never get that to build on certain platforms and

00:28:35.610 --> 00:28:38.510
<v Charlie>so we had to have configuration around that around like where

00:28:38.510 --> 00:28:41.570
<v Charlie>do we enable it and where do we not we actually totally tore

00:28:41.570 --> 00:28:45.070
<v Charlie>that out recently and moved over to zlib-rs

00:28:45.070 --> 00:28:47.790
<v Charlie>like the pure rust implementation of a lot

00:28:47.790 --> 00:28:50.510
<v Charlie>of these zlib optimizations which in our benchmarking at least

00:28:50.510 --> 00:28:53.590
<v Charlie>is actually both faster and it's like way easier to build so

00:28:53.590 --> 00:28:56.770
<v Charlie>when you get something that's pure rust like it's just

00:28:56.770 --> 00:28:59.490
<v Charlie>like it was it's simplified things dramatically we just like tore out this

00:28:59.490 --> 00:29:02.210
<v Charlie>like CMake dependency and like all this stuff and

00:29:02.210 --> 00:29:05.150
<v Charlie>sudden and got rid of all this configuration because now we can just use

00:29:05.150 --> 00:29:08.770
<v Charlie>like the faster easier to build more portable like rust

00:29:08.770 --> 00:29:11.550
<v Charlie>version on all platforms so a lot

00:29:11.550 --> 00:29:15.750
<v Charlie>of it comes from you know if you're trying to do things that are more like customized

00:29:15.750 --> 00:29:20.810
<v Charlie>or like bleeding edge like that like you use really fast system dependencies

00:29:20.810 --> 00:29:24.430
<v Charlie>or like allocators like you end up running into these kind of configuration.

00:29:24.430 --> 00:29:28.890
<v Charlie>Problems but we've we've we've tried to make that simpler over time must.

00:29:28.890 --> 00:29:31.550
<v Matthias>Have been such a relief to pull out that cmake.

00:29:31.550 --> 00:29:34.410
<v Charlie>Oh it's amazing it's amazing i actually tried to do it like

00:29:34.410 --> 00:29:37.790
<v Charlie>anyway sort of like sorry i tried to do it like a few months ago and

00:29:37.790 --> 00:29:40.650
<v Charlie>then we realized that they were using compile time feature

00:29:40.650 --> 00:29:44.050
<v Charlie>detection for a lot of the optimizations which isn't

00:29:44.050 --> 00:29:46.910
<v Charlie>great for us because it meant that like on x86 things

00:29:46.910 --> 00:29:49.630
<v Charlie>were actually a lot slower because we have to build for

00:29:49.630 --> 00:29:52.890
<v Charlie>like a common cpu target like and they

00:29:52.890 --> 00:29:55.710
<v Charlie>recently re-released with like runtime detection and

00:29:55.710 --> 00:29:59.230
<v Charlie>i was like okay cool we're doing it again so i like i i redid the change like

00:29:59.230 --> 00:30:03.270
<v Charlie>twice and and like redid all the benchmarking on like my windows machine and

00:30:03.270 --> 00:30:08.450
<v Charlie>on like on mac os for arm and all that kind of stuff so yeah but like what yeah

00:30:08.450 --> 00:30:14.410
<v Charlie>when you can get rid of stuff like that it's like immensely helpful i realized whenever i.

00:30:14.410 --> 00:30:17.330
<v Matthias>Read one of your posts or whenever i see you talk

00:30:17.330 --> 00:30:20.290
<v Matthias>i realized that you care a

00:30:20.290 --> 00:30:23.570
<v Matthias>lot about performance and you benchmark

00:30:23.570 --> 00:30:31.330
<v Matthias>a lot and meticulously and do you have any any tips for people that want to

00:30:31.330 --> 00:30:37.430
<v Matthias>make their rust code even faster from like doing it so often there must have

00:30:37.430 --> 00:30:40.930
<v Matthias>been patterns that evolved over time what is a.

00:30:40.930 --> 00:30:41.570
<v Charlie>Good benchmark.

00:30:41.570 --> 00:30:46.590
<v Matthias>What's a bad benchmark what are some good libraries out there in general what

00:30:46.590 --> 00:30:49.930
<v Matthias>are best practices what don't you measure for example that sort.

00:30:49.930 --> 00:30:52.910
<v Charlie>Of yeah there's like so much to share i mean yeah,

00:30:53.680 --> 00:30:59.080
<v Charlie>It's like a whole, actually like learning the tooling too around benchmarking is like itself a skill.

00:30:59.280 --> 00:31:03.040
<v Charlie>And like, we have people like, there are certain things that I only know how

00:31:03.040 --> 00:31:06.620
<v Charlie>to do on Mac OS and I don't really know how to do on like Linux or windows.

00:31:06.880 --> 00:31:09.400
<v Charlie>And like, there's other people on my team who would like, would like know how

00:31:09.400 --> 00:31:11.920
<v Charlie>to do these kinds of, like, I know how to use instruments, for example,

00:31:12.160 --> 00:31:14.780
<v Charlie>on Mac OS. And that's like its own set of things.

00:31:15.600 --> 00:31:20.280
<v Charlie>You know, I think for us, there's like a bunch of different forms of benchmarking.

00:31:20.280 --> 00:31:23.860
<v Charlie>One is like micro benchmarking which is i guess

00:31:23.860 --> 00:31:27.200
<v Charlie>like generally like the kind of thing that you could do with criterion or

00:31:27.200 --> 00:31:30.140
<v Charlie>a similar crate where you're like running typically fairly

00:31:30.140 --> 00:31:33.120
<v Charlie>well isolated code on test inputs

00:31:33.120 --> 00:31:36.500
<v Charlie>and running like thousands and thousands and thousands or millions

00:31:36.500 --> 00:31:40.880
<v Charlie>of iterations of it and trying to detect very small changes that's

00:31:40.880 --> 00:31:44.140
<v Charlie>that is an incredibly useful thing if

00:31:44.140 --> 00:31:46.840
<v Charlie>you can isolate your code in that way so like if i'm trying

00:31:46.840 --> 00:31:49.620
<v Charlie>to figure out for example if like there's two different ways

00:31:49.620 --> 00:31:52.640
<v Charlie>that i could write this function that like parses this very simple string

00:31:52.640 --> 00:31:55.380
<v Charlie>in the simple way and i'm like i want to know which one's

00:31:55.380 --> 00:31:58.520
<v Charlie>the faster way then like i'll do something like that i'll be like i'll try

00:31:58.520 --> 00:32:01.320
<v Charlie>and just isolate like two methods and then use like

00:32:01.320 --> 00:32:04.180
<v Charlie>criterion or something similar we also do micro

00:32:04.180 --> 00:32:08.860
<v Charlie>benchmarking on ci so we use a tool called cod speed which does like continuous

00:32:08.860 --> 00:32:13.300
<v Charlie>benchmarking continuous profiling that again is really good when you have things

00:32:13.300 --> 00:32:18.220
<v Charlie>that and they run it on valgrind so like basically like if you just try and

00:32:18.220 --> 00:32:20.680
<v Charlie>do this kind of benchmarking on like a GitHub Actions machine,

00:32:20.680 --> 00:32:22.540
<v Charlie>it tends to be extremely unreliable.

00:32:22.740 --> 00:32:26.280
<v Charlie>You have to have like a very high error tolerance. Like when we very early on

00:32:26.280 --> 00:32:29.960
<v Charlie>in Ruff, when we started adding continuous benchmarking, we did on GitHub Actions

00:32:29.960 --> 00:32:34.460
<v Charlie>and we were just regularly see like five plus five to 10% fluctuations,

00:32:35.100 --> 00:32:37.900
<v Charlie>even for no op changes, because those machines are very noisy.

00:32:39.040 --> 00:32:44.060
<v Charlie>So CodSpeed kind of solves this problem because they run all your stuff under Valgrind.

00:32:44.200 --> 00:32:47.740
<v Charlie>So they get like a much cleaner, basically a much cleaner snapshot of like what's

00:32:47.740 --> 00:32:53.380
<v Charlie>actually changing in in terms of performance there's some nuance to it but like

00:32:53.380 --> 00:32:58.600
<v Charlie>basically if you have code io can make things like really messy with benchmarking

00:32:58.600 --> 00:33:01.760
<v Charlie>in general but if you have if you can write benchmarks that don't have any io

00:33:01.760 --> 00:33:03.480
<v Charlie>that are very pure like pure cpu,

00:33:03.940 --> 00:33:07.100
<v Charlie>you can fit them into that kind of micro benchmarking framework it's like incredibly

00:33:07.100 --> 00:33:10.320
<v Charlie>useful especially in ruff where we have a lot of that we run i mean we just

00:33:10.320 --> 00:33:15.100
<v Charlie>run that in ci it catches things all the time real positive changes show up

00:33:15.100 --> 00:33:19.120
<v Charlie>in you know get flagged as positive changes like it's it's it's a fairly high

00:33:19.120 --> 00:33:23.520
<v Charlie>signal um so that's like if you can do that that's great,

00:33:24.260 --> 00:33:27.840
<v Charlie>We can't always do that in UV. We often do more like, I just sort of make these

00:33:27.840 --> 00:33:29.400
<v Charlie>words up, like macro benchmarking.

00:33:29.520 --> 00:33:34.520
<v Charlie>I don't know if that's a word, but that's like, I use Hyperfine usually and

00:33:34.520 --> 00:33:40.500
<v Charlie>I will compile to UV release builds and I will run some operation over and over

00:33:40.500 --> 00:33:42.880
<v Charlie>and I will try to see if I can detect a difference.

00:33:43.340 --> 00:33:47.740
<v Charlie>And I'll typically try and minimize IO as much as I can, but maybe it has to

00:33:47.740 --> 00:33:49.380
<v Charlie>read from disk for the on-disk cache.

00:33:49.500 --> 00:33:52.980
<v Charlie>And like there, it's like, you're trying to, you know, you're trying to find much smaller.

00:33:53.820 --> 00:33:56.640
<v Charlie>It either needs to be really obvious or very

00:33:56.640 --> 00:33:59.300
<v Charlie>consistent in order for you to catch anything this is like

00:33:59.300 --> 00:34:02.220
<v Charlie>one form of benchmarking is oh it made everything 10

00:34:02.220 --> 00:34:04.920
<v Charlie>faster then it's like very obvious that it made

00:34:04.920 --> 00:34:08.720
<v Charlie>things faster and you can catch it in that kind of benchmark the other.

00:34:08.720 --> 00:34:11.720
<v Charlie>Pieces that you can look at which aren't so much are a

00:34:11.720 --> 00:34:14.440
<v Charlie>little bit harder to do direct comparisons but we will

00:34:14.440 --> 00:34:17.640
<v Charlie>do a fair amount of like there's a great tool called sampling

00:34:17.640 --> 00:34:20.900
<v Charlie>which we use for it's a sample sampling based

00:34:20.900 --> 00:34:24.620
<v Charlie>profiler so like you can just run a uv command you

00:34:24.620 --> 00:34:27.460
<v Charlie>basically you just prefix it with sampling and then it opens up

00:34:27.460 --> 00:34:30.600
<v Charlie>a flame graph in your browser that has

00:34:30.600 --> 00:34:33.440
<v Charlie>you know all the like all the flame graphs

00:34:33.440 --> 00:34:36.340
<v Charlie>of everything that got called and where you spent time and so

00:34:36.340 --> 00:34:39.020
<v Charlie>often that's often a way that it's a

00:34:39.020 --> 00:34:41.800
<v Charlie>little bit harder you can't always tell

00:34:41.800 --> 00:34:45.540
<v Charlie>just from a flame graph if you made your whole program faster right

00:34:45.540 --> 00:34:48.500
<v Charlie>you can tell if you you ran the

00:34:48.500 --> 00:34:51.320
<v Charlie>flame graph you found something that was taking up a lot of time and then you make

00:34:51.320 --> 00:34:54.000
<v Charlie>changes and then it's gone like that's that's good

00:34:54.000 --> 00:34:57.160
<v Charlie>that means like you got rid of that time but like that alone

00:34:57.160 --> 00:35:00.220
<v Charlie>is not enough to tell that your program got faster right that's

00:35:00.220 --> 00:35:03.000
<v Charlie>just that you removed that thing that you were spending time

00:35:03.000 --> 00:35:06.080
<v Charlie>on but maybe it went somewhere else like blah blah so we'll often

00:35:06.080 --> 00:35:10.100
<v Charlie>do that as a way to diagnose issues and if something's slow then we'll be like

00:35:10.100 --> 00:35:12.840
<v Charlie>let's run it under sampling and like see where we're spending time for example

00:35:12.840 --> 00:35:16.380
<v Charlie>and so that's there's like certain tools you want to to diagnose issues and

00:35:16.380 --> 00:35:21.180
<v Charlie>then certain tools that we often use to try and like confirm findings or understand

00:35:21.180 --> 00:35:24.240
<v Charlie>if there are regressions or anything like that yeah and.

00:35:24.240 --> 00:35:29.040
<v Matthias>In my mind when you said sampling profiling and flame graphs i was sort of hoping

00:35:29.040 --> 00:35:33.840
<v Matthias>that the tool would kind of update the chart as the program runs i'm not sure

00:35:33.840 --> 00:35:35.540
<v Matthias>if that's the behavior of that program.

00:35:35.540 --> 00:35:38.720
<v Charlie>I don't think sampling does that i think there might be a way to like compare to

00:35:38.720 --> 00:35:41.740
<v Charlie>flame graphs and like cod speed tries to do this they'll

00:35:41.740 --> 00:35:47.700
<v Charlie>try to diff the profiles it works okay sometimes it's hard for them to i mean

00:35:47.700 --> 00:35:50.840
<v Charlie>it's like inherently that seems like a very hard problem like they have to try

00:35:50.840 --> 00:35:55.160
<v Charlie>and align like what two function calls are like the same like it's not necessarily

00:35:55.160 --> 00:35:59.160
<v Charlie>always trivial because code is changing but like there's some stuff like that.

00:36:01.140 --> 00:36:06.700
<v Matthias>Just for context, maybe someone might not know, a sampling profiler.

00:36:06.880 --> 00:36:09.280
<v Matthias>How would you describe that in a sentence or two?

00:36:10.080 --> 00:36:16.900
<v Charlie>I mean, my understanding is it basically watches your program execute and it samples at some rate.

00:36:17.220 --> 00:36:21.940
<v Charlie>So it takes a bunch of samples as your program is running to try and get a representation

00:36:21.940 --> 00:36:23.240
<v Charlie>of where you're spending time.

00:36:23.500 --> 00:36:29.100
<v Charlie>So maybe it takes, just for stupid numbers, maybe it takes a thousand samples.

00:36:29.100 --> 00:36:32.060
<v Charlie>And like in or let's say 100 samples and

00:36:32.060 --> 00:36:34.860
<v Charlie>in 10 of those you're in this function then you know it would say hey

00:36:34.860 --> 00:36:37.840
<v Charlie>i found this we were running this function 10 of the time that i sampled

00:36:37.840 --> 00:36:41.040
<v Charlie>and here is where it was being called from and like here's how you're spending

00:36:41.040 --> 00:36:45.500
<v Charlie>time so it basically watches your program execute and tries to figure out based

00:36:45.500 --> 00:36:50.020
<v Charlie>on probabilistically by sampling the behavior where you're spending your time

00:36:50.020 --> 00:36:52.660
<v Charlie>that's my understanding i never built one of these so it actually might work

00:36:52.660 --> 00:36:55.940
<v Charlie>totally differently than that but like but like that's my understanding that's.

00:36:55.940 --> 00:36:56.760
<v Matthias>Also my understanding.

00:36:56.760 --> 00:37:01.180
<v Charlie>Which is a little different than like in Valgrind for example like Rust,

00:37:01.320 --> 00:37:06.780
<v Charlie>I know Rust C I think does this too like you can do profiling based on instruction

00:37:06.780 --> 00:37:08.900
<v Charlie>counting so like you actually look,

00:37:09.630 --> 00:37:12.850
<v Charlie>like the instructions that are generated or executed or whatever.

00:37:13.030 --> 00:37:15.550
<v Charlie>And you, you try and do something that's, you don't look at the behavior of

00:37:15.550 --> 00:37:17.850
<v Charlie>the program. You don't like run the program.

00:37:18.370 --> 00:37:23.390
<v Charlie>You try to look at like, or sorry, you run the program, but you look at a different

00:37:23.390 --> 00:37:26.130
<v Charlie>thing. You're not, you're not focused on time, like wall time.

00:37:26.410 --> 00:37:29.010
<v Charlie>You're looking at like what instructions are being executed,

00:37:29.010 --> 00:37:30.870
<v Charlie>how many instructions are being executed.

00:37:31.090 --> 00:37:36.310
<v Charlie>So it gives you a little bit more of like a quantitative look i guess at like

00:37:36.310 --> 00:37:41.090
<v Charlie>where you might be spending time and i believe rust c does has continuous benchmarks

00:37:41.090 --> 00:37:44.270
<v Charlie>that look at i actually i could be totally wrong there's something that i've

00:37:44.270 --> 00:37:47.310
<v Charlie>been shown before in the rust ecosystem that does continuous benchmarking based

00:37:47.310 --> 00:37:50.030
<v Charlie>on instructions instruction counting i.

00:37:50.030 --> 00:37:54.270
<v Matthias>I was kind of thinking that cargo flame graph would do that in comparison to

00:37:54.270 --> 00:37:58.690
<v Matthias>sampling i never used sampling before but my go-to tool for flame graphs is

00:37:58.690 --> 00:38:01.030
<v Matthias>always cargo flame graph i'm not sure if you've used that one.

00:38:01.030 --> 00:38:05.510
<v Charlie>I think i have but it's been a while yeah and you find simply.

00:38:05.510 --> 00:38:07.290
<v Matthias>To be more ergonomic.

00:38:07.290 --> 00:38:10.230
<v Charlie>I found it to be easier yeah yeah it's just

00:38:10.230 --> 00:38:13.170
<v Charlie>like what we tend to use on the team but like you know i'm sure

00:38:13.170 --> 00:38:15.890
<v Charlie>that there are lots of good options but it's just like

00:38:15.890 --> 00:38:19.490
<v Charlie>it just works seamlessly across mac os and linux which is really nice so like

00:38:19.490 --> 00:38:22.770
<v Charlie>a lot of these tools will be like it's maybe hard to get them to work on one

00:38:22.770 --> 00:38:25.810
<v Charlie>or the other for example not always i don't know if that's true of cargo flame

00:38:25.810 --> 00:38:28.810
<v Charlie>graph i just mean like of things we've used in the past sometimes it's like

00:38:28.810 --> 00:38:32.310
<v Charlie>oh this works great on linux but not really on mac os and i think samply tends

00:38:32.310 --> 00:38:35.110
<v Charlie>to work really well on both cargo.

00:38:35.110 --> 00:38:39.570
<v Matthias>Flame graph gives you an svg file that you can open in your browser and it's sort of.

00:38:39.570 --> 00:38:42.950
<v Charlie>Interactive yeah yeah yeah pretty basic as well.

00:38:42.950 --> 00:38:47.890
<v Matthias>I don't know if there's probably support for the chrome profiler which also

00:38:47.890 --> 00:38:50.250
<v Matthias>supports flame graphs but i'm not sure.

00:38:50.250 --> 00:38:53.790
<v Charlie>Yeah yeah the thing so the thing that samply gives you in the browser you can

00:38:53.790 --> 00:38:57.890
<v Charlie>like click into the flame graph you can filter by name you can filter traces

00:38:57.890 --> 00:39:02.370
<v Charlie>by name and stuff it's like it's pretty nice but but again like some things

00:39:02.370 --> 00:39:06.390
<v Charlie>are just really hard to understand in a flame graph like you know if you have a lot of like,

00:39:07.280 --> 00:39:10.760
<v Charlie>like UV is very async and like we have lots, you know, we have like different

00:39:10.760 --> 00:39:14.780
<v Charlie>stuff going on. And so sometimes it's hard to tell like where time is actually being spent.

00:39:15.000 --> 00:39:19.400
<v Charlie>Like it's really things that are spending lots of CPU time are always nice because

00:39:19.400 --> 00:39:21.700
<v Charlie>they're obvious and you can find them and fix them.

00:39:22.300 --> 00:39:26.000
<v Charlie>But things where it's like, oh, the scheduling is like slightly off or like

00:39:26.000 --> 00:39:27.820
<v Charlie>we're waiting here, but like blah, blah, blah.

00:39:27.920 --> 00:39:31.880
<v Charlie>Like those are tend to be harder, like more pernicious bugs to find. by.

00:39:31.880 --> 00:39:38.040
<v Matthias>That definition isn't it true that profiling or benchmarking ruff is easier

00:39:38.040 --> 00:39:43.260
<v Matthias>than uv because ruff is inherently cpu bound you do a lot of computations i'm assuming.

00:39:43.260 --> 00:39:46.220
<v Charlie>Yeah i think it's i think it's much easier yeah like ruff does

00:39:46.220 --> 00:39:49.000
<v Charlie>have a fair amount of uh it does have io right in

00:39:49.000 --> 00:39:52.040
<v Charlie>the sense that we're reading files from disk to analyze

00:39:52.040 --> 00:39:55.860
<v Charlie>them but that io

00:39:55.860 --> 00:39:58.700
<v Charlie>is i i would say like it's a

00:39:58.700 --> 00:40:02.360
<v Charlie>lot more stable and more minimal than

00:40:02.360 --> 00:40:05.520
<v Charlie>in uv and also it's kind of like sort of

00:40:05.520 --> 00:40:08.400
<v Charlie>like happens up front like we read the files and then we analyze them

00:40:08.400 --> 00:40:11.900
<v Charlie>whereas in uv there's kind of just like constantly io

00:40:11.900 --> 00:40:14.760
<v Charlie>happening whether it's we're reading

00:40:14.760 --> 00:40:17.540
<v Charlie>stuff from disk from the user's project or

00:40:17.540 --> 00:40:21.940
<v Charlie>reading stuff from our on disk cache or we're making small

00:40:21.940 --> 00:40:24.880
<v Charlie>http requests or large http requests we're downloading

00:40:24.880 --> 00:40:28.180
<v Charlie>and unzipping files like there's just constantly io

00:40:28.180 --> 00:40:30.720
<v Charlie>happening so it tends to be a bit

00:40:30.720 --> 00:40:33.840
<v Charlie>harder to benchmark often yeah like

00:40:33.840 --> 00:40:36.960
<v Charlie>often we'll bench you know often we'll benchmark by

00:40:36.960 --> 00:40:40.140
<v Charlie>we'll actually run uv with the cache because at

00:40:40.140 --> 00:40:43.120
<v Charlie>least then it's a lot more stable it's like

00:40:43.120 --> 00:40:46.660
<v Charlie>we're doing the work but we're like reading from the cache every time as opposed

00:40:46.660 --> 00:40:51.460
<v Charlie>to if you want to benchmark anything that requires like network io really hard

00:40:51.460 --> 00:40:56.800
<v Charlie>it's very hard because the amount of variants that you'll get is you will usually

00:40:56.800 --> 00:41:00.580
<v Charlie>dwarf like anything that you would see in cpu code changes that.

00:41:00.580 --> 00:41:06.540
<v Matthias>Is true but at the same time you have to be careful not to change your targets you need to be sure.

00:41:06.540 --> 00:41:07.560
<v Charlie>That you're benchmarking.

00:41:07.560 --> 00:41:08.280
<v Matthias>The right thing.

00:41:08.280 --> 00:41:13.760
<v Charlie>Yes absolutely yes and yeah we do make a different distinction between that

00:41:13.760 --> 00:41:17.560
<v Charlie>we think of it as like warm performance versus cold performance it's like performance

00:41:17.560 --> 00:41:21.120
<v Charlie>when you have stuff in the cache versus when you don't and we do look at both

00:41:21.120 --> 00:41:24.840
<v Charlie>There's some things you can do, like you can set up a network link conditioner.

00:41:25.080 --> 00:41:28.940
<v Charlie>That's what it's called on macOS at least. So you can intentionally throttle

00:41:28.940 --> 00:41:32.760
<v Charlie>your own network connection to try and get it to be a bit more consistent,

00:41:32.980 --> 00:41:35.960
<v Charlie>like bring it down to something that would hopefully be a bit more consistent.

00:41:36.990 --> 00:41:40.670
<v Charlie>But again, that's also different because, well, it measures something,

00:41:40.850 --> 00:41:43.650
<v Charlie>but it measures something slightly different than what you would get on your machine typically.

00:41:43.990 --> 00:41:47.450
<v Charlie>Like I have a very high speed internet connection. So like, you know,

00:41:47.530 --> 00:41:50.750
<v Charlie>the bottlenecks that I experience are different when I throttle versus when

00:41:50.750 --> 00:41:54.190
<v Charlie>I don't, because when I throttle, the network is slower.

00:41:54.670 --> 00:41:58.450
<v Charlie>And so if we need to do things at the same time, it's easier.

00:41:58.610 --> 00:42:01.930
<v Charlie>But when my network connection is really fast, like

00:42:01.930 --> 00:42:04.810
<v Charlie>actually operations on the CPU can actually become

00:42:04.810 --> 00:42:08.210
<v Charlie>or like local sorry on disk io can become blocking because

00:42:08.210 --> 00:42:11.450
<v Charlie>like maybe i'm like streaming it faster really fast

00:42:11.450 --> 00:42:14.530
<v Charlie>and trying to unzip a file and write it to disk like the bottlenecks you're

00:42:14.530 --> 00:42:17.390
<v Charlie>just a little bit different the other thing we'll do sometimes is

00:42:17.390 --> 00:42:20.130
<v Charlie>like if we do need to look at anything related to like

00:42:20.130 --> 00:42:23.270
<v Charlie>the http stack we'll just like run a local server

00:42:23.270 --> 00:42:26.770
<v Charlie>so we're still again it's different but it's

00:42:26.770 --> 00:42:30.010
<v Charlie>gives you some information some particular ability yeah

00:42:30.010 --> 00:42:32.770
<v Charlie>yeah yeah it's like if you're trying to like at least because at

00:42:32.770 --> 00:42:35.990
<v Charlie>least then you get consistency in you know

00:42:35.990 --> 00:42:40.670
<v Charlie>to i mean to to a greater degree at least around like what your what your measurements

00:42:40.670 --> 00:42:44.810
<v Charlie>look like yeah the really hard thing with network io is just like it can just

00:42:44.810 --> 00:42:49.410
<v Charlie>be all over the place and if you're trying to measure like a one to five percent

00:42:49.410 --> 00:42:54.870
<v Charlie>performance change it's very hard to do it in the presence of making lots of network requests and.

00:42:54.870 --> 00:43:00.850
<v Matthias>In that context i was surprised to learn that you're using a single-threaded tokio runtime.

00:43:01.510 --> 00:43:08.790
<v Matthias>Isn't that what you're supposed to do when you want super amazing high-performance I.O. in Rust?

00:43:08.950 --> 00:43:13.930
<v Matthias>Do you always use multiple threads in tokio? Like, why did you decide against that?

00:43:14.370 --> 00:43:19.210
<v Charlie>Yeah, I mean, I guess like the... Sorry, the simplest answer is just that,

00:43:19.950 --> 00:43:23.690
<v Charlie>we benchmarked it and we looked at it a lot. And then we found out that like,

00:43:24.190 --> 00:43:27.950
<v Charlie>basically we use a single tokio thread for IO is sort of the way that I would put it.

00:43:28.090 --> 00:43:33.590
<v Charlie>And we like, we found at least the theory, I guess, is that it reduces synchronization

00:43:33.590 --> 00:43:39.070
<v Charlie>costs and that we don't perform like enough IO for multiple threads to be worth it.

00:43:39.550 --> 00:43:43.650
<v Charlie>So like we're often, you know, what's like a lot of IO, right?

00:43:43.710 --> 00:43:46.550
<v Charlie>If you think about a web server that's trying to do or

00:43:46.550 --> 00:43:49.630
<v Charlie>something that's like super high throughput and they're trying to do like thousands

00:43:49.630 --> 00:43:53.050
<v Charlie>of requests per second that's like pretty different than what we're doing because

00:43:53.050 --> 00:43:58.710
<v Charlie>like for us maybe we're downloading 20 packages at the same time right that's

00:43:58.710 --> 00:44:02.270
<v Charlie>20 different requests that are happening it's like pretty different it's a pretty

00:44:02.270 --> 00:44:05.670
<v Charlie>different amount of throughput so like,

00:44:06.290 --> 00:44:10.850
<v Charlie>we found that using a single runtime for a sorry a single thread for io and

00:44:10.850 --> 00:44:11.790
<v Charlie>then being really careful,

00:44:13.050 --> 00:44:15.790
<v Charlie>about compute work so trying to

00:44:15.790 --> 00:44:19.350
<v Charlie>make sure that we run compute work like off that main thread or off the main

00:44:19.350 --> 00:44:25.090
<v Charlie>thread is also really important so for example like we have a solver in uv we

00:44:25.090 --> 00:44:29.350
<v Charlie>have to solve for dependencies right like we get a big graph of the things that

00:44:29.350 --> 00:44:32.430
<v Charlie>people depend on and then we look at the things that those packages depend on

00:44:32.430 --> 00:44:34.070
<v Charlie>and we have to solve this like big.

00:44:34.590 --> 00:44:37.690
<v Charlie>Constraint satisfaction problem that solver runs

00:44:37.690 --> 00:44:40.550
<v Charlie>on its own thread so like we we move

00:44:40.550 --> 00:44:43.150
<v Charlie>that compute off to like a different thread so we have to be like a little

00:44:43.150 --> 00:44:46.330
<v Charlie>bit careful about like what we do on what on which thread

00:44:46.330 --> 00:44:49.170
<v Charlie>and like how we orchestrate it but we did we did.

00:44:49.170 --> 00:44:52.130
<v Charlie>Like actually find in practice that it was faster to just use

00:44:52.130 --> 00:44:54.890
<v Charlie>like the single threaded runtime there's some

00:44:54.890 --> 00:44:57.770
<v Charlie>nice other things to it too like we can

00:44:57.770 --> 00:45:00.710
<v Charlie>use like rc instead of arc in like some places you know

00:45:00.710 --> 00:45:04.990
<v Charlie>there's like some minor like quality of life things like that but yeah

00:45:04.990 --> 00:45:09.190
<v Charlie>we just make like a lot of small network requests and

00:45:09.190 --> 00:45:12.050
<v Charlie>we found that like using the

00:45:12.050 --> 00:45:14.790
<v Charlie>single threaded runtime empirically was like faster for our

00:45:14.790 --> 00:45:17.650
<v Charlie>program there was also a long conversation when we started uv

00:45:17.650 --> 00:45:21.150
<v Charlie>about whether we should use async rust at all which is

00:45:21.150 --> 00:45:24.070
<v Charlie>sort of another another topic like the thinking

00:45:24.070 --> 00:45:27.190
<v Charlie>there was like do we actually have enough

00:45:27.190 --> 00:45:30.650
<v Charlie>io to demand async rust

00:45:30.650 --> 00:45:33.310
<v Charlie>and or some kind of multi-threaded runtime because like that's maybe what we

00:45:33.310 --> 00:45:36.430
<v Charlie>would see as like the main benefit is like okay we get all this like thread coordination

00:45:36.430 --> 00:45:39.290
<v Charlie>and stuff and like i was pretty into it at the time

00:45:39.290 --> 00:45:42.330
<v Charlie>like other people on the team sort of tried to talk me out of it and were

00:45:42.330 --> 00:45:45.210
<v Charlie>more like okay why don't we just like manage our threads for

00:45:45.210 --> 00:45:48.430
<v Charlie>doing this kind of io i felt like

00:45:48.430 --> 00:45:51.430
<v Charlie>i i

00:45:51.430 --> 00:45:54.570
<v Charlie>felt like doing it with asyncrust was like would be more like doing

00:45:54.570 --> 00:45:58.130
<v Charlie>it right and maybe moving slightly more in the direction of like the arc of

00:45:58.130 --> 00:46:02.870
<v Charlie>the what the ecosystem wants i don't actually know if that's proven to be true

00:46:02.870 --> 00:46:07.030
<v Charlie>like i'm sort of like i think we could have built uv without using asyncrust

00:46:07.030 --> 00:46:10.130
<v Charlie>and it would have been find maybe i'll just put it that way at the same time

00:46:10.130 --> 00:46:12.090
<v Charlie>i actually find working with async rust to be.

00:46:13.560 --> 00:46:16.640
<v Charlie>And i think it's actually improved fairly dramatically

00:46:16.640 --> 00:46:20.120
<v Charlie>even since we started the project like since we started the project like

00:46:20.120 --> 00:46:23.000
<v Charlie>the rust team just like keeps shipping you know like there's like things

00:46:23.000 --> 00:46:25.720
<v Charlie>just keep getting like async that now is like

00:46:25.720 --> 00:46:29.400
<v Charlie>async closures async async async

00:46:29.400 --> 00:46:32.320
<v Charlie>in traits didn't exist when we started the project

00:46:32.320 --> 00:46:35.080
<v Charlie>we had to use like the async trait macro for

00:46:35.080 --> 00:46:37.900
<v Charlie>example like there's just a lot of things in async rust

00:46:37.900 --> 00:46:41.320
<v Charlie>that have actually been like stabilized and improved in the last year and

00:46:41.320 --> 00:46:44.200
<v Charlie>i i really don't find myself having

00:46:44.200 --> 00:46:47.220
<v Charlie>to work around async or fight async quote unquote very much

00:46:47.220 --> 00:46:49.960
<v Charlie>for whatever reason i'm not saying that it's worth it

00:46:49.960 --> 00:46:52.660
<v Charlie>for every project but like i actually don't think it has been a big

00:46:52.660 --> 00:46:56.820
<v Charlie>challenge for us i think the only challenge i think the main challenge has been

00:46:56.820 --> 00:47:01.840
<v Charlie>this stuff around scheduling and like trying to understand like how do we schedule

00:47:01.840 --> 00:47:07.840
<v Charlie>really efficiently and maybe feeling like we have slightly less control than

00:47:07.840 --> 00:47:11.460
<v Charlie>if we were hand-rolling our own approach to threading.

00:47:11.620 --> 00:47:16.740
<v Matthias>I mean, in this situation, there's sort of a middle ground as well,

00:47:16.920 --> 00:47:21.160
<v Matthias>which is instead of having a global tokio runtime,

00:47:21.840 --> 00:47:28.540
<v Matthias>let's say by annotating your main function with tokio main, you could also use

00:47:28.540 --> 00:47:32.480
<v Matthias>structure currency where the core of it is sync.

00:47:33.300 --> 00:47:38.760
<v Matthias>And when you use a lot of IO, when you need a lot of IO, you can start your

00:47:38.760 --> 00:47:46.480
<v Matthias>own little runtime even within a function say or within a struct did you consider that approach as well.

00:47:46.480 --> 00:47:49.640
<v Charlie>Yeah i think that's probably roughly what we would have done if we didn't do

00:47:49.640 --> 00:47:56.080
<v Charlie>this and again i think that could have been totally fine but this also worked

00:47:56.080 --> 00:48:01.040
<v Charlie>and so it's like i don't know it's been good i i think i there are challenges

00:48:01.040 --> 00:48:04.680
<v Charlie>with async right like the things that i mentioned before are becoming less of

00:48:04.680 --> 00:48:06.160
<v Charlie>a pain, but there's still a lot of a pain,

00:48:06.580 --> 00:48:09.260
<v Charlie>like async closures, right? Things have to be sent in sync.

00:48:10.820 --> 00:48:11.560
<v Charlie>It's sort of,

00:48:12.830 --> 00:48:16.810
<v Charlie>it's slightly infectious, right? Whereby like if one thing, if we need to call

00:48:16.810 --> 00:48:21.090
<v Charlie>one async function from another function, suddenly like the async propagates upward.

00:48:21.970 --> 00:48:26.290
<v Charlie>And there were some challenges like that. Like we, just as a random example,

00:48:26.650 --> 00:48:33.990
<v Charlie>like our Git implementation, I originally basically vendored from Cargo.

00:48:34.270 --> 00:48:37.990
<v Charlie>Not exactly, but like I looked at Cargo's Git implementation and I was like,

00:48:38.090 --> 00:48:40.010
<v Charlie>okay, Cargo is good at Git.

00:48:40.190 --> 00:48:42.930
<v Charlie>How do we deal with Git? and i like i sort of

00:48:42.930 --> 00:48:46.250
<v Charlie>like started with what they had and then we we changed it pretty significantly

00:48:46.250 --> 00:48:49.470
<v Charlie>over time but like that was an async right

00:48:49.470 --> 00:48:52.370
<v Charlie>and and so like and we were like calling into

00:48:52.370 --> 00:48:57.390
<v Charlie>it from an async you know from async runtime that ended up actually causing

00:48:57.390 --> 00:49:01.150
<v Charlie>like kind of a lot of problems that were like pretty annoying to debug because

00:49:01.150 --> 00:49:05.910
<v Charlie>for example they were using they need to make network requests and it was using

00:49:05.910 --> 00:49:11.750
<v Charlie>i think it was using using curl or something and i was like okay but we use reqwest.

00:49:12.290 --> 00:49:15.230
<v Charlie>I only want to use one networking stack. So let's replace it with reqwest.

00:49:15.350 --> 00:49:17.870
<v Charlie>But okay, if we want to replace it with reqwest, and it's going to be sync,

00:49:18.050 --> 00:49:19.470
<v Charlie>then we have to use reqwest blocking.

00:49:20.290 --> 00:49:24.270
<v Charlie>And then it was actually for a period of time, I can't remember how we fixed

00:49:24.270 --> 00:49:27.930
<v Charlie>this, it was actually impossible, because then reqwest blocking actually starts,

00:49:28.130 --> 00:49:30.130
<v Charlie>I believe it uses async internally.

00:49:30.450 --> 00:49:33.290
<v Charlie>And so it was like we were starting a tokio runtime within a tokio runtime.

00:49:33.570 --> 00:49:34.930
<v Charlie>And like that would just error, right?

00:49:35.070 --> 00:49:38.690
<v Charlie>And so, so it was like, we were trying to take some code that wasn't async and

00:49:38.690 --> 00:49:39.830
<v Charlie>use it in an async context.

00:49:39.830 --> 00:49:44.030
<v Charlie>And it was just kind of the kind of like buttheads a little bit so you do run

00:49:44.030 --> 00:49:48.170
<v Charlie>into stuff like that by buying into async but but in general i think the ergonomics

00:49:48.170 --> 00:49:52.810
<v Charlie>of it are actually quite good and like i i don't think we pay much of a high

00:49:52.810 --> 00:49:55.930
<v Charlie>cost for using it and hopefully we'll get more and more out of it over time

00:49:55.930 --> 00:49:58.670
<v Charlie>is sort of is a little bit of how i think about it yeah.

00:49:58.670 --> 00:50:05.190
<v Matthias>Exactly so i would assume that the majority of the problems with async cross

00:50:05.190 --> 00:50:06.910
<v Matthias>that you mentioned they don't really,

00:50:08.450 --> 00:50:12.770
<v Matthias>affect you because you don't think

00:50:12.770 --> 00:50:19.730
<v Matthias>libraries first your user facing interface is the cli but if you were.

00:50:19.730 --> 00:50:20.270
<v Charlie>To build.

00:50:20.270 --> 00:50:29.270
<v Matthias>A library and it used tokio as a sort of public interface so it was async to

00:50:29.270 --> 00:50:34.410
<v Matthias>begin with then that can cause some headaches with integration.

00:50:34.410 --> 00:50:37.910
<v Charlie>Because 100% Yeah, sorry. This is actually a great point. That's actually maybe

00:50:37.910 --> 00:50:43.130
<v Charlie>the thing that's most painful about using async is like there are all these

00:50:43.130 --> 00:50:48.230
<v Charlie>crates that we depend on that we have to depend on basically like the async

00:50:48.230 --> 00:50:50.230
<v Charlie>versions of those crates or the async interfaces.

00:50:50.510 --> 00:50:53.950
<v Charlie>And if I look at all those crates, they have to actually maintain like a tokio

00:50:53.950 --> 00:50:58.870
<v Charlie>interface and a sync interface and maybe like an async standard interface like sometimes.

00:50:59.790 --> 00:51:02.770
<v Charlie>And for example, like we have...

00:51:03.660 --> 00:51:07.780
<v Charlie>Tar rs okay really popular common

00:51:07.780 --> 00:51:11.120
<v Charlie>crate for creating and untarring

00:51:11.120 --> 00:51:18.580
<v Charlie>creating tar balls and untarring them and like we we needed like we wanted like

00:51:18.580 --> 00:51:22.720
<v Charlie>an async tar implementation so like it turns out there's something called i'm

00:51:22.720 --> 00:51:25.120
<v Charlie>going to get the exact names of these things wrong because they're all so similar

00:51:25.120 --> 00:51:29.140
<v Charlie>but there's something called async tar but that which is an async.

00:51:30.420 --> 00:51:33.100
<v Charlie>Standard port of tar rs it's meant to be they took

00:51:33.100 --> 00:51:35.820
<v Charlie>tar rs and made it async with async standard well we can't

00:51:35.820 --> 00:51:38.560
<v Charlie>really use that because we're using tokio and not async standard and we

00:51:38.560 --> 00:51:41.700
<v Charlie>don't want to have these two huge dependencies on different

00:51:41.700 --> 00:51:44.460
<v Charlie>async runtimes okay so that it turns out

00:51:44.460 --> 00:51:47.880
<v Charlie>that got forked to something called tokio tar

00:51:47.880 --> 00:51:50.960
<v Charlie>so it went from tar rs to async

00:51:50.960 --> 00:51:53.740
<v Charlie>tar to tokio tar and then that crate

00:51:53.740 --> 00:51:57.360
<v Charlie>actually got forked like two more times um just

00:51:57.360 --> 00:52:00.600
<v Charlie>by different people because it wasn't really maintained and then eventually

00:52:00.600 --> 00:52:03.620
<v Charlie>we forked it ourselves to fix

00:52:03.620 --> 00:52:07.020
<v Charlie>some bugs and now we like fully maintain that like

00:52:07.020 --> 00:52:10.160
<v Charlie>we just like maintain it it's a that's actually a public crate that we published outside

00:52:10.160 --> 00:52:13.240
<v Charlie>of uv where we've like we upstreamed or

00:52:13.240 --> 00:52:16.560
<v Charlie>we downstreamed i guess a lot of things from tar rs and

00:52:16.560 --> 00:52:19.320
<v Charlie>we like fixed some bad bugs that users were hitting

00:52:19.320 --> 00:52:22.420
<v Charlie>and so the whole like the ecosystem problem

00:52:22.420 --> 00:52:26.360
<v Charlie>i actually have no idea how to solve that which is like it

00:52:26.360 --> 00:52:29.200
<v Charlie>seems like a huge pain for crates to have to maintain all

00:52:29.200 --> 00:52:31.960
<v Charlie>these different interfaces and then for us it's like

00:52:31.960 --> 00:52:34.920
<v Charlie>we have two different zip crates we we use zip

00:52:34.920 --> 00:52:38.480
<v Charlie>rs and then we also use async zip because

00:52:38.480 --> 00:52:41.800
<v Charlie>we have like slightly different contexts in which they run and like that's maybe

00:52:41.800 --> 00:52:47.060
<v Charlie>like an us problem but like it it is that is the i think probably the most challenging

00:52:47.060 --> 00:52:50.460
<v Charlie>piece is just the touch points with the rest of the ecosystem when you want

00:52:50.460 --> 00:52:53.440
<v Charlie>to pull in async versions of things or people have to maintain async versions

00:52:53.440 --> 00:52:56.600
<v Charlie>of things like like okay we're going to maintain like async.

00:52:56.820 --> 00:53:01.220
<v Charlie>This async tar crate forever because like the tar rest crate is sync like that's

00:53:01.220 --> 00:53:04.280
<v Charlie>like a that's kind of a bad outcome but i don't have no idea how to solve it

00:53:04.280 --> 00:53:06.040
<v Charlie>i don't know if that's what you're getting at you're probably talking about

00:53:06.040 --> 00:53:09.400
<v Charlie>it more from the library maintaining perspective of having to expose tokio interfaces

00:53:09.400 --> 00:53:12.860
<v Charlie>like a lot of crates will have a tokio feature right that like pulls that stuff

00:53:12.860 --> 00:53:16.180
<v Charlie>in but yeah it's like it's a little painful i.

00:53:16.180 --> 00:53:19.280
<v Matthias>Don't know what's the way out of it here because in reality

00:53:19.280 --> 00:53:27.580
<v Matthias>well the ecosystem is still evolving async std is sort of that so that's off

00:53:27.580 --> 00:53:31.700
<v Matthias>the table but i do believe that there's merit in having a sync interface and

00:53:31.700 --> 00:53:36.260
<v Matthias>then an async interface on top of that which is a separate crate and the as

00:53:36.260 --> 00:53:39.080
<v Matthias>the crate ecosystem allows for it but yeah.

00:53:39.080 --> 00:53:45.600
<v Charlie>Yeah yeah yeah i actually did once maybe just to illustrate how confusing this

00:53:45.600 --> 00:53:47.360
<v Charlie>stuff is and how little I know about it.

00:53:47.520 --> 00:53:52.680
<v Charlie>Like early on, I did actually pull in a crate that used async as standard.

00:53:53.380 --> 00:53:57.100
<v Charlie>And I wasn't really thinking about it very hard. And I just saw I had like an

00:53:57.100 --> 00:54:00.360
<v Charlie>async interface and I needed to call like .compat on like a couple of things

00:54:00.360 --> 00:54:02.240
<v Charlie>or something. And I was like, okay, cool, like this works.

00:54:02.440 --> 00:54:07.540
<v Charlie>And then like, obviously like massively bloated our dependency graph and everything.

00:54:07.960 --> 00:54:10.240
<v Charlie>And people on my team were like, did you do this intentionally?

00:54:10.380 --> 00:54:12.120
<v Charlie>And I was like, I don't even really know what the difference,

00:54:12.280 --> 00:54:14.840
<v Charlie>like at that point in time, I was like, I don't really know like what the difference

00:54:14.840 --> 00:54:15.440
<v Charlie>is between these things.

00:54:15.620 --> 00:54:18.920
<v Charlie>They're just async, right? Blah, blah, blah. So like, it's super confusing.

00:54:19.660 --> 00:54:23.020
<v Charlie>It is helpful that things have centralized more on tokio, I think.

00:54:23.020 --> 00:54:26.680
<v Charlie>But it is it's a very easy like mistake to make and it's like not all clear

00:54:26.680 --> 00:54:29.640
<v Charlie>how these things relate to one another honestly.

00:54:29.640 --> 00:54:36.080
<v Matthias>Yeah yeah it had a ripple effect on the ecosystem which we still deal with today

00:54:36.080 --> 00:54:41.620
<v Matthias>but it's a step into the right direction that the futures trade is now in the

00:54:41.620 --> 00:54:47.740
<v Matthias>rust prelude so it feels like as you say things are progressing Right.

00:54:48.280 --> 00:54:54.160
<v Matthias>Now, another thing that I wanted to touch on, which also kind of is interesting

00:54:54.160 --> 00:54:58.580
<v Matthias>because you diverge from the norm a bit, is parser generators.

00:54:59.080 --> 00:54:59.820
<v Charlie>Oh, yeah.

00:55:00.480 --> 00:55:06.120
<v Matthias>I think you decided to switch to a handwritten parser in ruff.

00:55:07.040 --> 00:55:10.660
<v Matthias>And I wonder, first off, what was the decision process there?

00:55:10.820 --> 00:55:15.260
<v Matthias>And second, how do you handle the complexity that comes with it?

00:55:15.910 --> 00:55:21.110
<v Charlie>Yeah, great question. So like, originally, the parser in ruff came from a project

00:55:21.110 --> 00:55:25.230
<v Charlie>called Rust Python, which is, I guess, in some ways, like an even more ambitious

00:55:25.230 --> 00:55:26.810
<v Charlie>project, because it's a whole Python interpreter.

00:55:26.990 --> 00:55:29.370
<v Charlie>So they're actually trying to build like in Rust, they're trying to build a

00:55:29.370 --> 00:55:30.450
<v Charlie>run, you know, a whole runtime.

00:55:30.730 --> 00:55:34.950
<v Charlie>So like, as part of that, though, it has to parse code. And so we took that

00:55:34.950 --> 00:55:37.850
<v Charlie>I took that parser, and we depended on it as a library.

00:55:38.670 --> 00:55:43.990
<v Charlie>And that parser was based on a parser generator called Lollerpop or L-A-L-R-P-O-P.

00:55:43.990 --> 00:55:46.370
<v Charlie>Again, I don't really know how to pronounce anything because I spend all my

00:55:46.370 --> 00:55:47.670
<v Charlie>time just on the internet.

00:55:48.550 --> 00:55:54.090
<v Charlie>Same. But it's basically like you create a Lollerpop file.

00:55:54.790 --> 00:55:59.090
<v Charlie>It's a DSL, but it also can include Rust code verbatim at different points.

00:55:59.090 --> 00:56:01.050
<v Charlie>So you have something that kind of looks like the grammar.

00:56:02.270 --> 00:56:06.570
<v Charlie>But the thing that we were finding, when we started using that,

00:56:06.990 --> 00:56:09.810
<v Charlie>the first, I guess, sign of trouble was that Rust

00:56:09.810 --> 00:56:13.030
<v Charlie>Python didn't support several syntax

00:56:13.030 --> 00:56:17.090
<v Charlie>features in python new feature newish newish features couple last couple years

00:56:17.090 --> 00:56:21.930
<v Charlie>like match statements python has support for pattern matching and that was added

00:56:21.930 --> 00:56:27.350
<v Charlie>in 3.10 i think and it didn't support it and i was like okay and it turns out

00:56:27.350 --> 00:56:30.470
<v Charlie>that that's because in that in python 3.

00:56:30.950 --> 00:56:34.570
<v Charlie>Oh my god i'm gonna mess this up i guess i think in 3.9 maybe they switched

00:56:34.570 --> 00:56:39.810
<v Charlie>their parser so they moved from an lr1 this is not important if you know what

00:56:39.810 --> 00:56:44.470
<v Charlie>these things are but they moved from an LR1 parser to a PEG parser, a PEG parser.

00:56:44.990 --> 00:56:48.310
<v Charlie>And basically it meant that the grammar got more flexible.

00:56:48.630 --> 00:56:52.930
<v Charlie>So they were able to have things that would be called soft keywords.

00:56:53.190 --> 00:56:57.010
<v Charlie>So for example, match is a valid variable name.

00:56:57.210 --> 00:57:01.670
<v Charlie>You could do match equals one, but it's also a keyword because you can do match

00:57:01.670 --> 00:57:06.330
<v Charlie>object, colon, and then patterns, right? So it's both a valid variable name and a keyword.

00:57:06.610 --> 00:57:09.910
<v Charlie>And whether it's a keyword depends on the context around it.

00:57:10.270 --> 00:57:14.510
<v Charlie>So it depends, like the parser has to be able to support both those use cases.

00:57:15.820 --> 00:57:20.420
<v Charlie>I'm just thinking, I think async, now I can't remember.

00:57:20.560 --> 00:57:23.780
<v Charlie>Async might be a soft keyword as well, but largely it's for things that were

00:57:23.780 --> 00:57:25.720
<v Charlie>added where they don't want to actually,

00:57:25.880 --> 00:57:29.380
<v Charlie>they don't necessarily want to make changes that are backwards incompatible

00:57:29.380 --> 00:57:33.760
<v Charlie>in the sense that there's a lot of existing Python code that needs to run on

00:57:33.760 --> 00:57:37.120
<v Charlie>new versions of Python that might use match as a variable. And so that code

00:57:37.120 --> 00:57:38.040
<v Charlie>should continue to work.

00:57:38.400 --> 00:57:44.040
<v Charlie>So the grammar got more complicated and they added a more powerful parser to support that.

00:57:44.040 --> 00:57:48.620
<v Charlie>Now our parser that we were using based on lollipop couldn't support that there were like,

00:57:49.420 --> 00:57:53.500
<v Charlie>ambiguities in the grammar that were very hard to represent in lollipop

00:57:53.500 --> 00:58:01.420
<v Charlie>and we had to get increasingly good at lollipop in order to do it so like in

00:58:01.420 --> 00:58:05.600
<v Charlie>terms of like the precedence of the statements the way that you do like like

00:58:05.600 --> 00:58:08.620
<v Charlie>basically we were just like learning this tool really deeply and having to put

00:58:08.620 --> 00:58:11.580
<v Charlie>more and more work into actually supporting the Python grammar.

00:58:13.100 --> 00:58:18.420
<v Charlie>So what it felt like at the time was there were a few properties that we wanted,

00:58:19.060 --> 00:58:20.980
<v Charlie>that we thought we could get out of a new parser.

00:58:21.560 --> 00:58:26.600
<v Charlie>One was we thought we could make it a lot faster to start just out of the box.

00:58:27.160 --> 00:58:32.540
<v Charlie>Two, we thought it would be much easier for us to optimize further because it's

00:58:32.540 --> 00:58:34.640
<v Charlie>very hard to optimize like a parser generator.

00:58:34.820 --> 00:58:38.340
<v Charlie>Like the code is generated, right? As it sounds. And so like,

00:58:38.540 --> 00:58:41.820
<v Charlie>you can't like, I mean, there's only so much you can do to like optimize the

00:58:41.820 --> 00:58:44.040
<v Charlie>generated code, right? It's like kind of out of your control.

00:58:44.620 --> 00:58:49.540
<v Charlie>And the third was error recovery. We wanted like much better control over error

00:58:49.540 --> 00:58:54.300
<v Charlie>recovery so that, especially because we're building tooling that's designed to work in an editor.

00:58:54.520 --> 00:58:57.920
<v Charlie>So like if you type a syntax error, like if you type.

00:58:58.660 --> 00:59:02.220
<v Charlie>Def space and you're like starting to type a function we still

00:59:02.220 --> 00:59:05.000
<v Charlie>want to be able to parse as much of the file as we

00:59:05.000 --> 00:59:08.720
<v Charlie>can even though it's not syntactically valid anymore and

00:59:08.720 --> 00:59:11.740
<v Charlie>that's that's like pretty hard like there are

00:59:11.740 --> 00:59:14.920
<v Charlie>probably parser generators that support that to different degrees but

00:59:14.920 --> 00:59:18.220
<v Charlie>like a lot of that is requires fairly bespoke

00:59:18.220 --> 00:59:21.440
<v Charlie>handling of like what happens when you hit an error and

00:59:21.440 --> 00:59:24.540
<v Charlie>like what the different fallback cases could be so there

00:59:24.540 --> 00:59:28.000
<v Charlie>were things we wanted out of a parser generator which were like performance

00:59:28.000 --> 00:59:33.580
<v Charlie>and error recovery and then we also found that we were spending a lot of time

00:59:33.580 --> 00:59:36.940
<v Charlie>just trying to get the part the parser generous was a save you time but ultimately

00:59:36.940 --> 00:59:39.600
<v Charlie>we were spending a lot of time trying to get it to work for the grammar in the

00:59:39.600 --> 00:59:44.400
<v Charlie>first place so we were like we want to rewrite the parser we were like pretty sure about that,

00:59:45.000 --> 00:59:50.680
<v Charlie>and actually a contributor came to us and said that they were working on they

00:59:50.680 --> 00:59:54.320
<v Charlie>wanted to do write like a handwritten parser if i recall correctly i think it

00:59:54.320 --> 00:59:55.800
<v Charlie>was actually for like a master's project.

00:59:57.620 --> 01:00:02.380
<v Charlie>And they were like would you ever like use it in ruff if I can get it to work

01:00:02.380 --> 01:00:06.100
<v Charlie>with the ruff AST like I'm building around the ruff AST sorry they were building

01:00:06.100 --> 01:00:08.840
<v Charlie>around the ruff AST and they were like if I can get it to pass the ruff test

01:00:08.840 --> 01:00:12.440
<v Charlie>suite would you want to use it in ruff and we were like yeah absolutely and

01:00:12.440 --> 01:00:15.680
<v Charlie>we'll pay you for it and so we like,

01:00:16.300 --> 01:00:20.080
<v Charlie>there was a fair amount of so this contributor like brought us this parser and

01:00:20.080 --> 01:00:24.100
<v Charlie>there's a fair amount of kind of like last mile work of making sure that.

01:00:25.190 --> 01:00:27.710
<v Charlie>You know, because we're being used by all these big projects,

01:00:27.910 --> 01:00:31.030
<v Charlie>we need to make sure shipping a parser is like a huge change.

01:00:31.250 --> 01:00:36.290
<v Charlie>So we did end up investing a lot of time in like the final 10% of like making

01:00:36.290 --> 01:00:39.190
<v Charlie>sure that it works in all cases and that it won't panic for people and that

01:00:39.190 --> 01:00:41.390
<v Charlie>it will work exactly as we expect.

01:00:41.590 --> 01:00:44.870
<v Charlie>It's the kind of thing that it's pretty easy to test at very large scale because

01:00:44.870 --> 01:00:51.070
<v Charlie>we just, we can compare our basic, for example, the diagnostics that we get on very large projects.

01:00:51.450 --> 01:00:54.290
<v Charlie>There are arbitrary large projects and we can run ruff before and

01:00:54.290 --> 01:00:57.190
<v Charlie>after on those projects and make sure the diagnostics are completely unchanged

01:00:57.190 --> 01:01:01.170
<v Charlie>for example so we did a lot of large-scale testing and it

01:01:01.170 --> 01:01:03.950
<v Charlie>was actually like an incredibly smooth release like we

01:01:03.950 --> 01:01:06.830
<v Charlie>didn't i don't think we got like we got like maybe one bug report which

01:01:06.830 --> 01:01:09.530
<v Charlie>is amazing i didn't even work on this so i can brag about it a lot

01:01:09.530 --> 01:01:12.830
<v Charlie>but like it's amazing that that happened and ultimately

01:01:12.830 --> 01:01:16.610
<v Charlie>it means we now have our own parser it's totally it's completely handwritten and

01:01:16.610 --> 01:01:20.070
<v Charlie>it's been way easier for us to modify it over time because

01:01:20.070 --> 01:01:23.270
<v Charlie>like it it is work for sure like

01:01:23.270 --> 01:01:26.270
<v Charlie>the grammar gets extended but it we have complete control

01:01:26.270 --> 01:01:29.410
<v Charlie>over like how the parser works and like the idea

01:01:29.410 --> 01:01:32.230
<v Charlie>of adding new syntax to that grammar is so much

01:01:32.230 --> 01:01:35.650
<v Charlie>less daunting than adding it to like the parser generator which

01:01:35.650 --> 01:01:38.430
<v Charlie>isn't it's really not meant to be a knock on the parser generator like

01:01:38.430 --> 01:01:41.550
<v Charlie>i think parse i think like that's a great project i think parser generators

01:01:41.550 --> 01:01:44.630
<v Charlie>can definitely be great it just depends on what you're doing but for

01:01:44.630 --> 01:01:48.150
<v Charlie>us like python there's just a lot in the grammar there are lots of ambiguities

01:01:48.150 --> 01:01:52.450
<v Charlie>and it's evolving and it's something that we have to change so we felt more

01:01:52.450 --> 01:01:57.670
<v Charlie>comfortable doing it ourselves and it was also like way faster like it sped

01:01:57.670 --> 01:02:04.170
<v Charlie>up all of ruff by like 20 to 40 percent or something that's.

01:02:04.170 --> 01:02:10.290
<v Matthias>Pretty impressive and it sort of paid off for you to take ownership of this

01:02:10.290 --> 01:02:16.090
<v Matthias>entire parsing part because it's such an integral part of what you're building

01:02:16.090 --> 01:02:22.630
<v Matthias>at astral in general you can probably use that for uv and for ruff i'm assuming i don't know.

01:02:22.630 --> 01:02:27.150
<v Charlie>We don't use it in uv today the python parser we don't use it in uv today but

01:02:27.150 --> 01:02:32.490
<v Charlie>we could we have other parsers in uv thankfully like the version parser the

01:02:32.490 --> 01:02:37.570
<v Charlie>version specifier parser the requirements txt parser yeah a lot of the work that you do is.

01:02:37.570 --> 01:02:39.630
<v Matthias>Parsing right i.

01:02:39.630 --> 01:02:43.630
<v Charlie>Guess so yeah i mean often like we need to implement standards things

01:02:43.630 --> 01:02:48.210
<v Charlie>that have been specified in python but only have really have python implementations

01:02:48.210 --> 01:02:52.250
<v Charlie>so like versions would be a good example like sounds like a simple thing like

01:02:52.250 --> 01:02:58.210
<v Charlie>1.0.0 right but like they get actually fairly complicated and so like it's not

01:02:58.210 --> 01:03:00.290
<v Charlie>like that parser is incredibly complicated but,

01:03:00.910 --> 01:03:04.790
<v Charlie>It does run, I don't know, it probably, we're probably analyzing,

01:03:04.890 --> 01:03:06.270
<v Charlie>I have no idea what the number would be.

01:03:06.390 --> 01:03:11.090
<v Charlie>We're probably analyzing billions and billions of versions a day,

01:03:11.270 --> 01:03:14.310
<v Charlie>right? Think about how many times that version parser is parsing versions, right?

01:03:14.450 --> 01:03:19.670
<v Charlie>Like, so, you know, we think a lot about how do we make that fast and how do

01:03:19.670 --> 01:03:22.410
<v Charlie>we also, how do we make sure that it's fully standard compliant?

01:03:22.470 --> 01:03:24.470
<v Charlie>So yeah, we do build a lot of parsers.

01:03:24.470 --> 01:03:27.570
<v Matthias>To me that was probably one

01:03:27.570 --> 01:03:31.090
<v Matthias>of the highlights of your rust last year where

01:03:31.090 --> 01:03:34.010
<v Matthias>you shared that story about version

01:03:34.010 --> 01:03:37.170
<v Matthias>parsing i had such a good laugh so if someone

01:03:37.170 --> 01:03:40.290
<v Matthias>hasn't seen it yet we will link it in the show notes it's that's

01:03:40.290 --> 01:03:43.010
<v Matthias>a really fun example really great yeah it was such a fun

01:03:43.010 --> 01:03:46.070
<v Matthias>exercise and i think you could make an entire course

01:03:46.070 --> 01:03:49.150
<v Matthias>around that but anyhow probably a phd but yeah that's

01:03:49.150 --> 01:03:52.230
<v Matthias>another topic still it's different

01:03:52.230 --> 01:03:55.470
<v Matthias>when you build that as a fun hobby side

01:03:55.470 --> 01:04:00.690
<v Matthias>project or when you do that as a business with multiple employees with a larger

01:04:00.690 --> 01:04:05.050
<v Matthias>code base with multiple crates and so on so i would just wanted to take the

01:04:05.050 --> 01:04:12.630
<v Matthias>opportunity to talk about day two with rust as it stands today what's the verdict i.

01:04:12.630 --> 01:04:16.910
<v Charlie>Love it and i think it was i think it's been such a good choice for our projects

01:04:16.910 --> 01:04:23.850
<v Charlie>so we get to we can build extremely stable extremely fast software,

01:04:24.530 --> 01:04:30.510
<v Charlie>that also has the benefit of being memory safe for like you know thousands and

01:04:30.510 --> 01:04:35.470
<v Charlie>thousands or millions whatever of like python users and the day-to-day is excellent

01:04:35.470 --> 01:04:38.570
<v Charlie>like rust i've always thought that rust kind of like,

01:04:39.560 --> 01:04:44.140
<v Charlie>I don't know. It's not that much of a secret, but like the secret behind Rust

01:04:44.140 --> 01:04:46.340
<v Charlie>is I've always thought is like the tooling.

01:04:46.980 --> 01:04:52.900
<v Charlie>Like I, for me at least, and I've never really written any C++ and people can

01:04:52.900 --> 01:04:54.680
<v Charlie>also just think I'm a moron, which is fine.

01:04:54.820 --> 01:04:58.700
<v Charlie>But like, whenever I look at a C++ project, it's like, it's super intimidating

01:04:58.700 --> 01:05:02.100
<v Charlie>to figure out how do I even get this thing to build or run or like, what am I doing?

01:05:02.720 --> 01:05:06.080
<v Charlie>And I don't know that I ever would have become a quote unquote systems programmer

01:05:06.080 --> 01:05:09.760
<v Charlie>through c++ or i think it would have been a lot harder for me that's my prediction

01:05:09.760 --> 01:05:15.780
<v Charlie>maybe i'm wrong i don't know but like rust it's just such a high confidence experience it's like.

01:05:16.300 --> 01:05:19.160
<v Charlie>You install this thing with rust up you run cargo run cargo

01:05:19.160 --> 01:05:21.920
<v Charlie>build and like that's how you build a project like i can clone any

01:05:21.920 --> 01:05:24.640
<v Charlie>rust project and like feel fairly confident that i know

01:05:24.640 --> 01:05:27.660
<v Charlie>how to run it and how to test it and how to build it and

01:05:27.660 --> 01:05:30.480
<v Charlie>how to understand it and so for me like the tooling

01:05:30.480 --> 01:05:33.260
<v Charlie>story is excellent i i think the only

01:05:33.260 --> 01:05:36.140
<v Charlie>things i really have complaints about and these are just like things that people are

01:05:36.140 --> 01:05:39.060
<v Charlie>always going to that i'm always going to complain about no matter what is like

01:05:39.060 --> 01:05:41.800
<v Charlie>obviously i'd like compile times to be faster that would be

01:05:41.800 --> 01:05:44.800
<v Charlie>nice because as you build a bigger and bigger project it becomes

01:05:44.800 --> 01:05:48.180
<v Charlie>more and more of an issue and it just

01:05:48.180 --> 01:05:51.440
<v Charlie>it's just like a tax on development but you know

01:05:51.440 --> 01:05:54.240
<v Charlie>even if they were faster i'd probably be saying the same thing and saying

01:05:54.240 --> 01:05:58.340
<v Charlie>i just want them to be faster like um that's that's

01:05:58.340 --> 01:06:01.140
<v Charlie>like the main thing for me i think the thing one thing i'm really impressed by in.

01:06:01.140 --> 01:06:04.020
<v Charlie>Rust too is just the rate of

01:06:04.020 --> 01:06:07.340
<v Charlie>the rate at which the language and the tooling keeps improving

01:06:07.340 --> 01:06:10.400
<v Charlie>like every rust release has something

01:06:10.400 --> 01:06:13.160
<v Charlie>i want and something i've been waiting for which is

01:06:13.160 --> 01:06:16.100
<v Charlie>kind of crazy like just even looking at and i'm

01:06:16.100 --> 01:06:18.780
<v Charlie>not even i'm not one of the like i talked to there are people on

01:06:18.780 --> 01:06:21.540
<v Charlie>our team who have been doing rust since before 1.0 right and i've been doing

01:06:21.540 --> 01:06:24.300
<v Charlie>rust for a long time and like i started doing rust in like

01:06:24.300 --> 01:06:27.240
<v Charlie>2022 is when i started writing

01:06:27.240 --> 01:06:29.960
<v Charlie>rust i think wow yeah i've like i haven't been writing rust for

01:06:29.960 --> 01:06:32.760
<v Charlie>that long but like now i'm like every rust release like

01:06:32.760 --> 01:06:35.500
<v Charlie>there's just like there's just so much progress and it's

01:06:35.500 --> 01:06:38.420
<v Charlie>like awesome to feel like you're part of this community that's like building and

01:06:38.420 --> 01:06:45.300
<v Charlie>growing all the time so it's both it feels so stable and like so mature but

01:06:45.300 --> 01:06:49.240
<v Charlie>it's also the rate of change i think is like great and the rate of progress

01:06:49.240 --> 01:06:52.220
<v Charlie>and the rate at which things are being addressed my only complaint is compile

01:06:52.220 --> 01:06:57.740
<v Charlie>times but i you know i i'll take what i can get really same.

01:06:57.740 --> 01:06:59.660
<v Matthias>Let's let's make it faster definitely.

01:06:59.660 --> 01:07:00.460
<v Charlie>Yeah we.

01:07:00.460 --> 01:07:04.460
<v Matthias>Need more crates in the workspaces but when you when you refactor across,

01:07:05.080 --> 01:07:09.080
<v Matthias>crates or maybe even within crates how does it feel to you like what's that

01:07:09.080 --> 01:07:13.780
<v Matthias>experience like like do you refactor with confidence is it something that you

01:07:13.780 --> 01:07:18.680
<v Matthias>look forward to is it a choice refactor is it more of a you know dread.

01:07:18.680 --> 01:07:24.680
<v Charlie>No i like refactoring and rest a lot i think it's not like,

01:07:26.120 --> 01:07:28.120
<v Charlie>Maybe it's the nature of what we're building. It's, you know,

01:07:28.280 --> 01:07:31.300
<v Charlie>like, I guess the thing that people often say about like very well typed languages

01:07:31.300 --> 01:07:34.320
<v Charlie>or even like functional languages is like, if it compiles, you know,

01:07:34.360 --> 01:07:36.220
<v Charlie>it, you know, it will work.

01:07:37.340 --> 01:07:39.340
<v Charlie>I don't know if I believe that about Rust.

01:07:40.700 --> 01:07:42.220
<v Charlie>There's still ways that your pro code

01:07:42.220 --> 01:07:47.240
<v Charlie>can be wrong, but I do feel like I'm constantly guided by the compiler.

01:07:47.380 --> 01:07:52.200
<v Charlie>And actually more and more, I think the way I write code is that I try and make

01:07:52.200 --> 01:07:55.060
<v Charlie>it such that I will be guided by the compiler in the future.

01:07:55.060 --> 01:07:57.900
<v Charlie>So like for example i think about this

01:07:57.900 --> 01:08:00.660
<v Charlie>with like okay let's say i'm passing like a

01:08:00.660 --> 01:08:03.380
<v Charlie>struct into a function and like i need to do

01:08:03.380 --> 01:08:06.220
<v Charlie>something with every field i want

01:08:06.220 --> 01:08:09.860
<v Charlie>to make sure that if i add a new field to that to that struct i'm reminded

01:08:09.860 --> 01:08:13.180
<v Charlie>that i need to handle that field in that function and so

01:08:13.180 --> 01:08:16.020
<v Charlie>the thing i will often destructure it because that makes

01:08:16.020 --> 01:08:19.160
<v Charlie>sure that if i add it as opposed to referencing all the fields in line like

01:08:19.160 --> 01:08:23.460
<v Charlie>object dot a object dot b i will destructure it at the top of the function it's

01:08:23.460 --> 01:08:27.080
<v Charlie>a very small thing but it means that the compiler will tell you will remind

01:08:27.080 --> 01:08:31.140
<v Charlie>you that you need to look at it if you add a field so like more and more i find

01:08:31.140 --> 01:08:34.900
<v Charlie>myself trying to find ways to be guided by the compiler because it's such a powerful thing,

01:08:35.480 --> 01:08:40.600
<v Charlie>and it just makes it it just makes progress i just think it makes like working

01:08:40.600 --> 01:08:43.840
<v Charlie>on large complex products like so much easier the.

01:08:43.840 --> 01:08:48.420
<v Matthias>Destructuring part i never heard in such detail because yeah.

01:08:48.420 --> 01:08:52.560
<v Charlie>That's kind of a silly little thing but do you know you don't understand what i'm saying right no.

01:08:52.560 --> 01:08:55.660
<v Matthias>I totally like it i will totally steal that idea i like i.

01:08:55.660 --> 01:08:58.080
<v Charlie>Mean hopefully you don't have structs that are that complicated that need that

01:08:58.080 --> 01:09:01.740
<v Charlie>that much but like you know it is i just like more and more try to think about

01:09:01.740 --> 01:09:05.740
<v Charlie>like how will the compiler make sure that i do this correctly in the future yeah.

01:09:05.740 --> 01:09:11.500
<v Matthias>I know about that tip in a serialization context when for example you you want

01:09:11.500 --> 01:09:15.760
<v Matthias>to ensure that all the fields get serialized and deserialized properly then i.

01:09:15.760 --> 01:09:16.220
<v Charlie>Know some.

01:09:16.220 --> 01:09:20.100
<v Matthias>People to structure that but i it never crossed my mind that you can do that

01:09:20.100 --> 01:09:23.960
<v Matthias>in just you know normal function code you could even destructure it right there

01:09:23.960 --> 01:09:26.220
<v Matthias>in your arguments you don't.

01:09:26.220 --> 01:09:26.640
<v Charlie>Even have.

01:09:26.640 --> 01:09:29.200
<v Matthias>To do it in the first line of the body you can do it in the.

01:09:29.200 --> 01:09:30.880
<v Charlie>Arguments because it's just a you.

01:09:30.880 --> 01:09:36.300
<v Matthias>Know pattern match essentially or like it's a destructuring pattern yes.

01:09:36.300 --> 01:09:38.080
<v Charlie>Yeah do.

01:09:38.080 --> 01:09:42.700
<v Matthias>You have more such tips like where can people learn more about ideomatic rust

01:09:42.700 --> 01:09:45.460
<v Matthias>and best practices where did you learn it.

01:09:45.460 --> 01:09:48.460
<v Charlie>I mean a lot of it i learned from having great teammates

01:09:48.460 --> 01:09:52.160
<v Charlie>which is not that's sort of a bad answer because like not ever it just depends

01:09:52.160 --> 01:09:55.880
<v Charlie>on your life situation like you don't have that much control over that you have

01:09:55.880 --> 01:09:59.520
<v Charlie>some control but like but it is a real thing which is i started working on ruff

01:09:59.520 --> 01:10:04.660
<v Charlie>on my own and then as we grew the team i ended up hiring thankfully people who

01:10:04.660 --> 01:10:06.420
<v Charlie>knew a lot more about rust than me and like,

01:10:07.740 --> 01:10:13.560
<v Charlie>Like Mika, our team, who was the second employee to join the company,

01:10:14.100 --> 01:10:15.660
<v Charlie>he just taught me so much about Rust.

01:10:15.860 --> 01:10:21.780
<v Charlie>And then later we hired Andrew Gallant, BurntSushi, who I will often just send

01:10:21.780 --> 01:10:24.720
<v Charlie>him random Rust questions rather than Googling them.

01:10:24.820 --> 01:10:29.660
<v Charlie>I mean, not in a way that is exploitative of that relationship or overly burdensome

01:10:29.660 --> 01:10:34.660
<v Charlie>on me, but he loves being the elder statesman at the company that can help people

01:10:34.660 --> 01:10:36.420
<v Charlie>with hard Rust questions or problems.

01:10:36.680 --> 01:10:40.020
<v Charlie>And so finding great people to learn from is maybe the slightly higher level

01:10:40.020 --> 01:10:42.560
<v Charlie>lesson, but like I know that's not always easy.

01:10:42.780 --> 01:10:45.660
<v Charlie>The other thing that I did is I read a lot of code like.

01:10:47.760 --> 01:10:52.260
<v Charlie>ChatGPT and LLMs are great, but like you should also remember that like GitHub

01:10:52.260 --> 01:10:58.520
<v Charlie>CodeSearch exists and like all these amazing code bases are open source and so for example,

01:10:58.800 --> 01:11:03.780
<v Charlie>something I'll often do now is like if I look at a crate and I want to use it

01:11:03.780 --> 01:11:06.080
<v Charlie>and it doesn't have, maybe it has great examples.

01:11:06.440 --> 01:11:09.800
<v Charlie>Okay, great. if it doesn't have great examples and I don't feel the need to,

01:11:09.920 --> 01:11:13.860
<v Charlie>I don't want to read all the documentation myself, I will actually just go into

01:11:13.860 --> 01:11:18.560
<v Charlie>GitHub code search and I will just search for the struct name or the function name.

01:11:18.580 --> 01:11:24.540
<v Charlie>And I will go find real examples in one second of real projects using that crate.

01:11:24.840 --> 01:11:30.460
<v Charlie>And so like, you can just go read code, you know, like, like reading code is

01:11:30.460 --> 01:11:33.400
<v Charlie>like, you know, all this stuff is accessible to some degree.

01:11:33.700 --> 01:11:36.340
<v Charlie>And so that's, I don't know, that's how i've tried to pick up

01:11:36.340 --> 01:11:39.000
<v Charlie>things and and at least learn like i looked at

01:11:39.000 --> 01:11:41.720
<v Charlie>cargo a lot when we were building uv and tried to understand like how do

01:11:41.720 --> 01:11:45.140
<v Charlie>they do certain things like i don't know how to implement git like let me go

01:11:45.140 --> 01:11:48.880
<v Charlie>look at what cargo does and then let me let me read about their design decisions

01:11:48.880 --> 01:11:52.160
<v Charlie>because it's all documented in their prs and you can understand the trade-offs

01:11:52.160 --> 01:11:55.680
<v Charlie>and you can understand like why they did things a certain way so you know you

01:11:55.680 --> 01:11:58.620
<v Charlie>can also go hunt people down and talk to them about this stuff but there's plenty

01:11:58.620 --> 01:12:01.260
<v Charlie>that you can find without doing that.

01:12:01.260 --> 01:12:05.900
<v Matthias>Fully agree before there were llms there was BurntSushi but unfortunately you hired

01:12:05.900 --> 01:12:11.060
<v Matthias>him so there's just one of one of them but you can still read their open source

01:12:11.060 --> 01:12:16.200
<v Matthias>code so ripgrep whenever someone tells me what is an idiomatic rust crate that

01:12:16.200 --> 01:12:18.500
<v Matthias>i should read i always point them to ripgrep because.

01:12:18.500 --> 01:12:21.580
<v Charlie>Oh yeah i liked that a lot too when we were also when we were figuring out how

01:12:21.580 --> 01:12:24.720
<v Charlie>to structure our crates and how to manage workspaces and our release pipeline

01:12:24.720 --> 01:12:28.400
<v Charlie>and all that stuff there's just so much good there's just so much good code

01:12:28.400 --> 01:12:30.980
<v Charlie>out there and so you know go read it.

01:12:30.390 --> 01:12:32.580
<v Matthias>Yeah I always cry

01:12:32.580 --> 01:12:36.160
<v Matthias>when I open the ripgrep code out of joy of course I.

01:12:36.160 --> 01:12:38.760
<v Matthias>Really like reading this it's amazing.

01:12:39.600 --> 01:12:44.540
<v Matthias>Yeah, unfortunately, we have to come to an end. But I wonder if you have any

01:12:44.540 --> 01:12:46.680
<v Matthias>final statement to the Rust community.

01:12:47.360 --> 01:12:51.660
<v Charlie>Yeah, I mean, I think for me, like I, how do I, how do I say this correctly?

01:12:51.960 --> 01:12:56.560
<v Charlie>Like I, it's kind of amazing, I think that I only started writing Rust like a few years ago.

01:12:56.720 --> 01:13:00.540
<v Charlie>And now we've shipped, I mean, along with a great team, like we've shipped two

01:13:00.540 --> 01:13:01.760
<v Charlie>of these tools that are having,

01:13:02.040 --> 01:13:05.220
<v Charlie>I think at least a huge impact on Python, which

01:13:05.220 --> 01:13:09.160
<v Charlie>is like the most popular or the second most popular programming ecosystem

01:13:09.160 --> 01:13:12.000
<v Charlie>on earth so if you think about it like rust is

01:13:12.000 --> 01:13:15.040
<v Charlie>kind of in a lot of ways rust is kind of like powering python

01:13:15.040 --> 01:13:17.720
<v Charlie>at least to you know if i have a say about it at least

01:13:17.720 --> 01:13:20.580
<v Charlie>rust is powering is powering python and so

01:13:20.580 --> 01:13:25.360
<v Charlie>i don't know i've always just felt i again i never considered myself to be like

01:13:25.360 --> 01:13:30.460
<v Charlie>a systems programmer quote unquote in most of my career i was writing typescript

01:13:30.460 --> 01:13:35.840
<v Charlie>python i mean i did some java professionally but like i had never except for

01:13:35.840 --> 01:13:39.480
<v Charlie>like a course in college done any C. I really hadn't done any C++.

01:13:40.020 --> 01:13:45.580
<v Charlie>And like in the span of a few years, I like learned to build this kind of software.

01:13:45.740 --> 01:13:49.780
<v Charlie>So I don't know. I've had just like great experiences with the community and

01:13:49.780 --> 01:13:52.160
<v Charlie>being welcomed into it and learning the language.

01:13:52.320 --> 01:13:58.580
<v Charlie>And I think that should continue to be a very important part of Rust is like

01:13:58.580 --> 01:14:00.480
<v Charlie>welcoming people in and helping them learn.

01:14:00.680 --> 01:14:02.120
<v Charlie>Because the impact that we can

01:14:02.120 --> 01:14:05.100
<v Charlie>have by building this kind of stuff is just huge, even outside of Rust.

01:14:06.160 --> 01:14:10.040
<v Matthias>Perfect i couldn't have said it better both languages are really close to my

01:14:10.040 --> 01:14:16.240
<v Matthias>heart and i really like to see that synergy jolly your presence was much appreciated

01:14:16.240 --> 01:14:20.100
<v Matthias>i thank you so much for taking the time today yeah.

01:14:20.100 --> 01:14:23.300
<v Charlie>Thank you so much for having me and for for all the great questions i it was

01:14:23.300 --> 01:14:24.580
<v Charlie>it was uh it was really fun.

01:14:24.580 --> 01:14:29.940
<v Matthias>Rust in Production is a podcast by corrode it is hosted by me Matthias Endler

01:14:29.940 --> 01:14:34.620
<v Matthias>and produced by Simon Brüggen for show notes transcripts and to To learn more

01:14:34.620 --> 01:14:39.340
<v Matthias>about how we can help your company make the most of Rust, visit corrode.dev.

01:14:39.560 --> 01:14:41.880
<v Matthias>Thanks for listening to Rust in Production.