Matic with Eric Seppanen
Matthias Endler interviews Eric Seppanen about Rust's impact on privacy-focused home automation robots, emphasizing concurrency, security, and community collaboration.
2024-06-13 84 min
Description & Show Notes
The idea of smart robots automating away boring household chores sounds enticing, yet these devices rarely work as advertised: they get stuck, they break down, or are security nightmares. And so it's refreshing to see a company like Matic taking a different approach by attempting to build truly smart, reliable, and privacy-respecting robots. They use Rust for 95% of their codebase, and use camera vision to navigate, vacuum, and mop floors.
I sit down with Eric Seppanen, Software Engineer at Matic, to learn about vertical integration in robotics, on-device sensor processing, large Rust codebases, and why Rust is a great language for the problem space.
About Matic
Matic is on a mission to solve everyday problems with robotics. Design Milk wrote in an article about Matic: "Matic Robot Vacuum Collects Dust but Not Your Personal Data" and I really love that quote. It's a great summary of what Matic is about: privacy-respecting, truly smart robots. The San Francisco-based startup recently raised a $24M Series A round.
About Eric Seppanen
Eric is a systems engineer with a passion for reliable, well-designed software. He has a background in kernel development and high-performance computing with C++ and now works on robotics with Rust.
With his calm and insightful demeanor, Eric is the ideal person to talk about Rust's strengths for people with a C++ background.
Links From The Show
- Why Rust? It's the Safe Choice
- Folkert's episode
- egui
- SLAM
- Tokio Async Runtime
- reqwest
- Axum web framework
- The Cargo Book: "features should be additive"
- Jon Gjengset's YouTube channel
- Jon's book "Rust for Rusteaceans"
- Mara Bos' book "Rust Atomics and Locks"
- Will Wilson from antithesis about FoundationDB: "Is something bugging you?"
Official Links
About corrode
"Rust in Production" is a podcast by corrode, a company that helps teams adopt Rust. We offer training, consulting, and development services to help you succeed with Rust. If you want to learn more about how we can help you, please get in touch.
Transcript
This is Rust in Production, a podcast about companies using Rust to shape the
future of infrastructure.
My name is Matthias Endler from corrode, and today we are joined by Eric Seppanen
from Matic to talk about using Rust for advanced privacy-first home automation robots.
Eric, welcome. Can you talk a little bit about yourself?
Well, I started as an electrical engineer. I was designing circuit boards and
FPGAs, but I'd been running Linux since its early days.
So I started working with embedded Linux, and that led to doing some kernel work in C.
And then I found a job using C++ for almost 10 years at a company called Pure Storage.
I'm the kind of person who can't work on the same thing for more than a few
years, because I always want to try something new and different.
And I keep going higher and higher in the software stack, but I still like those
physical things that you can touch.
And I think projects that mix hardware and software, they really have a unique set of challenges.
And I find that really interesting and fun.
When I first reached out to you, it was because of a blog post that you wrote,
which is titled Rust is the Safe Choice.
And I really liked it. It really resonated with me.
And then I dug deeper and looked at where you're working.
And I found out that you work at Matic. And maybe in your own words,
what would you say does Matic do?
Matic's goal is building intelligent robots that save people time.
And that means that we're basically
using computer vision to help robots navigate inside of your home.
And that means a big software stack to help the robot see where it's going and
understand the shape of the rooms it's in and be able to plan its movements
in a way that makes sense.
What we've seen so far is that a lot of people are not happy with the previous
generation of home robots.
They do dumb things and they crash and they get stuck a lot and they're not
saving people time because they need to be watched and they need to be rescued
constantly. And we'd like to do better than that.
So we'd like to save you time and we'd like the technology to to be so good
that it's magic, that you don't need to think about it.
Now, with regards to your job there, how would you describe yourself at Matic?
I work on the platform team at Matic, and that means I work on the general parts
of our Rust software stack.
I work on dependency management and build systems and memory management,
telemetry, all the things that you might find in any large software project.
Did Matic hire you because of your ROS skills or because you worked on lower
level kernel stuff or because you were generally just curious about what they did and you had a mix,
a wide variety of skills? What was it like?
Like, well, I can't be sure what things got them to hire me.
But the reason that I wanted to work there is because it's a company that feels
a little bit like living in the future because easily 95% of what we do is in Rust.
And I don't think there are many companies that can say that.
It's a really good way of working.
Most of the code is then written in Rust. That's really surprising to me because
you have so many different components. You have so many different layers of abstraction, right?
How much of it was kind of done by the vendor, by the tools and components that
you use, and how much of it was done by the company itself?
I don't think we get hardware vendors delivering code in Rust.
I think that's something that we'll see in the future, but it's not really the reality today. day.
So we do have to spend some of our time building Rust FFI wrappers around vendor
libraries and other system libraries.
Okay, but when you say 95% is in Rust, what does that entail?
What are the different layers, the different components of the system right now?
I'm not sure how to split our software stack up into layers,
honestly, because there are so many elements of it that I personally don't have a great view into.
I guess what I can say is that we're past the point where we even try to build
things in other languages because we've had so much success with Rust that we
will just reach for it anytime we need to do anything.
We build native GUIs for debugging.
We build web services.
We build code that runs in the cloud. We build clients for other third-party
services, and we keep extending our reach and doing new things in Rust,
and it works out for us every single time.
And that's... some of these environments are a lot less polished, you know?
Running WebAssembly or maybe doing stuff in the cloud.
Some of that stuff may not be as polished as other areas, but it's maybe only
a year or two behind the rest of the Rust ecosystem.
So mostly it's just amazing that with a single language, you can move around
very fluidly between all of these very different platforms.
It feels a bit like Matic is a company that values vertical integration.
You have a lot of different layers, but you see it as one product. Is that correct?
Yeah, I think it's really interesting because the company does all of the hardware design in-house.
So I work in the same building as mechanical engineers and electrical engineers.
So we can get a great deal done in very little time because you can just go
talk to the person who worked on the hardware and solve problems very quickly.
With regards to this, when you build new hardware and you do that in a close-knit
environment together with the team, is it so that you have Rust in mind from the very beginning?
Or is that more of an afterthought? Because those people might not necessarily
know Rust too well. Maybe they come from a C background or they don't really
have any programming experience at all.
And I wonder, do you go and have specific things that you want from the hardware
that is easier to use from the Rust side, like a chip or some controller that you want to use?
Or is it something that they build on by themselves and then later on you decide
how to navigate it in Rust?
I think at this point we have enough confidence in our Rust abilities that we're
willing to choose the best hardware and just trust that we will figure out a way to make it work.
Anybody who's done these mixed hardware and software designs probably recognizes
that this job is never easy.
You know, the way it often works is a hardware vendor will deliver you a thousand
pages of documentation and they will deliver a bunch of code and that code is
probably kind of buggy and missing a bunch of the features that you need.
And now it's your problem. You need to make it work. And you're basically on
your own. and that's a lot of work no matter what language you're using.
So in my opinion, you might as well choose the language that's going to make
you the most productive.
So writing those FFI layers or maybe taking some C code and rewriting it from
scratch, that's definitely a lot of work, but it's a known amount of work.
And you were probably doing quite a bit of that integration work anyway,
just to be able to make that hardware and software work in your environment.
And so I can kind of amortize that work over years and years of product development.
And I think in the long term, it's a big win, because now you can really work
effectively with that code and make changes and debug things.
So if quality is important, then I think Rust starts to look really attractive.
Very nice. Which hardware do you use specifically?
Can you talk a little bit about that? Or is it a company secret?
The Matic robot is basically a small computer on wheels with a bunch of cameras,
and then it has a vacuum and a mop roller sticking out the front so it can drive
around cleaning your floors.
Each one of those systems takes a lot of time and expertise to design and many
generations of trial and error,
and so we've built so many generations of the robot in-house that we have gotten
pretty good at designing that set of hardware and software.
In terms of the actual computer hardware, I probably shouldn't name the exact
parts, but it's really not that exotic.
There are a few platforms out there that give you a bunch of ARM cores and some
GPU hardware and some camera interfaces, and that's basically all that we need.
We have in the past switched from one hardware vendor to another,
and so we're not really that dependent on the details there,
as long as it gives us the same basic set of resources that we need.
Okay, then you have those hardware components, you somehow integrate them into
one bigger piece, you write the Rust drivers for it, I guess.
Is that the next step that you need to take? Once you have the hardware,
you need to write the drivers or are they provided already?
We don't have to do that much in terms of driver work because the camera interfaces
are pretty straightforward.
We consume the camera images, the raw camera images directly into our user space stack.
And at that point, it's all of our own code. And in terms of controlling motors,
we have a small microcontroller that we use to do the motors. others.
Okay. You take the feed and then you do some commutation on it,
but does all of that happen on the device or do you transfer that to the cloud
or some other computer outside of the robot itself?
Yeah. So the goal is for the robot to be completely autonomous.
It seems like a good decision for customer privacy.
And it's really one of the things we wanted to do is to see if you could do
this task entirely on the robot.
If we can do this smart mapping and navigating with minimal resources,
that would be a great accomplishment.
We do use the local Wi-Fi network to be able to talk to a mobile phone app.
So if you want to pull out your phone, you can tell the robot to do something
and you can even play around a little little bit, you can move through a 3D
map of your house and draw where you want it to clean.
We have limited memory in CPU. I guess that makes it a little different from
some other software projects.
But really, the fundamental question is, can it be done at all?
Some hardware platforms would just be too slow to be able to do the mapping
and the navigation in real time.
But once you know what's possible, it just becomes software engineering,
making sure that the code can continue to get its job done with limited CPU and memory.
Yes, and on top of it, I might imagine that it wasn't always certain that it
would be possible to even run all of this on the machine itself.
Is that correct? Or would you say from the early days, you already knew that
you could use the hardware to its fullest extent and make it work?
I've only been at the company a few years, years, but I think the founders had
enough of a background in computer vision that they could see that this sort
of thing was starting to become possible.
And I think they got the timing exactly right.
Now, when I think about your blog post, initially, when I read the title,
the only sort of connection I made with the term safety was safety in a sense of memory safety,
like what Rust was good at.
And then there was this other spin which was
about safety being a safe choice
so being the choice that is kind
of obvious or the one that you should take
given all of the constraints but now there's another third angle to the title
which because it is about safety in terms of security or privacy and this is
interesting too because i have one of the chinese brands and for me this always,
leaves a bad taste because i don't
know what they do with my data when they send it to cloud i could
potentially use one of the home brew
versions of it and then run my own distribution on it sort of but it is a lot
of work and at the end of the day this also made me feel like i had to look
for alternatives at some point so there's a third angle to the title of your
blog post yeah kind of like that it.
Was unintentional that there were three meanings there the one that i mostly
meant was the second one you know rust has memory safety everybody knows that by now,
and to a certain degree it took me a while
to figure out what I wanted to say because there's a
lot of articles that say Rust is good and we like it but what
I discovered is I can't imagine doing this job with anything less going to another
systems programming language like C or C++ would feel like banging rocks together
to make fire you know these are things that people used to do but we've learned
how to be more productive than that.
And there's this popular idea that startups get innovation tokens, right?
You can only spend your tokens a certain number of times.
And so startups should use boring technology wherever possible.
And I think that's mostly right.
You wouldn't want your brand new company to take a chance on a new thing that
nobody understands because that could be the end of the company.
But the engineers that I talk to, you know, they don't have to use Rust for
very long before it just starts to be really obvious that this is a better way of working.
It's just a very pragmatic set of tools.
And these are not the only good tools out there.
I don't think it's worthwhile to argue about Go versus Rust because they have very similar goals.
You know, the languages are very different and they're maybe good at different things.
But the ways that they improve the productivity is really similar.
You know, it's all about eliminating the pain points and simplifying the tool
chain and bringing all the developers sort of into a single ecosystem.
I just think that if you reduce friction between the different projects in that
language, it's one of those things that makes you really productive.
I don't think I would want to go back to doing systems engineering and C or C++.
Well, it sounds certainly true do you have any specific examples where you would say
rust helped you or your team achieve something that seemed impossible with other languages?
Yeah that's an interesting question i
don't think that there are small examples where things would be impossible in
another language and rust makes them possible i think it's more that That individual
things become so much easier that large-scale products,
large-scale projects become possible that maybe would have been infeasible in another language.
And it's because of that productivity angle again.
The big ones for me, of course, memory safety is huge because there's a whole
range of bugs that you just don't waste your time on.
Use after free and buffer overruns, and you save time because you don't have
to go debug those, and you're also saving time because you don't even think about them.
This whole concept of coding fearlessly is so great because you don't have to
hold back. you can try to be really ambitious.
And when you step over the line, the compiler will stop you and say, please don't do that.
Another aspect is thread safety. Writing concurrent and parallel programs,
I think we as software engineers still really have a hard time at doing this correctly.
And it's such an incredible accomplishment when you can have your software fan
out across multiple CPUs and be well-behaved.
I think 10 years ago that would have been nearly impossible to do without a
tremendous effort on the programmer's part.
I have seen this done successfully in C++, and I've seen how much effort and discipline it takes.
And the moment you have a bad day, you break the entire system.
It can be really stressful.
So as a programmer, you sort of self-limit yourself, where if you're doing something
that doesn't need super high performance, you try to restrict yourself to only the simplest tools.
Because the good tools have sharp edges, and you get tired of getting cut by them all the time.
And so having that sort of safe concurrency is just amazing.
And sorry from your times
as a kernel developer if you
can still remember and maybe you still have
PTSD from it but can you still remember times where you coded in fear where
you were afraid that you would break things or you would miss an edge case was
that a recurring topic for you or was it rather a safe space,
a safe environment as well?
I think the last time I wrote much kernel code was probably 15 years ago.
And I think at that point, I had never really seen a really,
really high quality system where until you've seen such a good test environment
that it will catch errors that occur only
one in every 1 million executions,
I don't think you really, as a programmer, you don't have a good view of how bad you really are.
And maybe it's just me, but the longer I do this, the more I get the feeling
that I'm my own worst enemy and I just need better tools to protect me from myself.
It's interesting because I do share a similar sentiment in the past.
I wrote software of which i thought
it was pretty decent but now in hindsight it was broken
in so many ways and there was no one that told me there was no mentor no guide
and sometimes i wished for better tooling or a better linter didn't even cross
my mind that there would be a better way to write software in and of itself
in a different safer language yeah.
And i think the fact that there is a Rust community is an amazing advantage.
The fact that there's one build system, one code format, or one documentation
system, and it's shared by a much larger group than you'd ever find at any one company,
that means that I can be working on something that I built and I can switch
to something that a co-worker built and then switch to some random crate that
I downloaded from crates.io,
all of that code mostly looks familiar, uses the same idioms,
uses a lot of the same libraries as I use.
That's also really great for productivity. In some other language,
you can have such a huge impedance mismatch between codebases that it can really
take forever to get up to speed on a new codebase.
Bringing in a new dependency can be a really major decision because you have
to figure out how to integrate it with your build system and your documentation system and,
Not having that pain point in Rust is really a wonderful thing.
We had an episode with Volkert de Vries from Tweede Golf.
And one thing that he mentioned which really resonated with me was that it is
easier in Rust to jump into the standard library and just read what's going
on and understand how the standard library works.
That was really something that I really didn't reflect on, but it is certainly true for me.
Rust the application code
feels like rust the systems level code to some extent it feels like rust from
the standard library and whenever i jumped into c or c plus plus i wanted to
learn how it worked under the hood i was lost because i didn't speak this dialect
of c or c plus plus that they use there yeah.
I i think that rust as a language favors expressiveness and And it's a community value as well.
You know, the language doesn't stop you from doing bad things.
It's a community value that types express things.
You can express whether this is a valid URL or whether this number is a valid
error code or things like that.
And it's a community value that expressing invariance in code instead of in
comments. And I think that's wonderful.
For that means you mostly talk
about types and how they relate to one another you talk about composability
rather than hearts maybe systems level things and interfaces and and very specific
implementation details is that correct you talk about it on a higher level.
Yeah, I think that's all true. I think also, I just think about 20 years of
programming or 25 years of programming.
And I think about what it's like to be sort of the first generation of programmers
on a project where you are taught how to maintain invariance by your peers and
you You learn why some unusual design decisions were made.
But if you get to watch a successful project that goes on for 10 years,
you're eventually on these third and fourth generation engineers,
and all they have is the source code and some myths and rumors.
They don't know where the important invariants are, and they don't know why
these design decisions were made in the past. And I think that's the point at
which projects can start to go out of control.
And I put myself in their shoes. If I'm one of those engineers,
I would really hope the code is in a language like Rust.
Because now the code can carry this message down through the generations.
If the language is expressive enough, you can communicate those things.
And it's also because I feel really bad about some of the code I wrote 10 years
ago. go. I bet it's still in place and nobody knows how to get rid of it because
I was not able to do a good enough job of expressing how it fits into the larger system.
How much of it is because of the lack of rust in the past and how much of it
is because of you growing as an engineer?
That's a good question. I'm sure it's both. In a sense, I do try to make up
for all of my mistakes of the past.
But it's also just that I find the job more satisfying when I feel like I'm
leaving behind code that can be better understood and maintained by other people.
When your experience grows, do you think about your legacy a lot?
How you leave the code base for other people?
I don't think I'm old enough to think about legacy yet. yet,
but I think that that's one of the higher-level challenges of software design.
Writing code that works, in a sense, is the easy part.
Writing code that can also be understood by other human beings,
to the degree where they could take it apart and put it back together again
and make changes, that's to me a much more interesting challenge.
And as software projects grow.
I think that's a challenge that a lot of companies face.
And in a sense, it's a limiting factor to how they can continue to release new
product year after year.
When you look at the current code base for the vacuum cleaner,
what are the main abstractions that you work with and some of the types that
might be specific to this robot?
For example, when you take the camera feed do you convert it to something that
is quote-unquote safe do you convert it to your internal type so that you can
handle it better or do you deal with a lot of,
raw data throughout the system for performance reasons or otherwise i.
Think mostly we're able to deal with safe types we have relatively little unsafe
code in our tree and mostly it's there to to do these sorts of FFI layers for
interacting directly with the hardware.
Mostly the abstraction that I think of in terms of how does the robot manage
data is we have what are called map layers.
And the map is what you might think it is. It's sort of a representation of your house.
And those camera images get digested into this 3D map.
Think of it as a 3D point cloud that we can look at all of those points and
determine where are there obstacles,
where are the room boundaries, what type of floor are we traveling across,
where are there places maybe we're not allowed to go because there might be,
if you leave your charging cable on the floor, we don't want to drive over that.
And so there are many map layers to help us with the navigation and making good decisions.
And how does it work in practice? For example, let's say you have a charging
cable that's lying across the floor.
Do you recognize that before you even cross the cable or do you have to go over
the bump once and then you remember?
How does it work visually with other sensors?
Yeah, it's visually from the camera images itself.
We do have some neural networks that have been trained to recognize cables,
and they will sort of highlight that in the image, which then gets translated
into the 3D space so that the robot knows where it's allowed to go and where it's not.
So it's a bit like Tesla with their 3D model based on cameras,
or do you also have depth sensors?
It's entirely cameras. Fortunately, our robot is fairly lightweight and doesn't move very quickly.
So we don't have the sort of extreme safety risks that an autonomous car would have.
And when you built this map view, how much of it did you have to build yourself?
And how much of it could you use from the Rust ecosystem?
Were there any existing crates that you could build on?
Or did you really have to build everything from scratch?
I think we used existing crates wherever we could, but most of it was built from scratch.
We have an internal visualization tool that we built using the egui crates and its ecosystem.
And I haven't personally worked on that code base, but the people who I talk
to who have seem to like it a lot.
Okay, then do you also update the map from time to time or is it a static version of it?
So do you scan the room once and then store that because essentially your room
doesn't changed that much?
Or would you say there are update steps as well with every single iteration? How does that work?
The map is constantly updated. So every image that the robot sees through the
camera causes a change in the map.
And so it's constantly re-evaluating what's in its way.
And so one of the interesting experiences is to walk in front of the robot while
it's driving right at you, and to see it slow and navigate carefully around
you, which is not something that people are used to seeing with,
other robot vacuum cleaners.
Absolutely okay then when you update map how does that part work you take a
image from a camera feed and then you need to know the location where you are
you need to keep track of that and then do you update it in some sort of tree-like
structure where you have,
different areas of your apartment and then you only need to update this specific
part and need to know where to map it to or is it a flat structure how does that part work um.
So there's all of the robots out there that are able to navigate using cameras
the general set of algorithms they call them SLAM simultaneous location and
mapping and it's taking those camera images and comparing data in those images
to data it's stored in the past.
So the robot first has to decide, am I in a place that I've ever seen before?
And that could be yes or no. And so if you were to drop the robot in a new room
that it had never seen before, it essentially has to start a new map.
And every new surface that it sees, it will sort of gradually,
image by image, start to stitch together a new 3D environment based on the way
that the camera images overlap.
And because these are stereo cameras, it's getting not just images,
but also it can figure out the depth information, how far away everything is.
And that gets reinforced as it sort of moves around the space.
And then the robot may drive through a doorway and suddenly it discovers,
oh, you know, I've seen this part of the room before and now it actually has
to stitch two entire maps together at the doorway where they meet.
And so it's a pretty neat process to watch it happening. It's rather amazing
that it can be done on a robot this small.
Yes, and it can be done in real time, which is kind of crazy.
Where would you say is the hardware bottleneck? Is it CPU part of computation
or is it the IO part in terms of storing and retrieving information from the storage?
I think to a sense, it's the cost part in that if you want to spend more money
on CPU cores and GPU cores,
you can make the robot run faster and consume more detailed images and have more detailed maps.
And so it's really up to us to choose a cost point that makes sense for this particular product.
But if you're willing to spend more money, you can do things even faster.
And so I imagine you could design a whole family of different robots with different
capabilities, depending on how much you want to pay for them.
Did Rust help you save hardware costs?
I would say in a broad sense, yes, because Rust as a language is one of these
environments where you're very in control of your own memory use.
So a garbage collected language would probably put some significant memory pressure
on us. I think that would not work out very well.
As it is, we are constantly under memory pressure anyway, just because the amount
of data we're trying to ingest is fairly large.
So Rust gives us the tools to do the most with the CPU and the memory we have available to us.
Is some of that code executed in parallel or concurrently?
Do you do, and I guess you have to at some point, do multiple things concurrently
because you get a camera feed, you need to update the map, all of that has to
happen more or less at the same time.
Yeah, we have a lot of things going on in parallel. We have four ARM cores,
and then some GPU hardware that's also operating concurrently.
And so some of it we use tokio Async Executor to schedule tasks.
And then some work is really just CPU intensive.
So we have, I guess what you'd call blocking threads to execute that work on.
Pretty impressive. Does it mean we will have a vacuum robot that runs on futures
or the async stack in the future?
Yeah.
It already does, sort of, right?
It already does. That doesn't seem terribly surprising to me.
Async code seems pretty mature in Rust, and we don't have any fear about using it in production.
That is extremely cool, because I guess it should serve as a sort of testament
to what Rust can do in different environments.
And maybe there are people that know tokio from a purely web environment,
for example, and hearing that you could use some of the same technologies in
such a constrained environment and something that is close to the hardware,
at least, environment is kind of encouraging, right?
It is. And it's really helpful that as we move between different parts of the
code base that things don't have to change all that radically.
So if the robot needs to download a software update from the cloud,
we're using the same async HTTP client that the rest of the world does.
And we use all of that code for our own debugging tools as well.
And the fact that that can live in our code base side by side with all of this
magic stuff that's very specific to our robot and our problem is great for productivity
because you can shift seamlessly between the code and use all of the same strategies and coding techniques.
Is it true that the updater component is also written in Rust?
Yeah. Yeah. In fact, I hadn't really done much cloud development before working at Matic.
And I happened to get picked to do that particular part of our system.
And so I spun up my first ever cloud service using Axum and wrote clients for
it using, you know, tokio and reqwest and all of the standard Rust crates.
And it's worked out very well. I find it a really pleasant environment to work in.
From your background, given that you worked closer to the system before,
what was your first impression about the async web environment?
And also maybe on a more broader scale, the web ecosystem in Rust in general.
I haven't done that much in the web ecosystem. I sort of feel like I'm pretending
a little bit because I don't necessarily know what all of the expected ways
of working are, but it seems very mature to me.
I've done interactions with AWS services and Cloudflare services and built web
servers and web clients.
And for the most part, everything seems very polished to me and very easy to use.
But the only thing where I felt like I was reaching for some of the more extreme
use cases is if you want to use TLS client certificates.
I think that's a little bit of an exotic use case.
But if you want to do basic sort of token-based authentication,
I think it's a bit easier.
What does the development cycle look like? Like, do you push updates every day
and do they get updated more or less?
Do they get pushed to the clients, to the robots automatically?
Or do you have weekly update cycles?
What is it like? Continuous deployment?
For the robots that are in our office, it's fairly continuous.
It sort of depends if a particular engineer has a robot of their own that they're
debugging on. They control the deployment procedure.
For beta customers right now, we might release every week or two,
and I would imagine that's going to slow down a lot once the software stabilizes
and we kill the last bugs.
Did you ever run into issues with the update process?
For example, the update got stuck or you couldn't download the entire package.
You probably have some way to verify that the update image is correct based
on some checksum, I would assume.
But for example, things during the flashing process might not work as expected.
Do you keep that in sort of a sandbox box and you can always recover to a clean
state or how do you handle that part?
Yeah, we always want to be able to fall back to the previous version if something goes wrong.
I think the set of problems that we have is going to change as we start to ship
larger and larger numbers of robots.
Right now, I think most of the problems that I've seen around updates is that
home Wi-Fi networks can have a lot of strange failure modes.
And teaching our robot how to deal with those failure modes is mostly what we see so far.
And the other thing that we notice is that everybody has that one corner in
the house where the Wi-Fi reception is really bad.
And it's fairly common that that's where the robot parked on its charger.
So figuring out how to deal with that problem is kind of tricky.
In a sense, it would be interesting if we actually taught the robot how to while
it's mapping the house we should teach it how to map the wi-fi signal strength
so it knows where in the house it can go if it needs good network connectivity oh.
Can it update while it's doing its job or can it only update when it's in the home base.
There's no reason why it couldn't do it at the same time as it's cleaning.
It's not a big CPU load to be downloading stuff over the network.
I think mostly we want to do the update while it's idle, just because as we
start committing to writing down the new software version, we don't want the
robot to lose power if it can't find its dock.
So the dock is always the safest place to do things that you can't take back.
Right. Do you even think about all of these things when you develop new software and you ship it?
Or do you just push to a repository and it will handle itself somehow?
Mostly, I think we don't worry about it. I think when you ship a hardware product,
testing for quality and preventing regressions is the hardest part of the engineering.
Engineering, because if all you ship is, say, a web service,
you can build that anywhere and you can test it anywhere.
And it's really easy to say that things work the same way as the previous generation.
When you have a custom piece of hardware, if you really want to do effective
testing, you need that custom piece of hardware in the test loop.
But if you have a robot that moves and sees different things every time,
now testing has become extremely difficult.
Did rust help you with that somehow for example
did you have to touch the updater a lot and change its behavior or patch it
and maybe add ways to handle edge conditions or was it mostly a don't job and
you don't touch it that much anymore.
I think that no software is perfect the first time, and I think Rust makes us
more productive in all aspects.
So figuring out what the bug was, I think I can do that faster than I could in other languages.
And applying the fix and testing it and being confident that it's correct,
those things go a lot faster as well.
And that doesn't mean that I'm perfect and never introduce bugs so sure there
have been plenty of updates since the first version of the updater came out.
Okay then that means you you
still work on this component and you probably have
a way to handle errors when things like the update fails probably there might
be a way to also log errors but do you also handle panics how how's the error
handling story right now on the device specifically it must be quite a challenge.
Yeah, I think I always like to start with the simplest possible error handling.
And in the case of a software update, that means if we try to download something
and the download doesn't complete, we just give up for now and we'll try again
later. And hopefully that will succeed.
And we do log everything.
So the hope is that if we discover that there's a robot that's having trouble
updating that we may be able to get access to those logs to figure out what the problem is.
The nice thing about running a lot of robots ourselves is really significant
bugs will eventually happen in our office or in one of our homes,
and we can go and investigate what it is that went wrong.
But we also use fairly standard Linux tools, so the updater is a systemd service,
and if it panics and goes down, systemd will restart it, and it will try again later.
Did you ever have to catch panics in Rust?
We do catch panics, but not for very good reasons.
I think we just wanted the ability to print out a few extra values before the
stack trace gets emitted. So for the most part, a panic is a crash.
And we try to build our software in a way that if a process crashes, it just gets restarted.
And somebody who's looking at
the robot would really never notice that anything had gone wrong at all.
Impressive. We covered the graphical user interface to some extent.
We covered the map functionality. functionality there's this
web client which takes on the
updates and the requests from the robots we briefly
touched on cloud components because this is deployed in the cloud i assume there's
this user-facing app and the third-party libraries but one thing that we haven't
covered yet is the communication between hardware and software and i did wonder
about this what are the communication protocols?
What system is used internally to communicate between the different parts of the robot?
Is it a message bus? Is it a custom protocol? Do you use something like MQTT? How does it look like?
At varying times, we've used some of everything.
We do have some MQTT protocols and we've had some custom built protocols.
And we've rewritten our messaging between
the robot and the mobile phone app many times
trying to get to the right sweet spot where it tolerates bad network conditions
and also tries to be kind to other users of the network but we'd also like it
to be fairly portable because it has to run on ios as well as on the robot.
It's one code base. So we do run Rust code on iOS.
And we also need, there's a large number of shared data structures.
So we need those to be serialized in a very portable way.
And so this is one of those areas
where I think we maybe haven't found the exact right combination yet.
So we keep trying and we keep learning and getting better at it.
Internally to the robot, we tend to put most of the functionality into a small
number of large binaries and so most of the internal communication can be direct
function calls or channels.
With regards to the libraries that you tried for external communication,
can you still remember what was missing there? Anything that,
comes to mind, why you were not extremely happy about all of them?
Most of the ones that we've been unhappy with were things we built internally.
And it wasn't that we were unhappy with the performance exactly.
I think it was just a case where the code got so complicated that we were having
a tough time maintaining it.
And so we gradually had to stop using it and replace it with a simpler protocol. call.
Is it correct that there's no message queue on the system or do you have one
that you use to communicate between different separate components?
I'm sure there are many message queues. I can tell you this.
We have a really large dependency tree.
We write a lot of code very quickly and we check it all into a single workspace.
So it can actually be somewhat intimidating in that there are dozens of crates
that I have no idea what they do.
At any one time, you know, we probably have at least 500 crates of our own.
And our dependency tree, the last time I looked was more than a thousand third-party crates.
And not all of that is code for the robot.
A great deal of it is just the fact that we drop in all of our developer tools
into the same workspace because, you know, there's always some connection where
they want to have a shared data structure or something.
So we have, you know, you sometimes people talk about the mono repo strategy,
we kind of have the mono workspace strategy.
And it feels like we're really straining the limits of what can be done within
a single Rust workspace.
Sometimes it's really convenient, you know, if you see a bug in a library and
you want to update it, you can update it one place in the workspace and know
that everybody's handled.
But at the same time, I worry a little bit that we may be stressing the system
beyond what it was really designed to do.
And at some point, we may need to start breaking that apart.
I can't imagine because it must be...
Pretty common for you to run into limitations of cargo workspaces.
Are there any missing features that come to mind?
Or is it mostly about the size of workspace that becomes a problem at some point?
It's kind of a maintenance problem once you have a very large project,
because once you have that many third-party dependencies, you have to worry
about security concerns, maintenance concerns.
We run cargo deny a few times a week to catch new security advisories and alerts
about crates that might be unmaintained.
One thing that I think is probably missing is I would love to be able to do
sort of a confidence estimate about a new dependency.
And there might be many, many different things that would need to be checked.
Is this crate maintained?
Do they have good test coverage? Is there a broader community of people using this crate?
We don't want to necessarily be the only ones using it.
Does it use a lot of unsafe code? There's so many small data points that you
could imagine a tool that might aggregate them all together and say,
you know, this looks like a pretty solid dependency,
looks good, we should allow it into our tree.
Or maybe every once in a while, somebody will just grab the first thing that
popped up in a search on crates.io, and you'd like there to be a safety check
that says, maybe we should look harder at this one.
Very nice idea.
Maybe it's not something we want to depend on. And it doesn't solve the problem
though, because even well-maintained dependencies, several years later,
sometimes the maintainer moves on and the crate becomes unmaintained.
And as things change, sometimes unsoundness is discovered years after the fact,
because I think the Rust compiler's idea
of what's allowed and what's undefined behavior has subtle shifts over time
and sometimes miri gets better at discovering things it wasn't aware of previously
so we need to be constantly aware of the impact of having that many dependencies
and it's an ongoing challenge you.
Know i think about that a lot and i fully agree with this i just wonder how
such a service might look like Like,
would that be a web service where you can have all of the dependencies as a table?
Would it be something like cargo deny that you run from your terminal?
What is the most convenient way to integrate it into your development workflow?
Would it be a Visual Studio Code plugin or all of the above?
Would it be a CI CD check? So many questions.
I think the conventional answer would be this is something that you put in CI.
You know, not on the pull request path because you wouldn't want a pull request
that was okay five minutes ago to suddenly spontaneously break.
It's maybe something that you kind of run against the master branch once a week or something.
Now that I've said that, maybe I should change my answer a little bit in that
in the pull request, if you're adding a new dependency, maybe that should trigger a deeper inspection.
But I'd have to imagine that having that data set of information about all the
crates that have ever been published, you would have to have some kind of centralized
system for aggregating that data.
And this also sounds like a very opinionated tool, right? So my idea of a good
dependency might be very different from your idea.
And so maybe there's even sort of the idea of how you aggregate all these different
data points into a single metric might need to be customized for different users.
I think it's a really interesting problem, and I hope someday it exists.
Tests it reminds me a bit of lighthouse scores
for websites where they have a
scale from 0 to 100 and then it
is still somewhat subjective granted but
also they try to make it objective because
it is more of a range of values you
have something that is reasonably good or
something that is reasonably bad and you kind of can gauge from just the single
number in what shape a crate or a dependency or a website in this case is would
that help would that be some useful metric for you yeah.
I think that that's a place that you get to after you've done this for many
years and you've had a lot of discussions with your user base about what makes
a good metric and after they've used it a bunch and learn to trust you, then yeah,
maybe your default metric ends up being the one that everybody trusts because
they've seen how you manage all of these competing values.
But at first, I think it would need to be done in a way where you could be respectful
of differing opinions and work to build that trust over time.
It's funny because in the Rust world, we get very used to the compiler being
very opinionated, right?
And even the fact that we've all learned to trust Rust format,
used to be people would argue a lot about coding styles.
Profiles and i'm really really happy that at
least in the rust and go worlds and
mostly python as well people have moved on from
that kind of argument you know it's so much more valuable to have us share a
single format but we accept that it's opinionated and we accept that everybody
disagrees a little bit but we can all come together on a single i don't know
shared opinion i guess and say that it's good enough.
But when it comes to dependency management and things at the cargo level,
I think the tools have not been as opinionated in that you can publish things
that might really violate the expectations of the community.
You could publish crates that are unsound. You could publish crates that are
hostile, that do strange things in build.rs.
And, you know, unless you do something so egregious that you get kicked off
of crates.io, there's really not a layer that gets opinionated about,
is this a thing that people will want to use as a dependency?
And I think that would be an interesting space to explore.
This is very true. We did have quite a few interviews before with companies
that use Rust in production.
You're the first one to raise that request.
And if I had to put my product head on for a moment, I would ask,
if that was such a highly demanded feature, why has no one built it so far?
Is it because this is mostly a necessity by people that use Rust in the enterprise,
Rust for production for something more mature, and then they start to see the
lack of tooling for enterprise?
Or is it because it might actually be more of a community feature and there's no product behind that?
I think it's probably something you're more likely to see in companies and larger enterprises.
In the open source world, everything tends to be modular, fairly small crates
with a limited set of dependencies, and you really have a good familiarity with
all of your dependencies.
And once you get to these mega projects with hundreds or thousands of dependencies,
that's where it starts to feel like some additional tooling is needed.
And that's something that you're probably more likely to see in closed source
codebases at companies.
Wouldn't it be easy for a company to build such a product?
Not sure if a bigger company already built something like this for themselves,
or maybe it's available internally, but I'm really curious why that doesn't exist yet.
Well, there are shared code review systems out there already,
and companies like Google and AWS do contribute to those.
I think the system that I might be interested in is something that even if nobody
else in the world has ever reviewed this crate, I could still get some kind
of a metric based on a little more of an automated analysis.
But at the same time, I think those two different kinds of systems can interact with one another.
The fact that somebody at Google has done a positive code review on a crate
is a valuable input to an overall metric.
And if the automated system flags maybe what seems like a suspiciously high
amount of unsafe code, maybe that would also be useful data for somebody who
decides to go and and do a code review.
Certainly, if I were looking at our set of dependencies,
and I had a program that would give me a list and say, these are sort of the
top 1% of my dependencies that maybe look like they have perhaps quality issues,
then at least now that helps me know where to focus my attention.
Would that be something that a company like Matic would pay money for if that
was part of their critical infrastructure?
You bet big on Rust, so maybe it might pay off for you to have such a system.
Yeah, probably. Or maybe I'll decide to build it myself.
You could also go one step further and think not only about single dependencies,
but more about stacks of dependencies.
Dependencies for example you could say which crates are commonly used together
and for example i don't know much about embedded maybe i can go and try and
understand how other embedded rust,
applications are built maybe i get the choice of three or four different stacks
where i know these abstractions or these crates these libraries work well together
would that also be helpful yeah.
I mean certainly the fact that one crate is used by another crate that is high
quality that does seem to be a good vote of confidence.
At the end you have a
sort of graph-like structure with relationships
you have child and parent
relationships you have a crate which
uses many dependencies that might be the children but you also have an application
that uses this crate that might be the parent so you have this this graph structure
actually what I describe is more like a tree but you can look at it also as
a graph because there's relations between the crates on the same level,
Actually, it might be something like a tree of graphs where each layer is sort
of the siblings for one crate,
crates that get used together in the same cargo tumble.
Yeah. And another interesting problem that happens when you have these large
projects is because of cargo's feature unification.
You know, it says in bold text in the cargo manual that features must be additive.
And I think the more time I spend looking at many, many dependencies,
the more you realize that there are a lot of non-additive features out there
in the world. I see them all over the place.
And they're in some very common crates. But once you start including all of
your code in a single workspace, you realize that a feature that you turned
on in one project is now turned on everywhere.
And so non-additive features start to have a bigger and bigger impact the larger your workspace gets.
And I have actually had to go in and vendor and patch some very common crates
to make sure that a feature never got turned on. can.
You give me an example i.
Mean i don't want to shame any particular crates but
there's a crate that implements
spin locks and there are crates
that will given a particular feature flag
change over all of their internal logic from mutexes to spin locks so once you
have a very large dependency tree any one of your crates or any one of your
dependencies may decide to throw that switch and suddenly everything in your
code that uses that crate gets spinlocks instead of mutexes.
Now that's a pretty significant change. And so that's the sort of thing that I keep an eye out for.
And I was actually aware that the problem existed because I happened to look
through the code base previously and I thought, well, this crate's been around for a long time.
Surely Surely nobody would ever publish a crate that unilaterally throws that switch.
And about two months later, it happened. And I only noticed because when it
switched from mutexes to spinlocks, it made a subtle change to a type became
no longer sync or send, I think.
And so it actually caused a compile error in our code.
And that was how I discovered that suddenly spinlocks were enabled everywhere in our system.
Wow, that's really a drastic change. But isn't it possible to disable certain
features for certain members of the workspace only?
I'm not aware of a way to disable a feature if one of your dependencies enables that feature.
Ah, yeah, that's a little harder. I guess it's still potentially doable by introducing
your own feature, which describes how you can disable this dependencies feature.
I'm not 100% certain about it. But that would even mean that you would have
to be aware of all the features that could potentially be enabled in the dependencies.
So it doesn't really help you there. here.
Yeah. I think mostly it's just an awareness that as you get a very large workspace,
that dependency management is a very real task and ignoring that obligation
can get you into trouble in any number of ways.
I think the top level one is security.
And once you start paying attention to security, it's very easy to start paying
attention to all of these other issues as well.
Are there any other enterprise features that you might be missing right now?
I would assume that you use a
private crate registry for your dependencies for your internal ones or no.
We haven't done that yet we use public crates.io dependencies and cache the downloads.
But about your own crates that you need to maintain you probably don't have
the problem because everything is in one single workspace right that's one of
the advantages exactly it's And that's.
The easy path is just to keep piling everything into a single workspace.
Then we don't have to work out when is the right time to publish and how do
we manage that internal registry.
The master branch is where the code lives and everything is consistent based on that.
But are there any other enterprise features that are lacking right now?
I could think of monitoring, metrics, the development environment,
debuggers, things that mostly enterprise users need for bigger projects.
Of course, everyone should use a debugger, but there might be certain things
that you only use in an enterprise context.
Authentication, how tokens get handled, I don't know, anything that comes to mind.
I think that there are enough enterprises using and contributing to Rust that
most of the basics are already handled.
And anything that happens to be missing that we need, we try to build our own
or contribute it back to the right crate upstream.
You know, one of the things, like I mentioned earlier, TLS client certificates
are more heavily used in enterprise environments than in the open source world.
And so, you know, occasionally that might be a place where in the HTTP libraries,
you're more likely to find a bug relating to TLS client certificates than you
are in kind of the mainstream code path that everybody else uses.
And so what's the fix to that?
You know, I try to contribute a fix here and there.
And I published a couple of crates that supply canned TLS test certificates for use in unit tests.
And I could go a lot further, I should go a lot further.
But that's one of those things that that I would like to see tested more comprehensively.
Because if you have something that
does implements an http client you probably
write a unit test against local host
port 80 right but you probably don't
want to do all the excessive amount of work to figure out how to spin up a proper
https service running on local host just for the unit test you need private
keys and public keys and certificates you need to have a root certificate that
somehow gets plumbed into your system or at least into your HTTP client.
And all of that is a lot of work for somebody who just doesn't really use that
code as part of their development process.
So I think that's one of those areas where when I see a gap,
I like to pitch in a little bit and contribute if I can.
And speaking of edge cases, early on you mentioned that 95%,
if I remember correctly, of the code base are in Rust.
It's just a guess.
What about the other 5%?
So we have a Linux kernel and it's somewhat customized for this embedded CPU ecosystem that we get.
And we have a lot of system libraries that are run in C and C plus plus.
And so we include those in our tree and build them as we're building all of
the rust code for the robot. bot.
So it's not that different from what you would find on,
A Linux server appliance, perhaps?
Do you even have any of the other, say, more traditional languages like Java
or like web things, for example, TypeScript or JavaScript?
Or because you even mentioned that the iOS application is at least in parts
written in Rust, do you even have such cases where you need to cross the language boundary?
Yeah, lots. I mean, for iOS, most of the application gets written in Swift or
whatever the language of choice is for building iOS applications.
And really, we've just made a Rust library that understands how to speak our
protocol to talk to the robot, and it understands the data structures that are specific to our robot.
We don't use Java. I would say we probably have small tools here and there written in JavaScript.
Speaking for myself personally, I am apparently just a really terrible JavaScript programmer.
I am so bad at JavaScript that when I need to build code that runs in a web
page, I will just write it in Rust and compile it to WebAssembly.
It's not a perfect environment. You certainly have to experience a little bit
of pain and jump through some hoops to get it to work.
But I find that a lot easier than writing code in JavaScript, to be honest.
And I think that's That's mostly just me, because I find it so disconcerting
to be in a programming language with a weak type system where mysterious values
can propagate through the code.
And by the time they explode, I have no idea what went wrong.
I can share the sentiment. Probably I don't take it that far.
Most of the time, I still have some JavaScript projects, and
I still use JavaScript on the the front end but there are things like laptops
for example which allow you to build parts of your front end in rust as well
with web assembly and you have this deep tool integration and it feels just
right once you hit that sweet spot where you can talk about the same instructions,
both on the back end and the front end side it's really cool yeah and.
In my personal time i've also experimented a little bit with bevy.
And it's really amazing in bevy that you can build a native application.
And then you can also recompile for WebAssembly and see your very same game
run in a web browser. And the first time I tried it, it worked.
In your blog post, you mentioned that Rust was an easy-to-pick-up language for an intern.
And now that you describe your background and all the things that you do,
I wonder if it was more about you with all of your knowledge and maybe your
expertise and the way you can learn things pretty quickly,
or if it was also about the language.
It kind of somewhat contradicts the common belief that rust is a difficult language to learn,
or it might also be that you just got lucky as an intern and maybe your perception
is that rust is easier to learn whereas in reality it's easy to learn for you
but not for others can you comment on that?
Sure. So to repeat the story at Matic, we have this visualizer application,
and it's used by engineers to sort of monitor and interact with the robot.
And we had originally built this program in Python, but last summer,
an intern rewrote the entire visualizer in Rust.
And everybody was really happy because it made the application a lot faster.
And because the code was a lot easier to extend, people started contributing
a lot more detailed visualizations and it helped us fix a bunch of bugs.
And it turned out that this was the intern's first project in Rust.
But that's not the only time that that has happened. We've had a lot of new
hires that didn't know Rust before they started and things seem to be going pretty well.
By comparison, I learned Rust all by myself and it seemed fairly difficult.
For me. I think the difference is, if you have somebody nearby to talk to,
to look at your code, to give advice, I think that really seems to speed things up a lot.
So really having a mentor is a big help.
It also helps that once you have a large group of developers,
they don't all need to know how to do the tricky things.
They don't all need to know how to write unsafe code.
They don't all need to know how to do FFI.
They can kind of specialize a little bit. So that helps. They can grow their
Rust knowledge at their own pace, and they don't really have to feel pressured
to know every single detail.
But I was also surprised, you know, my background is sort of systems languages, C and C++.
I was really surprised, too, that that sort of a background doesn't seem that
important to being able to learn Rust.
I would have expected that, geez, if you've never used pointers or memory management
before, you might have a hard time. But that really doesn't seem to be a big issue.
I have a 12-year-old who's a pretty decent Python programmer,
and he's never learned C.
So maybe one day soon, I'll see if he wants to learn Rust, and then I'll really
get to the bottom of this.
Does the Rust learning experience depend on the person's previous experience with other languages?
I would have thought so, and I'm sure every person's experience is a little
different. but in terms of, is it a lot easier or a lot harder?
If you learned these languages previously, that doesn't seem to be the case in our experience.
I think the hardest part of learning Rust for me, I really wanted to understand
what idiomatic Rust was.
I wanted to know how do experts write Rust code?
And that's something that's really hard to learn, I think.
And particularly, you know, how do you capture complicated ideas and make them look simple?
People who are really good at Rust have a talent for, I guess,
making their code really expressive and friendly.
And I think mostly I learned a lot by watching John Jengset YouTube videos.
If you haven't seen them, he does these three and four and five hour long videos
where he tackles some big giant project.
And he's really good at Rust code. So you get to watch an expert and you get
to watch him make mistakes and figure out the problem and move on.
And having watched that, I think it's really helpful because you learn a lot
about how people mentally construct their code while they're writing it.
John is really a great person. I can relate to that.
And he's also just a great person in and of itself, not only a great rustation,
but also someone that is very happy to share their knowledge.
And it's much appreciated to have such a person in the community.
Are there any book resources or other non-video resources that you would recommend
for learning idiomatic rust? or is that another gap in the ecosystem right now that should be filled?
Yeah. So John's book is great, Rust for Rustations.
I also, I have a great love for efficient data structures and lock-free data
structures and algorithms.
And so the book Rustatomics and Locks, I think is really great.
By Mara Boss.
By Mara Boss. and I think that's a wonderful book as well.
And what's funny is even though the Rust atomic semantics are basically identical to those in C++,
I have never seen a C++ resource that explains that memory model as well as Morrow's book.
So I think it's a great resource even for people who don't necessarily care that much about Rust.
Idiomatic code is also very close to my heart. And I do believe that we need
more resources in order to convey how to write both maintainable,
ergonomic, but also reliable code.
And I do believe that if I
understood you correctly you we are on the same page here and
we think about systems a lot and how
the small parts interact with each other and
how to build something that is robust and reliable with a language like Rust
and Rust lends itself to these sorts of problems how do you learn about reliable
systems are there any resources out there maybe even outside of Rust that you
used to learn about how to build such environments, such applications?
I think I mostly just learned by seeing projects go right and go wrong.
My opinions about Rust are really opinions about where software engineering
is, and that's mostly based on the projects I've seen in the last 25 years.
When I was at Pure Storage, I got to witness a startup up grow through two orders of magnitude.
And that company was successful because they found a way to build really high quality code.
They sell data storage appliances with incredible uptime and reliability.
And doing that needs a lot of things to go right.
And the really cool part is that when things are right, the day-to-day experience
of being a software engineer starts to change.
You spend a lot less time chasing dumb stuff and you get to spend a lot more
time on the really unusual failures have you ever heard the phrase horses not zebras.
No, I have not.
This is a bit of a tangent, but it comes from medical doctors.
So a medical student is training to be a doctor.
They learn about all of this cutting edge research and exotic diseases,
and they go in to meet patients.
And there's a tendency to think that every patient has some exotic disease.
And it's so much more likely that they have something common and boring.
So the advice they give to medical students is when you hear hoofbeats,
think horses, not zebras.
Always go for the easy stuff first. But in software, it sometimes goes differently.
If you have a software project, once you've put in the effort to drive out all
of the really common bugs, then the only thing that's left are the zebras,
the really interesting, surprising bugs.
Bugs, you start to see strange failures from all over the place.
You can see design mistakes in the hardware.
You can see like bad CPU microcode.
You might start to see kernel bugs that nobody's ever identified before or compiler
errors that fail incredibly rarely.
And that is a really interesting job to have.
You get to spend all day being creative, trying to imagine how something could
possibly have happened. And it's really a lot of fun.
And how do you get to this really high level quality, I think the discipline
is just to always be looking for better tools.
And sometimes these tools just come along and fall in your lap.
And sometimes you have to build them yourself.
Every large software project has built some of their own tools.
But I think it's important to always be thinking, what could I build that would
fix not only this bug, but every other bug in this category.
And those category solutions are really hard and expensive to build,
but they're incredibly satisfying and worthwhile once you have them.
Is there a good example for such a product that you have in mind?
If you find yourself running out of memory a lot,
One time I had a coworker who wrote a tool to help profile all of the memory
that had ever been allocated and tell you where it went.
And once I saw that, I realized I want this in every project.
The fact that memory allocations are kind of mysterious data that nobody gets to see seems wrong.
And so given a day or two, you can build your own memory allocator and print
out some data about who the caller was and where the memory went.
And you can learn a lot of interesting things.
And I think the ultimate is there's this blog post out there by Will Wilson,
who's the CEO of Antithesis, and he describes the situation where the developers of FoundationDB,
they had such a good test environment that their engineers were 50 times more effective.
And it sounds like it must be an exaggeration, but I really do believe that
it's true because it's a really incredible feeling when your tools start multiplying
your productivity and you can do incredible things without wasting your time.
And so that's how I come to Rust
is that search for better tools and the desire to stop wasting my time.
One traditional question that I ask every interviewee is what would be your
message to the community?
I think my message is, thank you. I'm very grateful to have better tools.
They make programming fun again.
I think things in the Rust world are going really well.
There are always things I'd like to see get done faster, but I try to be patient
because I understand that sometimes people need time to think and make good
decisions. and open source developer burnout is a real problem.
So I don't want to add to the pressure in any way.
I'm amazed that the Rust toolchain releases every six weeks and it's so reliable.
I usually upgrade our toolchain at Matic the week it comes out.
I've been doing that for more than a year and we've never had to revert to an older version.
So it's great. We don't need to stress about the quality we get to play with
all the new toys. Everybody's happy.
And for other device vendors?
I think the world has a long way to go in terms of software security.
Almost everything is terrible. There are maybe five or 10 companies in the world
that can make secure consumer devices, and even they have problems from time to time.
Everybody else is just broken constantly.
And there's a lot of reasons for that, default passwords and directory traversals and so on.
But I think it's really hard to get any traction on the problem when there's
this constant undertow of memory unsafety.
You know, you send a big packet and you can cause the server to crash.
You can send a malformed URL and get a remote code execution.
It still feels like we're kind of living in the dark ages. and I really have
to wonder if you built these devices with rust everywhere would it start to get better?
I think in the next 10 years we're going to start to find out and
the other problem is a lot of hardware is built by companies that
just want it to be as cheap as possible they just
want to grab whatever open source stack is out there and ship
it and never care about the future and they don't ship
security updates either so that's a really hard problem to resolve but
I think it's at least possible for that cheap open source stack to reduce the
security problems by a lot and then maybe it's at least possible for cheap devices
to stay secure for at least a few years but it's a really interesting problem that's.
A great testament to Rust to the community and thanks for being a guest today.
Thanks so much for having me on the show.
Rust in Production is a podcast by corrode. It is hosted by me,
Matthias Endler, and produced by Simon Brüggen.
For show notes, transcripts, and to learn more about how we can help your company
make the most of Rust, visit corrode.dev.
Thanks for listening to Rust in Production.
Eric
00:00:23
Matthias
00:01:07
Eric
00:01:31
Matthias
00:02:24
Eric
00:02:31
Matthias
00:02:51
Eric
00:03:07
Matthias
00:03:30
Eric
00:03:51
Matthias
00:04:10
Eric
00:04:21
Matthias
00:05:32
Eric
00:05:43
Matthias
00:06:05
Eric
00:06:50
Matthias
00:08:21
Eric
00:08:29
Matthias
00:09:33
Eric
00:09:50
Matthias
00:10:19
Eric
00:10:35
Matthias
00:11:46
Eric
00:12:06
Matthias
00:12:22
Eric
00:13:40
Matthias
00:15:41
Eric
00:15:53
Matthias
00:18:08
Eric
00:18:37
Matthias
00:19:20
Eric
00:19:52
Matthias
00:21:03
Eric
00:21:48
Matthias
00:22:23
Eric
00:22:47
Matthias
00:24:13
Eric
00:24:23
Matthias
00:24:43
Eric
00:24:53
Matthias
00:25:34
Eric
00:26:06
Matthias
00:27:11
Eric
00:27:32
Matthias
00:27:53
Eric
00:28:00
Matthias
00:28:13
Eric
00:28:30
Matthias
00:28:51
Eric
00:29:10
Matthias
00:29:43
Eric
00:30:16
Matthias
00:31:45
Eric
00:32:01
Matthias
00:32:41
Eric
00:32:45
Matthias
00:33:25
Eric
00:33:45
Matthias
00:34:10
Eric
00:34:20
Matthias
00:34:21
Eric
00:34:22
Matthias
00:34:31
Eric
00:34:58
Matthias
00:35:42
Eric
00:35:46
Matthias
00:36:17
Eric
00:36:37
Matthias
00:37:25
Eric
00:37:47
Matthias
00:38:14
Eric
00:38:47
Matthias
00:39:50
Eric
00:39:59
Matthias
00:40:28
Eric
00:40:40
Matthias
00:41:23
Eric
00:41:45
Matthias
00:42:19
Eric
00:42:45
Matthias
00:43:41
Eric
00:43:45
Matthias
00:44:12
Eric
00:44:59
Matthias
00:46:15
Eric
00:46:31
Matthias
00:46:51
Eric
00:47:01
Matthias
00:48:25
Eric
00:48:46
Matthias
00:50:06
Eric
00:50:07
Matthias
00:50:51
Eric
00:51:23
Matthias
00:52:33
Eric
00:53:13
Matthias
00:55:13
Eric
00:55:57
Matthias
00:56:31
Eric
00:56:47
Matthias
00:58:03
Eric
00:58:17
Matthias
00:58:22
Eric
00:58:59
Matthias
00:59:10
Eric
01:00:00
Matthias
01:01:00
Eric
01:01:02
Matthias
01:02:20
Eric
01:02:30
Matthias
01:02:39
Eric
01:03:00
Matthias
01:03:25
Eric
01:03:36
Matthias
01:03:42
Eric
01:03:56
Matthias
01:04:14
Eric
01:04:44
Matthias
01:06:42
Eric
01:06:49
Matthias
01:06:51
Eric
01:06:53
Matthias
01:07:25
Eric
01:07:50
Matthias
01:09:05
Eric
01:09:36
Matthias
01:09:54
Eric
01:10:42
Matthias
01:12:49
Eric
01:12:56
Matthias
01:14:16
Eric
01:14:50
Matthias
01:15:12
Eric
01:15:13
Matthias
01:15:38
Eric
01:16:24
Matthias
01:17:20
Eric
01:17:22
Matthias
01:19:18
Eric
01:19:22
Matthias
01:20:38
Eric
01:20:46
Matthias
01:21:36
Eric
01:21:39
Matthias
01:23:07
Eric
01:23:12
Matthias
01:23:16