Rust in Production

Matthias Endler

OxidOS with Alexandru Radovici

Alex, CEO of OxidOS, discusses enhancing safety in car ECUs, Rust's advantage over other languages in embedded, certification, and teaching Rust to students. He highlights testing with OxidOS, Rust in industrial systems, and tool selection for success.

2024-06-27 69 min

Description & Show Notes

It has become a trope by now: "Cars are computers on wheels." In modern cars, not only the infotainment system but also the engine, brakes, and steering wheel are controlled by software. Better make sure that software is safe.
Alexandru Radovici is a Software Engineer at OxidOS, a company that builds a secure, open-source operating system for cars built on Rust and Tock. We talk about the challenges of certifying Rust code for the automotive industry and the new possibilities with Rust-based car software.

About OxidOS

OxidOS is a Rust-based secure ecosystem for safety critical automotive ECUs. Their solution consists of a Rust-based Secure Operating System and DevTools for medium-size microcontrollers inside automotive ECUs, designed for safety-critical applications. The OxidOS ecosystem provides significant security and safety enhancements while reducing development and certification time by half for automotive ECU software development projects. This is achieved through the usage of Rust that brings benefits such as memory and thread safety enforced at compile time. The OxidOS architecture runs memory sandboxed applications, which have cryptographic credentials and are digitally signed.

About Alexandru Radovici

Alexandru Radovici is an Associate Professor at the Politehnica University in Bucharest, Romania, where he has been using Rust to teach for a few years. Alexandru is also one of the maintainers of the Tock embedded operating system, written fully in Rust.

Links From The Show
Official Links
About corrode

"Rust in Production" is a podcast by corrode, a company that helps teams adopt Rust. We offer training, consulting, and development services to help you succeed with Rust. If you want to learn more about how we can help you, please get in touch.

Transcript

This is Rust in Production, a podcast about companies who use Rust to shape the future of infrastructure. My name is Matthias Endler from corrode, and today we talk to Alexandru Radovici from OxidOS about embedded development and putting Rust back into cars. Hey Alex, welcome to the show. Can you talk a little bit about yourself and about OxidOS, the company that you work for?
Alex
00:00:26
Hi, Matthias. Thank you for the invitation. My name is Alex. I'm the CEO of OxidOS. OxidOS wants to provide a new kind of operating system for small ECUs in order to make cars safer and the development cycle slower. My background is in computer engineering. I have a PhD in computer science and have been working in operating systems and compilers for the last 15 years, probably.
Matthias
00:00:53
Maybe for some clarification, what is an ECU?
Alex
00:00:57
An ECU would be an electromechanical component that has a chip on it. So it's a small part in a car that is controlled electronically by software. The normal device, we would say it's a microcontroller. In the car, we would say it's an electronic control unit.
Matthias
00:01:13
Okay, perfect. You say that you have a background in compilers, and i wonder how long is it since you started with rust and started with maybe even low level programming so.
Alex
00:01:28
Low level programming probably in high school in the days of windows 90 95 actually the days of dos we would do really crazy things in dos with rust i'm not that old in rust i started it in actually i had three attempts to start rust somewhere in 2015 2016 dropped it because complicated syntax tried two years later but didn't have time and finally i did have time during the pandemic the pandemic hit us in 2020 i was teaching at university we started teaching online which reduced significantly the time that i had to spend the university and i started contributing to Doc OS, which is an operating system written in Rust. That's how I learned Rust. I needed a project, and this one seemed really nice.
Matthias
00:02:18
It's kind of funny that you say that, because not many people have that depth of experience then, and also maybe drop the language in between. So I do wonder, has the syntax changed significantly since your first attempt?
Alex
00:02:31
Yes, it did. Lifetime elision rules were added, and I think the manual was better. But most probably I dropped it because I didn't have enough time. It was a really, really small side project, which required more time than I could afford.
Matthias
00:02:46
And what did you use before Rust? Probably some C, C++ assembly?
Alex
00:02:52
Not really. I started with Pascal and doing stuff in NASM and Pascal. I used Delphi afterwards. I did learn C by writing some Linux software. I never learned C++. I don't like it. I think it's a nice playground and it was super necessary, but I'm not a fan of C++. And knowing compilers, I can have several reasons why not. And then I used, surprisingly, Node.js as soon as it came out. My previous business was performing remote updates and controlling boards with everything written in node.js.
Matthias
00:03:28
And what was so great about node.js that you jumped on it immediately i.
Alex
00:03:33
Was very similar in syntax with c i didn't have to do mandatory indents and getting errors that they don't fit like in python i'm not saying python is not great it is it just it was not easy to fix indent problems and i don't know node.js was running everywhere it was running in the browser it was running on the computer it was really easy and fast to program so we started the previous business with controlling boards with writing a client in c and it took us about a month and a half to be able to start sending messages on the protocols that we needed in In Node.js, it took us half a day. Probably the Node.js ecosystem exploded and you would find modules for everything. Some were good, some were bad, but it just worked really nice.
Matthias
00:04:26
Are there any similarities between Node.js and Rust?
Alex
00:04:30
I think there's a lot of them in syntax. If you think of structures, if you think of TypeScript and the type system, but this is more or less normal. Somehow it was the same foundation or company behind them. So they borrowed many things. In the functional part from Node.js, they borrowed it. I think so. Of course, they have different purposes.
Matthias
00:04:53
Well, yeah, that's true. But at the same time, when you said Node.js runs on the backend and the frontend in every environment, I was reminded of Rust somehow, because Rust is kind of similar there. You have backend applications, and you can write things all the way to embed it, which is what we talk about today, right?
Alex
00:05:12
Yes, that's why we actually migrated to Rust. So we tried using Go for the previous company, and it didn't work very well for several reasons. And then when we found Rust, after the pandemic, we said, okay, this is something that we want to use. And now we have the code base more or less in Rust, and everything fits nicely together.
Matthias
00:05:33
What were some reasons why you dropped Go?
Alex
00:05:37
Package management was really bad. We are talking about 2015 to 2017. Package management was bad. they had several package managers none of them worked you had a global go variable the project structure was very fixed you needed to have a fixed folder in your home directory so there were several reasons of project organization i think yeah.
Matthias
00:06:02
These things are mostly fixed by now so i wonder would you go and give go another chance today or would you not but obviously Obviously, you know more about Rust nowadays, but I do wonder if you had a team and you started out fresh without really any Rust background, would you still try with Go or would you nowadays start with Rust right away?
Alex
00:06:27
Uh, that's a difficult question. Probably I'd look at it. The idea that you have a garbage collector can be good and bad. I mean, it's bad if you have a constraint system, but if you don't have a lot of constraints and you can consume a decent amount of memory, the garbage collection collector will be way more efficient because it will delete memory when it has time, not when it It doesn't have time. The second thing in Go, the fact that you don't have to explicitly start threads is not that bad. So probably for concurrent programming, I would give Go a try. But in the case, if the case were that I wouldn't know Rust and I wouldn't have a team that has experience in Rust, today I wouldn't choose Go because I have the team that knows Rust and it would not be economically wise to do this.
Matthias
00:07:24
How is the support for embedded devices in Go?
Alex
00:07:27
I know there's tiny Go. I saw this at FOSDEM in 2020, but I didn't follow it. My understanding is that it's basically some kind of VR TOS modified to be a Go runtime. But I might be saying something that's not true anymore. I didn't follow this.
Matthias
00:07:43
Yeah, the two things I learned lately was that they really improved there. Garbage collector times this is one huge thing because that was a blocker back in the day, and then i do think that they need some sort of runtime for an embedded environment there's no other way that i see because of go routines and so on i do wonder though if the runtime itself is limited or if you can write the same go code that you can write in a backend application for example?
Alex
00:08:16
I don't know. I don't know what to say. I know the runtime that they were using in 2020 was something based on VRTOS talking to the maintainer at FOSDEM. But I wouldn't know more than that. What I do know and what I'm really happy is that they use finally something different than C. Don't get me wrong, C is a really useful language, but it's really old and has a lot of problems. So if you can avoid it in a way or another, that's really good.
Matthias
00:08:49
Can you list some of the problems, especially in an embedded context?
Alex
00:08:53
Memory allocation will be always a problem. So if on big computers you have paging, and if you do something that's not just right, You might get a segmentation fault because the operating system protects you. In many cases in embedded, this is not happening. So you can do a lot of memory overflows or incorrect memory usages, and you will just not get away, but go to undefined behavior. And for beginners, and I am teaching embedded systems for some time, doing small lab projects is fine in C. But as soon as the project becomes a little bit more complicated, it becomes increasingly hard for students. They will fight with C syntax and pointers instead of focusing on what they need to do on the app. Another problem is there is no package management. So you are fully responsible in making the packages compatible. And the third problem is no interfaces. I mean, you have the H files, but that's not enough for an interface.
Matthias
00:10:01
I can certainly relate to that. I didn't write any embedded code in C, but it was always frustrating just to get past the compiler. And even if you did get past the compiler, it doesn't mean it runs. It just means it compiles. But runtime is another thing. There definitely were segfaults. And maybe you want to avoid that in an embedded environment.
Alex
00:10:27
In an embedded environment, there's no segfault. It just does something different, which Rust calls undefined behavior for a good reason.
Matthias
00:10:35
Right. So does that mean that the code that runs in my car also is prone to undefined behavior?
Alex
00:10:44
More or less. I mean, they use C for cars, but if the car manufacturer respects good practices, they will require software to comply with the ACIL standard, A, B, C, D, depending on what that software does. Because technically what they do is they write the software and then have a lot of static analysis with very expensive tools, which do what the Rust compiler does out of the box. Try to find undefined behavior and ask the developer to fix it. And to some extent that works fairly well, but it takes a lot of time and it is very expensive.
Matthias
00:11:26
And now suddenly we got rust and all of these problems go away quote unquote but i do wonder does it save companies time to start with rust because maybe they don't run into a lot of these problems.
Alex
00:11:43
It depends which company so car manufacturers would not write software directly until recent years they would buy the part like they would buy the headlight they would buy the brake system, the ABS system from a tire one. The tire one would actually write the software, choose the microcontroller, make the ECU, and make all these checks. I think, and we actually think at OxidOS, on an educated guess, that it will save some time because the Rust compiler forces you to write software in the right way. Somehow, the community that wrote Rust just got inspired from this functional safety and good practices of writing code. It's one to write the whole code, then go through static analysis and then retrofit. It's another one when your code does not compile at the beginning. So in Rust, it will simply not compile. So you will be forced to think a little bit different. And this is where we think they will save some time. And probably due to the nature of the language and the expressivity of the language, it will be easier to read the source code.
Matthias
00:12:55
The Rust compiler itself got recently approved or it got certified for those critical embedded environments like automotive. And I do wonder, as a Rust developer, how much work do you have to do on top of what is already certified or qualified? Do you also have to qualify your code even if you run it in Rust? I assume so.
Alex
00:13:20
Yes, and that is still an ongoing issue. So the compiler is certified. What is not certified yet is the core library. And the compiler, it's not like the C compiler where the compiler is completely standalone. In Rust, the compiler needs a subset of the core library, the language features. And I think several companies are working to certify that as well. We, as a software provider, need to certify the software as well. So we need to do what is called ACLD out-of-context certification. So we need to prove that the software does what it says it does. Out of context, because we have no hardware that we run on, the client that buys the operating system from us is providing the hardware. The client is the final step in the certification because he needs to take the software together with hardware, respect some parameters and guidelines and certify the whole ECU. We as a software provider need to provide a lot of documentation for the software. There's a V model where we need to provide specific requirements. Requirements, we need to provide the architecture, and we need to provide the detailed design up to almost every variable that we use. And then we need to provide tests for everything in the inverse direction, testing the small components, functional testing, and then testing that the requirements are met. In the Rust ecosystem is really rich and moving really fast, but we are still missing some tools. For instance, we need to prove that our tests do full coverage of the code, and this will be available in LLVM 18. I'm saying will be because Rust doesn't use LLVM 18 yet. It's in progress. And we are also missing some tools that can do some kind of static analysis that compiler doesn't do. But it's in the works. I mean, the Rust ecosystem is really good at doing things fast.
Matthias
00:15:26
Just yesterday, I used a tool called llvm-cov or something like this, which allows you to print the coverage of my rust code. But I assume that what you mean is it's not available for your environment for your context. Is that correct? Or is it the same software, essentially?
Alex
00:15:47
I think it's the same software. My colleagues were looking at it a month ago, so it wasn't available. This is really, really new because they included this in LLBM 18, and I know the Rust compiler is working towards having support. And I know that the team that does certification says that what we currently have is not enough, but it's very, very close to what we need, and it will be done. Talking to some compiler vendors, Rust compiler vendors, my understanding is that they are working to integrate this fully in the compiler.
Matthias
00:16:15
Yeah what i learned yesterday was that i needed to pass in one additional flag which was not stable yet there was a config flag i can't remember but the rest was in fact running on stable rust which is kind of that's the problem.
Alex
00:16:28
I think that's the problem that my colleagues had.
Matthias
00:16:31
Okay well coming back to the architecture of OxidOS and also the software that you need to certify i do wonder if you have to go down to a function level and even sometimes a variable level does it make, redesigning the architecture super hard i imagine that you need to be extremely conservative about what you put in and how you move things around because you know that it will mean you will have to update all these documents and maybe trigger another round of certification.
Alex
00:17:06
Yes that is correct. So how the system was designed, and this was designed in the 90s, is you are building specific requirements, you are building the architecture, you are building the detailed design up to the function and almost variable level. And this is the point where you start writing the software. Which is very interesting because in our case, we use an open source operating system, which we professionalize. And we have the software already, but we need to retrofit the document. So this is the challenge that we have. The second is we are working very hard to make this architecture changes easy. Because imagine it's an open source software. It moves online. It changes. It gets new features. It gets redesigned from time to time. We need to have a really fast way in bringing in those changes, at least the changes that are security related, and be able to update the documentation. And this is a process that my colleagues are working very hard to do.
Matthias
00:18:11
If I understand you correctly, TockOS existed before this process started. So now you are in a situation where you have this ongoing open source project and you somehow need to retrofit it into the certification process. Is that correct?
Alex
00:18:27
Exactly. Yes, that is correct.
Matthias
00:18:29
And then whenever you get a pull request from an external contributor, you need to somehow see if that fits into the bigger equation.
Alex
00:18:40
Exactly.
Matthias
00:18:41
Okay, how long is that feedback loop if you look at it from a open source maintainers perspective? Hours, days, weeks, months?
Alex
00:18:51
For the open source project, it's fast. For certification, we are still assessing this, but probably it's weeks in our case. In a standard certification process, it would be months. But we are trying to shrink this to weeks. We can't follow the upstream to the letter because we will provide a stable operating system with a certain set of features. What we need to pull in from upstream is security patches and changing the architecture will be the case only if this impacts seriously security and from time to time we'll pull another version of Tock and certify that one so this is the process that we are following.
Matthias
00:19:37
OxidOS is a fork of Tock and you maintain your mirror of it. Is that correct?
Alex
00:19:43
It is a fork which is very close to upstream. So we are trying to avoid things that different companies did, forking an operating system, forking some software, having a downstream version, and then having thousands out of upstream patches. So we are following Tock really, really close. Half of our team is our contributors, our steady contributors to Tock. And we're trying to update as soon as possible and ship back code as soon as possible.
Matthias
00:20:14
And how are you planning to reduce feedback cycle for the official certification? Do the certifiers, the qualifiers, do they get access to a repository? Do you have to send them a CD with all the code or an email or a fax? And also on the other side, Are they used to checking Rust code? What is the process there? Do they also have to upscale and learn new technologies?
Alex
00:20:46
Certification is a bureaucratic process. So assessors verify documents. And for instance, they start asking questions to the teams. Who is responsible for this? Okay, where can you show me that this was tested? And you need to show the test. Where can you show me you see this function? Excellent. Which requirement does it fulfill? fill and then you need to quickly show the requirement. So it's more or less a bureaucratic process, not a process of verifying code. The assessor is very much interested in the fact that you as a company have a clear process, what happens if and what needs to be done when something happens. He needs to see that everybody knows the process, knows how to find and to track down something. For instance, if some ECU misbehaves and they determine it's the operating system, we need to be able to very, very quickly track it down to the line of code. The language is not important here, but we have the responsibility to actually prove that, hey, this line actually covered this requirement and this could not have happened or this happened, but not because of a faulty process or a faulty code submit, but because nobody ever thought of this corner case. Basically avoiding the XZ problem. So we're an open source, but we need to prove that we did everything possible to avoid a problem such XZ. So if XZ would have been certified, what that contributor did would have been impossible because that testing framework would have had some requirements and they would have had to prove that everything in a test framework has a requirement and there's no additional code to that.
Matthias
00:22:36
From what I heard, and correct me if I'm wrong here, the entire process is sensical. It does make sense because they don't really look at the code itself, they look at the process, they look at a high-level documentation, documentation they look at whatever else you need but they don't really tell you how to write the code they just tell you what you need to guarantee or what are things to look out for. Do you agree with this?
Alex
00:23:06
Exactly actually you are telling yourself how to write the code because the detailed design document shows you sequence diagrams and exactly how do you need to write the code well and what they check is the sequence diagram the code respects the sequence diagram And there's no additional code except the stat sequence diagram. There's no additional significant code.
Matthias
00:23:30
Okay. Significant means?
Alex
00:23:33
You have an auxiliary variable, that's fine. But calling additional functions or having the function way more complex or having a branch of a match that's not covered in the detail design. It is down to a process, to a very efficient, hopefully, and well-documented process where a team knows exactly what it needs to do.
Matthias
00:23:55
Is it something that can be exciting and also rewarding in a sense that you learn more about software development and best practices? Or is it something that needs to be done?
Alex
00:24:06
It's half and half. It depends how you apply it. I mean, if you apply it correctly, and if you're changing, and that's what we are doing. If you're trying to change a little bit the process to be more modern than in the 90s, then yes, it's very rewarding. And the team is very efficient. If you follow the process that was described in the 90s to the line, you will have to send documents over email for approval. This is not something that we want to do. So we use a repository. The documents are marked down files. Of course, digitally signed, that's fine. But bringing this process to the modern era is one of our biggest concerns.
Matthias
00:24:44
Amazing.
Alex
00:24:45
And we are talking to assessors to allow us to do this because you can do deviations from the process as long as you can prove it makes sense.
Matthias
00:24:56
That sounds like a way more modern development environment that I envisioned.
Alex
00:25:03
We're struggling with this. So we are not there yet. We are working actively in doing this, but sadly, it's a very costly process.
Matthias
00:25:12
Does it also depend on who will verify your code, the agency that will do it in the end? Or is it more or less normative and people are always using the same process?
Alex
00:25:24
The process standard is ASPICE. Now, every manufacturer uses a different version, which he or she adapted for its purpose. We are trying to modernize it and adapt it for open source. because in the automotive industry, open source was not a thing until a few years ago. And in the safety critical parts of automotives are not connected to the internet, in the engine or brake control, open source still does not exist. And adapting this and making the industry understand what open source means and that they need to change a little bit the process takes time.
Matthias
00:26:00
I saw a talk by Pietro Albini at Rust Nation UK lately, and they used a Google document to document or maybe to start to qualify the Rust compiler. And they had one huge document, even with different color schemes for different areas or for different sections, I guess. And it felt like such a monstrous process to me. What you described is way more manageable. I do hope that you can do it your way.
Alex
00:26:38
Don't get me wrong. We still have documents, but they're in the form of markdown files. But there's a lot of documents. What we're trying to do is to use open standards. So there's a lot of software that allows you to do the documentation. But the problem is, it's not that it's costly. That would be fine. The problem is the format is closed. So we are trying to build all the documents and everything in open formats so we can share it with customers and export it into the platform that they use. This is another challenge. But it's getting there.
Matthias
00:27:14
Right. But once you're there, that means your customers don't have to go through the same process, which is kind of great.
Alex
00:27:21
Exactly.
Matthias
00:27:21
I wanted to touch on the customers a bit. Who would be your ideal customers? Would it be big automotive companies, anyone else in the safety space? Or might it also be hobbyists, people that are makers and want to try new ways of building safe environments? Can you talk about the ideal customer a little bit?
Alex
00:27:43
It ideally it would be the big automotive customers and aerospace and big functional safety like medical or let's say heavy industrial customers but that's this is our hope that in 10 years this will happen today we are hoping to get to customers which are smaller oems customers that build let's say electrical bikes or electrical scooters because as we are talking They can sell them without certification, but in a few years, probably, they won't be able to do this because of European regulation. So we are trying to get a system for them where they can use it to build these kind of small micromobility devices. And our second target is also smaller OEMs that build these smaller, usually electric cars that will probably take over the city. I mean, I doubt that in five to 10 years, you will have a big SUV running around in a city.
Matthias
00:28:48
Visit me in Dusseldorf, then I might convince you of the opposite. SUVs are very much a thing here.
Alex
00:28:55
Yes, but for instance, Paris, I think, has double or triple parking fees for SUVs. I read some news somewhere. I don't know if it's actually true, but probably. So that they can create more space for smaller vehicles. But in the long run, I really think people will have a small electric car, which they will use in the city. I don't want to say names, but we all know smaller electric cars. And then for going, let's say, on vacation, they will have a bigger SUV.
Matthias
00:29:25
Yeah.
Alex
00:29:25
Because literally you don't need it in the city. I mean, you can get around in a city with a small car and park easier.
Matthias
00:29:32
Yeah, yeah, totally. And I also like that you think long-term here because not many companies, especially startups, have a 10-year window or they think about what happens in 10 years from now. And it feels like this is part of the equation as well. Can you talk a little bit about this? Why do you think in such long time spans?
Alex
00:29:55
Because we have no other choice. That's how the automotive industry works. I mean, the long selling cycles and cars, so software that is being written today will be in cars in three to four years. So we really have no option than we think on the long run. And even with Rust, they are looking at Rust, but they don't fully understand it. Most of the car companies want to use it, but don't know how. So we are trying to figure this out.
Matthias
00:30:28
Are there any important or big automotive companies that are ahead of the curve here that understand it better than other companies.
Alex
00:30:38
Yes, they are, but I can't and I don't want to name them because we work with some of them. It might be NDA, but if you search online, you will see there's a couple of companies that have invested in rust. And there are some companies that have rust in cars already, not in safety critical systems. But as a better example, every car manufacturer that will use Android Automotive, which is Android's version for the infotainment of the car, for now at least, will use Rust because Google is rewriting Binder in Rust. Binder is the communication mechanism for driver for Android applications. So as soon as Google will ship this, everybody will have it in the car. And usually car manufacturers are modifying this android because they adapt it to themselves and they will have no choice than use rust it's.
Matthias
00:31:32
Kind of funny that in the 90s we tried to move away from rust in cars and now it's back if you allow me that pun.
Alex
00:31:40
I've heard this i have heard this a lot i was talking to people that are not in the technical space and when they heard rust Rust in cars, they said, oh, this is bad. I mean, how can I do marketing for you when you're telling me that I need to put rust in the car? Oh, yeah, what can I do?
Matthias
00:32:00
We can oxidize it. We can call it off.
Alex
00:32:02
Exactly.
Matthias
00:32:04
I watched a few things in that space. One was also a talk by Renault at Rust Nation UK. They talk openly about rust usage, so it's not a trade secret. I know that mercedes had a prototype internally that was a leak that came out where they had a rust repository but it was not for a safety critical environment i also know that volvo builds volvo os or something like this and it apparently was based on rust but i'm not sure if it still is or if they even developed that anymore and also a german manufacturer vw which at least experimented with it there were a few companies that i know of yeah.
Alex
00:32:49
I want to refrain from saying names because we work with car companies and i don't want to infer anything there.
Matthias
00:32:55
And how does a conversation like this usually go when you go in and you say you should definitely have rust in your car apart from the obvious pun they might also have questions about why this is a necessary step and what do you tell them then well.
Alex
00:33:14
It's not about rust because no car manufacturer will actually take rust because it's rust and it's a new language they don't care and they shouldn't care they saw cars and what we tell them is hey you know we can shorten the selling cycle we can be faster in the time to market anyway your developers need to learn. Car companies are, internalizing a lot. They took the model of Tesla and just internalize everything, but they need to train the teams because they're not big tech companies. And we're telling them, Hey, anyway, you need to train your people. Why not train them directly in rust? It's more expressive. You can do it faster. That's one. The second one is we tell them, Hey, you know, you will have a safety issue because one, you need to respect the European regulations. Second, you will have to respect the White House and the NSA's guidelines. And at some point, you will not be able to get away with C and C++ because these languages were specifically named in those papers. So it's a good idea to start with this. What we are trying to tell them now is let's make a POC, proof of concept. So I'm not expecting you to change your whole operating system. Them let's take something small and make a proof of concept and evaluate how fast this is working for you and if we can prove that hey this was way faster than writing it in c or in legacy things then maybe you would consider working with us and this is the selling point that we have.
Matthias
00:34:46
And then you come and you have some skeleton of their application that they want to build you You build the proof of concept for them on top of OxidOS and you demonstrate it.
Alex
00:34:57
Exactly. Many of them want to run OxidOS alongside something else. And that is also possible. I mean, if you have a beefier chip, you can always hypervise OxidOS. Many want to do this.
Matthias
00:35:09
OxidOS is just a good guest in this case. And it runs on a host and it has its own little runtime, which actually we maybe should touch on soon. But then you can embed it into another bigger system.
Alex
00:35:24
Exactly. I mean, it's an operating system that works on small chips, Cortex-M-like or RISC-V, IAMC. So anything that respects these can run OxidOS. And this is interesting because for RISC-V, there are very few operating systems that actually work in this small chip space. And OxidOS works perfectly on RISC-V.
Matthias
00:35:48
Is that the hardware that the automotive companies use.
Alex
00:35:51
Not yet mostly most companies use PowerPCs and arms yes PowerPCs the old ones they never changed the chips but probably in the future they will shift towards risk 5 for several reasons okay.
Matthias
00:36:09
Let's talk about TockOS itself of the components that it consists of the architecture, the high level picture of the entire thing.
Alex
00:36:19
All right, so TockOS is different from other operating systems in its class. Mostly because it's an embedded operating system for small chips, but behaves like a normal operating system for computers in the sense that the kernel of the operating system is completely separated from the apps and is getting compiled separately. That's a piece written fully in Rust. There is actually no C line of code. So yes, we do have some assembly, but most of the the rest of the kernel not most of the kernel is written fully in rust it has zero external dependencies i mean everything is contained into the repository we do not allow external dependencies except for crypto libraries or things that are super well vetted but at the moment there's no external dependency applications are compiled separately from the kernel so completely separately into their own binaries. You could write an application in C and D, in Rust, in C++, or anything that compiles for the target. As long as you can produce a static relocatable binary, which is a star for Rust, because Rust still can't produce this on ARM, you can run it on top of OxidOS and on top of Tock, actually. So right now, applications are exactly like Linux applications. They're their own binaries. they're not dependent on the kernel. Whenever you flash the operating system, you are flashing only the kernel, and then you can flash separately the apps, which is not the case for other operating systems in its class. Secondly, we do require memory protection, and applications are sandboxed. So applications run in user mode, exactly like a Linux application. Applications can never touch hardware. Absolutely never touch hardware. So you cannot have the case that you have an application that configures a dma differently than another application and then everything breaks which is the case of FreeRTOS.
Matthias
00:38:25
If they can't touch hardware how do they get any work done.
Alex
00:38:27
Through divers the kernel provides drivers so in the top kernel we have two types of drivers one is low level drivers which control hardware and export a set of traits, a set of standard traits, these drivers do have some unsafe code because we need to do memory input outputs, MMIO. But other than that, unsafe code is completely avoided. And then you have upper level drivers, which are called capsules. These ones are forbidden in having unsafe code. So they can only use safe rust, which means a driver even though it runs with full privileges on the computer on the processor, cannot interfere with other drivers and applications because the compiler has checked this at compile time. Of course, as long as the binary stays the same of the kernel, but you can always digitally sign the kernel. But as soon as the driver compiles, you know for a fact that it can't interfere with another driver or another application, even though it runs in full privileges. And this allows us to guarantee safety with zero runtime penalty.
Matthias
00:39:38
Well, you have this really magical compile time system but at the same time you run applications that are compiled independently and they run on this small runtime so technically if i could find an exploit i could still try to access the hardware directly no.
Alex
00:39:55
No because you run in an mpu sandbox mpu is memory protection so the application can access only its small part of memory none of the peripherals are ever mapped or not mapped but allowed to be accessed by the application so as long as hardware doesn't misbehave you are not able to touch the peripherals because it will fault the processor should fault of course if the processor doesn't support memory protection then there's nothing that we can do applications can't always access peripherals can.
Matthias
00:40:30
You restart applications at runtime can you.
Alex
00:40:34
Of course.
Matthias
00:40:35
Okay. That means you can also update them if you want to.
Alex
00:40:38
Of course i mean we do need some support from hardware because some flash systems on some microcontrollers behave in a really strange way but for instance for nordic semiconductors it works very well for STMs it works very well i mean you can update just the application the application can fault. It's the equivalent of a stack fault. It can be restarted, it can be stopped. You can do with an application the same things that you would do with a Linux application, just that the number of applications is significantly reduced.
Matthias
00:41:12
I can imagine that this is a really nice demonstration for vehicle manufacturers as well, when they for the first time see that you can safely restart or even update an application that might enable them to do more in the future.
Alex
00:41:29
Yes, this is one of the selling points. But as I said, we have three to four applications per chip. And yes, but don't expect it to be like in Linux. You have thousands of applications now, three to four applications usually.
Matthias
00:41:44
And what do these applications do?
Alex
00:41:47
It's the business logic of your device. So anything that is not related to hardware is done in an application, including networking. I mean, we have a running networking stack written in C, which runs in an application. So if your networking stack misbehaves, you just shoot it down and restart it.
Matthias
00:42:10
That's pretty incredible.
Alex
00:42:12
I mean, the operating system provides the layer two of the networking, but anything else goes into the application.
Matthias
00:42:19
It's kind of cool that you can do that in application space, in user space, quote unquote, quote because in traditional operating systems that will be a core part of the operating system but for you it's just another application that runs just next to the others and is there some sort of communication between those apps then can they send messages to one another because, if you want to use that application and make a network request you have to reach it somehow.
Alex
00:42:46
Yes, so that's what we do. We have an IPC system that sends messages between applications. We would love if hardware will help us more in this sense, but it doesn't. So many, many times it results into copying buffers because we have no other way. We have a way of sharing memory, but due to the way hardware is constructed, this is super limited.
Matthias
00:43:10
Shout out to James Munns who worked on or is still working on postcard which is a message format i wondered do you use your own message format for this ipc bus or do you use something out there that already exists.
Alex
00:43:24
We transfer buffers applications use whatever they want okay.
Matthias
00:43:28
So you deal with it on a memory level and you say this is the blob that got sent to you deal with it and then you need to interpret it.
Alex
00:43:38
so out of the box that's what the operating system does of course you can always add a driver a capsule to the operating system that is let's say content aware of the message and does more but this is something that goes from user to user so we ship the operating system with a generic set of capsules and it's you the user who decides if if you want to add or not, something like this. But in a generic way, we passed the buffer.
Matthias
00:44:06
If I was a manufacturer, I would ask, isn't that a security risk? Because you give me a blob of memory and I might interpret it in a certain way. For example, try to parse it as a struct that I can understand. There might be issues with the parsing process and maybe this could be an exploit. What would you answer there?
Alex
00:44:26
In the application, you mean? Well, the application needs to parse the, I mean, in automotive, they use it a little bit different. They won't pass buffers. They have something that is called the RTE, real time environment. I think real time environment. I'm not the automotive expert in the company. I'm the operating system expert in the company. Where they have a set of connections. So every application gets its values. So more or less automotive applications work like PLCs. They have a running cycle, and at every cycle, they just read some values. So what we do there, we just hand over to the application the values. We have a specialized capsule, which provides these values to the application. Let's say an application wants to know the RPM of the engine, the temperature of the engine, something like that. And then the application does something based on this. So yes, in the automotive space, we do have semantic passing of buffers.
Matthias
00:45:22
But even if you didn't, it wouldn't matter because that's in the application space and that is inherently sandboxed.
Alex
00:45:29
Exactly. I mean, if the application tries to access anything outside its world of memory, it just falls. Of course, an application could generate some networking signals or CAN signals based on something that it receives. But then the application had the right to do that. An interesting thing that Tock provides and inherently OxidOS also provides is we are able to digitally sign applications, actually add credentials to applications. It doesn't matter if it's a digital signature, it's a hash, it's just a credential field. And we can check those credentials and have firewall I called this, of what system calls an application is allowed to do. So, for instance, if you have an application that just needs to crunch some numbers but should never be able to access CAN, CAN is the communication bus, the credential that the application will receive will forbid it to access the CAN. Any system call to the CAN driver will just fail. The application won't even know that there is a CAN capability on the device. Because the operating system will reject it immediately.
Matthias
00:46:40
Right. Because that's another huge advantage there. If I understand you correctly, that means that you don't have to run all of this on real hardware to be able to test it. You can have your own little test environment. If you work against these trades, if you work against these capsules as you described them, that means you can test all your code even with unit tests locally and then deploy it once and be moderately sure that it will just work?
Alex
00:47:09
We actually do this. So we have an OxidOS simulator, which is publicly available in the cloud, where you can run OxidOS and real applications. So that means the real binary of the application compiled for ARM. In this case, on top of OxidOS, which is compiled to WASM. So OxidOS runs as a Wasm device, not Wasm device, it's compiled to Wasm. And applications are just to run and do system calls. And incidentally, applications don't, system calls in Tock and OxidOS are not function calls, are real ABI calls. So they do real system calls with sending data in registers, which is perfect for core WASM because that's what we can ship. So system calls are translated into really WASM simple function calls. But you can test the exact same binary in a test environment. And you can do this today accessing our platform.
Matthias
00:48:04
Yeah, makes for a great demo. Just for some context, WASM means Web Assembly. That's the runtime that gets built and can run not only on the web, and it's also not assembly. It's a format that is kind of specified in the open that allows you to have a very conservative set of inputs, mostly integers and floats. And a blob of memory that you can manipulate however you want. And it seems like TockOS maps very nicely into this model. And I guess others that came before TockOS that are used by current manufacturers, they don't really map to that model right away. It's probably very, very hard to run any existing system on WebAssembly.
Alex
00:48:56
They don't have the separation. So we were lucky in the sense that Tock and Oxid allowed us to compile the kernel in WebAssembly and then provide functions like networking or CAN in a simulated way, and then run the binary applications directly on top of it. We're just interpreting ARM assembly and making system calls. And this was fairly easy to do for us. And this is what you can see on our website. And we're hoping to have a nice day-long workshop at Oxidize about this.
Matthias
00:49:26
Amazing. Really looking forward to that. And can you run async code as well?
Alex
00:49:34
Actually, thank you for the question. Tock's kernel and OxidOS as well is fully asynchronous. So we are not allowed in the kernel to do any synchronous action. So every capsule and driver is a state machine. This reduces, makes the kernel more predictable. On the user space technically everything is asynchronous even in c even in rust you'd have to do asynchronous things we need to struggle to make it synchronous i mean make a call wait for an answer filter all the other answers out the rust library for Tock was initially asynchronous, now we have one that is synchronous and we at Oxid are trying to write a new library that uses a weight and a sink, and we're experimenting if we can use the embassies API for that. Embassy RS is a library for embedded systems, and it seems that people are starting to use it, and we are looking into if we could port embassy as an application for OxidOS. Because the kernel actually helps us a lot in this.
Matthias
00:50:47
If you don't use Embassy right now, does it mean you have your own executor at the moment?
Alex
00:50:52
The first Rust library had its own executor, but it wasn't great. The second one is fully synchronous for several reasons that were discussed in the Tock meetings.
Matthias
00:51:03
Okay. And what's so great about Embassy, other than maybe that it's a standard? Well, actually, that's probably the greatest benefit of it, to have something that works cross-platform, cross-application, cross-use case?
Alex
00:51:20
I'm not an embassy expert. I just teach an embedded class at the university which uses embassy. Why did we choose embassy? Because it works. We use the Raspberry Pi Pico for this and embassy actually works on the Raspberry Pi Pico with Wi-Fi. So this is clearly the point why we chose embassy. When it comes to embassy, I don't know if embassy is standard, but it does provide the standard Rust-embedded async hull. So the Rust-embedded group provided standard traits and Embassy just used them, which is great because it actually doesn't matter if you use Embassy or some other framework, you will have the same API, more or less. That's why we are trying to port Embassy over. I mean, Embassy has a really nice asynchronous executor. It seems to work well and is of decent size. And we think it could be very useful in our applications. But more than that, I'm not an embassy expert. So I'm just learning as I teach the class.
Matthias
00:52:23
When you teach embassy to students in the class, do they get it? Do they struggle with this new environment? Or maybe to rephrase it, do they struggle more with embassy and async execution or Rust and its syntax and the borrow checker?
Alex
00:52:42
I think none of them. They struggle with not being able to see the panic. So a problem that we have is we have Raspberry Pi Picos, but we don't have a debugger for them. So everything goes through USB. And what the embassy is still not able to do is, if you get a panic, shoot it out over USB. So if students get a panic, the port doesn't work, maybe blinks an LED. But shooting out the panic over USB is the problem that we are having with embassy. Understanding Rust, yes, maybe in the first labs, but we did some intros and they have some boilerplate code, understanding how to use the framework that's really easy i mean examples are fairly good you have some tutorials and the lab team did a great job in writing additional tutorials actually the class is free you can see it online if anybody wants to use it feel free to use it that's.
Matthias
00:53:41
Amazing and is anyone working on this panic handler a way to see the panic somewhere else other than the blinking light.
Alex
00:53:51
No i don't i none that i know of we as the team for the class and not OxidOS is looking into this because in Tock we can do this we have a few chips that only have usb and we can shoot out the panic over the usb and we are looking if we can contribute this back to embassy to may do the same thing but it's very different from Tock because it needs to touch embassies internals on this okay.
Matthias
00:54:19
And what other tools do you use other than embassy and a raspberry pi do you use probe rs in class.
Alex
00:54:27
Yes of course of course well.
Matthias
00:54:30
Not everyone knows what probe rs is maybe you can explain what it does what it provides to you.
Alex
00:54:36
On a bird's eye view probe rs is a really nice tool that allows you to flash embedded devices from a computer. And the really nice thing about it is that it knows a lot of boards. Technically, when you buy a board, you need to take a specific software that would flash the board. With Probe RS, they somehow collected a lot of open source code and they have written a lot of open source code in Rust, which makes flashing a board really easy. It's like a Swiss army knife for connecting and flashing software to boards. And yes, we use Probe RS. We use it for flashing and for displaying the USB messages, the console messages.
Matthias
00:55:18
Nice that means you have a very productive development environment mostly except for maybe this blinking panic handler but how do students feel about this and especially about rust are they excited about the language can they even compare it with something else that existed before like c or do you start right away with rust how do they feel about this environment and about the language.
Alex
00:55:40
My students are a special case they started in the their second year of university and during the first year they studied java only java so it was interesting because it was easy to relate to because java has interfaces rust has traits they're not exactly the same thing but they can relate to this java has classes rust has structs but java has methods and rust does have methods so at least for me from a teaching point of view and from the students it was easy to relate to Java. And then they understood this rather easily. If they like it or not, we will find out at the end of the class because every student has to build a project. We'll see the project and the feedback that we get from the students. But so far, we think it's okay so far. It's the first time that we do it. I mean, it's a completely brand new class.
Matthias
00:56:35
Once the students graduate and they are trying to look for jobs what are you looking for when someone applies to OxidOS what are the skills that are required to hire embedded rust engineers?
Alex
00:56:47
I will say something that people might not really resonate with i like students that are really good in doing educated guesses, so i ask him something i'm super happy if he doesn't know it because it's something specific specific, but I'm really, really happy when he tries to figure out how to tackle the problem and does some really good educated guesses. This means that the candidate uses his brain a lot and is able to understand complex problems and find an engineering solution. So I care less about the fact that he doesn't know the Rust language. Fair enough. If you know programming and understand the concepts, you will learn Rust in two days. And we've seen this before. So we have employees which knew nothing, literally nothing about Rust when we employed them. And now they are super proficient in Rust and solving difficult problems because it's not about Rust, it's about solving the problem.
Matthias
00:57:49
What are some great resources to learn more about embedded Rust development? You mentioned the course already, which is open. I hope that we can link that in the show notes. But is there anything else, any project ideas or any great resources that come to mind?
Alex
00:58:05
That's difficult. I think the ecosystem is lacking here. So I know the Rust embedded books that the Rust website has. I don't think they are really pedagogical. For me, it was hard to understand what they do. And I struggled with this. Good resources. I might not know every resource. So my problem is I don't know the resources because I did my training myself. And I did my training myself a few years ago, reading the embedded books, contributing to Tock, and mostly seeing what others did in Tock. So asking other contributors, looking at what they did, reverse engineering. That's how I learned. So good resources, yes, start with the books. It's still better than a few years ago because they used the microbit. They had another obscure board that they were using for a demo, which at least for me was really difficult to get. Now they use the micro bit, which is nice, and you can get it really easy, and it has a debugger, but they still need to work on how to do embedding. If I would recommend somebody to start with an embedded, I would say read the Embassy tutorials. It works really easy, it's understandable, and at least it's a good starting point. But it's nowhere close to arduino i'm sorry it's really nowhere close to arduino, but we can i mean we can contribute to this ecosystem that's why we made the class open source i'm not saying we have a really good resource i'm saying we have a resource and we made it open source and we we gladly accept contributions and feedback i mean we would love to have contributions and feedback yeah.
Matthias
00:59:52
A few things that come to mind from my side is I do work very well on a project-based environment. So I take a challenge and then I try to fix it or solve the problem. So focusing more on the project itself and not necessarily on the language or environment or the libraries, that is one thing. The other thing is open source maintenance or contributing to open source can really help you get better at understanding how other people work and think better, Where do you see Rust heading? What will be the future of Rust in 5 to 10 to 15 years?
Alex
01:00:32
I think it will do what Java and .NET did in 95 for the enterprise business. They completely wiped out C++ there. Some people still use COBOL, but that's like isolated cases. But I think for the embedded systems and for basic applications or command line applications, this will be the case. I don't think people will start new projects in C and C++. They will start them in Rust. It's going to take a few years. Still, it's going to take a few years, but you can already see sudo has been rewritten in Rust. The time server has been rewritten in Rust. And you see more and more projects that are being rewritten. So the ecosystem has exploded. It's moving nicely. And I think Rust, or this is my personal opinion, A language that applies the same principles of Rust will take over the command line and embedded systems in the next 10 years. This is like Java and C#. They did Java, but C is probably used on a wider scale.
Matthias
01:01:39
What would that mean to the state of critical infrastructure if it was rewritten in Rust?
Alex
01:01:45
I think it's going to be at least safer from the undefined behavior point of view I mean a lot of critical infrastructure has been written in c and c plus plus for a good reason don't get me wrong they had no other choice I mean when they started writing that critical infrastructure they had no choice they had constrained systems but I think it's going to become safer because undefined behavior is less likely to occur. I mean, we can still have undefined behavior and the compiler bug that we had like half a year ago proved that you can segfault a Rust program. And I think there's still things that, there's still ways to write the code that will segfault the Rust, not the compiler, but the program written in Rust because it happens. But at least on a general on scale, it's less likely to happen. It's not that programmers are not good. Programmers are really good, but eventually you make a mistake. Eventually you forget to verify that a pointer is not. Rust helps you in this. I think they would have written critical infrastructure in Java or C Sharp if they could. But they couldn't.
Matthias
01:02:59
Yeah, because those environments were not supported by the runtime.
Alex
01:03:04
It was not possible. I mean, simply not possible. We work also with a software services company, and we have in production Rust. I have Rust in industrial systems, and I have Rust in kiosks, and it works really nice.
Matthias
01:03:18
This is kind of a surprising thing to me, because I met you at Embedded World, and there were other companies there as well, which didn't really advertise Rust, but they used it. And for them, it was just a tool. And there were people that used it for their environments, for really solving customer problems. And that made me aware that we are making progress here, even in highly regulated environments. And also, people are generally excited about it, but they don't really see it as a language in and of itself, just a piece of the puzzle that just fits in well and makes sense. It's the right tool for the job and this was kind of eye-opening we.
Alex
01:04:02
Shipped three years ago an industrial system that works today in industry and before OxidOS long before OxidOS we shipped Rust software into the industry and we chose Rust not because it was fancy because we, The client allowed us to rewrite the project because they liked it. And we said we had a lot of problems in C. Let's try it in Rust. We had one issue. Never heard about the client again. I mean, never heard about the client. No news is good news. I mean, it works. And that was more than three years ago when we shipped it.
Matthias
01:04:39
When he said never heard about the client again, I wondered about why that was the case. But yeah, I guess in this context...
Alex
01:04:45
No, no news is good news. I mean, it works. The system works the client is happy they don't need patches to it because it works it didn't fail.
Matthias
01:04:55
Yeah, I guess the one - of the biggest compliments for any programming language or tool is if it becomes boring to use where you don't really you're not really worried that you will run into any huge issues or you will need to have long patch cycles and so on it just works and from what you tell me this is the experience that you had with this project.
Alex
01:05:16
Yes it just worked and imagine it was the same team building both of them and on the first iteration which was in c we had a lot of problems and not because programmers were not good because of like no pointer which misbehaved or something that we didn't catch or a concurrency problem or deadlock in a mutex, or something like that, which, yeah, Rust made it easier. I mean, eventually, with the older project, we realized, because everybody knew Rust in the team, but the client wanted C. We basically built Rust structures in C, and we said, hey, why are we doing this? Let's use Rust.
Matthias
01:05:54
Awesome. It has become sort of a tradition around here to ask this one final question. What would be your message to the Rust community? You can say literally anything, it can be technical or non-technical, but if you wanted to address the Rust community as a whole, what would you say? The stage is yours.
Alex
01:06:16
I would say use Rust if it is the right tool for the right job. Use it when it's really necessary. And don't try to use Rust where it's not necessary or where something else works better. That would be my message. And this will make Rust a success. If you're trying to push Rust everywhere, even if it doesn't fit really well, there will be a backlash. So for the community, the message will be, use it as the right tool.
Matthias
01:06:46
Awesome and i guess that wraps it up pretty well, unless you have any questions we are done with our interview and i thank you very much for taking the time and i hope to give OxidOS and TockOs a try in the future i you definitely put it on the map for me and if anyone out there is listening where can they learn more about OxidOS how can they get started what will be the first step also including Tock of course so.
Alex
01:07:21
For Tock just go to the Tock repository and look at the Tock operating system start reading the documentation pick a board because Tock is really explicit about boards pick a board that you have and just run it on the board that would be my message this is the message that we send that to everyone one. For OxidOS, oxidos.io, go to our developer platform, sign up and you can use OxidOS in a simulated environment. Don't expect something super, super flashy. It's at the beginning. But we are trying to build the manuals and everything for OxidOS so you can see how to use it in automotive.
Matthias
01:07:57
Alex, thank you so much for all the insights and for being a guest.
Alex
01:08:02
Thank you for inviting me. Happy to talk to you and happy to talk about Rust.
Matthias
01:08:07
Thank you! Ciao
Alex
01:08:08
Bye
Matthias
01:08:11
Rust in Production is a podcast by corrode, it is hosted by me Matthias Endler and produced by Simon Brüggen. For show notes transcripts and to learn more about how we can help your company make the most of Rust visit corrode.dev. Thanks for listening to Rust in Production.