Smoltcp: A small TCP/IP stack in Rust

littlestymaar · on June 20, 2019

This is a really good example of the kind of safety benefit you get when using Rust: this is a fairly low-level piece of software (not a kernel, but I'd say it's lower-level than at least 80% of what's usually written) and yet, only 76 lines of codes are located inside unsafe blocks for the whole code-base. To be safe from memory vulnerabilities, all you need is audit theses 76 lines. It can still be a tough job, and bugs may slip past the review, but it's a huge improvement over having to audit thousands of lines.

makomk · on June 20, 2019

That's not quite right. To be safe from memory vulnerabilities, you need to understand how those 76 lines interact with the entire rest of the codebase. How hard this is depends exactly on what interfaces they expose and to where. If I temember correctly this isn't just a theoretical issue, there was at least one memory safety issue in the Rust standard library that could only be spotted by considering the unsafe code, the safe code in that library, and how other safe code could make use of it at the same time.

cesarb · on June 20, 2019

> To be safe from memory vulnerabilities, you need to understand how those 76 lines interact with the entire rest of the codebase.

This is explained in more detail in the Rust documentation: https://doc.rust-lang.org/nomicon/working-with-unsafe.html

"[...] Because it relies on invariants of a struct field, this unsafe code does more than pollute a whole function: it pollutes a whole module. Generally, the only bullet-proof way to limit the scope of unsafe code is at the module boundary with privacy."

That is, it's not enough to read just the block of code marked "unsafe". You also have to consider which invariants that block of code depends on, and all the code which could affect that invariant. But it's often easy to constrain what can affect the invariant; in the example given in that page, the invariant is that the "capacity" field in the struct must exactly match the amount of memory allocated for block of memory stored in the "pointer" field (and the "length" field must not be greater than the "capacity" field). Since neither field is "pub", only code in the same module can modify either of them without using "unsafe" itself (with "unsafe", one could in theory use "transmute" to access even private fields).

derefr · on June 20, 2019

In other words, auditing one line of code within an unsafe{}-block in Rust, has about the same time-cost as auditing one line of a macro in a language that has them. You have to look at all the code that could be affected by the code, not just the code itself, because, like a macro, unsafe code “infects” other code with its semantics. (In macro code, that’s “the call sites”; in Rust’s unsafe code, that’s mostly code running concurrently to the executing code, given that you’re supposed to reintroduce a valid “safe” state before the end of the unsafe{} block.)

Though, keep in mind, this isn’t something especially costly about Rust. To properly analyze any low-level language without unsafe{} requires treating every LoC to this extended audit.

fluffything · on June 20, 2019

Rust modules using unsafe can be pretty self-contained and small.

For this particular library, most unsafe blocks are for calling glibc functions via FFI. They could probably use nix to avoid unsafe in FFI with libc.

cyphar · on June 21, 2019

> They could probably use nix to avoid unsafe in FFI with libc.

Though nix also just wraps the FFI unsafe blocks. I'm also very "meh" on nix -- it seemed like a neat project when I first started looking into using it but the interfaces they provide make it harder and less idiomatic to hook into libc.

For instance, all of the *Set interfaces are significantly more annoying to use than just bitwise or in C.

fluffything · on June 22, 2019

> Though nix also just wraps the FFI unsafe blocks.

So? How they provide safe APIs doesn't matter, what matters is that they are safe. If the C FFI API is always safe for all inputs, then wrapping it in an unsafe block is the right thing to do.

> but the interfaces they provide make it harder and less idiomatic to hook into libc.

The interfaces they provide are both safe and portable over pretty much all UNIX-like systems (MacOSX, Linux, FreeBSD, OpenBSD, NetBSD, DragonflyBSD, Solaris, ...). libc interfaces are platform specific.

cyphar · on June 22, 2019

I agree that unsafe { libc::foo() } is generally fine assuming the inputs are sane, but my point was more that nix is somewhat of a glorified unsafe wrapper that isn't always necessary.

But my issue with nix wasn't portability, it's that quite a few of the APIs they provide are more-frustrating-than-necessary.

steveklabnik · on June 20, 2019

Not the entire rest of the codebase, only stuff in the same module.

foldr · on June 20, 2019

How does a module boundary help?

steveklabnik · on June 20, 2019

Modules are the privacy boundary in Rust. This means if you’ve got some invariant you depend on, that invariant is scoped to the module. https://doc.rust-lang.org/nomicon/working-with-unsafe.html

foldr · on June 20, 2019

Right, that's how a module should be written. But the purpose of an audit is to find things that aren't as they should be. If you're using a module in the standard library, or another module that's widely used, then you can reasonably assume that any unsafe behavior is encapsulated. But that's more because you know the code quality is going to be at a certain level than because of the module boundary per se.

steveklabnik · on June 20, 2019

I think maybe we're splitting hairs here. The point is, the only way for something to be accessed outside of a module is through its boundary, so if something has gone wrong due to code outside, it must be to some code inside that has been exposed incorrectly. That means the fault lies inside the boundary.

foldr · on June 21, 2019

That is just to say that if an API that purports to be safe turns out not to be safe, then that's a problem with the API. The statement is true but tautological. Thus, I don't see how it could actually help to narrow down the source of, say, a segfault in a real-world debugging scenario.

You could establish the same convention for C++. Let's say that any C++ module that exposes an unsafe API is at fault for doing so. Great, now we can localize the "blame" for any given segfault to the module containing the code that actually dereferences the invalid pointer. Of course, the bug is just as easy or hard to fix as it would have been without this semantic convention. Maybe it's an easy one line fix. Maybe modifying the module to have a safe API would require a total rewrite of the rest of the codebase.

steveklabnik · on June 21, 2019

Because the root cause is smaller than “anything anywhere”, and you can control that scope. Rust gives you tools to deal with this. It is true that if you don’t use those tools, they can’t help you.

You can’t make a convention in C++ in the same way, because the safety aspect is not part of the language. I mean, you can, but it won’t help you the same way Rust will.

foldr · on June 21, 2019

A more reasonable statement would be something like "Rust makes it easier than C++ to write modules that wrap unsafe operations behind safe interfaces". That's true, but it's not true that module boundaries somehow inherently contain the effects of unsafe code. It's pretty clear that they can't do this, given that the code all ends up being linked into one executable. Thus, makomk's original comment seems entirely correct to me.

saghm · on June 20, 2019

In general, the framework for Rust unsafe code is that you're allowed to relax certain invariants in unsafe blocks (e.g. create additional mutable pointers to something already referenced), but you're still required to restore them when the block ends. If you have two mutable references to the same thing at the end of an unsafe block, you'll won't get an issue in the unsafe code, but it's still the unsafe code that's wrong by definition. This means that all you should have to do is inspect each unsafe block and ensure that the invariants hold at the end of it. (This of course assumes no bugs in the Rust compiler with regards to safety, but if you include compiler bugs, then even 100% safe code with no unsafe blocks could potentially have an issue)

cesarb · on June 20, 2019

> If you have two mutable references to the same thing at the end of an unsafe block, you'll won't get an issue in the unsafe code

This is wrong AFAIK: having two mutable references to the same thing is undefined behavior even within an unsafe block. An unsafe block only allows extra functionality (like dereferencing raw pointers), it doesn't change the semantics of functionality that's already allowed outside an unsafe block. That is: what would be undefined behavior outside an unsafe block is still undefined behavior inside an unsafe block.

nsajko · on June 20, 2019

> safety benefit

Has "safety" become a buzzword? What we really strive for is correctness. Hypothetically, Rust could still be detrimental to software correctness compared to a more usual language, if some of its traits as a language (eg., complexity) encourage bugs. Sometimes it seems like Rust afficionados think that buffer overflows etc. are the main kind of bugs.

And if you need absolute correctness (when human life depends on the software) you would probably use Ada and Spark or something else that enables you to actually prove correctness.

ordu · on June 20, 2019

> Has "safety" become a buzzword? What we really strive for is correctness.

Correctness is about program making what I meant as a programmer. Safety is a different thing. I see it as insulator on wires. It helps me to avoid shocks while working with electricity. Safety guarantees of Rust is like that, I have a pointer, I could use that pointer without fear of segmentation fault or data race. It feels like safety. It named "safety". So I believe it is a good coherent name.

> if you need absolute correctness (when human life depends on the software) you would probably use Ada and Spark or something else that enables you to actually prove correctness.

Maybe. But between absolute buggy code and absolute correct code there is a spectrum of not so buggy and not so correct code. We need a compromise enabling us to create a reasonable reliable software with a reasonable effort.

llogiq · on June 20, 2019

> Sometimes it seems like Rust afficionados think that buffer overflows etc. are the main kind of bugs.

Have you looked at the CVE stats of the last decade recently? Memory errors make up around 3/4 of that. Even if Rust would make the last quartile harder (which in my experience it doesn't), it could still be worth it for many applications where you cannot afford full verification, but want to avoid your users being pwned.

jandrewrogers · on June 20, 2019

They were talking about bugs and correctness generally. CVEs are an extremely biased population of such things, and most types of bugs and incorrectness will never show up in a CVE.

Keeping planes from crashing and bank accounts correct matter too. Rust solves a subset of memory safety problems but it is not a programming language for high assurance applications and in that context enables many other types of unsafe behavior.

wbl · on June 20, 2019

How many programs have you worked on that were high assurance programs?

jandrewrogers · on June 20, 2019

Core infrastructure software like database kernels and protocol stacks should be implemented at least in part to high assurance standards. I've verified parts of database engines and other critical code many times with good effect, finding subtle bugs we never would have discovered in the wild and with (as expected) no defects discovered later.

Most systems programming languages don't make it simple and many people don't do it but it is definitely valuable and worth doing when the economics make sense.

FreeFull · on June 20, 2019

So you end up stuck with languages like Ada, where the language can prove the correctness of your code (or rather, that your code follows the specification)?

jandrewrogers · on June 20, 2019

Currently modern C++ plus a ton of specialized tooling that covers much of the ground of Ada, just not built into the language. It is a balancing act to keep development from becoming unwieldy and the 80/20 rule applies. Code that is easy to verify also tends to be slow, and that is not a tradeoff you can make for some applications. No one is running something as complex as a database kernel through an end-to-end theorem prover. Design verification scales well (model checkers and similar), implementation not so much due to practical limits on what you can prove and accumulated complexity/bugs in the specification, and verification of code generation is very limited (I use the LLVM stack). Nonetheless, this gets you to a very low defect rate and it isn't like this code is being written from scratch every time.

Even with a fully verified toolchain there will still be bugs. I once had a customer find a rare bug in a database engine that was ultimately caused by slight differences in behavior between microarchitectures running the same binary.

phkahler · on June 20, 2019

Seems like I recently read that the Ada folks might want to borrow some concepts from Rust. To me that says both languages are working toward similar goals.

jodrellblank · on June 20, 2019

it is definitely valuable and worth doing when the economics make sense.

Tautologically so.

nsajko · on June 20, 2019

I would bet 99% of that 3/4 (or at least the part written in C) could have been caught with testing, fuzzing, and sanitizers we have today.

pjmlp · on June 20, 2019

Last JetBrains questionnaire proves, once again, how little most devs care about them.

"Which of the following tools do you or your team use for guideline enforcement or other code quality/analysis?"

https://www.jetbrains.com/lp/devecosystem-2019/cpp/

littlestymaar · on June 20, 2019

Too bad neither Microsoft, Google or Apple ever heard about those tools…

pjmlp · on June 20, 2019

Or even Linux, with its strict patch process, still managed to have 68% of 2018's kernel CVEs, related to memory corruption issues due to C's use.

z3phyr · on June 20, 2019

Is there a stat on how much % of CVE directly pertains to Microsoft, Google or Apple products?

testvox · on June 20, 2019

You could easily calculate such a stat from here. Microsoft Apple and Google are all in the top 5, with IBM and Oracle being the other two. Adobe used to be on top but with the death of flash they have been slipping. I checked out the breakdown of memory corruption/overflow bugs and its well over 50% of CVEs for MS and Apple. Google is much better with less then a quarter of their CVEs being memory related.

https://www.cvedetails.com/top-50-vendors.php

Microsoft 6508/116769 = 5.5%

Apple 4502/116769 = 3.8%

Google 4110/116769 = 3.5%

total (6508 + 4502 + 4110)/116769 = 13%

bluejekyll · on June 20, 2019

Microsoft's own research on this area claims it's closer to 70%: https://www.zdnet.com/article/microsoft-70-percent-of-all-se...

littlestymaar · on June 21, 2019

You're not talking about the same thing: the ggp asked how many CVE were dedicated to Apple, Microsoft or Google products (a question that doesn't make much sense here, but the gp still went on the calculation). You are talking about which proportions of thoses big corps CVE are memory-related (which is the right question to ask in this context, but not the one asked…).

MaxBarraclough · on June 21, 2019

There's a lot of value in those tools, but it's better to entirely eliminate the family of bugs 'by construction'.

bluejekyll · on June 20, 2019

The claim that Rust’s complexity is a detriment to correctness is an awkward argument.

The complexity you’re probably referring to is the type system and/or borrow checker. The type system is strong enough that, if you take the time, correctness can be encoded into the types. Meaning, you can build types that make it a compilation error if the software is not “correct”. At that point even refactors can be more assured to meet those encoded correctness guarantees. So I’d say that the type system helps develop software that is provably correct.

Then there’s the borrow checker, and that’s about runtime memory access correctness, again increasing correctness.

Point being, Rust programs because of the “complexity” can get much closer to correct than other languages in its space.

Gene_Parmesan · on June 20, 2019

Would the full phrase "memory safety/thread safety" have been something more gentle to your ears? Because memory safety has been a metric used to gauge software for as long as I've been alive and then some. If a program I write is correct on its face, but subject to issues caused by asynchronous reads/writes (perhaps completely unexpectedly), then it is not memory safe and will still cause problems. Whether you consider this a separate quality than correctness, or as a subset of correctness, it's still a useful metric to employ.

In my experience, lack of memory safety is one of the largest sources of non-trivial bugs I experience. Sure, if you are writing truly mission critical software for planes or pacemakers, you are going to need stronger (and different) guarantees. But Rust's goal is not provable correctness in the Coq sense. It's to provide a productive interface for designing efficient and memory safe/thread safe programs.

littlestymaar · on June 20, 2019

(Memory) safety and correctness are orthogonal issues. If I'm building a missile's aiming system, I absolutely want correctness. But if I run a web service on the cloud I care more about an attacker taking control of my machines than about another kind of bugs. Same if I deploy a fleet of connected devices to thousands of consumers' houses.

Computers are all around us, and they are a security nightmare, less because of their lack of correctness than because of their lack of safety. That's why Rust is important.

Formal proofs and tools like Ada are important too in their own domain, and Rust isn't competing with them (yet at least, I know there are some people would like to develope something akin to Spark for Rust adding best in class correctness tools to Rust's toolkit).

taneq · on June 20, 2019

No, they're different things. Correctness is nice, but it's much more important to me that my software doesn't let some bad person take over my computer than it is that my software works perfectly all the time. If I can get both at a reasonable cost, I'll take both, of course. But I'll settle for safety.

twanvl · on June 20, 2019

In something like a TCP/IP stack correctness is strongly related to safety. For example, it could be perfectly memory safe, but deliver packets to the wrong address, allow other programs to read all traffic, or allow easy denial-of-service attacks.

taneq · on June 20, 2019

I realise we didn't define "safe" before we started, but I didn't just mean memory safety. Agreed that all of your examples would be incorrect, I'd just also term all of them as also unsafe.

An example of something that's incorrect but not unsafe would be, say, an error which would occasionally corrupt random TCP packets causing checksums to fail and the packets to be retransmitted. It's not working how it should, but it's not compromising your system's security or your data's safety (at least I don't think it is).

nsajko · on June 20, 2019

Just use C or C++ with testing, fuzzing, sanitizers. It will probably be done before an equivalent Rust project compiles.

saghm · on June 20, 2019

I think the reason people are excited about improved safety is that in practice, it's really hard to be 100% sure that programs are correct in most circumstances, but safety at least gives you some insurance that if the program isn't correct, the types of failures you can get are more limited.

coliveira · on June 20, 2019

Modern Rust aficionados use the same fallacy spread by Java programmers in the 90s. Just because the language has a type system that is safer than C, it doesn't mean that they can automatically write "safe" code. If a piece of code is safe or not depends more on its design and on the ability of the implementation team than the language in which it is implemented. Nowadays we have a big pile of Java code that has given us all vulnerabilities found in the web in the last 20 years. It probably wouldn't be different if these were written in Rust.

littlestymaar · on June 21, 2019

How are the web vulnerabilities related to Java ?

There used to be a lot a vulnerabilities in the Java plugin of browsers, but it has nothing to do with Java as a language, it's a fraction of the security bugs that affected the web for the last decades (and it's also pretty dead).

The majority (by far) of thoses bugs were in fact memory issues[1], which would have been solved from the beginning if the browsers (or the plugins, like Flash) were written in Java (or Rust).

[1]: https://hacks.mozilla.org/2019/02/rewriting-a-browser-compon...

dnautics · on June 20, 2019

> And if you need absolute correctness (when human life depends on the software) you would probably use Ada and Spark or something else that enables you to actually prove correctness.

That's not even absolutely correct, because provability depends on correct preconditions, and there is a universe of incorrect preconditions that can throw a wrench into your provable system.

Your completely 'proven' system fails, when there's a short in jump resistor J35 on the motherboard, which invalidates the entire underpinning of the software system you assume to have.

For critical systems, you should strive for resiliency and failovers, not provability.

touisteur · on June 22, 2019

In Spark2014, you could at least prove the absence of runtime errors, and in the near future, prove memory safety. Without specifying them since they're part of the semantics of the language. That's not bad for a start.

Ar-Curunir · on June 20, 2019

What exactly are your suggested alternatives to Rust in this sort of niche? C/C++? Both of which have worse type systems, worse toolings, more convoluted edge cases, more hidden complexity etc etc?

touisteur · on June 21, 2019

For the curious, there is an example Spark2014 tcp/ip stack at https://github.com/AdaCore/spark2014/blob/master/testsuite/g... (see README). Not clear how much of it they proved and at which level... And the README says it's not thread-safe (but then Spark2014 should get rust-like protections soon).

swiley · on June 20, 2019

I probably shouldn’t let it deter me from learning the language but what you’ve said exactly describes the feeling I got when a coworker was telling me about his experience with it.

Apparently there’s a way to have code execute at compile time (as an alternative to preprocessor macros I guess) and there’s some API in the compiler that either lets you change the grammar or interact with the AST in some way. This guy was using it to add something to the syntax (I think) to make something easier. It sounded awesome (this guy was always doing some of the neatest stuff for fun) but my immediate thought was “now you can’t really be sure other parts of the code aren’t doing this and that severly limits the assumptions you can make about code you’re interacting with.”

I probably should play with it anyway since they certainly have some neat ideas but some of the complexity the language allows sounds really bad for larger projects.

MuffinFlavored · on June 20, 2019

Is it not possible to write a 100% safe TCP/IP stack in Rust? Why is there a need for 76 lines of unsafe code?

cesarb · on June 20, 2019

> Why is there a need for 76 lines of unsafe code?

I took a quick look, and all of them seem to be for calling functions from the C library. Moreover, nearly all of these seem to be system calls (open, close, select, ioctl...). Since the Rust compiler can't verify that these calls are safe (for an obvious example: the "read" call can write to arbitrary memory), it requires an "unsafe" block for them - signalling that it's the programmer's responsibility to make sure it's safe (in the "read" example, receiving a "&mut [u8]" from outside the "unsafe" block, and passing that slice's pointer and length as the parameters to the "read" call, would be safe).

Arnavion · on June 20, 2019

The unsafe code uses libc API. All FFI functions are unsafe because they do not have compiler-enforced memory safety and other invariants (by definition of not being written in Rust themselves). It is the onus of the caller to read their docs and ensure that their Rust calling code does maintain those invariants before exiting the unsafe block.

The crate also has a nice design decision to have a dedicated module for using these FFI functions, and unsafe is only allowed in this module. The rest of the crate errors if it has any unsafe code.

Zapsofar · on June 20, 2019

I'm not convinced that "memory vulnerabilities" are more than a small part of correctness. Nothing says that you followed the details of a complex spec properly, for example. Or that memory use is bounded. Or that allocated memory isn't spread over disparate cache lines, with different results each run. Or that the architecture isn't open to DDoS opportunities.

It's "safe" at the lowest level. No information after that.

MrBuddyCasino · on June 20, 2019

> I'm not convinced that "memory vulnerabilities" are more than a small part of correctness.

This is not at all the point.

So maybe your IP stack has a bug, but at least you won't have a remotely exploitable vulnerability. At least 3/4rd of all CVEs are due to memory safety. Its 2019, why does this need pointing out.

nickpsecurity · on June 20, 2019

Alternatively, you can use a separate tool(s) to verify those 76 lines are correct or at least safely use the unsafe calls.

moosingin3space · on June 20, 2019

Yes, like `miri`, which is awesome.

k__ · on June 20, 2019

Is the amount of code in unsafe blocks really a good metric for safety?

kibwen · on June 20, 2019

Not within the unsafe blocks itself, no. A more pertinent metric might be "number of lines of code in any modules that contain the `unsafe` keyword". It's true that certain operations can only happen within the unsafe block itself, but those operations can potentially invalidate safety-critical invariants being upheld by any code that the unsafe block can reach, which means the buck ultimately stops at the module boundary. If you have a module that contains unsafe code, factor out the bits that need `unsafe` into as small a submodule as possible with a well-considered interface so that you have as little code to audit as necessary.

The fortunate thing is that, in practice, you can program in Rust for years without ever needing to write an `unsafe` keyword, depending on your chosen domain. `unsafe` is mostly useful for calling external C code, for implementing maximally-efficient custom data structures, and interfacing with hardware.

gpm · on June 20, 2019

Like many metrics, if your not measuring it, yes.

It's easy to game, but with reasonably written code the number of lines in unsafe blocks should correlate fairly strongly with the number of times you did something scary. (Scary here includes things as basic as calling a c function or system call directly)

haberman · on June 20, 2019

I have a distant memory of Alan Cox saying something back in the day along the lines of "there are so many things left undocumented in the TCP/IP RFCs that if you just implement it straight from the spec your stack won't work at all."

But I can't find any reference to this now. Does anyone else remember this?

Is this true? Can you interoperate with the Internet by just implementing the spec?

jsnell · on June 20, 2019

No, it's not true. You can implement something that can load a webpage from 99.9% of the world's web servers from a couple of RFCs in a day. But this ease of implementation means that for basically any issue where the specs give any degree of freedom, somebody will have deviated from the norm. (I've implemented TCP from scratch three times, one of those implementations basically ran the traffic of entire countries.)

The real problem is that every step toward 100% gets progressively harder to find and debug. See e.g. [0] for a middle box that would mangle connections if it received two identical SYN packets, or [1] for a way in which almost all servers anyone currently runs are accidentally resilient to a certain kind of connection-killing packet corruption, but S3 isn't.

[0] https://www.snellman.net/blog/archive/2014-11-11-tcp-is-hard... [1] https://www.snellman.net/blog/archive/2017-07-20-s3-mystery/

eatonphil · on June 20, 2019

> You can implement something that can load a webpage from 99.9% of the world's web servers from a couple of RFCs in a day.

As an alternative perspective... while the RFCs are great, it's taken me weeks (maybe months now) to hack together a not-yet-functional TCP/IP stack. I've never dug below the application layer before. I'm still working through it. I cannot even say if the RFCs are sufficient, but I'll take your word.

nabla9 · on June 20, 2019

TCP/IP is relatively stable ecology where stacks in routers, applications etc. have evolved to survive with each other and intentionally malicious actors by developing coping mechanisms. Introducing a clean implementation of TCP/IP stack is like introducing a new species. It has no immune system against others and others don't know its 'signature behavior' either.

You can probably get interoperability at least part of the time. You may have deteriorated throughput and more broken connections with some stack and not with others. But you have introduced a new species. If your brand new stack has a tiny bug or new kind of misconfiguration and you start spreading it fast, the hell can break loose and you may ruin the day of many people before they find you and cut you off.

whitequark_ · on June 21, 2019

I designed smoltcp (and wrote most of the code currently in it). The original TCP/IP RFC (RFC 793) contains several ambiguous requirements, and as a result they do not specify a well-defined system. There are also some outright incorrect statements. There are a few follow up RFCs (e.g. RFC 1122) that clarify these issues, and there are more RFCs (e.g. RFC 7414) that describe the TCP features that you should avoid using.

By using this collection of TCP/IP RFCs that grew over the years, it is indeed possible to implement a stack from first principles and have it interoperate with other existing stacks without much trouble. (At least so long as you don't put the same bugs in your test suite as you do in your stack... which you will.)

However, being able to transmit some bytes reliably, and having a high-performance stack that works well in real world conditions are different. You might be able to do the former from RFCs, but the latter absolutely requires a nontrivial amount of tribal knowledge that you have to collect crumb by crumb, and often quite painfully, too.

Smoltcp is somewhere halfway between. It's pretty reliable, but I am sure there is much to be improved in its operation in adverse conditions, with obscure peers, and so on.

maltalex · on June 20, 2019

> Is this true? Can you interoperate with the Internet by just implementing the spec?

I've never implemented the entire stack. Few people have. But from the protocols I've implemented, that's only a slight exaggeration.

As soon as a popular implementation has a bug or decides not to behave according to spec, everyone has to adapt. You can typically use the spec to get 95% of the way to full interoperability.

andy_ppp · on June 20, 2019

And presumably 95% of the work is in the final 5%?

toast0 · on June 20, 2019

I don't think it's quite true of TCP/IP that you wouldn't work at all, although you do have to be careful, because there are a lot of RFCs, and it's not always clear which ones are important.

Also, one of the RFCs has wrong functions for calculating or adjusting checksums. I think there's also some convention on tcp option ordering that may be important but not well documented.

Either way, I would keep a couple other implementations close -- if not to peek at their code ocassionaly, at least to inspect their output.

adwn · on June 20, 2019

> Its design anti-goals include complicated compile-time computations, such as macro or type tricks, even at cost of performance degradation.

Why? That sounds more like an ideological decision than a pragmatic, engineering-driven one. Especially for a TCP/IP stack, where performance is typically a major concern, be it in a desktop, server, or embedded environment.

whitequark_ · on June 21, 2019

At the time when I started working on smoltcp, there were a few Rust libraries for working with TCP/IP on the wire level, and they heavily used metaprogramming. Unfortunately, the task of implementing a generic binary protocol serializer/deserializer that can handle TCP/IP is not small, and as far as I could tell, it overtook implementing anything beyond that.

So I made the decision to do the simplest possible thing: write the packet serializers/deserializers entirely by hand. It took very little time and adding any new features was easy and predictable. I believe it was the right decision as it allowed me to focus on the hard parts of a TCP/IP stack.

adwn · on June 21, 2019

Thank you for answering. That sounds sensible, I guess.

toast0 · on June 20, 2019

Most of the performance of TCP stacks has more to do with the order that comparisons are made for incoming packets (sometimes called fast path -- check for and handle normal packets first), locking of data structures, and congestion strategies (including retransmit behavior, SACK, etc). Macros or typing is less likely to make that faster.

For a new stack, though, correctness and readability are more important than performance.

q3k · on June 20, 2019

It can be an engineering decision: write readable, easy to understand and reason about code.

teddyh · on June 20, 2019

Somewhat relatedly, there exists a Python-like language for when you only have a few kb of memory, and can’t even run MicroPython. It’s called “Snek”:

https://keithp.com/snek/

blinkingled · on June 20, 2019

You might not care for low level TCP/IP details but this project will make for a great learning experience for anyone wanting to dig deeper into Rust and network programming.

It works with tun/tap interfaces and there's a tcpdump clone in the example using raw sockets that works on any regular Linux network interfaces.

Pretty cool!

Klasiaster · on June 20, 2019

Here are memory-safe network services on Linux with smoltcp and an optional switch for running multiple userspace network stacks:

https://github.com/ANLAB-KAIST/usnet_sockets → Socket library for Rust using smoltcp as userspace network stack (provides types compatible with the standard library, tokio is unpublished WIP)

https://github.com/ANLAB-KAIST/usnetd → Memory-safe L4 Switch for Userspace Network Stacks (firewalls the kernel network stack, see Ideas section for alternatives, e.g., transparently piping the kernel network packets through smoltcp)

pencillr · on June 20, 2019

An educational question: If this is a tcp/ip implementation where is it run? On a router? Or on a network interface controller? Both?

q3k · on June 20, 2019

> It is used in embedded systems such as the ARTIQ core device and ionpak. [1]

and

> ARTIQ (Advanced Real-Time Infrastructure for Quantum physics) is a leading-edge control system for quantum information experiments. [2]

[1] - https://m-labs.hk/smoltcp.html

[2] - https://m-labs.hk/artiq/index.html

progval · on June 20, 2019

The org developping smoltcp makes embedded systems, so the point of smoltcp is probably to use it there.

smoltcp is also used by Redox, an OS written from scratch in Rust.

chaosite · on June 20, 2019

"smoltcp is a standalone, event-driven TCP/IP stack that is designed for bare-metal, real-time systems."

My understanding based on that description is that it is meant for applications that run directly on the hardware, without an OS in the middle. I'm thinking embedded applications.

So I'm thinking that this is meant for IoT-style appliances and the like. Maybe I'm wrong :)

roblabla · on June 20, 2019

I'm using it in my toy OS as my TCP/IP implementation. It's meant to be run in a wide variety of contexts, from embedded IoT-style appliances to userspace Linux. In fact, it has instructions for Hosted Usage[0] in the README.

[0]: https://github.com/m-labs/smoltcp#hosted-usage-examples

astrange · on June 20, 2019

I think IoT devices run a lot more OS than you're expecting. Even the wifi chip on one might have a real OS.

This could be useful for a userspace TCP stack; sometimes it's better to do the whole thing in your process than the kernel.

pencillr · on June 20, 2019

Ah so, it can be something like a "network platform" on my (let's say) smart home thermostat? :D

chaosite · on June 20, 2019

Exactly, it can be the bit that allows your smart-home thermostat to talk to the app on your phone (or the cloud).

abhishekjha · on June 20, 2019

I think it would be run on a network interface. Isn't this or an equivalent implementation that comes packaged with every OS so that you can connect to a network?

I may be wrong here and others are more than welcome to correct me.

EDIT: Added "or an equivalent"

roblabla · on June 20, 2019

This is definitely not the implementation that comes packaged with every OS. Every OS has its own TCP/IP implementation that usually lives in the kernel - though most are derived from BSD's TCP/IP stack.

SmolTCP could (in theory) replace the implementation packaged with an OS, or even be used completely from userspace by taking over the raw network interface.

abhishekjha · on June 20, 2019

Apologies.

I didn't mean that this exact implementation is shipped with every OS. I meant an equivalent of it being shipped with OSes.

nn3 · on June 20, 2019

>Congestion control is not implemented.

Wit that, don't bother with it so far. That's a toy. It would be dangerous on any real network. If you doubt it please read up on "congestion collapse"

garmaine · on June 20, 2019

Not sure why you are being down-voted. Congestion control is pretty central to how TCP/IP works. Deploying without it will basically DoS your network.

kahlonel · on June 20, 2019

Any idea what would it take to write this without using _any_ C library function?

whitequark_ · on June 21, 2019

It doesn't use any C library functions by itself. (In fact in all of its applications at M-Labs, there is no C code running on the same chip.)

The only uses of libc are in the TUN/TAP driver, which is necessary if you want to bind it to a virtual OS interface.

gpm · on June 21, 2019

I haven't read the code, but probably very little, since libc has rust implementations that you could plug in in it's place.

gazarullz · on June 20, 2019

Can anyone point out any resources to acomplish this in the JVM space?

IAmLiterallyAB · on June 20, 2019

A zero heap allocation _anything_ in the JVM? Probably impossible

gazarullz · on June 20, 2019

Whoever downvoted my question can explain why? or it was just for fun?

smolder · on June 21, 2019

I didn’t downvote you but running a TCP/IP stack inside the JVM doesn’t seem terribly useful. Doing user space TCP/IP can be useful for performance reasons but running on top of JVM is likely to wipe out that advantage.

fluffything · on June 20, 2019

Lots of warnings on `cargo build`, undefined behavior on a couple of places (e.g. crating &mut uninitialized), raw calls to libc as opposed to using a safe wrapper over it like nix, ...

eeZah7Ux · on June 20, 2019

What a poor choice for naming. Please don't use stupid internet memes for naming, it makes all of us look silly in the eyes of non-techies.

WAHa_06x36 · on June 20, 2019

Are you expecting non-techies to be paying attention to an embedded TCP/IP stack, and also worrying that they think you are having too much fun?

bananocurrency · on June 20, 2019

get over it https://z.cash/technology/jubjub

OskarS · on June 20, 2019

Naming your product after dumb internet memes is a great way to make me not want to use them.

Don’t get me wrong, from first look it seems like a perfectly fine product, but the name is atrocious.

ignaloidas · on June 20, 2019

This is dumb. Many products have dumb names, we just got accustomed to them, i.e. git. If you are hating only on this specific meme - I present you SmolV - the leading(and only?) SpirV compressor. If you aren't going to use good things because of naming, then you deserve to only use bad things.

hajhatten · on June 20, 2019

Comments like these make me not want to read them.

Don’t get me wrong, from first look it seems like a perfectly fine comment, but the attitude is atrocious.

nindalf · on June 20, 2019

Evaluating others projects with Gordon Ramsay’s attitude is common on HN. These folks must assume that they’re “telling it like it is”.

zaptheimpaler · on June 20, 2019

It's not a product, and if you don't want to use it - don't :)

noselasd · on June 20, 2019

For the uninitiated, which meme is this ?

toby- · on June 20, 2019

It's part of DoggoLingo[0], a meme-filled Internet lexicon (or... something). It's all in good fun, and I don't understand why anybody could be upset by it. Sure, he could've called this "SmallTCP" and been bland, but decided to inject some (silly) humour. Plenty of other projects are similarly named.[1]

It reminds me of the words "cromulent"[2] and "embiggen"[3]; coined by The Simpsons but used in everything from BBC articles to scientific research.

[0]: https://en.wikipedia.org/wiki/DoggoLingo

[1]: https://github.com/aras-p/smol-v

[2]: https://en.wiktionary.org/wiki/cromulent

[3]: https://en.wiktionary.org/wiki/embiggen

taneq · on June 20, 2019

While we're at it, the arrangement of spikes on a Stegosaurus' tail is called a thagomizer, after a Far Side cartoon. And it's actually referred to as such by real live paleontologists. https://en.wikipedia.org/wiki/Thagomizer#Etymology

NikkiA · on June 20, 2019

> It reminds me of the words "cromulent"[2] and "embiggen"[3]; coined by The Simpsons but used in everything from BBC articles to scientific research.

popularized by the simpsons, at least cromulent was a valid english word before the episode, I don't think embiggen was.

toby- · on June 20, 2019

I don't think that's true. I can find no reference to "cromulent" existing pre-Simpsons, and every source explicitly says it was coined by David Cohen for an episode.

From Wiktionary: "A humorous, intentionally morphologically opaque neologism coined by American television writer David X. Cohen for 'Lisa the Iconoclast', a 1996 episode of the animated sitcom The Simpsons."

From Merriam-Webster: "It is safe to say that The Simpsons has contributed a great deal to the English language. One famous example is cromulent, which was coined specifically for the 1996 episode 'Lisa the Iconoclast'."

Sekhmet · on June 20, 2019

https://m.imgur.com/gallery/HPTS0YK