Let's code a TCP/IP stack, 1: Ethernet and ARP

notalaser · on March 6, 2016

This is a discussion that has happened here before and has generated its share of sparks, but I'll risk it :-). It is hardly a problem in the context of a userspace TCP/IP stack (I don't think Linux runs on any machine with non 8-bit bytes); it's just a good illustration about the kind of pitfalls that programming at this level sometimes poses.

I would suggest being a little more careful with this:

> unsigned char dmac[6];

because there are several platforms, all of them high-performance, non-legacy processors, where unsigned char is not 8 bits :-). E.g. on AD's SHARC, it's 32. You're declaring the struct with __attribute__((packed)), so I assume you're going to want to fill it automatically through DMA at some point in the future. On such a platform, it won't have the expected results. A colleague got bit by this, on a device that's sitting on my desk right now, and it's not a vintage piece of equipment from the 60s.

I would suggest using uint8_t instead, just like you're using uint16_t a few lines below.

ambrop7 · on March 6, 2016

Correct me if I'm wrong but...

It is impossible for unsigned char to be 32 bits and for uint8_t to exist, on a conformant C/C++ implementation. The sizeof char / unsigned char is 1 ("byte" is defined this way). So if 1 byte is more than 8 bits, there cannot be an uint8_t type, since that would make sizeof uint8_t less than 1.

Actually, combined with the requirement for char to have at least 8 bits, is follows that uint8_t can only possibly exist if char has 8 bits.

Edit: quoting C99, to show that uint8_t must not have padding bits (an assumption in this reasoning). The requirement for uintN_t to have no padding seems omitted in the second paragraph (I assume by mistake), but the third one suggests it really should have no padding.

The typedef name intN_t designates a signed integer type with width N, no padding bits, and a two’s complement representation. Thus, int8_t denotes a signed integer type with a width of exactly 8 bits.

The typedef name uintN_t designates an unsigned integer type with width N. Thus, uint24_t denotes an unsigned integer type with a width of exactly 24 bits.

These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a two’s complement representation, it shall define the corresponding typedef names.

notalaser · on March 7, 2016

I don't know if you can have a conformant implementation, but to be honest, I never really thought about it. Non-conformant (subtly or not-so-subtly) compilers are something that I have come to accept with resignation. I may not like them, but software has to be written, and it's not going to wait for another team of developers to get their compiler right.

At the moment, each one of the two codebases I'm maintaining has about a dozen workarounds for compiler bugs. One of the compilers is already at its 6th major version.

It's things like these that bring a smile to my face whenever I see people valiantly calling for rewriting everything in Rust :-).

umanwizard · on March 6, 2016

At least you'll have some hope of knowing what's going on when you try to compile the code and uint8_t doesn't exist.

userbinator · on March 7, 2016

In my experience, these platforms with non-8-bit-bytes or non-byte-adressable memory are things like DSPs and other niche processors, and if you want to port/write software for them, there are plenty of other quirks than the size of a char.

saminiir · on March 6, 2016

A very valid point! I'll update the code.

notalaser · on March 6, 2016

Glad I could help. Your little project is pretty ambitious; I hope it turns out all right -- best of luck! You certainly started well.

songgao · on March 6, 2016

Shameless plug: In case anybody is interested in playing with Ethernet frames in Go, I've got two tiny packages:

TUN/TAP: https://github.com/songgao/water (Linux only for now)

Raw Ethernet: https://github.com/songgao/ether (Linux, FreeBSD, Darwin)

dkopi · on March 6, 2016

Implementing ARP is an awesome introduction to networking and low level packet processing. This was often the first task I would give out during training for Real Time / Embedded programmers

A few things I'm missing for this to be a more complete ARP implementation: 1. Answer Ip to Mac conversion requests from the above stack

2. Send an ARP Request if the mac is missing (+Retries)

3. Static ARP configurations (To prevent ARP spoofing attacks)

4. ARP entry aging. Otherwise you'll run really quickly into "ERR: No free space in ARP translation table\n" (And of course a much larger cache than 32 entries)

5. Ignore unsolicited ARP replies (Helps prevent ARP spoofing attacks as well)

6. use uint8_t instead of "unsigned char" for packet structures.

Awesome work!

contingencies · on March 6, 2016

Unanswered ARP request repeat delays can be used for layer 2 OS detection. One of my very early projects was a tortured perl implementation; see https://packetstormsecurity.com/files/21904/induce-arp.tgz.h...

dkopi · on March 7, 2016

It's not the fingerprinting that impresses as much as the fact that this was done 16 years ago. Nice!

contingencies · on March 8, 2016

I also did ICMP OS detection about 5 years before it became public. In those days, I used to run home from school to see which militaries had visited my page that day. They never used to hide their IP then. You'd get US Pacific Command, Korea and Japanese militaries, etc. I felt like I had a front row seat at the militarization of the internet. Unfortunately archive.org never indexed the page I announced it on off this one: https://web.archive.org/web/19991128100455/http://pho.2600.o... .. but an early version of the list of interesting visitors was archived: https://web.archive.org/web/19991128111635/http://pho.2600.o...

saminiir · on March 6, 2016

Thanks! I will definitely update the implementation as the project matures.

voltagex_ · on March 7, 2016

Is there some kind of VM I could play with this on? Happy to go to something like Buildroot but I've never thought of arp like a daemon, it's always just there.

arm · on March 6, 2016

This reminds me of an old book¹ on Open Transport²³⁴ I started reading for fun a few years ago (what I found particularly interesting about Open Transport was that it used Unix System V’s STREAMS⁵ instead of Berkeley sockets).

Anyway, I linked the book here because it’s actually quite well-written, and the introductory section in particular actually does a good job of describing basic networking concepts.

――――――

¹ — https://web.archive.org/web/20040626025717/http://developer....

² — https://en.wikipedia.org/wiki/Open_Transport

³ — https://developer.apple.com/legacy/library/documentation/Car...

⁴ — https://web.archive.org/web/20041113084430/http://developer....

⁵ — https://en.wikipedia.org/wiki/STREAMS

danieljp · on March 6, 2016

Should I write a TCP/IP stack in 2016, I would do it IPv6 only. There are several methods for translating IPv4 into IPv6 so that you can get both protocols working anyway. Writing it IPv4 only is useless and harder to extend later. And don't come to tell me that we are not using IPv6 please... ( https://www.google.com/intl/en/ipv6/statistics.html ) [Edit: link]

dkopi · on March 6, 2016

IPv6 support is very important. But IPv6 only is probably a bad choice that will end up disconnecting you from much of the IPv4 only world.

detaro · on March 6, 2016

You obviously need edge devices that translate for you, e.g. using NAT64 (which maps IPv4 to a /96 IPv6 subnet. Since you then have an IPv6 address for each IPv4 address, you don't even need to maintain state on the gateway, so it scales very well)

Facebook does data centers that internally only use IPv6 using these techniques: http://blog.ipspace.net/2014/03/facebook-is-close-to-having-...

pjc50 · on March 6, 2016

Some while ago I wrote an ARP implementation in Verilog for FPGA use; a very simple state machine. Testing it (and the other TCP parts of the system) involved using the TUN/TAP devices to connect the simulation to real Ethernet and the application it was supposed to talk to.

TUN/TAP is very useful for doing nonstandard things with networks, implementing your own VPN, etc.

jws · on March 6, 2016

In my own TCP/IP stack I have defined things like IPv4 addresses and port numbers as unions instead of integers, and given them functions to convert to host order and back. e.g.

    typedef union {
        uint32_t net_order;
        uint8_t byte[4];
    } ipv4_address;
    _Static_assert( sizeof(ipv4_address) == 4, "ipv4_address is not 4 bytes long");

    static inline __attribute__((overloadable)) uint32_t as_host( ipv4_address a) {
        return other_order_32(a.net_order);
    }
    // more functions follow, like 
    //    bool equal(a,b)
    //    int compare(a,b)
    //    ipv4_address ipv4_address_from_host( uint32_t)

The nice part of this is that it would take an act of willful ignorance on my part to accidentally miss or over include a conversion from network order to host order.

The compiler is still happy to slam these around in registers like integers so I haven't changed the runtime performance. The downside is that C won't let you compare structs or unions with ==, so I have to have that equal() function which dirties the source code a bit.

That __attribute__((overloadable)) makes as_host() an overloadable function so I don't have to have a giant family of incredibly_long_function_names to convert all my different types to host byte order or to compare them with themselves.

That _Static_assert isn't terribly useful in this case, but there is one on all of the structures where I expect to have explicitly specified a layout. That way when the compiler tries to pull a fast one on me because some language lawyer spent too long reading the spec, I'll find out at build time.

superuser2 · on March 6, 2016

It doesn't reach all the way down to ARP, but worth noting is UChicago's ChiTCP project in undergraduate Networks class:

http://chi.cs.uchicago.edu/chitcp/

Animats · on March 6, 2016

Nice. Retro C code. Declarations of working variables at the beginning of the function, and a call to "strcpy". Very 1980s.

Doing this in Rust would be a good exercise.

conradev · on March 6, 2016

I have a small, safe, zero-copy partial IP/TCP/UDP parser implementation in Rust. My favorite part is the IP checksum calculation, because I can use Rust's iterators:

https://github.com/iCepa/tun2tor/blob/6cc56a2f6c0b578aed9277...

Unfortunately, I haven't had time to work on it in a little while :(

icebraining · on March 6, 2016

That piece of code does come directly from the Linux kernel docs: https://www.kernel.org/doc/Documentation/networking/tuntap.t...

Animats · on March 7, 2016

But from 1999. There's been progress since then.

signa11 · on March 7, 2016

> But from 1999. There's been progress since then.

well, take a look at the latest tun_alloc(...) from linux-kernel-git then, and you would see, that sore points that you have raised are still there :)

Scarbutt · on March 6, 2016

As opposed to variable declarations scattered everywhere in the function?

umanwizard · on March 6, 2016

Each style is "more cohesive" than the other, but measured along different axes.

Declaring variables at the top keeps variable declarations close to each other; declaring variables close to where they're used keeps, obviously, declarations close to use.

Pyxl101 · on March 7, 2016

> Declaring variables at the top keeps variable declarations close to each other

And that one is not very useful to a human. Declaring variables near their usage is useful to a human. Understanding the code requires less (human) working memory.

The reason that old C code declared all variables at the top is because it made the earliest compilers easier to implement in a single pass: the compiler could tell up front how much stack space was necessary for the function call before encountering statements. The early compilers could scan all of the statements, tabulate the stack space required, and then emit the machine code to reserve that amount of stack space, before proceeding to parse and compile the following statements. Modern compilers are multi-pass and can scan the function and compute the stack space before beginning to generate machine code.

Writing code in that style is an example of humans optimizing for the machine, rather than vice versa. Modern compilers don't have this limitation. I haven't read an argument that code written in this style is actually preferable for humans, and in my opinion and experience, it is not.

saminiir · on March 7, 2016

Makes sense. I have to take a look at the project's coding style a bit closer.

umanwizard · on March 7, 2016

Agreed!

ptrain · on March 6, 2016

For those of us who can't spot this stuff as easily - is this detrimental, or simply out of style? Is there a recommended guide or body of source code this is "modern" C?

Animats · on March 6, 2016

Here's a 2011 guide to modern C: [1] Discussed on YC last year.[2]

The 1970s string functions, such as "strcpy", that don't have a destination length were deprecated years ago, but they've never been removed from libraries. Microsoft's C compiler has warned about this for a decade.[3] This is one of the major causes of buffer overflows in C, and a large number of exploits involve it.. There's a CERT advisory from Homeland Security about "strcpy".[4] Seeing a "strcpy" in new code is a big red flag. If you see that in an interview, don't hire that programmer. They're dangerous.

Since C99, you've been able to write

    for (int i = 0; i < 10; ++i) ...

in C code, and that was in C++ from the early years. The trend in programming is strongly towards declaring and initializing variables at the same time. Remember, in C, local variables declared but not initialized have junk values until assigned. The "all variables declared at the top of a function" style is obsolete. Pointer variables declared without initialization are especially bad.

There's a long and painful history of classic C bugs, and the newer versions of C help, just a little, to avoid them.

[1] http://icube-icps.unistra.fr/img_auth.php/d/db/ModernC.pdf [2] https://news.ycombinator.com/item?id=9018247 [3] http://stackoverflow.com/questions/4012222/c-strcpy-gives-a-... [4] https://buildsecurityin.us-cert.gov/articles/knowledge/codin...

cesarb · on March 6, 2016

Playing devil's advocate: local variables declared but not initialized before being used will always generate a warning (which can be turned into a hard error) on modern optimizing compilers like gcc or clang.

And having all variables declared at the top of the function can make it easier to visualize how much stack space the function is using, which can be useful when your stack is limited to a few kilobytes.

userbinator · on March 7, 2016

IMHO strcpy() is perfectly fine if you know the length of the source, and ensure the destination is always big enough. strcpy() will always write strlen(src)+1 bytes, nothing more and nothing less. That said, code should be written so those facts are obvious, and documentation is a preferred way to show that. (Being explicit about whether null-terminators are included or not in length specifications is one thing that seems to be often overlooked, for example.)

Microsoft's _s functions don't help if you don't know what the lengths should be, and if you do, they just make it more confusing.

zbuf · on March 7, 2016

Yup, I'm gonna stick my neck out here and say I quite like strcpy()

In good code it spells out in one glance that there exists a guarantee on the size of the destination buffer.

Supplying the string's length just brings redundant code and, with it, ambiguity.

Discouraging strcpy() use is fine and understood; I'm aware of all the caveats and bugs and security implications.

But the kind of black-and-white thinking that says "always use strncpy instead of strcpy" is bad; the idea that truncating a string magically absolves us of any security implications or the need for exit paths.

And then let's look at how much code has made a mess of strncpy() whilst thinking it was doing the right thing.

I can probably count on one hand the number of times I've wanted to copy a string but been happy for it to be quietly trimmed, even in extreme cases. Whereas anything from alloca to flexible array member, or copying a string back to its original buffer, are all very appropriate use of strcpy().

ktRolster · on March 6, 2016

  > Since C99, you've been able to write
  >
  >    for (int i = 0; i < 10; ++i)

To be fair, Microsoft's compiler only began supporting that recently, so for cross-platform compatible code, people often declared the index outside the loop.

notalaser · on March 6, 2016

Declaring variables at the beginning of such a small function is certainly not out of style. IMHO, it's not detrimental. In fact, I find it quite useful (i.e. just by reading the declarations, I can infer the expected flow of the function -- fd will be used to open a device file, from which ifr is going to be populated through a call to read() or an ioctl). However, it's largely a question of style.

Using strcpy is a big no-no -- it's both detrimental, and out of style. Its manpage gives this warning (on FreeBSD -- the one you'll get if you type man strcpy may differ, but all man pages have had a similar warning since 1990something):

> The strcpy() function is easily misused in a manner which enables malicious users to arbitrarily change a running program's functionality through a buffer overflow attack.

That's because strcpy(dst, src) will basically do something like this:

    int i = 0;
    while (src[i] != '\0') {
        dst[i] = src[i];
        i++;
    }

(the real implementation is usually more terse for reasons I won't go into right now, but this is what it basically does in more verbose terms).

without bothering to check if dst[i] is not beyond the end of dst. This allows you to write past the end of the buffer that holds dst, and who knows what's there...

Nowadays, all that usually happens (on sane operating systems running on machines with a MMU) is that your application crashes due to something called ASLR. It used to be a big problem before that, though, and it still is (on insane operating systems, on machines that don't have a MMU and so on).

Either way, it's not a good idea, and it's considered to be a pretty big code smell. strncpy is the encouraged version. strlcpy is the sane one, but it's not available on all platforms (and even when it is, it's not always sanely implemented).

Pyxl101 · on March 7, 2016

> I can infer the expected flow of the function

Old C code declared variables at the beginning of a function because it made compilers easier to implement, in a single pass. If you interleave variables and statements, then calculating the amount of stack space needed for the function call requires multiple passes. Knowing this, defenses of "variables at the top" often seem like rationalizations to me.

There's a reason that other programming languages don't share this convention, and that's because compilers had evolved enough in sophistication to allow interleaved variables and statements from their beginnings; and they had no legacy code bases anchoring their code style, like C does.

Declaring variables at the top is suboptimal compared to declaring them at their usage. It requires less working memory to understand, and it's fewer lines of code, and the resulting code can often be easier to understand due to context. Standalone variables typically either require more documentation to make sense, or else they don't make sense until you see them used in context.

The C language has a lot of bad conventions and semantics that have led to thousands of bugs over the years. Variables whose scope span entire functions is one of them. There's no articulable advantage and the downside is greater demands on working memory, and greater risk of defects (variables being visible before they're initialized or outside places where they make sense). Rust is on the right track allowing variable lifetimes to be defined and managed in a precise way.

notalaser · on March 7, 2016

> Knowing this, defenses of "variables at the top" often seem like rationalizations to me.

Like I said, it's a matter of style. Defenses of one or another that span more than one paragraph often seem like a waste of time to me.

megous · on March 6, 2016

Strncpy is not used correctly either.

  strncpy(ifr.ifr_name, dev, IFNAMSIZ);

will not necessarily NULL terminate ifr.ifr_name for all values of dev variable. NULL terminated input is assumed later when ifr.ifr_name is used in strcpy.

user2994cb · on March 6, 2016

In this case, it seems the ioctl call will do the null termination & in fact, the kernel itself uses strcpy for that field (presumably OK since the device structure has a null terminated name):

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux....

saminiir · on March 6, 2016

I like retro. Any pointers (pun intended) to good C material is appreciated.

jnbiche · on March 6, 2016

Very interesting tutorial, more people should know about using tun/tap for directly accessing ethernet/ip frames.

However, please note that this is not production-capable code. Casting directly a struct directly into the data as a way to "parse" is easy and convenient, but ignores issues like endianness, alignment, etc. By all means, use a struct, but for production code, pick off the bits bit-by-by or word-by-word using appropriate bit operators and assign them to the struct members. Also, someone already mentioned the `char` issue with certain architectures. I'm sure I'm missing some other issues, since writing safe, portable C code isn't always obvious.

I look forward to reading the next part of these series.

FullyFunctional · on March 6, 2016

The is interesting and I hope it will get around to all of the modern implementation tweaks.

Hopefully not too off-topic but I would love to learn how we would have implemented a state-of-the-art global interconnect protocol today. There's of course efforts like http://named-data.net but what about the lower levels like about congestion control. Is dropping packages really the best paradigm?

EDIT: to be clear, I mean if starting from scratch in a world where IP never existed (and no backwards compatibility mattered).

vdm · on March 7, 2016

https://en.wikipedia.org/wiki/Recursive_InterNetwork_Archite...

godojo · on March 6, 2016

I would assume endianness to be an issue in such code. Where in these structs would the order need to be validated? Or am I wrong in thinking that the byte order matters for x86 targeted c network code?

jws · on March 6, 2016

x86 is little endian. The internet protocols are mostly big endian. It matters.

The good part is you catch your mistakes. Developing networking code on a big endian machine can get you a surprise later.

p1mrx · on March 6, 2016

This is not an IP stack, per the modern definition of IP:

https://tools.ietf.org/html/rfc6540