This is a discussion that has happened here before and has generated its share of sparks, but I'll risk it :-). It is hardly a problem in the context of a userspace TCP/IP stack (I don't think Linux runs on any machine with non 8-bit bytes); it's just a good illustration about the kind of pitfalls that programming at this level sometimes poses.
I would suggest being a little more careful with this:
> unsigned char dmac[6];
because there are several platforms, all of them high-performance, non-legacy processors, where unsigned char is not 8 bits :-). E.g. on AD's SHARC, it's 32. You're declaring the struct with __attribute__((packed)), so I assume you're going to want to fill it automatically through DMA at some point in the future. On such a platform, it won't have the expected results. A colleague got bit by this, on a device that's sitting on my desk right now, and it's not a vintage piece of equipment from the 60s.
I would suggest using uint8_t instead, just like you're using uint16_t a few lines below.
It is impossible for unsigned char to be 32 bits and for uint8_t to exist, on a conformant C/C++ implementation. The sizeof char / unsigned char is 1 ("byte" is defined this way). So if 1 byte is more than 8 bits, there cannot be an uint8_t type, since that would make sizeof uint8_t less than 1.
Actually, combined with the requirement for char to have at least 8 bits, is follows that uint8_t can only possibly exist if char has 8 bits.
Edit: quoting C99, to show that uint8_t must not have padding bits (an assumption in this reasoning). The requirement for uintN_t to have no padding seems omitted in the second paragraph (I assume by mistake), but the third one suggests it really should have no padding.
The typedef name intN_t designates a signed integer type with width N, no padding
bits, and a two’s complement representation. Thus, int8_t denotes a signed integer
type with a width of exactly 8 bits.
The typedef name uintN_t designates an unsigned integer type with width N. Thus,
uint24_t denotes an unsigned integer type with a width of exactly 24 bits.
These types are optional. However, if an implementation provides integer types with
widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a
two’s complement representation, it shall define the corresponding typedef names.
I don't know if you can have a conformant implementation, but to be honest, I never really thought about it. Non-conformant (subtly or not-so-subtly) compilers are something that I have come to accept with resignation. I may not like them, but software has to be written, and it's not going to wait for another team of developers to get their compiler right.
At the moment, each one of the two codebases I'm maintaining has about a dozen workarounds for compiler bugs. One of the compilers is already at its 6th major version.
It's things like these that bring a smile to my face whenever I see people valiantly calling for rewriting everything in Rust :-).
In my experience, these platforms with non-8-bit-bytes or non-byte-adressable memory are things like DSPs and other niche processors, and if you want to port/write software for them, there are plenty of other quirks than the size of a char.
Implementing ARP is an awesome introduction to networking and low level packet processing. This was often the first task I would give out during training for Real Time / Embedded programmers
A few things I'm missing for this to be a more complete ARP implementation:
1. Answer Ip to Mac conversion requests from the above stack
2. Send an ARP Request if the mac is missing (+Retries)
4. ARP entry aging. Otherwise you'll run really quickly into "ERR: No free space in ARP translation table\n"
(And of course a much larger cache than 32 entries)
I also did ICMP OS detection about 5 years before it became public. In those days, I used to run home from school to see which militaries had visited my page that day. They never used to hide their IP then. You'd get US Pacific Command, Korea and Japanese militaries, etc. I felt like I had a front row seat at the militarization of the internet. Unfortunately archive.org never indexed the page I announced it on off this one: https://web.archive.org/web/19991128100455/http://pho.2600.o... .. but an early version of the list of interesting visitors was archived: https://web.archive.org/web/19991128111635/http://pho.2600.o...
Is there some kind of VM I could play with this on? Happy to go to something like Buildroot but I've never thought of arp like a daemon, it's always just there.
This reminds me of an old book¹ on Open Transport²³⁴ I started reading for fun a few years ago (what I found particularly interesting about Open Transport was that it used Unix System V’s STREAMS⁵ instead of Berkeley sockets).
Anyway, I linked the book here because it’s actually quite well-written, and the introductory section in particular actually does a good job of describing basic networking concepts.
Should I write a TCP/IP stack in 2016, I would do it IPv6 only. There are several methods for translating IPv4 into IPv6 so that you can get both protocols working anyway. Writing it IPv4 only is useless and harder to extend later.
And don't come to tell me that we are not using IPv6 please... ( https://www.google.com/intl/en/ipv6/statistics.html )
[Edit: link]
You obviously need edge devices that translate for you, e.g. using NAT64 (which maps IPv4 to a /96 IPv6 subnet. Since you then have an IPv6 address for each IPv4 address, you don't even need to maintain state on the gateway, so it scales very well)
Some while ago I wrote an ARP implementation in Verilog for FPGA use; a very simple state machine. Testing it (and the other TCP parts of the system) involved using the TUN/TAP devices to connect the simulation to real Ethernet and the application it was supposed to talk to.
TUN/TAP is very useful for doing nonstandard things with networks, implementing your own VPN, etc.
In my own TCP/IP stack I have defined things like IPv4 addresses and port numbers as unions instead of integers, and given them functions to convert to host order and back. e.g.
typedef union {
uint32_t net_order;
uint8_t byte[4];
} ipv4_address;
_Static_assert( sizeof(ipv4_address) == 4, "ipv4_address is not 4 bytes long");
static inline __attribute__((overloadable)) uint32_t as_host( ipv4_address a) {
return other_order_32(a.net_order);
}
// more functions follow, like
// bool equal(a,b)
// int compare(a,b)
// ipv4_address ipv4_address_from_host( uint32_t)
The nice part of this is that it would take an act of willful ignorance on my part to accidentally miss or over include a conversion from network order to host order.
The compiler is still happy to slam these around in registers like integers so I haven't changed the runtime performance. The downside is that C won't let you compare structs or unions with ==, so I have to have that equal() function which dirties the source code a bit.
That __attribute__((overloadable)) makes as_host() an overloadable function so I don't have to have a giant family of incredibly_long_function_names to convert all my different types to host byte order or to compare them with themselves.
That _Static_assert isn't terribly useful in this case, but there is one on all of the structures where I expect to have explicitly specified a layout. That way when the compiler tries to pull a fast one on me because some language lawyer spent too long reading the spec, I'll find out at build time.
I have a small, safe, zero-copy partial IP/TCP/UDP parser implementation in Rust. My favorite part is the IP checksum calculation, because I can use Rust's iterators:
Each style is "more cohesive" than the other, but measured along different axes.
Declaring variables at the top keeps variable declarations close to each other; declaring variables close to where they're used keeps, obviously, declarations close to use.
> Declaring variables at the top keeps variable declarations close to each other
And that one is not very useful to a human. Declaring variables near their usage is useful to a human. Understanding the code requires less (human) working memory.
The reason that old C code declared all variables at the top is because it made the earliest compilers easier to implement in a single pass: the compiler could tell up front how much stack space was necessary for the function call before encountering statements. The early compilers could scan all of the statements, tabulate the stack space required, and then emit the machine code to reserve that amount of stack space, before proceeding to parse and compile the following statements. Modern compilers are multi-pass and can scan the function and compute the stack space before beginning to generate machine code.
Writing code in that style is an example of humans optimizing for the machine, rather than vice versa. Modern compilers don't have this limitation. I haven't read an argument that code written in this style is actually preferable for humans, and in my opinion and experience, it is not.
For those of us who can't spot this stuff as easily - is this detrimental, or simply out of style? Is there a recommended guide or body of source code this is "modern" C?
Here's a 2011 guide to modern C: [1] Discussed on YC last year.[2]
The 1970s string functions, such as "strcpy", that don't have a destination length were deprecated years ago, but they've never been removed from libraries. Microsoft's C compiler has warned about this for a decade.[3] This is one of the major causes of buffer overflows in C, and a large number of exploits involve it.. There's a CERT advisory from Homeland Security about "strcpy".[4] Seeing a "strcpy" in new code is a big red flag. If you see that in an interview, don't hire that programmer. They're dangerous.
Since C99, you've been able to write
for (int i = 0; i < 10; ++i) ...
in C code, and that was in C++ from the early years. The trend in programming is strongly towards declaring and initializing variables at the same time. Remember, in C, local variables declared but not initialized have junk values until assigned. The "all variables declared at the top of a function" style is obsolete. Pointer variables declared without initialization are especially bad.
There's a long and painful history of classic C bugs, and the newer versions of C help, just a little, to avoid them.
Playing devil's advocate: local variables declared but not initialized before being used will always generate a warning (which can be turned into a hard error) on modern optimizing compilers like gcc or clang.
And having all variables declared at the top of the function can make it easier to visualize how much stack space the function is using, which can be useful when your stack is limited to a few kilobytes.
IMHO strcpy() is perfectly fine if you know the length of the source, and ensure the destination is always big enough. strcpy() will always write strlen(src)+1 bytes, nothing more and nothing less. That said, code should be written so those facts are obvious, and documentation is a preferred way to show that. (Being explicit about whether null-terminators are included or not in length specifications is one thing that seems to be often overlooked, for example.)
Microsoft's _s functions don't help if you don't know what the lengths should be, and if you do, they just make it more confusing.
Yup, I'm gonna stick my neck out here and say I quite like strcpy()
In good code it spells out in one glance that there exists a guarantee on the size of the destination buffer.
Supplying the string's length just brings redundant code and, with it, ambiguity.
Discouraging strcpy() use is fine and understood; I'm aware of all the caveats and bugs and security implications.
But the kind of black-and-white thinking that says "always use strncpy instead of strcpy" is bad; the idea that truncating a string magically absolves us of any security implications or the need for exit paths.
And then let's look at how much code has made a mess of strncpy() whilst thinking it was doing the right thing.
I can probably count on one hand the number of times I've wanted to copy a string but been happy for it to be quietly trimmed, even in extreme cases. Whereas anything from alloca to flexible array member, or copying a string back to its original buffer, are all very appropriate use of strcpy().
> Since C99, you've been able to write
>
> for (int i = 0; i < 10; ++i)
To be fair, Microsoft's compiler only began supporting that recently, so for cross-platform compatible code, people often declared the index outside the loop.
Declaring variables at the beginning of such a small function is certainly not out of style. IMHO, it's not detrimental. In fact, I find it quite useful (i.e. just by reading the declarations, I can infer the expected flow of the function -- fd will be used to open a device file, from which ifr is going to be populated through a call to read() or an ioctl). However, it's largely a question of style.
Using strcpy is a big no-no -- it's both detrimental, and out of style. Its manpage gives this warning (on FreeBSD -- the one you'll get if you type man strcpy may differ, but all man pages have had a similar warning since 1990something):
> The strcpy() function is easily misused in a manner which enables malicious users to arbitrarily change a running program's functionality through a buffer overflow attack.
That's because strcpy(dst, src) will basically do something like this:
int i = 0;
while (src[i] != '\0') {
dst[i] = src[i];
i++;
}
(the real implementation is usually more terse for reasons I won't go into right now, but this is what it basically does in more verbose terms).
without bothering to check if dst[i] is not beyond the end of dst. This allows you to write past the end of the buffer that holds dst, and who knows what's there...
Nowadays, all that usually happens (on sane operating systems running on machines with a MMU) is that your application crashes due to something called ASLR. It used to be a big problem before that, though, and it still is (on insane operating systems, on machines that don't have a MMU and so on).
Either way, it's not a good idea, and it's considered to be a pretty big code smell. strncpy is the encouraged version. strlcpy is the sane one, but it's not available on all platforms (and even when it is, it's not always sanely implemented).
Old C code declared variables at the beginning of a function because it made compilers easier to implement, in a single pass. If you interleave variables and statements, then calculating the amount of stack space needed for the function call requires multiple passes. Knowing this, defenses of "variables at the top" often seem like rationalizations to me.
There's a reason that other programming languages don't share this convention, and that's because compilers had evolved enough in sophistication to allow interleaved variables and statements from their beginnings; and they had no legacy code bases anchoring their code style, like C does.
Declaring variables at the top is suboptimal compared to declaring them at their usage. It requires less working memory to understand, and it's fewer lines of code, and the resulting code can often be easier to understand due to context. Standalone variables typically either require more documentation to make sense, or else they don't make sense until you see them used in context.
The C language has a lot of bad conventions and semantics that have led to thousands of bugs over the years. Variables whose scope span entire functions is one of them. There's no articulable advantage and the downside is greater demands on working memory, and greater risk of defects (variables being visible before they're initialized or outside places where they make sense). Rust is on the right track allowing variable lifetimes to be defined and managed in a precise way.
will not necessarily NULL terminate ifr.ifr_name for all values of dev variable. NULL terminated input is assumed later when ifr.ifr_name is used in strcpy.
In this case, it seems the ioctl call will do the null termination & in fact, the kernel itself uses strcpy for that field (presumably OK since the device structure has a null terminated name):
Very interesting tutorial, more people should know about using tun/tap for directly accessing ethernet/ip frames.
However, please note that this is not production-capable code. Casting directly a struct directly into the data as a way to "parse" is easy and convenient, but ignores issues like endianness, alignment, etc. By all means, use a struct, but for production code, pick off the bits bit-by-by or word-by-word using appropriate bit operators and assign them to the struct members. Also, someone already mentioned the `char` issue with certain architectures. I'm sure I'm missing some other issues, since writing safe, portable C code isn't always obvious.
I look forward to reading the next part of these series.
The is interesting and I hope it will get around to all of the modern implementation tweaks.
Hopefully not too off-topic but I would love to learn how we would have implemented a state-of-the-art global interconnect protocol today. There's of course efforts like http://named-data.net but what about the lower levels like about congestion control. Is dropping packages really the best paradigm?
EDIT: to be clear, I mean if starting from scratch in a world where IP never existed (and no backwards compatibility mattered).
I would assume endianness to be an issue in such code. Where in these structs would the order need to be validated? Or am I wrong in thinking that the byte order matters for x86 targeted c network code?
I would suggest being a little more careful with this:
> unsigned char dmac[6];
because there are several platforms, all of them high-performance, non-legacy processors, where unsigned char is not 8 bits :-). E.g. on AD's SHARC, it's 32. You're declaring the struct with __attribute__((packed)), so I assume you're going to want to fill it automatically through DMA at some point in the future. On such a platform, it won't have the expected results. A colleague got bit by this, on a device that's sitting on my desk right now, and it's not a vintage piece of equipment from the 60s.
I would suggest using uint8_t instead, just like you're using uint16_t a few lines below.