Hacker News new | past | comments | ask | show | jobs | submit login
How both TCP and Ethernet checksums fail (2015) (evanjones.ca)
107 points by mmastrac on Jan 30, 2017 | hide | past | favorite | 43 comments



At ${PREVIOUS_JOB}, I changed offices and discovered my window wasn't the only thing I lost in the move: SSH sessions kept dying with MAC errors. It happened with multiple computers with different NICs and running different OSes, so it wasn't computer-related. I tried swapping the patch cable, but that didn't change anything.

On a whim, I took the faceplate off the box containing my (100 Mbps) Ethernet and phone jacks and discovered both drops were provided over a single four pair cable originally installed in the late 1980s. More alarmingly, the outer jacket of the cable had been cut off about a foot from the punchdowns on the backs of the keystone jacks, and the entire foot of conductors emerging from the jacket were all untwisted and balled up in the box.

Cutting off about 10 inches of each wire, twisting the pairs back together, and punching the wires back down onto the backs of the jacks fixed the SSH problem...


You should have used Mosh /s


While getting my degree I worked as a sys admin assistant for the CS dept. This was back in the 10BASE2 days. It wasn't unheard of find a professor who had re-arranged his or her office and needed a longer patch cable to connect their workstation, so he or she found some 75 ohm coax and used that and then wondered why the networking wasn't so good.


> Mosh /s

Is also a unit of measurement - how many times per second the mosh session would've been broken if it were SSH

/s


I develop backup software among other things and I can tell you, that a CRC is neat but I've gotten many many reports where a CRC did not catch data corruption (but SHA-2 did). Much fewer reports indicated that the CRC caught it first.

My conclusion: A CRC32 over a couple megs of data is only good enough to likely catch "gross" corruption, like truncating the data, not finer corruption like memory and disks tend to do it, especially when we're talking about bit rot.

Also, unless you're using an optimized implementation -- and zlib is not optimized -- using eg. PCLMUL or at least a slice-by-4/8 variant it's a waste of time, because eg. BLAKE2b isn't much slower than zlib CRC, but will catch All The Corruption!1 [1] (Also, Python 3.6 added the full might of BLAKE2 to the stdlib, although it's the reference implementation and not some fancy SSE4/AVX(2) impl.)

[1] A technical note: There is a tradeoff between checksum length and data length; with a good checksum increasing it's length eventually increases the error rate over the same medium (aka false positives [the checksum is corrupted instead of the data]). However, it's usually more important these days to be certain about correctness and not to minimize total indicated BER over a certain channel.


You make an excellent point: probability says a 4-byte CRC32C provides much weaker guarantees as the length of the message gets longer. These CRCs are typically optimized for pretty short messages, and that is what I had in mind when I wrote that article (e.g. the kind of messages that might get exchanged as part of an online serving system).

For overkill discussions about these issues, see

"32-Bit Cyclic Redundancy Codes for Internet Applications" which talks about the kinds of errors that various CRCs are guaranteed to detect: https://users.ece.cmu.edu/~koopman/networks/dsn02/dsn02_koop...


I think it's an apple to orange comparison; sha1 needs to be compared to a proper crc160, and sha2 to a proper crc256.

(Proper means one nontrivial cycle e.g. An irreducible generating polynomial of a proper size)

I would be surprised if there's material difference in error detection between sha2 and crc255 (other than the 1 bit difference).

However, crc is useless against an adversary - preimage attacks are trivial. at the same size, it becomes an apple to apple comparison, and whether you choose depends on a speed/preimage requirements.


Or go the other way: compare the crc32 to a 4-byte blake2 hash (blake2 can vary its output length on multiples of 8 bits).

I'd expect that, no matter how many bits were corrupted, the pattern of the corruption, or the size of the file, a 4-byte blake2 hash would have always a 1 in 4294967296 chance of not catching the error; for crc32, the chance of not catching the error would increase with the size of the file or the number of corrupted bits. On the other hand, a CRC is guaranteed to catch certain corruption patterns (like all burst errors of up to a certain size), while for a cryptographic hash like blake2, there's no such guarantee.

(A cryptographic hash sent together with the data, by itself, is also useless against an adversary, since the adversary can change the hash. You need at least a keyed hash to protect against that.)


I do not know what you would think that.

A crc32 will miss an error if and only if crc32(error-pattern)=0, which happens in 1:2^32, assuming a full-cycle polynomial. While these patterns are easy to characterize, and are independent of the data (neither is true for a crypto hash), the probabilities are the same.


Is it really an apple to orange comparison? CRC can't do everything SHA does, but SHA can do what CRC does, so it's more like an apple to apple-and-orange comparison. And that's an easy comparison: pick the second one! You get a free orange!

Is there something that a proper CRC256 would be better at than SHA-256?


Crc is much simpler; depending on the polynomial, it can be as little as 2 xor gates and a shift register; if you do your own asic/fpga, I suspect it could be 10-100 times smaller and faster than sha256. In a modern CPU, You can do something like 100 bits/clock of CRC, 10-100 times slower for Sha.

The other thing that crc can do that sha256 cannot is an error correcting code of sort; if your errors are limited to a small subset (say, only bursts), you can tabulate the crc of those errors, and then (expected-crc xor computed-crc) is the crc of the error; if it's in your list, you know what the error is. I had actually worked on an device where this was useful some 20 years ago.


That second bit is pretty cool, I didn't know about that. I actually experimented a bit with using cryptographic hashes as an error correcting code, but you have to brute force it, so it's only practical for small amounts of data and small errors.


CRC64 will probably[1] suffice for any data package you'll see on your life. A cryptographic hash of 64 bits won't.

That said, it's probably easier to find a reliable implementation of SHA256 than of some CRC64, so, if the added CPU cost doesn't bother you[2], I'd say SHA256 is the best choice.

1 - CRC40 will likely be enough. Those extra 24 bits are a huge safety margin.

2 - And it normally shouldn't, not even for performance sensitive applications. The hash is very fast, and you are doing it before IO. Except for some fast disappearing niches, it just won't matter.


What would a CRC64 do better than, say, using the first 64 bits of the SHA-256 (other than performance, of course)? I see talk of CRCs always catching contiguous errors smaller than the CRC size, is it just that a CRC is guaranteed to do that and a 64-bit cryptographic hash isn't?


Yes. CRC is linear in the data and the error. This makes it easy to analyze the crc of specific error sets, and if non of them have zero crc, then if the error is in that set it is guaranteed to be found.


Of course it's a bit of an apples to oranges comparison.

An interesting question about CRCs is whether you can implement them efficiently. CRC32 is possible with some help from the CPU (CLMUL as mentioned, or complete instructions for specific polynomials, like crc32c); eg. on Haswell you get to 14-16 GB/s with a CLMUL approach on CRC32. That's obviously much faster than the hashes I mentioned. With the zlib implementation, on the other hand, you only get about one GB/s and there it's really close to B2b already.

Table-based approaches (classic zlib implementation, or slice-by-n, which needs n-times as many tables) would not work well with longer polynomials; the tables would outgrow L1/L2 cache sizes rather quickly. I'd have to think a bit about the tableless CLMUL approach.


zlib does not use crc at all (zip does, but zlib doesn't). It uses adler-32 which shares most properties with crc32 (prob of missing a random error is almost the same, error patterns are easy to analyze and are 16-bit additive and independent of data).


However, because SHA-256 is so widely used, efficient implementations are available off the shelf; I've even seen talk about providing it in hardware on the CPU. Given that, you might as well just go ahead and use it.


Various CPUs have hardware SHA-1/2 support, eg. everything with ARM NEON = many smartphones and ARM chips, even some ARM boards. Most Sun/Oracle SPARC machines have crypto engines that do this, and many other algorithms; the same goes for POWER.

I'm not quite sure why no mainstream x86 extensions have extra stuff for this (VIA PadLock does SHA), but I can take a guess: ARX is already reasonably efficient in software (more so with newer designs, eg. BLAKE2 + AVX2) - unlike AES. And unlike AES there aren't really many problems with side channels, especially when we think about GCM.


I am not following this:

>"A CRC32 over a couple megs of data is only good enough to likely catch "gross"

The CRC on TCP MSS or on the ethernet frame would never be higher than 1460 or 1500 byes(or 9K in the case of jumbo frames), how or where would a CRC over multiple megs come into play in regular networking?

CRC is used on jumbo ethernet of of sized of 9K and it works just fine. In fact this is the reason that 9K is the limit on jumbo frames.


The author of the article was arguing that you should add a CRC to the data being sent across TCP/IP. So that would encompass the whole payload, not just per-packet.


I see, I misunderstood the context of the comment. Thanks.


This is what worries me about the HTTP "Cache-Control: immutable" proposal. Immmutable caching should come with a reliable way to be sure that the content that is going to become immutable is not corrupted in the first place.


Not if it's over HTTPS, since it guarantees data integrity. Corrupted network bits will cause the connection to fail.

If the file is corrupted at the edge server an SRI attribute [1] can detect it, although it's not supported on IE and Safari yet.

[1] https://developer.mozilla.org/en-US/docs/Web/Security/Subres...


Better hope the file isn't corrupted while it's being written to disk (or while read, or in place, or…).

IMHO, anything immutable should be identified with its SHA2+ hash.


Doesn't a CTRL-F5 force a refresh of all resources? That's always an option (although not one non-technical users will know).


Yep, a non-technical user will rather go buy a new computer than press Ctrl+F5.


Another case I've seen where TCP checksum failed was in old Eee PCs. There, the TCP checksum check on incoming packets was offloaded to the network interface. Unfortunately, some Linux driver and/or hardware bug occasionally corrupted the packet payload after the check passed. The workaround was to disable the hardware check.

https://www.tablix.org/~avian/blog/archives/2010/10/eeepc_no...


See the previous Hacker News discussion from October 2015: https://news.ycombinator.com/item?id=10360108

I'm glad people are still interested in this subject. :)


Can we somehow strengthen the TCP checksum? Asking all application-layer protocols to implement something that the lower layer is supposed to give for free seems unwise in the long run.


The end to end principle says no, you need application level checksums anyway


It's written into the standard, so it's hard to change. If you strengthen it, you run the risk of vendors offloading it and getting it wrong some tiny fraction of the time. The lower layer is always offloaded so it's actually the enemy: it's not an end-to-end checksum!


The end to end principle strikes again.


I still don't understand how 4-byte CRCs are sufficient. Can someone explain?


It's a statistical thing I suppose. 2^32 is 4 billion. So the chances of a garbled packet happening to have the same CRC as your original packet are 1 in 4 billion. Ultimately no system is completely impervious to unlikely freak collisions. For example even if you duplicated each packet there is still a chance your original and duplicated packet could freakishly be corrupted in exactly the same way. Having said all that I suppose a 64 bit CRC would get you a lot closer to monkeys typing Shakespeare territory.


After looking this up, this does not seem to be correct. I'll quote from Noway2 here http://www.eng-tips.com/viewthread.cfm?qid=263494.

"There are problems in trying to compute an error rate for which the CRC will fail to detect a bad communication. First, whether or not the error gets through will be highly dependent on the message. Second, CRC was designedto catch short bursts of contiguous bit errors. If the number of error bits is greater than this, there is a chance that CRC will fail.

Here is an example: Consider these two hex strings: 1) 4B04B6F9BF002F002C002E002CD12E 2) 00260010BF002F002C002E002CD12E

This is from an actual application and uses a 16 bit CRC algorithm to calculate a check value. In both cases the check value is 0x2ED1 (byte swapped). The CRC is the standard CRC used in Modbus, CCITT believe.

Mathematically speaking, an X bit CRC will detect all consecutive bit errors less than X bits. When the error burst is bigger than X, the an X bit CRC will detect at a rate of 1-2^-X . For example,a 16 bit CRC has a 99.99847412109375% chance of catching the error burst.

As you can see from my example above with the two strings, in reality, error bursts longer than X bits DO HAPPEN AND DO GET THROUGH!"


Note that CRCs are designed to cover a certain message length, and are designed to catch bit errors within that message. The probability of a single bit error getting through is much much greater than a simple probability of 1 in 4 billion, the probability of catching 2 bit errors is somewhat less than catching 1 bit error (but still greater than 1 in 4 billion), and so on.

(e.g. https://users.ece.cmu.edu/~koopman/networks/dsn02/dsn02_koop... argues CRC32 will detect all 1, 2 and 3 bit errors in an ethernet frame)


Consider the volume of traffic going over big backbones though (e. g. 100gbps per second). Let's say one of the devices terminating this backbone has a piece of memory go bad that mangles IP payloads. At 8.3 Mpps (1500byte mtu line rate), a 1 in 4 billion chance would trigger on average every 9 minutes or so.

That only assumes random probably with checksum collisions, but its interesting to see how the volumes of traffic moved around the Internet now have brought 'rare' events into much more frequency.


The ethernet CRC is designed for packets on the wire. On the wire you have a bit error rate and sometimes burst errors. For bit errors, as long as no packet has 32 bit errors the CRC will always catch them. For burst errors, if the error sets all bits to zero (or to one) then the CRC will catch it.

This is different from random memory corruption.

Note that while a packet is the memory of a router you typically only have to TCP checksum to protect the packet, which is a rather weak 16-bit sum. So it is safe to assume that TCP will fail to detect many failures and if you care add a sha2 hash.

Why sha2, because while md5 is perfectly fine for random errors, we should should make sure to kill all use of insecure hash functions. Otherwise they keep popping up in contexts where errors are not random.


Current generation Cisco and Nokia core routers seem to be around 400Gbps per slot. Juniper is claiming something wildly higher (up to ~1.5Tbps) so I guess they are ahead but double counting like some of these guys do. But yes, at these speeds, definitely more common, and getting more frequent all the time.

http://www.juniper.net/techpubs/en_US/release-independent/ju...

EDIT: changed 3Tbps to 1.5Tbps.


>Consider the volume of traffic going over big backbones though (e. g. 100gbps per second). Let's say one of the devices terminating this backbone has a piece of memory go bad that mangles IP payloads. At 8.3 Mpps (1500byte mtu line rate), a 1 in 4 billion chance would trigger on average every 9 minutes or so.

I think someone would notice when only one packet per nine minutes successfully traverses their 100 Gbps core switch. So mission accomplished. :)


One problem with simple CRC calculations is the question of whether it's correct to assume a random distribution of errors: if hardware failures produce clustered failures you might find that a particular device or model produces errors at much higher rates.

ftp://ftp.cis.upenn.edu/pub/mbgreen/papers/ton98.pdf

“First, a non-uniform distribution of data makes failure of the TCP checksum far more likely than one would naively expect. The undetected splice rate in our data for the 16-bit TCP checksum over real data is comparable to uniform data with a 10-bit checksum.

Second, checksum distributions on modest amounts of real data are substantially different from the distributions one would anticipate for uniformly distributed data. This skewed distribution does result in significantly higher failure rates of the TCP checksum. In particular, if a router or host has a buffering problem that causes adjacent packets to be merged, the TCP checksum might fail .1% of the time rather than the 0.0015% of the time that purely random data distribution would suggest.”

https://users.ece.cmu.edu/~koopman/pubs/paultisch05_dsn_crc_...

http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigco...


Interesting that it took 4 months to get the kernel bug too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: