The internet is big-endian, and generally data sent over the wire is converted t...

jandrewrogers · on June 30, 2024

The low-level parts of the network are big-endian because they date from a time when a lot of networking was done on big-endian machines. Most modern protocols and data encodings above UDP/TCP are explicitly little-endian because x86 and most modern ARM are little-endian. I can't remember the last time I had to write a protocol codec that was big-endian; that was common in the 1990s, but that was a long time ago. Even for protocols that explicitly support both big- and little-endian encodings, I never see an actual big-endian encoding in the wild and some implementations don't bother to support them even though they are part of the standard, with seemingly little consequence.

There are vestiges of big-endian in the lower layers of the network but that is a historical artifact from when many UNIX servers were big-endian. It makes no sense to do new development with big-endian formats, and in practice it has become quite rare as one would reasonably expect.

forrestthewoods · on July 1, 2024

No idea why you’re getting downvoted. Everything you’ve written is correct.

masklinn · on July 1, 2024

Is it though? Because my experience is very different than GP’s: git uses network byte order for its binary files, msgpack and cbor use network byte order, websocket uses network byte order, …

IshKebab · on July 1, 2024

Yeah I'd say it should be true but there are plenty of modern protocols that still inexplicably use big endian.

For your own protocols there's no need to deal with big endian though.

forrestthewoods · on June 30, 2024

> any RFC that defines a protocol including binary data will generally go with big-endian numbers

I'm not sure this is true. And if it is true it really shouldn't be. There are effectively no modern big endian CPUs. If designing a new protocol there is, afaict, zero benefit to serializing anything as big endian.

It's unfortunate that TCP headers and networking are big endian. It's a historical artifact.

Converting data to/from BE is a waste. I've designed and implemented a variety of simple communication protocols. They all define the wire format to be LE. Works great, zero issues, zero regrets.

classichasclass · on July 1, 2024

> There are effectively no modern big endian CPUs.

POWER9, Power10 and s390x/Telum/etc. all say hi. The first two in particular have a little endian mode and most Linuces run them little, but they all can run big, and on z/OS, AIX and IBM i, must do so.

I imagine you'll say effectively no one cares about them, but they do exist, are used in shipping systems you can buy today, and are fully supported.

forrestthewoods · on July 1, 2024

Yeah those are a teeny tiny fraction of CPUs on the market. Little Endian should be the default and the rare big endian CPU gets to run the slow path.

Almost no code anyone here will write will run on those chips. It’s not something almost any programmer needs to worry about. And those that do can easily add support where it’s necessary.

The point is that big endian is an extreme outlier.

userbinator · on June 30, 2024

Only the early protocols below the application layer are BE. A lot of the later stuff switched to LE.

kortilla · on June 30, 2024

Yes, those “early protocols” carry everything. Until applications stop opening sockets, this problem doesn’t go away.

lmm · on June 30, 2024

If you're writing an implementation of one of those "early protocols", sure. If not, call a well-known library, let it do whatever bit twiddling it needs to, and get on with what you were actually doing.

01HNNWZ0MV43FF · on June 30, 2024

But the payload isn't BE