I don't think this is right. I've come to understand the 40gbps number to be the total bidirecional bandwidth with overhead taken into account on a single port.
Which adds up, because, a single port maxes at 22gbps FULL DUPLEX shave off a 10% overhead factor, and double it = 40gbps.
If that's true, how it's possible for TB3 to drive dual 4k monitors at 24 bpp 60 Hz and have enough bandwidth on top to drive gigabit ethernet etc. peripherals?
Simple, it's not doing what you think. Actual 4k monitors do not consume that maximum theoretically calculated bandwidth. Indeed, in most cases they only consume about 40% of the maximum possible bandwidth of 8 lanes of DP (which is the same as a single TB3 port.)
And yes, despite the generic support claims of 2x4k displays, it is possible to get 2 4k displays you can't run simultaneously on one TB3 port. It's just fairly unlikely.
In practice you'll have 2 4k displays using about 17.28Gbps of the bandwidth leaving the rest for other things. And there's a good reason for that. It enables MST at 4k 60hz. An example is the Dell P2415Q. It supports MST daisy chaining and you can use two of those same monitors together (not with macOS but with Windows) from a single DP1.2 port. That's only possible because the display doesn't actually demand over 8.5-ish Gbps bandwidth.
There are many TB3 docks available like https://mymantiz.com/products/md-01-zeus that offer dual 4k @60 Hz and as far as I know, they work just fine on Macbook Pro.
Are you sure you're not confusing it with USB-C, which can only drive one 4k@60Hz monitor per connector? USB-C connectors are compatible with TB3.
Which adds up, because, a single port maxes at 22gbps FULL DUPLEX shave off a 10% overhead factor, and double it = 40gbps.