If this hypothetical future is one where mixtures of experts is predominant, where each expert fits on a node, then the nodes only need the bandwidth to accept inputs and give responses — they won't need the much higher bandwidth required to spread a single model over the planet.
If you think RAM speeds are slow for the transformer or inference, imagine what 100Mbs would be like.