It worth noticing that number you're quoting is for embeddings between layers. If you split your model between 5 nodes you will need to send this 32kb 5 times. Also it's per token. Meaning if you process 1K tokens it turns to be 32 MB of data, 1M tokens - 32 GB...