It worth noticing that number you're quoting is for embeddings between layers. I...

		boroboro4 on July 17, 2024 \| parent \| context \| favorite \| on: Exo: Run your own AI cluster at home with everyday... It worth noticing that number you're quoting is for embeddings between layers. If you split your model between 5 nodes you will need to send this 32kb 5 times. Also it's per token. Meaning if you process 1K tokens it turns to be 32 MB of data, 1M tokens - 32 GB...