This is neat. Model weights are split into their layers and distributed across several machines who then report themselves in a big hash table when they are ready to perform inference or fine tuning "as a team" over their subset of the layers.
It's early but I've been working on hosting model weights in a Docker registry for https://github.com/jmorganca/ollama. Mainly for the content addressability (Ollama will verify the correct weights are downloaded every time) and ultimately weights can be fetched by their content instead of by their name or url (which may change!). Perhaps a good next step might be to split the models by layers and store each layer independently for use cases like this (or even just for downloading + running larger models over several "local" machines).
Ah, is it possible to tone down the self-promotion? I've been seeing your comments for ollama on many LLM-related posts here.
> Please don't use HN primarily for promotion. It's ok to post your own stuff part of the time, but the primary use of the site should be for curiosity.
Surely in this case it would've been possible to comment about OP's work while leaving out the free backlink to your project. Just my 0.02
It's early but I've been working on hosting model weights in a Docker registry for https://github.com/jmorganca/ollama. Mainly for the content addressability (Ollama will verify the correct weights are downloaded every time) and ultimately weights can be fetched by their content instead of by their name or url (which may change!). Perhaps a good next step might be to split the models by layers and store each layer independently for use cases like this (or even just for downloading + running larger models over several "local" machines).