This would be so cool, but we need to think more about how we could do it and make enough money in the future to train more models with even cooler features.
Neither of those look like they have a generative AI component.
We (as a society) desperately need a way to train these models in a federated, distributed manner. I would be more than happy to commit some of my own compute to training open audio / text / image / you-name-it models.
But (if I understand correctly) the current architecture makes this if not impossible, nearly so.
I'm just observing until there's a Stable Diffusion 1.5 equivalent of music generation. Open license, under 8GB of VRAM, large communities for sharing fine-tuned models, plugins like ControlNet, etc. Then this AI music generation will really take off and yield flawless results.
I know it will happen, just like SD happened after DALL-E. Bonus points to whoever does so for using C++ and Vulkan instead of Pytorch and CUDA. :-)
Any plans to release the model(s) under an open license ?