"Unfit for deployment" or "not intended for deployment" is semi-standard wording for research models that are just raw language models with none of the safety/bias/offensiveness filtering that is usually desired for product applications. For example, if you deploy it as a customer-service chatbot, it might tell your customers to kill themselves, or call them racial slurs.
It doesn't mean that there's anything technically wrong with the language model per se as a model of language, just that there has been no effort made to ensure it's fit to be deployed as-is for any given generative-AI use case, and the model authors would prefer you didn't do that.
This requires retraining from scratch, so no, you can't use Llama 2 pretrained weight.
As far as I can tell you can take Llama 2 modelling code, training infrastructure, training data and apply proposed modification (they provide PyTorch nn.Module which should be drop in replacement of nn.Linear) and run the training if you have enough compute and it should work. Doesn't mean it would work, there are always lots of practical problems, but it should work in principle.
https://huggingface.co/pbelcak/UltraFastBERT-1x11-long