Not OP, but yeah, ollama is super easy to install. I just installed the Docker v...

Not OP, but yeah, ollama is super easy to install.

I just installed the Docker version and created a little wrapper script which starts and stops the container. Installing different models is trivial.

I think I already had CUDA set up, not sure if that made a difference. But it's quick and easy. Set it up, fuck around for an hour or so while you get things working, then you've got your own local LLM you can spin up whenever you want.