For now it is CPU only yes, uses AVX instructions. But it's pretty fast anyway, try it out. I have it running on my mbp M1 and it's pretty decent. I think GPU support will come eventually. I wrote an app that uses the openai API and it was nice and simple to just point it at my own local service instead.
Edit: I think textgen itself can support this nowadays