I think this means I either have to train the -pt model with my own instruction tuning or use another provider :(
24b would be too small to run on device and I'm trying to keep my cloud costs low (meaning I can't afford to host a small 27b 24/7).
24b would be too small to run on device and I'm trying to keep my cloud costs low (meaning I can't afford to host a small 24b 24/7).
I think this means I either have to train the -pt model with my own instruction tuning or use another provider :(