You can't on modest hardware, VRAM size is a function of model size, KV cache that depends on context length and the quant size of the model and K/V. 16gb isn't much really. You need more vram, the best way for most folks is to buy a macbook with unified memory. You can get a 128gb mac, but it's not cheap. If you are handy and resourceful you can build a GPU cluster.
I never thought I would say it, but the 128gb mbp is probably the most cost efficient way ( and probably easiest ) of doing it. New nvidia cards ( 5090 ) are 32gb and supposedly just shy of 2k and used a100 40gb is about 8k..
All in all, not a cheap hobby ( if you are not doing it for work ).