7B uses about 4.5G max & runs at 203.38 ms per token, 13B about 8G and does 396.58 ms per token.
30B needs about 20G and basically hangs due to swapping i guess with 16G.
7B uses about 4.5G max & runs at 203.38 ms per token, 13B about 8G and does 396.58 ms per token.
30B needs about 20G and basically hangs due to swapping i guess with 16G.