It could go even further, in theory. The kind of ops that the current crop of LLMs needs is very simple, and at the same time there's no hard requirement for precision (which is why 4-bit quantization works so well). This means that unconventional approaches such as analog computing are potentially in the play again - it's easy to do addition and multiplication in an analog circuit, if you don't care about the answer being precise, and in theory one could pack a lot more of those circuits in the same space.