Not a hot take, I think you're right. If it was scaled up to 70b, I think it wou...

		nabakin on Dec 8, 2023 \| parent \| context \| favorite \| on: Mistral "Mixtral" 8x7B 32k model [magnet] Not a hot take, I think you're right. If it was scaled up to 70b, I think it would be better than Llama 2 70b. Maybe if it was then scaled up to 180b and turned into a MoE it would be better than GPT-4.