30B models are in no way comparable to GPT-4 even to GPT-3. There is no spacial comprehension in models with less then 125B params (or I had no access to such model). 130B GLM seems to be really interesting as the crowd-source start though, or 176B BLOOMZ, which requires additional training (it is underfitted as hell).
BLOOMZ was better then GPT-3.5 for sure, but yeah underfitted ...