Open AI vs "OpenAI", very different things. The fact that I can do this on commo...

Open AI vs "OpenAI", very different things.

The fact that I can do this on commodity hardware on a 4GB model. A model that understands text and visual images, just absolutely blows my mind.

I almost feel like in a new future, a 100GB model may be able to offline handle speech -> text, video -> live scene graph. A robot that could base level physical understanding of our world like a 4 year old does. (objects, their relationship to other objects and behaviors)