I'm with you there but we still don't know how it works, just that it does. The ...

I'm with you there but we still don't know how it works, just that it does. The method though is you take a bunch of images, you plug them into a multi dimensional array (a nice way of saying a tensor), have some kind of tagging system, and when you ask the system for an answer, it will put one out for you. So for example in the astronaut riding the horse, there is, on some level, a picture of a horse with those similar pixels, that exists in the data of some object tagged 'horse.' Likewise with astronaut. What is important is that the data sets are absolutely massive, with billions of parameters.

Here's a more of a 'not 15 year old' explanation: https://ml.berkeley.edu/blog/posts/dalle2/