My guess is that human creativity is mostly just technical skill + taste + random noise. DALL-E has the first one, and we could probably approximate the last one, so the middle is the only one that needs work. It feels like that's a similar issue to how GPT often ends up trailing off. Maybe some kind of improved attention would work? Or an improved version of the sampling trick?
As far as something to play with to get interesting ideas, and to take a first few cuts at implementing them, DALL-E is great. Then let's bring the human's technical skill and ability at curation into the mix.