I highly doubt it’s a product of the raw training dataset because I had the opposite problem. The token for “background” introduced intense blur on the whole image almost regardless of how it was used in the prompt, which is interesting because their prompt interpretation is much better.
It seems likely that they did heavy calibration of text as well as a lot of tuning efforts to make the model prefer images that are “flux-y”.
Whatever process they’re following, they’ve inadvertently made the model overly sensitive to certain terms to the point at which their mere inclusion is stronger than a Lora.
The photos you’re showing aren’t especially noteworthy in the scheme of things. It doesn’t take a lot of effort to “escape” the basic image formatting and get something hyper realistic. Personally I don’t think they’re trying to hide the hyper realism so much as trying to default to imagery that people want.
It seems likely that they did heavy calibration of text as well as a lot of tuning efforts to make the model prefer images that are “flux-y”.
Whatever process they’re following, they’ve inadvertently made the model overly sensitive to certain terms to the point at which their mere inclusion is stronger than a Lora.
The photos you’re showing aren’t especially noteworthy in the scheme of things. It doesn’t take a lot of effort to “escape” the basic image formatting and get something hyper realistic. Personally I don’t think they’re trying to hide the hyper realism so much as trying to default to imagery that people want.