Artificial bokeh (like in Portrait mode on an iPhone) is getting better and better. It went from looking cheesy to somewhat natural in only a few years. It’ll be a while before it’s close to the example you gave, but I can imagine a future where Camera.app has a “tap to blur” feature using live CV object detection.
I'm a serious luddite in a lot of ways but don't see a meaningful difference between blurring the background by focusing the lens a certain way, and blurring it after the fact using depth information captured in the photo.
The way phone image processing pipelines work these days, this doesn't seem too far from the truth. Images from phones are essentially AI renderings which use pixel data as a prompt, and have less and less relation to the actual pixel data with each new device.
If I could ask an AI for a photo of my dog and it looked better than one I took with my camera, why wouldn't I? We're representing the world as series of pixels on a bitmap, makes little difference to me where it came from.
In the hypothetical future scenario where the output is good enough I don't think there's much of a point in making a distinction between something that is optically manipulated via lens vs done so via software.